Downloaded from www.biorust.com on Sat Nov 21, 2009 09:34:09

 
Text & Number Management Functions
Tutorial Author - Scrowler (http://forums.biorust.com/member.php?userid=66)

Welcome to yet another scrowler tutorial on PHP. This tutorial covers various different text manipulation and management functions, many of which are absolutely invaluable for your scripts.

Topics that we will cover in this tutorial include:
- Text length limiting
- HTML formatting
- Replacing letters or words
- Form posted text
- Hashing strings (one way encryption, inc. mhash lib)
- Number manipulation
- Padding strings
- Converting text cases
- Word wrapping

Woah! What a list! Let’s get started then! This tutorial is written for the average PHP beginner to get a firmer grasp on the world of string management with PHP, including some basic math functions. The content of this tutorial is optimized to include web functionality i.e. stripping HTML tags from a string, converting HTML entities, etc. You will even be taught how to hash strings, going far into the mhash PHP extension library.

First up on the agenda: Text length limiting.

Text length limiting
Text length limiting is a very useful feature of PHP, and it’s relatively easy to achieve. Using inbuilt functions of PHP you can cut a string off at a certain point, and you can even check its length with a single function call!

So, let’s say that we have a submission form for a local soccer club to allow coaches to submit overviews of weekly soccer matches. The administrator doesn’t want his database to get huge, because he knows how many teams there are in the club, and he suspects that coaches might get carried away! So what does he do? The answer is simple - He imposes a text length restriction!

On Monday, the administrator is feeling rather harsh, so he decides to use substr() to simply cut everything off the string from a certain point onwards, to limit the length. He reads this tutorial and finds out the syntax for substr() is as follows: [string], [start], [length].

He checks the database and it’s huuuuuge! How can he manage to allow so much data to go in every week? Fortunately, he understands the concepts in this tutorial and is able to solve his problem in a harsh way.  Here is the code he decides to use:
 

<?php

function cutText($string)
{
$string = substr($string, 0, 5000);
return $string;
}

$string = cutText($_POST['overview']);

?>

This works successfully! If a coach inputs an overview larger than 5000 characters, anything past the 5000th character is cut off!

So the week goes by, the next soccer round commences, and a coach goes to submit his weekly overview. He loses half of it, and what does he do? He complains of course! How unfair - They didn’t even back it up!

The administrator decides to be a little nicer this time, and instead of just cutting the end off, he decides to tell the user if it’s too long, so they can cut it down to size manually. For this, he uses the function strlen() which simply takes the input argument [string] and returns its length.
 

<?php

function checkLength($string)
{
$length = strlen($string);
if($length > 5000)
{
return false;
} else {
return true;
}
}

$text = $_POST['overview'];

if(checkLength($text) == true){
# process the data
} else {
echo 'Sorry for the trouble, your data was longer than 5000 characters, could you please shorten it and retry? Thanks!';
}

?>

Problem solved! The soccer coach now has the opportunity to shorten his overview instead of completely losing the excess!  This gives us a happy administrator, as his database remains a manageable size, and happy coaches as they can submit their overviews with ease!

That concludes the section on text limiting, so let’s move on to HTML formatting!


HTML Formatting
Often if you want to output HTML code onto a page from a database, and you don’t want it to be executed as HTML, you need to convert the tag to it’s HTML entity. This is a representation of the character that the browser doesn’t parse, i.e. the < symbol’s HTML entity is: &lt; and the & symbol: &amp;

Surprisingly, converting tags to their entities is as easy as 1, 2, 3 - PHP has a few inbuilt functions that do it for you! We will look at the htmlentities() function in this example, as it covers most of everything you will probably use. This function simply returns the string, with the tags encoded as entities. So, if we input “<&” it will return “&lt;&amp;”.

Our administrator from the previous section knew of this function, so he implemented it in his scripting, so that if somebody put in some HTML by accident into their overview, it would convert it. Let’s see what he came up with:
 

<?php

$text = $_POST['overview'];

$text = htmlentities($text);

?>

Done! Its as simple as that! None of the inputted HTML will be parsed, and it will simply appear as you would code it!

Now, let’s say he had a section where the user could input customized HTML code for their headings of the overviews. He thought he could simply take the “entitized” header code (because it’s a separate variable in this example) and convert it back to HTML. Possible? Of course! He uses the function: html_entity_decode(). It accepts an entitized string and return the HTML’ized version of that string. Let’s see what he came up with:
 

<?php

$header = $_POST['header'];

$entities = htmlentities($header);
# $entities contains the html tags converted to entities

$html = html_entity_decode($entities);
# $html contains the regular html again

?>

By now, our administrator is a PHP expert, so he decides that instead of converting the HTML tags to entities, he’ll simply take them all out! To do this, he uses strip_tags(). The arguments for this function are: [string], [allowable tags]. So, if he inputs a string and defines <b> as allowable, it won’t remove the <b> tag, but will remove all other HTML tags. Let’s see what he came up with:
 

<?php

$text = $_POST['overview'];

$text = strip_tags($text, "<h1>");

?>

He’s smart, so he tells PHP to strip every tag except <h1> so that coaches can still have headings in their overviews! Now let’s move on to replacing letters and words.

Replacing letters or words
This can be an important feature if, for example, the administrator wanted all personal references to him to be replaced with “The Administrator”. Luckily PHP has a couple of functions to do this for him!  To keep things simple, though, we will only look at one: str_replace(). This function accepts the arguments: [look for], [replacement], [string], [optional- count]. You don’t have to input the count variable, but that will specify how many occurrences of [look for] to replace. I.e. only replace the first two names, then leave the rest, etc.

Let’s see what he did:

<?php

$text = $_POST['overview'];

$text = str_replace("John", "The Administrator", $text);

?>

So, if a coach inputs “Unfortunately, John was too lazy to make it to our match today…” it would be returned as: “Unfortunately, The Administrator was too lazy to make it to our match today…”. Voila! You can use this for swear word filters, tooltip entries, anything!

Let’s fly on to form posted text.

Form Posted Text
This is a short section, as there is really only two functions I’m going to cover - Ones that add and remove slashes behind specific characters that would cause the script to work incorrectly, or die altogether. They are used very regularly in web forms to add slashes behind quotes to stop them from exiting PHP syntax.

The function we use to add slashes is, ironically enough, called addslashes(). It simply takes a variable in, and returns the formatted string with slashes on it. Some browsers do this by default now, but it’s good to be safe.

Then, once you’ve got through all the posting and transport, simply use stripslashes() to remove those slashes again and your string is as normal!

<?php

$text = $_POST['overview'];

$safe = addslashes($text);
# example – Today Mary said "Hello" is returned as Today Mary said \"Hello\"

$normal = stripslashes($text);
# example – string is Today Mary said "Hello" again

?>

And that’s all for this subsection. If you want to learn more, try these links:
http://www.php.net/addslashes & http://www.php.net/stripslashes

Hashing strings: encryption made easy
Do you want to keep a password secure but don’t want to have to learn those hardcore DES encryption algorithms to do it? Hashing is your answer. It’s easy, fast, and safe, as it’s one way encryption. It is often used for passwords all over the WWW.

Since I like hashing and encryption, I’ll explain a bit about mhash hashing as well as regular PHP hashing. But first, you must decide which of the 2 most common hashing functions you want to use!

The MD5 algorithm returns a 32byte length hash, while the SHA1 algorithm returns a 40byte length hash. There is really no difference between the usage of the two, as they both are operate in the same way. i.e. They both accept a string argument, and return their hash of it.

Q. So hashing is used for passwords... but how good is that if I can’t decrypt it?
A. I asked that question too when I was learning hash-ography. The answer is simple and logical: compare the newly hashed password to an already hashed password.

MD5 and SHA1 potentially have 3 output types: hexadecimal, base64 and STR output. PHP normally only uses hexadecimal output, which is a mix of numbers and letters.

So, how about we get some hashing done? Let’s use MD5 and SHA1 to hash a string:

<?php

$string = "scrowler likes apples";

$md5 = md5($string);
# $md5 = 8a74418dc9eea9d7e44bd580f9892b9b

$sha1 = sha1($string);
# $sha1 = 6d3ac220bf069eaa182afd67b1256ee1240ece41

?>

And that’s basic hashing. If you want to compare passwords or phrases, do it like this:

<?php

$storedphrase = "6d3ac220bf069eaa182afd67b1256ee1240ece41";

$string = "scrowler likes apples";

if(sha1($string) == $storedphrase) echo "Yay, they match!";

?>

For more advanced users, here’s a little guide to using the mhash library of PHP functions. If you think md5 and sha1 will be fine for you, you can skip forward.

Mhash
Mhash is, in short, an additional PHP library that allows more advanced hashing functions. The current list of algorithms that it can use are listed below (taken from php.net on 18/3/05):
- MHASH_MD5
- MHASH_SHA1
- MHASH_HAVAL256
- MHASH_HAVAL192
- MHASH_HAVAL160
- MHASH_HAVAL128
- MHASH_RIPEMD160
- MHASH_GOST
- MHASH_TIGER
- MHASH_CRC32
- MHASH_CRC32B

So essentially, it still includes md5 and sha1 but also a number of other hashing algorithms that you can use.

Mhash is a small library, consisting only of 5 functions. So I’ll outline them all.
mhash_count() - Gets the highest available hash ID
mhash_get_block_size() - Gets the block size of the specified hash
mhash_get_hash_name() - Gets the name of the specified hash
mhash_keygen_s2k() - Generates a key
mhash() - Computes the hash

mhash() is the function we will worry about first, as it’s the function that actually does the hashing. It accepts arguments in the following order: [hash name], [string], [key]. Basically, you specify one of the hash algorithms above in [hash name] without quotes, and you input a string. If you specify a key, it will return the HMAC hash (Hashing for Message Authentication), although it is not required, and if you don’t specify one, it will just return a standard hash. If the algorithm doesn’t support HMAC modes, mhash() returns false.

The difference between using md5("") and using mhash(MHASH_MD5, "") is that mhash() returns raw bin output, which must be converted to hexadecimal by using the function bin2hex(). This way, the functions will both output the same thing.

mhash_keygen_s2k() is the function we use to generate keys. This function accepts [hash name], [password], [salt], [bytes]. Basically, you input the hash name like you did with mhash(), then you input a password, and a randomly generated salt that must be <= 8 bytes long. If it’s less, PHP will pad it with 0’s to make it 8. The bytes integer tells the function how long to make the generated key. Please note that although your salt should be random, you must somehow obtain a copy of it.

mhash_get_block_size() is a simple function that takes in a [hash name] and returns the block size for that algorithm. If the hash doesn’t exist, it will return false.

mhash_get_hash_name() is also a simple function, it takes in the ID of [hash name] and returns the singular name of the hash, i.e. if MHASH_MD5 was input, it would get the key of it, and return MD5.

mhash_count() is probably the simplest mhash function. It doesn’t have any inputs and simply returns the number of algorithms there are available in the mhash library.

So, to conclude the mhash section of this tutorial, let’s write a little scriptlet that will hash a string in every algorithm!

<?php

$num_of_hashes = mhash_count();

$string = "scrowler likes apples";

for( $i=0; $i <= $num_of_hashes; $i++){

$name = mhash_get_hash_name($i);
echo "<p>Hashing using: ".$name."<br />Original string: <em>".$string."</em><br />";

$hash = mhash($i, $string);
$hash = bin2hex($hash);
echo "Hashed: ".$hash."</p>";

}

echo $num_of_hashes . " hashing algorithms used.";

?>

That concludes the mhash and hashing section of this tutorial. I hope you learned something about hashing!

Number Manipulation
Well, now that we’ve finished hashing let’s move on to some basic math. PHP has logical operators built in to help you with math, so let’s have a look at some:

Addition: [num] + [num]
Subtraction: [num] - [num]
Multiplication: [num] * [num]
Division: [num] / [num]
Modulus: [num] % [num]

What is modulus? Well, it returns the remainder of num1 divided by num2. I.e. the modulus of 5 and 2 is 1. 5 / 2 = 2r1. Remainder: 1.

For iteration looping, if you want to add one to your iteration identifier, you can use '$i++' instead of '$i += 1' or '$i = $i + 1'. Similarly, if you want to take one off, you can use '$i--'.

Here’s an example:

<?php

$num1 = 5;
$num2 = 2;

$add = $num1 + $num2; # 7
$sub = $num1 - $num2; # 3
$mul = $num1 * $num2; # 10
$div = $num1 / $num2; # 2.5
$mod = $num1 % $num2; # 1

?>

Finally, I’ll cover exponents and square roots. PHP has built in functions to deal with each of these. They are: pow() and sqrt(). Let’s look at pow() first. It accepts the base number and the exponent and is pretty straight forward. Example: pow(2,2) returns 4, pow(3,3) returns 27.

sqrt() accepts one argument only: the base number. sqrt(4) returns 2. sqrt(9) returns 3 etc. If your number is negative, it is an imaginary number, and PHP gives you a nice little error message, as it does if you input 0. To calculate the square root of negative numbers, square root the abs() (positive) value of your number and slap an “i” on the end of it. E.g. sqrt(-9) would return an error, but if you use '$num=sqrt(abs(-9))."i"' then it will return “3i”. For more information about i and imaginary numbers, Google it!

sqrt() can be replicated by the following: pow([num], 0.5). A number to the power of 0.5 is the same as the square root of that number.

Anyway, enough of that! Onwards to Padding strings!

Padding Strings
This will be a short section, as it only uses one function: str_pad(). Going allllll the way back to the first section, let’s imagine that our administrator wants to take somebody’s name inputted via a form, and if it’s not x characters long, add spaces or dashes or something to make it that long. This is the main function of str_pad().

It accepts these arguments: [input], [pad length], [pad], [where]. [Input] is the inputted string, [pad length] is the length you want your string to be, [pad] is the string of what you will pad it with (it can be numbers, letters, anything), and [where] specifies where to put the padding. If you don’t specify anything, it puts it to the right. You can put either of the following values in there:

STR_PAD_RIGHT
STR_PAD_LEFT
STR_PAD_BOTH


So, if you put in STR_PAD_RIGHT it will put the padding on the right of the existent string. STR_PAD_BOTH will divide the necessary number of added pads by 2 and add half to each side.

Converting Text Classes
We are nearing the end of this article! This section is also reasonably short, so I’m going to outline 4 PHP functions that change character cases:

strtolower() takes a string, and converts it to lowercase.

strtoupper() takes a string and converts it to uppercase.

ucwords() takes a string and converts the first letter of each word to uppercase. If it’s already uppercase it leaves it alone.

ucfirst() takes a string and converts only the first letter of the whole string to uppercase. Once again it leaves it alone if it’s already uppercase.

They all allow one string of input, containing the text to be formatted. So, let’s write an example!

<?php

$text = "scrowler LIKES apples";

$lo = strtolower($text); # scrowler likes apples
$up = strtoupper($text); # SCROWLER LIKES APPLES
$wr = ucwords($text); # Scrowler LIKES Apples
$fr = ucfirst($text); # Scrowler LIKES apples

?>

And, finally, I’ll conclude the article with a short section on word wrapping.

Word Wrapping
Since this isn’t very challenging, this section will also be short. You can do it with one function! This is useful if you need to add line breaks every x characters. The function is called wordwrap() and accepts the arguments: [string], [width], [linebreak], [cut]. We won’t worry about the cut argument at the moment.

[Width] specifies the width of the line, and [linebreak] specifies what to put at the end of the line. Use '<br />\n' to do a linedrop in both <pre> tags and in regular XHTML.

Here’s an example.

<?php

$string = "a long string that will be cut up into smaller parts.";

echo wordwrap($string, 10, "<br />\n");

?>

This will output:

a long <br />
string <br />
that will <br />
be cut up <br />
into <br />
smaller <br />
parts.

It’s that easy. Have fun! Oh, and that concludes the article! I hope you've learned as much as our database administrator! ;). If you have any questions, you can contact me via the Creative Forums - just click my name below to visit my profile where you can PM or email me.  See ya!




All Content © BioRUST 2009 All Rights Reserved.