| Downloaded from www.biorust.com on Sat Nov 21, 2009 09:34:09 |
![]() | |
| Text & Number Management Functions Tutorial Author - Scrowler (http://forums.biorust.com/member.php?userid=66) |
Welcome to yet another scrowler tutorial on PHP. This
tutorial covers various different text manipulation and management functions,
many of which are absolutely invaluable for your scripts.
Topics that we will cover in this tutorial include:
- Text length limiting
-
HTML formatting
-
Replacing letters or words
-
Form posted text
- Hashing strings (one way encryption, inc. mhash lib)
-
Number manipulation
-
Padding strings
-
Converting text cases
-
Word wrapping
Woah! What a list! Let’s get started then! This tutorial is written for the
average PHP beginner to get a firmer grasp on the world of string management
with PHP, including some basic math functions. The content of this tutorial is
optimized to include web functionality i.e. stripping HTML tags from a string,
converting HTML entities, etc.
You will even be taught how to hash strings, going far into the mhash
PHP extension library.
First up on the agenda: Text length limiting.
Text length
limiting
Text length limiting is a very useful feature of PHP, and it’s relatively easy
to achieve. Using inbuilt functions of PHP you can cut a string off at a certain
point, and you can even check its length with a single function call!
So, let’s say that we have a submission form for a local soccer club to allow
coaches to submit overviews of weekly soccer matches. The administrator doesn’t
want his database to get huge, because he knows how many teams there are in the
club, and he suspects that coaches might get carried away! So what does he do?
The answer is simple - He imposes a text length restriction!
On Monday, the administrator is feeling rather harsh, so he decides to use
substr() to simply cut everything off the string from a certain point onwards,
to limit the length. He reads this tutorial and finds out the syntax for substr()
is as follows: [string], [start], [length].
He checks the database and it’s huuuuuge! How can he manage to allow so much
data to go in every week? Fortunately, he understands the concepts in this
tutorial and is able to solve his problem in a harsh way. Here is the code
he decides to use:
|
<?php function cutText($string) { $string = substr($string, 0, 5000); return $string; } $string = cutText($_POST['overview']); ?> |
This works successfully! If a coach inputs an overview
larger than 5000 characters, anything past the 5000th character is
cut off!
So the week goes by, the next soccer round commences, and a coach goes to
submit his weekly overview. He loses half of it, and what does he do?
He complains of course! How unfair - They didn’t even back it up!
The administrator decides to be a little nicer this time, and instead of
just cutting the end off, he decides to tell the user if it’s too long, so
they can cut it down to size manually. For this, he uses the function strlen() which
simply takes the input argument [string] and returns its length.
|
<?php function checkLength($string) { $length = strlen($string); if($length > 5000) { return false; } else { return true; } } $text = $_POST['overview']; if(checkLength($text) == true){ # process the data } else { echo 'Sorry for the trouble, your data was longer than 5000 characters, could you please shorten it and retry? Thanks!'; } ?> |
Problem solved! The soccer coach now has the opportunity to shorten
his
overview instead of completely losing the excess! This gives us a happy administrator, as his database remains a manageable size, and
happy coaches as they can submit their overviews with ease!
That concludes the section on text limiting, so let’s move on to HTML
formatting!
HTML Formatting
Often if you want to output HTML code onto a page from a database,
and you don’t want it to be executed as HTML, you need to convert the tag to
it’s HTML entity. This is a representation of the character that the browser
doesn’t parse, i.e. the < symbol’s HTML entity is: < and the
& symbol: &
Surprisingly, converting tags to their entities is as easy as 1, 2, 3 - PHP has a
few inbuilt functions that do it for you! We will look at the htmlentities()
function in this example, as it covers most of everything you will probably use.
This function simply returns the string, with the tags encoded as entities. So,
if we input “<&” it will return “<&”.
Our administrator from the previous section knew of this function, so he
implemented it in his scripting, so that if somebody put in some HTML by
accident into their overview, it would convert it. Let’s see what he came up
with:
|
<?php $text = $_POST['overview']; $text = htmlentities($text); ?> |
Done! Its as simple as that! None of the inputted HTML will be
parsed, and it will simply appear as you would code it!
Now, let’s say he had a section where the user could input customized HTML
code for their headings of the overviews. He thought he could simply take
the “entitized” header code (because it’s a separate variable in this
example) and convert it back to HTML. Possible? Of course! He uses the
function: html_entity_decode(). It accepts an entitized string and return
the HTML’ized version of that string. Let’s see what he came up with:
|
<?php $header = $_POST['header']; $entities = htmlentities($header); # $entities contains the html tags converted to entities $html = html_entity_decode($entities); # $html contains the regular html again ?> |
By now, our administrator is a PHP expert, so
he decides that instead of converting the HTML tags to entities, he’ll
simply take them all out! To do this, he uses strip_tags(). The
arguments for this function are: [string], [allowable tags]. So, if he
inputs a string and defines <b> as allowable, it won’t remove the
<b>
tag, but will remove all other HTML tags. Let’s see what he came up
with:
|
<?php $text = $_POST['overview']; $text = strip_tags($text, "<h1>"); ?> |
He’s smart, so he tells PHP to strip every tag except <h1> so that coaches can still have headings in their overviews! Now let’s move on to replacing letters and words.
Replacing letters or words
This can be an important feature if, for example, the administrator
wanted all personal references to him to be replaced with “The Administrator”. Luckily PHP has a
couple of functions to do this for him! To keep things simple,
though, we will only look at one: str_replace(). This function accepts
the arguments: [look for], [replacement], [string], [optional- count].
You don’t have to input the count variable, but that will specify how
many occurrences of [look for] to replace. I.e. only replace the
first two names, then leave the rest, etc.
Let’s see what he did:
|
<?php $text = $_POST['overview']; $text = str_replace("John", "The Administrator", $text); ?> |
So, if a coach inputs “Unfortunately, John was too
lazy to make it to our match today…” it would be returned as:
“Unfortunately, The Administrator was too lazy to make it to our match
today…”. Voila! You can use this for swear word filters, tooltip
entries, anything!
Let’s fly on to form posted text.
Form Posted Text
This is a short section, as there is really only two functions I’m going
to cover - Ones that add and remove slashes behind specific characters that would
cause the script to work incorrectly, or die altogether. They are used
very regularly
in web forms to add slashes behind quotes to stop them from exiting PHP
syntax.
The function we use to add slashes is, ironically enough, called
addslashes(). It simply takes a variable in, and returns the formatted
string with slashes on it. Some browsers do this by default now, but
it’s good to be safe.
Then, once you’ve got through all the posting and transport, simply use
stripslashes() to remove those slashes again and your string is
as normal!
|
<?php $text = $_POST['overview']; $safe = addslashes($text); # example – Today Mary said "Hello" is returned as Today Mary said \"Hello\" $normal = stripslashes($text); # example – string is Today Mary said "Hello" again ?> |
And that’s all for this subsection. If you want to learn more, try
these links:
http://www.php.net/addslashes
&
http://www.php.net/stripslashes
Hashing strings: encryption made easy
Do you want to keep a password secure but don’t want to have to learn
those hardcore DES encryption algorithms to do it? Hashing is your
answer. It’s easy, fast, and safe, as it’s one way encryption. It is
often used for passwords all over the WWW.
Since I like hashing and encryption, I’ll explain a bit about mhash
hashing as well as regular PHP hashing. But first, you must decide which
of the 2 most common hashing functions you want to use!
The
MD5 algorithm returns a 32byte length hash, while the SHA1 algorithm returns
a 40byte length hash. There is really no difference between the usage of
the two, as they both are operate in the same way. i.e.
They both accept a string argument, and return their hash of it.
Q.
So hashing is used for passwords... but how good is that if I can’t
decrypt it?
A. I asked that question too when I was learning hash-ography.
The answer is simple and logical: compare the newly hashed password to
an already hashed password.
MD5 and SHA1 potentially have 3 output types: hexadecimal, base64 and
STR output. PHP normally only uses hexadecimal output, which is a
mix of numbers and letters.
So, how about we get some hashing done? Let’s use MD5 and SHA1 to hash a
string:
|
<?php $string = "scrowler likes apples"; $md5 = md5($string); # $md5 = 8a74418dc9eea9d7e44bd580f9892b9b $sha1 = sha1($string); # $sha1 = 6d3ac220bf069eaa182afd67b1256ee1240ece41 ?> |
And that’s basic hashing. If you want to compare
passwords or phrases, do it like this:
|
<?php $storedphrase = "6d3ac220bf069eaa182afd67b1256ee1240ece41"; $string = "scrowler likes apples"; if(sha1($string) == $storedphrase) echo "Yay, they match!"; ?> |
For more advanced users, here’s a little guide to using the mhash library of PHP functions. If you think md5 and sha1 will be fine for you, you can skip forward.
Mhash
Mhash is, in short, an additional PHP library that allows more advanced
hashing functions. The current list of algorithms that it can use are
listed below (taken from php.net on 18/3/05):
- MHASH_MD5
- MHASH_SHA1
- MHASH_HAVAL256
- MHASH_HAVAL192
- MHASH_HAVAL160
- MHASH_HAVAL128
- MHASH_RIPEMD160
- MHASH_GOST
- MHASH_TIGER
- MHASH_CRC32
- MHASH_CRC32B
So essentially, it still includes md5 and sha1 but also a number of
other hashing algorithms that you can use.
Mhash is a small library, consisting only of 5 functions. So I’ll
outline them all.
mhash_count() - Gets the highest available hash ID
mhash_get_block_size() - Gets the block size of the specified hash
mhash_get_hash_name() - Gets the name of the specified hash
mhash_keygen_s2k() - Generates a key
mhash() - Computes the hash
mhash() is the function we will worry about first, as it’s the function
that actually does the hashing. It accepts arguments in the following
order: [hash name], [string], [key].
Basically, you specify one of the hash algorithms above in [hash name]
without quotes, and you input a string. If you specify a key, it will
return the HMAC hash (Hashing for Message Authentication), although it is
not required, and if you don’t specify one, it will just return a
standard hash. If the algorithm doesn’t support HMAC modes, mhash()
returns false.
The difference between using md5("") and using mhash(MHASH_MD5,
"") is
that mhash() returns raw bin output, which must be converted to
hexadecimal by using the function bin2hex(). This way, the functions
will both output the same thing.
mhash_keygen_s2k() is the function we use to generate keys. This function accepts
[hash name], [password], [salt],
[bytes]. Basically, you input the hash name like you did with mhash(),
then you input a password, and a randomly generated salt that must be
<= 8 bytes long. If it’s less, PHP will pad it with 0’s to make it 8.
The bytes integer tells the function how long to make the generated key.
Please note that although your salt should be random, you must somehow
obtain a copy of it.
mhash_get_block_size() is a simple function that takes in a [hash name]
and returns the block size for that algorithm. If the hash doesn’t
exist, it will return false.
mhash_get_hash_name() is also a simple function, it takes in the
ID of [hash name] and returns the singular name of the hash, i.e. if
MHASH_MD5
was input, it would get the key of it, and return MD5.
mhash_count() is probably the simplest mhash function. It doesn’t
have any inputs and simply returns the number of algorithms there are
available in the mhash library.
So, to conclude the mhash section of this tutorial, let’s write a little
scriptlet that will hash a string in every algorithm!
|
<?php $num_of_hashes = mhash_count(); $string = "scrowler likes apples"; for( $i=0; $i <= $num_of_hashes; $i++){ $name = mhash_get_hash_name($i); echo "<p>Hashing using: ".$name."<br />Original string: <em>".$string."</em><br />"; $hash = mhash($i, $string); $hash = bin2hex($hash); echo "Hashed: ".$hash."</p>"; } echo $num_of_hashes . " hashing algorithms used."; ?> |
That concludes the mhash and hashing section of this
tutorial. I hope you learned something about hashing!
Number Manipulation
Well, now that we’ve finished hashing let’s move on to some basic math. PHP
has logical operators built in to help you with math, so let’s have a
look at some:
Addition: [num] + [num]
Subtraction: [num] - [num]
Multiplication: [num] * [num]
Division: [num] / [num]
Modulus: [num] % [num]
What is modulus? Well, it returns the remainder of num1 divided by num2. I.e.
the modulus of 5 and 2 is 1. 5 / 2 = 2r1. Remainder: 1.
For iteration looping, if you want to add one to your iteration
identifier, you can use '$i++' instead of '$i += 1' or
'$i = $i + 1'.
Similarly, if you want to take one off, you can use '$i--'.
Here’s an example:
|
<?php $num1 = 5; $num2 = 2; $add = $num1 + $num2; # 7 $sub = $num1 - $num2; # 3 $mul = $num1 * $num2; # 10 $div = $num1 / $num2; # 2.5 $mod = $num1 % $num2; # 1 ?> |
Finally, I’ll cover exponents and square roots. PHP
has built in functions to deal with each of these. They are: pow()
and
sqrt(). Let’s look at pow() first.
It accepts the base number and the exponent and is pretty straight forward.
Example: pow(2,2) returns 4, pow(3,3) returns 27.
sqrt() accepts one argument only: the base number. sqrt(4) returns 2.
sqrt(9) returns 3 etc. If your number is negative, it is an imaginary
number, and PHP gives you a nice little error message, as it does if you
input 0. To calculate the square root of negative numbers, square root
the abs() (positive) value of your number and slap an “i” on the end of
it. E.g. sqrt(-9) would return an error, but if you use '$num=sqrt(abs(-9))."i"' then it will return “3i”. For more information about
i and
imaginary numbers, Google it!
sqrt() can be replicated by the following: pow([num], 0.5). A number to
the power of 0.5 is the same as the square root of that number.
Anyway, enough of that!
Onwards to Padding strings!
Padding Strings
This will be a short section, as it only uses one function: str_pad().
Going allllll the way back to the first section, let’s imagine that our
administrator wants to take somebody’s name inputted via a form, and if
it’s not x characters long, add spaces or dashes or something to make it
that long. This is the main function of str_pad().
It accepts these arguments: [input], [pad length], [pad], [where].
[Input]
is the inputted string, [pad length] is the length you want your string to
be, [pad] is the string of what you will pad it with (it can be numbers,
letters, anything), and [where] specifies where to put the padding. If you
don’t specify anything, it puts it to the right. You can put either of the
following values in there:
STR_PAD_RIGHT
STR_PAD_LEFT
STR_PAD_BOTH
So, if you put in STR_PAD_RIGHT it will put the padding on the right of
the existent string. STR_PAD_BOTH will divide the necessary number of
added pads by 2 and add half to each side.
Converting Text Classes
We are nearing the end of this article! This section is also reasonably
short, so I’m going to outline 4 PHP functions that change character cases:
strtolower() takes a string, and converts it to lowercase.
strtoupper() takes a string and converts it to uppercase.
ucwords() takes a string and converts the first letter of each word to
uppercase. If it’s already uppercase it leaves it alone.
ucfirst() takes a string and converts only the first letter of the whole
string to uppercase. Once again it leaves it alone if it’s already uppercase.
They all allow one string of input, containing the text to be formatted.
So, let’s write an example!
|
<?php $text = "scrowler LIKES apples"; $lo = strtolower($text); # scrowler likes apples $up = strtoupper($text); # SCROWLER LIKES APPLES $wr = ucwords($text); # Scrowler LIKES Apples $fr = ucfirst($text); # Scrowler LIKES apples ?> |
And, finally, I’ll conclude the article with a short section on word wrapping.
Word Wrapping
Since this isn’t very challenging, this section will also be short. You
can do it with one function!
This is useful if you need to add line breaks every x characters. The
function is called wordwrap() and accepts the arguments: [string], [width],
[linebreak], [cut]. We won’t worry about the cut argument at the moment.
[Width] specifies the width of the line, and [linebreak] specifies what to
put at the end of the line. Use '<br />\n' to do a linedrop in
both <pre> tags and in regular XHTML.
Here’s an example.
|
<?php $string = "a long string that will be cut up into smaller parts."; echo wordwrap($string, 10, "<br />\n"); ?> |
This will output:
a long <br />
string <br />
that will <br />
be cut up <br />
into <br />
smaller <br />
parts.
It’s that easy. Have fun! Oh, and that concludes the article! I hope you've learned as much as our database administrator! ;). If you have any questions, you can contact me via the Creative Forums - just click my name below to visit my profile where you can PM or email me. See ya!