Welcome, Guest

Please login or register

TUTORIALS SUBMENU

PHOTOSHOP    FLASH    ILLUSTRATOR    BLENDER    CINEMA 4D    WEB-CODING    [SUBMIT]

Related Links

Regular Expressions


One of PHPs most useful features is its string processing abilities. Feed PHP any string, and it can process it in any number of different ways with a multitude of different in-built functions.   Finding letter occurrences, replacing certain words, limiting the number of characters, etc - it's all made very easy.

One very useful function in particular is preg_replace(), which allows you to find certain occurrences of words in an advanced, customized way and replace them with a a string of your choice. The searched string can either be a simple string (although I recommend you use str_replace() for this function only due to its superior speed), or it can be a regular expression (REGEX). These regular expressions are like targeted wildcards, albeit MUCH more complex.  The aim of this tutorial is to describe the formulation strategy of various REGEX expressions, what they do, and how to customize them to your own unique purposes.  As you can guess, this is an ADVANCED tutorial, so no efforts will be made to explain the preg_replace() or str_replace() functions.  If you need this tutorial, you are more than likely able to read the PHP manual anyway...  ;)

Basic REGEX
Make no mistake - REGEX is widely used today - even searches in Microsoft Windows use them to some degree. Let me point you towards a simple example:

*.* - This is REGEX, and in windows it means "find any file with any extension in a given directory". In PHP it would mean "find one or more characters followed by a dot followed by one or more characters. Let us enhance that a little:

[A-Z]*.* - The "[A-Z]" is a character class and it basically means any letter from a to z that is uppercase. If you want to collect lowercase you would enter "[a-z]". If you would like to collect any letter, the obvious solution would be "[A-Za-z]".

TIP: If you want to check for a custom range of characters you could always use [g-p], etc.

Occurence-Counting REGEX
A character class followed by a "*" means "zero or more characters from the selected character class". So this string: [a-z]* would mean "zero or more lowercase letters". If you need to check for at least one occurrence of a letter you would use:

[a-z]+ - A "+" basically means "one or more occurrences". You could also do:

[a-z]{1} - This means "exactly one or more occurrences of a lowercase letter". So "exactly two to three occurrences of a lowercase letter" would be: [a-z]{2-3}

If you want to check for an optional character you use the question mark (?), like this: [a-z]? - And the explanation of this line is "an optional lowercase character". Now that we have this covered lets move on...

Character-Counting REGEX
^(.){4-6}$ - In PHP REGEX the carrot (^) symbol basically means the beginning of the line. So the dollar ($) symbol obviously means the end of the line. The end of the line occurs when a '/n' character is found. So this expression will mean "the start of the line followed by 4 to 6 any characters followed by the end of the line". Yes, the dot (.) character means "any character". So the line: (.*)  would mean "any amount of any character". The carrot (^) character can also be used for negating character classes. By negating I mean checking if there are no characters of the specified range. So a string like ^[^0-9]*$ would mean "start of the line followed by zero or more any characters that is NOT a digit followed by the end of the line".

The Zen of Brackets
By now you have probably noticed all the different brackets that are used. All of them have a different meaning. Let me explain:

  • The parenthesis "(" and ")" are used to group different expressions together, to which (if you need to use preg_replace) you can return later using a simple "$n" where n means a digit representing order from left to right of all the groups in the REGEX string. So, if you want to extract the text from the second group in this: ^([a-z]+)[A-Z]?([0-5]{1-3})$ You would have to use "$2" (the first group is ([a-z]+) and the second is ([0-5]{1-3})). And, of course, the usual translation of the string to human language is "the start of the line followed by one or more lowercase letters followed by an optional uppercase letter followed by 1 to 3 digits not higher then 5 followed by the end of the line".

  • The curly brackets "{" and "}" represent the widely used minimum/maximum values. As explained earlier, they can be used to further customize checking for characters in a string instead of the usual "one or more" or "zero or more". Syntax would be: {n} for n or more e.g. {1}, or {n-m} for no less than n number of characters and no more then m number of characters.   e.g. {3-7}

  • And finally, of course, there are the the normal brackets "[" and "]". These represent a character range, which was also explained earlier. The syntax for this one is: [a-b] where a is the range start and b is the range end e.g. [A-Z]

Of course, you don't have to use all REGEX for a string. You can also check for occurrences of words in a more advanced way. If, for example, you would like to search for a string containing the word "military" followed by an optional digit followed by the end of the line, you would write something like this: [Mm]ilitary[0-9]?$     Take note that the "[Mm]" is also a character range - it specifies a search for either character in the brackets. You can use all kinds of characters in your searches, but if you want to use a special character (e.g. a bracket) you will need to escape it using the all-saving backslash (\). This is, of course, the rule for PHP in general anyway!    So, for example, if you want to search for "[word]" you would write the REGEX like this: (\[word\]+)

Commonly Used Examples
Now that we have all the advanced theory out of the way, here are some frequently used reference REGEX expressions found in popular PHP-driven scripts:

\[b\](.*?)\[/b\] - What you see here is REGEX used to search for text encased in a [b] and [/b] tag. This is used very widely among forums, news systems of all kinds, etc.

[0-9A-Za-z]{8-15} - This could be used in scripts that utilise registration with passwords. This REGEX only accepts a string that is numeric or alphabetic with minimum 8 and maximum 15 characters.

The Speed Issue & Techniques
Using preg_replace() is definitely convenient, but it isn't too fast considering that PHP has to parse the string for metacharacters first instead of proceeding straight to the searching. I cant stress this enough: if you want to search a rather large text file for the word "cat" then, FOR THE LOVE OF GOD, use the strstr() function instead of preg_match(). Don't use preg functions when you're not using REGEX. Trust me on this one!

Also, many new people don't see the magic of arrays and proceed with the ignorant way of using 30 preg_match() functions each after the other instead of just putting the content in an array and searching that instead. Arrays are faster, more convenient and, most of all, they wont make your code look messy. Incidentally, if you are still rusty with arrays, you will do well to check out Scrowler's tutorial on arrays, also on Biorust...

Well, this is the end of the tutorial, so if you have any questions (or just want to flame me for writing some innate babble) then proceed to the Biorust forums and leave your opinions there. I promise someone will get back to you.

- Tutorial written by Blodo

Automatic Translations: Translate Into French Translate Into German Translate Into Italian Translate Into Spanish Translate Into Portuguese

Last 5 User Comments


There are no comments for this tutorial yet.
You can place a comment by clicking here.
Amazing Font Pack!

Featured Tutorialsmore

Wrinkle Removal
Wrinkle Removal
- Adobe Photoshop -
Hershey Kisses
Hershey Kisses
- Adobe Illustrator -
Painting A Wooden ...
Painting A Wooden ...
- Adobe Photoshop -
Colorizing B&W Pho...
Colorizing B&W Pho...
- Adobe Photoshop -
Membership

Username:
Password:  
Remember Me

Lost Password? || Register

Related Links



Special Options
Printer Friendly Version
Forum Threads

 Deactivate Account
Author: jerinian
Posted: Oct 02nd, 11:16am
Activity: 1 replies, 890 views
 changes....
Author: supertackyman
Posted: Sep 12th, 2:56am
Activity: 2 replies, 1055 views
Back again and with free webhosting :)
Author: ngz
Posted: Aug 14th, 3:50pm
Activity: 0 replies, 1055 views
Cartoon Crab 6 Legs Walk Run created in Blender
Author: patricia3d
Posted: Jun 19th, 12:58pm
Activity: 0 replies, 1938 views
HTML Form Post Array to PHP
Author: Space Cowboy
Posted: May 25th, 2:18pm
Activity: 0 replies, 1834 views
My blog where i create Digi Scrapbook
Author: claudya07
Posted: May 11th, 2:33pm
Activity: 0 replies, 14444 views
Blood Dripping from Letters
Author: patricia3d
Posted: Apr 05th, 3:37am
Activity: 0 replies, 2757 views
A New Designer has joined the ranks
Author: skates1
Posted: Mar 28th, 2:19pm
Activity: 2 replies, 2777 views
The hole in Photoshop
Author: Mars
Posted: Feb 13th, 9:28pm
Activity: 2 replies, 3442 views
Colour Swatch
Author: ebz7350
Posted: Jan 15th, 11:18am
Activity: 0 replies, 2357 views
 BioRUST Forums - Reply to Topic
Author: inonShozy
Posted: Jan 11th, 11:32am
Activity: 8 replies, 2500 views
 Version 2 of my portfolio site.
Author: andrewnleon
Posted: Jan 08th, 6:18am
Activity: 6 replies, 2800 views
Forum Threads

--- Site Resources ---
Total Tutorials:212
Total Downloads:    441
Total Fonts:    4673