PHPBuilder - Advanced String Processing - How Regular Are Your Expressions Page 3

RSS Twitter

Advanced String Processing - How Regular Are Your Expressions - Page 3

by: PHP Builder Staff
May 19, 2009

One more example:

$text = "the letter a is a vowel"
$reg-ex = "/the\sletter\s[aeiou]\sis\sa\svowel/i";

This reads:

Search for "the letter " followed by one of the letters a,e,i,o,u and none other
Followed by " is a vowel"

On a positive match, then $matches[0] will hold "the letter "a" is a vowel" , there will be no other parts in $matches as there are no bracket sections.
In case your wondering \s is a special character called a meta- character, and it means anything classed as white space.
The * symbol is also a meta character and means match "0 or more occurrences" eg:


The above axample will match any text starting with 'A', the ^ and $ meta characters mean start and end of the text, so:


Will match any and all the text in a phrase as long as it starts with an 'A' right at the beginning, which is different to the previous, because that will match on the first 'A' it encounters in the text, then match on the rest of the line, and that brings us to my next point.
Regular expressions are greedy. They will try and match the largest amount possible at any given time in any given match string, which is why you really only want to use * if it's really necessary, if you can, always try to narrow your search as much as possible EG:

"Alan went to meet marsha"

To get the word 'Alan' use an expression of:


Or use the count control match meta characters:


What this expression says is, look for a 4 character word beginning with 'A' right at the beginning of the line, followed by a space and at least 1 or more characters.
The {4} means 4 characters of any description, and only 4 characters. It's also possible to specify ranges. Take a look at this example:


This example would specify an A followed by between 1 and 4 characters, but no less than 1 and no more than 4. And this snippet:


This code would mean an 'A' at the beginning followed by at least 4 characters, possibly more.
You can also combine other rules, this does not just have to be a '.', '*' or '+' , you can use a character class like this:


This would match on a line beginning with 'A' and at least 4 of any of the characters in the square brackets in any order, but only the characters in the square brackets.
We've really only just scraped the tip of the iceberg with regular expressions, it's a huge subject for which many books have been written. I urge you to read more about them and you can always look to the PHP manual, the expressions section is at .phpt.
Next time will be the final part in our series, in which we wrap up and look at some practical examples of what we've learned so far.
It's also your chance to tell me what you'd like to cover. If there is a particular thing you've been trying to do, or a technique your not sure how to make work, then please leave a comment using the form at the bottom of this page.
Between now and the final article, I'll be checking these comments, and I'll use them as a basis for what I put in the last article, please note however, I'm not going to complete your project for you or your homework assignment, so please don't put things in like "please show me how to make a project that does xxxx" all I'm looking for are real world ideas based on common scenarios that you guys are currently learning.
Until next time
May your expressions remain regular

« Previous Page

Comment and Contribute

Your comment has been submitted and is pending approval.




(Maximum characters: 1200). You have characters left.