Click to See Complete Forum and Search --> : [Resolved] PCRE vs. POSIXregexp ????


Merve
10-04-2003, 12:43 PM
Because I'm curious and confused. What's the difference? Which is better? Please shed some light on this issue for me. Oh yes, this poll does time out after 15 days.

Moonglobe
10-04-2003, 01:23 PM
both. POSIX when i'm lazy, and PCRE when i actually want to work. PCRE is faster, and more powerful i believe, but also slightly harder to understand.

Merve
10-04-2003, 01:54 PM
POSIXregex is not explained on php.net, the functions are just listed. PCRE is explained on php.net, and the functions are listed. It makes everything simpler for me.

drawmack
10-04-2003, 02:45 PM
coming from a PERL background I use PCRE more often then not, though I like the POSIX syntax better:

http://www-sop.inria.fr/mimosa/fp/Bigloo/doc/bigloo-5.11.html

http://directory.google.com/Top/Computers/Programming/Languages/Regular_Expressions/FAQs,_Help,_and_Tutorials/

EDIT: But I voted your a dork-ass just cause it's on the list.

Merve
10-04-2003, 02:53 PM
Maybe it's just me, but I find PCRE less confusing and PCRE resources easier to find. I'm not trying to be biased here, as POSIXregexp has a much easier syntax.

drawmack
10-04-2003, 03:06 PM
a lot of times I use PRCE cause I've already got the reg ex laying aroung in a perl script and it's easier then rewritting it.

Weedpacket
10-05-2003, 12:12 AM
I was already used to PCRE syntax - the only reason I needed to use POSIX syntax was because PCRE wasn't automatically part of PHP3.

There are a whole bunch of refinements in PCRE syntax that are absent from POSIX syntax (they're mentioned on the POSIX regex manual page) without which you either need to write additional code or an exponentially long regexp to achieve the same effect.

Eg., (okay, so it's a slightly artificial example). Match all strings that lie between the strings "{this}" and "{that}" which do not contain the string "{never}".

PCRE:/{this}(?:
(?!{(?:that|never)})
.)*{that}/x
POSIX:{this}
([^{]|{
(
[^nt]|n([^e]|e([^v]|v([^e]|e([^r]|r[^}]))))
|
t([^h]|h([^a]|a([^t]|t[^}])))
)
)*
{that}

(Remembering, of course, that you can't really break POSIX regexps up with whitespace the way you can with PCRE with the /x modifier.)

And if {this} and {that} is to be case-sensitive, but {never} is not:

PCRE:/{this}(?:
(?!{(?:that|(?:(?i)never))})
.)*{that}/x
POSIX:{this}
([^{]|{
(
[^nNt]|[nN]([^eE]|[eE]([^vV]|[vV]([^eE]|[eE]([^rR]|[rR][^}]))))
|
t([^h]|h([^a]|a([^t]|t[^}])))
)
)*
{that}

And then there's the /e modifier. With that, preg_replace() can literally do anything with the strings it matches that PHP can do.

Merve
10-05-2003, 10:16 AM
Originally posted by Weedpacket

Eg., (okay, so it's a slightly artificial example). Match all strings that lie between the strings "{this}" and "{that}" which do not contain the string "{never}".


I apologise for being a stupid newbie, but do you mean substrings, as in strings that contain this, that, never?

BuzzLY
10-05-2003, 01:12 PM
That doesn't make you a stupid newbie. What makes you a stupid newbie is quoting the entire post when responding with a one-line question. Or, does that make you a dork-ass? ;)

Merve
10-05-2003, 01:20 PM
Thanks for pointing out my stupid-newbidity, BuzzLY. Does POSIXregex have any advantages?

BuzzLY
10-05-2003, 02:04 PM
You're welcome :p

I really don't know of any advantages... I think this debate is similar to print vs. echo. If both do what you want, then use the one that you are most comfortable with. If one does what you want, but the other doesn't, then your choice is obvious.

If you know how to use both, you definitely have an advantage, wouldn't you say?

Merve
10-05-2003, 02:09 PM
Sadly, I only know PCRE...the only advantage I can think of is that POSIXregex is easier to learn..supposedly.

I'm sticking to PCRE for the moment. Thanks for voting me a dork-ass, and thanks for all your help on this matter; it'll help me mess around a bit , and then I'll truly figure out what's better.

LordShryku
10-05-2003, 02:34 PM
You got one-up on me....I don't know much about regular expressions....

Merve
10-05-2003, 02:41 PM
I couldn't write a pattern without referring to the manual and this handy resource: http://www.tote-taste.de/X-Project/regex/

I'm not that great at regex; I just know the basics.

Weedpacket
10-05-2003, 04:53 PM
Originally posted by BuzzLY
You're welcome :p

I really don't know of any advantages... I think this debate is similar to print vs. echo. If both do what you want, then use the one that you are most comfortable with. If one does what you want, but the other doesn't, then your choice is obvious.

If you know how to use both, you definitely have an advantage, wouldn't you say? There really isn't anything POSIX expressions can do that PCRE expressions can't. Apart from the /.../ delimiters in the latter, a POSIX regexp is a valid PCRE regexp.

And I just checked with my copy of PHP and I've found that that does include things like [[:alnum:]] - supporting those is a compile-time option when building the PCRE library. I remember thinking that those were the one thing that POSIX regexps have that PCRE regexps don't, but it turns out I'm wrong. Of course, [a-zA-Z0-9] is just as short and more obvious anyway.

So any ereg() expression "foo" can be turned into a preg_match() expression "/foo/" (or your choice of delimiter), and any eregi() expression "bar" can be turned into a preg_match() expression "/bar/i". So migrating (to use the euphemism - you going to migrate back at the end of the season?) from POSIX to PCRE is pretty much a trivial task.

Going the other way, though, can be a fair sight more difficult; what's the POSIX equivalent of \b?

Merve
10-06-2003, 08:29 PM
It's great to have your input. I've made my choice now and it's PCRE. Thank you very much.