Date: 06/12/99
- Next message: Jim Winstead: "[PHP-DEV] lxr.php.net"
- Previous message: Bug Database: "[PHP-DEV] Bug #1450 Updated: wrong array size"
- Next in thread: eschmid: "[PHP-DEV] CVS update: php3/doc/functions"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Saturday June 12, 1999 @ 18:00
Author: andrey
Update of /repository/php3/doc/functions
In directory php:/tmp/cvs-serv22468
Modified Files:
pcre.sgml
Log Message:
Started adding secions from PCRE manpage.
Index: php3/doc/functions/pcre.sgml
diff -u php3/doc/functions/pcre.sgml:1.10 php3/doc/functions/pcre.sgml:1.11
--- php3/doc/functions/pcre.sgml:1.10 Fri Jun 11 13:20:19 1999
+++ php3/doc/functions/pcre.sgml Sat Jun 12 18:00:46 1999
@@ -455,6 +455,107 @@
</blockquote>
</refsect1>
</refentry>
+
+ <refentry id="pattern.options">
+ <refnamediv>
+ <refname>Pattern Syntax</refname>
+ <refpurpose>describes PCRE regex syntax</refpurpose>
+ </refnamediv>
+ <refsect1>
+ <title>Description</title>
+ <literallayout>
+ The PCRE library is a set of functions that implement regular
+ expression pattern matching using the same syntax and semantics
+ as Perl 5, with just a few differences (see below). The current
+ implementation corresponds to Perl 5.005.
+ </literallayout>
+
+ <refsect1>
+ <title>Differences from Perl</title>
+ <literallayout>
+ The differences described here are with respect to Perl
+ 5.005.
+
+ 1. By default, a whitespace character is any character that
+ the C library function isspace() recognizes, though it is
+ possible to compile PCRE with alternative character type
+ tables. Normally isspace() matches space, formfeed, newline,
+ carriage return, horizontal tab, and vertical tab. Perl 5 no
+ longer includes vertical tab in its set of whitespace char-
+ acters. The \v escape that was in the Perl documentation for
+ a long time was never in fact recognized. However, the char-
+ acter itself was treated as whitespace at least up to 5.002.
+ In 5.004 and 5.005 it does not match \s.
+
+ 2. PCRE does not allow repeat quantifiers on lookahead
+ assertions. Perl permits them, but they do not mean what you
+ might think. For example, (?!a){3} does not assert that the
+ next three characters are not "a". It just asserts that the
+ next character is not "a" three times.
+
+ 3. Capturing subpatterns that occur inside negative looka-
+ head assertions are counted, but their entries in the
+ offsets vector are never set. Perl sets its numerical vari-
+ ables from any such patterns that are matched before the
+ assertion fails to match something (thereby succeeding), but
+ only if the negative lookahead assertion contains just one
+ branch.
+
+ 4. Though binary zero characters are supported in the sub-
+ ject string, they are not allowed in a pattern string
+ because it is passed as a normal C string, terminated by
+ zero. The escape sequence "\0" can be used in the pattern to
+ represent a binary zero.
+
+ 5. The following Perl escape sequences are not supported:
+ \l, \u, \L, \U, \E, \Q. In fact these are implemented by
+ Perl's general string-handling and are not part of its pat-
+ tern matching engine.
+
+ 6. The Perl \G assertion is not supported as it is not
+ relevant to single pattern matches.
+
+ 7. Fairly obviously, PCRE does not support the (?{code})
+ construction.
+
+ 8. There are at the time of writing some oddities in Perl
+ 5.005_02 concerned with the settings of captured strings
+ when part of a pattern is repeated. For example, matching
+ "aba" against the pattern /^(a(b)?)+$/ sets $2 to the value
+ "b", but matching "aabbaa" against /^(aa(bb)?)+$/ leaves $2
+ unset. However, if the pattern is changed to
+ /^(aa(b(b))?)+$/ then $2 (and $3) get set.
+
+ In Perl 5.004 $2 is set in both cases, and that is also true
+ of PCRE. If in the future Perl changes to a consistent state
+ that is different, PCRE may change to follow.
+
+ 9. Another as yet unresolved discrepancy is that in Perl
+ 5.005_02 the pattern /^(a)?(?(1)a|b)+$/ matches the string
+ "a", whereas in PCRE it does not. However, in both Perl and
+ PCRE /^(a)?a/ matched against "a" leaves $1 unset.
+
+ 10. PCRE provides some extensions to the Perl regular
+ expression facilities:
+
+ (a) Although lookbehind assertions must match fixed length
+ strings, each alternative branch of a lookbehind assertion
+ can match a different length of string. Perl 5.005 requires
+ them all to have the same length.
+
+ (b) If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is not
+ set, the $ meta- character matches only at the very end of
+ the string.
+
+ (c) If PCRE_EXTRA is set, a backslash followed by a letter
+ with no special meaning is faulted.
+
+ (d) If PCRE_UNGREEDY is set, the greediness of the repeti-
+ tion quantifiers is inverted, that is, by default they are
+ not greedy, but if followed by a question mark they are.
+ </literallayout>
+ </refsect1>
+ </refentry>
</reference>
<!-- Keep this comment at the end of the file
-- PHP Development Mailing List (http://www.php.net/) To unsubscribe, e-mail: php-dev-unsubscribe <email protected> For additional commands, e-mail: php-dev-help <email protected> To contact the list administrators, e-mail: php-list-admin <email protected>
- Next message: Jim Winstead: "[PHP-DEV] lxr.php.net"
- Previous message: Bug Database: "[PHP-DEV] Bug #1450 Updated: wrong array size"
- Next in thread: eschmid: "[PHP-DEV] CVS update: php3/doc/functions"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

