<?xml version="1.0"?>
<?xml-stylesheet
href="http://www.w3.org/2000/08/w3c-synd/style.css" type="text/css"
?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel rdf:about="http://pear.php.net/bugs/search.php">
    <title>PEAR Bug Search Results</title>
    <link>http://pear.php.net/bugs/search.php?cmd=display&amp;package_name%5B0%5D=PHP_LexerGenerator</link>
    <description>Search Results</description>
    <dc:language>en-us</dc:language>
    <dc:creator>pear-webmaster@lists.php.net</dc:creator>
    <dc:publisher>pear-webmaster@lists.php.net</dc:publisher>
    <admin:generatorAgent rdf:resource="http://pear.php.net/bugs"/>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
    <items>
     <rdf:Seq>
      <rdf:li rdf:resource="http://pear.php.net/bug/18454" />
      <rdf:li rdf:resource="http://pear.php.net/bug/16944" />
      <rdf:li rdf:resource="http://pear.php.net/bug/14731" />
      <rdf:li rdf:resource="http://pear.php.net/bug/13252" />
      <rdf:li rdf:resource="http://pear.php.net/bug/13221" />
      <rdf:li rdf:resource="http://pear.php.net/bug/12564" />

     </rdf:Seq>
    </items>
  </channel>

  <image rdf:about="http://pear.php.net/gifs/pearsmall.gif">
    <title>PEAR Bugs</title>
    <url>http://pear.php.net/gifs/pearsmall.gif</url>
    <link>http://pear.php.net/bugs</link>
  </image>

    <item rdf:about="http://pear.php.net/bug/18454">
      <title>PHP_LexerGenerator: Bug 18454 [Open] single quote is not escaped properly in yy_global_pattern</title>
      <link>http://pear.php.net/bugs/18454</link>
      <content:encoded><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by urkle
2011-04-19T02:01:47+00:00
PHP: 5.3.6 OS: Mac OS X Package Version: 0.4.0

Description:
------------
In revision 246683, a change was made to use single quotes instead of double quotes to wrap the $yy_global_pattern variable contents..

However the content quoting was not changed.. (Double quotes are escaped and single quotes are NOT escaped.).

This needs to be adjusted so that single quotes are escaped when placed in that string.</pre>]]></content:encoded>
      <description><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by urkle
2011-04-19T02:01:47+00:00
PHP: 5.3.6 OS: Mac OS X Package Version: 0.4.0

Description:
------------
In revision 246683, a change was made to use single quotes instead of double quotes to wrap the $yy_global_pattern variable contents..

However the content quoting was not changed.. (Double quotes are escaped and single quotes are NOT escaped.).

This needs to be adjusted so that single quotes are escaped when placed in that string.</pre>]]></description>
      <dc:date>2011-04-19T02:01:47+00:00</dc:date>
      <dc:creator>urkle &amp;#x61;&amp;#116; outoforder &amp;#x64;&amp;#111;&amp;#x74; cc</dc:creator>
      <dc:subject>PHP_LexerGenerator Bug</dc:subject>
    </item>
    <item rdf:about="http://pear.php.net/bug/16944">
      <title>PHP_LexerGenerator: Bug 16944 [Open] Bug in regexp lookahead</title>
      <link>http://pear.php.net/bugs/16944</link>
      <content:encoded><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by thue
2009-12-26T06:37:21+00:00
PHP: 5.3.1 OS: All Package Version: 0.3.4

Description:
------------
(?=...) positive lookahead in lexer regexps doesn't work.

The error is because of the following code in LexerGenerator/Regex/Parser.php :

#line 1599 &quot;Parser.php&quot;
#line 428 &quot;Parser.y&quot;
    function yy_r82(){
    $this-&gt;_retvalue = new PHP_LexerGenerator_ParseryyToken('(?=' . $this-&gt;yystack[$this-&gt;yyidx + -1]-&gt;minor-&gt;string . ')', array(
        'pattern '=&gt; '(?=' . $this-&gt;yystack[$this-&gt;yyidx + -1]-&gt;minor['pattern'] . ')'));
    }

Where 'pattern ' should be 'pattern' (without the space).</pre>]]></content:encoded>
      <description><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by thue
2009-12-26T06:37:21+00:00
PHP: 5.3.1 OS: All Package Version: 0.3.4

Description:
------------
(?=...) positive lookahead in lexer regexps doesn't work.

The error is because of the following code in LexerGenerator/Regex/Parser.php :

#line 1599 &quot;Parser.php&quot;
#line 428 &quot;Parser.y&quot;
    function yy_r82(){
    $this-&gt;_retvalue = new PHP_LexerGenerator_ParseryyToken('(?=' . $this-&gt;yystack[$this-&gt;yyidx + -1]-&gt;minor-&gt;string . ')', array(
        'pattern '=&gt; '(?=' . $this-&gt;yystack[$this-&gt;yyidx + -1]-&gt;minor['pattern'] . ')'));
    }

Where 'pattern ' should be 'pattern' (without the space).</pre>]]></description>
      <dc:date>2009-12-26T06:37:21+00:00</dc:date>
      <dc:creator>thuejk &amp;#x61;&amp;#116; gmail &amp;#x64;&amp;#111;&amp;#x74; com</dc:creator>
      <dc:subject>PHP_LexerGenerator Bug</dc:subject>
    </item>
    <item rdf:about="http://pear.php.net/bug/14731">
      <title>PHP_LexerGenerator: Bug 14731 [Open] $ in regex is interpreted as an reference to variable</title>
      <link>http://pear.php.net/bugs/14731</link>
      <content:encoded><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by lew21
2008-10-02T17:57:54+00:00
PHP: 5.2.6 OS: Linux Package Version: 0.3.4

Description:
------------
$ in regex is interpreted as an reference to variable

Test script:
---------------
label      = #[$a-zA-Z_\x7f-\xff][$a-zA-Z0-9_\x7f-\xff]*#

Expected result:
----------------
I expected it to work.

Actual result:
--------------
PHP Notice:  Undefined variable: a in /*/lexer.php on line 122
PHP Notice:  Undefined variable: a in /*/lexer.php on line 122</pre>]]></content:encoded>
      <description><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by lew21
2008-10-02T17:57:54+00:00
PHP: 5.2.6 OS: Linux Package Version: 0.3.4

Description:
------------
$ in regex is interpreted as an reference to variable

Test script:
---------------
label      = #[$a-zA-Z_\x7f-\xff][$a-zA-Z0-9_\x7f-\xff]*#

Expected result:
----------------
I expected it to work.

Actual result:
--------------
PHP Notice:  Undefined variable: a in /*/lexer.php on line 122
PHP Notice:  Undefined variable: a in /*/lexer.php on line 122</pre>]]></description>
      <dc:date>2008-10-09T14:57:52+00:00</dc:date>
      <dc:creator>lew21st &amp;#x61;&amp;#116; gmail &amp;#x64;&amp;#111;&amp;#x74; com</dc:creator>
      <dc:subject>PHP_LexerGenerator Bug</dc:subject>
    </item>
    <item rdf:about="http://pear.php.net/bug/13252">
      <title>PHP_LexerGenerator: Feature/Change Request 13252 [Open] Possibility to reuse a rule.</title>
      <link>http://pear.php.net/bugs/13252</link>
      <content:encoded><![CDATA[<pre>PHP_LexerGenerator Feature/Change Request
Reported by clicky
2008-02-28T16:10:34+00:00
PHP: Irrelevant OS: Any Package Version: 0.3.4

Description:
------------
Currently, if you have something like this in your regexp declaration block:
nmstart = /[_a-zA-Z]/
nmchar = /[_a-zA-Z0-9]/

And you want to define ident using something like this:
ident = @{nmstart}{nmchar}*@

You're forced to copy/paste the whole declaration of both nmstart &amp; nmchar in ident's declaration.

I suggest to allow using {name} in a regexp pattern to reuse the rule called &quot;name&quot; in the rule being declared. You may use this syntax anywhere in the pattern, except in character classes (that is, /[0-9{name}]/ would really match any digit, '{', 'n', 'a', 'm', 'e' or '}').

This is not part of any PCRE specification though. Therefore, it would require adding some special handling. Also, the PCRE documentation doesn't mention any use of the {name} syntax anywhere. The closest one is r{N,M} which indicates r must be repeated between N &amp; M times, but you probably already know this.

Test script:
---------------
/*!lex2php
%input $this-&gt;data
%counter $this-&gt;N
%token $this-&gt;token
%value $this-&gt;value
%line $this-&gt;line
%matchlongest 1
nmstart = /[_a-zA-Z]/
nmchar = /[_a-zA-Z0-9]/
ident = @{nmstart}{nmchar}*@
*/

Expected result:
----------------
No error. &quot;ident&quot; should be able to match something like: &quot;Some_ID123&quot;. It should not match something like &quot;5ome_invalid_ID&quot; because '5' does not match the pattern expressed by nmstart.



Actual result:
--------------
Popping SUBPATTERN
Popping PATTERN
Popping pattern_declarations
Popping processing_instructions
Popping COMMENTSTART
Popping PHPCODE
Popping $

Exception: Unexpected input at line29: { in /usr/share/php/PHP/LexerGenerator/Regex/Lexer.php on line 202

Call Stack:
    0.0002      53992   1. {main}() /usr/share/php/PHP/LexerGenerator/cli.php:0
    0.0287    1587664   2. PHP_LexerGenerator-&gt;__construct() /usr/share/php/PHP/LexerGenerator/cli.php:3
    0.2520    1605640   3. PHP_LexerGenerator_Parser-&gt;doParse() /usr/share/php/PHP/LexerGenerator.php:283
    0.2522    1605640   4. PHP_LexerGenerator_Parser-&gt;yy_reduce() /usr/share/php/PHP/LexerGenerator/Parser.php:1855
    0.2522    1605640   5. PHP_LexerGenerator_Parser-&gt;yy_r29() /usr/share/php/PHP/LexerGenerator/Parser.php:1716
    0.2522    1605640   6. PHP_LexerGenerator_Parser-&gt;_validatePattern() /usr/share/php/PHP/LexerGenerator/Parser.php:1643
    0.2523    1605640   7. PHP_LexerGenerator_Regex_Lexer-&gt;yylex() /usr/share/php/PHP/LexerGenerator/Parser.php:503
    0.2523    1605640   8. PHP_LexerGenerator_Regex_Lexer-&gt;yylex1() /usr/share/php/PHP/LexerGenerator/Regex/Lexer.php:59</pre>]]></content:encoded>
      <description><![CDATA[<pre>PHP_LexerGenerator Feature/Change Request
Reported by clicky
2008-02-28T16:10:34+00:00
PHP: Irrelevant OS: Any Package Version: 0.3.4

Description:
------------
Currently, if you have something like this in your regexp declaration block:
nmstart = /[_a-zA-Z]/
nmchar = /[_a-zA-Z0-9]/

And you want to define ident using something like this:
ident = @{nmstart}{nmchar}*@

You're forced to copy/paste the whole declaration of both nmstart &amp; nmchar in ident's declaration.

I suggest to allow using {name} in a regexp pattern to reuse the rule called &quot;name&quot; in the rule being declared. You may use this syntax anywhere in the pattern, except in character classes (that is, /[0-9{name}]/ would really match any digit, '{', 'n', 'a', 'm', 'e' or '}').

This is not part of any PCRE specification though. Therefore, it would require adding some special handling. Also, the PCRE documentation doesn't mention any use of the {name} syntax anywhere. The closest one is r{N,M} which indicates r must be repeated between N &amp; M times, but you probably already know this.

Test script:
---------------
/*!lex2php
%input $this-&gt;data
%counter $this-&gt;N
%token $this-&gt;token
%value $this-&gt;value
%line $this-&gt;line
%matchlongest 1
nmstart = /[_a-zA-Z]/
nmchar = /[_a-zA-Z0-9]/
ident = @{nmstart}{nmchar}*@
*/

Expected result:
----------------
No error. &quot;ident&quot; should be able to match something like: &quot;Some_ID123&quot;. It should not match something like &quot;5ome_invalid_ID&quot; because '5' does not match the pattern expressed by nmstart.



Actual result:
--------------
Popping SUBPATTERN
Popping PATTERN
Popping pattern_declarations
Popping processing_instructions
Popping COMMENTSTART
Popping PHPCODE
Popping $

Exception: Unexpected input at line29: { in /usr/share/php/PHP/LexerGenerator/Regex/Lexer.php on line 202

Call Stack:
    0.0002      53992   1. {main}() /usr/share/php/PHP/LexerGenerator/cli.php:0
    0.0287    1587664   2. PHP_LexerGenerator-&gt;__construct() /usr/share/php/PHP/LexerGenerator/cli.php:3
    0.2520    1605640   3. PHP_LexerGenerator_Parser-&gt;doParse() /usr/share/php/PHP/LexerGenerator.php:283
    0.2522    1605640   4. PHP_LexerGenerator_Parser-&gt;yy_reduce() /usr/share/php/PHP/LexerGenerator/Parser.php:1855
    0.2522    1605640   5. PHP_LexerGenerator_Parser-&gt;yy_r29() /usr/share/php/PHP/LexerGenerator/Parser.php:1716
    0.2522    1605640   6. PHP_LexerGenerator_Parser-&gt;_validatePattern() /usr/share/php/PHP/LexerGenerator/Parser.php:1643
    0.2523    1605640   7. PHP_LexerGenerator_Regex_Lexer-&gt;yylex() /usr/share/php/PHP/LexerGenerator/Parser.php:503
    0.2523    1605640   8. PHP_LexerGenerator_Regex_Lexer-&gt;yylex1() /usr/share/php/PHP/LexerGenerator/Regex/Lexer.php:59</pre>]]></description>
      <dc:date>2008-02-28T16:10:34+00:00</dc:date>
      <dc:creator>missingno &amp;#x61;&amp;#116; ifrance &amp;#x64;&amp;#111;&amp;#x74; com</dc:creator>
      <dc:subject>PHP_LexerGenerator Feature/Change Request</dc:subject>
    </item>
    <item rdf:about="http://pear.php.net/bug/13221">
      <title>PHP_LexerGenerator: Feature/Change Request 13221 [Open] Unescaped hyphen can't be used to express itself in character classes</title>
      <link>http://pear.php.net/bugs/13221</link>
      <content:encoded><![CDATA[<pre>PHP_LexerGenerator Feature/Change Request
Reported by clicky
2008-02-26T22:07:49+00:00
PHP: Irrelevant OS: Irrelevant Package Version: 0.3.4 and CVS

Description:
------------
If you want to use the hyphen character (-) in a character class, you need to escape it, eg.: use octal (\055) or hexadecimal (\x2D) notation.
I think the regexp lexer/parser should be made a bit more permissive as, for example, [-a-zA-Z0-9] is a valid character class.

I think this is more of a feature request than a real bug report though...

Test script:
---------------
&lt;?php

class TestLexer {

// ...

/*!lex2php
%input $this-&gt;data
%counter $this-&gt;N
%token $this-&gt;token
%value $this-&gt;value
%line $this-&gt;line
%matchlongest 1
space = /[ \t\n]+/
name  = /[-a-zA-Z0-9]+/
*/
// Please note that using name  = /[\x2Da-zA-Z0-9]+/
// works as expected.

/*!lex2php
%statename START
name {
    echo &quot;Name\n&quot;;
    var_dump($this-&gt;value);
    echo &quot;    name subpatterns: \n&quot;;
    var_dump($yy_subpatterns);
}

space { return FALSE; }
*/

// ...

}

?&gt;

Expected result:
----------------
Plex should accept the file without any exception being raised.

Actual result:
--------------
Reduce (29) [subpattern ::= SUBPATTERN].
Syntax Error on line 27: token '-' while parsing rule:End of Input OPENCHARCLASS Popping SUBPATTERN
Popping PATTERN
Popping processing_instructions
Popping COMMENTSTART
Popping PHPCODE
Popping $

Exception: Unexpected HYPHEN(-), expected one of: NEGATE,TEXT,ESCAPEDBACKSLASH,BACKREFERENCE,COULDBEBACKREF in /usr/share/php/PHP/LexerGenerator/Regex/Parser.php on line 1779

Call Stack:
    0.0002      54040   1. {main}() /usr/share/php/PHP/LexerGenerator/cli.php:0
    0.0260    1586716   2. PHP_LexerGenerator-&gt;__construct() /usr/share/php/PHP/LexerGenerator/cli.php:3
    0.0332    1597664   3. PHP_LexerGenerator_Parser-&gt;doParse() /usr/share/php/PHP/LexerGenerator.php:283
    0.0334    1598224   4. PHP_LexerGenerator_Parser-&gt;yy_reduce() /usr/share/php/PHP/LexerGenerator/Parser.php:1855
    0.0334    1598224   5. PHP_LexerGenerator_Parser-&gt;yy_r29() /usr/share/php/PHP/LexerGenerator/Parser.php:1716
    0.0334    1598224   6. PHP_LexerGenerator_Parser-&gt;_validatePattern() /usr/share/php/PHP/LexerGenerator/Parser.php:1643
    0.0340    1602532   7. PHP_LexerGenerator_Regex_Parser-&gt;doParse() /usr/share/php/PHP/LexerGenerator/Parser.php:505
    0.0341    1602636   8. PHP_LexerGenerator_Regex_Parser-&gt;yy_syntax_error() /usr/share/php/PHP/LexerGenerator/Regex/Parser.php:1878</pre>]]></content:encoded>
      <description><![CDATA[<pre>PHP_LexerGenerator Feature/Change Request
Reported by clicky
2008-02-26T22:07:49+00:00
PHP: Irrelevant OS: Irrelevant Package Version: 0.3.4 and CVS

Description:
------------
If you want to use the hyphen character (-) in a character class, you need to escape it, eg.: use octal (\055) or hexadecimal (\x2D) notation.
I think the regexp lexer/parser should be made a bit more permissive as, for example, [-a-zA-Z0-9] is a valid character class.

I think this is more of a feature request than a real bug report though...

Test script:
---------------
&lt;?php

class TestLexer {

// ...

/*!lex2php
%input $this-&gt;data
%counter $this-&gt;N
%token $this-&gt;token
%value $this-&gt;value
%line $this-&gt;line
%matchlongest 1
space = /[ \t\n]+/
name  = /[-a-zA-Z0-9]+/
*/
// Please note that using name  = /[\x2Da-zA-Z0-9]+/
// works as expected.

/*!lex2php
%statename START
name {
    echo &quot;Name\n&quot;;
    var_dump($this-&gt;value);
    echo &quot;    name subpatterns: \n&quot;;
    var_dump($yy_subpatterns);
}

space { return FALSE; }
*/

// ...

}

?&gt;

Expected result:
----------------
Plex should accept the file without any exception being raised.

Actual result:
--------------
Reduce (29) [subpattern ::= SUBPATTERN].
Syntax Error on line 27: token '-' while parsing rule:End of Input OPENCHARCLASS Popping SUBPATTERN
Popping PATTERN
Popping processing_instructions
Popping COMMENTSTART
Popping PHPCODE
Popping $

Exception: Unexpected HYPHEN(-), expected one of: NEGATE,TEXT,ESCAPEDBACKSLASH,BACKREFERENCE,COULDBEBACKREF in /usr/share/php/PHP/LexerGenerator/Regex/Parser.php on line 1779

Call Stack:
    0.0002      54040   1. {main}() /usr/share/php/PHP/LexerGenerator/cli.php:0
    0.0260    1586716   2. PHP_LexerGenerator-&gt;__construct() /usr/share/php/PHP/LexerGenerator/cli.php:3
    0.0332    1597664   3. PHP_LexerGenerator_Parser-&gt;doParse() /usr/share/php/PHP/LexerGenerator.php:283
    0.0334    1598224   4. PHP_LexerGenerator_Parser-&gt;yy_reduce() /usr/share/php/PHP/LexerGenerator/Parser.php:1855
    0.0334    1598224   5. PHP_LexerGenerator_Parser-&gt;yy_r29() /usr/share/php/PHP/LexerGenerator/Parser.php:1716
    0.0334    1598224   6. PHP_LexerGenerator_Parser-&gt;_validatePattern() /usr/share/php/PHP/LexerGenerator/Parser.php:1643
    0.0340    1602532   7. PHP_LexerGenerator_Regex_Parser-&gt;doParse() /usr/share/php/PHP/LexerGenerator/Parser.php:505
    0.0341    1602636   8. PHP_LexerGenerator_Regex_Parser-&gt;yy_syntax_error() /usr/share/php/PHP/LexerGenerator/Regex/Parser.php:1878</pre>]]></description>
      <dc:date>2008-02-28T17:46:55+00:00</dc:date>
      <dc:creator>missingno &amp;#x61;&amp;#116; ifrance &amp;#x64;&amp;#111;&amp;#x74; com</dc:creator>
      <dc:subject>PHP_LexerGenerator Feature/Change Request</dc:subject>
    </item>
    <item rdf:about="http://pear.php.net/bug/12564">
      <title>PHP_LexerGenerator: Bug 12564 [Open] Unicode support incomplete</title>
      <link>http://pear.php.net/bugs/12564</link>
      <content:encoded><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by instance
2007-12-02T04:45:56+00:00
PHP: 5.2.4 OS: Any Package Version: CVS

Description:
------------
There are still some issues with processing Unicode, so although the parser code works, unit tests (i.e. LexerGeneratorTest::testLexerGeneratorUnicode()) fail.</pre>]]></content:encoded>
      <description><![CDATA[<pre>PHP_LexerGenerator Bug
Reported by instance
2007-12-02T04:45:56+00:00
PHP: 5.2.4 OS: Any Package Version: CVS

Description:
------------
There are still some issues with processing Unicode, so although the parser code works, unit tests (i.e. LexerGeneratorTest::testLexerGeneratorUnicode()) fail.</pre>]]></description>
      <dc:date>2008-01-09T03:05:02+00:00</dc:date>
      <dc:creator>jal &amp;#x61;&amp;#116; ambitonline &amp;#x64;&amp;#111;&amp;#x74; com</dc:creator>
      <dc:subject>PHP_LexerGenerator Bug</dc:subject>
    </item>
</rdf:RDF>
