Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 2.0.0

Bug #2931 _validateQuotedString() is too strict
Submitted: 2004-12-09 02:19 UTC
From: john at curioussymbols dot com Assigned: chagenbu
Status: Closed Package: Mail
PHP Version: 4.3.1 OS: WinXP
Roadmaps: (Not assigned)    
Subscription  


 [2004-12-09 02:19 UTC] john at curioussymbols dot com
Description: ------------ Sample code below. The address string $address_string = '"Joe Doe \(from Somewhere\)" <doe@example.com>, postmaster@example.com, root'; doesn't parse correctly, and the following script gives 'Something is wrong' Script is taken from PHP manual for imap RFC822 parse function, which works correctly with the same code; I have modified it to use the PEAR equivalent. I've just downloaded the latest version of PEAR and Mail and tested with those. Reproduce code: --------------- <?php require_once("Mail/RFC822.php"); $address_string = '"Joe Doe \(from Somewhere\)" <doe@example.com>, postmaster@example.com, root'; //$address_string = "Joe Doe from Somewhere <doe@example.com>, postmaster@example.com, root"; echo($address_string."\n"); $address_array = Mail_RFC822::parseAddressList($address_string, "example.com"); if (!is_array($address_array) || count($address_array) < 1) { die("something is wrong\n"); } foreach ($address_array as $val) { echo "mailbox : " . $val->mailbox . "<br />\n"; echo "host : " . $val->host . "<br />\n"; echo "personal: " . $val->personal . "<br />\n"; } print_r($address_array); ?> Expected result: ---------------- Something like this (tested it on another server which had IMAP extensions): X-Powered-By: PHP/4.0.6 Content-type: text/html mailbox : doe<br /> host : example.com<br /> personal: Joe Doe (from Somewhere)<br /> adl : <br /> mailbox : postmaster<br /> host : example.com<br /> personal: <br /> adl : <br /> mailbox : root<br /> host : example.com<br /> personal: <br /> adl : <br /> Actual result: -------------- john@cruncher2 ~ $ php test.php "Joe Doe \(from Somewhere\)" <doe@example.com>, postmaster@example.com, root something is wrong

Comments

 [2005-02-07 03:51 UTC] chagenbu
So, the bug, if it is one, is in _validateQuotedString() - it doesn't like the backslashes inside the quoted string. If you remove the backslashes, in fact, it works fine. So here's my question - should we just be checking that a quoted string contains no unescaped quotes and that the ending quotes aren't escaped? I'm not sure on the technical definition of quoted-string from the RFCs and haven't checked yet.
 [2005-02-07 16:13 UTC] john at curioussymbols dot com
I wouldn't put the quotes there myself, but I'm using this function to parse incoming emails, and the incoming emails sometimes have these. It's an Outlook or Outlook Express thing, I believe. I think the intention of the backslashes is to avoid the 'from somewhere' being interpreted as a 'comment' (see RFC 2822, section 3.2.3, http://www.faqs.org/rfcs/rfc2822.html I think that basically you're supposed to have quotes around a name if it has more than one word. So, John <john@doe.com> "Doe, John" <john@doe.com> I'm not sure if the brackets in the following count as a comment or not. I think maybe they do, and that would be why MS escapes them in my original example. "Doe (John)" <john@doe.com> Note section 4.4 and A.6.1 in RFC 2822. These say that you can also (obsoletely) have John Doe <john@doe.com> That probably confuses things - but I reckon it probably still happens a bit sometimes from dodgy webmail apps etc. JP
 [2005-02-07 16:15 UTC] john at curioussymbols dot com
Change that: "I wouldn't put the quotes there myself" to "I wouldn't put the backslashes there myself" but even then, I'm not so sure. I can't see anything in RFC2822 that says you're allowed to have a comment inside a quoted-string.
 [2005-02-07 17:17 UTC] richard
From RFC822 section 3.4.3: --- A comment is a set of ASCII characters, which is enclosed in matching parentheses and which is not within a quoted-string. --- And with regard to quoted strings, they are thus: 1. Enclosed by (unescaped) quotes ("). 2. Can contain any ASCII char (0 - 127). 3. Quotes, backslashes and CR must be escaped by a backslash. 4. Any character can also be escaped. Enjoy.
 [2005-05-12 21:29 UTC] mic at uts dot cc dot utexas dot edu
We are seeing the same problem at our IMP installation. We have changed the test to return !preg_match('/[\x0D\\\\"]/', preg_replace('/\\\\./', '', $qstring)); This is a two-stage process expressed as a one liner: 1) account for valid escaped characters (by the expedient of removing them from consideration using preg_replace()) 2) fail on any remaining illegal (i.e. unescaped) characters, i.e. carriage returns, backslashes, quotes. This change has worked so far with our test scripts and with IMP. I hope it is comprehensive enough to represent a worthwhile fix for this bug.
 [2005-05-18 17:42 UTC] chagenbu
This bug has been fixed in CVS. In case this was a documentation problem, the fix will show up at the end of next Sunday (CET) on pear.php.net. In case this was a pear.php.net website problem, the change will show up on the website in short time. Thank you for the report, and for helping us make PEAR better.