Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.10.12

Bug #3513 support of RFC2231 in header fields.
Submitted: 2005-02-18 05:43 UTC
From: hiroaki dot kawai at gmail dot com Assigned: cipri
Status: Closed Package: Mail_Mime
PHP Version: 4.3.10 OS: ANY
Roadmaps: 1.4.0, 1.4.0RC4, 1.4.0a1    
Subscription  


 [2005-02-18 05:43 UTC] hiroaki dot kawai at gmail dot com
Description: ------------ Filenames in multibyte string should be encoded as described in RFC2231. Current head version of Mail_Mime does not support it, so I'd like to request add the feature. Reproduce code: --------------- --TEST-- Tests for RFC2231 support in Content-Disposition. --SKIPIF-- --FILE-- <?php error_reporting(E_ALL); $testEncoded="GyRCRnxLXDhsGyhCLnR4dA=="; $test = base64_decode($testEncoded); // Japanese filename in ISO-2022-JP charset. require_once('Mail/mime.php'); $Mime=new Mail_Mime(); $Mime->setTXTBody(''); $Mime->addAttachment('Japanese file',"text/plain", $test, FALSE); $body = $Mime->get(); $hdrs = ''; foreach ($Mime->headers() AS $name => $val) { $hdrs .= "$name: $val\n"; } $hdrs .= "To: Receiver <receiver@example.com>\n"; $hdrs .= "From: Sender <sender@example.com>\n"; $hdrs .= "Subject: PEAR::Mail_Mime test mail\n"; require_once('Mail/mimeDecode.php'); $mime_message = "$hdrs\n$body"; $Decoder = new Mail_mimeDecode($mime_message); $params = array( 'include_bodies' => TRUE, 'decode_bodies' => TRUE, 'decode_headers' => TRUE ); $Decoded = $Decoder->decode($params); print_r($Decoded->parts[0]->headers['content-disposition']); ?> --EXPECT-- attachment; filename*=iso-2022-jp''%1B%24BF%7CK%5C8l%1B%28B.txt

Comments

 [2005-02-18 14:01 UTC] hiroaki dot kawai at gmail dot com
I should have make the phpt test code focused on encoding phase. the code I submitted depended on decoding code. Here's a replacement code. --TEST-- Tests for RFC2231 support in Content-Disposition. --SKIPIF-- --FILE-- <?php mb_internal_encoding('ISO-2022-JP'); error_reporting(E_ALL); $testEncoded="GyRCRnxLXDhsGyhCLnR4dA=="; $test = base64_decode($testEncoded); require_once('Mail/mime.php'); $Mime=new Mail_Mime(); $Mime->setTXTBody(''); $Mime->addAttachment('Japanese file',"text/plain", $test, FALSE); $body = $Mime->get(); $bodyarr=explode("\r\n",$body); print_r($bodyarr[3]."\r\n"); print_r($bodyarr[4]."\r\n"); ?> --EXPECT-- Content-Disposition: attachment; filename*="iso-2022-jp''%1B%24BF%7CK%5C8l%1B%28B.txt"
 [2005-02-18 14:03 UTC] hiroaki dot kawai at gmail dot com
And I created an patch to support RFC2231 encoding feature to the current CVS head version. Index: pear/Mail_Mime/mimePart.php =================================================================== RCS file: /repository/pear/Mail_Mime/mimePart.php,v retrieving revision 1.13 diff -u -r1.13 mimePart.php --- pear/Mail_Mime/mimePart.php 10 Dec 2004 23:08:26 -0000 1.13 +++ pear/Mail_Mime/mimePart.php 18 Feb 2005 10:48:34 -0000 @@ -166,9 +166,12 @@ case 'dfilename': if (isset($headers['Content-Disposition'])) { - $headers['Content-Disposition'] .= '; filename="' . $value . '"'; + $dfilename = $this->buildDispositionParam("filename", $value); + $headers['Content-Disposition'] .= ';' . MAIL_MIMEPART_CRLF . $dfilename; + } else { - $dfilename = $value; + $dfilename = $this->buildDispositionParam("filename", $value); + } break; @@ -347,5 +350,88 @@ $output = substr($output, 0, -1 * strlen($eol)); // Don't want last crlf return $output; } + function buildDispositionParam($key, $val, $line_max = 76, $charset = NULL) + { + if(!$charset){ + if(!extension_loaded('mbstring')){ + $charset = "US-ASCII"; + }else{ + $charset = mb_internal_encoding(); + } + } + //if(extension_loaded('mbstring')){ + $internalcharset=mb_internal_encoding(); + if($internalcharset==FALSE){ + $internalcharset="US-ASCII"; + } + + if($internalcharset == "eucJP-win"){ + $internalcharset="Windows-31J"; + $val=mb_convert_encoding($val,"SJIS-win"); + } + + // Japanese mailer prefer ISO-2022-JP as described in RFC2237 + if(stristr(mb_language(),'ja')){ + $charset="ISO-2022-JP"; + } + + // Avoid 2231 encoding if possible. + if(stristr(mb_detect_encoding($val, "ASCII, ".mb_internal_encoding()),"ASCII")==false){ + $internalcharset="US-ASCII"; + }else{ + if(strcasecmp($internalcharset,$charset)){ + $val=mb_convert_encoding($val, $charset, $internalcharset); + } + } + // Fix internalcharset name for IANA registry. + $charset=mb_preferred_mime_name($charset); + //} + + // XXX The sequence below might not be safe for RFC2822, 2231 encoding may hide the problem. + # $need_escape = FALSE; + # if(mb_strpos($val, "\x09") !== FALSE) $need_escape = TRUE; + # if(mb_strpos($val, "\r") !== FALSE) $need_escape = TRUE; + # if(mb_strpos($val, "\n") !== FALSE) $need_escape = TRUE; + + $param_pair=" ".$key."=\"".$val."\""; + if($charset == "US-ASCII" && strlen($param_pair) <= $line_max) { // && $need_escape might be here. + return $param_pair; + } + + // RFC2231 encoding. + $encval = strtolower($charset) . "''" . str_replace('_', '%5F', rawurlencode($val)); + $param_pair=" ".$key."*=\"".$encval."\""; + if(strlen($param_pair) <= $line_max) { // we don't need folding + return $param_pair; + } + + + // folding + // 4 bytes for "*0*=", 1 byte for heading ' ', 1 byte for ';', => total 6 bytes + $rhs_space = $line_max - strlen($key) - 6; + + // folding + $dest = ""; + for($i = 0 ; strlen($encval) > $rhs_space ; $i++) { + $dest .= " " . $key . "*" . $i . "*="; + $pos = strrpos(substr($encval, 0, $rhs_space), '%'); + if( $rhs_space - 2 <= $pos) { // '%' ‚ªŒã‚ë‚©‚ç2byteˆÈ“à‚É‚ ‚é + $dest .= substr($encval, 0, $pos); + $encval = substr($encval, $pos); + } else { + $dest .= substr($encval, 0, $rhs_space); + $encval = substr($encval, $rhs_space); + } + $dest .= ";\r\n"; + // bytes allowed in right hand space + // 3 bytes for "**=", 1 byte for heading ' ', 1 byte for ';' => total 5 bytes + $rhs_space = $line_max - strlen(strval($i)) - strlen($key) - 5; + } + $dest .= " " . $key . "*" . $i . "*="; + $dest .= $encval; + + return $dest; + } + } // End of class ?>
 [2005-02-18 14:08 UTC] hiroaki dot kawai at gmail dot com
Oops, sorry. the patch should be modified. Here is a diff of diff. mbstring extension treatment will be enabled. --- mimePart.patch 2005-02-18 22:04:28.123182400 +0900 +++ mimePart.patch.2 2005-02-18 23:06:29.253905600 +0900 @@ -4,7 +4,7 @@ retrieving revision 1.13 diff -u -r1.13 mimePart.php --- pear/Mail_Mime/mimePart.php 10 Dec 2004 23:08:26 -0000 1.13 -+++ pear/Mail_Mime/mimePart.php 18 Feb 2005 10:48:34 -0000 ++++ pear/Mail_Mime/mimePart.php 18 Feb 2005 14:06:29 -0000 @@ -166,9 +166,12 @@ case 'dfilename': @@ -33,7 +33,7 @@ + $charset = mb_internal_encoding(); + } + } -+ //if(extension_loaded('mbstring')){ ++ if(extension_loaded('mbstring')){ + $internalcharset=mb_internal_encoding(); + if($internalcharset==FALSE){ + $internalcharset="US-ASCII"; @@ -59,7 +59,7 @@ + } + // Fix internalcharset name for IANA registry. + $charset=mb_preferred_mime_name($charset); -+ //} ++ } + + // XXX The sequence below might not be safe for RFC2822, 2231 encoding may hide the problem. + # $need_escape = FALSE;
 [2006-12-03 20:23 UTC] cipri (Cipriano Groenendal)
This bug has been fixed in CVS. If this was a documentation problem, the fix will appear on pear.php.net by the end of next Sunday (CET). If this was a problem with the pear.php.net website, the change should be live shortly. Otherwise, the fix will appear in the package's next release. Thank you for the report and for helping us make PEAR better. I fixed this by adding another optional character to the addAttachment function, where you can specify the charset used in the filename. Your code looked good, but I don't wanna create the extra dep on the mbstring plugin, and this solution was a bit more flexible too :)
 [2007-04-28 09:49 UTC] cipri (Cipriano Groenendal)
This should also be fixed correctly now in 1.4.0RC4.
 [2007-05-05 15:09 UTC] cipri (Cipriano Groenendal)
Thank you for your bug report. This issue has been fixed in the latest released version of the package, which you can download at http://pear.php.net/get/Mail_Mime Fixed in 1.4.0