Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.0.0RC8

Bug #4971 function _urlencode() insufficient for international file names
Submitted: 2005-08-02 15:01 UTC
From: wrandels at hsw dot fhz dot ch Assigned: hholzgra
Status: Suspended Package: HTTP_WebDAV_Server
PHP Version: Irrelevant OS: Windows XP, Mac OS X
Roadmaps: 1.1    
Subscription  


 [2005-08-02 15:01 UTC] wrandels at hsw dot fhz dot ch
Description: ------------ Function _urlencode() does a very minimalistic encoding of URLS. Unfortunately the encoding is insufficient for use with international filenames. On Windows XP, if a filename contains both space characters and non-ascii characters, such as an "รค" (lower case character a with diaeresis), then the WebDAV client of Windows XP displays the space character as the following character sequence "%20". On Mac OS X, the WebDAV client suppresses all filenames that have non-ascii characters. Test script: --------------- I changed the _urlencode() function into the following to alleviate the problem: function _urlencode($path) { $c = explode('/', $path); for ($i = 0; $i < count($c); $i++) { $c[$i] = str_replace('+','%20',urlencode($c[$i])); } return implode('/', $c); } Expected result: ---------------- The code snippet above fixes the issue. However, it does not fix cross-platform issues when using international filenames on Mac OS X and Windows XP. Mac OS X encodes international filenames using the Unicode Normalization Form D (NFD) whereas Windows XP uses Unicode Normalization Form C (NFC). It appears that the WebDAV client in Mac OS X can deal with both normalization forms, but it will always submit NFD encoded names to the server. The WebDAV client in Windows XP treats a name encoded with NFD as a different name as when it is encoded using NFC. To fix the normalization form issue, I have found that it is best, to always normalize resource names to NFC before sending them in a reply to the client. I am not sure though, whether normalization shall be addressed by the abstract Pear Server class, or whether it shall be handled by a concrete sublcass.

Comments

 [2008-04-24 01:30 UTC] hholzgra (Hartmut Holzgraefe)
* XP space/non-ascii issue now refiled as bug #13760 (most likely already fixed) * Unicode normalization issue now refiled as bug #13759 (feature request for past 1.0 versions)