Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 0.9.4

Bug #19904 UTF-16 surrogate pairs trigger "Excel found unreadable content" error
Submitted: 2013-04-17 00:21 UTC
From: seanch Assigned:
Status: Feedback Package: Spreadsheet_Excel_Writer (version 0.9.3)
PHP Version: Irrelevant OS: Linux
Roadmaps: (Not assigned)    
Subscription  


 [2013-04-17 00:21 UTC] seanch (Sean Callan-Hinsvark)
Description: ------------ If a Unicode string written to a worksheet contains any "surrogate pairs" then when it's opened in Excel an "unreadable content" error will occur and the data will not be displayed. The problem is in the Spreadsheet_Excel_Writer_Worksheet::writeStringBIFF8() method where mb_strlen($str, 'UTF-16LE') is used to calculate the string's length. Apparently Excel expects Unicode string lengths to be the number of 16-bit code points, not the number of characters. Test script: --------------- require_once 'Spreadsheet/Excel/Writer.php'; $excel = new Spreadsheet_Excel_Writer(); $excel->setVersion(8); // Excel 97/2000 format, which allows Unicode characters $worksheet = $excel->addWorksheet('test'); $worksheet->setInputEncoding('UTF-8'); $utf8_string = html_entity_decode('𝄞', ENT_COMPAT, 'UTF-8'); // musical symbol G clef $result = $worksheet->writeString(0, 0, $utf8_string); $excel->send('test.xls'); $excel->close(); Expected result: ---------------- The worksheet should open in Excel without error, with a single (likely undisplayable) character in the first cell. Actual result: -------------- When opening the worksheet in Excel an "Excel found unreadable content" error occurs and no data is in the first cell.

Comments

 [2013-04-17 00:23 UTC] seanch (Sean Callan-Hinsvark)
I guess the HTML entity in my test script isn't being escaped when it's displayed here. That line should be: $utf8_string = html_entity_decode('𝄞', ENT_COMPAT, 'UTF-8'); // musical symbol G clef
 [2013-04-17 00:45 UTC] seanch (Sean Callan-Hinsvark)
 [2014-01-24 00:19 UTC] chealer (Filipus Klutiero)
The attached patch was merged. However, I still get corrupted files with the current version. I realized I was in fact hit by bug #19278. This bug only happens when using BIFF8 (Excel 97/2003). Spreadsheet_Excel_Writer uses an older format by default.
 [2017-05-25 02:33 UTC] sanmai (Alexey Kopytko)
-Status: Open +Status: Feedback
Thank you for taking the time to report a problem with the package. Unfortunately you are not using a current version of the package -- the problem might already be fixed. Please download a new version from http://pear.php.net/packages.php If you are able to reproduce the bug with one of the latest versions, please change the package version on this bug report to the version you tested and change the status back to "Open". Again, thank you for your continued support of PEAR.