Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.4.5

Bug #4950 Incorrect CDATA serializing
Submitted: 2005-07-30 00:09 UTC
From: ja dot doma at gmail dot com Assigned: ashnazg
Status: Closed Package: XML_Util
PHP Version: 4.3.11 OS:
Roadmaps: 1.2.0a1    
Subscription  


 [2005-07-30 00:09 UTC] ja dot doma at gmail dot com
Description: ------------ This code: require_once "XML/Util.php"; // creating an XML tag $tag = XML_Util::createTag("test", array(), "Content ]]></test> here!",null, XML_UTIL_CDATA_SECTION); Will result in: <test><![CDATA[Content ]]></test> here!]]></test> As you can see it is wrong bechavior. I'm not sure, but the only one solution is to create two CDATA sections there: <test><![CDATA[Content ]]>]]><![CDATA[</test> here!]]></test> Note: the ]]> sequence is outside CDATA sections Normal(DOM) parser may merge sequence of CDATA and text into one DomNode.text field. Sorry for my badly english =) Test script: --------------- require_once "XML/Util.php"; // creating an XML tag $tag = XML_Util::createTag("test", array(), "Content ]]></test> here!",null, XML_UTIL_CDATA_SECTION); echo $tag; Expected result: ---------------- <test><![CDATA[Content ]]>]]><![CDATA[</test> here!]]></test> Actual result: -------------- <test><![CDATA[Content ]]></test> here!]]></test>

Comments

 [2005-07-30 00:20 UTC] ja dot doma at gmail dot com
The solution might be(line 642): function createCDataSection($data) { return sprintf("<![CDATA[%s]]>", preg_replace('/\]\]>/', "]]>]]><![CDATA[", strval($data))); }
 [2005-07-30 00:34 UTC] ja dot doma at gmail dot com
Oops! Since XML can not contain > symbol it must be escaped with > So the proper fix to bug is: function createCDataSection($data) { return sprintf("<![CDATA[%s]]>", preg_replace('/\]\]>/', "]]>]]><![CDATA[", strval($data))); }
 [2008-05-04 18:09 UTC] ashnazg (Chuck Burgess)
I built a test case phpt for this, using the submitter's test script. I also tested the submitter's patch. The one _without_ the > entity seems to work, while the other does not. This result of course is based on the "Expected result" value shown in the submitter's test script. Neither patch breaks the basic functionality tests in the phpt that I built for that createCDataSection() method. Granted, this was tested on PHP 5.2.4.
 [2008-05-04 23:13 UTC] ashnazg (Chuck Burgess)
patch committed to cvs
 [2008-05-05 06:33 UTC] drry (Drry Drry)
more proper fix: function createCDataSection($data) { return sprintf("<![CDATA[%s]]>", preg_replace('/\]\]>/', "]]]]><![CDATA[>", strval($data))); }
 [2008-05-05 16:32 UTC] ashnazg (Chuck Burgess)
I'm inclined to agree, now that I'm looking directly at the CDATA spec. The occurrence of "]]>" is only allowed to end the CDATA section, and the first patch looks like it allows the literal "]]>" to be listed inside the first CDATA section, presumably on the assumption that the XML parser will see both ]]> end sequences and decide to use the latter one. This new patch splits the literal "]]>" into two pieces, "]]" and ">", thereby avoiding trying to write a literal "]]>".
 [2008-05-05 16:39 UTC] ashnazg (Chuck Burgess)
Modified createCDataSection() method as outlined in drry's comment. I have no idea what the attached patch file is for, though.