Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 2.1.2

Bug #17397 Parsing breaks with inline JavaScript
Submitted: 2010-05-13 21:39 UTC
From: ramv Assigned:
Status: Open Package: XML_HTMLSax (version 2.1.2)
PHP Version: 5.3.1 OS: RHEL
Roadmaps: (Not assigned)    
Subscription  


 [2010-05-13 21:39 UTC] ramv (Ram Viswanadha)
Description: ------------ If the inline javascript has a for loop with a '>' and '<' character, the parsing breaks. Test script: --------------- <html> <head> Test </head> <body> <span id='TestString'>This is a test <span> <script> /* Success Handler for YUI Get Script calls */ Test.onScriptsLoaded = function() { //console.log('onscriptsloaded'); for(var i=0;i<Test.GetScriptsCode.length;i++){ //console.log(Test.GetScriptsCode[i]); Test.GetScriptsCode[i](); } }; </script> </body> </html> Parsing this produces broken JavaScript since the parser consumes the '<' character for(var i=0;i Test.GetScriptsCode.length;i++){ ^^^ Expected result: ---------------- The text in the script block should be treated as a one chunk and returned for processing with </script> invoking the end tag handler

Comments

 [2010-05-16 12:04 UTC] doconnor (Daniel O'Connor)
Well; https://developer.mozilla.org/en/Properly_Using_CSS_and_JavaScript_in_XHTML_Documents should really be utilized; but since this is a package for parsing badly formed html... :( Any ideas how other implementors deal with it?
 [2010-05-16 12:06 UTC] doconnor (Daniel O'Connor)
Also; if you are usiing PHP 5.3; there are better options around - http://php.net/tidy + http://php.net/simplexml
 [2010-05-16 22:01 UTC] dufuz (Helgi Þormar Þorbjörnsson)
Start by having a look at http://pear.php.net/package/XML_HTMLSax3 instead, it's the newest version of the package.
 [2010-05-16 22:43 UTC] ramv (Ram Viswanadha)
XML_HTMLSax3-3.0.0_1 has the same problem