Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.2.3

Bug #15385 special characters are converted to % and break sprintf
Submitted: 2008-12-23 22:26 UTC
From: duli Assigned:
Status: Analyzed Package: Text_Wiki (version 1.2.0)
PHP Version: 5.1.6 OS: centos 5.2
Roadmaps: (Not assigned)    
Subscription  


 [2008-12-23 22:26 UTC] duli (Luis Marzagao)
Description: ------------ I'm using wicked, a horde application that uses Text_wiki. (horde developers said it was a PEAR bug: http://bugs.horde.org/ticket/7801) When I create a wiki page with special characters in the name, like "Tributário", the page name gets encoded with %, which causes sprintf on line 134 of pear/Text/Wiki/Render/Xhtml/Wikilink.php to break: Warning: sprintf() [function.sprintf]: Too few arguments in /usr/share/pear/Text/Wiki/Render/Xhtml/Wikilink.php on line 134 This happens because sprintf will try to substitute every %, when in fact only the first one should be substituted. So this: $href = sprintf($href, $this->urlEncode($page)); gets like this: $href = sprintf(/wicked/display.php?page=%s&referrer=Tribut%C3%A1rio, parcelamento); And the error occurs because of the various % after page=%s. So I think every % sign other than the first one should be escaped with %% in order to sprintf make the proper substitution. Test script: --------------- $href = "/wicked/display.php?page=%s&referrer=Tribut%C3%A1rio"; $page = "parcelamento"; $hfer = sprintf($href, $page); Will show: PHP Warning: sprintf(): Too few arguments in /home/duli/test.php on line 4 ---------- (Possible) Solution: If you change line 134 of Wikilink.php for this, then everything works just fine, because every % after the first one will be escaped: $href = sprintf(substr_replace($href, ereg_replace("%", "%%", substr($href, strpos($href, '%')+1)), strpos($href, '%')+1), $this->urlEncode($page)); I'm sure there's a clever way of fixing it. I'm no php expert. Expected result: ---------------- The page should load normally and the links created on the fly should be fine. Actual result: -------------- The page "tributário" loads, but the wikilinks inside the page don't get created, because sprintf will think it's missing arguments: ( ! ) Warning: sprintf() [function.sprintf]: Too few arguments in /usr/share/pear/Text/Wiki/Render/Xhtml/Wikilink.php on line 134 Call Stack # Time Memory Function Location 1 0.0008 142480 {main}( ) ../display.php:0 2 0.0402 3165472 Page->render( ) ../display.php:136 3 0.0402 3165472 Page->display( ) ../Page.php:508 4 0.0402 3165472 StandardPage->displayContents( ) ../Page.php:358 5 0.0459 3563592 Text_Wiki->transform( ) ../StandardPage.php:102 6 0.0550 4017424 Text_Wiki->render( ) ../Wiki.php:918 7 0.0576 4149312 preg_replace_callback ( ) ../Wiki.php:1012 8 0.0582 4150768 Text_Wiki->_renderToken( ) ../Wiki.php:0 9 0.0582 4150768 Text_Wiki_Render_Xhtml_Wikilink->token( ) ../Wiki.php:1128 10 0.0583 4150768 sprintf ( ) ../Wikilink.php:134

Comments

 [2008-12-24 00:13 UTC] justinpatrin (Justin Patrin)
It looks like the problem here is that the encode is happening one step too early. Try replacing this: $href = sprintf($href, $this->urlEncode($page)); with this: $href = $this->urlEncode(sprintf($href, $page)); and let me know if that fixes it for you.
 [2008-12-24 00:18 UTC] justinpatrin (Justin Patrin)
Looking back at your test script it looks like these urlencoded chars may be in your href and not your page name so this might not work. The urlencode may also convert pieces of the URL that need to not be encoded. Please try the fix, though, and see how it works. It looks like this is simply a limitation of how the system was implemented. sprintf() turns out not to be the best option here since URL encoding in the href can (and will) have % signs in it. Espacing just the first one wouldn't be right since the order of those parameters could be changed. We could replace everything but %s, but this could be a BC break as some people may be using the other sprintf() options (this may not be a BC issue, though, if the documented API only supports %s). Another option would be to simply use str_replace('%s', $page, $href); but this, again, would break possible uses of other sprintf() options and would replace all %s with the page instead of just the first %s. Comments welcome.
 [2008-12-24 02:44 UTC] duli (Luis Marzagao)
> The urlencode may also convert pieces of the URL that need to not > be encoded. Please try the fix, though, and see how it works. Indeed. Now the $href of the "still not created" pages becomes like this, which I think is not desired behaviour: %2Fwicked%2Fdisplay.php%3Fpage%3DPenal%26referrer%3DWikiHome And it breaks the link, returning this when you click on it: The requested URL /wicked//wicked/display.php?page=Processo_Civil&referrer=WikiHome was not found on this server. As for the already created pages, which is the case of my page named "Tributário", I can access it, but then the links on this pages are broken, because of the same error (due to the many %): Warning: sprintf() [function.sprintf]: Too few arguments in /usr/share/pear/Text/Wiki/Render/Xhtml/Wikilink.php on line 142 > It looks like this is simply a limitation of how the system was > implemented. Yes, I think so. And that's because if one only uses page names in english, one will never come across this issue. >Another option would be to simply use str_replace('%s', $page, $href); >but this, again, would break possible uses of other sprintf() options >and would replace all %s with the page instead of just the first %s. This solution works, because it only substitutes the % sign followed by a 's', which is not the case of the other existing %, at least for this url I'm testing: Tribut%C3%A1rio So the these % do not get replaced. I wonder if there's a case some word could be encoded with %s, like Tribut%s%A1rio, for example?! Anyway, if think the str_replace solution is already better than the sprintf, because it will *certainly* break less (if none at all). Thanks
 [2008-12-24 02:58 UTC] duli (Luis Marzagao)
I have tested a lot of special chars (á, à, é, è, é, í, ì, ó, ò, ú, ù, ä, ë, ç, ã, ẽ etc) and none of the generated a encoded url with %s, so it's working pretty fine. But It is also necessary to change line 101 from: $href = sprintf($href, $this->urlEncode($page)) . $anchor; to $href = str_replace('%s', $page, $href) . $anchor; Otherwise the links to already created pages will present the exact same issue.
 [2008-12-24 18:07 UTC] duli (Luis Marzagao)
Just added a patch. It seems to be working pretty fine now. Thanks
 [2009-08-25 12:00 UTC] cweiske (Christian Weiske)
-Status: Feedback +Status: Analyzed
Only the first %s should be replaced.