Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.1.0

Bug #5880 utf8 issues
Submitted: 2005-11-06 20:30 UTC
From: brian at matzon dot dk Assigned: tacker
Status: Bogus Package: File_Bittorrent
PHP Version: Irrelevant OS:
Roadmaps: (Not assigned)    
Subscription  


 [2005-11-06 20:30 UTC] brian at matzon dot dk
Description: ------------ When decoding strings, the $return should be wrapped with a utf8_decode. Fixes issues I had :)

Comments

 [2005-11-06 20:38 UTC] brian at matzon dot dk
fixed summary
 [2005-11-06 20:40 UTC] brian at matzon dot dk
whoops - this fix also breaks the hash so I am unable to call getStats afterwards... ideally the values decoded should be utf8 decoded, and when used internally encoded as appropriate. If not, using the api will be tedius at best
 [2005-11-07 06:15 UTC] nagash at php dot net
Yer, originally everything was done in utf8 enc/dec, but the hash is broken. even if you encode the value, and later decode, it gives you the wrong hash! thats why utf8 was originally taken out, but maybe tacker has some ideas to fix it now.
 [2005-11-07 08:32 UTC] tacker at php dot net
Thanks for the report. Could you please provide the torrent(s) you have problems with?
 [2005-11-07 08:34 UTC] tacker at php dot net
Thanks for the report. Could you please provide the torrent(s) you have problems with?
 [2005-11-19 17:16 UTC] tacker at php dot net
File_Bittorrent and the official Bittorrent client do produce the same output, so it is not a bug. $ php torrentinfo.php -t bugs/ticket-12/utf8-test.torrent name: æøåöüúù.png filename: utf8-test.torrent comment: date: 1131381448 created_by: TorrentCreator 0.3 files: (1) 1: æøåöüúù.png size: 2797 announce: http://xxxxxx.xx:1250/announce announce_list: $ torrentinfo-console bugs/ticket-12/utf8-test.torrent torrentinfo-console 4.1.7 - decode BitTorrent metainfo files metainfo file.......: utf8-test.torrent info hash...........: f083668eaec3330e2f7ea54f7b948d0e82252643 file name...........: æøåöüúù.png file size...........: 2797 (0 * 32768 + 2797) tracker announce url: http://xxxxxx.xx:1250/announce comment.............:
 [2005-11-19 17:50 UTC] brian at matzon dot dk
Wow, thats a bit of a letdown. So just because torrentinfo-console doesn't handle utf8, File_Bittorrent wont either ?
 [2005-11-19 18:23 UTC] brian at matzon dot dk
To the best of my knowledge, all strings in a torrent are UTF8 encoded. Therefore any usage of said strings in php must be decoded. However doing so results in some unknown errors when doing a hash. I am however unable to confirm this. Consider the following php code: $field = 'æøå'; echo "Name:\t\t\t$field, " . utf8_decode($field) . "\n"; echo "md5:\t\t\t" . md5($field) . "\n"; echo "utf8<->utf8, md5:\t" . md5(utf8_encode(utf8_decode($field))) . "\n"; which yields the following correct names and hash: Name: æøå, æøå md5: 93c0abe43cfa5d4f4fbdd46694442182 utf8<->utf8, md5: 93c0abe43cfa5d4f4fbdd46694442182 If File_Bittorrent does not fix this UTF8 error, then it's going to be a PITA to use, since *every* field has to be accessed via an utf8_decode method.
 [2005-11-26 22:17 UTC] brian at matzon dot dk
I am reopening bug since I have identified the cause and the fix (though I am unable to provide a patch file). Hacking around the encode and decode file, I have successfully made it utf8 aware (as it should be!). Basically by utf8 encoding and decoding the strings everything works as fine, except we're getting wrong hash'es. The reason for this is because the string 'pieces' is being en/de-coded too, which it shouldn't. The fix therefor must encode and decode only the correct strings and not any and all strings. Currently this means that all strings should be handled, except the 'pieces' string. I am unsure about the 'name' in multi file mode since it's listed as a "character string" as opposed to a normal string.
 [2006-09-04 08:59 UTC] tacker at php dot net (Markus Tacker)
In an UTF-8 shell the output is correct. $ php torrentinfo.php -t utf8.torrent name: æøåöüúù.png filename: utf8.torrent comment: date: 1131381448 created_by: TorrentCreator 0.3 files: (1) 1: æøåöüúù.png size: 2797 announce: http://xxxxxx.xx:1250/announce announce_list: $ torrentinfo-console utf8.torrent torrentinfo-console 4.4.0 - dekodiere BitTorrent Metainformationsdateien Metainformationsdatei....: utf8.torrent Informations-Hash........: f083668eaec3330e2f7ea54f7b948d0e82252643 Dateiname................: æøåöüúù.png Dateigröße.............: 2797 (0 * 32768 + 2797) Tracker-Ankündigungs-URL: http://xxxxxx.xx:1250/announce Kommentar................: