Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.5.2

Bug #17689 Int casting causes incorrect values for large groups on 32-bit systems
Submitted: 2010-08-10 20:29 UTC
From: pmow Assigned: heino
Status: Closed Package: Net_NNTP (version 1.5.0a1)
PHP Version: 5.2.6 OS: Ubuntu
Roadmaps: (Not assigned)    
Subscription  
Comments Add Comment Add patch


Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know! Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem : 31 - 23 = ?

 
 [2010-08-10 20:29 UTC] pmow (Gabriel Campos)
Description: ------------ Because of Net_NNTP's int casting, I've run into an issue with a large newsgroup and a user's 32-bit system. A number around 3,770,000,000,000 is being reported as a lower number, around 2Million. Removing the cast from line 816-818 works, although the possibility exists that there are other areas it should be removed outside of my use case. Test script: --------------- include 'config.php'; $nntp = new nntp; $nntp->doConnect(); $data = $nntp->selectGroup('alt.binaries.boneless'); echo $data['last']; $nntp->doQuit(); Expected result: ---------------- 3770886011 Actual result: -------------- 2147483647

Comments

 [2010-08-10 20:31 UTC] pmow (Gabriel Campos)
correction: reporting around 2Billion not Million
 [2010-08-11 00:10 UTC] heino (Heino H. Gehlsen)
-Status: Open +Status: Verified
Haven’t tested it, but it seems plausible enough to change status to ‘verified’ straight ahead… Whether to remove the casting completely or to cast into floats is another matter! Using the otherwise unmodified substring from the server has its advantages over floats (less processing), but casting to float have its advantages too (less memory usage on such large groups). In hindsight I’d probably have used the unmodified string in this particular case, well at least in the protocol class, but now we also have to consider backward compatibility! Which type would be the lease breakage and the best choice, if we were to simply cross our fingers and hope solving this bug doesn’t break anyone’s code?
 [2010-08-11 17:37 UTC] pmow (Gabriel Campos)
I'm no expert, but when reading about this subject in php's docs I read that it will automatically use the most appropriate type. In addition, php converts between types as needed. So it would be backward compatible, as any code expecting an int will get an int. That being said, if those users' code casts to ints, they will encounter this issue as well. However it isn't Net_NNTP's fault, it's their code. (They would have encountered this issue anyway with large groups) See http://php.net/manual/en/language.types.integer.php Two excerpts that I found interesting: If PHP encounters a number beyond the bounds of the integer type, it will be interpreted as a float instead. Also, an operation which results in a number beyond the bounds of the integer type will return a float instead. To explicitly convert a value to integer , use either the (int) or (integer) casts. However, in most cases the cast is not needed, since a value will be automatically converted if an operator, function or control structure requires an integer argument. A value can also be converted to integer with the intval() function.
 [2010-08-11 23:23 UTC] heino (Heino H. Gehlsen)
From my point of view it’s Net_NNTP which doesn’t comply with the RFC! When I wrote the code I probably focused on minimizing memory usage and possible optimizing serialization/unserialization of the data, and I never thought of the possibility of someone accessing huge groups from a 32 bit system (which ‘only’ supports 2147483647 messages compared to the 64 bit max of 9223372036854775807 messages). I pretty much agree with you on the flexible casting in PHP, and I don’t consider this a major BC-break, especially if casting to floats would fix this bug. There are however still possible issues to consider - like cashed serialized data and such and whether the precision of floats is a problem. Truth be told, the bug should only be relevant on a 32 bit system and, and personally I don’t think the 64 bit users should be bothered unnecessary… All that said I honestly don’t expect recasting to strings (by removing the casting) – except for perhaps a performance penalty under certain conditions and certainly a memory penalty. Floats on the other hand could break something if they are not precise! One solution could be to only cast into integers up to PHP_INT_MAX, and keep the substring after that – or perhaps only cast into integers on 64-bit systems. Any thoughts?
 [2010-08-12 02:00 UTC] pmow (Gabriel Campos)
It seems not casting at all would be best. Casting as a float would indeed result in a loss of precision, although at the moment the largest usenet group a.b.boneless is 3770886011 (10 chars) and the precision that I get when casting is 12 characters. The growth in the last year of this particular group was 1.3B posts. If we assume a linear growth (not really expected) we have 769 years to go. So if you want to use floats, that's fine I think. If you simply want to clean the output I'm guessing is_numeric would work. The output from server would be saved in whatever is most efficient (integer, float/double, string). It seems like this would be ideal, no?
 [2010-08-14 02:17 UTC] heino (Heino H. Gehlsen)
-Status: Verified +Status: Analyzed -Assigned To: +Assigned To: heino
I’ve looked through the code, and the problem is more profound than the few lines you pointed out, yet it’s fixing it doable. To preserve some degree of consistency recasting to strings should also be done where otherwise appropriate! As you point out floats is out of the picture due to loss of precision, but strings would however also be problematic to some degree. PHP would automatically recast the numbers cast as string into integers/floats at will if the user did any calculations, and then we wouldn’t really have solved anything, right? And believe me users will end up having PHP recast the numbers strings to numbers! Fixing this bug, which is actually more of a platform inadequacy to handle a certain task, would only allow users on a 32 bit system to treat article numbers as strings, that’s it. Using the original substring (by not forcefully recasting to an integer) would of cause allow one to use Net_NNTP to fetch articles and store them, or at least their meta data, in a database – but one would have to let the database do any needed calculations, which would of cause make it possible the handle unsigned 32 bit integers... To be honest I’m considering the possibility that this ‘bug’ shouldn’t be fixed at all, and that users who wish to handle newsgroups larger than 2,147,483,647 articles should upgrade to modern 64 bit systems. That said I’m still willing to consider applying a patch, but I’m not the one to make any patch to save your customer/user from a hardware upgrade! Please take into consideration that since Net_NNTP (RFC977 et. al.) was written NNTP has been upgraded to version two (RFC 3977) in the autumn 2006. Also the package was written for PHP4, and doesn’t use exceptions and such. This is ancient code, which was optimized for the past – yet ancient code, which handles present tasks on current modern 64 bit hardware… BTW Some seven years ago wrote an unpublished exception based PHP5 packages for NNTP, POP3 and SMTP (it never got into the PEAR community, since I pretty much left due to the lack of cross package coordination between related packages). In case you should be interested I would search the archieves…
 [2010-08-14 02:58 UTC] pmow (Gabriel Campos)
Casting to strings will only result in the script converting to float for calculations. As you say, the only real benefit is when dumping the string directly into a database which can handle larger numbers. Yes I mentioned floats but the loss of precision will not practically happen for another several hundred years. Would this not be the best option?
 [2010-08-22 20:19 UTC] heino (Heino H. Gehlsen)
-Status: Analyzed +Status: Feedback
Thank you for taking the time to report a problem with the package. This problem may have been already fixed by a previous change that is in the SVN of the package. Please checking out the SVN repository of this package and upgrade svn checkout svn.php.net/repository/pear/packages/Net_NNTP/trunk pear upgrade package2.xml or pear upgrade package.xml If you are able to reproduce the bug with the latest SVN, please change the status back to "Open". Again, thank you for your continued support of PEAR.
 [2010-12-24 18:32 UTC] heino (Heino H. Gehlsen)
This bug is still hanging due to lack of final feedback. I'm about to roll a new version, and would like to make it a beta, but that won't happen unless I have some positive feedback on this bug. This bugfix could theoretically break something, although it's unlikely, so if I don't get any feedback from anyone, I'll might roll back the related commits to get the other fixes out there...
 [2011-08-17 23:39 UTC] heino (Heino H. Gehlsen)
-Status: Feedback +Status: Closed
Thank you for your bug report. This issue has been fixed in the latest released version of the package, which you can download at http://pear.php.net/get/ Closed without feedback :-(