Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.0.0

Request #3935 File_CSV::read should optionally ignore field count
Submitted: 2005-03-23 23:29 UTC
From: crain at fuse dot net Assigned: dufuz
Status: Assigned Package: File_CSV
PHP Version: 4.3.9 OS: Win XP
Roadmaps: (Not assigned)    
Subscription  


 [2005-03-23 23:29 UTC] crain at fuse dot net
Description: ------------ Hi, I would like to see an optional third parameter to File_CSV::read() to instruct the method whether or not to ignore lines with more or fewer fields than calculated by File_CSV::discoverFormat(), e.g. function read($file, &$conf, $ignoreSize = false). Currently, the test at the end of the method results in valid csv data not being returned to the calling script, even though a valid csv file may have lines having varying numbers of elements. If the exploded pieces of a line <> $conf['fields'] (which, being based only on the first five lines of a file, should be only a guideline in any case) the parsed line is not returned. The current behavior breaks PEAR::Contact_AddressBook::importFromFile(), which relies on File_CSV::read() for parsing csv exports from contact managers, which very typically export files of varying numbers of fields. I think it should be up to the calling script to determine how to respond to a mismatch in size. Thanks! Andy Crain

Comments

 [2005-03-23 23:50 UTC] crain at fuse dot net
Oops. I should have said $conf['fields'] is based on first 10 lines of file, not five.
 [2005-03-24 18:01 UTC] dufuz
What version of File do you have installed ?
 [2005-03-24 18:07 UTC] crain at fuse dot net
Sorry about that. I knew I was forgetting something: Id: File.php,v 1.6 2005/02/27 10:00:26 Id: CSV.php,v 1.3 2005/02/27 10:00:27
 [2005-03-24 18:18 UTC] dufuz
You have a ancient File version, please update and let me know how this works.
 [2005-03-24 20:57 UTC] crain at fuse dot net
Apologies again: the version numbers I gave previously were from my own repository; in fact, I'm running the most current version of the files: Id: File.php,v 1.28 2005/01/10 13:25:57 mike Exp Id: CSV.php,v 1.18 2005/02/18 11:16:14 dufuz Exp Andy
 [2005-12-15 14:01 UTC] contact at andreass dot net
I would recommend the following approach (I personally need the option to fix the length of fields with empty ones, as Exel produces a different number of empty fields if there are some at the end.): read($file, &$conf, $sizeMismatch='error') if (count($fields) != $conf['fields']) { if( $sizeMismatch == 'fix' ) { return array_merge( $fields, array_fill( count($fields), $conf['fields']-count($fields), '' ) ); } elseif( $sizeMismatch == 'ignore' ) { return $fields; } else { File_CSV::raiseError("Read wrong fields number count: '". count($fields) . "' expected ".$conf['fields']); return true; } }
 [2006-03-18 19:27 UTC] contact at andreass dot net (Andreas Schamberger)
What about incorporating my suggestion into the Package?
 [2006-03-20 11:25 UTC] dufuz (Helgi Þormar)
I am still thinking about it, I am taking on a 3 month trip to Brazil so I might have to take some time but I have been looking at this since I wrote couple of test cases that havent yet gotten into CVS. Tho to tell the truth, if the fields are less than the field count indicates then File_CSV fills those fields with empty tho I guess using auto discover can fuck this royally up for us since we only make it check the first 10 lines (more can result in speed issues, tho it cant hurt to up it to 20 I guess) So well yeah the idea is still valid and will be looked at soon enough.