Vote Details for "Net_IDNA" by neufeind

» Details
  • Voter: Stefan Neufeind 
  • Vote: -1 (not conditional)
  • Reviews: Deep source review
» Comment
I'm in favor of having such a class. But imho PHP4-support as well is needed - and possible!

The mappings you use at the beginning of the source-file are quite "long" - and not too self-explanatory. I'd favor using the tables provided in the RFC. This should imho also include using one table for replacements (replacing with nothing as well as replacing with one or more chars). Your naming $_nm_casemap is somehow irritating - because not only uppercase is mapped, but also several chars (like the German sz [means: "ß"], which is mapped to "ss"). So this should be clarified.

Support for encodings other than UTF-8 and ISO-8859-1 should be added. Currently you use a boolean to specify if input is UTF-8. Imho that should be extended to provide a more open interface. (Note: And then we could maybe use UCS-4 in the class, since it makes multibyte-string-handling easier and probably faster - we might want to test that). It also saves you from decomposing the string into an array.

I wonder if using UTF-8 is "that easy". The RFC uses Unicode in UCS4-representation for it's explanations, and so does your class internally. What if I'm able to provide already existing UCS4 to your class? That would ease the work, but it's not possible with the current class.

Nameprep should be a separate function, as it's also a separate process by definition (RFC).

The RFC define that separate labels (as described in RFC3490) are en/decoded. However I didn't find any place in the source where you split the domain-name at a dot, which separates labels. According to the RFC all labels are encoded/decoded separately and later joined with dots again.

Imho generally a more RFC-strict implementation would be useful.

All in all: I think given the parts mentioned above are added/corrected the class is "okay". However, as has already been discussed on the mailinglists, several developers would favor multi-charset-support as well as an open API to support several extensions if they are present (mbstring, libiconv, ... as well as libidn !!!).

Please have a look at the implementation at http://pear.speedpartner.de/ (as David Rech already mentioned in his comment). It has most of the "missing" features. The "to-do" points are supporting other character-conversion libraries (e.g. libiconv) apart from the currently used mbstring. Also support for libidn is already in the works. (Those two things are what David Rech is already working on, as an addition to the existing implementation. Hope to see it finished soon :-)) ).

PS: Sorry, I was not able to give feedback on this package earlier due to workload.