Comments for "I18N_UnicodeNormalizer"

» Submit Your Comment
Comments are only accepted during the "Proposal" phase. This proposal is currently in the "Finished" phase.
» Comments
  • David Sanders  [2007-06-11 11:49 UTC]

    Instead of commenting on the usual tech stuff I'm just going to say: As a dev who work on database systems, especially on the comparison of people's names, etc, I quite like this idea.
  • Lukas Smith  [2007-06-11 11:57 UTC]

    Have you looked at the following .. http://pecl.php.net/package/translit .. without knowing what all the jargon words mean in your proposal, I guess it handles parts of your feature set in C. So maybe integrating it as an optional backend for those parts would be a good idea.
  • Till Klampaeckel  [2007-06-11 21:33 UTC]

    In general I like this a lot as well, since it looks like what I've been looking for in PHP. ;-)

    I am not sure I like the __construct($root='') part where you prepend a root directory to constant (a path) which is defined inside the class and not adjustable (through a set()).

    Instead I'd remove the constant and only use the $root (and probably call it something else). Not sure if you have to since this doesn't improve performance but it wouldn't hurt to test if the directory exists and so on and throw an Exception.

    In general, some of your code is hard to read (lines are pretty long) and some of the if()'s seem a bit complicated - for example, exit early, instead of long code inside the condition and exit in the else part.

    Then, I'd (personally) always use empty() to test instead of, if ($foo != '').

    But I guess that's called CS or maybe "IMHO". :-)

    Also, you shoud list the multibyte extension as a dependency.
  • Michel Corne  [2007-06-15 17:26 UTC]

    Thanks for your comments. Please read my own comments below.
    I would be interested to have some feedback from other folks, especially those who have an interest in internationalization stuff. Also, feedback from QA folks would be very much appreciated to let me know if the testing suite and the API doc is in line with PEAR's standards.
    MC

    [pear-dev] "...Have you looked at the following .. http://pecl.php.net/package/translit .."
    [MC] Actually transliteration is very different from normalization so I doubt that there is much that I could leverage. I have not checked that package yet. There are 2 only methods that I wrote that are somewhat similar to other packages.

    [pear-dev] "... not sure I like the __construct($root='') part where you prepend a root directory to constant (a path) which is defined inside the class and not adjustable..."
    [MC] The $root is not meant to be used in production. This is just a way to trick the class to include a set of files similar to the production ones but customized for some of the testing. I will make this clearer in the API doc.

    [pear-dev] "... some of your code is hard to read (lines are pretty long)..."
    [MC] The lines are long because there is the code itself and a comment right after. I comment each line because it is easier for me to maintain the code when I go back to it weeks/months after. Also, we have large screens these days and editors with smart highlighting which I believe makes the whole things pretty readable. We also all have different styles to program I guess :-)

    [pear-dev] "... some of the if()'s seem a bit complicated - for example, exit early, instead of long code inside the condition and exit in the else part..."
    [MC] Good point regarding exiting early. I used to do as you say for years. Then I changed things more recently. I may have to change that back... The normalization rules are not easy to understand nor to implement which may reflect in the code. This requires many conditions tests and recursive programing. I am open to simplifying what I wrote, please be more specific so I can take a look at it again.

    [pear-dev] "...I'd (personally) always use empty() to test instead of, if ($foo != '')..."
    [MC] Yeah, yeah... I use empty() too... depends on the day :-) Note that it is not strictly the same thing.

    [pear-dev] "...you should list the multibyte extension as a dependency..."
    [MC] I did not know that we had to list extensions as dependencies. No problem. Where should I list that?