Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 2.5.0b5

Request #4666 Different database charsets
Submitted: 2005-06-23 08:31 UTC
From: dan at yes dot lt Assigned: quipo
Status: Closed Package: MDB2
PHP Version: Irrelevant OS:
Roadmaps: (Not assigned)    
Subscription  


 [2005-06-23 08:31 UTC] dan at yes dot lt
Description: ------------ Now when we are working with databases - all queries and results must be on the same charset as the current php charset is. If current charset is different from the current one, then we must convert all queries and results manualy. So, how about to implement additional URI parameter "charset" (or "encoding") and to do automatically charset conversion. I.e. we will be able to work with database using ISO-8859-13 (baltic) when our internal charset is UTF-8 just by adding &charset=ISO-8859-13 to our database uri. Below is my working implementation of this functionality: function _convertQuery($query) { if (! empty($this->dsn['charset'])) { $database = $this->dsn['charset']; if (extension_loaded('mbstring')) { // Working with "MultiByte String" extension $internal = mb_internal_encoding(); // Checking if database charset is different from internal charset if (strcasecmp($database, $internal)) { // Converting query return mb_convert_encoding($query, $database, $internal); } } elseif (extension_loaded('iconv')) { // Working with "Iconv" extension $internal = iconv_get_encoding('internal_encoding'); // Checking if database charset is different from internal charset if (strcasecmp($database, $internal)) { // Converting query return iconv($internal, $database, $query); } } else { $this->raiseError('Charset conversion not supported', 0, PEAR_ERROR_DIE); } } return $query; } function simpleQuery($query) { // prepare query ... $query = $this->_convertQuery($query); ... // execute query } function _convertResult(&$arr) { if (! empty($this->dsn['charset'])) { $database = $this->dsn['charset']; if (extension_loaded('mbstring')) { // Working with "MultiByte String" extension $internal = mb_internal_encoding(); // Checking if database charset is different from internal charset if (strcasecmp($database, $internal)) { // Converting result foreach ($arr as $key => $val) { $arr[$key] = mb_convert_encoding($val, $internal, $database); } } } elseif (extension_loaded('iconv')) { // Working with "Iconv" extension $internal = iconv_get_encoding('internal_encoding'); // Checking if database charset is different from internal charset if (strcasecmp($database, $internal)) { // Converting result foreach ($arr as $key => $val) { $arr[$key] = iconv($database, $internal, $val); } } } else { $this->raiseError('Charset conversion not supported', 0, PEAR_ERROR_DIE); } } } function fetchInto($result, &$arr, $fetchmode, $rownum = null) { // fetch $arr ... $this->_convertResult($arr); ... // apply portability }

Comments

 [2005-08-11 20:47 UTC] pear dot php dot net at chsc dot dk
I believe mysqli_set_character_name() does all this for you, if you are using mysqli. But your code is necessary for drivers like mysql and others that does not support this natively. BTW I don't think it should be part of the DSN but rather an option, because the character set is only relevant within the application as opposed to the DSN that is supplied from outside the application.
 [2005-08-11 20:55 UTC] pear dot php dot net at chsc dot dk
Sorry, I meant mysqli_set_charset(). The mysql case can probably be solved using $DB->query("SET NAMES utf8") or similar - see http://dev.mysql.com/doc/mysql/en/charset-connection.html But of course it would be nice if all this were standardized in the DB library.
 [2005-08-11 21:11 UTC] dan at yes dot lt
yes, it would be nice if all this were standardized in the DB library. yes, charset may be set as option. also there may be default methods for databases with no charset support, and overriden methods for specified databases (ie. mysql, mysqli). ps. thanks for mysql charset queries :) but our software must be db independent (today it works on mysql, mssql, pgsql and oracle), so we need that db library feature :)
 [2005-10-18 10:15 UTC] cryptographite at comcast dot net
Adding a DSN option for 'set names' would be wonderful. Sometimes doing a query() just isn't possible (such as when DB is being loaded by another PEAR module and you don't have direct query access to the connection).
 [2005-10-18 10:18 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2005-11-14 23:18 UTC] art at siit dot net
$DB->query("SET CHARACTER SET 'utf8'"); is required, in order to convert the results. I have added an entry about supporting input/result charset to MDB2's todo list at http://oss.backendmedia.com/MDB2/ToDo Currently, I found info about converting charset in two DBMS: - MySQL (SET NAMES charset, SET CHARACTER SET 'charset') - SQLite (PRAGMA encoding="charset"; , not really a 'conversion', only has effect for new table creation) PostgreSQL also supports UTF-8 but I don't how to set it, as well as other DMBSs.
 [2005-11-14 23:30 UTC] art at siit dot net
Note, that $DB->query("SET CHARACTER SET 'utf8'"); is only for MySQL. (just for an example). for the MDB2 API, may be we can have something like: $DB->setCharset("utf-8"); .. or should we set the charset at the result / datatype ?
 [2005-11-15 11:22 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2005-11-16 11:15 UTC] art at siit dot net
Add more info to http://oss.backendmedia.com/MDB2/CharacterSet It seems like we have at least 5 different types of charset to deal with: client, connection, database, table, and results (may not be settable in every dbms). Please check it out + comment :)
 [2006-03-07 17:45 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2006-03-12 09:59 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2006-03-12 14:52 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2006-03-13 08:10 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2006-06-30 21:19 UTC] tokul at users dot sourceforge dot net (Tomas Kuliavas)
Include library version information when you add new options. I think DSN charset option was introduced in MDB2 v2.1.0 and your docs does not mention that. Only short notice in changelog about added SetCharset function.
 [2006-06-30 21:20 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2007-01-11 20:57 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!
 [2007-03-12 13:53 UTC] User who submitted this comment has not confirmed identity
If you submitted this note, check your email.If you do not have a message, click here to re-send
MANUAL CONFIRMATION IS NOT POSSIBLE.  Write a message to pear-dev@lists.php.net
to request the confirmation link.  All bugs/comments/patches associated with this

email address will be deleted within 48 hours if the account request is not confirmed!