



Feature #1341


keep consistency between browser encoding and mysql database encoding

Added by Gilles Ballanger about 16 years ago. Updated about 16 years ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:



after trying to lazily import issue directly in mysql database (I know it's very bad to do like this, it's better using ruby importation script via redmine API) I see issue subject (and description too) badly utf-8 encoded :

if I import record via SQL using

INSERT INTO `issues` (`tracker_id`, `project_id`, `subject`, `description`, `due_date`, `category_id`, `status_id`, `assigned_to_id`, `priority_id`, `fixed_version_id`, `author_id`, `lock_version`, `created_on`, `updated_on`, `start_date`, `done_ratio`, `estimated_hours`) VALUES 
(4, 1, 'é', 'é', NULL, NULL, 1, NULL, 4, 5, 3, 0, '2008-05-30 14:19:43', '2008-05-30 14:19:43', '2008-05-30', 0, NULL);

the resulting database dump for this record is
INSERT INTO `issues` (`id`, `tracker_id`, `project_id`, `subject`, `description`, `due_date`, `category_id`, `status_id`, `assigned_to_id`, `priority_id`, `fixed_version_id`, `author_id`, `lock_version`, `created_on`, `updated_on`, `start_date`, `done_ratio`, `estimated_hours`) VALUES 
(234, 4, 1, 0xc3a9, 0xc3a9, NULL, NULL, 1, NULL, 4, 5, 3, 0, '2008-05-30 14:19:43', '2008-05-30 14:19:43', '2008-05-30', 0, NULL);

and the result on browser show a '�' char in place of 'é'

If I insert an issue via browser with subject='é' and description='é' the dumped database is

INSERT INTO `issues` (`id`, `tracker_id`, `project_id`, `subject`, `description`, `due_date`, `category_id`, `status_id`, `assigned_to_id`, `priority_id`, `fixed_version_id`, `author_id`, `lock_version`, `created_on`, `updated_on`, `start_date`, `done_ratio`, `estimated_hours`) VALUES 
(235, 1, 1, 0xc383c2a9, 0xc383c2a9, NULL, NULL, 1, NULL, 4, NULL, 3, 0, '2008-06-01 13:14:20', '2008-06-01 13:14:20', '2008-06-01', 0, NULL);

=> the 'é' char was coded in hex c3 83 c2 a9 (the correct encoding is c3 a9)

This produce "é" in place of "é" in mysql database dump but a correct é char in issue

My knowledge in ruby are not sufficient to reproduce this kind of string encoding interpretation but I do it in python :
first I encode 'é' char in utf-8 by:

>>> unicode("é","utf-8").encode("utf-8")

If i take each value, declare it as unicode string and recode it in utf-8, I have the same bad coding behavior

>>> (u"\xc3").encode("utf-8")
>>> (u"\xa9").encode("utf-8")

so perhaps there is a double encoding conversion somewhere between what is send from browser to what is write in database ?

Once again importing directly in database is a very bad idea (this is a perfect example) but meanwhile this inconsistency between database coding and page rendering can be source of problem in future ...

Actions #1

Updated by Thomas Löber about 16 years ago

What are the values of your MySQL variables?

mysql> show variables like 'character%';
| Variable_name            | Value                      |
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |

You may set the MySQL character set variables in my.cnf (e.g. /etc/mysql/my.cnf).

For the server:

character-set-server = utf8

For the client:

default-character-set = utf8

The character set setting for the Rails connection to MySQL is in config/database.yml:

  adapter: mysql
  encoding: utf8

Actions #2

Updated by Gilles Ballanger about 16 years ago

  • Status changed from New to Resolved

Original situation :

mysql> show variables like 'character%';
| Variable_name            | Value                      |
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |

all wrong :( ...

after adapting server client and redmine configuration files consistency is back. :)

of course the issues already in database with bad encoding appear with wrong character set but new one with "good" utf-8 encoding are correctly display.

Thanks for your solution.

Actions #3

Updated by Jean-Philippe Lang about 16 years ago

  • Status changed from Resolved to Closed
  • Resolution set to Invalid

Also available in: Atom PDF