Defect #2664
closedMercurial: Repository path encoding of non UTF-8 characters
100%
Description
Environment¶
- Server OS: Debian Lenny
- Redmine: svn rev 2361 (same problem with 0.8.0)
- Ruby: 1.8.6
- RubyGems: 1.3.1
- Rails: 2.1.2
- PostgreSQL: 8.3.5
- Mercurial: 1.0.1
- System locale: en_us.UTF8
- Database encoding: utf8
- Database locale: fr_FR.UTF8 (same problem with en_us.UTF8)
Error¶
Running: ruby script/runner "Repository.fetch_changesets" -e production
gives the following errors:
/home/redmine/redmine-0.8.0/vendor/rails/railties/lib/commands/runner.rb:47: /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb:147:in `log': RuntimeError: ERROR C22021 Minvalid byte sequence for encoding "UTF8": 0xe97365 HThis error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". Fwchar.c L1545 Rreport_invalid_encoding: INSERT INTO "changes" ("changeset_id", "action", "revision", "branch", "from_path", "path", "from_revision") VALUES(781, E'A', NULL, NULL, NULL, E'/Quantity/doc/Présentation du projet.pdf', NULL) RETURNING "id" (ActiveRecord::StatementInvalid) from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:484:in `execute' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:929:in `select_raw' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:916:in `select' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all_without_query_cache' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/query_cache.rb:61:in `select_all' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:13:in `select_one' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:19:in `select_value' from /home/redmine/redmine-0.8.0/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:433:in `insert' ... 31 levels... from /home/redmine/redmine-0.8.0/vendor/rails/railties/lib/commands/runner.rb:47 from /home/redmine/apps/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require' from /home/redmine/apps/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require' from script/runner:3
The error seems quite similar to #834, #917, and #1663 but the error is not appening on the same table. Here, the problem comes from the "changes" table while the already reported (and corrected) issues refer a problem on the "changesets" table.
The problem seems to comes from file path which are not converted to UTF-8 (as we can notice, there is a 'é' character in the file path).
I have tried different encoding in the repository tab settings without success.
Files
Related issues
Updated by Jérémie Delaitre almost 16 years ago
I just noticed something weird with Mercurial.
When I try to remove the file mentionned above, mercurial did not success...
So the problem is maybe from Mercurial instead of Redmine.
Updated by Daniel Lima about 15 years ago
I have the same issue too. My environment is a Redmine 0.8.4 in a Windows 2003 Server. My repo is Mercurial with some special character in file path, like 'ç', 'ã', 'õ'.
Updated by Yuya Nishihara over 14 years ago
That's because Mercurial (and also Git) treats file names as byte string.
Here we need to convert them to UTF-8, but, there's no reliable info about file name encoding.
Wei Li wrote:
I have the same issue with Bazaar.
I'm not sure about Bazaar, but it must handle paths as UTF-8, so it seems strange.
Updated by Rui Tang over 14 years ago
I'm using redmine 0.9.3 on Windows Server 2003, has the same problem.
C:\redmine-0.9>ruby script/runner "Repository.fetch_changesets" -e production
c:/ruby/lib/ruby/gems/1.8/gems/rails-2.3.5/lib/commands/runner.rb:48: c:/ruby/li
b/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record/connection_adapters/ab
stract_adapter.rb:219:in `log': Mysql::Error: Incorrect string value: '\xB2\xE2\
xCA\xD4\xB9\xDC...' for column 'path' at row 1: INSERT INTO `changes` (`changese
t_id`, `action`, `revision`, `branch`, `from_path`, `path`, `from_revision`) VAL
UES (
ActiveRecord::StatementInvalid)
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/connection_adapters/mysql_adapter.rb:323:in `execute'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/connection_adapters/abstract/database_statements.rb:259:in `insert_sql'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/connection_adapters/mysql_adapter.rb:333:in `insert_sql'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/connection_adapters/abstract/database_statements.rb:44:in `insert_without_query
_dirty'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/connection_adapters/abstract/query_cache.rb:18:in `insert'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/base.rb:2908:in `create_without_timestamps'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/timestamp.rb:53:in `create_without_callbacks'
from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-2.3.5/lib/active_record
/callbacks.rb:266:in `create'
... 30 levels...
from c:/ruby/lib/ruby/gems/1.8/gems/rails-2.3.5/lib/commands/runner.rb:4
8
from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `ge
m_original_require'
from c:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `re
quire'
from script/runner:3
Updated by Yuya Nishihara over 14 years ago
Yuya Nishihara wrote:
That's because Mercurial (and also Git) treats file names as byte string.
Here we need to convert them to UTF-8, but, there's no reliable info about file name encoding.
Hi, I made a patch to fix the issue.
It adds repositories.path_encoding
column, which can be configured via Settings -> Repository tab.
Since it changes database schema, rake db:migrate
is necessary. Please try it with care.
Updated by Toshi MARUYAMA over 14 years ago
Yuya Nishihara wrote:
That's because Mercurial (and also Git) treats file names as byte string.
Here we need to convert them to UTF-8, but, there's no reliable info about file name encoding.Wei Li wrote:
I have the same issue with Bazaar.
I'm not sure about Bazaar, but it must handle paths as UTF-8, so it seems strange.
I asked this Bazaar problem and #5578 at Mercurial-ja google group (in Japanese).
The reason is same with #5578.
Bazaar issue: want an option to set the output encoding, especially on win32 .
And I got a suggestion that XMLOutput plugin is better than "bzr log".
Updated by Toshi MARUYAMA over 14 years ago
- File git-bzr.patch git-bzr.patch added
Git problem is reported at #5251.
I tried git and Bazaar and I could display multi-bytes characters path.
This patch is for git and Bazaar.
Updated by Yuya Nishihara over 14 years ago
Toshi Maruyama wrote:
Git problem is reported at #5251.
I tried git and Bazaar and I could display multi-bytes characters path.
This patch is for git and Bazaar.
Git and Mercurial have absolutely the same problem, they treat filename as bytes, so the patch about Git seems reasonable.
But Bazaar's problem sounds different to me. It lies on the communication layer between Redmine and Bazaar. They should talk in UTF-8 but currently not.
Updated by xiaoyu yin over 14 years ago
To share my experence:
My system is Windows XP SP3, and Windows Server 2003.
My steps are:
1.Uninstall the redmine and reinstall it.
2.Creat hg repository in redmine folder.
3.Import the patch.
4.run "rake db:migrate RAILS_ENV=production" command
5.Restart the redmine service.
The path_encoding column was added successfully.
And I test the coding type in the list one by one, the "GBK" is correct for me.
Good luck for you!
Updated by xiaoyu yin over 14 years ago
By the way: if you have data in database, please backup it first and restore it after that the path_encoding column was added successfully.
Updated by Toshi MARUYAMA over 14 years ago
Additionally, you need to delete repository setting created before patch applied and recreate the same repository from Redmine settings tab.
Updated by Toshi MARUYAMA almost 14 years ago
- Status changed from New to Closed
Updated by Toshi MARUYAMA almost 14 years ago
- Status changed from Closed to Reopened
- Assignee set to Toshi MARUYAMA
- Priority changed from High to Low
Updated by Toshi MARUYAMA almost 14 years ago
- Status changed from Reopened to 7
Updated by Toshi MARUYAMA almost 14 years ago
- Target version set to Unplanned backlogs
Updated by bo ye almost 14 years ago
please fix this first in later version of redmine(like 1.1.2?) if #4455 Mercurial overhaul could not be done soon.
this problem stopped us from using hg for redmine completely.
Updated by Toshi MARUYAMA almost 14 years ago
- Subject changed from Redmine+Mercurial+PostgreSQL: path encoding and multi-bytes characters to Repository path encoding of non UTF-8 characters (Mercurial, Git and CVS)
Updated by Toshi MARUYAMA almost 14 years ago
- File 20110207-db.diff 20110207-db.diff added
- File 20110207-git-cvs-fs.diff 20110207-git-cvs-fs.diff added
- File 20110207-impl.diff 20110207-impl.diff added
These are patches for svn trunk r4799 and 1.1 stable r4800.
- 20110207-impl.diff is main patch.
- 20110207-db.diff is DB migration. If you have applied Yuya's issue-2664-0.9-stable-2010-04-11.patch, you don't need to apply this patch or you don't need to run "rake db:migrate". If you have not applied Yuya's patch, you need to apply this patch, and run "rake db:migrate".
- 20110207-git-cvs-fs.diff is for Git, CVS and Filesystem.
Updated by Toshi MARUYAMA almost 14 years ago
- Subject changed from Repository path encoding of non UTF-8 characters (Mercurial, Git and CVS) to Repository path encoding of non UTF-8 characters (Mercurial, Git, CVS and Filesystem)
Updated by Toshi MARUYAMA almost 14 years ago
- File hg-ruby-1.9.diff hg-ruby-1.9.diff added
This is ad hoc Mercurial adapter patch for Redmine SVN trunk and Ruby 1.9.
I confirmed to run on my Japanese Windows Vista and Mingw Ruby 1.9.2.
There is another "IO.popen" issue #6090.
source:tags/1.1.1/lib/redmine/scm/adapters/abstract_adapter.rb#L184
I think we need to refactor "IO.popen" such as Yuya's Mercurial overhaul
Updated by Paolo Losi almost 14 years ago
I can confirm that the patches (see note 19) solve the problem for us.
Since the issue is blocking, we would like to know if
the is a method to backout the patches and undo the schema
migration when there will be an official release that addresses this issue.
Thanks
Updated by Paolo Losi almost 14 years ago
Paolo Losi wrote:
I can confirm that the patches (see note 19) solve the problem for us.
Since the issue is blocking, we would like to know if
the is a method to backout the patches and undo the schema
migration when there will be an official release that addresses this issue.
Answering myself:
rake db:migrate:down
Sorry for the noise
Updated by bo ye almost 14 years ago
wow, these patches work great!!
it seems even better than before, at least now issues can be linked with r####
please make this to the next minor version 1.1.2. you have my vote. :)
there is a minor problem with the patches though. it doesn't work with codeview plugin. the error on the repository page:
NoMethodError in Code_review#update_revisions_view Showing vendor/plugins/redmine_code_review/app/views/code_review/_update_revisions.html.erb where line #6 raised: undefined method `review_count' for #<Changeset:0x63e7320> Extracted source (around line #6): 3: # and open the template in the editor. 4: %> 5: 6: <script type="text/javascript"> 7: <% @changesets.each do |changeset| %> 8: <% 9: if changeset.review_count > 0
Toshi MARUYAMA wrote:
These are patches for svn trunk r4799 and 1.1 stable r4800.
- 20110207-impl.diff is main patch.
- 20110207-db.diff is DB migration. If you have applied Yuya's issue-2664-0.9-stable-2010-04-11.patch, you don't need to apply this patch or you don't need to run "rake db:migrate". If you have not applied Yuya's patch, you need to apply this patch, and run "rake db:migrate".
- 20110207-git-cvs-fs.diff is for Git, CVS and Filesystem.
Updated by Toshi MARUYAMA almost 14 years ago
bo ye wrote:
please make this to the next minor version 1.1.2. you have my vote. :)
This feature has big behaviour change and has a db migrate.
So, I think it is difficult to apply 1.1 stable.
But, we need to consider to apply 1.2.
Yuya, what do you think?
Updated by Yuya Nishihara almost 14 years ago
Toshi MARUYAMA wrote:
bo ye wrote:
please make this to the next minor version 1.1.2. you have my vote. :)
This feature has big behaviour change and has a db migrate.
So, I think it is difficult to apply 1.1 stable.
But, we need to consider to apply 1.2.Yuya, what do you think?
Same idea. For now, you can work around the issue by:
- put lib/redmine/scm/adapters/path_encodable_wrapper.rb
- apply the patch only for app/models/repository.rb
- and replace the content of
def new_scm
method in place ofdb:migrate
:
scm = Redmine::Scm::Adapters::PathEncodableWrapper.new(scm, path_encoding) unless path_encoding.blank?
by
scm = Redmine::Scm::Adapters::PathEncodableWrapper.new(scm, 'encoding-name-of-your-repo')
Updated by Toshi MARUYAMA almost 14 years ago
Ruby 1.9 compatibility and tests are very serious.
Please see source:trunk/test/unit/lib/redmine/scm/adapters/git_adapter_test.rb@4810#L77 .
Updated by Toshi MARUYAMA almost 14 years ago
Japanese Shift_JIS and Traditional Chinese Big5 have 0x5c(backslash) problem and these are incompatible with ASCII.
Japanese EUC-JP is compatible with ASCII.
Ruby uses ANSI api to fork a process on Windows.
Updated by Toshi MARUYAMA almost 14 years ago
Subversion supports URL encoding for path and Redmine uses it.
I think Redmine Mercurial adapter need to wrap command line path of cat, diff and annotate such as Yuya's Mercurial overhaul helper extension.
Updated by Toshi MARUYAMA almost 14 years ago
I start implementing in new way.
Ruby 1.9 compatibility is very serious.
Updated by Toshi MARUYAMA almost 14 years ago
- Subject changed from Repository path encoding of non UTF-8 characters (Mercurial, Git, CVS and Filesystem) to Repository path encoding of non UTF-8 characters (Mercurial and Filesystem)
- Priority changed from Low to Normal
- % Done changed from 0 to 20
Updated by Toshi MARUYAMA over 13 years ago
- Subject changed from Repository path encoding of non UTF-8 characters (Mercurial and Filesystem) to Mercurial: Repository path encoding of non UTF-8 characters
- % Done changed from 20 to 60
Updated by Toshi MARUYAMA over 13 years ago
- File ruby-1.9.2-japanese-windows.png ruby-1.9.2-japanese-windows.png added
- Target version changed from Unplanned backlogs to 1.2.0
I can't run on my Japanese Windows Ruby 1.9.2 without #4050 Ruby-1.9-Encoding.default_external.diff .
Despite applying this patch, I got following error.
[2011-03-04 20:51:58] ERROR Encoding::InvalidByteSequenceError: "\x9C" followed by "-" on Windows-31J r:/Ruby192/lib/ruby/gems/1.9.1/gems/rails-2.3.11/lib/rails/rack/static.rb:37:in `file?' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rails-2.3.11/lib/rails/rack/static.rb:37:in `file_exist?' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rails-2.3.11/lib/rails/rack/static.rb:18:in `call' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rack-1.1.0/lib/rack/urlmap.rb:47:in `block in call' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rack-1.1.0/lib/rack/urlmap.rb:41:in `each' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rack-1.1.0/lib/rack/urlmap.rb:41:in `call' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rails-2.3.11/lib/rails/rack/log_tailer.rb:17:in `call' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rack-1.1.0/lib/rack/content_length.rb:13:in `call' r:/Ruby192/lib/ruby/gems/1.9.1/gems/rack-1.1.0/lib/rack/handler/webrick.rb:48:in `service' r:/Ruby192/lib/ruby/1.9.1/webrick/httpserver.rb:111:in `service' r:/Ruby192/lib/ruby/1.9.1/webrick/httpserver.rb:70:in `run' r:/Ruby192/lib/ruby/1.9.1/webrick/server.rb:183:in `block in start_thread'
Updated by Toshi MARUYAMA over 13 years ago
"Files" module has similar strange behavior on my Japanese Windows Ruby 1.9.2.
I give up fix it.
Updated by Toshi MARUYAMA over 13 years ago
- Status changed from 7 to Closed
- % Done changed from 90 to 100
- Resolution set to Fixed
I finished implementing this feature until r5001.
And I confirmed to run on my Japanese Windows Ruby 1.8 and Linux Ruby 1.8.
On Linux with #4050 Ruby-1.9-Encoding.default_external.diff , I confirmed to run in ISO-8859-1 locale.