Project

General

Profile

Actions

Defect #2274

closed

Filesystem Repository path encoding of non UTF-8 characters

Added by Toni Kerschbaum almost 16 years ago. Updated over 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Toshi MARUYAMA
Category:
SCM
Target version:
Start date:
2008-12-04
Due date:
% Done:

100%

Estimated time:
Resolution:
Fixed
Affected version:

Description

The filesystem repository has some issues regarding special characters in filenames or directory names.
If the name if a file or directory contains a special char (like ü, ö, ä for example), the codepage will get mixed up
and the browser will having trouble displaying it (see attached pictures). If one corrects the codepage manually,
the files in the repo will be displayed correctly, but every other special char (table headers, etc.) will get messed up.

In addition to that, if a directoy contains a special char, it is browseable, but every subdir and file in that directory is not.
That is mainly because every '/' after the special char will be converted to %2F and every special char will be converted to its entity, completely messing the URI up. You will get a 404 (not found) error So, for instance

http://redmine.my.domain/repositories/browse/project/Brücke -> http://redmine.my.domain/repositories/browse/project/Br%FCcke
http://redmine.my.domain/repositories/browse/project/Brücke/übersicht.jpg -> http://redmine.my.domain/repositories/browse/project/Br%FCcke%2F%FCbersicht.jpg

Manually replacing every %2F with / did not help really. You will get the Redmine page, but an error saying that the file does not exist in the repo. Replacing every %-code with its special char will lead to a 500 (internal) error.

Setting the codepages under Administration -> Repositories -> Codepages does not affect this in any way. Tried settings:
  • UTF-8, ISO 8859-1, ISO 8859-15, CP1252
  • ISO-8859-1, ISO-8859-15, UTF-8, CP1252
  • ISO-8859-15, ISO-8859-1, UTF-8, CP1252
Tested under:
  • Redmine 0.7.3.devel.2079 (MySQL)
  • ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
  • Rails 2.1.0

Files

repo_enc_bug_01.gif (20.5 KB) repo_enc_bug_01.gif Encoding bug (Firefox 3.0.4 Encoding: Auto/UTF-8, Windows XP SP3) Toni Kerschbaum, 2008-12-13 20:09
repo_enc_bug_02.gif (20 KB) repo_enc_bug_02.gif Encoding bug (Firefox 3.0.4 Encoding: ISO-8859-1, Windows XP SP3) Toni Kerschbaum, 2008-12-13 20:09
fs-setting.png (13.2 KB) fs-setting.png Toshi MARUYAMA, 2011-02-08 08:14
fs-browse.png (28.9 KB) fs-browse.png Toshi MARUYAMA, 2011-02-08 08:14

Related issues

Related to Redmine - Defect #2664: Mercurial: Repository path encoding of non UTF-8 charactersClosedToshi MARUYAMA2009-02-04

Actions
Actions #1

Updated by Paul Rivier almost 16 years ago

  • Assignee set to Paul Rivier

Hi Tony,

I can't reproduce it here with chars like é € ä ß or à ... My only available environment is Linux, so it might be related to your windows environment. Can anybody else confirm this bug and narrow it as much as possible ?

Actions #2

Updated by Toni Kerschbaum almost 16 years ago

I just noticed that I forgot to upload the pictures, so here they are.

It could be related to my Windows environment (Windows Server 2003). If I can help you in any way to narrow or track down the problem, please let me know.

Also, the path in the first two pictures remains fully browseable (and the files downloadable), but some other paths and directories do not. Sadly, I can't find any notable difference in those paths and filenames.

Actions #3

Updated by Toshi MARUYAMA almost 14 years ago

  • Assignee changed from Paul Rivier to Toshi MARUYAMA
Actions #4

Updated by Toshi MARUYAMA almost 14 years ago

  • Subject changed from Filesystem Repository and (german) special chars to Filesystem Repository path encoding of non UTF-8 characters
Actions #5

Updated by Toshi MARUYAMA almost 14 years ago

Try #2664 note-19 patches.

These are my Japanese Windows Vista images.

Actions #6

Updated by Toshi MARUYAMA over 13 years ago

  • Status changed from New to 7
  • Target version set to 1.2.0
  • % Done changed from 0 to 80
Actions #7

Updated by Toshi MARUYAMA over 13 years ago

  • Status changed from 7 to Closed
  • % Done changed from 80 to 100
  • Resolution set to Fixed

I finished implementing until r4944.

It is impossible to prepare test tar ball and test non UTF-8 encoding paths on all OSs, filesystems and Languages.

If tar ball has Latin-1 path encoding files, I can't extract it on my Japanese Windows.

Please refer.
http://mercurial.selenic.com/wiki/EncodingStrategy?action=recall&rev=6

Actions

Also available in: Atom PDF