Defect #2274
closedFilesystem Repository path encoding of non UTF-8 characters
100%
Description
The filesystem repository has some issues regarding special characters in filenames or directory names.
If the name if a file or directory contains a special char (like ü, ö, ä for example), the codepage will get mixed up
and the browser will having trouble displaying it (see attached pictures). If one corrects the codepage manually,
the files in the repo will be displayed correctly, but every other special char (table headers, etc.) will get messed up.
In addition to that, if a directoy contains a special char, it is browseable, but every subdir and file in that directory is not.
That is mainly because every '/' after the special char will be converted to %2F and every special char will be converted to its entity, completely messing the URI up. You will get a 404 (not found) error So, for instance
http://redmine.my.domain/repositories/browse/project/Brücke -> http://redmine.my.domain/repositories/browse/project/Br%FCcke
http://redmine.my.domain/repositories/browse/project/Brücke/übersicht.jpg -> http://redmine.my.domain/repositories/browse/project/Br%FCcke%2F%FCbersicht.jpg
Manually replacing every %2F with / did not help really. You will get the Redmine page, but an error saying that the file does not exist in the repo. Replacing every %-code with its special char will lead to a 500 (internal) error.
Setting the codepages under Administration -> Repositories -> Codepages does not affect this in any way. Tried settings:- UTF-8, ISO 8859-1, ISO 8859-15, CP1252
- ISO-8859-1, ISO-8859-15, UTF-8, CP1252
- ISO-8859-15, ISO-8859-1, UTF-8, CP1252
- Redmine 0.7.3.devel.2079 (MySQL)
- ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
- Rails 2.1.0
Files
Related issues
Updated by Paul Rivier about 16 years ago
- Assignee set to Paul Rivier
Hi Tony,
I can't reproduce it here with chars like é € ä ß or à ... My only available environment is Linux, so it might be related to your windows environment. Can anybody else confirm this bug and narrow it as much as possible ?
Updated by Toni Kerschbaum about 16 years ago
- File repo_enc_bug_01.gif repo_enc_bug_01.gif added
- File repo_enc_bug_02.gif repo_enc_bug_02.gif added
I just noticed that I forgot to upload the pictures, so here they are.
It could be related to my Windows environment (Windows Server 2003). If I can help you in any way to narrow or track down the problem, please let me know.
Also, the path in the first two pictures remains fully browseable (and the files downloadable), but some other paths and directories do not. Sadly, I can't find any notable difference in those paths and filenames.
Updated by Toshi MARUYAMA almost 14 years ago
- Assignee changed from Paul Rivier to Toshi MARUYAMA
Updated by Toshi MARUYAMA almost 14 years ago
- Subject changed from Filesystem Repository and (german) special chars to Filesystem Repository path encoding of non UTF-8 characters
Updated by Toshi MARUYAMA almost 14 years ago
- File fs-setting.png fs-setting.png added
- File fs-browse.png fs-browse.png added
Updated by Toshi MARUYAMA almost 14 years ago
- Status changed from New to 7
- Target version set to 1.2.0
- % Done changed from 0 to 80
Updated by Toshi MARUYAMA almost 14 years ago
- Status changed from 7 to Closed
- % Done changed from 80 to 100
- Resolution set to Fixed
I finished implementing until r4944.
It is impossible to prepare test tar ball and test non UTF-8 encoding paths on all OSs, filesystems and Languages.
If tar ball has Latin-1 path encoding files, I can't extract it on my Japanese Windows.
Please refer.
http://mercurial.selenic.com/wiki/EncodingStrategy?action=recall&rev=6