Project

General

Profile

Actions

Feature #2371

closed

character encoding for attachment file

Added by youngseok yi about 15 years ago. Updated over 12 years ago.

Status:
Closed
Priority:
Low
Assignee:
Toshi MARUYAMA
Category:
Attachments
Target version:
Start date:
2008-12-22
Due date:
% Done:

100%

Estimated time:
Resolution:
Fixed

Description

As r814, default encoding for repository can be configured.

diff or patch attachment requires similar configuration.

I thinks 2nd option may be enough and useful.


Files

attachment-encoding.patch (1017 Bytes) attachment-encoding.patch Yuya Nishihara, 2010-06-27 12:14
general-settings.png (28.2 KB) general-settings.png Toshi MARUYAMA, 2011-11-17 11:17

Related issues

Related to Redmine - Defect #9143: Partial diff comparison should be done on actual code, not on htmlClosedJean-Philippe Lang2011-08-29

Actions
Related to Redmine - Defect #4608: Mail attachment name encoding is incorectly handledClosed2010-01-19

Actions
Has duplicate Redmine - Feature #4577: convert text file attached an issue to utf-8.Closed2010-01-14

Actions
Has duplicate Redmine - Defect #3652: Unicode Support for TXT-FilesClosed2009-07-22

Actions
Actions #1

Updated by Yuya Nishihara over 13 years ago

youngseok yi wrote:

  • follow encoding of repository.

Attached patch implements it with minimal changes. attachment-encoding.patch

Proper solution will be something like:
  1. move to_utf8 to separate module, e.g. RepoFilesHelper
  2. make AttachmentsHelper and RepositoriesHelper include RepoFilesHelper
Actions #2

Updated by Toshi MARUYAMA almost 13 years ago

  • Assignee set to Toshi MARUYAMA
Actions #3

Updated by Toshi MARUYAMA almost 13 years ago

  • Target version set to Candidate for next major release
Actions #4

Updated by Toshi MARUYAMA almost 13 years ago

  • Target version changed from Candidate for next major release to 1.3.0
Actions #5

Updated by Toshi MARUYAMA over 12 years ago

  • Subject changed from encoding for diff or patch attachment file to encoding for attachment file
Actions #6

Updated by Toshi MARUYAMA over 12 years ago

  • Subject changed from encoding for attachment file to character encoding for attachment file
Actions #7

Updated by Etienne Massip over 12 years ago

Toshi, won't your last commit prevent me from attaching an iso8859-1 encoded patch to this issue and seeing it fine?

Actions #8

Updated by Toshi MARUYAMA over 12 years ago

Etienne Massip wrote:

Toshi, won't your last commit prevent me from attaching an iso8859-1 encoded patch to this issue and seeing it fine?

This feature issue goal is that attachment file and patch encoding are converted by repositories setting.

Actions #9

Updated by Etienne Massip over 12 years ago

I'm not sure this is a good idea; repositories may return data using a specific encoding, but attachments are usually stored on FS without transformation, so assuming that they're "very likely to be encoded the same way data in SCM is" is not necessarily true.

For example, my encoding list starts with UTF-8 and my locale (Fr) would assume that files uploaded by users are probably encoded in ISO-8859-15/CP1252; so assuming that the text files uploaded are in UTF-8 mean that they will be rendered stripped and that I will probably often loose some chars, which is the actual situation.

I would prefer to be able to specify a distinct default encoding for text attachments which would be ISO-8859-15/CP1252 (could be defaulted to default server encoding) and render with something like bom_present?(str) ? str : Iconv.conv('UTF-8', Setting.default_encoding).

Actions #10

Updated by Toshi MARUYAMA over 12 years ago

UTF-8 is very strict.
It is very rare case that miss understanding ISO-8859-1 characters as UTF-8.
http://groups.google.com/group/thg-dev/browse_thread/thread/6c258628e3fce8/09e9dbe4a030e51d

Actions #11

Updated by Toshi MARUYAMA over 12 years ago

Redmine 1.2.2 repository converting encoding is this line.
source:tags/1.2.2/app/helpers/repositories_helper.rb#L140

In case of "UTF-8,ISO-8859-1",
if converting error in "UTF-8", Redmine converts from ISO-8859-1.

Japanese use three encoding, UTF-8, EUC-JP and Shift-JIS (CP932).
This Redmine feature is big advantage in Japan.

Actions #12

Updated by Etienne Massip over 12 years ago

So if I understand well, according to encoding list order, it will try and fail to convert the ISO-8859-1 file from UTF-8 to UTF-8 and then will try and success to convert it from ISO-8859-1 to UTF-8?

Guess it will work...

Actions #13

Updated by Etienne Massip over 12 years ago

What if the administrator does not set UTF-8 at the start of the list?
Can't you str.is_utf8? ? str : try Iconv.conv('UTF-8', Setting.encodings)?

Actions #14

Updated by Toshi MARUYAMA over 12 years ago

Etienne Massip wrote:

repositories may return data using a specific encoding,

It is not true.
SCMs does not have encoding information (meta data) of file contents.
http://mercurial.selenic.com/wiki/EncodingStrategy?action=recall&rev=21#Unknown_byte_strings

Actions #15

Updated by Etienne Massip over 12 years ago

Toshi MARUYAMA wrote:

It is not true.
SCMs does not have encoding information (meta data) of file contents.

Well, that's why I said may :-)

Actions #16

Updated by Toshi MARUYAMA over 12 years ago

Etienne Massip wrote:

What if the administrator does not set UTF-8 at the start of the list?

This is very rare case in Japan.
It is popular "UTF-8,EUC-JP,Shift_JIS in Japan.
This order is strict order.

If Single Byte Character Set (e.g. ISO-8859-1) is the start of the list, all characters are converted to UTF-8.
But, I think this is very rare case in the whole world.

Can't you str.is_utf8? ? str : try Iconv.conv('UTF-8', Setting.encodings)?

Default repository encoding setting is empty.
This is equivalent that default is UTF-8.
And I think it is better that administrator set UTF-8 in the start of the list explicitly.

Actions #17

Updated by Toshi MARUYAMA over 12 years ago

  • % Done changed from 0 to 100
Actions #18

Updated by Anton Statutov over 12 years ago

Is this feature fixes #4608?

Actions #19

Updated by Mischa The Evil over 12 years ago

Anton Statutov wrote:

Is this feature fixes #4608?

I don't think so.

Actions #20

Updated by Toshi MARUYAMA over 12 years ago

  • Status changed from New to Closed
  • Resolution set to Fixed

Committed in r7885.

Actions

Also available in: Atom PDF