Project

General

Profile

Actions

Defect #11089

closed

UTF-8 encoding not showing correctly when looking highlighted php file contents

Added by Troex Nevelin over 12 years ago. Updated over 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Text formatting
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Resolution:
Affected version:

Description

Ruby version              1.9.3 (x86_64-linux)
RubyGems version          1.8.11
Rack version              1.4
Rails version             3.2.5
Database adapter          mysql2
Database schema version   20120422150750
Git version               1.7.2.5

When I request file to see it contents (repository/revisions/HASH/entry) instead UTF-8 text I get '???'.
I'm using Git SCM and my files are valid UTF-8 (without BOM). I have this problem with Chineses, Russian, Thai and other scripts than latin.
However seeing diff's and attached utf-8 files are okay.


Files

utf-8-not-shown-in-file-contents-view.png (50.4 KB) utf-8-not-shown-in-file-contents-view.png Troex Nevelin, 2012-06-04 19:13
diff-view-is-okay.png (56.6 KB) diff-view-is-okay.png Troex Nevelin, 2012-06-04 19:13
gh-new-d7e2a66d.png (53.8 KB) gh-new-d7e2a66d.png Toshi MARUYAMA, 2012-06-05 15:13
git-show-component.php.txt (2.61 KB) git-show-component.php.txt Troex Nevelin, 2012-06-05 15:36
git-show-component.php (2.61 KB) git-show-component.php the same file but with different extension Troex Nevelin, 2012-06-05 15:39
issue-attached-files.png (15.2 KB) issue-attached-files.png Troex Nevelin, 2012-06-05 15:45
test-file-with-php-ext.png (145 KB) test-file-with-php-ext.png Troex Nevelin, 2012-06-05 15:45
test-file-with-txt-ext.png (93.5 KB) test-file-with-txt-ext.png Troex Nevelin, 2012-06-05 15:45
def.php (31 Bytes) def.php gehao liu, 2012-06-11 11:09
def.py (38 Bytes) def.py gehao liu, 2012-06-11 11:09
def.txt (31 Bytes) def.txt gehao liu, 2012-06-11 11:09
def.php.png (3.93 KB) def.php.png Toshi MARUYAMA, 2013-06-04 15:40

Related issues

Has duplicate Redmine - Defect #11131: repository View and Annotate code Utf-8 show ??? ,diff is rightClosed

Actions
Has duplicate Redmine - Defect #14445: <code> inside <pre> destroy polish diacritic of PHPClosed

Actions
Actions #1

Updated by Etienne Massip over 12 years ago

Did you set any value in the Attachments and repositories encodings setting (in Administration/Settings General tab)?

If not, try to?

Actions #2

Updated by Troex Nevelin over 12 years ago

Yes I have tried setting it to UTF-8 but it has no effect

Actions #3

Updated by Toshi MARUYAMA over 12 years ago

  • Subject changed from UTF-8 encoding not showing correctly when looking file contents to Git: encoding not showing correctly when looking file contents
Actions #4

Updated by Toshi MARUYAMA over 12 years ago

Redmine uses "git show".
source:tags/2.0.1/lib/redmine/scm/adapters/git_adapter.rb#L372

Git 1.7.3.4, "git show --help" says

The contents of the blob objects are uninterpreted sequences of bytes. 
There is no encoding translation at the core level.

Actions #5

Updated by Troex Nevelin over 12 years ago

I understand that git stores files in binary form, but calling from console:

git show --no-color HEAD:.../lang/ru/component.php

returns UTF-8 valid text, as I understand Redmine tries to guess encoding and sanitise content making sure no invalid characters pass to view.

For example source:trunk/config/locales/ja.yml this displays up correctly (but it uses SVN).

I think there is encoding guess problem in source:tags/2.0.1/lib/redmine/codeset_util.rb#L84 calling .to_utf8_by_setting_internal(str) sets ASCII-8BIT encoding on line 94?

Actions #6

Updated by Toshi MARUYAMA over 12 years ago

I cannot reproduce.
https://github.com/redmine/redmine/commit/d7e2a66d

Could you attach this "git show" output file?

Actions #7

Updated by Troex Nevelin over 12 years ago

git show --no-color HEAD:.../lang/ru/component.php > git-show-component.php.txt

I'm running Redmine on Debian 6, with ruby 1.9.3p125 (2012-02-16) [x86_64-linux] package compiled from debian ruby repository, using unicorn rack server.

I'm almost sure this is local related problem. Can you guide me how to debug this problem? I'm familier with ruby and ror. I have tried to output raw content in app/views/common/_file.html.erb but it gives me ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT) error

Actions #9

Updated by Troex Nevelin over 12 years ago

I've made one more test on my setup, I've attached the same file to an issue but with different extensions .txt and .php and when trying to see attached file I get an issue with viewing syntax highlighted file. So this is not only Git related problem.

But no issue here in this ticket.

# grep coderay Gemfile.lock 
    coderay (1.0.6)
  coderay (~> 1.0.6)
Actions #10

Updated by Toshi MARUYAMA over 12 years ago

  • Subject changed from Git: encoding not showing correctly when looking file contents to UTF-8 encoding not showing correctly when looking file contents
  • Category deleted (SCM)
Actions #11

Updated by gehao liu over 12 years ago

  • Status changed from New to Resolved

same problem!!!!!!!!!!
txt extname is OK,
python extname .py is OK.
php extname is wrong,
problem is viewing syntax highlighted!!!!!!!!!!

Actions #12

Updated by gehao liu over 12 years ago

gehao liu wrote:

same problem!!!!!!!!!!
txt extname is OK,
python extname .py is OK.
php extname is wrong,
problem is viewing syntax highlighted!!!!!!!!!!

Actions #13

Updated by Toshi MARUYAMA over 12 years ago

  • Status changed from Resolved to New
Actions #14

Updated by gehao liu over 12 years ago

this is coderay 1.0.6's bug,only php file.

Actions #15

Updated by András Kolesár over 11 years ago

coderay php encoding issue has been solved:
https://github.com/rubychan/coderay/issues/40

checked, works fine with updated coderay/scanners/php.rb file

Actions #16

Updated by Etienne Massip over 11 years ago

  • Subject changed from UTF-8 encoding not showing correctly when looking file contents to UTF-8 encoding not showing correctly when looking highlighted file contents
  • Category set to Text formatting
  • Status changed from New to Confirmed
  • Target version set to Candidate for next minor release

Upgrade dep to 1.0.9 or 1.1.

Actions #17

Updated by Toshi MARUYAMA over 11 years ago

  • Related to Defect #13692: warning: already initialized constant on Ruby 1.8.7 added
Actions #18

Updated by Toshi MARUYAMA over 11 years ago

  • Related to deleted (Defect #13692: warning: already initialized constant on Ruby 1.8.7)
Actions #19

Updated by Toshi MARUYAMA over 11 years ago

Coderay version is defined "~> 1.0.6" in source:tags/2.3.1/Gemfile#L6
So, Coderay 1.0.9 is installed in 2.3.1.

This is note 12 php image.

Actions #20

Updated by Toshi MARUYAMA over 11 years ago

  • Subject changed from UTF-8 encoding not showing correctly when looking highlighted file contents to UTF-8 encoding not showing correctly when looking highlighted php file contents
Actions #21

Updated by Toshi MARUYAMA over 11 years ago

  • Target version deleted (Candidate for next minor release)
Actions #22

Updated by Toshi MARUYAMA over 11 years ago

  • Has duplicate Defect #14445: <code> inside <pre> destroy polish diacritic of PHP added
Actions

Also available in: Atom PDF