Project

General

Profile

Actions

Feature #3396

closed

Git: use --encoding=UTF-8 in "git log"

Added by Vitaliy Ischenko over 15 years ago. Updated over 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Toshi MARUYAMA
Category:
SCM
Target version:
Start date:
2009-05-20
Due date:
% Done:

0%

Estimated time:
Resolution:
Fixed

Description

Global setting for repositories log encoding is useless for git
git has config option i18n.logoutputencoding if it is empty, then log encoding is UTF-8
otherwise use value specified by option


Related issues

Related to Redmine - Defect #3196: Don't properly support encoding of repositories (git)ClosedToshi MARUYAMA2009-04-17

Actions
Related to Redmine - Feature #1735: Per project repository log encoding settingClosedToshi MARUYAMA2008-08-03

Actions
Related to Redmine - Defect #5251: Git: Repository path encoding of non UTF-8 charactersClosedToshi MARUYAMA2010-04-07

Actions
Related to Redmine - Defect #2664: Mercurial: Repository path encoding of non UTF-8 charactersClosedToshi MARUYAMA2009-02-04

Actions
Related to Redmine - Defect #4773: Redmine+Git+PostgresSQL 8.4 fails with linux kernel tree (encoding)ClosedJean-Philippe Lang2010-02-09

Actions
Related to Redmine - Defect #7597: Subversion and Mercurial log have the possibility to miss encodingClosedToshi MARUYAMA2011-02-10

Actions
Actions #1

Updated by Jean-Philippe Lang over 15 years ago

This is pretty vague. What do you expect exactly?
I'm not a git user, so any detail is welcome.
Thanks.

Actions #2

Updated by Vitaliy Ischenko over 15 years ago

There is config option i18n.logOutputEncoding (per repository) in git which stores encoding for log output with git-log.

From http://www.kernel.org/pub/software/scm/git/docs/git-config.html

i18n.logOutputEncoding
   Character encoding the commit messages are converted to when running git-log and friends.

if it is empty or unset, then output will be UTF-8 encoded
else value specified in this option will be used

you can get this value with `git config i18n.logOutputEncoding`

Actions #3

Updated by Jean-Philippe Lang over 15 years ago

  • Tracker changed from Defect to Feature
Actions #4

Updated by Toshi MARUYAMA almost 14 years ago

  • Status changed from New to 7
  • Assignee set to Toshi MARUYAMA
Actions #5

Updated by Toshi MARUYAMA almost 14 years ago

Actions #6

Updated by Toshi MARUYAMA almost 14 years ago

Additional reference.
http://www.kernel.org/pub/software/scm/git/docs/git.html

-c <name>=<value>

Pass a configuration parameter to the command. The value given will override values from configuration files. The <name> is expected in the same format as listed by git config (subkeys separated by dots).

Actions #7

Updated by Jean-François Dagenais over 13 years ago

I wrote an answer to Weverton Morais about how I patched a problem we had i beleive is related to this ticket. I maintain a modified linux kernel git repo, so lots of international names in there, I narrowed it down to a simple duplicating scenario.

Try making a dummy git commit with this name:

git commit -am"dummy test character encoding" --allow-empty --author="blaŻbla <tata@toto.com>" 

Then do the changeset fetch, I use

ruby script/runner "Repository.fetch_changesets" 

or the /sys/fetch_changesets with the key.

The logs will show a collation error on a query. We use git under linux platforms and never worried about encoding, so I believe our platforms default to utf8.

As my answer said, the problem seemed to be that all of the tables created by redmine (or TurnKey Linux? the base of our install.) were defaulted to latin1. In any case, the fetch_chagesets code should acount for the difference in encoding if needed.

Actions #8

Updated by Jean-François Dagenais over 13 years ago

... so the point is, it's not just the file paths inside the repo, or the commit logs, but all text contained within the repo it seems.

Actions #9

Updated by Vitaliy Ischenko over 13 years ago

Jean-François Dagenais wrote:

... so the point is, it's not just the file paths inside the repo, or the commit logs, but all text contained within the repo it seems.

According to docs this is false: i18n.commitencoding relates only to log message, all other parts should be treated as uninterpreted sequences of non-NUL bytes (file paths, author, commiter and other commit object headers).

Actions #10

Updated by Toshi MARUYAMA over 13 years ago

  • Subject changed from read Git log encoding from i18n.logoutputencoding to Git: use --encoding=UTF-8 in "git log"
Actions #11

Updated by Toshi MARUYAMA over 13 years ago

  • Status changed from 7 to Closed
  • Target version set to 1.2.0
  • Resolution set to Fixed

Implemented until r4964.

Actions

Also available in: Atom PDF