git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Steffen Prohaska <prohaska@zib.de>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Git Mailing List <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Finn Arne Gangstad <finnag@pvv.org>,
	Junio C Hamano <gitster@pobox.com>,
	Michele Ballabio <barra_cuda@katamail.com>
Subject: Re: [RFC/PATCH] git-gui: Use gitattribute "encoding" for file content display
Date: Wed, 23 Jan 2008 11:28:35 +0100	[thread overview]
Message-ID: <200801231128.36504.jnareb@gmail.com> (raw)
In-Reply-To: <EFF72DA9-A717-44A1-9C5C-649D08BB7E96@zib.de>

On Wed. 23 Jan 2008, Steffen Prohaska wrote:
> On Jan 23, 2008, at 6:55 AM, Junio C Hamano wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>
>>> git-gui: Use gitattribute "encoding" for file content display
>>>
>>> Most folks using git-gui on internationalized files have complained
>>> that it doesn't recognize UTF-8 correctly.  In the past we have just
>>> ignored the problem and showed the file contents as binary/US-ASCII,
>>> which is wrong no matter how you look at it.
>>
>> Hmmm.
>>
>> At least for now in 1.5.4, I'd prefer the way gitk shows UTF-8
>> (if I recall correctly latin-1 or other legacy encoding, as long
>> as LANG/LC_* is given appropriately, as well) contents without
>> per-path configuration without introducing new attributes.
> 
> Shouldn't we first try harder to get things right without adding
> an attribute?  Maybe we could continue a good tradition and look
> at the content of the first: we could first look for hints in the
> file about the encoding.  XML and many text files contain such
> hints already to help editors.  For example,  Python source can
> explicitly contain the encoding [1]; and I guess there are many
> other examples.

For example LaTeX files either use inputenc package to set encoding
(e.g. \usepackage[latin2]{inputenc}) or use magic first line to
specify TCX (TeX character translation) file 
(e.g. %& -translate-file=il2-t1).

Emacs encourages to use file variables, either in the form of magic
first line, or file variables at the end of file; I think the same
is true for Vim.


I'd like then for it to be at least as configurable as diff.*.funcname 
is for diff.

> If we don't find a direct hint, we could have 
> some magic auto-detection similar to what we do for autocrlf.

We can at least try to and check for UTF-16 magic first two bytes, and 
detect if we have character which is invalid in UTF-8 (for performance 
I guess checking only beginning of file)... 

> As a fallback the user could specify a default encoding.  But only
> as a last resort, I'd use explicit attributes.

...and then falling back to fallback encoding, like gitweb does.

-- 
Jakub Narebski
Poland

  reply	other threads:[~2008-01-23 10:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-23  5:47 [RFC/PATCH] git-gui: Use gitattribute "encoding" for file content display Shawn O. Pearce
2008-01-23  5:55 ` Junio C Hamano
2008-01-23  8:41   ` Steffen Prohaska
2008-01-23 10:28     ` Jakub Narebski [this message]
2008-01-24  3:36   ` Shawn O. Pearce
2008-01-23  7:02 ` Pedro Melo
     [not found] ` <4FF40048-FCF4-4BAD-AD08-6ADAD30E7B6A@simplicidade.org>
2008-01-24  3:39   ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200801231128.36504.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=barra_cuda@katamail.com \
    --cc=finnag@pvv.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=prohaska@zib.de \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).