All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Steffen Prohaska <prohaska@zib.de>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Git Mailing List <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Finn Arne Gangstad <finnag@pvv.org>,
	Junio C Hamano <gitster@pobox.com>,
	Michele Ballabio <barra_cuda@katamail.com>
Subject: Re: [RFC/PATCH] git-gui: Use gitattribute "encoding" for file content display
Date: Wed, 23 Jan 2008 11:28:35 +0100	[thread overview]
Message-ID: <200801231128.36504.jnareb@gmail.com> (raw)
In-Reply-To: <EFF72DA9-A717-44A1-9C5C-649D08BB7E96@zib.de>

On Wed. 23 Jan 2008, Steffen Prohaska wrote:
> On Jan 23, 2008, at 6:55 AM, Junio C Hamano wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>
>>> git-gui: Use gitattribute "encoding" for file content display
>>>
>>> Most folks using git-gui on internationalized files have complained
>>> that it doesn't recognize UTF-8 correctly.  In the past we have just
>>> ignored the problem and showed the file contents as binary/US-ASCII,
>>> which is wrong no matter how you look at it.
>>
>> Hmmm.
>>
>> At least for now in 1.5.4, I'd prefer the way gitk shows UTF-8
>> (if I recall correctly latin-1 or other legacy encoding, as long
>> as LANG/LC_* is given appropriately, as well) contents without
>> per-path configuration without introducing new attributes.
> 
> Shouldn't we first try harder to get things right without adding
> an attribute?  Maybe we could continue a good tradition and look
> at the content of the first: we could first look for hints in the
> file about the encoding.  XML and many text files contain such
> hints already to help editors.  For example,  Python source can
> explicitly contain the encoding [1]; and I guess there are many
> other examples.

For example LaTeX files either use inputenc package to set encoding
(e.g. \usepackage[latin2]{inputenc}) or use magic first line to
specify TCX (TeX character translation) file 
(e.g. %& -translate-file=il2-t1).

Emacs encourages to use file variables, either in the form of magic
first line, or file variables at the end of file; I think the same
is true for Vim.


I'd like then for it to be at least as configurable as diff.*.funcname 
is for diff.

> If we don't find a direct hint, we could have 
> some magic auto-detection similar to what we do for autocrlf.

We can at least try to and check for UTF-16 magic first two bytes, and 
detect if we have character which is invalid in UTF-8 (for performance 
I guess checking only beginning of file)... 

> As a fallback the user could specify a default encoding.  But only
> as a last resort, I'd use explicit attributes.

...and then falling back to fallback encoding, like gitweb does.

-- 
Jakub Narebski
Poland

  reply	other threads:[~2008-01-23 10:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-23  5:47 [RFC/PATCH] git-gui: Use gitattribute "encoding" for file content display Shawn O. Pearce
2008-01-23  5:55 ` Junio C Hamano
2008-01-23  8:41   ` Steffen Prohaska
2008-01-23 10:28     ` Jakub Narebski [this message]
2008-01-24  3:36   ` Shawn O. Pearce
2008-01-23  7:02 ` Pedro Melo
     [not found] ` <4FF40048-FCF4-4BAD-AD08-6ADAD30E7B6A@simplicidade.org>
2008-01-24  3:39   ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200801231128.36504.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=barra_cuda@katamail.com \
    --cc=finnag@pvv.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=prohaska@zib.de \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.