From: Jakub Narebski <jnareb@gmail.com>
To: Steffen Prohaska <prohaska@zib.de>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
Git Mailing List <git@vger.kernel.org>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Finn Arne Gangstad <finnag@pvv.org>,
Junio C Hamano <gitster@pobox.com>,
Michele Ballabio <barra_cuda@katamail.com>
Subject: Re: [RFC/PATCH] git-gui: Use gitattribute "encoding" for file content display
Date: Wed, 23 Jan 2008 11:28:35 +0100 [thread overview]
Message-ID: <200801231128.36504.jnareb@gmail.com> (raw)
In-Reply-To: <EFF72DA9-A717-44A1-9C5C-649D08BB7E96@zib.de>
On Wed. 23 Jan 2008, Steffen Prohaska wrote:
> On Jan 23, 2008, at 6:55 AM, Junio C Hamano wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>
>>> git-gui: Use gitattribute "encoding" for file content display
>>>
>>> Most folks using git-gui on internationalized files have complained
>>> that it doesn't recognize UTF-8 correctly. In the past we have just
>>> ignored the problem and showed the file contents as binary/US-ASCII,
>>> which is wrong no matter how you look at it.
>>
>> Hmmm.
>>
>> At least for now in 1.5.4, I'd prefer the way gitk shows UTF-8
>> (if I recall correctly latin-1 or other legacy encoding, as long
>> as LANG/LC_* is given appropriately, as well) contents without
>> per-path configuration without introducing new attributes.
>
> Shouldn't we first try harder to get things right without adding
> an attribute? Maybe we could continue a good tradition and look
> at the content of the first: we could first look for hints in the
> file about the encoding. XML and many text files contain such
> hints already to help editors. For example, Python source can
> explicitly contain the encoding [1]; and I guess there are many
> other examples.
For example LaTeX files either use inputenc package to set encoding
(e.g. \usepackage[latin2]{inputenc}) or use magic first line to
specify TCX (TeX character translation) file
(e.g. %& -translate-file=il2-t1).
Emacs encourages to use file variables, either in the form of magic
first line, or file variables at the end of file; I think the same
is true for Vim.
I'd like then for it to be at least as configurable as diff.*.funcname
is for diff.
> If we don't find a direct hint, we could have
> some magic auto-detection similar to what we do for autocrlf.
We can at least try to and check for UTF-16 magic first two bytes, and
detect if we have character which is invalid in UTF-8 (for performance
I guess checking only beginning of file)...
> As a fallback the user could specify a default encoding. But only
> as a last resort, I'd use explicit attributes.
...and then falling back to fallback encoding, like gitweb does.
--
Jakub Narebski
Poland
next prev parent reply other threads:[~2008-01-23 10:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-23 5:47 [RFC/PATCH] git-gui: Use gitattribute "encoding" for file content display Shawn O. Pearce
2008-01-23 5:55 ` Junio C Hamano
2008-01-23 8:41 ` Steffen Prohaska
2008-01-23 10:28 ` Jakub Narebski [this message]
2008-01-24 3:36 ` Shawn O. Pearce
2008-01-23 7:02 ` Pedro Melo
[not found] ` <4FF40048-FCF4-4BAD-AD08-6ADAD30E7B6A@simplicidade.org>
2008-01-24 3:39 ` Shawn O. Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801231128.36504.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=barra_cuda@katamail.com \
--cc=finnag@pvv.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=prohaska@zib.de \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).