From: Karsten Blees <karsten.blees@gmail.com>
To: Ken Ismert <kismert@gmail.com>
Cc: msysgit@googlegroups.com, git@vger.kernel.org
Subject: Re: Script for handling UTF-16 files
Date: Wed, 10 Apr 2013 20:59:56 +0200 [thread overview]
Message-ID: <5165B6AC.5090403@gmail.com> (raw)
In-Reply-To: <608a349b-cc71-4cba-9197-3783049e9f47@googlegroups.com>
Am 10.04.2013 01:47, schrieb Ken Ismert:
>
> I bumped into the UTF-16 display problem with Git Extensions running on top of msysGit. After lots of searching and experimenting, I came up with a solution that works for me.
>
> Note: Please see questions below.
>
> This method is for MSysGit 1.8.1, and is tested on Windows XP. I use Git Extensions 2.44, but since the changes are at the Git level, they should work for Git Gui as well. Steps:
There has been a discussion about handling UTF-16 on the git ML a while back, see http://thread.gmane.org/gmane.comp.version-control.git/159708
As suggested there, I would try to use a clean/smudge filter (i.e. store UTF-16 files as UTF-8 in the repository and convert back to UTF-16 on checkout). That way git can treat your UTF-16 files as text in most cases (i.e. you can merge them, git-grep works, gitattributes work (eol-conversion, ident-replacement, built-in diff patterns...)).
If you use a textconv filter, UTF-16 content will be treated as binary by most git operations.
There's also an 'encoding' attribute and a 'gui.encoding' setting which in theory should solve your issue (i.e. specify encoding of files for display by GUI tools). I don't know if Git Extensions supports that, or whether its supposed to work for binary files at all.
> 3) Modify the global ~/Git/etc/gitconfig or your local ~/.git/config file, and add these lines:
>
> [diff "astextutf16"]
> textconv = astextutf16
Why not simply "textconv = iconv -f utf-16 -t utf-8", without the extra script?
> c) I had success with iconv, but is there any built-in UTF-16 to UTF-8 converter that ships with msysGit?
There are ready-to-use UTF-conversion functions in the codebase, but these are not accessible as a git command or built-in filter.
> As a quick fix, how hard would it be to add a 'utf16' diff filter, similar to cpp or |csharp? Or is this simply the wrong place to put in a work-around?
As described above, I think a diff filter is not the right tool for the job. The only universal format for text content that works reasonably well with established text-based technologies (merge algorithms, regex etc.) is UTF-8. If we want to benefit from these technologies, git should store text files as UTF-8 and convert from / to platform-specific formats on checkin / checkout or for display.
Bye,
Karsten
--
--
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.
You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en
---
You received this message because you are subscribed to the Google Groups "msysGit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
next parent reply other threads:[~2013-04-10 18:59 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <608a349b-cc71-4cba-9197-3783049e9f47@googlegroups.com>
2013-04-10 18:59 ` Karsten Blees [this message]
2013-04-11 19:11 ` Script for handling UTF-16 files Ken Ismert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5165B6AC.5090403@gmail.com \
--to=karsten.blees@gmail.com \
--cc=git@vger.kernel.org \
--cc=kismert@gmail.com \
--cc=msysgit@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).