git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* filenames in repo are in CP1251; I have linux and UTF-8
@ 2011-08-02  7:54 Ilya Basin
  2011-08-02 13:02 ` Dmitry Potapov
  0 siblings, 1 reply; 2+ messages in thread
From: Ilya Basin @ 2011-08-02  7:54 UTC (permalink / raw)
  To: git

Hi list! Most of our developers use msysgit which stores filenames in
CP1251 encoding. Two of us use cygwin and linux.

When checkout on both linux and cygwin, Russian filenames are garbled.

On Cygwin it even worse, because the conversion of invalid UTF-8 chars
to Unicode, made by Cygwin is irreversible, causing git to miss
the files it've just checked out. Workaround for Cygwin is to change
LANG from "C.UTF-8" to "ru_RU.CP1251". It fixes filenames, but brakes
everything else: log (fixed by i18n.logOutputEncoding), status, show,
diff

Unlike on Cygwin, LANG has no effect for git filenames on Linux. Good
thing, there's no conversion to unicode, so files aren't lost at
checkout.

Question: is there a way to tell git on Linux to use UTF-8 filenames
for the working tree, while storing them in CP1251?

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: filenames in repo are in CP1251; I have linux and UTF-8
  2011-08-02  7:54 filenames in repo are in CP1251; I have linux and UTF-8 Ilya Basin
@ 2011-08-02 13:02 ` Dmitry Potapov
  0 siblings, 0 replies; 2+ messages in thread
From: Dmitry Potapov @ 2011-08-02 13:02 UTC (permalink / raw)
  To: Ilya Basin; +Cc: git

Hi!

On Tue, Aug 2, 2011 at 11:54 AM, Ilya Basin <basinilya@gmail.com> wrote:
>
> Unlike on Cygwin, LANG has no effect for git filenames on Linux. Good
> thing, there's no conversion to unicode, so files aren't lost at
> checkout.

LANG has no effect on open() or other system functions on Linux, so
all filenames are created as they were stored in Git, in your case,
it is CP1251.

AFAIK, it should not be difficult to configure CP1251 on Linux. You
generate the required locale and then start xterm with this locale
and specifying some CP1251 font.

I have never done it before, but the following steps seem to work:

# generate CP1251
sudo localedef -c -i ru_RU -f CP1251 ru_RU.CP1251

# make sure that it was generated
# you should see ru_RU.cp1251 in the output
locale -a | grep ru_RU

# start xterm
LANG=ru_RU.cp1251 xterm -fn '-monotype-courier
new-semilight-r-normal--0-0-0-0-c-0-microsoft-cp1251' &

You can try another font. See the output of xlsfonts:
  xlsfonts | grep -i cp1251

>
> Question: is there a way to tell git on Linux to use UTF-8 filenames
> for the working tree, while storing them in CP1251?

The short answer is no. A more detail answer is that there was
a series of patches that should allow filename conversation:

http://article.gmane.org/gmane.comp.version-control.git/119224

but a proper implementation turned out to be very invasive and
no one seems to care about it very deeply, so it was dropped.
See details here:

http://article.gmane.org/gmane.comp.version-control.git/122860


Dmitry

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-08-02 13:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-02  7:54 filenames in repo are in CP1251; I have linux and UTF-8 Ilya Basin
2011-08-02 13:02 ` Dmitry Potapov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).