From: "Shawn O. Pearce" <spearce@spearce.org>
To: Marc Strapetz <marc.strapetz@syntevo.com>
Cc: EGit developer discussion <egit-dev@eclipse.org>,
git@vger.kernel.org, robin.rosenberg@dewire.com
Subject: Re: [egit-dev] Re: jgit problems for file paths with non-ASCII characters
Date: Thu, 26 Nov 2009 12:03:35 -0800 [thread overview]
Message-ID: <20091126200335.GW11919@spearce.org> (raw)
In-Reply-To: <4B0E8FF2.8040206@syntevo.com>
Marc Strapetz <marc.strapetz@syntevo.com> wrote:
> > We should try to work harder with the git-core folks to get character
> > set encoding for file names worked out. We might be able to use a
> > configuration setting in the repository to tell us what the proper
> > encoding should be, and if not set, assume UTF-8.
>
> I agree that this should be the ultimate goal, though the default should
> better be "system encoding" for compatibility with current git
> repositories and instead have newer git versions always set encoding to
> UTF-8. Thus, for our jgit clone I've introduced a system property to
> configure Constants.PATH_ENCODING set to system encoding. It's used by
> PathFilter and this resolves my original problem.
That's probably a good point, using the system encoding on a
repository may produce the file names in a more compatible way
with git-core. But we probably don't want the encoding to be a
single encoding constant in this JVM, we probably need to support
a per-repository configuration of the encoding for path names so
that we can eventually move to a non-platform specific encoding.
> I have tried to switch more usages from Constants.CHARACTER_ENCODING to
> Constants.PATH_ENCODING, but ended up in confusion due to my lack of
> understanding: primarily because I couldn't tell anymore whether encoded
> strings were file names or not.
Heh. Yea. There are a number of file name encoding sites. I think
everything in the treewalk package, as well as the GitIndex, Tree and
DirCache* classes. Also the Patch class and its FileHeader friend.
> Does it make sense to explicitly
> distinguish encoding usages in that way? We could try to contribute here
> (and hopefully cause less review effort to jgit developers than the
> changes itself are worth ;-)
Yes, it does. Because we eventually need to support encodings
other than the current UTF-8 we assume for file names, especially
if a repository is using the local filesystem encoding and that
isn't UTF-8.
--
Shawn.
prev parent reply other threads:[~2009-11-26 20:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-25 13:47 jgit problems for file paths with non-ASCII characters Marc Strapetz
2009-11-25 21:11 ` Robin Rosenberg
2009-11-26 0:54 ` [egit-dev] " Shawn O. Pearce
2009-11-26 13:09 ` Thomas Singer
2009-11-26 14:47 ` Johannes Schindelin
2009-11-26 15:31 ` Thomas Singer
2009-11-26 19:57 ` Shawn O. Pearce
2009-11-26 16:44 ` Robin Rosenberg
2009-11-26 14:25 ` Marc Strapetz
2009-11-26 20:03 ` Shawn O. Pearce [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091126200335.GW11919@spearce.org \
--to=spearce@spearce.org \
--cc=egit-dev@eclipse.org \
--cc=git@vger.kernel.org \
--cc=marc.strapetz@syntevo.com \
--cc=robin.rosenberg@dewire.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.