All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Singer <thomas.singer@syntevo.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: EGit developer discussion <egit-dev@eclipse.org>,
	Marc Strapetz <marc.strapetz@syntevo.com>,
	git@vger.kernel.org
Subject: Re: [egit-dev] Re: jgit problems for file paths with non-ASCII	characters
Date: Thu, 26 Nov 2009 14:09:09 +0100	[thread overview]
Message-ID: <4B0E7DF5.9040007@syntevo.com> (raw)
In-Reply-To: <20091126005423.GM11919@spearce.org>

> But as you said, this still doesn't make the Apple normal form
> any easier.  Though if we know we are on such a strange filesystem
> we might be able to assume the paths in the repository are equally
> damaged.  Or not.

Well, if the git-core folks could standardize on, e.g., composed UTF-8
(rather then just UTF-8), for storing file names in the repository, then
everything should be clear, isn't it?

--
Best regards,
Thomas Singer
=============
syntevo GmbH
http://www.syntevo.com
http://blog.syntevo.com


Shawn O. Pearce wrote:
> Robin Rosenberg <robin.rosenberg@dewire.com> wrote:
>> onsdag 25 november 2009 14:47:25 skrev  Marc Strapetz:
>>> I have noticed that jgit converts file paths to UTF-8 when querying the
>>> repository.
> ...
>>> Is this a bug or a misconfiguration of my repository? I'm using jgit
>>> (commit e16af839e8a0cc01c52d3648d2d28e4cb915f80f) on Windows.
>> A bug. 
>>
>> The problem here is that we need to allow multiple encodings since there
>> is no reliable encoding specified anywhere.
> 
> This is a design fault of both Linux and git.  git gets a byte
> sequence from readdir and stores that as-is into the repository.
> We have no way of knowing what that encoding is.  So now everyone
> touching a Git repository is screwed.
> 
>> The approach I advocate is
>> the one we use for handling encoding in general. I.e. if it looks like UTF-8,
>> treat it like that else fallback. This is expensive however
> 
> We should try to work harder with the git-core folks to get character
> set encoding for file names worked out.  We might be able to use a
> configuration setting in the repository to tell us what the proper
> encoding should be, and if not set, assume UTF-8.
> 
>> and then we have
>> all the other issues with case insensitive name and the funny property that
>> unicode has when it allows characters to be encoding using multiple sequences
>> of code points as empoloyed by Apple.
> 
> But as you said, this still doesn't make the Apple normal form
> any easier.  Though if we know we are on such a strange filesystem
> we might be able to assume the paths in the repository are equally
> damaged.  Or not.
> 

  reply	other threads:[~2009-11-26 13:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-25 13:47 jgit problems for file paths with non-ASCII characters Marc Strapetz
2009-11-25 21:11 ` Robin Rosenberg
2009-11-26  0:54   ` [egit-dev] " Shawn O. Pearce
2009-11-26 13:09     ` Thomas Singer [this message]
2009-11-26 14:47       ` Johannes Schindelin
2009-11-26 15:31         ` Thomas Singer
2009-11-26 19:57           ` Shawn O. Pearce
2009-11-26 16:44       ` Robin Rosenberg
2009-11-26 14:25     ` Marc Strapetz
2009-11-26 20:03       ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B0E7DF5.9040007@syntevo.com \
    --to=thomas.singer@syntevo.com \
    --cc=egit-dev@eclipse.org \
    --cc=git@vger.kernel.org \
    --cc=marc.strapetz@syntevo.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.