git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sam Vilain <sam.vilain@catalyst.net.nz>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Peter Karlsson <peter@softwolves.pp.se>,
	Mark Junker <mjscod@web.de>, Pedro Melo <melo@simplicidade.org>,
	Martin Langhoff <martin.langhoff@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Dmitry Potapov <dpotapov@gmail.com>, Kevin Ballard <kevin@sb.org>
Subject: Re: [PATCH] [RFC] Design for pathname encoding gitattribute [RESEND]
Date: Tue, 22 Jan 2008 22:57:27 +1300	[thread overview]
Message-ID: <4795BE07.4040500@catalyst.net.nz> (raw)
In-Reply-To: <7vr6gatidd.fsf@gitster.siamese.dyndns.org>

Junio C Hamano wrote:
> To support the above scenarios, I think each instance of
> repository needs to be able to say "this path (specified with a
> matching pattern in the filename encoding) should be converted
> this way coming in, and that way going out."  UTF-8 only project
> would have NKC<->NKD on HFS+ partition, and nothing on
> everywhere else.

I think there is another reason to do this - simple sanity.  Two people
adding the same filename should not end up with a different tree ID, if
they for whatever reason ended up entering a differing equivalent
variant of the same Unicode NKC form.

But, that rule of sanity breaks the C semantics sanity, so it must be a
per-project setting.  Not a necessity, but a good feature I think.  It
can be enforced with external scripts/hooks of course.

What happens on the way in and out of the filesystem, I see that as a
side issue.  Once you define what the normalized form is for the
project, then the features should just fall into place without messy
heuristics.  There is also a correct behaviour when faced with
filesystems that have a different idea about who enforces encoding rules
- so long as you can detect what those ideas are :).  It also means that
users can choose to use the same local encoding as their locale, which
might interoperate better with other apps.

The readdir() (case|normalization) tolerance change is good in its own
right, but it's a slightly different scenario, and an independent
question to what is the normalized form.  Of course, on case folding,
unicode normalizing filesystems you'd have to have a mixture of these
settings for sane operation.

On the chicken and egg thing, I guess .gitattributes is too late, you're
right - unless you say that at each directory level, the globbing is
always C.  But I haven't thought about that very hard.  I was just
re-using a mechanism that already exists rather than try to invent
something new.  I do agree with Dscho's point that mixing encodings in a
repository is not necessarily a use case worth catering for.

Sam.

  parent reply	other threads:[~2008-01-22  9:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-22  4:41 [PATCH] [RFC] Design for pathname encoding gitattribute [RESEND] Sam Vilain
2008-01-22  5:35 ` Johannes Schindelin
2008-01-22  6:37   ` Junio C Hamano
2008-01-22  6:26 ` Junio C Hamano
2008-01-22  7:43   ` Junio C Hamano
2008-01-22  8:09     ` Mark Junker
2008-01-22  9:16       ` Junio C Hamano
2008-01-22  9:13     ` Rafael Garcia-Suarez
2008-01-22  9:57     ` Sam Vilain [this message]
2008-01-22 10:36       ` Junio C Hamano
2008-01-22 10:44         ` Sam Vilain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4795BE07.4040500@catalyst.net.nz \
    --to=sam.vilain@catalyst.net.nz \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dpotapov@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kevin@sb.org \
    --cc=martin.langhoff@gmail.com \
    --cc=melo@simplicidade.org \
    --cc=mjscod@web.de \
    --cc=peter@softwolves.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).