All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jay Soffian <jaysoffian@gmail.com>
Cc: John Tapsell <johnflux@gmail.com>,
	Teemu Likonen <tlikonen@iki.fi>,
	git@vger.kernel.org
Subject: Re: non-ascii filenames issue
Date: Sun, 05 Apr 2009 12:29:00 -0700	[thread overview]
Message-ID: <7vfxgmrjb7.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <76718490904050923j105e383dsf650afa0a0687858@mail.gmail.com> (Jay Soffian's message of "Sun, 5 Apr 2009 12:23:35 -0400")

Jay Soffian <jaysoffian@gmail.com> writes:

> On Sun, Apr 5, 2009 at 6:51 AM, John Tapsell <johnflux@gmail.com> wrote:
>> Unfortunately not, because for some absolutely crazy reason
>
> Bzzt. http://article.gmane.org/gmane.comp.version-control.git/50830

I do not think the message gives enough information on the issue, as "a
pathname is a slash separated sequence of path components terminated with
a NUL, and a path component is an uninterpreted sequence of bytes
excluding NUL and slash" is simply a UNIX tradition the original git
design took as _given_, so the "some absolutely crazy reason" comment does
not even deserve refuting.

There is _no_ reason, crazy or otherwise.  If you start from "a pathname
is an uninterpreted sequence of bytes" tradition, it is a design parameter
and "how things are", and you simply do not argue with them.  And the
message you quoted doesn't, either.

	Side note: I am not saying that we should not ever change that
	particular design parameter.  I am just explaining why 50830 is
	not a good counterargument to quote against the "some absolutely
	crazy reason" accusation.

> And, as always, patches welcomed.

Before patches, you need a sound design and justification.

At least you need to consider the following (the early ones are easier):

 - Do we unify them to some canonical encoding internally and do the
   matching in the canonical space?   What's the internal representation
   (presumably UTF-8)?

 - How should a user tell the pathname conversion rules between the
   internal repreasentation and the filesystem representation to git?  A
   config variable per a repository?

 - How should this interact with patch+apply dataflow (including "rebase"
   without -i/-m)?  Should pathnames in diffs be in canonical form?

 - How should this interact with case challenged and/or unicode corrupting
   filesystems such as NTFS and HFSplus whose creat(), readdir(), and
   stat() contradict with each other?

 - What should happen when the pathname in the canonical representation
   recorded in the history cannot be externalized on a particular
   filesystem?  Does it gracefully degenerate and give some escape hatch,
   and if so how?

  reply	other threads:[~2009-04-05 19:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-05  9:36 non-ascii filenames issue Gregory Petrosyan
2009-04-05  9:54 ` Teemu Likonen
2009-04-05 10:01   ` Gregory Petrosyan
2009-04-05 10:51     ` John Tapsell
2009-04-05 16:23       ` Jay Soffian
2009-04-05 19:29         ` Junio C Hamano [this message]
2009-04-05 20:22           ` Jay Soffian
2009-04-06  7:28       ` Peter Krefting
2009-04-06  9:12         ` Johannes Schindelin
2009-04-06 22:33           ` Dmitry Potapov
2009-04-07  8:26         ` demerphq

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vfxgmrjb7.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jaysoffian@gmail.com \
    --cc=johnflux@gmail.com \
    --cc=tlikonen@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.