git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH] Fix "inside work tree" detection on case-insensitive filesystems
Date: Fri, 1 Jan 2016 17:24:13 +0100	[thread overview]
Message-ID: <5686A82D.6020105@alum.mit.edu> (raw)
In-Reply-To: <alpine.DEB.2.20.1601011554240.14434@virtualbox>

On 01/01/2016 03:58 PM, Johannes Schindelin wrote:
> Hi Michael,
> 
> On Tue, 29 Dec 2015, Michael Haggerty wrote:
> 
>> On 09/28/2015 06:12 PM, Johannes Schindelin wrote:
>>> Git has a config variable to indicate that it is operating on a file
>>> system that is case-insensitive: core.ignoreCase. But the
>>> `dir_inside_of()` function did not respect that. As a result, if Git's
>>> idea of the current working directory disagreed in its upper/lower case
>>> with the `GIT_WORK_TREE` variable (e.g. `C:\test` vs `c:\test`) the
>>> user would be greeted by the error message
>>>
>>> 	fatal: git-am cannot be used without a working tree.
>>>
>>> when trying to run a rebase.
>>>
>>> This fixes https://github.com/git-for-windows/git/issues/402 (reported by
>>> Daniel Harding).
>>
>> I was just going through the 2.7 release notes when I saw this patch.
> 
> Thanks for your diligence!
> 
>> My understanding was that many of the case-insensitive filesystems also
>> support Unicode. Is the byte-oriented code in this patch adequate? I
>> would have thought it necessary to use a Unicode-aware algorithm here,
>> that knows:
>>
>> * that bytes != characters
> 
> I am not sure that we can in general assume that the file name is UTF-8...
> Or does Git always assume that?

No, it does not. I would say that Git resolutely refuses to take a stand
on whether/what encoding is used for filenames within its object database.

But as far as I understand, here we have a slightly different situation:
we have a string from an environment variable, and we have the current
working directory. Neither of these things are paths from the Git ODB,
and we don't have to take a stand about whether two paths from the ODB
are equivalent.

>> * how to do a case-insensitive comparison of strings that include
>> non-ASCII characters
> 
> I was worrying about that, too, but decided to punt on it when I realized
> that no other case-insensitive code in Git bothers about those characters.
> 
>> * (possibly) insensitivity to NFC vs. NFD vs. non-normalized forms
> 
> Whoa... I really would like to stay away from that collection of
> potholes...
> 
>> I suppose that such OSs have built-in functions for deciding whether two
>> paths are equivalent. Possibly these could be used?
> 
> We could, in theory, try to do that, but what about the OSes that do *not*
> have those functions? We would need our own fallback anyway, so why not
> guarantee consistency and use our own functions only?

I don't see why consistency across OSes is an issue here. Whenever we
are talking about paths on a particular filesystem, it is the
filesystem's idea of equivalence that is relevant.

Suppose we had a wrapper `filesystem_paths_equivalent(path1, path2)`. If
an OS provides a function that can be used for such a comparison, we can
use that. If not, but `ignore_case` is set, we can use your
`cmp_icase()`-based function. Otherwise, the default implementation
would be `!strcmp()`.

I don't think it would be worth going to all this trouble for this one
callsite (at least under my assumption that this has no security
implications). But I would have expected this issue to come up in enough
places that such an abstraction would be helpful.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

      reply	other threads:[~2016-01-01 16:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-28 16:12 [PATCH] Fix "inside work tree" detection on case-insensitive filesystems Johannes Schindelin
2015-12-29 14:47 ` Michael Haggerty
2016-01-01 14:58   ` Johannes Schindelin
2016-01-01 16:24     ` Michael Haggerty [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5686A82D.6020105@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).