All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@MIT.EDU>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Hommey <mh@glandium.org>, Kevin Ballard <kevin@sb.org>,
	git@vger.kernel.org
Subject: Re: git on MacOSX and files with decomposed utf-8 file names
Date: Wed, 23 Jan 2008 12:12:58 -0500	[thread overview]
Message-ID: <20080123171258.GB32663@mit.edu> (raw)
In-Reply-To: <alpine.LFD.1.00.0801230808440.1741@woody.linux-foundation.org>

On Wed, Jan 23, 2008 at 08:16:33AM -0800, Linus Torvalds wrote:
> 
> 
> On Wed, 23 Jan 2008, Theodore Tso wrote:
> > 
> > So this demonstrates that on my MacOS 10.4.11 system, on NFS, MacOS is
> > doing no normalization, as it is creating two files.  On HFS+, MacOS
> > is mapping both filenames to the same decomposed name.
> 
> Well, it demonstrates that (a) the OS and (b) _perl_ don't mangle 
> filenames on non-HFS+ filesystems.

Well "touch" actually since that was what was actually creating the
files; I only used perl because it was easist way to gaurantee exactly
how the filenames would be generated.

> The problem is that since most native applications *expect* that name 
> mangling, they'll probably do name mangling of their own (internally) just 
> to compare the names!
> 
> So I would not be surprised if the globbing libraries, for example, will 
> do NFD-mangling in order to glob "correctly", so even programs ported from 
> real Unix might end up getting pathnames subtly changed into NFD as part 
> of some hot library-on-library action with UTF hackery inside.

It's worse than that.  You can specify at format time whether or not
HFS+ does case-sensitivity or not, and of course, there is UFS, which
I expect does no Unicode normalization at all, much like NFS.  I
suspect what you've pointed out is why certain MacOS programs break
horribly when run on non-HFS+ filesystems, though.  And if that is the
case, then those same programs might not be reliable if the user's
home directory is stored on NFS --- like they would be in an
enteprise/corproate environment, if Apple ever wants to have any hope
of penetrating that market.

Because of this, git code won't be able to just check for HFS+; it
will probably have to do a run-time test to see whether or not the
filesystem is doing case-folding or not, since that can be turned on
or off on a per-filesystem basis.  Also unknown, and which should be
tested, is whether turning off case-folding also turns off Unicode
normalization.  It may be that they did this so that HFS+ could be UFS
compatible, since Darwin *must* be built on a UFS filesystem,
reflecting its Mach/BSD heritage.  (I ran across this while doing my
web research; apparently HFS+ has been causing Apple headaches
internally.  Heh.  :-)

>Things like the finder etc, which must be very aware of the fact that
>filenames get corrupted, would presumably internally always convert
>everything they get into NFD in order to compare names from different
>sources. And as part of that, programs may well corrupt the name before
>they then use it to create a pathname.

Well, hopefully not everyone inside Apple's OS groups are total
morons, and actually use a utf8_str_equiv() routine instead of
strcmp() to do their Unicode comparisons.  But then again, maybe
not...

> The fact that your perl program works under NFS, but creates NFD on a VFAT 
> volume, does imply that they probably used at least some of the same 
> routines they use in HFS+ for VFAT. Not entirely surprising: doing case 
> insensitive stuff with Unicode is nasty code, so why not share it (even if 
> it's then incorrect for FAT)..
> 
> Piece of crap it is, though. Apple has painted themselves into a nasty 
> corner there.

No kidding!!

							- Ted

  reply	other threads:[~2008-01-23 17:14 UTC|newest]

Thread overview: 260+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-16 15:17 git on MacOSX and files with decomposed utf-8 file names Mark Junker
2008-01-16 15:34 ` Johannes Schindelin
2008-01-16 15:43   ` Kevin Ballard
2008-01-16 16:32     ` Johannes Schindelin
2008-01-16 16:46       ` Jakub Narebski
2008-01-16 20:39         ` Kevin Ballard
2008-01-16 21:51           ` Jakub Narebski
2008-01-16 22:06             ` Kevin Ballard
2008-01-16 22:23               ` Johannes Schindelin
2008-01-16 23:16                 ` Kevin Ballard
2008-01-16 22:32               ` Linus Torvalds
2008-01-16 22:52                 ` Linus Torvalds
2008-01-16 23:11                 ` Kevin Ballard
2008-01-16 23:38                   ` Linus Torvalds
2008-01-16 23:57                     ` Pedro Melo
2008-01-17  0:16                       ` Linus Torvalds
2008-01-17  0:27                         ` Pedro Melo
2008-01-17  0:32                           ` David Kastrup
2008-01-17  0:40                             ` Pedro Melo
2008-01-17  0:54                               ` Wincent Colaiuta
2008-01-17  1:08                                 ` Johannes Schindelin
2008-01-17  1:41                                   ` Linus Torvalds
2008-01-17  4:07                                     ` Kevin Ballard
2008-01-17  0:35                           ` Johannes Schindelin
2008-01-17  0:45                             ` Pedro Melo
2008-01-18  8:29                         ` Peter Karlsson
2008-01-18 11:16                           ` Jakub Narebski
2008-01-16 23:58                     ` David Kastrup
2008-01-17  0:19                       ` Linus Torvalds
2008-01-17  0:09                     ` Kevin Ballard
2008-01-17  0:25                       ` Linus Torvalds
2008-01-17  0:33                         ` Johannes Schindelin
2008-01-17  0:43                           ` Pedro Melo
2008-01-17  0:57                             ` Johannes Schindelin
2008-01-17  1:06                           ` Linus Torvalds
2008-01-17  1:16                       ` Linus Torvalds
2008-01-17  3:52                         ` Kevin Ballard
2008-01-17  4:08                           ` Linus Torvalds
2008-01-17  4:30                             ` Kevin Ballard
2008-01-17  4:51                               ` Martin Langhoff
2008-01-17  5:23                                 ` Kevin Ballard
2008-01-17  6:13                                   ` Geert Bosch
2008-01-17  7:11                                     ` Mitch Tishmack
2008-01-17 10:22                                       ` Wincent Colaiuta
2008-01-17 13:44                                         ` Kevin Ballard
2008-01-17 15:57                                           ` Johannes Schindelin
2008-01-17 16:53                                             ` Kevin Ballard
2008-01-18  0:44                                               ` Robin Rosenberg
2008-01-17 14:02                                     ` Andrew Heybey
2008-01-17 15:04                                       ` Kevin Ballard
2008-01-19 19:29                                         ` Kyle Moffett
2008-01-19 19:57                                           ` Kevin Ballard
2008-01-17 10:08                             ` Wincent Colaiuta
2008-01-17 16:43                               ` Linus Torvalds
2008-01-17 18:09                                 ` Mark Junker
2008-01-17 18:12                                   ` Pedro Melo
2008-01-17 18:18                                     ` Johannes Schindelin
2008-01-17 18:36                                       ` Mark Junker
2008-01-17 18:38                                       ` Pedro Melo
2008-01-17 18:44                                     ` Linus Torvalds
2008-01-17 19:02                                       ` Pedro Melo
2008-01-17 18:42                                   ` Linus Torvalds
2008-01-17 18:50                                     ` Mark Junker
2008-01-17 18:52                                     ` Pedro Melo
     [not found]                                       ` <alpine.LFD.1.00.0801 171100330.14959@woody.linux-foundation.org>
2008-01-17 19:01                                       ` Theodore Tso
2008-01-17 19:11                                       ` Linus Torvalds
2008-01-18  0:18                                         ` Kevin Ballard
2008-01-18  0:35                                           ` Linus Torvalds
2008-01-18  1:05                                         ` Robin Rosenberg
2008-01-18  1:24                                           ` Linus Torvalds
2008-01-18  4:08                                             ` Brian Dessent
2008-01-18  8:49                                             ` Dmitry Potapov
2008-01-18  9:42                                             ` Robin Rosenberg
2008-01-18 10:30                                               ` Dmitry Potapov
2008-01-18 15:37                                                 ` Peter Karlsson
2008-01-18 17:24                                                   ` Jakub Narebski
2008-01-18 10:19                                         ` Peter Karlsson
2008-01-18 10:50                                           ` Dmitry Potapov
2008-01-18 15:30                                             ` Peter Karlsson
2008-01-18 17:11                                           ` Linus Torvalds
2008-01-18 20:24                                             ` Kevin Ballard
2008-01-19  8:48                                               ` Dmitry Potapov
2008-01-19 14:55                                                 ` Kevin Ballard
2008-01-19 21:17                                                   ` Dmitry Potapov
2008-01-19 18:58                                                 ` Linus Torvalds
2008-01-19 20:39                                                   ` Mark Junker
2008-01-19 22:58                                                   ` Johannes Schindelin
2008-01-20  6:14                                                     ` Dmitry Potapov
2008-01-20  6:53                                                       ` Linus Torvalds
2008-01-20 13:15                                                       ` Johannes Schindelin
2008-01-20  0:11                                                   ` Wincent Colaiuta
2008-01-20  1:04                                                     ` Linus Torvalds
2008-01-20  5:27                                                       ` Mike Hommey
2008-01-20  5:45                                                         ` Linus Torvalds
2008-01-20  7:00                                                           ` Mike Hommey
2008-01-20  7:26                                                             ` Linus Torvalds
2008-01-20  8:00                                                             ` Dmitry Potapov
2008-01-20  8:12                                                               ` Dmitry Potapov
2008-01-20  9:34                                                       ` Wincent Colaiuta
2008-01-18 20:28                                             ` Junio C Hamano
2008-01-18 20:50                                               ` Johannes Schindelin
2008-01-23  2:46                                               ` Eric W. Biederman
2008-01-23  2:57                                                 ` Junio C Hamano
2008-01-23 14:26                                                   ` Nicolas Pitre
2008-01-23 21:19                                                     ` Junio C Hamano
2008-01-21 14:14                                             ` Peter Karlsson
2008-01-21 16:43                                               ` Kevin Ballard
2008-01-21 16:48                                                 ` David Kastrup
2008-01-21 16:59                                                   ` Kevin Ballard
2008-01-21 20:43                                                     ` Dmitry Potapov
2008-01-21 20:53                                                       ` Kevin Ballard
2008-01-21 21:05                                                         ` David Kastrup
2008-01-21 23:01                                                         ` Dmitry Potapov
2008-01-21 16:53                                                 ` Jeff King
2008-01-21 17:08                                                 ` Nicolas Pitre
2008-01-21 17:25                                                   ` Kevin Ballard
2008-01-21 20:35                                                     ` David Kastrup
2008-01-21 20:32                                                   ` David Kastrup
2008-01-21 18:12                                                 ` Linus Torvalds
2008-01-21 19:05                                                   ` Kevin Ballard
2008-01-21 19:41                                                     ` Linus Torvalds
2008-01-21 19:58                                                       ` Kevin Ballard
2008-01-21 20:33                                                         ` Linus Torvalds
2008-01-21 20:53                                                           ` Kevin Ballard
     [not found]                                                             ` <alpine.LFD.1.0! 0.0801211323120.2957@woody.linux-foundation.org>
2008-01-21 20:58                                                             ` David Kastrup
2008-01-21 21:17                                                             ` Martin Langhoff
2008-01-21 21:28                                                               ` Kevin Ballard
2008-01-21 21:43                                                                 ` Martin Langhoff
2008-01-21 21:33                                                             ` Linus Torvalds
2008-01-21 21:49                                                               ` Kevin Ballard
2008-01-21 22:34                                                                 ` Linus Torvalds
2008-01-21 22:46                                                                   ` Kevin Ballard
2008-01-21 22:56                                                                     ` Martin Langhoff
     [not found]                                                                       ` <53C76BEA-2232-4940-8776-9DF1880089A4@sb.org>
2008-01-21 23:05                                                                         ` Kevin Ballard
2008-01-21 23:16                                                                         ` Martin Langhoff
2008-01-22  0:30                                                                           ` Kevin Ballard
2008-01-21 23:00                                                                     ` Theodore Tso
2008-01-21 23:09                                                                       ` Kevin Ballard
2008-01-21 23:44                                                                     ` Linus Torvalds
2008-01-22  0:47                                                                       ` Kevin Ballard
2008-01-22  1:01                                                                         ` Linus Torvalds
2008-01-22  1:13                                                                           ` Linus Torvalds
2008-01-22  2:33                                                                             ` Kevin Ballard
2008-01-22  2:50                                                                               ` Linus Torvalds
2008-01-22  3:04                                                                                 ` Kevin Ballard
2008-01-22  3:17                                                                                   ` Linus Torvalds
2008-01-22  3:21                                                                                   ` Martin Langhoff
2008-01-22  4:22                                                                                     ` Kevin Ballard
     [not found]                                                                                   ` <20080122133427.GB17804@mit.edu>
2008-01-23  0:08                                                                                     ` Theodore Tso
2008-01-23  0:38                                                                                       ` Kevin Ballard
2008-01-23  1:47                                                                                         ` Martin Langhoff
2008-01-23  2:06                                                                                         ` Theodore Tso
2008-01-23  8:45                                                                                         ` David Kastrup
2008-01-23  0:38                                                                                       ` Linus Torvalds
2008-01-23  1:14                                                                                         ` Martin Langhoff
2008-01-23  1:16                                                                                         ` Kevin Ballard
2008-01-23  1:27                                                                                           ` Martin Langhoff
2008-01-23  1:33                                                                                         ` Theodore Tso
2008-01-23  1:56                                                                                           ` Linus Torvalds
2008-01-23  2:02                                                                                             ` Kevin Ballard
2008-01-23  6:41                                                                                           ` Mike Hommey
2008-01-23  8:15                                                                                             ` Kevin Ballard
2008-01-23  8:43                                                                                               ` Dmitry Potapov
2008-01-23  9:02                                                                                                 ` Jonathan del Strother
2008-01-23  9:12                                                                                                   ` Dmitry Potapov
2008-01-23  9:19                                                                                                     ` Mike Hommey
2008-01-23  9:32                                                                                                       ` Dmitry Potapov
2008-01-23  9:40                                                                                               ` Mike Hommey
2008-01-23 13:38                                                                                                 ` Theodore Tso
2008-01-23 16:16                                                                                                   ` Linus Torvalds
2008-01-23 17:12                                                                                                     ` Theodore Tso [this message]
2008-01-23 17:19                                                                                                     ` Kevin Ballard
2008-01-23 17:32                                                                                                       ` Linus Torvalds
2008-01-24 21:02                                                                                                         ` On pathnames Junio C Hamano
2008-01-24 22:31                                                                                                           ` Nicolas Pitre
2008-01-25  3:55                                                                                                             ` Martin Langhoff
2008-01-25  4:18                                                                                                               ` Junio C Hamano
2008-01-25  4:12                                                                                                             ` Junio C Hamano
2008-01-25  8:08                                                                                                               ` Pedro Melo
2008-01-25 12:25                                                                                                               ` Johannes Schindelin
2008-01-25 12:50                                                                                                                 ` David Kastrup
2008-01-25 12:53                                                                                                                 ` Wincent Colaiuta
2008-01-24 23:56                                                                                                           ` Sean
2008-01-25  0:36                                                                                                           ` Johannes Schindelin
2008-01-25  4:00                                                                                                           ` Daniel Barkalow
2008-01-25  4:21                                                                                                             ` Junio C Hamano
2008-01-25 11:36                                                                                                               ` Johannes Schindelin
2008-01-25 16:25                                                                                                                 ` Daniel Barkalow
2008-01-25 17:34                                                                                                                   ` Johannes Schindelin
2008-01-25  5:59                                                                                                             ` Jeff King
2008-01-23 20:18                                                                                                       ` git on MacOSX and files with decomposed utf-8 file names Jay Soffian
     [not found]                                                                                                         ` <1DC841ED-634F-412C-9560-F37E4172A4CD@sb.org>
     [not found]                                                                                                           ` <76718490801231421l7b6552f8sec13f570360198b@mail.gmail.com>
     [not found]                                                                                                             ` <4F906435-A186-4E98-8865-F185D75F14D4@sb.org>
     [not found]                                                                                                               ` <76718490801231517h6d57e5bfkc19d394d38ad19db@mail.gmail.com>
2008-01-24  2:05                                                                                                                 ` Kevin Ballard
2008-01-24  3:11                                                                                                                   ` Junio C Hamano
2008-01-24  4:37                                                                                                                     ` Martin Langhoff
2008-01-24  5:30                                                                                                                       ` Kevin Ballard
2008-01-24  6:39                                                                                                                         ` Steffen Prohaska
2008-01-24 18:17                                                                                                                           ` Mitch Tishmack
2008-01-24 18:52                                                                                                                           ` Mitch Tishmack
2008-01-24 19:58                                                                                                                             ` Kevin Ballard
2008-01-23 23:37                                                                                                       ` Martin Langhoff
2008-01-23 16:58                                                                                                 ` Kevin Ballard
2008-01-23 17:39                                                                                                   ` Dmitry Potapov
2008-01-23 17:47                                                                                                     ` Kevin Ballard
2008-01-21 19:57                                                     ` Theodore Tso
2008-01-21 20:01                                                       ` Kevin Ballard
2008-01-21 20:15                                                         ` Theodore Tso
2008-01-21 20:31                                                           ` Kevin Ballard
2008-01-21 20:46                                                             ` Theodore Tso
2008-01-21 20:59                                                               ` Kevin Ballard
     [not found]                                                               ` <6E303071-82A4-4D69-AA0C-EC41168B9AFE@sb.org>
2008-01-21 21:18                                                                 ` Theodore Tso
2008-01-21 21:43                                                                   ` Kevin Ballard
2008-01-21 21:49                                                                     ` Martin Langhoff
2008-01-21 21:57                                                                       ` Kevin Ballard
2008-01-22  0:36                                                                         ` Johannes Schindelin
2008-01-22  0:42                                                                           ` Kevin Ballard
2008-01-22  0:48                                                                             ` David Kastrup
2008-01-22  1:06                                                                             ` Martin Langhoff
2008-01-22  1:34                                                                             ` Johannes Schindelin
2008-01-22  1:53                                                                               ` Martin Langhoff
2008-01-22  2:03                                                                                 ` Johannes Schindelin
2008-01-21 22:38                                                                     ` David Kastrup
2008-01-22  2:34                                                                       ` Kevin Ballard
2008-01-22  7:51                                                                         ` David Kastrup
2008-01-21 20:56                                                     ` Dmitry Potapov
2008-01-21 21:07                                                       ` Kevin Ballard
2008-01-21 22:41                                                         ` Dmitry Potapov
2008-01-21 22:53                                                           ` Kevin Ballard
2008-01-21 23:21                                                             ` Dmitry Potapov
2008-01-21 19:44                                                   ` Mike Hommey
2008-01-21 20:36                                                   ` Dmitry Potapov
2008-01-21 21:06                                                   ` Martin Langhoff
2008-01-21 21:09                                                     ` David Kastrup
2008-01-21 21:42                                                     ` Linus Torvalds
2008-01-21 22:45                                                       ` Martin Langhoff
2008-01-21 20:30                                                 ` Dmitry Potapov
2008-01-21 18:16                                               ` Linus Torvalds
2008-01-17 21:27                                   ` Dmitry Potapov
2008-01-17 22:01                                 ` JM Ibanez
2008-01-17 22:09                                   ` Johannes Schindelin
2008-01-18  1:27                                     ` Robin Rosenberg
2008-01-17 23:05                                   ` Linus Torvalds
2008-01-17 23:10                                   ` Dmitry Potapov
2008-01-16 23:52           ` Dmitry Potapov
2008-01-16 22:37       ` Eyvind Bernhardsen
2008-01-16 23:03     ` Wincent Colaiuta
2008-01-17  7:29     ` Miles Bader
2008-01-17  4:43 ` Jay Soffian
2008-01-17  4:59   ` Jay Soffian
2008-01-17  5:15     ` Junio C Hamano
2008-01-17 10:28       ` Wincent Colaiuta
2008-01-17 11:10         ` Johannes Schindelin
2008-01-17 11:23           ` Pedro Melo
2008-01-17 11:51             ` Wincent Colaiuta
2008-01-17 12:53               ` Johannes Schindelin
2008-01-17 13:40                 ` Wincent Colaiuta
2008-01-17 17:58               ` Junio C Hamano
2008-01-17 18:22                 ` Johan Herland
2008-01-17 13:05             ` Johannes Schindelin
2008-01-17 11:46           ` Wincent Colaiuta
2008-01-17  5:11   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080123171258.GB32663@mit.edu \
    --to=tytso@mit.edu \
    --cc=git@vger.kernel.org \
    --cc=kevin@sb.org \
    --cc=mh@glandium.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.