git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@MIT.EDU>
To: Kevin Ballard <kevin@sb.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Karlsson <peter@softwolves.pp.se>,
	Mark Junker <mjscod@web.de>, Pedro Melo <melo@simplicidade.org>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git on MacOSX and files with decomposed utf-8 file names
Date: Mon, 21 Jan 2008 15:15:30 -0500	[thread overview]
Message-ID: <20080121201530.GF29792@mit.edu> (raw)
In-Reply-To: <998717B0-0165-4383-AAB8-33BD2A49954E@sb.org>

On Mon, Jan 21, 2008 at 03:01:43PM -0500, Kevin Ballard wrote:
>
> You seem to be under the impression that I'm advocating that git treat all 
> filenames as unicode strings, and thus change its hashing algorithm as 
> described. I am not. I am saying that, if git only had to deal with HFS+, 
> then it could treat all filenames as strings, etc. However, since git does 
> not only have to deal with HFS+, this will not work. What I am describing 
> is an ideal, not a practicality.

Well, why are you arguing on the git list about precisely that (when
you reponsed to Linus), then?

> In other words, what I'm saying is that treating filenames as strings works 
> perfectly fine, *provided you can do that 100% of the time*. git cannot do 
> that 100% of the time, therefore it's not appropriate here. The purpose of 
> this argument is to illustrate that treating filenames as strings isn't 
> wrong, it's simply incompatible with treating filenames as byte sequences.

No, it's still broken, because of the Unicode-is-not-static problem.
What happens when you start adding more composable characters, which
some future version of HFS+ will start breaking apart? 

Presumably the whole *reason* why HFS+ was corrupting strings was so
that "stupid applications" that only did byte comparisons would work
correctly.  But when you upgrade from Mac OS 10.5 to 10.6, and it adds
support for new composable characters, and you now take a USB hard
drive that was hooked up to a MacBook Air, running one version of
MacOS, and hook it up to another Macintosh, running another version of
MacOS, the normalization algorithm will be different, so the byte
comparisons won't work.  

So all of this extra work which MacOS put in to corrupt filenames
behind our back doesn't actually do any good; applications still need
to be smart, or there will be rare, hard to reproduce bugs
nevertheless.  So if MacOS wants to supply Unicode libraries that
compare strings keeping in mind Unicode "equivalences" it can be our
guest (although how they deal with different versions of Unicode with
different equivalence classes will be their cross to bear).  BUT MacOS
X SHOULD NOT BE CORRUPTING FILENAMES.  TO DO SO IS BROKEN.

Even Microsoft got this right; its filesystem is case-preserving, but
it has case-insensitive lookups.  Hence, it is not corrupting
filenames behind the application's back, unlike MacOS.

						- Ted

  reply	other threads:[~2008-01-21 20:17 UTC|newest]

Thread overview: 260+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-16 15:17 git on MacOSX and files with decomposed utf-8 file names Mark Junker
2008-01-16 15:34 ` Johannes Schindelin
2008-01-16 15:43   ` Kevin Ballard
2008-01-16 16:32     ` Johannes Schindelin
2008-01-16 16:46       ` Jakub Narebski
2008-01-16 20:39         ` Kevin Ballard
2008-01-16 21:51           ` Jakub Narebski
2008-01-16 22:06             ` Kevin Ballard
2008-01-16 22:23               ` Johannes Schindelin
2008-01-16 23:16                 ` Kevin Ballard
2008-01-16 22:32               ` Linus Torvalds
2008-01-16 22:52                 ` Linus Torvalds
2008-01-16 23:11                 ` Kevin Ballard
2008-01-16 23:38                   ` Linus Torvalds
2008-01-16 23:57                     ` Pedro Melo
2008-01-17  0:16                       ` Linus Torvalds
2008-01-17  0:27                         ` Pedro Melo
2008-01-17  0:32                           ` David Kastrup
2008-01-17  0:40                             ` Pedro Melo
2008-01-17  0:54                               ` Wincent Colaiuta
2008-01-17  1:08                                 ` Johannes Schindelin
2008-01-17  1:41                                   ` Linus Torvalds
2008-01-17  4:07                                     ` Kevin Ballard
2008-01-17  0:35                           ` Johannes Schindelin
2008-01-17  0:45                             ` Pedro Melo
2008-01-18  8:29                         ` Peter Karlsson
2008-01-18 11:16                           ` Jakub Narebski
2008-01-16 23:58                     ` David Kastrup
2008-01-17  0:19                       ` Linus Torvalds
2008-01-17  0:09                     ` Kevin Ballard
2008-01-17  0:25                       ` Linus Torvalds
2008-01-17  0:33                         ` Johannes Schindelin
2008-01-17  0:43                           ` Pedro Melo
2008-01-17  0:57                             ` Johannes Schindelin
2008-01-17  1:06                           ` Linus Torvalds
2008-01-17  1:16                       ` Linus Torvalds
2008-01-17  3:52                         ` Kevin Ballard
2008-01-17  4:08                           ` Linus Torvalds
2008-01-17  4:30                             ` Kevin Ballard
2008-01-17  4:51                               ` Martin Langhoff
2008-01-17  5:23                                 ` Kevin Ballard
2008-01-17  6:13                                   ` Geert Bosch
2008-01-17  7:11                                     ` Mitch Tishmack
2008-01-17 10:22                                       ` Wincent Colaiuta
2008-01-17 13:44                                         ` Kevin Ballard
2008-01-17 15:57                                           ` Johannes Schindelin
2008-01-17 16:53                                             ` Kevin Ballard
2008-01-18  0:44                                               ` Robin Rosenberg
2008-01-17 14:02                                     ` Andrew Heybey
2008-01-17 15:04                                       ` Kevin Ballard
2008-01-19 19:29                                         ` Kyle Moffett
2008-01-19 19:57                                           ` Kevin Ballard
2008-01-17 10:08                             ` Wincent Colaiuta
2008-01-17 16:43                               ` Linus Torvalds
2008-01-17 18:09                                 ` Mark Junker
2008-01-17 18:12                                   ` Pedro Melo
2008-01-17 18:18                                     ` Johannes Schindelin
2008-01-17 18:36                                       ` Mark Junker
2008-01-17 18:38                                       ` Pedro Melo
2008-01-17 18:44                                     ` Linus Torvalds
2008-01-17 19:02                                       ` Pedro Melo
2008-01-17 18:42                                   ` Linus Torvalds
2008-01-17 18:50                                     ` Mark Junker
2008-01-17 18:52                                     ` Pedro Melo
     [not found]                                       ` <alpine.LFD.1.00.0801 171100330.14959@woody.linux-foundation.org>
2008-01-17 19:01                                       ` Theodore Tso
2008-01-17 19:11                                       ` Linus Torvalds
2008-01-18  0:18                                         ` Kevin Ballard
2008-01-18  0:35                                           ` Linus Torvalds
2008-01-18  1:05                                         ` Robin Rosenberg
2008-01-18  1:24                                           ` Linus Torvalds
2008-01-18  4:08                                             ` Brian Dessent
2008-01-18  8:49                                             ` Dmitry Potapov
2008-01-18  9:42                                             ` Robin Rosenberg
2008-01-18 10:30                                               ` Dmitry Potapov
2008-01-18 15:37                                                 ` Peter Karlsson
2008-01-18 17:24                                                   ` Jakub Narebski
2008-01-18 10:19                                         ` Peter Karlsson
2008-01-18 10:50                                           ` Dmitry Potapov
2008-01-18 15:30                                             ` Peter Karlsson
2008-01-18 17:11                                           ` Linus Torvalds
2008-01-18 20:24                                             ` Kevin Ballard
2008-01-19  8:48                                               ` Dmitry Potapov
2008-01-19 14:55                                                 ` Kevin Ballard
2008-01-19 21:17                                                   ` Dmitry Potapov
2008-01-19 18:58                                                 ` Linus Torvalds
2008-01-19 20:39                                                   ` Mark Junker
2008-01-19 22:58                                                   ` Johannes Schindelin
2008-01-20  6:14                                                     ` Dmitry Potapov
2008-01-20  6:53                                                       ` Linus Torvalds
2008-01-20 13:15                                                       ` Johannes Schindelin
2008-01-20  0:11                                                   ` Wincent Colaiuta
2008-01-20  1:04                                                     ` Linus Torvalds
2008-01-20  5:27                                                       ` Mike Hommey
2008-01-20  5:45                                                         ` Linus Torvalds
2008-01-20  7:00                                                           ` Mike Hommey
2008-01-20  7:26                                                             ` Linus Torvalds
2008-01-20  8:00                                                             ` Dmitry Potapov
2008-01-20  8:12                                                               ` Dmitry Potapov
2008-01-20  9:34                                                       ` Wincent Colaiuta
2008-01-18 20:28                                             ` Junio C Hamano
2008-01-18 20:50                                               ` Johannes Schindelin
2008-01-23  2:46                                               ` Eric W. Biederman
2008-01-23  2:57                                                 ` Junio C Hamano
2008-01-23 14:26                                                   ` Nicolas Pitre
2008-01-23 21:19                                                     ` Junio C Hamano
2008-01-21 14:14                                             ` Peter Karlsson
2008-01-21 16:43                                               ` Kevin Ballard
2008-01-21 16:48                                                 ` David Kastrup
2008-01-21 16:59                                                   ` Kevin Ballard
2008-01-21 20:43                                                     ` Dmitry Potapov
2008-01-21 20:53                                                       ` Kevin Ballard
2008-01-21 21:05                                                         ` David Kastrup
2008-01-21 23:01                                                         ` Dmitry Potapov
2008-01-21 16:53                                                 ` Jeff King
2008-01-21 17:08                                                 ` Nicolas Pitre
2008-01-21 17:25                                                   ` Kevin Ballard
2008-01-21 20:35                                                     ` David Kastrup
2008-01-21 20:32                                                   ` David Kastrup
2008-01-21 18:12                                                 ` Linus Torvalds
2008-01-21 19:05                                                   ` Kevin Ballard
2008-01-21 19:41                                                     ` Linus Torvalds
2008-01-21 19:58                                                       ` Kevin Ballard
2008-01-21 20:33                                                         ` Linus Torvalds
2008-01-21 20:53                                                           ` Kevin Ballard
     [not found]                                                             ` <alpine.LFD.1.0! 0.0801211323120.2957@woody.linux-foundation.org>
2008-01-21 20:58                                                             ` David Kastrup
2008-01-21 21:17                                                             ` Martin Langhoff
2008-01-21 21:28                                                               ` Kevin Ballard
2008-01-21 21:43                                                                 ` Martin Langhoff
2008-01-21 21:33                                                             ` Linus Torvalds
2008-01-21 21:49                                                               ` Kevin Ballard
2008-01-21 22:34                                                                 ` Linus Torvalds
2008-01-21 22:46                                                                   ` Kevin Ballard
2008-01-21 22:56                                                                     ` Martin Langhoff
     [not found]                                                                       ` <53C76BEA-2232-4940-8776-9DF1880089A4@sb.org>
2008-01-21 23:05                                                                         ` Kevin Ballard
2008-01-21 23:16                                                                         ` Martin Langhoff
2008-01-22  0:30                                                                           ` Kevin Ballard
2008-01-21 23:00                                                                     ` Theodore Tso
2008-01-21 23:09                                                                       ` Kevin Ballard
2008-01-21 23:44                                                                     ` Linus Torvalds
2008-01-22  0:47                                                                       ` Kevin Ballard
2008-01-22  1:01                                                                         ` Linus Torvalds
2008-01-22  1:13                                                                           ` Linus Torvalds
2008-01-22  2:33                                                                             ` Kevin Ballard
2008-01-22  2:50                                                                               ` Linus Torvalds
2008-01-22  3:04                                                                                 ` Kevin Ballard
2008-01-22  3:17                                                                                   ` Linus Torvalds
2008-01-22  3:21                                                                                   ` Martin Langhoff
2008-01-22  4:22                                                                                     ` Kevin Ballard
     [not found]                                                                                   ` <20080122133427.GB17804@mit.edu>
2008-01-23  0:08                                                                                     ` Theodore Tso
2008-01-23  0:38                                                                                       ` Kevin Ballard
2008-01-23  1:47                                                                                         ` Martin Langhoff
2008-01-23  2:06                                                                                         ` Theodore Tso
2008-01-23  8:45                                                                                         ` David Kastrup
2008-01-23  0:38                                                                                       ` Linus Torvalds
2008-01-23  1:14                                                                                         ` Martin Langhoff
2008-01-23  1:16                                                                                         ` Kevin Ballard
2008-01-23  1:27                                                                                           ` Martin Langhoff
2008-01-23  1:33                                                                                         ` Theodore Tso
2008-01-23  1:56                                                                                           ` Linus Torvalds
2008-01-23  2:02                                                                                             ` Kevin Ballard
2008-01-23  6:41                                                                                           ` Mike Hommey
2008-01-23  8:15                                                                                             ` Kevin Ballard
2008-01-23  8:43                                                                                               ` Dmitry Potapov
2008-01-23  9:02                                                                                                 ` Jonathan del Strother
2008-01-23  9:12                                                                                                   ` Dmitry Potapov
2008-01-23  9:19                                                                                                     ` Mike Hommey
2008-01-23  9:32                                                                                                       ` Dmitry Potapov
2008-01-23  9:40                                                                                               ` Mike Hommey
2008-01-23 13:38                                                                                                 ` Theodore Tso
2008-01-23 16:16                                                                                                   ` Linus Torvalds
2008-01-23 17:12                                                                                                     ` Theodore Tso
2008-01-23 17:19                                                                                                     ` Kevin Ballard
2008-01-23 17:32                                                                                                       ` Linus Torvalds
2008-01-24 21:02                                                                                                         ` On pathnames Junio C Hamano
2008-01-24 22:31                                                                                                           ` Nicolas Pitre
2008-01-25  3:55                                                                                                             ` Martin Langhoff
2008-01-25  4:18                                                                                                               ` Junio C Hamano
2008-01-25  4:12                                                                                                             ` Junio C Hamano
2008-01-25  8:08                                                                                                               ` Pedro Melo
2008-01-25 12:25                                                                                                               ` Johannes Schindelin
2008-01-25 12:50                                                                                                                 ` David Kastrup
2008-01-25 12:53                                                                                                                 ` Wincent Colaiuta
2008-01-24 23:56                                                                                                           ` Sean
2008-01-25  0:36                                                                                                           ` Johannes Schindelin
2008-01-25  4:00                                                                                                           ` Daniel Barkalow
2008-01-25  4:21                                                                                                             ` Junio C Hamano
2008-01-25 11:36                                                                                                               ` Johannes Schindelin
2008-01-25 16:25                                                                                                                 ` Daniel Barkalow
2008-01-25 17:34                                                                                                                   ` Johannes Schindelin
2008-01-25  5:59                                                                                                             ` Jeff King
2008-01-23 20:18                                                                                                       ` git on MacOSX and files with decomposed utf-8 file names Jay Soffian
     [not found]                                                                                                         ` <1DC841ED-634F-412C-9560-F37E4172A4CD@sb.org>
     [not found]                                                                                                           ` <76718490801231421l7b6552f8sec13f570360198b@mail.gmail.com>
     [not found]                                                                                                             ` <4F906435-A186-4E98-8865-F185D75F14D4@sb.org>
     [not found]                                                                                                               ` <76718490801231517h6d57e5bfkc19d394d38ad19db@mail.gmail.com>
2008-01-24  2:05                                                                                                                 ` Kevin Ballard
2008-01-24  3:11                                                                                                                   ` Junio C Hamano
2008-01-24  4:37                                                                                                                     ` Martin Langhoff
2008-01-24  5:30                                                                                                                       ` Kevin Ballard
2008-01-24  6:39                                                                                                                         ` Steffen Prohaska
2008-01-24 18:17                                                                                                                           ` Mitch Tishmack
2008-01-24 18:52                                                                                                                           ` Mitch Tishmack
2008-01-24 19:58                                                                                                                             ` Kevin Ballard
2008-01-23 23:37                                                                                                       ` Martin Langhoff
2008-01-23 16:58                                                                                                 ` Kevin Ballard
2008-01-23 17:39                                                                                                   ` Dmitry Potapov
2008-01-23 17:47                                                                                                     ` Kevin Ballard
2008-01-21 19:57                                                     ` Theodore Tso
2008-01-21 20:01                                                       ` Kevin Ballard
2008-01-21 20:15                                                         ` Theodore Tso [this message]
2008-01-21 20:31                                                           ` Kevin Ballard
2008-01-21 20:46                                                             ` Theodore Tso
2008-01-21 20:59                                                               ` Kevin Ballard
     [not found]                                                               ` <6E303071-82A4-4D69-AA0C-EC41168B9AFE@sb.org>
2008-01-21 21:18                                                                 ` Theodore Tso
2008-01-21 21:43                                                                   ` Kevin Ballard
2008-01-21 21:49                                                                     ` Martin Langhoff
2008-01-21 21:57                                                                       ` Kevin Ballard
2008-01-22  0:36                                                                         ` Johannes Schindelin
2008-01-22  0:42                                                                           ` Kevin Ballard
2008-01-22  0:48                                                                             ` David Kastrup
2008-01-22  1:06                                                                             ` Martin Langhoff
2008-01-22  1:34                                                                             ` Johannes Schindelin
2008-01-22  1:53                                                                               ` Martin Langhoff
2008-01-22  2:03                                                                                 ` Johannes Schindelin
2008-01-21 22:38                                                                     ` David Kastrup
2008-01-22  2:34                                                                       ` Kevin Ballard
2008-01-22  7:51                                                                         ` David Kastrup
2008-01-21 20:56                                                     ` Dmitry Potapov
2008-01-21 21:07                                                       ` Kevin Ballard
2008-01-21 22:41                                                         ` Dmitry Potapov
2008-01-21 22:53                                                           ` Kevin Ballard
2008-01-21 23:21                                                             ` Dmitry Potapov
2008-01-21 19:44                                                   ` Mike Hommey
2008-01-21 20:36                                                   ` Dmitry Potapov
2008-01-21 21:06                                                   ` Martin Langhoff
2008-01-21 21:09                                                     ` David Kastrup
2008-01-21 21:42                                                     ` Linus Torvalds
2008-01-21 22:45                                                       ` Martin Langhoff
2008-01-21 20:30                                                 ` Dmitry Potapov
2008-01-21 18:16                                               ` Linus Torvalds
2008-01-17 21:27                                   ` Dmitry Potapov
2008-01-17 22:01                                 ` JM Ibanez
2008-01-17 22:09                                   ` Johannes Schindelin
2008-01-18  1:27                                     ` Robin Rosenberg
2008-01-17 23:05                                   ` Linus Torvalds
2008-01-17 23:10                                   ` Dmitry Potapov
2008-01-16 23:52           ` Dmitry Potapov
2008-01-16 22:37       ` Eyvind Bernhardsen
2008-01-16 23:03     ` Wincent Colaiuta
2008-01-17  7:29     ` Miles Bader
2008-01-17  4:43 ` Jay Soffian
2008-01-17  4:59   ` Jay Soffian
2008-01-17  5:15     ` Junio C Hamano
2008-01-17 10:28       ` Wincent Colaiuta
2008-01-17 11:10         ` Johannes Schindelin
2008-01-17 11:23           ` Pedro Melo
2008-01-17 11:51             ` Wincent Colaiuta
2008-01-17 12:53               ` Johannes Schindelin
2008-01-17 13:40                 ` Wincent Colaiuta
2008-01-17 17:58               ` Junio C Hamano
2008-01-17 18:22                 ` Johan Herland
2008-01-17 13:05             ` Johannes Schindelin
2008-01-17 11:46           ` Wincent Colaiuta
2008-01-17  5:11   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080121201530.GF29792@mit.edu \
    --to=tytso@mit.edu \
    --cc=git@vger.kernel.org \
    --cc=kevin@sb.org \
    --cc=melo@simplicidade.org \
    --cc=mjscod@web.de \
    --cc=peter@softwolves.pp.se \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).