From: Theodore Tso <tytso@MIT.EDU>
To: Kevin Ballard <kevin@sb.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Peter Karlsson <peter@softwolves.pp.se>,
Mark Junker <mjscod@web.de>, Pedro Melo <melo@simplicidade.org>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git on MacOSX and files with decomposed utf-8 file names
Date: Mon, 21 Jan 2008 14:57:03 -0500 [thread overview]
Message-ID: <20080121195703.GE29792@mit.edu> (raw)
In-Reply-To: <C6C0E6A1-053B-48CE-90B3-8FFB44061C3B@sb.org>
On Mon, Jan 21, 2008 at 02:05:51PM -0500, Kevin Ballard wrote:
> You're right, but it doesn't have to treat it as a binary stream at the
> level I care about. I mean, no matter what you do at some level the string
> is evaluated as a binary stream. For our purposes, just redefine the
> hashing algorithm to hash all equivalent strings the same, and you can
> implement that by using SHA1 on a particular encoding of the string.
That's horribly broken, for a couple of reasons. First of all,
changing the hash algorithm breaks compatibility with existing
repositories; sure, you can try to guess what will least likely break
existing repository (which won't be the native MacOSX normalization
algorithm, since it's more likely the combined character will likely
be used on other environments), but there's still no guarantee there
aren't filenames that use some other form of byte-string for the
filename.
Secondly, the hash algorithm would not be stable. Unicode is not
static, and new characters can get added that may be composable, and
thus would be normalized differently. This is one of the reasons why
Unicode is so horribly broken as a standard. It was originally
created by representatives from the printing world that were horribly
clueless about what was needed with respect to canonicalization
representation, so they compromised allowed both forms, not realizing
what a massive f*ckup this would cause later on. So people have over
the years piled kludges on top of kludges in order to make Unicode
"work".
So we can't blame all of the craziness on the MacOS designers,
although they have seen to have been very creative about how to take a
bad situation and make it worse....
- Ted
next prev parent reply other threads:[~2008-01-21 19:59 UTC|newest]
Thread overview: 260+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-16 15:17 git on MacOSX and files with decomposed utf-8 file names Mark Junker
2008-01-16 15:34 ` Johannes Schindelin
2008-01-16 15:43 ` Kevin Ballard
2008-01-16 16:32 ` Johannes Schindelin
2008-01-16 16:46 ` Jakub Narebski
2008-01-16 20:39 ` Kevin Ballard
2008-01-16 21:51 ` Jakub Narebski
2008-01-16 22:06 ` Kevin Ballard
2008-01-16 22:23 ` Johannes Schindelin
2008-01-16 23:16 ` Kevin Ballard
2008-01-16 22:32 ` Linus Torvalds
2008-01-16 22:52 ` Linus Torvalds
2008-01-16 23:11 ` Kevin Ballard
2008-01-16 23:38 ` Linus Torvalds
2008-01-16 23:57 ` Pedro Melo
2008-01-17 0:16 ` Linus Torvalds
2008-01-17 0:27 ` Pedro Melo
2008-01-17 0:32 ` David Kastrup
2008-01-17 0:40 ` Pedro Melo
2008-01-17 0:54 ` Wincent Colaiuta
2008-01-17 1:08 ` Johannes Schindelin
2008-01-17 1:41 ` Linus Torvalds
2008-01-17 4:07 ` Kevin Ballard
2008-01-17 0:35 ` Johannes Schindelin
2008-01-17 0:45 ` Pedro Melo
2008-01-18 8:29 ` Peter Karlsson
2008-01-18 11:16 ` Jakub Narebski
2008-01-16 23:58 ` David Kastrup
2008-01-17 0:19 ` Linus Torvalds
2008-01-17 0:09 ` Kevin Ballard
2008-01-17 0:25 ` Linus Torvalds
2008-01-17 0:33 ` Johannes Schindelin
2008-01-17 0:43 ` Pedro Melo
2008-01-17 0:57 ` Johannes Schindelin
2008-01-17 1:06 ` Linus Torvalds
2008-01-17 1:16 ` Linus Torvalds
2008-01-17 3:52 ` Kevin Ballard
2008-01-17 4:08 ` Linus Torvalds
2008-01-17 4:30 ` Kevin Ballard
2008-01-17 4:51 ` Martin Langhoff
2008-01-17 5:23 ` Kevin Ballard
2008-01-17 6:13 ` Geert Bosch
2008-01-17 7:11 ` Mitch Tishmack
2008-01-17 10:22 ` Wincent Colaiuta
2008-01-17 13:44 ` Kevin Ballard
2008-01-17 15:57 ` Johannes Schindelin
2008-01-17 16:53 ` Kevin Ballard
2008-01-18 0:44 ` Robin Rosenberg
2008-01-17 14:02 ` Andrew Heybey
2008-01-17 15:04 ` Kevin Ballard
2008-01-19 19:29 ` Kyle Moffett
2008-01-19 19:57 ` Kevin Ballard
2008-01-17 10:08 ` Wincent Colaiuta
2008-01-17 16:43 ` Linus Torvalds
2008-01-17 18:09 ` Mark Junker
2008-01-17 18:12 ` Pedro Melo
2008-01-17 18:18 ` Johannes Schindelin
2008-01-17 18:36 ` Mark Junker
2008-01-17 18:38 ` Pedro Melo
2008-01-17 18:44 ` Linus Torvalds
2008-01-17 19:02 ` Pedro Melo
2008-01-17 18:42 ` Linus Torvalds
2008-01-17 18:50 ` Mark Junker
2008-01-17 18:52 ` Pedro Melo
[not found] ` <alpine.LFD.1.00.0801 171100330.14959@woody.linux-foundation.org>
2008-01-17 19:01 ` Theodore Tso
2008-01-17 19:11 ` Linus Torvalds
2008-01-18 0:18 ` Kevin Ballard
2008-01-18 0:35 ` Linus Torvalds
2008-01-18 1:05 ` Robin Rosenberg
2008-01-18 1:24 ` Linus Torvalds
2008-01-18 4:08 ` Brian Dessent
2008-01-18 8:49 ` Dmitry Potapov
2008-01-18 9:42 ` Robin Rosenberg
2008-01-18 10:30 ` Dmitry Potapov
2008-01-18 15:37 ` Peter Karlsson
2008-01-18 17:24 ` Jakub Narebski
2008-01-18 10:19 ` Peter Karlsson
2008-01-18 10:50 ` Dmitry Potapov
2008-01-18 15:30 ` Peter Karlsson
2008-01-18 17:11 ` Linus Torvalds
2008-01-18 20:24 ` Kevin Ballard
2008-01-19 8:48 ` Dmitry Potapov
2008-01-19 14:55 ` Kevin Ballard
2008-01-19 21:17 ` Dmitry Potapov
2008-01-19 18:58 ` Linus Torvalds
2008-01-19 20:39 ` Mark Junker
2008-01-19 22:58 ` Johannes Schindelin
2008-01-20 6:14 ` Dmitry Potapov
2008-01-20 6:53 ` Linus Torvalds
2008-01-20 13:15 ` Johannes Schindelin
2008-01-20 0:11 ` Wincent Colaiuta
2008-01-20 1:04 ` Linus Torvalds
2008-01-20 5:27 ` Mike Hommey
2008-01-20 5:45 ` Linus Torvalds
2008-01-20 7:00 ` Mike Hommey
2008-01-20 7:26 ` Linus Torvalds
2008-01-20 8:00 ` Dmitry Potapov
2008-01-20 8:12 ` Dmitry Potapov
2008-01-20 9:34 ` Wincent Colaiuta
2008-01-18 20:28 ` Junio C Hamano
2008-01-18 20:50 ` Johannes Schindelin
2008-01-23 2:46 ` Eric W. Biederman
2008-01-23 2:57 ` Junio C Hamano
2008-01-23 14:26 ` Nicolas Pitre
2008-01-23 21:19 ` Junio C Hamano
2008-01-21 14:14 ` Peter Karlsson
2008-01-21 16:43 ` Kevin Ballard
2008-01-21 16:48 ` David Kastrup
2008-01-21 16:59 ` Kevin Ballard
2008-01-21 20:43 ` Dmitry Potapov
2008-01-21 20:53 ` Kevin Ballard
2008-01-21 21:05 ` David Kastrup
2008-01-21 23:01 ` Dmitry Potapov
2008-01-21 16:53 ` Jeff King
2008-01-21 17:08 ` Nicolas Pitre
2008-01-21 17:25 ` Kevin Ballard
2008-01-21 20:35 ` David Kastrup
2008-01-21 20:32 ` David Kastrup
2008-01-21 18:12 ` Linus Torvalds
2008-01-21 19:05 ` Kevin Ballard
2008-01-21 19:41 ` Linus Torvalds
2008-01-21 19:58 ` Kevin Ballard
2008-01-21 20:33 ` Linus Torvalds
2008-01-21 20:53 ` Kevin Ballard
[not found] ` <alpine.LFD.1.0! 0.0801211323120.2957@woody.linux-foundation.org>
2008-01-21 20:58 ` David Kastrup
2008-01-21 21:17 ` Martin Langhoff
2008-01-21 21:28 ` Kevin Ballard
2008-01-21 21:43 ` Martin Langhoff
2008-01-21 21:33 ` Linus Torvalds
2008-01-21 21:49 ` Kevin Ballard
2008-01-21 22:34 ` Linus Torvalds
2008-01-21 22:46 ` Kevin Ballard
2008-01-21 22:56 ` Martin Langhoff
[not found] ` <53C76BEA-2232-4940-8776-9DF1880089A4@sb.org>
2008-01-21 23:05 ` Kevin Ballard
2008-01-21 23:16 ` Martin Langhoff
2008-01-22 0:30 ` Kevin Ballard
2008-01-21 23:00 ` Theodore Tso
2008-01-21 23:09 ` Kevin Ballard
2008-01-21 23:44 ` Linus Torvalds
2008-01-22 0:47 ` Kevin Ballard
2008-01-22 1:01 ` Linus Torvalds
2008-01-22 1:13 ` Linus Torvalds
2008-01-22 2:33 ` Kevin Ballard
2008-01-22 2:50 ` Linus Torvalds
2008-01-22 3:04 ` Kevin Ballard
2008-01-22 3:17 ` Linus Torvalds
2008-01-22 3:21 ` Martin Langhoff
2008-01-22 4:22 ` Kevin Ballard
[not found] ` <20080122133427.GB17804@mit.edu>
2008-01-23 0:08 ` Theodore Tso
2008-01-23 0:38 ` Kevin Ballard
2008-01-23 1:47 ` Martin Langhoff
2008-01-23 2:06 ` Theodore Tso
2008-01-23 8:45 ` David Kastrup
2008-01-23 0:38 ` Linus Torvalds
2008-01-23 1:14 ` Martin Langhoff
2008-01-23 1:16 ` Kevin Ballard
2008-01-23 1:27 ` Martin Langhoff
2008-01-23 1:33 ` Theodore Tso
2008-01-23 1:56 ` Linus Torvalds
2008-01-23 2:02 ` Kevin Ballard
2008-01-23 6:41 ` Mike Hommey
2008-01-23 8:15 ` Kevin Ballard
2008-01-23 8:43 ` Dmitry Potapov
2008-01-23 9:02 ` Jonathan del Strother
2008-01-23 9:12 ` Dmitry Potapov
2008-01-23 9:19 ` Mike Hommey
2008-01-23 9:32 ` Dmitry Potapov
2008-01-23 9:40 ` Mike Hommey
2008-01-23 13:38 ` Theodore Tso
2008-01-23 16:16 ` Linus Torvalds
2008-01-23 17:12 ` Theodore Tso
2008-01-23 17:19 ` Kevin Ballard
2008-01-23 17:32 ` Linus Torvalds
2008-01-24 21:02 ` On pathnames Junio C Hamano
2008-01-24 22:31 ` Nicolas Pitre
2008-01-25 3:55 ` Martin Langhoff
2008-01-25 4:18 ` Junio C Hamano
2008-01-25 4:12 ` Junio C Hamano
2008-01-25 8:08 ` Pedro Melo
2008-01-25 12:25 ` Johannes Schindelin
2008-01-25 12:50 ` David Kastrup
2008-01-25 12:53 ` Wincent Colaiuta
2008-01-24 23:56 ` Sean
2008-01-25 0:36 ` Johannes Schindelin
2008-01-25 4:00 ` Daniel Barkalow
2008-01-25 4:21 ` Junio C Hamano
2008-01-25 11:36 ` Johannes Schindelin
2008-01-25 16:25 ` Daniel Barkalow
2008-01-25 17:34 ` Johannes Schindelin
2008-01-25 5:59 ` Jeff King
2008-01-23 20:18 ` git on MacOSX and files with decomposed utf-8 file names Jay Soffian
[not found] ` <1DC841ED-634F-412C-9560-F37E4172A4CD@sb.org>
[not found] ` <76718490801231421l7b6552f8sec13f570360198b@mail.gmail.com>
[not found] ` <4F906435-A186-4E98-8865-F185D75F14D4@sb.org>
[not found] ` <76718490801231517h6d57e5bfkc19d394d38ad19db@mail.gmail.com>
2008-01-24 2:05 ` Kevin Ballard
2008-01-24 3:11 ` Junio C Hamano
2008-01-24 4:37 ` Martin Langhoff
2008-01-24 5:30 ` Kevin Ballard
2008-01-24 6:39 ` Steffen Prohaska
2008-01-24 18:17 ` Mitch Tishmack
2008-01-24 18:52 ` Mitch Tishmack
2008-01-24 19:58 ` Kevin Ballard
2008-01-23 23:37 ` Martin Langhoff
2008-01-23 16:58 ` Kevin Ballard
2008-01-23 17:39 ` Dmitry Potapov
2008-01-23 17:47 ` Kevin Ballard
2008-01-21 19:57 ` Theodore Tso [this message]
2008-01-21 20:01 ` Kevin Ballard
2008-01-21 20:15 ` Theodore Tso
2008-01-21 20:31 ` Kevin Ballard
2008-01-21 20:46 ` Theodore Tso
2008-01-21 20:59 ` Kevin Ballard
[not found] ` <6E303071-82A4-4D69-AA0C-EC41168B9AFE@sb.org>
2008-01-21 21:18 ` Theodore Tso
2008-01-21 21:43 ` Kevin Ballard
2008-01-21 21:49 ` Martin Langhoff
2008-01-21 21:57 ` Kevin Ballard
2008-01-22 0:36 ` Johannes Schindelin
2008-01-22 0:42 ` Kevin Ballard
2008-01-22 0:48 ` David Kastrup
2008-01-22 1:06 ` Martin Langhoff
2008-01-22 1:34 ` Johannes Schindelin
2008-01-22 1:53 ` Martin Langhoff
2008-01-22 2:03 ` Johannes Schindelin
2008-01-21 22:38 ` David Kastrup
2008-01-22 2:34 ` Kevin Ballard
2008-01-22 7:51 ` David Kastrup
2008-01-21 20:56 ` Dmitry Potapov
2008-01-21 21:07 ` Kevin Ballard
2008-01-21 22:41 ` Dmitry Potapov
2008-01-21 22:53 ` Kevin Ballard
2008-01-21 23:21 ` Dmitry Potapov
2008-01-21 19:44 ` Mike Hommey
2008-01-21 20:36 ` Dmitry Potapov
2008-01-21 21:06 ` Martin Langhoff
2008-01-21 21:09 ` David Kastrup
2008-01-21 21:42 ` Linus Torvalds
2008-01-21 22:45 ` Martin Langhoff
2008-01-21 20:30 ` Dmitry Potapov
2008-01-21 18:16 ` Linus Torvalds
2008-01-17 21:27 ` Dmitry Potapov
2008-01-17 22:01 ` JM Ibanez
2008-01-17 22:09 ` Johannes Schindelin
2008-01-18 1:27 ` Robin Rosenberg
2008-01-17 23:05 ` Linus Torvalds
2008-01-17 23:10 ` Dmitry Potapov
2008-01-16 23:52 ` Dmitry Potapov
2008-01-16 22:37 ` Eyvind Bernhardsen
2008-01-16 23:03 ` Wincent Colaiuta
2008-01-17 7:29 ` Miles Bader
2008-01-17 4:43 ` Jay Soffian
2008-01-17 4:59 ` Jay Soffian
2008-01-17 5:15 ` Junio C Hamano
2008-01-17 10:28 ` Wincent Colaiuta
2008-01-17 11:10 ` Johannes Schindelin
2008-01-17 11:23 ` Pedro Melo
2008-01-17 11:51 ` Wincent Colaiuta
2008-01-17 12:53 ` Johannes Schindelin
2008-01-17 13:40 ` Wincent Colaiuta
2008-01-17 17:58 ` Junio C Hamano
2008-01-17 18:22 ` Johan Herland
2008-01-17 13:05 ` Johannes Schindelin
2008-01-17 11:46 ` Wincent Colaiuta
2008-01-17 5:11 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080121195703.GE29792@mit.edu \
--to=tytso@mit.edu \
--cc=git@vger.kernel.org \
--cc=kevin@sb.org \
--cc=melo@simplicidade.org \
--cc=mjscod@web.de \
--cc=peter@softwolves.pp.se \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).