From: Robin Rosenberg <robin.rosenberg.lists@dewire.com>
To: Steffen Prohaska <prohaska@zib.de>
Cc: Git Mailing List <git@vger.kernel.org>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Eli Zaretskii <eliz@gnu.org>,
Daniel Barkalow <barkalow@iabervon.org>,
Alex Riesen <raa.lkml@gmail.com>,
tsuna@lrde.epita.fr, Andreas Ericsson <ae@op5.se>
Subject: Re: Switching from CVS to GIT
Date: Wed, 17 Oct 2007 21:33:32 +0200 [thread overview]
Message-ID: <200710172133.34273.robin.rosenberg.lists@dewire.com> (raw)
In-Reply-To: <4D822762-D344-465E-B77D-90A64D61F5A9@zib.de>
tisdag 16 oktober 2007 skrev Steffen Prohaska:
>
> On Oct 16, 2007, at 2:33 PM, Johannes Schindelin wrote:
>
> >> Maybe we need a configuration similar to core.autocrlf (which
> >> controls
> >> newline conversion) to control filename comparison and normalization?
> >>
> >> Most obviously for the case (in-)sensitivity on Windows, but I also
> >> remember the unicode normalization happening on Mac's HFS filesystem
> >> that caused trouble in the past.
> >
> > Robin Rosenberg has some preliminary code for that. The idea is to
> > wrap
> > all filesystem operations in cache.h, and do a filename normalisation
> > first.
>
> At that point we could add a safety check. Paths that differ only by
> case, or whitespace, or ... (add general and project specific rules
> here)
> should be denied. This would guarantee that tree objects can always be
> checked out. Even if the filesystem capabilities are limited.
>
> Robin, what do you think?
My code only normalizes filenames to UTF-8 inside git, which isn't the same
thing. I think that can be extended to handling MacOSX normalized UTF-8 and
Windows UTF-16 so, when you check out a thing from git there will be no
surprises. Case insensitivity is another dimension. I have no idea as to the
performance of the code, it's more like a proof-that-it-can-be-done.
The code cannot "fail", it always does something reasonable, like not
converting when that is not possible. Something else has to be done for
validation.
The UTF-16 that windows use is not a current issue because git only does
local code page. Jgit, but it isn't very smart either because git doesn't say
anything about filename encoding, while Windows/MacOSX/CIFS and other
filesystems does.
The fact that git uses eigth bit file names may also be a reason performance
is slower on Windows, because the eight-bit Win32API transforms all strings
and filenames to the native UTF-16 encoding on *every* system call, in and
out; that's a lot of work when you do it thousands of times. If git itself
did the transform it might be made smarter and more suited to git's purposes,
and most importantly faster. I have no idea about the performance hit. One
has to measure something.
I notice a number of SCM's out there, including one with a \$\d{4} pricetag
gets you into trouble if you rename a file from Foo to FOO on Windows.
-- robin
next prev parent reply other threads:[~2007-10-17 19:31 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1192293466.17584.95.camel@homebase.localnet>
[not found] ` <uy7e6keyv.fsf@gnu.org>
[not found] ` <1192381040.4908.57.camel@homebase.localnet>
2007-10-14 17:10 ` Switching from CVS to GIT Benoit SIGOURE
2007-10-14 18:06 ` Marco Costalba
2007-10-14 18:20 ` Johannes Schindelin
2007-10-15 5:35 ` Martin Langhoff
2007-10-14 18:27 ` Andreas Ericsson
2007-10-14 18:39 ` Johannes Schindelin
2007-10-14 19:09 ` Andreas Ericsson
2007-10-14 20:14 ` Johannes Schindelin
2007-10-14 22:14 ` Alex Riesen
2007-10-14 22:41 ` Eli Zaretskii
2007-10-14 23:45 ` Johannes Schindelin
2007-10-15 0:36 ` Brian Dessent
2007-10-15 1:22 ` Johannes Schindelin
2007-10-15 1:24 ` Johannes Schindelin
2007-10-15 6:04 ` Eli Zaretskii
2007-10-15 7:56 ` Steffen Prohaska
2007-10-15 8:20 ` Eli Zaretskii
2007-10-15 8:47 ` Johannes Schindelin
2007-10-15 11:09 ` Eli Zaretskii
2007-10-15 12:31 ` Johannes Sixt
2007-10-15 12:37 ` Eli Zaretskii
2007-10-15 18:29 ` Paul Smith
2007-10-15 9:23 ` Steffen Prohaska
2007-10-15 11:06 ` Eli Zaretskii
2007-10-15 4:12 ` Eli Zaretskii
2007-10-15 8:34 ` Johannes Schindelin
2007-10-15 9:02 ` Benoit SIGOURE
2007-10-15 17:56 ` Alex Riesen
2007-10-15 18:37 ` Brian Dessent
2007-10-15 18:44 ` Johannes Schindelin
2007-10-15 19:07 ` Brian Dessent
2007-10-15 19:27 ` Johannes Schindelin
2007-10-15 20:24 ` Linus Torvalds
2007-10-15 20:36 ` Johannes Schindelin
2007-10-15 19:42 ` Alex Riesen
2007-10-15 19:48 ` Eli Zaretskii
2007-10-15 19:58 ` Johannes Schindelin
2007-10-15 21:06 ` Eli Zaretskii
2007-10-15 20:05 ` Brian Dessent
2007-10-15 20:19 ` Johannes Schindelin
2007-10-15 20:43 ` Steffen Prohaska
2007-10-15 20:46 ` Johannes Schindelin
2007-10-16 2:24 ` Nguyen Thai Ngoc Duy
2007-10-16 4:16 ` Eli Zaretskii
2007-10-16 10:09 ` Nguyen Thai Ngoc Duy
2007-10-16 12:18 ` Eli Zaretskii
2007-10-16 6:17 ` Steffen Prohaska
2007-10-15 21:08 ` Eli Zaretskii
2007-10-15 20:05 ` Mark Watts
2007-10-15 4:06 ` Eli Zaretskii
2007-10-15 5:56 ` Eli Zaretskii
2007-10-15 8:44 ` Johannes Schindelin
2007-10-15 8:56 ` David Kastrup
2007-10-15 8:57 ` David Kastrup
2007-10-15 17:49 ` Alex Riesen
2007-10-15 18:25 ` Dave Korn
2007-10-15 18:34 ` Johannes Schindelin
2007-10-15 19:34 ` Alex Riesen
2007-10-15 17:53 ` Alex Riesen
2007-10-14 23:55 ` Andreas Ericsson
2007-10-16 0:45 ` Daniel Barkalow
2007-10-16 4:30 ` Eli Zaretskii
2007-10-16 5:14 ` Andreas Ericsson
2007-10-16 6:25 ` Eli Zaretskii
2007-10-16 7:07 ` Daniel Barkalow
2007-10-16 12:29 ` Johannes Schindelin
2007-10-16 12:38 ` Peter Karlsson
2007-10-16 13:04 ` Eli Zaretskii
2007-10-16 12:53 ` Eli Zaretskii
2007-10-16 12:59 ` David Kastrup
2007-10-16 13:15 ` Johannes Schindelin
2007-10-16 15:47 ` Dave Korn
2007-10-16 15:56 ` David Brown
2007-10-16 16:04 ` Nicolas Pitre
2007-10-16 16:23 ` Dave Korn
2007-10-16 18:06 ` Christopher Faylor
2007-10-16 16:59 ` Andreas Ericsson
2007-10-16 7:14 ` Steffen Prohaska
2007-10-16 12:33 ` Johannes Schindelin
2007-10-16 13:16 ` Steffen Prohaska
2007-10-16 13:21 ` Johannes Schindelin
2007-10-16 13:50 ` Steffen Prohaska
2007-10-16 14:14 ` Johannes Schindelin
2007-10-16 14:36 ` Steffen Prohaska
2007-10-16 15:12 ` Eli Zaretskii
2007-10-17 19:33 ` Robin Rosenberg [this message]
2007-10-16 5:56 ` Daniel Barkalow
2007-10-16 7:03 ` Eli Zaretskii
2007-10-16 12:39 ` Johannes Schindelin
2007-10-16 12:47 ` David Kastrup
2007-10-16 13:16 ` Eli Zaretskii
2007-10-16 13:24 ` Johannes Schindelin
2007-10-16 15:02 ` Eli Zaretskii
2007-10-16 15:18 ` Johannes Schindelin
2007-10-16 15:43 ` Eli Zaretskii
2007-10-16 17:04 ` Daniel Barkalow
2007-10-16 6:06 ` David Kastrup
2007-10-16 6:42 ` Johannes Sixt
2007-10-16 7:17 ` Eli Zaretskii
2007-10-14 22:59 ` Dave Korn
2007-10-15 0:01 ` Johannes Schindelin
2007-10-15 17:36 ` Alex Riesen
2007-10-15 0:03 ` David Brown
2007-10-15 6:08 ` Eli Zaretskii
2007-10-15 10:16 ` Andreas Ericsson
2007-10-15 10:38 ` Johannes Sixt
2007-10-15 10:52 ` Andreas Ericsson
2007-10-15 11:16 ` Dave Korn
2007-10-15 0:46 ` Michael Gebetsroither
2007-10-15 17:38 ` Alex Riesen
2007-10-15 19:26 ` David Kastrup
2007-10-15 19:30 ` Alex Riesen
2007-10-16 11:13 ` Peter Karlsson
2007-10-15 5:43 ` Martin Langhoff
2007-10-15 6:39 ` Johannes Sixt
2007-10-15 23:12 ` Shawn O. Pearce
2007-10-16 6:10 ` Johannes Sixt
2007-10-16 6:21 ` Shawn O. Pearce
2007-10-16 6:29 ` Johannes Sixt
2007-10-16 15:16 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200710172133.34273.robin.rosenberg.lists@dewire.com \
--to=robin.rosenberg.lists@dewire.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=ae@op5.se \
--cc=barkalow@iabervon.org \
--cc=eliz@gnu.org \
--cc=git@vger.kernel.org \
--cc=prohaska@zib.de \
--cc=raa.lkml@gmail.com \
--cc=tsuna@lrde.epita.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).