* Moving a directory into another fails
@ 2006-07-26 15:00 Jon Smirl
2006-07-26 22:34 ` Nicolas Vilz
0 siblings, 1 reply; 24+ messages in thread
From: Jon Smirl @ 2006-07-26 15:00 UTC (permalink / raw)
To: git
I cloned a git project. Then in the original I did mkdir for a new
directory and use git mv to move an existing directory into it. I then
used cg diff to generate a patch for the move.
When I use cg patch to apply this patch to the cloned tree it fails.
This seems to be a problem in the git code, not cg. It is not picking
up the creation of the new intervening subdirectory correctly.
I just synced and this does not work in the current code.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
2006-07-26 15:00 Moving a directory into another fails Jon Smirl
@ 2006-07-26 22:34 ` Nicolas Vilz
2006-07-26 23:03 ` Jon Smirl
0 siblings, 1 reply; 24+ messages in thread
From: Nicolas Vilz @ 2006-07-26 22:34 UTC (permalink / raw)
To: Jon Smirl; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1887 bytes --]
On Wed, Jul 26, 2006 at 11:00:48AM -0400, Jon Smirl wrote:
> I cloned a git project. Then in the original I did mkdir for a new
> directory and use git mv to move an existing directory into it. I then
> used cg diff to generate a patch for the move.
>
> When I use cg patch to apply this patch to the cloned tree it fails.
> This seems to be a problem in the git code, not cg. It is not picking
> up the creation of the new intervening subdirectory correctly.
>
> I just synced and this does not work in the current code.
I tried to reproduce your scenario and before that I setup a test
repository.
(1) mkdir git_test
(2) cd git_test
(3) git init-db
(4) vim test.txt
# fill in some bogus text
(5) mkdir testing
(6) cd testing
(7) vim test1.txt
# again, fill in some bogus text
(8) cd ..
(9) cg add test.txt testing/test1.txt
(10) cg commit -C
# just give a fancy commit message...
(11) cd ..
(12) mkdir bare_git
(13) cd bare_git
(14) mkdir git_test.git
(15) GIT_DIR=git_test.git git init-db
(16) cd ../git_test
(17) git push ../bare_git/git_test.git --all
(18) cd ../
(19) git clone bare_git/git_test.git git_test2
(20) cd git_test
(21) mkdir blah_test
(22) git mv testing/ blah_test/
(23) cg diff > ../mkdir_patch.diff
(24) cd ..
(25) cd git_test2/
(26) cg patch < ../mkdir_patch.diff
from the last one (26) i get
patching file blah_test/testing/test1.txt
patching file testing/test1.txt
touch: cannot touch `testing/test1.txt': No such file or directory
Adding file blah_test/testing/test1.txt
Removing file testing/test1.txt
but the result is correct. There is no testing-directory in here
anymore, and inside blah_test, there is my testing dir with the file
test1.txt in it...
did I miss something?
I use cogito-0.17.3 with git version 1.4.1
(obviously without that recently rewritten git-mv...)
Nicolas
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
2006-07-26 22:34 ` Nicolas Vilz
@ 2006-07-26 23:03 ` Jon Smirl
2006-07-26 23:25 ` Nicolas Vilz
2006-07-28 1:43 ` Petr Baudis
0 siblings, 2 replies; 24+ messages in thread
From: Jon Smirl @ 2006-07-26 23:03 UTC (permalink / raw)
To: Nicolas Vilz; +Cc: git
This is a simpler sequence
cg clone git foo
cg clone git foo1
cd foo
mkdir zzz
git mv gitweb zzz
cg diff >patch
cg ../foo1
cg patch <../foo/patch
Fails with these errors. We have determined that git apply patch is ok
and this is a bug in cg patch.
[jonsmirl@jonsmirl foo1]$ cg patch <../foo/patch
mv: cannot move `gitweb/README' to `zzz/gitweb/README': No such file
or directory
mv: cannot move `gitweb/gitweb.cgi' to `zzz/gitweb/gitweb.cgi': No
such file or directory
mv: cannot move `gitweb/gitweb.css' to `zzz/gitweb/gitweb.css': No
such file or directory
mv: cannot stat `"gitweb/test/M\\303\\244rchen"': No such file or directory
mv: cannot move `gitweb/test/file with spaces' to
`zzz/gitweb/test/file with spaces': No such file or directory
mv: cannot move `gitweb/test/file+plus+sign' to
`zzz/gitweb/test/file+plus+sign': No such file or directory
patch: **** Only garbage was found in the patch input.
Removing file gitweb/README
Adding file zzz/gitweb/README
error: zzz/gitweb/README: does not exist and --remove not passed
fatal: Unable to process file zzz/gitweb/README
cg-add: warning: not all items could have been added
Removing file gitweb/gitweb.cgi
Adding file zzz/gitweb/gitweb.cgi
error: zzz/gitweb/gitweb.cgi: does not exist and --remove not passed
fatal: Unable to process file zzz/gitweb/gitweb.cgi
cg-add: warning: not all items could have been added
Removing file gitweb/gitweb.css
Adding file zzz/gitweb/gitweb.css
error: zzz/gitweb/gitweb.css: does not exist and --remove not passed
fatal: Unable to process file zzz/gitweb/gitweb.css
cg-add: warning: not all items could have been added
Removing file "gitweb/test/Märchen"
Adding file "zzz/gitweb/test/Märchen"
error: "zzz/gitweb/test/Märchen": does not exist and --remove not passed
fatal: Unable to process file "zzz/gitweb/test/Märchen"
cg-add: warning: not all items could have been added
Removing file gitweb/test/file with spaces
Adding file zzz/gitweb/test/file with spaces
error: zzz/gitweb/test/file with spaces: does not exist and --remove not passed
fatal: Unable to process file zzz/gitweb/test/file with spaces
cg-add: warning: not all items could have been added
Removing file gitweb/test/file+plus+sign
Adding file zzz/gitweb/test/file+plus+sign
error: zzz/gitweb/test/file+plus+sign: does not exist and --remove not passed
fatal: Unable to process file zzz/gitweb/test/file+plus+sign
cg-add: warning: not all items could have been added
[jonsmirl@jonsmirl foo1]$
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
2006-07-26 23:03 ` Jon Smirl
@ 2006-07-26 23:25 ` Nicolas Vilz
2006-07-28 1:43 ` Petr Baudis
1 sibling, 0 replies; 24+ messages in thread
From: Nicolas Vilz @ 2006-07-26 23:25 UTC (permalink / raw)
To: Jon Smirl; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 509 bytes --]
On Wed, Jul 26, 2006 at 07:03:30PM -0400, Jon Smirl wrote:
> This is a simpler sequence
>
> cg clone git foo
> cg clone git foo1
> cd foo
> mkdir zzz
> git mv gitweb zzz
> cg diff >patch
> cg ../foo1
> cg patch <../foo/patch
>
> Fails with these errors. We have determined that git apply patch is ok
> and this is a bug in cg patch.
Well, perhaps i should react faster and I shouldn't pause my fetchmail
for 2 or 3 hours... this is bad for this list :) You are kind of fast :)
Nicolas
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
2006-07-26 23:03 ` Jon Smirl
2006-07-26 23:25 ` Nicolas Vilz
@ 2006-07-28 1:43 ` Petr Baudis
2006-12-04 18:19 ` Stefan Pfetzing
1 sibling, 1 reply; 24+ messages in thread
From: Petr Baudis @ 2006-07-28 1:43 UTC (permalink / raw)
To: Jon Smirl; +Cc: Nicolas Vilz, git
Dear diary, on Thu, Jul 27, 2006 at 01:03:30AM CEST, I got a letter
where Jon Smirl <jonsmirl@gmail.com> said that...
> This is a simpler sequence
>
> cg clone git foo
> cg clone git foo1
> cd foo
> mkdir zzz
> git mv gitweb zzz
> cg diff >patch
> cg ../foo1
> cg patch <../foo/patch
Even simpler one:
mkdir zzz
cg-mv gitweb zzz
cg-diff | cg-patch -R
(which would even undo the mess supposing that it worked properly)
> [jonsmirl@jonsmirl foo1]$ cg patch <../foo/patch
> mv: cannot move `gitweb/README' to `zzz/gitweb/README': No such file
> or directory
Oops. Thanks, fixed with this:
diff --git a/cg-patch b/cg-patch
index cc82f1f..923df0e 100755
--- a/cg-patch
+++ b/cg-patch
@@ -145,6 +145,8 @@ redzone_border()
echo "$file1: rename destination $file2 already exists, NOT RENAMING" >&2
return
fi
+ # FIXME: Remove stale empty directories related to $mvfrom
+ case $mvto in */*) mkdir -p "${mvto%/*}";; esac
mv "$mvfrom" "$mvto"
fi
if [ "$op" = "delete" -o "$op" = "rename" ]; then
> mv: cannot stat `"gitweb/test/M\\303\\244rchen"': No such file or directory
Junio, how am I supposed to unmangle this *censored* stuff?
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Snow falling on Perl. White noise covering line noise.
Hides all the bugs too. -- J. Putnam
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-07-28 1:43 ` Petr Baudis
@ 2006-12-04 18:19 ` Stefan Pfetzing
2006-12-04 18:56 ` Jakub Narebski
0 siblings, 1 reply; 24+ messages in thread
From: Stefan Pfetzing @ 2006-12-04 18:19 UTC (permalink / raw)
To: git
Hi Folks,
2006/7/28, Petr Baudis <pasky@suse.cz>:
> > mv: cannot stat `"gitweb/test/M\\303\\244rchen"': No such file or directory
since when is this file in the official git.git tree?
Its quite problematic when used on HFS+ because it uses UTF-16 internally IMHO.
Git always thinks there is a new file in my git.git clone.
--- snip ---
dreamind@paris:~/src/git% git status
# Untracked files:
# (use "git add" to add to commit)
#
# gitweb/test/MaÌrchen
nothing to commit
--- snap ---
bye
dreamind
--
http://www.dreamind.de/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 18:19 ` Stefan Pfetzing
@ 2006-12-04 18:56 ` Jakub Narebski
2006-12-04 19:03 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jakub Narebski @ 2006-12-04 18:56 UTC (permalink / raw)
To: git
Stefan Pfetzing wrote:
> 2006/7/28, Petr Baudis <pasky@suse.cz>:
>>>
>>> mv: cannot stat `"gitweb/test/M\\303\\244rchen"': No such file or directory
>>>
> since when is this file in the official git.git tree?
Since merging in gitweb, Sat Jun 10, 2006.
> Its quite problematic when used on HFS+ because it uses UTF-16 internally IMHO.
>
> Git always thinks there is a new file in my git.git clone.
That is the problem that git tries to be content agnostict, and it
includes being coding agnostic.
I personally think that because the same repository might be deployed
on different systems with different file name encoding (and this is not
something you have control over, contrary to commit/tag message encoding,
and encoding in files), git should acquire core.filesystemEncoding
configuration variable which would encode from filesystem encoding used
in working directory and perhaps index to UTF-8 encoding used in repository
(in tree objects) and perhaps index.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 18:56 ` Jakub Narebski
@ 2006-12-04 19:03 ` Johannes Schindelin
2006-12-04 19:10 ` Jakub Narebski
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-04 19:03 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Hi,
On Mon, 4 Dec 2006, Jakub Narebski wrote:
> [...] git should acquire core.filesystemEncoding configuration variable
> which would encode from filesystem encoding used in working directory
> and perhaps index to UTF-8 encoding used in repository (in tree objects)
> and perhaps index.
So, you want to pull in all thinkable encodings? Of course, you could rely
on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo
uses it already. But I never use mailinfo, so I do not need libiconv.)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 19:03 ` Johannes Schindelin
@ 2006-12-04 19:10 ` Jakub Narebski
2006-12-04 19:10 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jakub Narebski @ 2006-12-04 19:10 UTC (permalink / raw)
To: git
Johannes Schindelin wrote:
> On Mon, 4 Dec 2006, Jakub Narebski wrote:
>
>> [...] git should acquire core.filesystemEncoding configuration variable
>> which would encode from filesystem encoding used in working directory
>> and perhaps index to UTF-8 encoding used in repository (in tree objects)
>> and perhaps index.
>
> So, you want to pull in all thinkable encodings? Of course, you could rely
> on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo
> uses it already. But I never use mailinfo, so I do not need libiconv.)
A conditional dependency. If you don't have libiconv, this feature wouldn't
be available.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 19:10 ` Jakub Narebski
@ 2006-12-04 19:10 ` Johannes Schindelin
2006-12-04 19:37 ` Jakub Narebski
2006-12-04 20:26 ` Linus Torvalds
0 siblings, 2 replies; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-04 19:10 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Hi,
On Mon, 4 Dec 2006, Jakub Narebski wrote:
> Johannes Schindelin wrote:
>
> > On Mon, 4 Dec 2006, Jakub Narebski wrote:
> >
> >> [...] git should acquire core.filesystemEncoding configuration variable
> >> which would encode from filesystem encoding used in working directory
> >> and perhaps index to UTF-8 encoding used in repository (in tree objects)
> >> and perhaps index.
> >
> > So, you want to pull in all thinkable encodings? Of course, you could rely
> > on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo
> > uses it already. But I never use mailinfo, so I do not need libiconv.)
>
> A conditional dependency. If you don't have libiconv, this feature wouldn't
> be available.
You are speaking as somebody compiling git from source. We are a minority.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 19:10 ` Johannes Schindelin
@ 2006-12-04 19:37 ` Jakub Narebski
[not found] ` <7617FA7E-D49A-4A4C-B033-C2CB20623F5F@wf227.com>
2006-12-04 20:26 ` Linus Torvalds
1 sibling, 1 reply; 24+ messages in thread
From: Jakub Narebski @ 2006-12-04 19:37 UTC (permalink / raw)
To: git
Johannes Schindelin wrote:
> On Mon, 4 Dec 2006, Jakub Narebski wrote:
>
>> Johannes Schindelin wrote:
>>
>>> On Mon, 4 Dec 2006, Jakub Narebski wrote:
>>>
>>>> [...] git should acquire core.filesystemEncoding configuration variable
>>>> which would encode from filesystem encoding used in working directory
>>>> and perhaps index to UTF-8 encoding used in repository (in tree objects)
>>>> and perhaps index.
>>>
>>> So, you want to pull in all thinkable encodings? Of course, you could rely
>>> on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo
>>> uses it already. But I never use mailinfo, so I do not need libiconv.)
>>
>> A conditional dependency. If you don't have libiconv, this feature wouldn't
>> be available.
>
> You are speaking as somebody compiling git from source. We are a minority.
Usually iconv is in libc.
# Define NEEDS_LIBICONV if linking with libc is not enough (Darwin).
Hmm... perhaps not that usually. The uname based configuration in Makefile
(not the test based configuration provided by autoconf generated
./configure script) sets NEEDS_LIBICONV for: Darwin, SunOS 5.8, Cygwin,
FreeBSD and OpenBSD, some versions of NetBSD, AIX.
And HFS+ is on MacOS X / Darwin, without iconv in libc...
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 19:10 ` Johannes Schindelin
2006-12-04 19:37 ` Jakub Narebski
@ 2006-12-04 20:26 ` Linus Torvalds
2006-12-04 20:51 ` Linus Torvalds
` (3 more replies)
1 sibling, 4 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-12-04 20:26 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
On Mon, 4 Dec 2006, Johannes Schindelin wrote:
>
> On Mon, 4 Dec 2006, Jakub Narebski wrote:
>
> > Johannes Schindelin wrote:
> >
> > > On Mon, 4 Dec 2006, Jakub Narebski wrote:
> > >
> > >> [...] git should acquire core.filesystemEncoding configuration variable
> > >> which would encode from filesystem encoding used in working directory
> > >> and perhaps index to UTF-8 encoding used in repository (in tree objects)
> > >> and perhaps index.
> > >
> > > So, you want to pull in all thinkable encodings? Of course, you could rely
> > > on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo
> > > uses it already. But I never use mailinfo, so I do not need libiconv.)
> >
> > A conditional dependency. If you don't have libiconv, this feature wouldn't
> > be available.
>
> You are speaking as somebody compiling git from source. We are a minority.
You guys are ignoring the _real_ problem.
It has nothing at all to do with dependencies on external packages. The
REAL problem is that if you do locale-dependent trees and other git
objects, git will STOP WORKING.
A filename in a tree object _has_ to be see as a pure 8-bit character
stream. They _have_ to be compared with "memcmp()", and they have to sort
the same way and mean EXACTLY the same thing for everybody.
If a filesystem cannot represent that name AS THAT BYTE SEQUENCE then the
filesystem is broken. No ifs, buts, maybes about it. I'm sorry, but that's
how it is.
This is _exactly_ the same issue as case independence. Git does not ignore
case, and it really CANNOT ignore case. Ignoring case would cause horrible
and deep problems, and it has nothing to do with dependencies on libraries
(although it _would_ get much much worse from locale settings, and again
having different locales compare the same name differently because case
rules are different).
So it really boils down to one one: git saves a byte stream. Not text.
This is true for all levels of the git archive. It's true for blob
content, it's true for filenames in trees, and it is true for commits. The
commit message is actually somewhat easier (because we have nothing to
"compare" it to afterwards in the checked-out tree), so the commit message
is the _one_ thing we can kind of play games with, but even there, once
it's done, it's done, and it's just a stream of bytes.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 20:26 ` Linus Torvalds
@ 2006-12-04 20:51 ` Linus Torvalds
2006-12-04 20:54 ` Shawn Pearce
` (2 subsequent siblings)
3 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-12-04 20:51 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
On Mon, 4 Dec 2006, Linus Torvalds wrote:
>
> If a filesystem cannot represent that name AS THAT BYTE SEQUENCE then the
> filesystem is broken. No ifs, buts, maybes about it. I'm sorry, but that's
> how it is.
Btw, what this means in practice is that when git creates a file with a
certain sequence of bytes, then
(a) readdir had better return _that_ sequence of bytes, or git will see
it as somethign else.
(b) opening it with that same sequence of bytes had better work.
This does not mean that a filesystem may not internally use some other
encoding. It just means that if the filesystem - when converting back and
forth between the internal encoding and the one it shows to user space -
had better convert back to the exact same thing.
Also, note that for most projects, even a broken filesystem doesn't
actually matter - it's enough that the filesystem gets the conversions
right for the particular set of names in a particular project. So any
project that just has 7-bit filenames will obviously never even see any
issues at all, even if the filesystem it runs on then does something
strange with 8-bit filenames.
This is one reason why UNIX's "everything is a stream of bytes" is so
important, and whyprograms should generally work with byte streams, not
"wide strings" or similar. It's the only way that you can reliably work
across different locales. Use wide strings and locale-specific stuff
_only_ for actually showing users something on the tty, for example.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 20:26 ` Linus Torvalds
2006-12-04 20:51 ` Linus Torvalds
@ 2006-12-04 20:54 ` Shawn Pearce
2006-12-04 20:56 ` Jakub Narebski
2006-12-04 21:05 ` Johannes Schindelin
3 siblings, 0 replies; 24+ messages in thread
From: Shawn Pearce @ 2006-12-04 20:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Johannes Schindelin, Jakub Narebski, git
Linus Torvalds <torvalds@osdl.org> wrote:
> You guys are ignoring the _real_ problem.
>
> It has nothing at all to do with dependencies on external packages. The
> REAL problem is that if you do locale-dependent trees and other git
> objects, git will STOP WORKING.
Yes!
In jgit I assumed all tree entry names were encoded in UTF8.
Then I later learned they aren't. Foolish me.
As Linus points out its a HUGE problem that the caller of
git-write-tree gets to decide what encoding should be used for
that tree. Especially if someone else wants to use a different
encoding for the same filename (think ISO-8859-1 vs. UTF-8)!
I'd rather just force the tree entry names to be encoded in UTF-8
always, as its compact for most western texts (which many filenames
are), and at least degrades to supporting the non western texts.
A per-project setting is essentially impossible as we have
no such concept today, and a per-repository setting (like
i18n.commitEncoding) lets two different users encode the same
filename differently, which means two different tree SHA1s with
the exact same content... not correct!
> This is true for all levels of the git archive. It's true for blob
> content, it's true for filenames in trees, and it is true for commits. The
> commit message is actually somewhat easier (because we have nothing to
> "compare" it to afterwards in the checked-out tree), so the commit message
> is the _one_ thing we can kind of play games with, but even there, once
> it's done, it's done, and it's just a stream of bytes.
Commit encoding is a problem. Clearly the "header parts"
(tree, parent) are US-ASCII but the author and committer lines
can be anything. So can the body. And we have no way of knowing
what encoding was used years later, we can only guess and display
it wrong.
We really should either normalize all commit messages to a single
encoding (again, UTF-8) or embed the encoding as part of the headers
somehow (e.g. look at how XML embeds the document encoding in the
start of the document).
--
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
2006-12-04 20:26 ` Linus Torvalds
2006-12-04 20:51 ` Linus Torvalds
2006-12-04 20:54 ` Shawn Pearce
@ 2006-12-04 20:56 ` Jakub Narebski
2006-12-04 21:05 ` Johannes Schindelin
3 siblings, 0 replies; 24+ messages in thread
From: Jakub Narebski @ 2006-12-04 20:56 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Johannes Schindelin, git
Dnia poniedziałek 4. grudnia 2006 21:26, Linus Torvalds napisał:
>>>> On Mon, 4 Dec 2006, Jakub Narebski wrote:
>>>>
>>>>> [...] git should acquire core.filesystemEncoding configuration variable
>>>>> which would encode from filesystem encoding used in working directory
>>>>> and perhaps index to UTF-8 encoding used in repository (in tree objects)
>>>>> and perhaps index.
> You guys are ignoring the _real_ problem.
>
> It has nothing at all to do with dependencies on external packages. The
> REAL problem is that if you do locale-dependent trees and other git
> objects, git will STOP WORKING.
>
> A filename in a tree object _has_ to be see as a pure 8-bit character
> stream. They _have_ to be compared with "memcmp()", and they have to sort
> the same way and mean EXACTLY the same thing for everybody.
What I propose is having filename in tree object UTF-8 encoded. I don't
know if git relies heavily that filename encoding on filesystem (in working
area) is the same as in the index, is the same as in a tree object.
Although I'm not sure what is the problem. You checkout non US-ASCII filename
out of git; the file can have strange characters in a name, but should
encode to the filename as is in git. The problem migh be some forbidden by
filesystem characters in a filename perhaps.
Although Wolfgang Fischer wrote (to me and Johannes Schindelin) that HFS+
uses UTF8-NFC (Normalization-Form-Composed) when creating a file, while
readdir returns encoding used by HFS+, which is UTF8-NFD (Normalization-Form-
Decomposed). [Explitive censored]
> If a filesystem cannot represent that name AS THAT BYTE SEQUENCE then the
> filesystem is broken. No ifs, buts, maybes about it. I'm sorry, but that's
> how it is.
We have some configuration variables to work around broken filesystems,
like core.ignoreStat, so why not core.filesystemEncoding.
--
Jakub Narebski
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
[not found] ` <7617FA7E-D49A-4A4C-B033-C2CB20623F5F@wf227.com>
@ 2006-12-04 21:01 ` Johannes Schindelin
0 siblings, 0 replies; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-04 21:01 UTC (permalink / raw)
To: Wolfgang Fischer; +Cc: Jakub Narebski, git
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1855 bytes --]
Hi,
thank you, Wolfgang, for back Cc'ing me (Jakub never does that...), but I
am Cc'ing the git list here, also. Hope both of you don't mind.
On Mon, 4 Dec 2006, Wolfgang Fischer wrote:
> On 04.12.2006, at 20:37, Jakub Narebski wrote:
>
> > And HFS+ is on MacOS X / Darwin, without iconv in libc...
>
> And, what is even worse, is the fact that HFS+ uses an encoding, which is not
> represented in libiconv.
>
> If you CREATE a file, you can use UTF8-NFC
> (Normalization-Form-Composed), but if you later READDIR a directory, you
> will get the very same name back in the encoding used by HFS+, which is
> UTF8-NFD Normalization-Form-Decomposed. The difference is noticeable for
> some non-ASCII characters like e.g.
>
> LATIN SMALL LETTER A WITH DIAERESIS U+00E4 or U+0061 U+0308 in Unicode.
>
> If you need a sane backward mapping, one has to use some CoreFoundation
> interface, for which I removed the details out of my brain, in order to
> reclaim that memory area (garbage collection!). But I can help you with
> some details and probably code, if you really need that conversion
> direction.
Yes. When my iBook was still alive, I saw that problem, too: writing and
reading filenames were completely different issues.
Worse, you can experience the same on USB-Sticks when accessing them with
different OSes. For example, when checking out a git repo on a stick with
Linux, and then calling git-status on the same stick with Windows XP, you
see an issue with the file "Märchen", like you did on MacOSX.
So, please, please, please do not try to be smart about filename encodings
in git, but just DO NOT USE ANYTHING BUT ASCII IN FILENAMES IF THE
REPOSITORY IS GOING TO BE PUT ON DIFFERENT OPERATING SYSTEMS/FILE SYSTEMS.
(Wow, the Caps Lock key is _not_ dead after all. I must have been infected
by Linus...)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 20:26 ` Linus Torvalds
` (2 preceding siblings ...)
2006-12-04 20:56 ` Jakub Narebski
@ 2006-12-04 21:05 ` Johannes Schindelin
2006-12-04 21:23 ` Linus Torvalds
3 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-04 21:05 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jakub Narebski, git
Hi,
On Mon, 4 Dec 2006, Linus Torvalds wrote:
> On Mon, 4 Dec 2006, Johannes Schindelin wrote:
> >
> > On Mon, 4 Dec 2006, Jakub Narebski wrote:
> >
> > > Johannes Schindelin wrote:
> > >
> > > > On Mon, 4 Dec 2006, Jakub Narebski wrote:
> > > >
> > > >> [...] git should acquire core.filesystemEncoding configuration variable
> > > >> which would encode from filesystem encoding used in working directory
> > > >> and perhaps index to UTF-8 encoding used in repository (in tree objects)
> > > >> and perhaps index.
> > > >
> > > > So, you want to pull in all thinkable encodings? Of course, you could rely
> > > > on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo
> > > > uses it already. But I never use mailinfo, so I do not need libiconv.)
> > >
> > > A conditional dependency. If you don't have libiconv, this feature wouldn't
> > > be available.
> >
> > You are speaking as somebody compiling git from source. We are a minority.
>
> You guys are ignoring the _real_ problem.
>
> It has nothing at all to do with dependencies on external packages. The
> REAL problem is that if you do locale-dependent trees and other git
> objects, git will STOP WORKING.
The issue was _not_ locale-dependent trees, but file systems which
_change_ the encoding. And even then, Jakub's proposition reencoding could
work, because it is an _encoding_ after all, i.e. bijective (reversable
mapping for you non-Math guys). Not at all comparable to cases
insensitivity, which _loses_ information.
But for reasons described in another mail, there are more fundamental
problems with encodings, especially with MacOSX which (braindeadly)
encodes _differently_ when writing and reading.
So, we reach the same conclusion, but for different reasons.
Ciao,
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 21:05 ` Johannes Schindelin
@ 2006-12-04 21:23 ` Linus Torvalds
2006-12-05 7:34 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-12-04 21:23 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
On Mon, 4 Dec 2006, Johannes Schindelin wrote:
>
> The issue was _not_ locale-dependent trees, but file systems which
> _change_ the encoding.
Correct. However, it doesn't really change the issue: some byte streams
may simply not work in certain encodings.
You could, of course, basically do some kind of "escape high characters"
on the filename if it has characters in it that you suspect might cause
problems, but you'd better make 100% sure that it really is 100%
reversible (and you need to do all the real operations on the _native_git_
version of the filename).
So we _could_ use a flag that says "escape all filenames", but it would
not be a _locale_ setting, it would really be a per-repository setting,
and it wouldn't be "iconv", it would be something similar to what we do
for "git diff" when we escape filenames with strange characters in them.
We could do it by changing ever "open()/creat()" and "[l]stat()" on the
working tree with somethign that first escapes the filename.
Then, people with broken filesystems could set
[core]
escapefilenames = true
and instead of seeing 8-bit filenames, they'd see filenames with 7 bits
and escapes. They could work with such a repo, for sure. It would be ugly
as hell, though.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-04 21:23 ` Linus Torvalds
@ 2006-12-05 7:34 ` Johannes Schindelin
2006-12-05 9:36 ` Jakub Narebski
2006-12-05 17:11 ` Linus Torvalds
0 siblings, 2 replies; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-05 7:34 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jakub Narebski, git
Hi,
On Mon, 4 Dec 2006, Linus Torvalds wrote:
> [core]
> escapefilenames = true
I think this goes too far. The problem _only_ showed up with a made-up
test case for gitweb. Let's bite the apple when we _have_ to (which I
doubt will happen, because for the most part, developers understand that
spaces and umlauts have _no_ place in filenames, basically since UNIX was
invented by stupid US Americans who did not know anything about nice
filenames, let alone other languages than English and C).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Moving a directory into another fails
2006-12-05 7:34 ` Johannes Schindelin
@ 2006-12-05 9:36 ` Jakub Narebski
2006-12-05 14:11 ` filesystem encodings and gitweb tests, was " Johannes Schindelin
2006-12-05 17:11 ` Linus Torvalds
1 sibling, 1 reply; 24+ messages in thread
From: Jakub Narebski @ 2006-12-05 9:36 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, git
Johannes Schindelin wrote:
> On Mon, 4 Dec 2006, Linus Torvalds wrote:
>
>> [core]
>> escapefilenames = true
>
> I think this goes too far. The problem _only_ showed up with a made-up
> test case for gitweb. Let's bite the apple when we _have_ to (which I
> doubt will happen, because for the most part, developers understand that
> spaces and umlauts have _no_ place in filenames, basically since UNIX was
> invented by stupid US Americans who did not know anything about nice
> filenames, let alone other languages than English and C).
No, the problem showed with stupid HFS+ which uses different encoding
for creating file, and different for readdir.
Perhaps we should remove gitweb/test directory, and move testing gitweb
to proper place, t/ directory.
By the way, would it be correct to use external tools (if they exist),
namely HTMLtidy in gitweb output test to-be-written?
--
Jakub Narebski
^ permalink raw reply [flat|nested] 24+ messages in thread
* filesystem encodings and gitweb tests, was Re: Moving a directory into another fails
2006-12-05 9:36 ` Jakub Narebski
@ 2006-12-05 14:11 ` Johannes Schindelin
2006-12-05 14:29 ` Jakub Narebski
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-05 14:11 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Linus Torvalds, git
Hi,
On Tue, 5 Dec 2006, Jakub Narebski wrote:
> Johannes Schindelin wrote:
>
> > On Mon, 4 Dec 2006, Linus Torvalds wrote:
> >
> >> [core]
> >> escapefilenames = true
> >
> > I think this goes too far. The problem _only_ showed up with a made-up
> > test case for gitweb. Let's bite the apple when we _have_ to (which I
> > doubt will happen, because for the most part, developers understand that
> > spaces and umlauts have _no_ place in filenames, basically since UNIX was
> > invented by stupid US Americans who did not know anything about nice
> > filenames, let alone other languages than English and C).
>
> No, the problem showed with stupid HFS+ which uses different encoding
> for creating file, and different for readdir.
This is just one of the problems. I described another problem in this
thread, namely a repo on a usb stick being accessed from different hosts.
> Perhaps we should remove gitweb/test directory, and move testing gitweb
> to proper place, t/ directory.
If you do that, please make sure that these tests can be disabled (a la
svn tests), so that people not being interested in gitweb, or lacking the
programs to test it, do not have to suffer.
> By the way, would it be correct to use external tools (if they exist),
> namely HTMLtidy in gitweb output test to-be-written?
(Yes, they exist. HTMLtidy for example ;-)
IMHO if such a tool is common enough, you should use it. If anybody steps
forward providing automated HTTP tests, I will not complain, and certainly
not about testing with things like HTMLtidy.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: filesystem encodings and gitweb tests, was Re: Moving a directory into another fails
2006-12-05 14:11 ` filesystem encodings and gitweb tests, was " Johannes Schindelin
@ 2006-12-05 14:29 ` Jakub Narebski
2006-12-05 14:40 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jakub Narebski @ 2006-12-05 14:29 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, git
Johannes Schindelin wrote:
> On Tue, 5 Dec 2006, Jakub Narebski wrote:
>
>> No, the problem showed with stupid HFS+ which uses different encoding
>> for creating file, and different for readdir.
>
> This is just one of the problems. I described another problem in this
> thread, namely a repo on a usb stick being accessed from different hosts.
That is not much a problem. Yes, the filenames on different hosts would
_look_ different, but shouldn't be detected as new file.
--
Jakub Narebski
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: filesystem encodings and gitweb tests, was Re: Moving a directory into another fails
2006-12-05 14:29 ` Jakub Narebski
@ 2006-12-05 14:40 ` Johannes Schindelin
0 siblings, 0 replies; 24+ messages in thread
From: Johannes Schindelin @ 2006-12-05 14:40 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Linus Torvalds, git
Hi,
On Tue, 5 Dec 2006, Jakub Narebski wrote:
> Johannes Schindelin wrote:
>
> > On Tue, 5 Dec 2006, Jakub Narebski wrote:
> >
> >> No, the problem showed with stupid HFS+ which uses different encoding
> >> for creating file, and different for readdir.
> >
> > This is just one of the problems. I described another problem in this
> > thread, namely a repo on a usb stick being accessed from different hosts.
>
> That is not much a problem. Yes, the filenames on different hosts would
> _look_ different, but shouldn't be detected as new file.
Sorry, I should have been clearer. I meant a repo _and_ a working
directory going along with it, on a USB stick. They do look different on
different hosts, and git-status looks different as a consequence ;-)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Re: Moving a directory into another fails
2006-12-05 7:34 ` Johannes Schindelin
2006-12-05 9:36 ` Jakub Narebski
@ 2006-12-05 17:11 ` Linus Torvalds
1 sibling, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-12-05 17:11 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
On Tue, 5 Dec 2006, Johannes Schindelin wrote:
>
> On Mon, 4 Dec 2006, Linus Torvalds wrote:
>
> > [core]
> > escapefilenames = true
>
> I think this goes too far.
Sure., I agree that in _practice_ this isn't actually a problem, because
people have long since learnt to avoid strange filenames in SCM's, simply
because you can't get it right with insane filesystems.
That said, it might be a good idea to abstract out the create/read phase
for filenames in the working tree regardless, since that also tends to be
an area where other issues can come up (whoops - '/' vs '\' as the
directory separator etc).
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2006-12-05 17:11 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-26 15:00 Moving a directory into another fails Jon Smirl
2006-07-26 22:34 ` Nicolas Vilz
2006-07-26 23:03 ` Jon Smirl
2006-07-26 23:25 ` Nicolas Vilz
2006-07-28 1:43 ` Petr Baudis
2006-12-04 18:19 ` Stefan Pfetzing
2006-12-04 18:56 ` Jakub Narebski
2006-12-04 19:03 ` Johannes Schindelin
2006-12-04 19:10 ` Jakub Narebski
2006-12-04 19:10 ` Johannes Schindelin
2006-12-04 19:37 ` Jakub Narebski
[not found] ` <7617FA7E-D49A-4A4C-B033-C2CB20623F5F@wf227.com>
2006-12-04 21:01 ` Johannes Schindelin
2006-12-04 20:26 ` Linus Torvalds
2006-12-04 20:51 ` Linus Torvalds
2006-12-04 20:54 ` Shawn Pearce
2006-12-04 20:56 ` Jakub Narebski
2006-12-04 21:05 ` Johannes Schindelin
2006-12-04 21:23 ` Linus Torvalds
2006-12-05 7:34 ` Johannes Schindelin
2006-12-05 9:36 ` Jakub Narebski
2006-12-05 14:11 ` filesystem encodings and gitweb tests, was " Johannes Schindelin
2006-12-05 14:29 ` Jakub Narebski
2006-12-05 14:40 ` Johannes Schindelin
2006-12-05 17:11 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).