* Error writing loose object on Cygwin
@ 2006-07-12 3:57 Shawn Pearce
2006-07-12 4:15 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Shawn Pearce @ 2006-07-12 3:57 UTC (permalink / raw)
To: git
I've got a weird bug that a coworker just found today on Cygwin
running on XP. He was trying to do `git add foo` in a brand new
repository which was stored on a Solaris server[*1*] and mounted
on his XP desktop by way of samba[*2*].
My coworker received the "unable to write sha1 filename %s:%s"
error in move_temp_to_file during git-add. After sprinkling some
printfs all over that area of sha1_file.c I concluded that GIT was
receiving back EACCES as the error from the first link attempt in
link_temp_to_file (rather than ENOENT) when the parent directory
didn't exist.
Reproducing it on XP systems was easy, as was working around the
problem:
mkdir foo
cd foo
git init-db
echo foo>foo
git add foo # dies with "unable to write sha1 filename"
mkdir .git/objects/25
git add foo # now succeeds without error
What's more interesting is Windows 2000 systems accessing the same
Solaris server and the same samba server with the same version of
Cygwin didn't have any problems (the first git add succeeded).
This was Cygwin 1.5.19-4 and 1.4.1. The tiny patch below fixes
the issue for us, but certainly seems like not the best way to go
about this... But right now I've got my coworkers running GIT 1.4.1
plus the patch below.
Has anyone else seen this type of behavior before? Any suggestions
on debugging this issue?
Footnotes:
[*1*] Yes, this Solaris server is the same one that has the old
compiler and almost no GNU tools, which means GIT, StGIT,
cogito and pg's higher level functions are all broken...
[*2*] Yes, Solaris is a real UNIX and the coworker should just use
GIT there. The problem is we have some GIT based scripts which
mirror a version control tool that is only available through
a Java applet running in Internet Explorer on a Windows
system. Which means although we can use GIT on Solaris
there are some operations that we need to execute on
Windows...
-->8--
Assume EACCES means ENOENT when creating sha1 objects.
---
sha1_file.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/sha1_file.c b/sha1_file.c
index 8179630..c04d6a5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1344,7 +1344,7 @@ static int link_temp_to_file(const char
* else succeeded.
*/
ret = errno;
- if (ret == ENOENT) {
+ if (ret == ENOENT || ret == EACCES) {
char *dir = strrchr(filename, '/');
if (dir) {
*dir = 0;
--
1.4.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: Error writing loose object on Cygwin 2006-07-12 3:57 Error writing loose object on Cygwin Shawn Pearce @ 2006-07-12 4:15 ` Junio C Hamano 2006-07-12 4:36 ` Linus Torvalds 2006-07-13 5:51 ` Christopher Faylor 0 siblings, 2 replies; 9+ messages in thread From: Junio C Hamano @ 2006-07-12 4:15 UTC (permalink / raw) To: Shawn Pearce; +Cc: git Shawn Pearce <spearce@spearce.org> writes: > Has anyone else seen this type of behavior before? Any suggestions > on debugging this issue? I would suggest raising this (politely) to Cygwin people. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-12 4:15 ` Junio C Hamano @ 2006-07-12 4:36 ` Linus Torvalds 2006-07-12 5:00 ` Shawn Pearce 2006-07-13 5:51 ` Christopher Faylor 1 sibling, 1 reply; 9+ messages in thread From: Linus Torvalds @ 2006-07-12 4:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn Pearce, git On Tue, 11 Jul 2006, Junio C Hamano wrote: > Shawn Pearce <spearce@spearce.org> writes: > > > Has anyone else seen this type of behavior before? Any suggestions > > on debugging this issue? > > I would suggest raising this (politely) to Cygwin people. Well, since it apparently works with W2000, and breaks with XP, I suspect it's actually Windows that just returns the wrong error code. It's entirely possible that we should just make that whole if (ret == ENOENT) go away. Yes, it's the right error code if a subdirectory is missing, and yes, POSIX requires it, and yes, WXP is probably just a horrible piece of sh*t, but on the other hand, I don't think git really has any serious reason to even care. So we might as well say that if the link() fails for _any_ reason, we'll try to see if doing the mkdir() and re-trying the link helps. Linus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-12 4:36 ` Linus Torvalds @ 2006-07-12 5:00 ` Shawn Pearce 2006-07-13 4:27 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Shawn Pearce @ 2006-07-12 5:00 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, git Linus Torvalds <torvalds@osdl.org> wrote: > On Tue, 11 Jul 2006, Junio C Hamano wrote: > > > Shawn Pearce <spearce@spearce.org> writes: > > > > > Has anyone else seen this type of behavior before? Any suggestions > > > on debugging this issue? > > > > I would suggest raising this (politely) to Cygwin people. > > Well, since it apparently works with W2000, and breaks with XP, I suspect > it's actually Windows that just returns the wrong error code. > > It's entirely possible that we should just make that whole > > if (ret == ENOENT) > > go away. Yes, it's the right error code if a subdirectory is missing, and > yes, POSIX requires it, and yes, WXP is probably just a horrible piece of > sh*t, but on the other hand, I don't think git really has any serious > reason to even care. > > So we might as well say that if the link() fails for _any_ reason, we'll > try to see if doing the mkdir() and re-trying the link helps. Hmm. Its a single mkdir call before we give up and tell the user something is wrong. The following change appears to work OK here on a reasonably POSIX compliant system (OK meaning it reports errors reasonably). Given that this type of error (failed link) shouldn't happen that often, except for on Coda or FAT (according to a comment in move_temp_to_file), I guess the change is OK and comes with little penalty. But for Coda and FAT users things are going to slow down a little bit as we try mkdir for every new loose object being created before we try rename. Tomorrow when I get access to my Cygwin system again I'll try to write up a tiny test case which shows the error behavior we are seeing and send it to the Cygwin mailing list, as this really does seem to be a Cygwin or Windows issue. But of course having GIT handle this case slightly better wouldn't be bad either. :-) diff --git a/sha1_file.c b/sha1_file.c index 8734d50..db4bddc 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1336,26 +1336,23 @@ static int link_temp_to_file(const char return 0; /* - * Try to mkdir the last path component if that failed - * with an ENOENT. + * Try to mkdir the last path component if that failed. * * Re-try the "link()" regardless of whether the mkdir * succeeds, since a race might mean that somebody * else succeeded. */ ret = errno; - if (ret == ENOENT) { - char *dir = strrchr(filename, '/'); - if (dir) { - *dir = 0; - mkdir(filename, 0777); - if (adjust_shared_perm(filename)) - return -2; - *dir = '/'; - if (!link(tmpfile, filename)) - return 0; - ret = errno; - } + char *dir = strrchr(filename, '/'); + if (dir) { + *dir = 0; + mkdir(filename, 0777); + if (adjust_shared_perm(filename)) + return -2; + *dir = '/'; + if (!link(tmpfile, filename)) + return 0; + ret = errno; } return ret; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-12 5:00 ` Shawn Pearce @ 2006-07-13 4:27 ` Junio C Hamano 0 siblings, 0 replies; 9+ messages in thread From: Junio C Hamano @ 2006-07-13 4:27 UTC (permalink / raw) To: Shawn Pearce; +Cc: git Shawn Pearce <spearce@spearce.org> writes: > Tomorrow when I get access to my Cygwin system again I'll try to > write up a tiny test case which shows the error behavior we are > seeing and send it to the Cygwin mailing list, as this really does > seem to be a Cygwin or Windows issue. But of course having GIT > handle this case slightly better wouldn't be bad either. :-) Surely, and thanks. I'll await for a follow-up report, and until then will hold onto this patch. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-12 4:15 ` Junio C Hamano 2006-07-12 4:36 ` Linus Torvalds @ 2006-07-13 5:51 ` Christopher Faylor 2006-07-14 3:34 ` Shawn Pearce 1 sibling, 1 reply; 9+ messages in thread From: Christopher Faylor @ 2006-07-13 5:51 UTC (permalink / raw) To: Junio C Hamano, Shawn Pearce, git On Tue, Jul 11, 2006 at 09:15:38PM -0700, Junio C Hamano wrote: >Shawn Pearce <spearce@spearce.org> writes: > >> Has anyone else seen this type of behavior before? Any suggestions >> on debugging this issue? > >I would suggest raising this (politely) to Cygwin people. I lost the thread here but wasn't this referring to a samba mount? If so, it would be samba that's returning the wrong "errno". cgf ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-13 5:51 ` Christopher Faylor @ 2006-07-14 3:34 ` Shawn Pearce 2006-07-14 5:24 ` Christopher Faylor 2006-07-14 5:26 ` Linus Torvalds 0 siblings, 2 replies; 9+ messages in thread From: Shawn Pearce @ 2006-07-14 3:34 UTC (permalink / raw) To: Christopher Faylor; +Cc: git Christopher Faylor <me@cgf.cx> wrote: > On Tue, Jul 11, 2006 at 09:15:38PM -0700, Junio C Hamano wrote: > >Shawn Pearce <spearce@spearce.org> writes: > > > >> Has anyone else seen this type of behavior before? Any suggestions > >> on debugging this issue? > > > >I would suggest raising this (politely) to Cygwin people. > > I lost the thread here but wasn't this referring to a samba mount? If so, > it would be samba that's returning the wrong "errno". I thought about that but Windows 2000 talking to the same samba server issues back the correct errno. Running the exact same Cygwin and GIT binaries (we've at least standardized on that). So it seems weird that a samba server is issuing the correct error code to a Windows 2000 client but the wrong one to a Windows XP client. (In both cases the clients are accessing directories on the same filesystem on the UNIX server.) -- Shawn. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-14 3:34 ` Shawn Pearce @ 2006-07-14 5:24 ` Christopher Faylor 2006-07-14 5:26 ` Linus Torvalds 1 sibling, 0 replies; 9+ messages in thread From: Christopher Faylor @ 2006-07-14 5:24 UTC (permalink / raw) To: Shawn Pearce, git On Thu, Jul 13, 2006 at 11:34:35PM -0400, Shawn Pearce wrote: >Christopher Faylor <me@cgf.cx> wrote: >> On Tue, Jul 11, 2006 at 09:15:38PM -0700, Junio C Hamano wrote: >> >Shawn Pearce <spearce@spearce.org> writes: >> > >> >> Has anyone else seen this type of behavior before? Any suggestions >> >> on debugging this issue? >> > >> >I would suggest raising this (politely) to Cygwin people. >> >> I lost the thread here but wasn't this referring to a samba mount? If so, >> it would be samba that's returning the wrong "errno". > >I thought about that but Windows 2000 talking to the same samba >server issues back the correct errno. Running the exact same Cygwin >and GIT binaries (we've at least standardized on that). So it >seems weird that a samba server is issuing the correct error code >to a Windows 2000 client but the wrong one to a Windows XP client. >(In both cases the clients are accessing directories on the same >filesystem on the UNIX server.) It's entirely possible that samba is behaving differently with different versions of windows. OTOH, I believe that EACCES is the catch-all for windows errors when translating into errnos so possibly it is an uncaught error translation. If you have the inclination and time, if you could run the session under strace: "strace -o strace.out git ...",d snip twenty or thirty lines on each side of the place where the the errno translation is happening, and send it to the cygwin list at cygwin at cygwin maybe something will be obvious. Note that cygwin's strace is not anything like any other strace and is quite a bit more wordy so, this file will be pretty large. That's why I ask for some careful editing before sending it to the mailing list. The errno number for EACCES on cygwin is 13. cgf ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error writing loose object on Cygwin 2006-07-14 3:34 ` Shawn Pearce 2006-07-14 5:24 ` Christopher Faylor @ 2006-07-14 5:26 ` Linus Torvalds 1 sibling, 0 replies; 9+ messages in thread From: Linus Torvalds @ 2006-07-14 5:26 UTC (permalink / raw) To: Shawn Pearce; +Cc: Christopher Faylor, git On Thu, 13 Jul 2006, Shawn Pearce wrote: > > I thought about that but Windows 2000 talking to the same samba > server issues back the correct errno. Running the exact same Cygwin > and GIT binaries (we've at least standardized on that). So it > seems weird that a samba server is issuing the correct error code > to a Windows 2000 client but the wrong one to a Windows XP client. The samba connection protocol is fairly involved, and it will, as far as I know, do a variety of "negotiation" of capabilities of both ends. What a W2000 client does can very possibly be very different from what a WXP client does, which in turn is certainly going to be different from a W98 client. It will simply talk a different version of the protocol. I am also told that the error codes actually differ between different versions of the samba protocol - not in the sense that different events generate different error codes, but that the _same_ error (say "ENOENT") is actually represented wioth different numbering in "old Windows SMB" and "new windows SMB". I don't know the details, and may have gotten them wrong, but the point it, is't not at all impossible that the exact same version of Samba on the server will negotiate a different protocol because the client OS is different, and even though the Cygwin libraries and git binaries are the exact same libraries/binaries, they might get different error codes from the same system call. (This may also explain why there are two "samba clients" in the kernel: the CONFIG_SMB and CONFIG_CIFS. CIFS is the "new version SMB", and the CIFS client currently doesn't even understand the old version - so you might use SMB for old servers, and CIFS for new servers) That said, I thought W2000 and WXP both negotiated the "new" protocol, but there are probably config details even within that one.. Linus ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-07-14 5:26 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-12 3:57 Error writing loose object on Cygwin Shawn Pearce 2006-07-12 4:15 ` Junio C Hamano 2006-07-12 4:36 ` Linus Torvalds 2006-07-12 5:00 ` Shawn Pearce 2006-07-13 4:27 ` Junio C Hamano 2006-07-13 5:51 ` Christopher Faylor 2006-07-14 3:34 ` Shawn Pearce 2006-07-14 5:24 ` Christopher Faylor 2006-07-14 5:26 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).