From: Linus Torvalds <torvalds@osdl.org>
To: Alex Riesen <raa.lkml@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: bug: git-repack -a -d produces broken pack on NFS
Date: Thu, 27 Apr 2006 16:54:34 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0604271630030.3701@g5.osdl.org> (raw)
In-Reply-To: <20060427213207.GA6709@steel.home>
Ok, trying to think some more about this..
On Thu, 27 Apr 2006, Alex Riesen wrote:
>
> $SRC/linux.git$ git repack -a -d
> Generating pack...
> Done counting 235947 objects.
> Deltifying 235947 objects.
> 100% (235947/235947) done
> Writing 235947 objects.
> 100% (235947/235947) done
> Total 235947, written 235947 (delta 182131), reused 235466 (delta 181650)
> Pack pack-6dcda5a7782864d57ec44bd30ebec13b07df2c87 created.
> $SRC/linux.git$ git fsck-objects --full
> git-fsck-objects: error: Packfile .git/objects/pack/pack-6dcda5a7782864d57ec44bd30ebec13b07df2c87.pack SHA1 mismatch with idx
This is interesting on so many levels.
First off, the index file or the pack-file is clearly somehow corrupt,
because when you then try to do the "git clone" off the result later on
(which won't actually check the SHA1's), it gets
> git-index-pack: fatal: packfile '/mnt/large/tmp/raa/tmp/.git/objects/pack/tmp-wcRvk5': bad object at offset 102601801: inflate returned -3
which means that either the offset was wrong, or the data at that offset
was wrong.
That made me suspect the object re-use code - it might have been broken in
the original pack, and then on re-use the broken data would have been just
copied over.
HOWEVER - that doesn't actually fly as an explanation, because even if the
data itself was broken, the repack would have re-generated the SHA1, so if
the problem had been about copying an already broken pack over, you'd have
gotten the "git clone" error, but you would _not_ have gotten the "pack
SHA1 does not match index" error.
So in order for the SHA1 to not match, we literally must have corrupted
things when we created the pack-file.
However, I've stared and stared at the sha1file writing code, and I don't
see how you _could_ corrupt it. We use it with interruptible file
descriptors all the time (sockets - the exact same code is used to
transfer packs over the network), and that "intr" shouldn't matter one
whit. We're doing very safe things, as far as I can tell.
The thing is, even if a wild pointer corrupts the write buffer for the
sha1file writing code somehow, we actually always do the "calculate the
SHA1" and "flush the buffer to the file" together. So even if somebody
corrupted the buffer, we'd still generate the "right" SHA1 (of the
corrupted buffer).
So the only thing that I can see that can generate bad SHA1 checksums is
- actual problem in the SHA1 buffers themselves (ie a wild pointer
corrupting the "SHA1_CTX" thing itself)
- real filesystem corruption. With NFS, the UDP checksums aren't all that
strong, but the ethernet CRC should catch things (there have been
reports of network cards that don't check the CRC well, but quite
frankly, I haven't seen one in a _loong_ time)
- RAM corruption and/or kernel NFS bugs.
I'll continue to stare at the code, but I can't see anything even remotely
suspicious in git itself so far.
Linus
prev parent reply other threads:[~2006-04-27 23:54 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-27 21:32 bug: git-repack -a -d produces broken pack on NFS Alex Riesen
2006-04-27 22:11 ` Linus Torvalds
2006-04-27 22:17 ` Junio C Hamano
2006-04-27 22:29 ` Linus Torvalds
2006-04-27 22:44 ` Junio C Hamano
2006-04-27 22:18 ` Linus Torvalds
2006-04-28 22:27 ` Alex Riesen
2006-04-28 23:18 ` Linus Torvalds
2006-04-27 23:54 ` Linus Torvalds [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0604271630030.3701@g5.osdl.org \
--to=torvalds@osdl.org \
--cc=git@vger.kernel.org \
--cc=raa.lkml@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).