From: Jeff King <peff@peff.net>
To: "Dale R. Worley" <worley@alum.mit.edu>
Cc: git@vger.kernel.org
Subject: Re: "git fsck" fails on malloc of 80 G
Date: Wed, 18 Dec 2013 16:58:21 -0500 [thread overview]
Message-ID: <20131218215821.GA14276@sigill.intra.peff.net> (raw)
In-Reply-To: <201312180306.rBI36KCm016209@hobgoblin.ariadne.com>
On Tue, Dec 17, 2013 at 10:06:20PM -0500, Dale R. Worley wrote:
> Here's the basic backtrace information, and the values of the "size"
> variables, which seem to be the immediate culprits:
> [...]
> #1 0x00000000004f3633 in xmallocz (size=size@entry=80530636800)
> at wrapper.c:73
> #2 0x00000000004d922f in unpack_compressed_entry (p=p@entry=0x7e4020,
> w_curs=w_curs@entry=0x7fffffffc9f0, curpos=654214694, size=80530636800)
> at sha1_file.c:1797
> #3 0x00000000004db4cb in unpack_entry (p=p@entry=0x7e4020,
> obj_offset=654214688, final_type=final_type@entry=0x7fffffffd088,
> final_size=final_size@entry=0x7fffffffd098) at sha1_file.c:2072
> #4 0x00000000004b1e3f in verify_packfile (base_count=0, progress=0x9bdd80,
> fn=0x42fc00 <fsck_obj_buffer>, w_curs=0x7fffffffd090, p=0x7e4020)
> at pack-check.c:119
Thanks, that's helpful. Unfortunately the patch I mentioned before won't
help you. The packfile format (like the experimental loose format that my patch
dropped) stores the size outside of the zlib crc. So it has the same
problem: we want to allocate the buffer up front to store the zlib
results.
The pack index does store a crc (calculated when we made or received
the pack) over each object's on-disk representation. So we could check
that, though doing it on every access has performance implications.
The pack data itself also has a SHA-1 checksum over the whole thing. We
should probably do a better job in verify-pack of:
1. Check the whole sha1 checksum before doing anything else.
2. In the uncommon case that it fails, check each individual object
crc to find the broken object (and if none, assume either the
header or the checksum itself is what got munged).
In the meantime, you should be able to do step 1 manually like:
# check first N-20 bytes of packfile against the checksum in the
# final 20 bytes. NB: pretty sure this use of "head" is a GNU-ism,
# and of course you need openssl
for i in objects/pack/*.pack; do
tail -c 20 "$i" >want.tmp &&
head -c -20 "$i" | openssl sha1 -binary >have.tmp &&
cmp want.tmp have.tmp ||
echo >&2 "broken: $i"
done
git-fsck should be doing this check itself, but I wonder if you are not
making it that far.
> If I understand the code correctly, the object header buffer
> \260\200\200\200\340\022x\234\354\301\001\001
> really does encode the size value 0x12c0000000.
If it does, and you do not have an 80G file, then it sounds like you may
have a corrupt packfile.
-Peff
next prev parent reply other threads:[~2013-12-18 21:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-16 16:05 "git fsck" fails on malloc of 80 G Dale R. Worley
2013-12-16 19:15 ` Jeff King
2013-12-18 3:06 ` Dale R. Worley
2013-12-18 21:58 ` Jeff King [this message]
2013-12-18 21:08 ` Dale R. Worley
2013-12-18 22:09 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131218215821.GA14276@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=worley@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).