git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Joey Hess <joey@kitenet.net>
Cc: git@vger.kernel.org
Subject: Re: corrupt object memory allocation error
Date: Wed, 20 Nov 2013 16:33:48 -0500	[thread overview]
Message-ID: <20131120213348.GA29004@sigill.intra.peff.net> (raw)
In-Reply-To: <20131120203350.GA31139@kitenet.net>

On Wed, Nov 20, 2013 at 04:33:50PM -0400, Joey Hess wrote:

> I've got a git repository of < 2 mb, where git wants to
> allocate a rather insane amount of memory:
> 
> >git fsck
> Checking object directories: 100% (256/256), done.
> fatal: Out of memory, malloc failed (tried to allocate 124865231165 bytes)
> 
> > git show 11644b5a075dc1425e01fbba51c045cea2d0c408
> fatal: Out of memory, malloc failed (tried to allocate 124865231165 bytes)
> 
> The problem seems to be the attached object file, which has gotten
> corrupted, presumably in the header that git reads to see how large it
> is. Thought I'd report this in case there is some easy way to
> add a sanity check.

Definitely a corrupt object. The start is not a valid zlib header, so we
guess that it is an "experimental loose object". This is a format that
git wrote for very short period as a performance experiment; it didn't
pan out and we no longer write it.

The loose object format contains the (purported) object size outside of
the checksum'd zlib data (whereas the normal format has a human-readable
header that gets zlib'd). Your corrupted bytes end up specifying a
ridiculously large size.

I wonder if it is time to drop reading support for the experimental
objects. It was never widely used, and was deprecated in v1.5.2 by
726f852 (deprecate the new loose object header format, 2007-05-09). That
would improve the case when the initial bytes of a loose object are
corrupted, because we would complain about the bogus zlib data before
trying to allocate the buffer.

The problem would still remain for packfiles, which use a similar
encoding, but I suspect it is less common there. For a single-byte
corruption, it is unlikely to be right in the length header. But for
absolute junk that is not git data at all, the first bytes are very
likely to be corrupted. In the pack case, we would notice early that it
does not look like a packfile; for the loose object, we have no such
header and proceed with the allocation.

As for your specific corruption, I can't make heads or tails of it. It
is not a single-bit error. The first two bytes of a loose object should
always be <0x78, 0x01>, which is the standard zlib deflate header. Your
bytes aren't even close, and decoding the rest with a corrupted zlib
header seems fruitless.

You don't happen to have another copy of the object (or of the data
contained in the object, such as the working tree file), do you? It
might be interesting to see a comparison of the bytes of the correct
data and your corruption.

-Peff

  reply	other threads:[~2013-11-20 21:33 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-20 20:33 corrupt object memory allocation error Joey Hess
2013-11-20 21:33 ` Jeff King [this message]
2013-11-20 22:28   ` Joey Hess
2013-11-21 11:41     ` [PATCH] drop support for "experimental" loose objects Jeff King
2013-11-21 11:48       ` Jeff King
2013-11-21 12:43         ` Duy Nguyen
2013-11-21 14:42           ` Keshav Kini
2013-11-21 22:41           ` Jeff King
2013-11-21 19:44         ` Junio C Hamano
2013-11-23  0:24         ` Jonathan Nieder
2013-11-23  0:30           ` Jeff King
2013-11-23  0:47             ` Jonathan Nieder
2013-11-21 16:04       ` Joey Hess
2013-11-21 20:19         ` Christian Couder
2013-11-22  9:58           ` Jeff King
2013-11-22 11:04             ` Christian Couder
2013-11-22 11:24               ` Jeff King
2013-11-22 14:23                 ` Christian Couder
2013-11-22 16:15                   ` Jeff King
2013-11-22 17:23             ` Junio C Hamano
2013-11-22  2:09         ` Jeff King
2013-11-22 17:28           ` Joey Hess
2013-11-24  8:44             ` Jeff King
2013-11-24  9:07               ` Jeff King
2013-11-25 18:35                 ` Junio C Hamano
2013-11-27  9:30                   ` Jeff King
2013-11-27 18:57                     ` Junio C Hamano
2013-11-27 19:03                       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131120213348.GA29004@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=joey@kitenet.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).