git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Ericsson <ae@op5.se>
To: David Abrahams <dave@boostpro.com>
Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: "malloc failed"
Date: Thu, 29 Jan 2009 14:41:15 +0100	[thread overview]
Message-ID: <4981B1FB.6030700@op5.se> (raw)
In-Reply-To: <87pri6qmvm.fsf@mcbain.luannocracy.com>

David Abrahams wrote:
> on Thu Jan 29 2009, Jeff King <peff-AT-peff.net> wrote:
> 
>> On Thu, Jan 29, 2009 at 12:20:41AM -0500, Jeff King wrote:
>>
>>> Ok, that _is_ big. ;) I wouldn't be surprised if there is some corner of
>>> the code that barfs on a single object that doesn't fit in a signed
>>> 32-bit integer; I don't think we have any test coverage for stuff that
>>> big.
>> Sure enough, that is the problem. With the patch below I was able to
>> "git add" and commit a 3 gigabyte file of random bytes (so even the
>> deflated object was 3G).
>>
>> I think it might be worth applying as a general cleanup, but I have no
>> idea if other parts of the system might barf on such an object.
>>
>> -- >8 --
>> Subject: [PATCH] avoid 31-bit truncation in write_loose_object
>>
>> The size of the content we are adding may be larger than
>> 2.1G (i.e., "git add gigantic-file"). Most of the code-path
>> to do so uses size_t or unsigned long to record the size,
>> but write_loose_object uses a signed int.
>>
>> On platforms where "int" is 32-bits (which includes x86_64
>> Linux platforms), we end up passing malloc a negative size.
> 
> 
> Good work.  I don't know if this matters to you, but I think on a 32-bit
> platform you'll find that size_t, which is supposed to be able to hold
> the size of the largest representable *memory block*, is only 4 bytes
> large:
> 
>   #include <limits.h>
>   #include <stdio.h>
> 
>   int main()
>   {
>     printf("sizeof(size_t) = %d", sizeof(size_t));
>   }
> 
> Prints "sizeof(size_t) = 4" on my core duo.
> 

It has nothing to do with typesize, and everything to do with
signedness. A size_t cannot be negative, while an int can.
Making sure we use the correct signedness everywhere means
we double the capacity where negative values are clearly bogus,
such as in this case. On 32-bit platforms, the upper limit for
what git can handle is now 4GB, which is expected. To go beyond
that, we'd need to rework the algorithm so we handle chunks of
the data instead of the whole. Some day, that might turn out to
be necessary but today is not that day.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

  reply	other threads:[~2009-01-29 13:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-27 15:04 "malloc failed" David Abrahams
2009-01-27 15:29 ` Shawn O. Pearce
2009-01-27 15:32   ` David Abrahams
2009-01-27 18:02 ` Johannes Schindelin
2009-01-28  5:02 ` Jeff King
2009-01-28 21:53   ` David Abrahams
2009-01-29  0:06     ` David Abrahams
2009-01-29  5:20       ` Jeff King
2009-01-29  5:56         ` Jeff King
2009-01-29  7:53           ` Junio C Hamano
2009-01-29 13:10           ` David Abrahams
2009-01-29 13:41             ` Andreas Ericsson [this message]
2009-01-30  4:49             ` Jeff King
2009-01-28 22:16   ` Pau Garcia i Quiles
2009-01-29  5:14     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4981B1FB.6030700@op5.se \
    --to=ae@op5.se \
    --cc=dave@boostpro.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).