All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Koegler <martin.koegler@chello.at>
To: Junio C Hamano <gitster@pobox.com>
Cc: Martin Koegler <martin.koegler@chello.at>,
	git@vger.kernel.org, Johannes.Schindelin@gmx.de
Subject: Re: [PATCH V2 1/2] Fix delta integer overflows
Date: Fri, 11 Aug 2017 09:43:41 +0200	[thread overview]
Message-ID: <20170811074341.GD15128@mail.zuhause> (raw)
In-Reply-To: <xmqqmv772nmc.fsf@gitster.mtv.corp.google.com>

On Thu, Aug 10, 2017 at 01:07:07PM -0700, Junio C Hamano wrote:
> > The current delta code produces incorrect pack objects for files > 4GB.
> >
> > Signed-off-by: Martin Koegler <martin.koegler@chello.at>
> 
> I am a bit torn on this change.
> 
> Given that this is not merely a local storage format but it also is
> an interchange format, we would probably want to make sure that the
> receiving end (e.g. get_delta_hdr_size() that is used at the
> beginning of patch_delta()) on a platform whose size_t is smaller
> than that of a platform that produced the delta stream with this
> code behaves "sensibly".

Overflows would already be detected during unpack:
* Assuming size_t = uint32, the system should just be able to handle up to 4GB of process memory.
So loading any source blob larger than 4GB should already fail.
* Assuming size_t = uint32 and a source blob size < 4 GB, the target blob size would be readed
truncated. apply_delta checks, that the generated result matches the encoded size - this check would
fail.
 
> If we replaced ulong we use in create/patch delta codepaths with
> uint32_t, that would be safer, just because the encoder would not be
> able to emit varint that is larger than the receivers to handle.
> But that defeats the whole point of using varint() to encode the
> sizes in the first place.  It was partly done for space saving, but
> more for allowing larger sizes and larger ulong in the future
> without having to change the file format.

The ondisk-format is able to handle larger sizes [using a slightly worse compression].
The current implementation is just buggy.

I would not move to uint32_t. The remaing part of git uses "unsigned long", so the 
delta code could still be called with larger files.

We will also see more RAM as well as CPU power - reducing the limits just because of older plattforms,
which can't even handle such large blobs, is the wrong way.

Regards,
Martin

  parent reply	other threads:[~2017-08-11  7:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-10  7:01 [PATCH V2 1/2] Fix delta integer overflows Martin Koegler
2017-08-10  7:01 ` [PATCH V2 2/2] Convert size datatype to size_t Martin Koegler
2017-08-10 14:46   ` Johannes Schindelin
2017-08-10 22:04   ` Junio C Hamano
2017-08-11  7:12     ` Martin Koegler
2017-08-10 20:07 ` [PATCH V2 1/2] Fix delta integer overflows Junio C Hamano
2017-08-10 20:36   ` Jeff King
2017-08-11 18:43     ` Junio C Hamano
2017-08-11  7:43   ` Martin Koegler [this message]
2017-08-11 18:40     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170811074341.GD15128@mail.zuhause \
    --to=martin.koegler@chello.at \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.