git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@fluxnic.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] sha1_file: don't malloc the whole compressed result when writing out objects
Date: Mon, 22 Feb 2010 00:50:18 -0500 (EST)	[thread overview]
Message-ID: <alpine.LFD.2.00.1002220034540.1946@xanadu.home> (raw)
In-Reply-To: <7v635p4z26.fsf@alter.siamese.dyndns.org>

On Sun, 21 Feb 2010, Junio C Hamano wrote:

> Nicolas Pitre <nico@fluxnic.net> writes:
> 
> > The whole point is to detect data incoherencyes.
> 
> Yes.  We want to make sure that the SHA-1 we compute is over what we fed
> deflate().
> 
> > So current sequence of events is as follows:
> >
> > T0	write_sha1_file_prepare() is called
> > T1	start initial SHA1 computation on data buffer
> > T2	in the middle of initial SHA1 computation
> > T3	end of initial SHA1 computation -> object name is determined
> > T4	write_loose_object() is called
> > ...	enter the write loop
> > T5+n	deflate() called on buffer n
> > T6+n	git_SHA1_Update(() called on the same buffer n
> > T7+n	deflated data written out
> > ...
> > Tend	abort if result of T6+n doesn't match object name from T3
> >
> > So... what can happen:
> >
> > 1) Data is externally modified before T5+n: deflated data and its CRC32 
> >    will be coherent with the SHA1 computed in T6+n, but incoherent with 
> >    the SHA1 used for the object name. Wrong data is written to the 
> >    object even if it will inflate OK. We really want to prevent that 
> >    from happening. The test in Tend will fail.
> >
> > 2) Data is externally modified between T5+n and T6+n: the deflated data 
> >    and CRC32 will be coherent with the object name but incoherent with 
> >    the parano_sha1.  Although written data will be OK, this is way too 
> >    close from being wrong, and the test in Tend will fail.  If there is 
> >    more than one round into the loop and the external modifications are 
> >    large enough then this becomes the same as case 1 above.
> >
> > 3) Data is externally modified in T2: again the test in Tend will fail.
> >
> > So in all possible cases I can think of, the write will abort.
> 
> There is one pathological case.
> 
> Immediately before T5+n (or between T5+n and T6+n), the external process
> changes the data deflate() is working on, but before T6+n, the external
> process changes the data back.  Two SHA-1's computed may match, but it is
> not a hash over what was deflated(); you won't be able to abort.

And what real life case would trigger this?  Given the size of the 
window for this to happen, what are your chances?

Of course the odds for me to be struck by lightning also exist.  And if 
I work really really hard at it then I might be able to trigger that 
pathological case above even before the next thunderstorm.  But in 
practice I'm hardly concerned by either of those possibilities.


Nicolas

  reply	other threads:[~2010-02-22  5:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-21  4:27 [PATCH] sha1_file: don't malloc the whole compressed result when writing out objects Nicolas Pitre
2010-02-21 19:45 ` Junio C Hamano
2010-02-21 21:26   ` Nicolas Pitre
2010-02-21 22:22     ` Junio C Hamano
2010-02-21 22:30     ` Junio C Hamano
2010-02-22  1:35       ` Nicolas Pitre
2010-02-22  5:30         ` Junio C Hamano
2010-02-22  5:50           ` Nicolas Pitre [this message]
2010-02-22  6:17             ` Junio C Hamano
2010-02-22  6:31               ` Junio C Hamano
2010-02-22 17:36                 ` Nicolas Pitre
2010-02-22 19:55                   ` Junio C Hamano
2010-02-22  6:27             ` Dmitry Potapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1002220034540.1946@xanadu.home \
    --to=nico@fluxnic.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).