From: Junio C Hamano <gitster@pobox.com>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] sha1_file: don't malloc the whole compressed result when writing out objects
Date: Sun, 21 Feb 2010 22:31:54 -0800 [thread overview]
Message-ID: <7v4ol9vl0l.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <7v8walyesu.fsf@alter.siamese.dyndns.org> (Junio C. Hamano's message of "Sun\, 21 Feb 2010 22\:17\:53 -0800")
Junio C Hamano <gitster@pobox.com> writes:
> Nicolas Pitre <nico@fluxnic.net> writes:
>
>> And what real life case would trigger this? Given the size of the
>> window for this to happen, what are your chances?
>
>> Of course the odds for me to be struck by lightning also exist. And if
>> I work really really hard at it then I might be able to trigger that
>> pathological case above even before the next thunderstorm. But in
>> practice I'm hardly concerned by either of those possibilities.
>
> The real life case for any of this triggers for me is zero, as I won't be
> mistreating git as a continuous & asynchronous back-up tool.
>
> But then that would make the whole discussion moot. There are people who
> file "bug reports" with an artificial reproduction recipe built around a
> loop that runs dd continuously overwriting a file while "git add" is asked
> to add it.
Having said all that, I like your approach better. It is not worth paying
the price of unnecessary memcpy(3) that would _only_ help catching the
insanely artificial test case, but your patch strikes a good balance of
small overhead to catch the easier-to-trigger (either by stupidity, malice
or mistake) cases.
So I am tempted to discard the "paranoia" patch, and replace with your two
patches, with the following caveats in the log message.
--- /var/tmp/2 2010-02-21 22:23:30.000000000 -0800
+++ /var/tmp/1 2010-02-21 22:23:22.000000000 -0800
@@ -21,7 +21,9 @@
deflate operation has consumed that data, and make sure it matches
with the expected SHA1. This way we can rely on the CRC32 checked by
the inflate operation to provide a good indication that the data is still
- coherent with its SHA1 hash.
+ coherent with its SHA1 hash. One pathological case we ignore is when
+ the data is modified before (or during) deflate call, but changed back
+ before it is hashed.
There is some overhead of course. Using 'git add' on a set of large files:
next prev parent reply other threads:[~2010-02-22 6:32 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-21 4:27 [PATCH] sha1_file: don't malloc the whole compressed result when writing out objects Nicolas Pitre
2010-02-21 19:45 ` Junio C Hamano
2010-02-21 21:26 ` Nicolas Pitre
2010-02-21 22:22 ` Junio C Hamano
2010-02-21 22:30 ` Junio C Hamano
2010-02-22 1:35 ` Nicolas Pitre
2010-02-22 5:30 ` Junio C Hamano
2010-02-22 5:50 ` Nicolas Pitre
2010-02-22 6:17 ` Junio C Hamano
2010-02-22 6:31 ` Junio C Hamano [this message]
2010-02-22 17:36 ` Nicolas Pitre
2010-02-22 19:55 ` Junio C Hamano
2010-02-22 6:27 ` Dmitry Potapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7v4ol9vl0l.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=nico@fluxnic.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).