From: Bill Zaumen <bill.zaumen@gmail.com>
To: Shawn Pearce <spearce@spearce.org>
Cc: git@vger.kernel.org, gitster@pobox.com, pclouds@gmail.com,
peff@peff.net, torvalds@linux-foundation.org
Subject: Re: [PATCH] Implement fast hash-collision detection
Date: Tue, 29 Nov 2011 20:01:45 -0800 [thread overview]
Message-ID: <1322625705.1728.299.camel@yos> (raw)
In-Reply-To: <CAJo=hJtFT55Ucyij9esr3Hd9yJ6XCxatK7vjPOLMKow57HqBoQ@mail.gmail.com>
Note: for some reason my email is not showing up on the mailing list.
I'm trying a different email address - previously my 'From' field
contained a subaddress "+git" but gmail won't put that in the 'Sender'
field, so possibly the email is being filtered for that reason.
On Tue, 2011-11-29 at 09:08 -0800, Shawn Pearce wrote:
> I don't think you understand how these thin packs are processed.
I think the confusion was due to me being a bit too terse. The
documentation clearly states that thin packs allow deltas to be
sent when the delta is based on an object that the server and client
both have in common, given the commits each already has. If there is
one server and one client, there isn't an issue. The case I meant is
the one in which a user does a fetch from one server, gets a forged
blob, and then fetches from another server with the original blob, and
with additional commits along the same branch. If a server bases the
delta off of the original blob, and the client applies the delta to the
forged blob, the client will most likely end up with a blob with a
different SHA-1 hash than the one expected. Since an object in a tree
is then missing (no object with the expected SHA-1 hash), the repository
is corrupted.
The "first to arrive wins" policy isn't sufficient in one specific case:
multiple remote repositories where new commits are added asynchronously,
with the repositories out of sync possibly for days at a time (e.g.,
over a 3-day weekend). In this case, the first to arrive at one
repository may not be the first to arrive at another, so what happens at
a particular client in the presence of hash collisions is dependent on
the sequence of remotes from which updates were fetched. The risk
occurs in the window where the repositories are out of sync.
Regarding the kernel.org problem that you used as a separate example,
while it was fortunately possible to rebuild things (and git provided
significant advantages), earlier detection of the problem might have
reduced the time for which kernel.org was down. Early detection of
errors in general is a good practice if it can be done at a reasonable
cost.
> Trust. Review. Verify.
While good advice in principle, you should keep in mind that there are
a lot of people out there working at various companies who are not as
capable as you are. Some of them are overworked and make mistakes
because they've been working 16 hour days for weeks trying to meet a
deadline. Given that, extra checks to catch problems early
are probably a good idea if they don't impact performance significantly.
prev parent reply other threads:[~2011-11-30 4:02 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1322546563.1719.22.camel@yos>
2011-11-29 9:07 ` [PATCH] Implement fast hash-collision detection Jeff King
2011-11-29 10:24 ` Ævar Arnfjörð Bjarmason
2011-11-29 10:29 ` Jeff King
2011-11-29 13:17 ` Nguyen Thai Ngoc Duy
2011-11-29 15:23 ` Shawn Pearce
2011-11-29 14:04 ` Nguyen Thai Ngoc Duy
2011-11-29 20:59 ` Jeff King
2011-11-30 13:35 ` Nguyen Thai Ngoc Duy
2011-11-30 18:05 ` Junio C Hamano
2011-12-01 4:43 ` Nguyen Thai Ngoc Duy
2011-11-30 19:00 ` Bill Zaumen
2011-11-29 21:56 ` Bill Zaumen
2011-11-30 6:25 ` Jeff King
2011-12-01 0:41 ` Bill Zaumen
2011-12-01 5:26 ` Jeff King
2011-12-02 2:59 ` Bill Zaumen
2011-12-02 17:00 ` Jeff King
2011-11-29 17:08 ` Shawn Pearce
2011-11-29 22:05 ` Jeff King
2011-11-30 4:01 ` Bill Zaumen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1322625705.1728.299.camel@yos \
--to=bill.zaumen@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=spearce@spearce.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).