From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH 0/4] hash.h: support choosing a separate SHA-1 for non-cryptographic uses
Date: Thu, 05 Sep 2024 08:41:16 -0700 [thread overview]
Message-ID: <xmqq34me5crn.fsf@gitster.g> (raw)
In-Reply-To: <20240905103736.GC2556395@coredump.intra.peff.net> (Jeff King's message of "Thu, 5 Sep 2024 06:37:36 -0400")
Jeff King <peff@peff.net> writes:
> Probably the solution is:
>
> - renaming packfiles into place should use finalize_object_file() to
> avoid collisions. That happens for tmp-objdir migration already,
> but we should do it more directly in index-pack (and maybe even
> pack-objects). And possibly we should implement an actual
> byte-for-byte comparison if we think we saw a collision, rather than
> just assuming that the write was effectively a noopi (see the FIXME
> in that function). That becomes more important if the checksum gets
> more likely to collide accidentally (we essentially ignore the
> possibility that sha1 would ever do so).
Yes. I somehow mistakenly thought that Taylor analized the code
path when brian raised the "we rename over, overwriting existing
files" and I included fixing it as one of the steps necessary to
safely switch the tail sum to weaker and faster hash, but after
reading the thread again, it seems that I was hallucinating X-<.
This needs to be corrected.
> - possibly object_creation_mode should have a more strict setting that
> refuses to fall back to renames. Or alternatively, we should do our
> own check for existence when falling back to a rename() in
> finalize_object_file().
True, too.
> - at some moment we will have moved pack-XYZ.pack into place, but not
> yet the matching idx. So we'll have the old idx and the new pack. An
> object lookup at that moment could cause us to find the object using
> the old idx, but then get the data out of the new pack file,
> replacing real data with the attacker's data. It's a pretty small
> window, but probably possible with enough simultaneous reading
> processes. Not something you'd probably want to spend $40k
> generating a collision for, but if we used a weak enough checksum,
> then attempts become cheap.
This reminds me why we changed the hash we use to name packfiles; we
used to use "hash of sorted object names contained in the pack", but
that would mean a (forced) repack of a sole pack of a fully packed
repository can create a packfile with contents and object layout
different from the original but with the same name, creating this
exact race to yourself without involving any evil attacker. We of
course use the hash of the actual pack data stream these days, and
that would avoid this problem.
It is funny to compare this with the reason why we switched how we
name individual objects in a very early part in the history. We
used to name an object after the hash value of _compressed_ object
header plus payload, but that obviously means different compression
level (or improvement of the compressor implementation) would give
different names to the same contents, and that is why we swapped the
order to use the hash value of the object header plus payload before
compression. Of course, that _requires_ us to avoid overwriting an
existing file to foil collision attack. That brings us back to the
topic here ;-).
> So I think we really do need to address this to prefer local data. At
> which point it should be safe to use a much weaker checksum. But IMHO it
> is still worth having a "fast" SHA1. Even if the future is SHA256 repos
> or xxHash pack trailers, older clients will still use SHA1.
Yup. 100% agreed.
Thanks.
next prev parent reply other threads:[~2024-09-05 15:41 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-01 16:03 [PATCH 0/4] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Taylor Blau
2024-09-01 16:03 ` [PATCH 1/4] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 19:34 ` Taylor Blau
2024-09-01 16:03 ` [PATCH 2/4] hash.h: scaffolding for _fast hashing variants Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 17:27 ` Junio C Hamano
2024-09-03 19:52 ` Taylor Blau
2024-09-03 20:47 ` Junio C Hamano
2024-09-03 21:24 ` Taylor Blau
2024-09-04 7:05 ` Patrick Steinhardt
2024-09-04 14:53 ` Junio C Hamano
2024-09-03 19:40 ` Taylor Blau
2024-09-01 16:03 ` [PATCH 3/4] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 19:43 ` Taylor Blau
2024-09-01 16:03 ` [PATCH 4/4] csum-file.c: use fast SHA-1 implementation when available Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 1:22 ` brian m. carlson
2024-09-03 19:50 ` Taylor Blau
2024-09-02 3:41 ` [PATCH 0/4] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Junio C Hamano
2024-09-03 19:48 ` Taylor Blau
2024-09-03 20:44 ` Junio C Hamano
2024-09-02 14:08 ` brian m. carlson
2024-09-03 19:47 ` Taylor Blau
2024-09-03 22:41 ` Junio C Hamano
2024-09-04 14:01 ` brian m. carlson
2024-09-05 10:37 ` Jeff King
2024-09-05 15:41 ` Junio C Hamano [this message]
2024-09-05 16:23 ` Taylor Blau
2024-09-05 16:51 ` Junio C Hamano
2024-09-05 17:04 ` Taylor Blau
2024-09-05 17:51 ` Taylor Blau
2024-09-05 20:21 ` Taylor Blau
2024-09-05 20:27 ` Jeff King
2024-09-05 21:27 ` Junio C Hamano
2024-09-05 15:11 ` [PATCH v2 " Taylor Blau
2024-09-05 15:12 ` [PATCH v2 1/4] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-05 15:12 ` [PATCH v2 2/4] hash.h: scaffolding for _fast hashing variants Taylor Blau
2024-09-05 15:12 ` [PATCH v2 3/4] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-05 15:12 ` [PATCH v2 4/4] csum-file.c: use fast SHA-1 implementation when available Taylor Blau
2024-09-06 19:46 ` [PATCH v3 0/9] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Taylor Blau
2024-09-06 19:46 ` [PATCH v3 1/9] finalize_object_file(): check for name collision before renaming Taylor Blau
2024-09-06 19:46 ` [PATCH v3 2/9] finalize_object_file(): refactor unlink_or_warn() placement Taylor Blau
2024-09-06 19:46 ` [PATCH v3 3/9] finalize_object_file(): implement collision check Taylor Blau
2024-09-06 21:44 ` Junio C Hamano
2024-09-06 21:51 ` Chris Torek
2024-09-10 6:53 ` Jeff King
2024-09-10 15:14 ` Junio C Hamano
2024-09-16 10:45 ` Patrick Steinhardt
2024-09-16 15:54 ` Taylor Blau
2024-09-16 16:03 ` Taylor Blau
2024-09-17 20:40 ` Junio C Hamano
2024-09-06 19:46 ` [PATCH v3 4/9] pack-objects: use finalize_object_file() to rename pack/idx/etc Taylor Blau
2024-09-06 19:46 ` [PATCH v3 5/9] i5500-git-daemon.sh: use compile-able version of Git without OpenSSL Taylor Blau
2024-09-11 6:10 ` Jeff King
2024-09-11 6:12 ` Jeff King
2024-09-12 20:28 ` Junio C Hamano
2024-09-11 15:28 ` Junio C Hamano
2024-09-11 21:23 ` Jeff King
2024-09-06 19:46 ` [PATCH v3 6/9] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-06 19:46 ` [PATCH v3 7/9] hash.h: scaffolding for _fast hashing variants Taylor Blau
2024-09-06 19:46 ` [PATCH v3 8/9] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-06 19:46 ` [PATCH v3 9/9] csum-file.c: use fast SHA-1 implementation when available Taylor Blau
2024-09-06 21:50 ` [PATCH v3 0/9] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Junio C Hamano
2024-09-24 17:32 ` [PATCH v4 0/8] " Taylor Blau
2024-09-24 17:32 ` [PATCH v4 1/8] finalize_object_file(): check for name collision before renaming Taylor Blau
2024-09-25 17:02 ` Junio C Hamano
2024-09-24 17:32 ` [PATCH v4 2/8] finalize_object_file(): refactor unlink_or_warn() placement Taylor Blau
2024-09-24 17:32 ` [PATCH v4 3/8] finalize_object_file(): implement collision check Taylor Blau
2024-09-24 20:37 ` Jeff King
2024-09-24 21:59 ` Taylor Blau
2024-09-24 22:20 ` Jeff King
2024-09-25 18:06 ` Taylor Blau
2024-09-24 21:32 ` Junio C Hamano
2024-09-24 22:02 ` Taylor Blau
2024-09-24 17:32 ` [PATCH v4 4/8] pack-objects: use finalize_object_file() to rename pack/idx/etc Taylor Blau
2024-09-24 21:34 ` Junio C Hamano
2024-09-24 17:32 ` [PATCH v4 5/8] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-24 17:32 ` [PATCH v4 6/8] hash.h: scaffolding for _unsafe hashing variants Taylor Blau
2024-09-24 17:32 ` [PATCH v4 7/8] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-24 17:32 ` [PATCH v4 8/8] csum-file.c: use unsafe SHA-1 implementation when available Taylor Blau
2024-09-24 20:52 ` [PATCH v4 0/8] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Jeff King
2024-09-25 16:58 ` Elijah Newren
2024-09-25 17:11 ` Junio C Hamano
2024-09-25 17:22 ` Taylor Blau
2024-09-25 17:22 ` Taylor Blau
2024-09-26 15:22 ` [PATCH v5 " Taylor Blau
2024-09-26 15:22 ` [PATCH v5 1/8] finalize_object_file(): check for name collision before renaming Taylor Blau
2024-09-26 15:22 ` [PATCH v5 2/8] finalize_object_file(): refactor unlink_or_warn() placement Taylor Blau
2024-09-26 15:22 ` [PATCH v5 3/8] finalize_object_file(): implement collision check Taylor Blau
2024-09-26 15:22 ` [PATCH v5 4/8] pack-objects: use finalize_object_file() to rename pack/idx/etc Taylor Blau
2024-09-26 15:22 ` [PATCH v5 5/8] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-26 15:22 ` [PATCH v5 6/8] hash.h: scaffolding for _unsafe hashing variants Taylor Blau
2024-09-26 15:22 ` [PATCH v5 7/8] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-26 15:22 ` [PATCH v5 8/8] csum-file.c: use unsafe SHA-1 implementation when available Taylor Blau
2024-09-26 22:47 ` [PATCH v5 0/8] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Elijah Newren
2024-09-27 0:44 ` Junio C Hamano
2024-09-27 3:57 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq34me5crn.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).