git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 02/10] builtin/fast-import: fix segfault with unsafe SHA1
Date: Tue, 7 Jan 2025 13:06:20 +0100	[thread overview]
Message-ID: <Z30YvHKjA_b6_xwt@pks.im> (raw)
In-Reply-To: <Z3wsZjAqbfI/EdVe@nand.local>

On Mon, Jan 06, 2025 at 02:17:58PM -0500, Taylor Blau wrote:
> On Fri, Jan 03, 2025 at 02:08:01PM +0100, Patrick Steinhardt wrote:
> > On Mon, Dec 30, 2024 at 12:22:34PM -0500, Taylor Blau wrote:
> > > On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote:
> > > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> > > > index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644
> > > > --- a/builtin/fast-import.c
> > > > +++ b/builtin/fast-import.c
> > > > @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
> > > >  		|| (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size)
> > > >  		cycle_packfile();
> > > >
> > > > -	the_hash_algo->init_fn(&checkpoint.ctx);
> > > > +	the_hash_algo->unsafe_init_fn(&checkpoint.ctx);
> > >
> > > This will obviously fix the issue at hand, but I don't think this is any
> > > less brittle than before. The hash function implementation here needs to
> > > agree with that used in the hashfile API. This change makes that
> > > happen, but only using side information that the hashfile API uses the
> > > unsafe variants.
> >
> > Yup, I only cared about fixing the segfault because we're close to the
> > v2.48 release. I agree that the overall state is still extremely brittle
> > right now.
> >
> > [snip]
> > > I think we should perhaps combine forces here. My ideal end-state is to
> > > have the unsafe_hash_algo() stuff land from my earlier series, then have
> > > these two fixes (adjusted to the new world order as above), and finally
> > > the Meson fixes after that.
> > >
> > > Does that seem like a plan to you? If so, I can put everything together
> > > and send it out (if you're OK with me forging your s-o-b).
> >
> > I think the ideal state would be if the hashing function used was stored
> > as part of `struct git_hash_ctx`. So the flow basically becomes for
> > example:
> >
> >     ```
> >     struct git_hash_ctx ctx;
> >     struct object_id oid;
> >
> >     git_hash_sha1_init(&ctx);
> >     git_hash_update(&ctx, data);
> >     git_hash_final_oid(&oid, &ctx);
> >     ```
> >
> > Note how the intermediate calls don't need to know which hash function
> > you used to initialize the `struct git_hash_ctx` -- the structure itself
> > should remember what it has been initilized with and do the right thing.
> 
> I'm not sure I'm following you here. In the stream_blob() function
> within fast-import, the problem isn't that we're switching hash
> functions mid-stream, but that we're initializing the hashfile_context
> structure with the wrong hash function to begin with.

True, but it would have been a non-issue if the hash context itself knew
which hash function to use for updates. Sure, we would've used the slow
variant of SHA1 instead of the fast-but-unsafe one. But that feels like
the lesser evil compared to crashing.

> You snipped it out of your reply, but I think that my suggestion to do:
> 
>     pack_file->algop->init_fn(&checkpoint.ctx);
> 
> would harden us against the broken behavior we're seeing here.
> 
> As a separate defense-in-depth measure, we could teach functions from
> the hashfile API which deal with hashfile_checkpoint structure to ensure
> that the hashfile and its checkpoint both use the same algorithm (by
> adding a hash_algo field to the hashfile_checkpoint structure).

I would think that it were even harder to abuse if it wasn't the
hashfile API, but the hash API that remembered the algorithm.

Patrick

  reply	other threads:[~2025-01-07 12:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-30 14:24 [PATCH 00/10] Fix segfaults when using the unsafe SHA1 backend Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 01/10] bulk-checkin: fix segfault with " Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 02/10] builtin/fast-import: " Patrick Steinhardt
2024-12-30 17:22   ` [PATCH 02/10] builtin/fast-import: fix segfault with unsafe SHA1 Taylor Blau
2025-01-03 13:08     ` Patrick Steinhardt
2025-01-03 16:25       ` Junio C Hamano
2025-01-06 19:17       ` Taylor Blau
2025-01-07 12:06         ` Patrick Steinhardt [this message]
2025-01-08 19:21           ` Taylor Blau
2025-01-09  5:57             ` Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 03/10] ci: exercise unsafe OpenSSL backend Patrick Steinhardt
2024-12-30 17:31   ` Taylor Blau
2024-12-30 14:24 ` [PATCH 04/10] meson: consistenlty spell 'CommonCrypto' Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 05/10] meson: deduplicate access to SHA1/SHA256 backend options Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 06/10] meson: require SecurityFramework when it's used as SHA1 backend Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 07/10] meson: simplify conditions for HTTPS and SHA1 dependencies Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 08/10] meson: add missing dots for build options Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 09/10] meson: wire up unsafe SHA1 backend Patrick Steinhardt
2024-12-30 14:24 ` [PATCH 10/10] meson: provide a summary of configured backends Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z30YvHKjA_b6_xwt@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).