git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 07/28] commit: avoid leaking already-saved buffer
Date: Thu, 26 Sep 2024 15:50:05 +0200	[thread overview]
Message-ID: <ZvVmjbMujO1h2sQp@pks.im> (raw)
In-Reply-To: <20240924215434.GG1143820@coredump.intra.peff.net>

On Tue, Sep 24, 2024 at 05:54:34PM -0400, Jeff King wrote:
> When we parse a commit via repo_parse_commit_internal(), if
> save_commit_buffer is set we'll stuff the buffer of the object contents
> into a cache, overwriting any previous value.

Interesting. I saw some cases that I think could be caused by this, but
couldn't make much sense of them.

> This can result in a leak of that previously cached value, though it's
> rare in practice. If we have a value in the cache it would have come
> from a previous parse, and during that parse we'd set the object.parsed
> flag, causing any subsequent parse attempts to exit without doing any
> work.
> 
> But it's possible to "unparse" a commit, which we do when registering a
> commit graft. And since shallow fetches are implemented using grafts,
> the leak is triggered in practice by t5539.
> 
> There are a number of possible ways to address this:
> 
>   1. the unparsing function could clear the cached commit buffer, too. I

s/the/The/

>      think this would work for the case I found, but I'm not sure if
>      there are other ways to end up in the same state (an unparsed
>      commit with an entry in the commit buffer cache).
> 
>   2. when we parse, we could check the buffer cache and prefer it to

s/when/When/

>      reading the contents from the object database. In theory the
>      contents of a particular sha1 are immutable, but the code in
>      question is violating the immutability with grafts. So this
>      approach makes me a bit nervous, although I think it would work in
>      practice (the grafts are applied to what we parse, but we still
>      retain the original contents).
> 
>   3. We could realize the cache is already populated and discard its
>      contents before overwriting. It's possible some other code could be
>      holding on to a pointer to the old cache entry (and we'd introduce
>      a use-after-free), but I think the risk of that is relatively low.
> 
>   4. The reverse of (3): when the cache is populated, don't bother
>      saving our new copy. This is perhaps a little weird, since we'll
>      have just populated the commit struct based on a different buffer.
>      But the two buffers should be the same, even in the presence of
>      grafts (as in (2) above).
> 
> I went with option 4. It addresses the leak directly and doesn't carry
> any risk of breaking other assumptions. And it's the same technique used
> by parse_object_buffer() for this situation, though I'm not sure when it
> would even come up there. The extra safety has been there since
> bd1e17e245 (Make "parse_object()" also fill in commit message buffer
> data., 2005-05-25).

Okay. This feels a bit weird indeed, but the fact that we already use
the same strategy in other places makes me feel way safer.

> This lets us mark t5539 as leak-free.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  commit.c                      | 3 ++-
>  t/t5539-fetch-http-shallow.sh | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/commit.c b/commit.c
> index 3a54e4db0d..cc03a93036 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -595,7 +595,8 @@ int repo_parse_commit_internal(struct repository *r,
>  	}
>  
>  	ret = parse_commit_buffer(r, item, buffer, size, 0);
> -	if (save_commit_buffer && !ret) {
> +	if (save_commit_buffer && !ret &&
> +	    !get_cached_commit_buffer(r, item, NULL)) {
>  		set_commit_buffer(r, item, buffer, size);
>  		return 0;
>  	}

And the fix is trivial.

Patrick

  reply	other threads:[~2024-09-26 13:50 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-24 21:49 [PATCH 0/28] leak fixes for http fetch/push Jeff King
2024-09-24 21:50 ` [PATCH 01/28] http-fetch: clear leaking git-index-pack(1) arguments Jeff King
2024-09-24 21:50 ` [PATCH 02/28] shallow: fix leak when unregistering last shallow root Jeff King
2024-09-24 21:51 ` [PATCH 03/28] fetch-pack: fix leaking sought refs Jeff King
2024-09-25 17:17   ` René Scharfe
2024-09-26 11:52     ` Patrick Steinhardt
2024-09-24 21:51 ` [PATCH 04/28] connect: clear child process before freeing in diagnostic mode Jeff King
2024-09-26 13:49   ` Patrick Steinhardt
2024-09-24 21:52 ` [PATCH 05/28] fetch-pack: free object filter before exiting Jeff King
2024-09-26 13:49   ` Patrick Steinhardt
2024-09-24 21:52 ` [PATCH 06/28] fetch-pack, send-pack: clean up shallow oid array Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-27  3:45     ` Jeff King
2024-09-24 21:54 ` [PATCH 07/28] commit: avoid leaking already-saved buffer Jeff King
2024-09-26 13:50   ` Patrick Steinhardt [this message]
2024-09-24 21:55 ` [PATCH 08/28] send-pack: free cas options before exit Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-27  3:47     ` Jeff King
2024-09-24 21:56 ` [PATCH 09/28] transport-helper: fix strbuf leak in push_refs_with_push() Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-27  3:49     ` Jeff King
2024-09-24 21:57 ` [PATCH 10/28] fetch: free "raw" string when shrinking refspec Jeff King
2024-09-24 21:58 ` [PATCH 11/28] fetch-pack: clear pack lockfiles list Jeff King
2024-09-24 21:58 ` [PATCH 12/28] transport-helper: fix leak of dummy refs_list Jeff King
2024-09-24 21:59 ` [PATCH 13/28] http: fix leak when redacting cookies from curl trace Jeff King
2024-09-24 22:01 ` [PATCH 14/28] http: fix leak of http_object_request struct Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-27  3:50     ` Jeff King
2024-09-24 22:02 ` [PATCH 15/28] http: call git_inflate_end() when releasing http_object_request Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-27  3:51     ` Jeff King
2024-09-24 22:02 ` [PATCH 16/28] http: stop leaking buffer in http_get_info_packs() Jeff King
2024-09-24 22:03 ` [PATCH 17/28] remote-curl: free HEAD ref with free_one_ref() Jeff King
2024-09-24 22:04 ` [PATCH 18/28] http-walker: free fake packed_git list Jeff King
2024-09-24 22:04 ` [PATCH 19/28] http-push: clear refspecs before exiting Jeff King
2024-09-24 22:04 ` [PATCH 20/28] http-push: free repo->url string Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-27  3:55     ` Jeff King
2024-09-24 22:05 ` [PATCH 21/28] http-push: free curl header lists Jeff King
2024-09-26 13:50   ` Patrick Steinhardt
2024-09-24 22:06 ` [PATCH 22/28] http-push: free transfer_request dest field Jeff King
2024-09-24 22:08 ` [PATCH 23/28] http-push: free transfer_request strbuf Jeff King
2024-09-24 22:09 ` [PATCH 24/28] http-push: free remote_ls_ctx.dentry_name Jeff King
2024-09-24 22:09 ` [PATCH 25/28] http-push: free xml_ctx.cdata after use Jeff King
2024-09-24 22:10 ` [PATCH 26/28] http-push: clean up objects list Jeff King
2024-09-24 22:11 ` [PATCH 27/28] http-push: clean up loose request when falling back to packed Jeff King
2024-09-24 22:12 ` [PATCH 28/28] http-push: clean up local_refs at exit Jeff King
2024-09-26 13:50 ` [PATCH 0/28] leak fixes for http fetch/push Patrick Steinhardt
2024-09-27  3:55   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvVmjbMujO1h2sQp@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).