From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 07/28] commit: avoid leaking already-saved buffer
Date: Thu, 26 Sep 2024 15:50:05 +0200 [thread overview]
Message-ID: <ZvVmjbMujO1h2sQp@pks.im> (raw)
In-Reply-To: <20240924215434.GG1143820@coredump.intra.peff.net>
On Tue, Sep 24, 2024 at 05:54:34PM -0400, Jeff King wrote:
> When we parse a commit via repo_parse_commit_internal(), if
> save_commit_buffer is set we'll stuff the buffer of the object contents
> into a cache, overwriting any previous value.
Interesting. I saw some cases that I think could be caused by this, but
couldn't make much sense of them.
> This can result in a leak of that previously cached value, though it's
> rare in practice. If we have a value in the cache it would have come
> from a previous parse, and during that parse we'd set the object.parsed
> flag, causing any subsequent parse attempts to exit without doing any
> work.
>
> But it's possible to "unparse" a commit, which we do when registering a
> commit graft. And since shallow fetches are implemented using grafts,
> the leak is triggered in practice by t5539.
>
> There are a number of possible ways to address this:
>
> 1. the unparsing function could clear the cached commit buffer, too. I
s/the/The/
> think this would work for the case I found, but I'm not sure if
> there are other ways to end up in the same state (an unparsed
> commit with an entry in the commit buffer cache).
>
> 2. when we parse, we could check the buffer cache and prefer it to
s/when/When/
> reading the contents from the object database. In theory the
> contents of a particular sha1 are immutable, but the code in
> question is violating the immutability with grafts. So this
> approach makes me a bit nervous, although I think it would work in
> practice (the grafts are applied to what we parse, but we still
> retain the original contents).
>
> 3. We could realize the cache is already populated and discard its
> contents before overwriting. It's possible some other code could be
> holding on to a pointer to the old cache entry (and we'd introduce
> a use-after-free), but I think the risk of that is relatively low.
>
> 4. The reverse of (3): when the cache is populated, don't bother
> saving our new copy. This is perhaps a little weird, since we'll
> have just populated the commit struct based on a different buffer.
> But the two buffers should be the same, even in the presence of
> grafts (as in (2) above).
>
> I went with option 4. It addresses the leak directly and doesn't carry
> any risk of breaking other assumptions. And it's the same technique used
> by parse_object_buffer() for this situation, though I'm not sure when it
> would even come up there. The extra safety has been there since
> bd1e17e245 (Make "parse_object()" also fill in commit message buffer
> data., 2005-05-25).
Okay. This feels a bit weird indeed, but the fact that we already use
the same strategy in other places makes me feel way safer.
> This lets us mark t5539 as leak-free.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> commit.c | 3 ++-
> t/t5539-fetch-http-shallow.sh | 1 +
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/commit.c b/commit.c
> index 3a54e4db0d..cc03a93036 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -595,7 +595,8 @@ int repo_parse_commit_internal(struct repository *r,
> }
>
> ret = parse_commit_buffer(r, item, buffer, size, 0);
> - if (save_commit_buffer && !ret) {
> + if (save_commit_buffer && !ret &&
> + !get_cached_commit_buffer(r, item, NULL)) {
> set_commit_buffer(r, item, buffer, size);
> return 0;
> }
And the fix is trivial.
Patrick
next prev parent reply other threads:[~2024-09-26 13:50 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 21:49 [PATCH 0/28] leak fixes for http fetch/push Jeff King
2024-09-24 21:50 ` [PATCH 01/28] http-fetch: clear leaking git-index-pack(1) arguments Jeff King
2024-09-24 21:50 ` [PATCH 02/28] shallow: fix leak when unregistering last shallow root Jeff King
2024-09-24 21:51 ` [PATCH 03/28] fetch-pack: fix leaking sought refs Jeff King
2024-09-25 17:17 ` René Scharfe
2024-09-26 11:52 ` Patrick Steinhardt
2024-09-24 21:51 ` [PATCH 04/28] connect: clear child process before freeing in diagnostic mode Jeff King
2024-09-26 13:49 ` Patrick Steinhardt
2024-09-24 21:52 ` [PATCH 05/28] fetch-pack: free object filter before exiting Jeff King
2024-09-26 13:49 ` Patrick Steinhardt
2024-09-24 21:52 ` [PATCH 06/28] fetch-pack, send-pack: clean up shallow oid array Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-27 3:45 ` Jeff King
2024-09-24 21:54 ` [PATCH 07/28] commit: avoid leaking already-saved buffer Jeff King
2024-09-26 13:50 ` Patrick Steinhardt [this message]
2024-09-24 21:55 ` [PATCH 08/28] send-pack: free cas options before exit Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-27 3:47 ` Jeff King
2024-09-24 21:56 ` [PATCH 09/28] transport-helper: fix strbuf leak in push_refs_with_push() Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-27 3:49 ` Jeff King
2024-09-24 21:57 ` [PATCH 10/28] fetch: free "raw" string when shrinking refspec Jeff King
2024-09-24 21:58 ` [PATCH 11/28] fetch-pack: clear pack lockfiles list Jeff King
2024-09-24 21:58 ` [PATCH 12/28] transport-helper: fix leak of dummy refs_list Jeff King
2024-09-24 21:59 ` [PATCH 13/28] http: fix leak when redacting cookies from curl trace Jeff King
2024-09-24 22:01 ` [PATCH 14/28] http: fix leak of http_object_request struct Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-27 3:50 ` Jeff King
2024-09-24 22:02 ` [PATCH 15/28] http: call git_inflate_end() when releasing http_object_request Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-27 3:51 ` Jeff King
2024-09-24 22:02 ` [PATCH 16/28] http: stop leaking buffer in http_get_info_packs() Jeff King
2024-09-24 22:03 ` [PATCH 17/28] remote-curl: free HEAD ref with free_one_ref() Jeff King
2024-09-24 22:04 ` [PATCH 18/28] http-walker: free fake packed_git list Jeff King
2024-09-24 22:04 ` [PATCH 19/28] http-push: clear refspecs before exiting Jeff King
2024-09-24 22:04 ` [PATCH 20/28] http-push: free repo->url string Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-27 3:55 ` Jeff King
2024-09-24 22:05 ` [PATCH 21/28] http-push: free curl header lists Jeff King
2024-09-26 13:50 ` Patrick Steinhardt
2024-09-24 22:06 ` [PATCH 22/28] http-push: free transfer_request dest field Jeff King
2024-09-24 22:08 ` [PATCH 23/28] http-push: free transfer_request strbuf Jeff King
2024-09-24 22:09 ` [PATCH 24/28] http-push: free remote_ls_ctx.dentry_name Jeff King
2024-09-24 22:09 ` [PATCH 25/28] http-push: free xml_ctx.cdata after use Jeff King
2024-09-24 22:10 ` [PATCH 26/28] http-push: clean up objects list Jeff King
2024-09-24 22:11 ` [PATCH 27/28] http-push: clean up loose request when falling back to packed Jeff King
2024-09-24 22:12 ` [PATCH 28/28] http-push: clean up local_refs at exit Jeff King
2024-09-26 13:50 ` [PATCH 0/28] leak fixes for http fetch/push Patrick Steinhardt
2024-09-27 3:55 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZvVmjbMujO1h2sQp@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).