From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Scott Leggett <scott@sl.id.au>
Subject: Re: [PATCH] commit-graph: retain commit slab when closing NULL commit_graph
Date: Fri, 5 Jan 2024 12:07:44 -0500 [thread overview]
Message-ID: <ZZg3YIEDCjbbGyVX@nand.local> (raw)
In-Reply-To: <20240105054142.GA2035092@coredump.intra.peff.net>
On Fri, Jan 05, 2024 at 12:41:42AM -0500, Jeff King wrote:
> This fixes a regression introduced in ac6d45d11f (commit-graph: move
> slab-clearing to close_commit_graph(), 2023-10-03), in which running:
>
> git -c fetch.writeCommitGraph=true fetch --recurse-submodules
>
> multiple times in a freshly cloned repository causes a segfault. What
> happens in the second (and subsequent) runs is this:
>
> 1. We make a "struct commit" for any ref tips which we're storing
> (even if we already have them, they still go into FETCH_HEAD).
>
> Because the first run will have created a commit graph, we'll find
> those commits in the graph.
>
> The commit struct is therefore created with a NULL "maybe_tree"
> entry, because we can load its oid from the graph later. But to do
> that we need to remember that we got the commit from the graph,
> which is recorded in a global commit_graph_data_slab object.
>
> 2. Because we're using --recurse-submodules, we'll try to fetch each
> of the possible submodules. That implies creating a separate
> "struct repository" in-process for each submodule, which will
> require a later call to repo_clear().
>
> The call to repo_clear() calls raw_object_store_clear(), which in
> turn calls close_object_store(), which in turn calls
> close_commit_graph(). And the latter frees the commit graph data
> slab.
>
> 3. Later, when trying to write out a new commit graph, we'll ask for
> their tree oid via get_commit_tree_oid(), which will see that the
> object is parsed but with a NULL maybe_tree field. We'd then
> usually pull it from the graph file, but because the slab was
> cleared, we don't realize that we can do so! We end up returning
> NULL and segfaulting.
>
> (It seems questionable that we'd write a graph entry for such a
> commit anyway, since we know we already have one. I didn't
> double-check, but that may simply be another side effect of having
> cleared the slab).
Yeah, I agree that we should not be re-writing a commit-graph entry that
already exists in an earlier (non-removed) layer of the commit-graph.
I haven't thought through all of the details here, but that sounds like
a potentially dangerous thing to be doing.
> So that fixes the regression in the least-risky way possible.
Thanks for the detailed explanation. I think that the fix presented here
is a reasonable approach, and is worthwhile to pick up.
> I do think there's some fragility here that we might want to follow up
> on. We have a global commit_graph_data_slab that contains graph
> positions, and our global commit structs depend on the that slab
> remaining valid. But close_commit_graph() is just about closing _one_
> object store's graph. So it's dangerous to call that function and clear
> the slab without also throwing away any "struct commit" we might have
> parsed that depends on it.
I definitely agree that there seems like a high risk of another
potentially-dangerous bug happening as a result of this area.
One thing that sticks out to me is that we have a scope mismatch
between the commit structs, the raw_object_store, and the commit slabs.
Slabs are tied to the lifetime of the raw_object_store, but the commit
objects are tied to the global lifetime of the process. I wonder if it
would make sense to have a slab per object-store, and require that you
pass both the commit *and* the object-store you're looking it up in as
arguments to any slab-related functions.
I dunno. I have not put a ton of thought into that ^^ approach either,
so let me know if it seems obviously bogus to you or if this is worth
looking into further.
Thanks,
Taylor
next prev parent reply other threads:[~2024-01-05 17:07 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-05 5:41 [PATCH] commit-graph: retain commit slab when closing NULL commit_graph Jeff King
2024-01-05 17:07 ` Taylor Blau [this message]
2024-01-10 11:39 ` Jeff King
2024-01-10 16:38 ` Junio C Hamano
2024-01-11 7:53 ` Jeff King
2024-01-11 18:35 ` Junio C Hamano
2024-01-05 19:56 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZg3YIEDCjbbGyVX@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=scott@sl.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).