From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Scott Leggett <scott@sl.id.au>
Subject: Re: [PATCH] commit-graph: retain commit slab when closing NULL commit_graph
Date: Fri, 5 Jan 2024 12:07:44 -0500 [thread overview]
Message-ID: <ZZg3YIEDCjbbGyVX@nand.local> (raw)
In-Reply-To: <20240105054142.GA2035092@coredump.intra.peff.net>
On Fri, Jan 05, 2024 at 12:41:42AM -0500, Jeff King wrote:
> This fixes a regression introduced in ac6d45d11f (commit-graph: move
> slab-clearing to close_commit_graph(), 2023-10-03), in which running:
>
> git -c fetch.writeCommitGraph=true fetch --recurse-submodules
>
> multiple times in a freshly cloned repository causes a segfault. What
> happens in the second (and subsequent) runs is this:
>
> 1. We make a "struct commit" for any ref tips which we're storing
> (even if we already have them, they still go into FETCH_HEAD).
>
> Because the first run will have created a commit graph, we'll find
> those commits in the graph.
>
> The commit struct is therefore created with a NULL "maybe_tree"
> entry, because we can load its oid from the graph later. But to do
> that we need to remember that we got the commit from the graph,
> which is recorded in a global commit_graph_data_slab object.
>
> 2. Because we're using --recurse-submodules, we'll try to fetch each
> of the possible submodules. That implies creating a separate
> "struct repository" in-process for each submodule, which will
> require a later call to repo_clear().
>
> The call to repo_clear() calls raw_object_store_clear(), which in
> turn calls close_object_store(), which in turn calls
> close_commit_graph(). And the latter frees the commit graph data
> slab.
>
> 3. Later, when trying to write out a new commit graph, we'll ask for
> their tree oid via get_commit_tree_oid(), which will see that the
> object is parsed but with a NULL maybe_tree field. We'd then
> usually pull it from the graph file, but because the slab was
> cleared, we don't realize that we can do so! We end up returning
> NULL and segfaulting.
>
> (It seems questionable that we'd write a graph entry for such a
> commit anyway, since we know we already have one. I didn't
> double-check, but that may simply be another side effect of having
> cleared the slab).
Yeah, I agree that we should not be re-writing a commit-graph entry that
already exists in an earlier (non-removed) layer of the commit-graph.
I haven't thought through all of the details here, but that sounds like
a potentially dangerous thing to be doing.
> So that fixes the regression in the least-risky way possible.
Thanks for the detailed explanation. I think that the fix presented here
is a reasonable approach, and is worthwhile to pick up.
> I do think there's some fragility here that we might want to follow up
> on. We have a global commit_graph_data_slab that contains graph
> positions, and our global commit structs depend on the that slab
> remaining valid. But close_commit_graph() is just about closing _one_
> object store's graph. So it's dangerous to call that function and clear
> the slab without also throwing away any "struct commit" we might have
> parsed that depends on it.
I definitely agree that there seems like a high risk of another
potentially-dangerous bug happening as a result of this area.
One thing that sticks out to me is that we have a scope mismatch
between the commit structs, the raw_object_store, and the commit slabs.
Slabs are tied to the lifetime of the raw_object_store, but the commit
objects are tied to the global lifetime of the process. I wonder if it
would make sense to have a slab per object-store, and require that you
pass both the commit *and* the object-store you're looking it up in as
arguments to any slab-related functions.
I dunno. I have not put a ton of thought into that ^^ approach either,
so let me know if it seems obviously bogus to you or if this is worth
looking into further.
Thanks,
Taylor
next prev parent reply other threads:[~2024-01-05 17:07 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-05 5:41 [PATCH] commit-graph: retain commit slab when closing NULL commit_graph Jeff King
2024-01-05 17:07 ` Taylor Blau [this message]
2024-01-10 11:39 ` Jeff King
2024-01-10 16:38 ` Junio C Hamano
2024-01-11 7:53 ` Jeff King
2024-01-11 18:35 ` Junio C Hamano
2024-01-05 19:56 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZg3YIEDCjbbGyVX@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=scott@sl.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.