From: "Strain, Roger L." <roger.strain@swri.org>
To: Johan Herland <jherland@gmail.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Cc: Avery Pennarun <apenwarr@gmail.com>
Subject: Re: [BUG?] 'git subtree split' replicates unrelated mainline merges into subtree?
Date: Fri, 8 Jan 2021 18:54:32 +0000 [thread overview]
Message-ID: <bc2dbc2ea0c343a28f75452e594da0da@swri.org> (raw)
In-Reply-To: <20210107015404.3433-1-jherland@gmail.com>
(Apologies for message formatting, fighting Outlook)
> I've been trying to understand how the subtree cache (mis)behaves in
> this case. The cache is initially seeded from find_existing_splits(),
> which finds these lines the adc8ecf commit message:
>
> git-subtree-mainline: 9b6e8f677b700a00e9f1715e2624bf5ed756dc85
> git-subtree-split: 5280958b2f997c3ce7bff7192cceb19f55b45cd9
>
> and adds these corresponding entries to the cache:
>
> 9b6e8f6 -> 5280958
> 5280958 -> 5280958
>
> In other words, the cache starts out claiming that 5280958 is the
> equivalent subtree commit for the 9b6e8f6 mainline commit.
> However, in my naive understanding this does not make sense, as
> 9b6e8f6 _precedes_ the subtree addition, and has no content in
> the relevant subdir.
I think you've identified the exact problem right here. In the normal
split/rejoin commits, the mainline commit *prior to* the merge commit
does, in fact, represent the same subtree state as the subtree commit
which is also merged at that point. But in the case of an add, that's
not true, and I'm actually a little surprised the same commit message
markers are generated.
What really should be captured by the initial cache seeding is that
the add merge commit *itself* has the same subtree content. However,
that can't be determined at the time the merge commit is created, as
the hash of that merge commit is determined by the commit message
itself.
I think a possible solution to this would be modifying the initial cache
process to isolate the Add commits and handle them differently.
Rather than using the hashes in the commit message, it should map
the merge commit itself to the subtree-split commit, and either do
nothing with the subtree-mainline commit hash, or explicitly set it
to notree.
However, this will complicate the logic of building the initial cache,
as it currently only cares about the existence of those simple lines,
and adds the mappings, erroneously as you have noted in this case.
It might also be worth changing the commit message provided for
adds so it no longer generates the incorrect assertion that the mainline
commit is identical subtree-wise, but even with that change, support
for correctly handling existing commits would still be ideal.
Currently, we're using a local version of the subtree script based on
some changes laid out here: https://github.com/gitgitgadget/git/pull/493
I'm hopeful that changeset will eventually land here, as it helps with
several complex issues in our repositories. I bring that up also because
it introduces some additional tools for managin the initial cache,
allowing manual mapping of one commit to another. That version might
allow some level of testing on whether this idea would correct the
problem described.
--
Roger Strain
prev parent reply other threads:[~2021-01-08 19:13 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-07 1:54 [BUG?] 'git subtree split' replicates unrelated mainline merges into subtree? Johan Herland
2021-01-08 18:54 ` Strain, Roger L. [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bc2dbc2ea0c343a28f75452e594da0da@swri.org \
--to=roger.strain@swri.org \
--cc=apenwarr@gmail.com \
--cc=git@vger.kernel.org \
--cc=jherland@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).