* Re: [PATCH] contrib/subtree: fix split with squashed subtrees
2025-08-24 19:10 [PATCH] contrib/subtree: fix split with squashed subtrees Colin Stagner
@ 2025-09-01 13:54 ` Phillip Wood
2025-09-01 20:43 ` Colin Stagner
2025-09-05 2:27 ` [PATCH v2] " Colin Stagner
2025-09-10 3:11 ` [PATCH v3] " Colin Stagner
2 siblings, 1 reply; 15+ messages in thread
From: Phillip Wood @ 2025-09-01 13:54 UTC (permalink / raw)
To: Colin Stagner, git
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
Hi Colin
On 24/08/2025 20:10, Colin Stagner wrote:
> 98ba49ccc2 (subtree: fix split processing with multiple subtrees
> present, 2023-12-01) increases the performance of
>
> git subtree split --prefix=subA
>
> by ignoring subtree merges which are outside of `subA/`. It also
> introduces a regression. Subtree merges that should be retained
> are incorrectly ignored if they:
>
> 1. are nested under `subA/`; and
> 2. are merged with `--squash`.
>
> For example, a subtree merged like:
>
> git subtree merge --squash --prefix=subA/subB "$rev"
> # ^^^^^^^^ ^^^^
>
> is erroneously ignored during a split of `subA`. This causes
> missing tree files and different commit hashes starting in
> git v2.44.0-rc0.
>
> The method:
>
> should_ignore_subtree_split_commit REV
>
> should test only if REV is a subtree commit, but the combination of
>
> git log -1 --grep=...
>
> actually searches all *parent* commits until a `--grep` match is
> discovered. Limit these checks to the current REV only.
Thanks for the clear explanation of the problem and the proposed solution.
> Tests now cover nested subtrees.
Great
> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index 3fddba797c..139049351d 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -788,17 +788,17 @@ ensure_valid_ref_format () {
> # Usage: check if a commit from another subtree should be
> # ignored from processing for splits
> should_ignore_subtree_split_commit () {
> assert test $# = 1
> local rev="$1"
> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
> + if test -n "$(git log -1 --grep="git-subtree-dir:" "$rev^!")"
This makes sense as we only want to grep the current commit. We could
drop the "-1" as we're only considering a single commit.
> then
> - if test -z "$(git log -1 --grep="git-subtree-mainline:" $rev)" &&
> - test -z "$(git log -1 --grep="git-subtree-dir: $arg_prefix$" $rev)"
> + if test -z "$(git log -1 --grep="git-subtree-mainline:" "$rev^!")" &&
> + test -z "$(git log -1 --grep="git-subtree-dir: $arg_prefix$" "$rev^!")"
I'm less sure about this change. Is the second test checking making sure
we don't prune this commit if it has an ancestor that is a subtree merge
for the subtree we're interested in? It would be very helpful if Zach
could comment on what was intended here.
If it turns out that all three tests only want to consider a single
commit then it would be be more efficient to run a single git command
and check the output with something like
git show -s
--format='%(trailers:key=git-subtree-dir,key=git-subtree-mainline' $rev
| while read trailer
do
# check trailers here using case "$trailer"
done
Thanks
Phillip
> then
> return 0
> fi
> fi
> return 1
> }
>
> # Usage: process_split_commit REV PARENTS
> diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh
> index 3edbb33af4..1ddc213621 100755
> --- a/contrib/subtree/t/t7900-subtree.sh
> +++ b/contrib/subtree/t/t7900-subtree.sh
> @@ -68,6 +68,34 @@ test_create_pre2_32_repo () {
> git -C "$1-clone" replace HEAD^2 $new_commit
> }
>
> +# test_create_subtree_add REPO ORPHAN PREFIX FILENAME ...
> +#
> +# Create a simple subtree on a new branch named ORPHAN in REPO.
> +# The subtree is then merged into the current branch of REPO,
> +# under PREFIX. The generated subtree has has one commit
> +# with subject and tag FILENAME with a single file "FILENAME.t"
> +#
> +# When this method returns:
> +# - the current branch of REPO will have file PREFIX/FILENAME.t
> +# - REPO will have a branch named ORPHAN with subtree history
> +#
> +# additional arguments are forwarded to "subtree add"
> +test_create_subtree_add () {
> + (
> + cd "$1" &&
> + orphan="$2" &&
> + prefix="$3" &&
> + filename="$4" &&
> + shift 4 &&
> + last="$(git branch --show-current)" &&
> + git checkout --orphan "$orphan" &&
> + git rm -rf . &&
> + test_commit "$filename" &&
> + git checkout "$last" &&
> + git subtree add --prefix="$prefix" "$@" "$orphan"
> + )
> +}
> +
> test_expect_success 'shows short help text for -h' '
> test_expect_code 129 git subtree -h >out 2>err &&
> test_must_be_empty err &&
> @@ -426,6 +454,48 @@ test_expect_success 'split with multiple subtrees' '
> --squash --rejoin -d -m "Sub B Split 1" 2>&1 | grep -w "\[1\]")" = ""
> '
>
> +# When subtree split-ing a directory that has other subtree
> +# *merges* underneath it, the split must include those subtrees.
> +# This test creates a nested subtree, `subA/subB`, and tests
> +# that the tree is correct after a subtree split of `subA/`.
> +# The test covers:
> +# - An initial `subtree add`; and
> +# - A follow-up `subtree merge`
> +# both with and without `--squashed`.
> +for is_squashed in '' 'y';
> +do
> + test_expect_success "split keeps nested ${is_squashed:+--squash }subtrees that are part of the split" '
> + subtree_test_create_repo "$test_count" &&
> + (
> + cd "$test_count" &&
> + mkdir subA &&
> + test_commit subA/file1 &&
> + git branch -m main &&
> + test_create_subtree_add \
> + . mksubtree subA/subB file2 ${is_squashed:+--squash} &&
> + test -e subA/file1.t &&
> + test -e subA/subB/file2.t &&
> + git subtree split --prefix=subA --branch=bsplit &&
> + git checkout bsplit &&
> + test -e file1.t &&
> + test -e subB/file2.t &&
> + git checkout mksubtree &&
> + git branch -D bsplit &&
> + test_commit file3 &&
> + git checkout main &&
> + git subtree merge \
> + ${is_squashed:+--squash} \
> + --prefix=subA/subB mksubtree &&
> + test -e subA/subB/file3.t &&
> + git subtree split --prefix=subA --branch=bsplit &&
> + git checkout bsplit &&
> + test -e file1.t &&
> + test -e subB/file2.t &&
> + test -e subB/file3.t
> + )
> + '
> +done
> +
> test_expect_success 'split sub dir/ with --rejoin from scratch' '
> subtree_test_create_repo "$test_count" &&
> test_create_commit "$test_count" main1 &&
>
> base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] contrib/subtree: fix split with squashed subtrees
2025-09-01 13:54 ` Phillip Wood
@ 2025-09-01 20:43 ` Colin Stagner
2025-09-02 13:22 ` Phillip Wood
0 siblings, 1 reply; 15+ messages in thread
From: Colin Stagner @ 2025-09-01 20:43 UTC (permalink / raw)
To: git, phillip.wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher,
Colin Stagner
On 9/1/25 08:54, Phillip Wood wrote:
Colin Stagner <ask+git@howdoi.land> writes:
>> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
>> + if test -n "$(git log -1 --grep="git-subtree-dir:" "$rev^!")"
>
> We could drop the "-1" as we're only considering a single commit.
Concur.
>> - if test -z "$(git log -1 --grep="git-subtree-mainline:"
>> $rev)" &&
>> - test -z "$(git log -1 --grep="git-subtree-dir:
>> $arg_prefix$" $rev)"
>> + if test -z "$(git log -1 --grep="git-subtree-mainline:"
>> "$rev^!")" &&
>> + test -z "$(git log -1 --grep="git-subtree-dir:
>> $arg_prefix$" "$rev^!")"
>
> I'm less sure about this change. Is the second test checking
> making sure we don't prune this commit if it has an ancestor
> that is a subtree merge for the subtree we're interested in?
The outer loop in git-subtree.sh:983 appears to iterate from the root
commit forwards… and not from the HEAD backwards.
git rev-list --topo-order --reverse --parents $rev $unrevs
# ^^^^^^^^^
Since the iteration is ancestor-first, I'm having difficulty seeing why
`should_ignore_subtree_split_commit()` would want to do an ancestor
traversal at all. It already sees the commits ancestor-first. But there
could be a reason that I don't know.
Here is a more long-winded breakdown of these tests. From what I can
determine:
test -z "$(git log -1 --grep="git-subtree-mainline:" "$rev")
excludes squashed commits created from
git subtree merge --prefix subM --squash srcBranch
The --squash creates two commits:
1. A single-parent "Squashed 'subM/' content from", which
squashes the changes from srcBranch. This commit's tree
is like the one on srcBranch. It does not have the `subM/`
prefix.
2. A merge commit which rewrites the tree in (1) to add
the `subM/` leading directory, then merge it with the
current branch. The merge commit doesn't have any
`git-subtree:` trailers.
We must exclude (1) since the trees aren't actually compatible with
HEAD. (They don't have the `subM` prefix). We must keep (2). The above
`test -z` appears to do this.
I am *much* less certain about the second test:
test -z "$(git log -1 \
--grep="git-subtree-dir: $arg_prefix$" $rev)"
I think this was intended to keep the mainline portion from a previous
`git subtree split --rejoin`. But if I remove this `test -z`, all the
unit tests still pass—including mine. There may not be any test coverage
for this line. I will probably omit this `test` from v2.
> It would be very helpful if Zach could comment on what was intended here.
Yes, this would aid my understanding a lot.
> If it turns out that all three tests only want to consider a single
> commit then it would be be more efficient to run a single git command
> and check the output with something like
>
> git show -s --format='%(trailers:key=git-subtree-dir,key=git-
> subtree-mainline' $rev | while read trailer
> do
> # check trailers here using case "$trailer"
> done
This is a cleaner approach, and I'll explore it for v2. Any objection to
long options like `--no-patch` instead of `-s`? I find these are better
for scripts since there's less hunting around in man pages.
Thanks for your review,
Colin
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] contrib/subtree: fix split with squashed subtrees
2025-09-01 20:43 ` Colin Stagner
@ 2025-09-02 13:22 ` Phillip Wood
2025-09-02 14:57 ` Phillip Wood
0 siblings, 1 reply; 15+ messages in thread
From: Phillip Wood @ 2025-09-02 13:22 UTC (permalink / raw)
To: Colin Stagner, git, phillip.wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
On 01/09/2025 21:43, Colin Stagner wrote:
> On 9/1/25 08:54, Phillip Wood wrote:
> Colin Stagner <ask+git@howdoi.land> writes:
>>> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
>>> + if test -n "$(git log -1 --grep="git-subtree-dir:" "$rev^!")"
>>
>> We could drop the "-1" as we're only considering a single commit.
>
> Concur.
>
>>> - if test -z "$(git log -1 --grep="git-subtree-mainline:"
>>> $rev)" &&
>>> - test -z "$(git log -1 --grep="git-subtree-dir:
>>> $arg_prefix$" $rev)"
>>> + if test -z "$(git log -1 --grep="git-subtree-mainline:"
>>> "$rev^!")" &&
>>> + test -z "$(git log -1 --grep="git-subtree-dir:
>>> $arg_prefix$" "$rev^!")"
>>
>> I'm less sure about this change. Is the second test checking
>> making sure we don't prune this commit if it has an ancestor
>> that is a subtree merge for the subtree we're interested in?
>
> The outer loop in git-subtree.sh:983 appears to iterate from the root
> commit forwards… and not from the HEAD backwards.
>
> git rev-list --topo-order --reverse --parents $rev $unrevs
> # ^^^^^^^^^
>
> Since the iteration is ancestor-first, I'm having difficulty seeing why
> `should_ignore_subtree_split_commit()` would want to do an ancestor
> traversal at all. It already sees the commits ancestor-first. But there
> could be a reason that I don't know.
Ah, I was only looking at this patch, not how it was called. That begs
the question "what's the point of these checks if we've already visited
all the ancestors anyway". I think the answer is that it is pruning the
recursion that happens in check_parent() and checking the commits that
come from that rev-list command is pointless. The regression test
introduced with this function only looks at $extracount which comes from
the recursion. I haven't looked too closely but it would be nice if we
could move this check so it is only run when check_parents() is recursing.
> Here is a more long-winded breakdown of these tests. From what I can
> determine:
>
> test -z "$(git log -1 --grep="git-subtree-mainline:" "$rev")
>
> excludes squashed commits created from
>
> git subtree merge --prefix subM --squash srcBranch
> > The --squash creates two commits:
>
> 1. A single-parent "Squashed 'subM/' content from", which
> squashes the changes from srcBranch. This commit's tree
> is like the one on srcBranch. It does not have the `subM/`
> prefix.
Yes, and the commit message comes from squash_msg() which adds the
"git-subtree-dir:" and "git-subtree-split:" trailers but not
"git-subtree-mainline:".
> 2. A merge commit which rewrites the tree in (1) to add
> the `subM/` leading directory, then merge it with the
> current branch. The merge commit doesn't have any
> `git-subtree:` trailers.
Indeed. Running
git subtree split --squash --rejoin --prefix subM
will create a squash commit as above and a merge with a commit message
created by rejoin_msg() and contains the "git-subtree-dir:",
"git-subtree-split:" and "git-subtree-mainline:" trailers.
> We must exclude (1) since the trees aren't actually compatible with
> HEAD. (They don't have the `subM` prefix). We must keep (2). The above
> `test -z` appears to do this.
So we'll exclude squashed merges but not squashed splits? I think you're
right that we don't want "git log" to walk the history here.
> I am *much* less certain about the second test:
>
> test -z "$(git log -1 \
> --grep="git-subtree-dir: $arg_prefix$" $rev)"
>
> I think this was intended to keep the mainline portion from a previous
> `git subtree split --rejoin`. But if I remove this `test -z`, all the
> unit tests still pass—including mine. There may not be any test coverage
> for this line. I will probably omit this `test` from v2.
I'm not very familiar with git-subtree but I thought this was ensuring
that we did not exclude the ancestors of a squash or split that involves
the subtree that we're interested in. I wouldn't be surprised if the
test coverage was lacking.
>> It would be very helpful if Zach could comment on what was intended here.
>
> Yes, this would aid my understanding a lot.
and mine too.
>> If it turns out that all three tests only want to consider a single
>> commit then it would be be more efficient to run a single git command
>> and check the output with something like
>>
>> git show -s --format='%(trailers:key=git-subtree-dir,key=git-
>> subtree-mainline' $rev | while read trailer
>> do
>> # check trailers here using case "$trailer"
>> done
>
> This is a cleaner approach, and I'll explore it for v2. Any objection to
> long options like `--no-patch` instead of `-s`? I find these are better
> for scripts since there's less hunting around in man pages.
Sure, I used '-s' out of habit but '--no-patch' would be clearer
(especially as I can't see any link between the short and long option
names in this case)
Thanks
Phillip
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] contrib/subtree: fix split with squashed subtrees
2025-09-02 13:22 ` Phillip Wood
@ 2025-09-02 14:57 ` Phillip Wood
2025-09-04 1:34 ` Colin Stagner
0 siblings, 1 reply; 15+ messages in thread
From: Phillip Wood @ 2025-09-02 14:57 UTC (permalink / raw)
To: Colin Stagner, git, phillip.wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
On 02/09/2025 14:22, Phillip Wood wrote:
> On 01/09/2025 21:43, Colin Stagner wrote:
>> On 9/1/25 08:54, Phillip Wood wrote:
>> Colin Stagner <ask+git@howdoi.land> writes:
>>
>> The outer loop in git-subtree.sh:983 appears to iterate from the root
>> commit forwards… and not from the HEAD backwards.
>>
>> git rev-list --topo-order --reverse --parents $rev $unrevs
>> # ^^^^^^^^^
>>
>> Since the iteration is ancestor-first, I'm having difficulty seeing
>> why `should_ignore_subtree_split_commit()` would want to do an
>> ancestor traversal at all. It already sees the commits ancestor-first.
>> But there could be a reason that I don't know.
>
> Ah, I was only looking at this patch, not how it was called. That begs
> the question "what's the point of these checks if we've already visited
> all the ancestors anyway". I think the answer is that it is pruning the
> recursion that happens in check_parent() and checking the commits that
> come from that rev-list command is pointless. The regression test
> introduced with this function only looks at $extracount which comes from
> the recursion. I haven't looked too closely but it would be nice if we
> could move this check so it is only run when check_parents() is recursing.
Sorry that's not quite right. check_parents() recurses into
process_split_commit() rather than the loop that call
should_ignore_subtree_split_commit(). I think what this check does do is
prune some parents which stops check_parents() from recursing into other
subtrees so the check is in the right place.
Thanks
Phillip
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] contrib/subtree: fix split with squashed subtrees
2025-09-02 14:57 ` Phillip Wood
@ 2025-09-04 1:34 ` Colin Stagner
0 siblings, 0 replies; 15+ messages in thread
From: Colin Stagner @ 2025-09-04 1:34 UTC (permalink / raw)
To: Phillip Wood, git, phillip.wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
On 9/2/25 09:57, Phillip Wood wrote:
> On 01/09/2025 21:43, Colin Stagner wrote:
>>
>> The outer loop in git-subtree.sh:983 appears to iterate from the root
>> commit forwards… and not from the HEAD backwards.
>>
>> git rev-list --topo-order --reverse --parents $rev $unrevs
>> # ^^^^^^^^^
>>
>> Since the iteration is ancestor-first, I'm having difficulty seeing
>> why `should_ignore_subtree_split_commit()` would want to do an
>> ancestor traversal at all. It already sees the commits ancestor-
>> first.
> check_parents() recurses into process_split_commit() rather than the
> loop that call should_ignore_subtree_split_commit(). I think what this
> check does do is prune some parents which stops check_parents() from
> recursing into other subtrees so the check is in the right place.
I agree. In the original patch [1], Zach indicated that the check
results in a significant speedup for rejoin-heavy repos. The check is
clearly doing something.
Performance improvements may be possible. Instead of looking at commits
one at a time, this operation might be faster as part of a one-shot
HEAD-to-root traversal:
git log --grep 'for stuff' --format='%(trailers:...)' $unrev..HEAD
Commits that are deemed "uninteresting" or unnecessary could then be
provided, in bulk, as negative refs to the `git rev-list` traversal.
But my plan is to make the smallest and most portable maint-2.44 bugfix.
I think that non-essential performance changes are a task for later.
>> I am *much* less certain about the second test:
>>
>> test -z "$(git log -1 \
>> --grep="git-subtree-dir: $arg_prefix$" $rev)"
>>
>> If I remove this `test -z`, all the unit tests still pass—including mine. There may not be any test coverage for this line.
> I'm not very familiar with git-subtree but I thought this was ensuring
> that we did not exclude the ancestors of a squash or split that involves
> the subtree that we're interested in.
It does, but I am still having problems finding commits that actually
trigger it.
It appears that `find_existing_splits()` in git-subtree.sh:459 filters
out the commits that the `test -z` I quoted above would otherwise
detect. `find_existing_splits()` searches for a previous --rejoin commit
to use as an `$unrev` stopping point for the rev-walk. It searches for
commits matching
git log --grep="^git-subtree-dir: $dir/*\$"
in combination with `git-subtree-mainline:`.
This is essentially the same test as in
`should_ignore_subtree_split_commit()`.
In --ignore-joins mode, `find_existing_splits()` looks for different
commits. I experimented a bit with adding --ignore-joins to some of the
existing unit tests, but I still could not find any instance where this
`test -z` makes a difference.
That said... I am inclined to keep this second test. The bug I am
patching is the result of an overzealous prune. The last thing I want to
do is to inadvertently prune commits we need for the sake of a
performance boost.
> I wouldn't be surprised if the test coverage was lacking.
There don't appear to be any tests at all for --ignore-joins, aside from
option parsing.
[1]: 98ba49ccc2 (subtree: fix split processing with multiple subtrees
present, 2023-12-01)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] contrib/subtree: fix split with squashed subtrees
2025-08-24 19:10 [PATCH] contrib/subtree: fix split with squashed subtrees Colin Stagner
2025-09-01 13:54 ` Phillip Wood
@ 2025-09-05 2:27 ` Colin Stagner
2025-09-08 15:21 ` Phillip Wood
2025-09-10 3:11 ` [PATCH v3] " Colin Stagner
2 siblings, 1 reply; 15+ messages in thread
From: Colin Stagner @ 2025-09-05 2:27 UTC (permalink / raw)
To: git, phillip.wood, Phillip Wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher,
Colin Stagner
98ba49ccc2 (subtree: fix split processing with multiple subtrees
present, 2023-12-01) increases the performance of
git subtree split --prefix=subA
by ignoring subtree merges which are outside of `subA/`. It also
introduces a regression. Subtree merges that should be retained
are incorrectly ignored if they:
1. are nested under `subA/`; and
2. are merged with `--squash`.
For example, a subtree merged like:
git subtree merge --squash --prefix=subA/subB "$rev"
# ^^^^^^^^ ^^^^
is erroneously ignored during a split of `subA`. This causes
missing tree files and different commit hashes starting in
git v2.44.0-rc0.
The method:
should_ignore_subtree_split_commit REV
should test only a single commit REV, but the combination of
git log -1 --grep=...
actually searches all *parent* commits until a `--grep` match is
discovered.
Rewrite this method to test only one REV at a time. Extract commit
information with a single `git` call as opposed to three. The
`test` conditions for rejecting a commit remain unchanged.
Unit tests now cover nested subtrees.
Signed-off-by: Colin Stagner <ask+git@howdoi.land>
---
Notes:
This bugfix patch is intended for maint-2.44 and up.
v2 rewrites `should_ignore_subtree_split_commit()` completely. In
addition to the review comments, v2 also:
* adds `--no-show-signature` to align with
8841b5222c (subtree: fix add and pull for GPG-signed commits,
2018-02-23)
* removes use of `local` from `should_ignore_subtree_split_commit`.
`local` is not part of POSIX sh. Other uses of `local` remain in
untouched code.
* unit tests are unchanged since v1.
contrib/subtree/git-subtree.sh | 36 +++++++++++----
contrib/subtree/t/t7900-subtree.sh | 70 ++++++++++++++++++++++++++++++
2 files changed, 98 insertions(+), 8 deletions(-)
diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 5dab3f506c..c3cd60d341 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -783,24 +783,44 @@ ensure_clean () {
# Usage: ensure_valid_ref_format REF
ensure_valid_ref_format () {
assert test $# = 1
git check-ref-format "refs/heads/$1" ||
die "fatal: '$1' does not look like a ref"
}
-# Usage: check if a commit from another subtree should be
+# Usage: should_ignore_subtree_split_commit REV
+#
+# Check if REV is a commit from another subtree and should be
# ignored from processing for splits
should_ignore_subtree_split_commit () {
assert test $# = 1
- local rev="$1"
+
+ git show \
+ --no-patch \
+ --no-show-signature \
+ --format='%(trailers:key=git-subtree-dir,key=git-subtree-mainline)' \
+ "$1" |
+ (
+ have_mainline=
+ subtree_dir=
+
+ while read -r trailer val
+ do
+ case "$trailer" in
+ (git-subtree-dir:)
+ subtree_dir="${val%/}" ;;
+ (git-subtree-mainline:)
+ have_mainline=y ;;
+ esac
+ done
+
- if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
+ if test -n "${subtree_dir:-}" &&
+ test -z "${have_mainline:-}" &&
+ test "${subtree_dir}" != "$arg_prefix"
then
- if test -z "$(git log -1 --grep="git-subtree-mainline:" $rev)" &&
- test -z "$(git log -1 --grep="git-subtree-dir: $arg_prefix$" $rev)"
- then
- return 0
- fi
+ return 0
fi
return 1
+ )
}
# Usage: process_split_commit REV PARENTS
diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh
index ca4df5be83..8bd45e7be7 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -67,6 +67,34 @@ test_create_pre2_32_repo () {
git -C "$1-clone" replace HEAD^2 $new_commit
}
+# test_create_subtree_add REPO ORPHAN PREFIX FILENAME ...
+#
+# Create a simple subtree on a new branch named ORPHAN in REPO.
+# The subtree is then merged into the current branch of REPO,
+# under PREFIX. The generated subtree has has one commit
+# with subject and tag FILENAME with a single file "FILENAME.t"
+#
+# When this method returns:
+# - the current branch of REPO will have file PREFIX/FILENAME.t
+# - REPO will have a branch named ORPHAN with subtree history
+#
+# additional arguments are forwarded to "subtree add"
+test_create_subtree_add () {
+ (
+ cd "$1" &&
+ orphan="$2" &&
+ prefix="$3" &&
+ filename="$4" &&
+ shift 4 &&
+ last="$(git branch --show-current)" &&
+ git checkout --orphan "$orphan" &&
+ git rm -rf . &&
+ test_commit "$filename" &&
+ git checkout "$last" &&
+ git subtree add --prefix="$prefix" "$@" "$orphan"
+ )
+}
+
test_expect_success 'shows short help text for -h' '
test_expect_code 129 git subtree -h >out 2>err &&
test_must_be_empty err &&
@@ -425,6 +453,48 @@ test_expect_success 'split with multiple subtrees' '
--squash --rejoin -d -m "Sub B Split 1" 2>&1 | grep -w "\[1\]")" = ""
'
+# When subtree split-ing a directory that has other subtree
+# *merges* underneath it, the split must include those subtrees.
+# This test creates a nested subtree, `subA/subB`, and tests
+# that the tree is correct after a subtree split of `subA/`.
+# The test covers:
+# - An initial `subtree add`; and
+# - A follow-up `subtree merge`
+# both with and without `--squashed`.
+for is_squashed in '' 'y';
+do
+ test_expect_success "split keeps nested ${is_squashed:+--squash }subtrees that are part of the split" '
+ subtree_test_create_repo "$test_count" &&
+ (
+ cd "$test_count" &&
+ mkdir subA &&
+ test_commit subA/file1 &&
+ git branch -m main &&
+ test_create_subtree_add \
+ . mksubtree subA/subB file2 ${is_squashed:+--squash} &&
+ test -e subA/file1.t &&
+ test -e subA/subB/file2.t &&
+ git subtree split --prefix=subA --branch=bsplit &&
+ git checkout bsplit &&
+ test -e file1.t &&
+ test -e subB/file2.t &&
+ git checkout mksubtree &&
+ git branch -D bsplit &&
+ test_commit file3 &&
+ git checkout main &&
+ git subtree merge \
+ ${is_squashed:+--squash} \
+ --prefix=subA/subB mksubtree &&
+ test -e subA/subB/file3.t &&
+ git subtree split --prefix=subA --branch=bsplit &&
+ git checkout bsplit &&
+ test -e file1.t &&
+ test -e subB/file2.t &&
+ test -e subB/file3.t
+ )
+ '
+done
+
test_expect_success 'split sub dir/ with --rejoin from scratch' '
subtree_test_create_repo "$test_count" &&
test_create_commit "$test_count" main1 &&
base-commit: 09669c729af92144fde84e97d358759b5b42b555
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v2] contrib/subtree: fix split with squashed subtrees
2025-09-05 2:27 ` [PATCH v2] " Colin Stagner
@ 2025-09-08 15:21 ` Phillip Wood
2025-09-10 1:56 ` Colin Stagner
0 siblings, 1 reply; 15+ messages in thread
From: Phillip Wood @ 2025-09-08 15:21 UTC (permalink / raw)
To: Colin Stagner, git, phillip.wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
Hi Colin
This is looking good. I've left a few comments below, especially on the
new test.
On 05/09/2025 03:27, Colin Stagner wrote:
>
> -# Usage: check if a commit from another subtree should be
> +# Usage: should_ignore_subtree_split_commit REV
> +#
> +# Check if REV is a commit from another subtree and should be
> # ignored from processing for splits
> should_ignore_subtree_split_commit () {
> assert test $# = 1
> - local rev="$1"
> +
> + git show \
> + --no-patch \
> + --no-show-signature \
> + --format='%(trailers:key=git-subtree-dir,key=git-subtree-mainline)' \
> + "$1" |
> + (
> + have_mainline=
> + subtree_dir=
> +
> + while read -r trailer val
> + do
> + case "$trailer" in
> + (git-subtree-dir:)
> + subtree_dir="${val%/}" ;;
> + (git-subtree-mainline:)
> + have_mainline=y ;;
> + esac
> + done
This looks good, we run git log once and then parse the trailers. We do
not use the optional '(' in case statements in our code though.
> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
> + if test -n "${subtree_dir:-}" &&
> + test -z "${have_mainline:-}" &&
> + test "${subtree_dir}" != "$arg_prefix"
If we have a git-subtree-dir: trailer whose value does not match the
subtree we're interested in and there is no git-subtree-mainline:
trailer then we skip this commit - good. What's the idea behind using
"${var:-}" rather than "{var}"?
> then
> - if test -z "$(git log -1 --grep="git-subtree-mainline:" $rev)" &&
> - test -z "$(git log -1 --grep="git-subtree-dir: $arg_prefix$" $rev)"
> - then
> - return 0
> - fi
> + return 0
> fi
> return 1
> + )
> }
>
> # Usage: process_split_commit REV PARENTS
> diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh
> index ca4df5be83..8bd45e7be7 100755
> --- a/contrib/subtree/t/t7900-subtree.sh
> +++ b/contrib/subtree/t/t7900-subtree.sh
> @@ -67,6 +67,34 @@ test_create_pre2_32_repo () {
> git -C "$1-clone" replace HEAD^2 $new_commit
> }
>
> +# test_create_subtree_add REPO ORPHAN PREFIX FILENAME ...
> +#
> +# Create a simple subtree on a new branch named ORPHAN in REPO.
> +# The subtree is then merged into the current branch of REPO,
> +# under PREFIX. The generated subtree has has one commit
> +# with subject and tag FILENAME with a single file "FILENAME.t"
> +#
> +# When this method returns:
> +# - the current branch of REPO will have file PREFIX/FILENAME.t
> +# - REPO will have a branch named ORPHAN with subtree history
> +#
> +# additional arguments are forwarded to "subtree add"
> +test_create_subtree_add () {
> + (
> + cd "$1" &&
> + orphan="$2" &&
> + prefix="$3" &&
> + filename="$4" &&
> + shift 4 &&
> + last="$(git branch --show-current)" &&
> + git checkout --orphan "$orphan" &&
> + git rm -rf . &&
If you use "git switch --orphan" that clears the worktree for you
> + test_commit "$filename" &&
> + git checkout "$last" &&
I think this could be "git checkout @{-1}" and then we'd avoid having to
run "git branch" above
> + git subtree add --prefix="$prefix" "$@" "$orphan"
> + )
> +}
> +
> test_expect_success 'shows short help text for -h' '
> test_expect_code 129 git subtree -h >out 2>err &&
> test_must_be_empty err &&
> @@ -425,6 +453,48 @@ test_expect_success 'split with multiple subtrees' '
> --squash --rejoin -d -m "Sub B Split 1" 2>&1 | grep -w "\[1\]")" = ""
> '
>
> +# When subtree split-ing a directory that has other subtree
> +# *merges* underneath it, the split must include those subtrees.
> +# This test creates a nested subtree, `subA/subB`, and tests
> +# that the tree is correct after a subtree split of `subA/`.
> +# The test covers:
> +# - An initial `subtree add`; and
> +# - A follow-up `subtree merge`
> +# both with and without `--squashed`.
> +for is_squashed in '' 'y';
no need for ';' at the end of the line
> +do
> + test_expect_success "split keeps nested ${is_squashed:+--squash }subtrees that are part of the split" '
> + subtree_test_create_repo "$test_count" &&
> + (
> + cd "$test_count" &&
> + mkdir subA &&
> + test_commit subA/file1 &&
> + git branch -m main &&
For tests that depend on the default branch name you can add
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
to the start of the file before it sources test-lib.sh. Or we could
change subtree_create_repo() to pass all its arguments along to
test_create_repo() and use "subtree_create_repo --initial-branch=main
$test_count"
> + test_create_subtree_add \
> + . mksubtree subA/subB file2 ${is_squashed:+--squash} &&
> + test -e subA/file1.t &&
We have test_path_is_file() for this which prints a useful diagnostic
message it it fails
Thanks
Phillip
> + test -e subA/subB/file2.t &&
> + git subtree split --prefix=subA --branch=bsplit &&
> + git checkout bsplit &&
> + test -e file1.t &&
> + test -e subB/file2.t &&
> + git checkout mksubtree &&
> + git branch -D bsplit &&
> + test_commit file3 &&
> + git checkout main &&
> + git subtree merge \
> + ${is_squashed:+--squash} \
> + --prefix=subA/subB mksubtree &&
> + test -e subA/subB/file3.t &&
> + git subtree split --prefix=subA --branch=bsplit &&
> + git checkout bsplit &&
> + test -e file1.t &&
> + test -e subB/file2.t &&
> + test -e subB/file3.t
> + )
> + '
> +done
> +
> test_expect_success 'split sub dir/ with --rejoin from scratch' '
> subtree_test_create_repo "$test_count" &&
> test_create_commit "$test_count" main1 &&
>
> base-commit: 09669c729af92144fde84e97d358759b5b42b555
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v2] contrib/subtree: fix split with squashed subtrees
2025-09-08 15:21 ` Phillip Wood
@ 2025-09-10 1:56 ` Colin Stagner
2025-09-10 2:02 ` Junio C Hamano
0 siblings, 1 reply; 15+ messages in thread
From: Colin Stagner @ 2025-09-10 1:56 UTC (permalink / raw)
To: phillip.wood, git; +Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
Phillip,
Hello again! I have adopted your recommendations everywhere except for
`git checkout @{-1}`. Details below.
On 9/8/25 10:21, Phillip Wood wrote:
> On 05/09/2025 03:27, Colin Stagner wrote:
>> + while read -r trailer val
>> + do
>> + case "$trailer" in
>> + (git-subtree-dir:)
>> + subtree_dir="${val%/}" ;;
>> + (git-subtree-mainline:)
>> + have_mainline=y ;;
>> + esac
>> + done
>
> We do not use the optional '(' in case statements
Will fix in v3.
>
>> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
>> + if test -n "${subtree_dir:-}" &&
>> + test -z "${have_mainline:-}" &&
>> + test "${subtree_dir}" != "$arg_prefix"
>
> What's the idea behind using "${var:-}" rather than "{var}"?
I write a lot of shell scripts that run "set -u" (aka "set -o nounset"),
so I do this a lot when testing for empty vars. In this case, it's not
actually necessary since `have_mainline` is explicitly defined above.
And we don't run `set -u` anyway.
Will remove from v3.
>> +test_create_subtree_add () {
>> + (
>> + cd "$1" &&
>> + orphan="$2" &&
>> + prefix="$3" &&
>> + filename="$4" &&
>> + shift 4 &&
>> + last="$(git branch --show-current)" &&
>> + git checkout --orphan "$orphan" &&
>> + git rm -rf . &&
>
> If you use "git switch --orphan" that clears the worktree for you
Very useful. I'll start using it in v3.
>> + test_commit "$filename" &&
>> + git checkout "$last" &&
>
> I think this could be "git checkout @{-1}" and then we'd avoid having to
> run "git branch" above
I experimented with this, but I couldn't get it to work on git 2.44.
Although the reflog shows the refs I expect, using
git switch '@{-1}'
dies with
fatal: invalid reference: @{-1}
checkout doesn't work either. Perhaps there is something about --orphan
that is messing up the history.
I could make `test_create_subtree_add` take a mainline branch name,
but... unless there's something unsound about v2, I think we should just
keep v2. `git branch --show-current` looks like well-defined porcelain.
Any other ideas?
>> +# The test covers:
>> +# - An initial `subtree add`; and
>> +# - A follow-up `subtree merge`
>> +# both with and without `--squashed`.
>> +for is_squashed in '' 'y';
>
> no need for ';' at the end of the line
Fixed for v3.
>> + subtree_test_create_repo "$test_count" &&
>> + (
>> + cd "$test_count" &&
>> + mkdir subA &&
>> + test_commit subA/file1 &&
>> + git branch -m main &&
>
> For tests that depend on the default branch name you can add
>
> GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> to the start of the file before it sources test-lib.sh.
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main looks very common, so I'll go
with that for v3.
>> + test_create_subtree_add \
>> + . mksubtree subA/subB file2 ${is_squashed:+--squash} &&
>> + test -e subA/file1.t &&
>
> We have test_path_is_file() for this which prints a useful diagnostic
> message
Fixed all occurrences in v3.
Colin
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v2] contrib/subtree: fix split with squashed subtrees
2025-09-10 1:56 ` Colin Stagner
@ 2025-09-10 2:02 ` Junio C Hamano
2025-09-10 3:00 ` Colin Stagner
0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2025-09-10 2:02 UTC (permalink / raw)
To: Colin Stagner
Cc: phillip.wood, git, Zach FettersMoore, Christian Couder,
Patrik Weiskircher
Colin Stagner <ask+git@howdoi.land> writes:
>>> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
>>> + if test -n "${subtree_dir:-}" &&
>>> + test -z "${have_mainline:-}" &&
>>> + test "${subtree_dir}" != "$arg_prefix"
>> What's the idea behind using "${var:-}" rather than "{var}"?
>
> I write a lot of shell scripts that run "set -u" (aka "set -o
> nounset"), so I do this a lot when testing for empty vars. In this
> case, it's not actually necessary since `have_mainline` is explicitly
> defined above. And we don't run `set -u` anyway.
Besides, "if test -n ${subtree_dir-}" without colon would be the
more proper way for those who care about "set -u", wouldn't it? It
is not that you want to substitute with an empty string that comes
between that "-" and "}" when subtree_dir is unset or set to empty.
You are preparing for the case where the variable is truly not set,
and the variable being set to an empty string is not something you
are worried about. THe same for ${have_mainline:-}.
>> If you use "git switch --orphan" that clears the worktree for you
>
> Very useful. I'll start using it in v3.
Excellent suggestion.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v2] contrib/subtree: fix split with squashed subtrees
2025-09-10 2:02 ` Junio C Hamano
@ 2025-09-10 3:00 ` Colin Stagner
2025-09-10 15:10 ` Junio C Hamano
0 siblings, 1 reply; 15+ messages in thread
From: Colin Stagner @ 2025-09-10 3:00 UTC (permalink / raw)
To: Junio C Hamano
Cc: phillip.wood, git, Zach FettersMoore, Christian Couder,
Patrik Weiskircher
On 9/9/25 21:02, Junio C Hamano wrote:
> Besides, "if test -n ${subtree_dir-}" without colon would be the
> more proper way for those who care about "set -u", wouldn't it? It
> is not that you want to substitute with an empty string that comes
> between that "-" and "}" when subtree_dir is unset or set to empty.
> You are preparing for the case where the variable is truly not set,
> and the variable being set to an empty string is not something you
> are worried about.
Yes, "test -n ${subtree_dir-}" is definitely the more correct expression.
At the very real risk of embarrassing myself in public today... in the
particular case of a "test -n," is there actually an appreciable
difference? Either way, the output of the substitution is empty if the
input is empty or undefined. Here, "test -n ${subtree_dir:-}" is merely
less efficient. Right?
The difference between "${x:-}" vs "${x-}" really starts to matter if
you want to permit the empty string (or not). It also matters if you
call a command that has side effects.
(And in the context of this patch, neither are necessary.)
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v2] contrib/subtree: fix split with squashed subtrees
2025-09-10 3:00 ` Colin Stagner
@ 2025-09-10 15:10 ` Junio C Hamano
0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2025-09-10 15:10 UTC (permalink / raw)
To: Colin Stagner
Cc: phillip.wood, git, Zach FettersMoore, Christian Couder,
Patrik Weiskircher
Colin Stagner <ask+git@howdoi.land> writes:
> On 9/9/25 21:02, Junio C Hamano wrote:
>> Besides, "if test -n ${subtree_dir-}" without colon would be the
>> more proper way for those who care about "set -u", wouldn't it? It
>> is not that you want to substitute with an empty string that comes
>> between that "-" and "}" when subtree_dir is unset or set to empty.
>> You are preparing for the case where the variable is truly not set,
>> and the variable being set to an empty string is not something you
>> are worried about.
> Yes, "test -n ${subtree_dir-}" is definitely the more correct expression.
>
> At the very real risk of embarrassing myself in public today... in the
> particular case of a "test -n," is there actually an appreciable
> difference? Either way, the output of the substitution is empty if the
> input is empty or undefined. Here, "test -n ${subtree_dir:-}" is
> merely less efficient. Right?
>
> The difference between "${x:-}" vs "${x-}" really starts to matter if
> you want to permit the empty string (or not). It also matters if you
> call a command that has side effects.
>
> (And in the context of this patch, neither are necessary.)
Correct. There is no practical difference.
Your explanation for using the "default values" parameter expansion
in this script, knowing that "set -u" is not in use, being it is out
of inertia, I would have expected them to be written in a way that
is suitable when "set -u" is in use, which is without colon. Doing
something "different" on a variable that is set but set to an empty
string is not something you would want to do to deal with "set -u",
so I found it strange to see the colon there.
There is no practical difference, since the "default value"
specified is an empty string, so a variable set to an empty string
will use the empty string between ":-" and "}" instead of its value
that is another empty string, and you can tell these two empty
strings apart in the result ;-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3] contrib/subtree: fix split with squashed subtrees
2025-08-24 19:10 [PATCH] contrib/subtree: fix split with squashed subtrees Colin Stagner
2025-09-01 13:54 ` Phillip Wood
2025-09-05 2:27 ` [PATCH v2] " Colin Stagner
@ 2025-09-10 3:11 ` Colin Stagner
2025-09-10 9:39 ` Phillip Wood
2 siblings, 1 reply; 15+ messages in thread
From: Colin Stagner @ 2025-09-10 3:11 UTC (permalink / raw)
To: git, phillip.wood, Phillip Wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher,
Colin Stagner
98ba49ccc2 (subtree: fix split processing with multiple subtrees
present, 2023-12-01) increases the performance of
git subtree split --prefix=subA
by ignoring subtree merges which are outside of `subA/`. It also
introduces a regression. Subtree merges that should be retained
are incorrectly ignored if they:
1. are nested under `subA/`; and
2. are merged with `--squash`.
For example, a subtree merged like:
git subtree merge --squash --prefix=subA/subB "$rev"
# ^^^^^^^^ ^^^^
is erroneously ignored during a split of `subA`. This causes
missing tree files and different commit hashes starting in
git v2.44.0-rc0.
The method:
should_ignore_subtree_split_commit REV
should test only a single commit REV, but the combination of
git log -1 --grep=...
actually searches all *parent* commits until a `--grep` match is
discovered.
Rewrite this method to test only one REV at a time. Extract commit
information with a single `git` call as opposed to three. The
`test` conditions for rejecting a commit remain unchanged.
Unit tests now cover nested subtrees.
Signed-off-by: Colin Stagner <ask+git@howdoi.land>
---
Notes:
This bugfix patch is intended for maint-2.44 and up.
contrib/subtree/git-subtree.sh | 36 +++++++++++----
contrib/subtree/t/t7900-subtree.sh | 71 ++++++++++++++++++++++++++++++
2 files changed, 99 insertions(+), 8 deletions(-)
diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 5dab3f506c..ad9b9b0191 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -783,24 +783,44 @@ ensure_clean () {
# Usage: ensure_valid_ref_format REF
ensure_valid_ref_format () {
assert test $# = 1
git check-ref-format "refs/heads/$1" ||
die "fatal: '$1' does not look like a ref"
}
-# Usage: check if a commit from another subtree should be
+# Usage: should_ignore_subtree_split_commit REV
+#
+# Check if REV is a commit from another subtree and should be
# ignored from processing for splits
should_ignore_subtree_split_commit () {
assert test $# = 1
- local rev="$1"
+
+ git show \
+ --no-patch \
+ --no-show-signature \
+ --format='%(trailers:key=git-subtree-dir,key=git-subtree-mainline)' \
+ "$1" |
+ (
+ have_mainline=
+ subtree_dir=
+
+ while read -r trailer val
+ do
+ case "$trailer" in
+ git-subtree-dir:)
+ subtree_dir="${val%/}" ;;
+ git-subtree-mainline:)
+ have_mainline=y ;;
+ esac
+ done
+
- if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
+ if test -n "${subtree_dir}" &&
+ test -z "${have_mainline}" &&
+ test "${subtree_dir}" != "$arg_prefix"
then
- if test -z "$(git log -1 --grep="git-subtree-mainline:" $rev)" &&
- test -z "$(git log -1 --grep="git-subtree-dir: $arg_prefix$" $rev)"
- then
- return 0
- fi
+ return 0
fi
return 1
+ )
}
# Usage: process_split_commit REV PARENTS
diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh
index ca4df5be83..25be40e12b 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -9,6 +9,9 @@ This test verifies the basic operation of the add, merge, split, pull,
and push subcommands of git subtree.
'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
TEST_DIRECTORY=$(pwd)/../../../t
. "$TEST_DIRECTORY"/test-lib.sh
@@ -67,6 +70,33 @@ test_create_pre2_32_repo () {
git -C "$1-clone" replace HEAD^2 $new_commit
}
+# test_create_subtree_add REPO ORPHAN PREFIX FILENAME ...
+#
+# Create a simple subtree on a new branch named ORPHAN in REPO.
+# The subtree is then merged into the current branch of REPO,
+# under PREFIX. The generated subtree has has one commit
+# with subject and tag FILENAME with a single file "FILENAME.t"
+#
+# When this method returns:
+# - the current branch of REPO will have file PREFIX/FILENAME.t
+# - REPO will have a branch named ORPHAN with subtree history
+#
+# additional arguments are forwarded to "subtree add"
+test_create_subtree_add () {
+ (
+ cd "$1" &&
+ orphan="$2" &&
+ prefix="$3" &&
+ filename="$4" &&
+ shift 4 &&
+ last="$(git branch --show-current)" &&
+ git switch --orphan "$orphan" &&
+ test_commit "$filename" &&
+ git checkout "$last" &&
+ git subtree add --prefix="$prefix" "$@" "$orphan"
+ )
+}
+
test_expect_success 'shows short help text for -h' '
test_expect_code 129 git subtree -h >out 2>err &&
test_must_be_empty err &&
@@ -425,6 +455,47 @@ test_expect_success 'split with multiple subtrees' '
--squash --rejoin -d -m "Sub B Split 1" 2>&1 | grep -w "\[1\]")" = ""
'
+# When subtree split-ing a directory that has other subtree
+# *merges* underneath it, the split must include those subtrees.
+# This test creates a nested subtree, `subA/subB`, and tests
+# that the tree is correct after a subtree split of `subA/`.
+# The test covers:
+# - An initial `subtree add`; and
+# - A follow-up `subtree merge`
+# both with and without `--squashed`.
+for is_squashed in '' 'y'
+do
+ test_expect_success "split keeps nested ${is_squashed:+--squash }subtrees that are part of the split" '
+ subtree_test_create_repo "$test_count" &&
+ (
+ cd "$test_count" &&
+ mkdir subA &&
+ test_commit subA/file1 &&
+ test_create_subtree_add \
+ . mksubtree subA/subB file2 ${is_squashed:+--squash} &&
+ test_path_is_file subA/file1.t &&
+ test_path_is_file subA/subB/file2.t &&
+ git subtree split --prefix=subA --branch=bsplit &&
+ git checkout bsplit &&
+ test_path_is_file file1.t &&
+ test_path_is_file subB/file2.t &&
+ git checkout mksubtree &&
+ git branch -D bsplit &&
+ test_commit file3 &&
+ git checkout main &&
+ git subtree merge \
+ ${is_squashed:+--squash} \
+ --prefix=subA/subB mksubtree &&
+ test_path_is_file subA/subB/file3.t &&
+ git subtree split --prefix=subA --branch=bsplit &&
+ git checkout bsplit &&
+ test_path_is_file file1.t &&
+ test_path_is_file subB/file2.t &&
+ test_path_is_file subB/file3.t
+ )
+ '
+done
+
test_expect_success 'split sub dir/ with --rejoin from scratch' '
subtree_test_create_repo "$test_count" &&
test_create_commit "$test_count" main1 &&
base-commit: 09669c729af92144fde84e97d358759b5b42b555
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v3] contrib/subtree: fix split with squashed subtrees
2025-09-10 3:11 ` [PATCH v3] " Colin Stagner
@ 2025-09-10 9:39 ` Phillip Wood
2025-09-11 16:01 ` Junio C Hamano
0 siblings, 1 reply; 15+ messages in thread
From: Phillip Wood @ 2025-09-10 9:39 UTC (permalink / raw)
To: Colin Stagner, git, phillip.wood
Cc: Zach FettersMoore, Christian Couder, Patrik Weiskircher
Hi Colin
Thanks for working on this. I'm not particularly familiar with
git-subtree but as far as I can see this version looks good.
Thanks
Phillip
On 10/09/2025 04:11, Colin Stagner wrote:
> 98ba49ccc2 (subtree: fix split processing with multiple subtrees
> present, 2023-12-01) increases the performance of
>
> git subtree split --prefix=subA
>
> by ignoring subtree merges which are outside of `subA/`. It also
> introduces a regression. Subtree merges that should be retained
> are incorrectly ignored if they:
>
> 1. are nested under `subA/`; and
> 2. are merged with `--squash`.
>
> For example, a subtree merged like:
>
> git subtree merge --squash --prefix=subA/subB "$rev"
> # ^^^^^^^^ ^^^^
>
> is erroneously ignored during a split of `subA`. This causes
> missing tree files and different commit hashes starting in
> git v2.44.0-rc0.
>
> The method:
>
> should_ignore_subtree_split_commit REV
>
> should test only a single commit REV, but the combination of
>
> git log -1 --grep=...
>
> actually searches all *parent* commits until a `--grep` match is
> discovered.
>
> Rewrite this method to test only one REV at a time. Extract commit
> information with a single `git` call as opposed to three. The
> `test` conditions for rejecting a commit remain unchanged.
>
> Unit tests now cover nested subtrees.
>
> Signed-off-by: Colin Stagner <ask+git@howdoi.land>
> ---
>
> Notes:
> This bugfix patch is intended for maint-2.44 and up.
>
> contrib/subtree/git-subtree.sh | 36 +++++++++++----
> contrib/subtree/t/t7900-subtree.sh | 71 ++++++++++++++++++++++++++++++
> 2 files changed, 99 insertions(+), 8 deletions(-)
>
> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index 5dab3f506c..ad9b9b0191 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -783,24 +783,44 @@ ensure_clean () {
> # Usage: ensure_valid_ref_format REF
> ensure_valid_ref_format () {
> assert test $# = 1
> git check-ref-format "refs/heads/$1" ||
> die "fatal: '$1' does not look like a ref"
> }
>
> -# Usage: check if a commit from another subtree should be
> +# Usage: should_ignore_subtree_split_commit REV
> +#
> +# Check if REV is a commit from another subtree and should be
> # ignored from processing for splits
> should_ignore_subtree_split_commit () {
> assert test $# = 1
> - local rev="$1"
> +
> + git show \
> + --no-patch \
> + --no-show-signature \
> + --format='%(trailers:key=git-subtree-dir,key=git-subtree-mainline)' \
> + "$1" |
> + (
> + have_mainline=
> + subtree_dir=
> +
> + while read -r trailer val
> + do
> + case "$trailer" in
> + git-subtree-dir:)
> + subtree_dir="${val%/}" ;;
> + git-subtree-mainline:)
> + have_mainline=y ;;
> + esac
> + done
> +
> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
> + if test -n "${subtree_dir}" &&
> + test -z "${have_mainline}" &&
> + test "${subtree_dir}" != "$arg_prefix"
> then
> - if test -z "$(git log -1 --grep="git-subtree-mainline:" $rev)" &&
> - test -z "$(git log -1 --grep="git-subtree-dir: $arg_prefix$" $rev)"
> - then
> - return 0
> - fi
> + return 0
> fi
> return 1
> + )
> }
>
> # Usage: process_split_commit REV PARENTS
> diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh
> index ca4df5be83..25be40e12b 100755
> --- a/contrib/subtree/t/t7900-subtree.sh
> +++ b/contrib/subtree/t/t7900-subtree.sh
> @@ -9,6 +9,9 @@ This test verifies the basic operation of the add, merge, split, pull,
> and push subcommands of git subtree.
> '
>
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> TEST_DIRECTORY=$(pwd)/../../../t
> . "$TEST_DIRECTORY"/test-lib.sh
>
> @@ -67,6 +70,33 @@ test_create_pre2_32_repo () {
> git -C "$1-clone" replace HEAD^2 $new_commit
> }
>
> +# test_create_subtree_add REPO ORPHAN PREFIX FILENAME ...
> +#
> +# Create a simple subtree on a new branch named ORPHAN in REPO.
> +# The subtree is then merged into the current branch of REPO,
> +# under PREFIX. The generated subtree has has one commit
> +# with subject and tag FILENAME with a single file "FILENAME.t"
> +#
> +# When this method returns:
> +# - the current branch of REPO will have file PREFIX/FILENAME.t
> +# - REPO will have a branch named ORPHAN with subtree history
> +#
> +# additional arguments are forwarded to "subtree add"
> +test_create_subtree_add () {
> + (
> + cd "$1" &&
> + orphan="$2" &&
> + prefix="$3" &&
> + filename="$4" &&
> + shift 4 &&
> + last="$(git branch --show-current)" &&
> + git switch --orphan "$orphan" &&
> + test_commit "$filename" &&
> + git checkout "$last" &&
> + git subtree add --prefix="$prefix" "$@" "$orphan"
> + )
> +}
> +
> test_expect_success 'shows short help text for -h' '
> test_expect_code 129 git subtree -h >out 2>err &&
> test_must_be_empty err &&
> @@ -425,6 +455,47 @@ test_expect_success 'split with multiple subtrees' '
> --squash --rejoin -d -m "Sub B Split 1" 2>&1 | grep -w "\[1\]")" = ""
> '
>
> +# When subtree split-ing a directory that has other subtree
> +# *merges* underneath it, the split must include those subtrees.
> +# This test creates a nested subtree, `subA/subB`, and tests
> +# that the tree is correct after a subtree split of `subA/`.
> +# The test covers:
> +# - An initial `subtree add`; and
> +# - A follow-up `subtree merge`
> +# both with and without `--squashed`.
> +for is_squashed in '' 'y'
> +do
> + test_expect_success "split keeps nested ${is_squashed:+--squash }subtrees that are part of the split" '
> + subtree_test_create_repo "$test_count" &&
> + (
> + cd "$test_count" &&
> + mkdir subA &&
> + test_commit subA/file1 &&
> + test_create_subtree_add \
> + . mksubtree subA/subB file2 ${is_squashed:+--squash} &&
> + test_path_is_file subA/file1.t &&
> + test_path_is_file subA/subB/file2.t &&
> + git subtree split --prefix=subA --branch=bsplit &&
> + git checkout bsplit &&
> + test_path_is_file file1.t &&
> + test_path_is_file subB/file2.t &&
> + git checkout mksubtree &&
> + git branch -D bsplit &&
> + test_commit file3 &&
> + git checkout main &&
> + git subtree merge \
> + ${is_squashed:+--squash} \
> + --prefix=subA/subB mksubtree &&
> + test_path_is_file subA/subB/file3.t &&
> + git subtree split --prefix=subA --branch=bsplit &&
> + git checkout bsplit &&
> + test_path_is_file file1.t &&
> + test_path_is_file subB/file2.t &&
> + test_path_is_file subB/file3.t
> + )
> + '
> +done
> +
> test_expect_success 'split sub dir/ with --rejoin from scratch' '
> subtree_test_create_repo "$test_count" &&
> test_create_commit "$test_count" main1 &&
>
> base-commit: 09669c729af92144fde84e97d358759b5b42b555
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v3] contrib/subtree: fix split with squashed subtrees
2025-09-10 9:39 ` Phillip Wood
@ 2025-09-11 16:01 ` Junio C Hamano
0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2025-09-11 16:01 UTC (permalink / raw)
To: Phillip Wood
Cc: Colin Stagner, git, phillip.wood, Zach FettersMoore,
Christian Couder, Patrik Weiskircher
Phillip Wood <phillip.wood123@gmail.com> writes:
> Hi Colin
>
> Thanks for working on this. I'm not particularly familiar with
> git-subtree but as far as I can see this version looks good.
>
> Thanks
>
> Phillip
Thanks, both of you. Queued.
^ permalink raw reply [flat|nested] 15+ messages in thread