* [Bug] Git subtree regression @ 2025-12-26 19:58 dev 2025-12-30 17:07 ` george 0 siblings, 1 reply; 11+ messages in thread From: dev @ 2025-12-26 19:58 UTC (permalink / raw) To: git Thank you for filling out a Git bug report! Please answer the following questions to help us understand your issue. What did you do before the bug happened? (Steps to reproduce your issue) I use git subtrees to manage the monorepo `https://github.com/athena-framework/athena`. When using git 2.52.0, I can add a new remote for say the `clock` component via `git remote add clock git@github.com:athena-framework/clock.git` Then do a `subtree push` via `git subtree push --prefix="src/components/clock" "clock" master`. What did you expect to happen? (Expected behavior) I expected it to work and say `Everything up-to-date`, because it is up to date. What happened instead? (Actual behavior) It fails because of: ``` To github.com:athena-framework/clock.git ! [rejected] 0efb3d9858e3bfee65165508aeeacc50417c9a99 -> master (non-fast-forward) error: failed to push some refs to 'github.com:athena-framework/clock.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. If you want to integrate the remote changes, hint: use 'git pull' before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details. ``` What's different between what you expected and what actually happened? Seems to be a regression of https://github.com/git/git/commit/83f9dad7d6fb5988b68f80b25bd87c68693195dd as it used to work and now it doesn't. Anything else you want to add: I did some initial exploration and it might have something to do with the `clock` component originally being added via `git subtree add --squash`. For another component: - git 2.51.1: split produces 92 commits, properly connected to original repo history - git 2.52.0: split produces 8 commits, disconnected history with a new root The `git-subtree-split:` marker in the squash commit body doesn't seem to be honored in 2.52.0. Please review the rest of the bug report below. You can delete any lines you don't wish to share. [System Info] git version: git version 2.52.0 cpu: x86_64 built from commit: 9a2fb147f2c61d0cab52c883e7e26f5b7948e3ed sizeof-long: 8 sizeof-size_t: 8 shell-path: /bin/sh rust: enabled libcurl: 8.17.0 OpenSSL: OpenSSL 3.6.0 1 Oct 2025 zlib-ng: 2.2.5 SHA-1: SHA1_DC SHA-256: SHA256_BLK default-ref-format: files default-hash: sha1 uname: Linux 6.18.2-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 18 Dec 2025 18:00:18 +0000 x86_64 compiler info: gnuc: 15.2 libc info: glibc: 2.42 $SHELL (typically, interactive shell): /bin/bash ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2025-12-26 19:58 [Bug] Git subtree regression dev @ 2025-12-30 17:07 ` george 2026-01-04 4:52 ` Colin Stagner 0 siblings, 1 reply; 11+ messages in thread From: george @ 2025-12-30 17:07 UTC (permalink / raw) To: george; +Cc: git --- I explored this more and think I found the root cause. Commit `83f9dad7d6fb5988b68f80b25bd87c68693195dd` changed `should_ignore_subtree_split_commit()` to examine only a commit's own trailers via `git show --format='%(trailers:...)'`. The old code used `git log -1 --grep=...` which had the important side effect of searching through ancestor commits. In a multi-subtree monorepo with this topology: ``` main: A---B---M---E (B = subtree add --squash for subA) / feature: C---D (D = subtree add --squash for subB) ``` When splitting `subA`, commits `C` and `D` from the feature branch should be **ignored** because they belong to a branch that only contains `subB`, not `subA`. ## Old behavior (2.51.1) `git log -1 --grep="git-subtree-dir:"` on commit `C` would traverse ancestors and find `D`'s subtree markers for `subB`, correctly identifying `C` as belonging to another subtree's branch. ## New behavior (2.52.0) `git show` on commit `C` finds no trailers (regular commits don't have them), so `C` is **not** ignored. This breaks the split because both parents of the merge `M` are processed, but `C` has no cache entry, leading to disconnected history. Thus, split operations produce fewer commits than expected with broken parent chains, breaking push/pull workflows to upstream subtree repositories. ## Reproduction This can be reproduced via my monorepo: https://github.com/athena-framework/athena ```bash # 2.51.1 produces correct result (matches the number of commits in the `athena-framework/clock` repo) $ git subtree split --prefix="src/components/clock" 4ee66f8198b2532110b75a36575e363ccccff47e # 20 commits, connected to remote # 2.52.0 produces broken result $ git subtree split --prefix="src/components/clock" 0efb3d9858e3bfee65165508aeeacc50417c9a99 # 7 commits, disconnected # The commits have identical trees but different parents: $ git cat-file -p 4ee66f8198b2532110b75a36575e363ccccff47e tree 8333b0cbb2a10528f8c803812af7a8e603e70367 parent d72f22f28ca5ed57ef3c2df74f0abd5569ac5934 # Connected to 19-commit history $ git cat-file -p 0efb3d9858e3bfee65165508aeeacc50417c9a99 tree 8333b0cbb2a10528f8c803812af7a8e603e70367 parent 81c5dbe70ce26a7758fbe7f87b3ce0704043cfb1 # Only 6 commits, disconnected ``` ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2025-12-30 17:07 ` george @ 2026-01-04 4:52 ` Colin Stagner 2026-01-04 14:27 ` george 0 siblings, 1 reply; 11+ messages in thread From: Colin Stagner @ 2026-01-04 4:52 UTC (permalink / raw) To: george; +Cc: git Hello, George! Thanks for looking in to this. On 12/30/25 11:07, george@mail.dietrich.pub wrote: > --- > I explored this more and think I found the root cause. > Commit `83f9dad7d6fb5988b68f80b25bd87c68693195dd` changed `should_ignore_subtree_split_commit()` to examine only a commit's own trailers via `git show --format='%(trailers:...)'`. > The old code used `git log -1 --grep=...` which had the important side effect of searching through ancestor commits. The old `--grep=...` approach was introduced as a performance speedup for large splits. I don't believe the original author intended to alter the split result, but the old approach inadvertently did in some cases. > # 2.52.0 produces broken result > $ git subtree split --prefix="src/components/clock" > 0efb3d9858e3bfee65165508aeeacc50417c9a99 # 7 commits, On v2.52.0 on my machine, I get an error instead: fatal: could not rev-parse split hash d0ed70566b3e962fbff71145d8155986b48c6885 from commit 5817d4435bf448f526c3b0049f00e6500277e4bb I presume I need more history than just master to make this work. Can you test this split command in git v2.43.7? This is before `should_ignore_subtree_split_commit()` was introduced. I'd like to distill this down into a minimum working example that doesn't depend on an external repo like athena. Namely, some shell instructions that start from an empty `git init` and create a repo with the bug condition. That way, we know exactly and narrowly what sort of history graph produces the bug. I think I have almost enough information here to do that, but you're welcome to try writing an MWE yourself. > In a multi-subtree monorepo with this topology: > > ``` > main: A---B---M---E (B = subtree add --squash for subA) > / > feature: C---D (D = subtree add --squash for subB) > ``` Just to verify: in this example, is commit M a normal merge commit? Or is it also created with subtree? Colin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-01-04 4:52 ` Colin Stagner @ 2026-01-04 14:27 ` george 2026-01-05 3:36 ` Colin Stagner 0 siblings, 1 reply; 11+ messages in thread From: george @ 2026-01-04 14:27 UTC (permalink / raw) To: ask+git; +Cc: george, git --- Ahh, yes. It seems you also need to add `clock` as a remote and fetch it: ``` $ git clone git@github.com:athena-framework/athena.git $ cd athena $ git remote add clock git@github.com:athena-framework/clock.git $ git fetch clock $ git subtree split --prefix="src/components/clock" 0efb3d9858e3bfee65165508aeeacc50417c9a99 ``` I wasn't able to use _exactly_ 2.43.7, but I was able to use 2.43.0 which would still be before that other change. It also produced the expected commit hash, unlike 2.52.0. It, also was significantly faster, took ~9s vs 2.51.1 which was ~26s. 2.52.0 was better at ~14s, but of course produces the wrong hash. This reproduces the issue quite well, and what the root cause likely. It does seem one component was added differently, as a non-merge commit, which seems break things. Looking at the Athena monorepo, this can somewhat be confirmed via https://github.com/athena-framework/athena/commits/master/?after=ee21a41e9dfc969e759b532d45c0c0faa21876d6+0. How the first two commits show up as verified, unlike the other times when I normally do `git subtree add --squash` and push directly to main, they show up as unverified. ``` #!/bin/bash # # THE BUG: # When a commit's direct parent is a squash commit for a DIFFERENT subtree, # and that squash commit's ancestry includes OUR subtree's squash commit, # the split breaks. # # Old code: `git log -1 --grep` searches ancestry, finds our marker → don't ignore # New code: only checks parent's own trailers → ignores → breaks parent chain # # This pattern occurs when subtree squash commits are cherry-picked or rebased # into a linear history (instead of the normal merge structure). set -e KEEP_TMPDIR="${KEEP_TMPDIR:-}" TMPDIR=$(mktemp -d) echo "Working directory: $TMPDIR" cleanup() { if [ -n "$KEEP_TMPDIR" ]; then echo "Preserving temp directory: $TMPDIR" else rm -rf "$TMPDIR" fi } trap cleanup EXIT create_repo() { local repo="$1" git init -b main "$repo" git -C "$repo" config user.email "test@test.com" git -C "$repo" config user.name "Test User" git -C "$repo" config log.date relative } create_commit() { local repo="$1" local name="$2" ( cd "$repo" mkdir -p "$(dirname "$name")" echo "$name" > "$name" git add "$name" git commit -m "$name" ) } cd "$TMPDIR" echo "=== Creating repositories ===" create_repo monorepo create_repo subA create_repo subB echo "=== Creating upstream commits ===" create_commit subA subA1 create_commit subA subA2 create_commit subB subB1 create_commit subB subB2 echo "=== Setting up monorepo with linear squash structure ===" # Initial commit create_commit monorepo main1 # Add subA with --squash (normal way - creates merge) git -C monorepo fetch ../subA HEAD git -C monorepo subtree add --prefix=subA --squash FETCH_HEAD # Make a change in subA create_commit monorepo subA/change1 # Now we simulate cherry-picking JUST the squash commit for subB # (This is what seems to have happened in the athena repo) # First, get subB ready git -C monorepo fetch ../subB HEAD # Create a LINEAR squash commit for subB (simulating cherry-pick of just the squash commit) # This is the key pattern that triggers the bug - a squash commit as a regular linear commit ( cd monorepo mkdir -p subB git -C ../subB archive HEAD | tar -x -C subB git add subB # Create a squash-style commit with subtree trailers but as a LINEAR commit # Trailers must be in the last paragraph, separated by blank line subB_short=$(git -C ../subB rev-parse --short HEAD) subB_full=$(git -C ../subB rev-parse HEAD) git commit -F - <<EOF Squashed 'subB/' content from commit $subB_short git-subtree-dir: subB git-subtree-split: $subB_full EOF ) echo "" echo "=== Key structure: subB squash is a LINEAR commit, not a merge ===" git -C monorepo log -1 --format='%H %s' HEAD echo "Parent count: $(git -C monorepo cat-file -p HEAD | grep -c '^parent')" # Now make a commit that touches subA # This commit's parent is the subB squash commit (linear) create_commit monorepo subA/change2 echo "" echo "=== Repository structure ===" git -C monorepo log --oneline --graph # Verify the squash commit's ancestry includes subA's marker subB_squash=$(git -C monorepo rev-parse HEAD^) echo "" echo "=== Checking ancestry of subB squash commit ($subB_squash) ===" echo "Looking for subA marker in ancestry..." if git -C monorepo log -1 --grep="git-subtree-dir: subA" "$subB_squash" --oneline 2>/dev/null; then echo " FOUND - old code would search this and NOT ignore" else echo " NOT FOUND - test setup may be incomplete" fi echo "" echo "=== Running subtree split on subA ===" split_hash=$(git -C monorepo subtree split --prefix=subA 2>/dev/null) echo "Split hash: $split_hash" split_count=$(git -C monorepo rev-list --count "$split_hash") echo "Commits in split: $split_count" echo "" echo "=== Split history ===" git -C monorepo log --oneline "$split_hash" echo "" echo "=== Result ===" # Expected: 4 commits (2 upstream + 2 local changes) if [ "$split_count" -ge 4 ]; then echo "PASS: Split produced connected history ($split_count commits)" exit 0 else echo "FAIL: Split produced disconnected history (only $split_count commits, expected >= 4)" exit 1 fi ``` ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-01-04 14:27 ` george @ 2026-01-05 3:36 ` Colin Stagner 2026-01-06 4:55 ` george 0 siblings, 1 reply; 11+ messages in thread From: Colin Stagner @ 2026-01-05 3:36 UTC (permalink / raw) To: george; +Cc: git On 1/4/26 08:27, george@mail.dietrich.pub wrote: > It does seem one component was added differently, as a non-merge commit, which seems break things. > ``` > # Create a LINEAR squash commit for subB (simulating cherry-pick of just the squash commit) > # This is the key pattern that triggers the bug - a squash commit as a regular linear commit > ( > cd monorepo > mkdir -p subB > git -C ../subB archive HEAD | tar -x -C subB > git add subB > # Create a squash-style commit with subtree trailers but as a LINEAR commit > # Trailers must be in the last paragraph, separated by blank line > subB_short=$(git -C ../subB rev-parse --short HEAD) > subB_full=$(git -C ../subB rev-parse HEAD) > git commit -F - <<EOF > Squashed 'subB/' content from commit $subB_short > git-subtree-dir: subB > git-subtree-split: $subB_full > EOF > ) > ``` Yes, this is very likely to cause breakage. Normally, git subtree merge -P subA --squash makes two commits, in this order: 1. Squashed 'subA/' content from commit f00... 2. Merge commit (1) as 'subA' Commit 1 updates the subtree but does *not* rewrite paths. If you `git show` one, you will see that it has files like subA1 subA2 and *not* subA/subA1. The path rewrite actually takes place in Commit 2 (the merge), via the `-Xsubtree` merge strategy option. `should_ignore_subtree_split_commit` tries to search for commits like (1), which all have the `git-subtree-*` trailer. Normally, these commits either have: * no parents, if they result from a new `git subtree add --squash`; OR * only parents which are also "Squashed 'subA/' content," if they result from a follow-up `git subtree merge --squash` We can safely ignore these commits—and all of their parents—during a `subtree split` if they belong to a different subtree. Of course, that heuristic doesn't work if the commit has been rebased onto other unrelated history—which is what happened in your repo. I suspect the best way out may be to remove the `should_ignore_subtree_split_commit` heuristic entirely. It is mostly useful for repos that use `split --rejoin` a lot, and the check itself is slow. WDYT? > How the first two commits show up as verified, unlike the other times when I normally do `git subtree add --squash` and push directly to main, they show up as unverified. git v2.51.0 also adds --gpg-sign compatibility to subtree. Perhaps this is what you are seeing? > It seems you also need to add `clock` as a remote and fetch it: Ah, thanks. Personally, I'm a big advocate for the monorepo layout. In my experience, it makes almost every task easier and faster. Colin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-01-05 3:36 ` Colin Stagner @ 2026-01-06 4:55 ` george 2026-01-10 1:25 ` Colin Stagner 0 siblings, 1 reply; 11+ messages in thread From: george @ 2026-01-06 4:55 UTC (permalink / raw) To: ask+git; +Cc: george, git > I suspect the best way out may be to remove the `should_ignore_subtree_split_commit` heuristic entirely. > It is mostly useful for repos that use `split --rejoin` a lot, and the check itself is slow. WDYT? I think this sound reasonable yea. Should be sure to include a spec to capture this case going forward too ofc. This would at least fix the breaking change, and some other heuristic could be applied in the future. > git v2.51.0 also adds --gpg-sign compatibility to subtree. > Perhaps this is what you are seeing? Hmm, I don't think so as I'm just learning about this now. Will give it a shot next time I have to add another component tho! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-01-06 4:55 ` george @ 2026-01-10 1:25 ` Colin Stagner 2026-01-10 17:22 ` george 0 siblings, 1 reply; 11+ messages in thread From: Colin Stagner @ 2026-01-10 1:25 UTC (permalink / raw) To: george; +Cc: git George, Can you have a look at the patch in <20260110011811.788219-1-ask+git@howdoi.land> and see if it solves this issue? Colin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-01-10 1:25 ` Colin Stagner @ 2026-01-10 17:22 ` george 2026-02-15 20:36 ` Colin Stagner 0 siblings, 1 reply; 11+ messages in thread From: george @ 2026-01-10 17:22 UTC (permalink / raw) To: ask+git; +Cc: george, git I did! Thank you so much! It seems it not only produces the correct commit hash but is also quite a bit more performant. ```sh $ git --version git version 2.51.1 $ time git subtree split --prefix="src/components/clock" 4ee66f8198b2532110b75a36575e363ccccff47e real 0m32.971s user 0m18.856s sys 0m14.627s $ git --version git version 2.52.0 $ time git subtree split --prefix="src/components/clock" 0efb3d9858e3bfee65165508aeeacc50417c9a99 real 0m18.680s user 0m7.698s sys 0m12.842s $ /home/george/dev/git/git/git --version git version 2.52.0.408.gecb62f5599 $ time /home/george/dev/git/git/git subtree split --prefix="src/components/clock" 4ee66f8198b2532110b75a36575e363ccccff47e real 0m10.816s user 0m3.909s sys 0m7.755s ``` Thanks again! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-01-10 17:22 ` george @ 2026-02-15 20:36 ` Colin Stagner 2026-02-16 21:25 ` D. Ben Knoble 2026-02-18 4:29 ` george 0 siblings, 2 replies; 11+ messages in thread From: Colin Stagner @ 2026-02-15 20:36 UTC (permalink / raw) To: george; +Cc: git George, My original patch for this issue introduced other regressions and needed to be reverted. I don't recommend using it. Instead, can you take a look at: https://lore.kernel.org/git/20260215201748.889866-1-ask+git@howdoi.land/ which removes the "ignore other splits" optimization altogether. After some research, I suspect that this optimization may not have enough information to work correctly and preserve history in all cases. I'd also appreciate testing of https://lore.kernel.org/git/20260215201748.889866-1-ask+git@howdoi.land/ which fixes a "recursion depth exceeded" bug on Debian/Ubuntu. I've CC'd you on both of these patch series. I have tested both of these on selected subdirectories of your athena repository. They seem to work. But I'd appreciate it if you could look at all the splits you normally do and see if the patches correctly preserve history for you. Thanks, Colin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-02-15 20:36 ` Colin Stagner @ 2026-02-16 21:25 ` D. Ben Knoble 2026-02-18 4:29 ` george 1 sibling, 0 replies; 11+ messages in thread From: D. Ben Knoble @ 2026-02-16 21:25 UTC (permalink / raw) To: Colin Stagner; +Cc: george, git On Mon, Feb 16, 2026 at 3:26 PM Colin Stagner <ask+git@howdoi.land> wrote: > > George, > > My original patch for this issue introduced other regressions and needed > to be reverted. I don't recommend using it. > > Instead, can you take a look at: > > > https://lore.kernel.org/git/20260215201748.889866-1-ask+git@howdoi.land/ [snip] > https://lore.kernel.org/git/20260215201748.889866-1-ask+git@howdoi.land/ > > which fixes a "recursion depth exceeded" bug on Debian/Ubuntu. > > I've CC'd you on both of these patch series. JFYI: looks like you pasted the same link twice ;) -- D. Ben Knoble ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug] Git subtree regression 2026-02-15 20:36 ` Colin Stagner 2026-02-16 21:25 ` D. Ben Knoble @ 2026-02-18 4:29 ` george 1 sibling, 0 replies; 11+ messages in thread From: george @ 2026-02-18 4:29 UTC (permalink / raw) To: ask+git; +Cc: george, git Ahh bummer, thanks for the follow up. I'm on arch and don't seem to suffer from the same recursion limitation as debian/ubuntu do. Because of that I'll defer checking out that set of patches to someone else. I did however checkout the other and can confirm it looks good! I went thru each of the components and asserted the split hash matches the latest commit on each of the related repos. Also asserted `subtree push` results in an `Everything up to date` message. Thanks again for all your work on this. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-02-18 4:29 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-26 19:58 [Bug] Git subtree regression dev 2025-12-30 17:07 ` george 2026-01-04 4:52 ` Colin Stagner 2026-01-04 14:27 ` george 2026-01-05 3:36 ` Colin Stagner 2026-01-06 4:55 ` george 2026-01-10 1:25 ` Colin Stagner 2026-01-10 17:22 ` george 2026-02-15 20:36 ` Colin Stagner 2026-02-16 21:25 ` D. Ben Knoble 2026-02-18 4:29 ` george
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox