Git development
 help / color / mirror / Atom feed
* [PATCH] log: let --follow follow renames in merge commits
@ 2026-05-12  7:21 Miklos Vajna
  2026-05-19  6:17 ` Miklos Vajna
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2026-05-12  7:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Patrick Steinhardt, brian m. carlson

Have a repo with a subtree merge, do a 'git log --follow prefix/test.c',
the output only contains history in the outer repo, not commits that
were merged via a subtree merge.

This is inconsistent, since doing a 'git blame prefix/test.c' does find
the original commits. This works because find_rename() in blame.c is
invoked for each parent, and there diff_tree_oid() is used, which uses
try_to_follow_renames(). This means that in case a rename happens as
part of a merge commit, git blame can follow that rename.

Fix the problem in a similar way for the 'git log --follow' case: in
case log_tree_diff() finds a merge commit and it would return early,
then do some extra work in the follow_renames case first. Check each
parent, use diff_tree_oid() and if found_follow is set, then work with
that parent instead of returning.

This means that users examining the history of a repo with subtree
merges can see all commits to a file with a single 'git log --follow'
invocation, instead of one invocation for the outer repo and one for the
history before the subtree merge.

Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
---

Hi Junio,

I sent this out a week ago at
<https://lore.kernel.org/git/afmfSa-p-9vuDL3E@collabora.com/T/#u>, I
didn't get any reply to it -- so I'm somewhat optimistic that the patch
itself is a good idea, seeing no negative comments.

So this is a resend, this time to you, CC'ing the list, rather than the
other way around.

Could you please review this?

Thanks,

Miklos

 log-tree.c                          | 20 ++++++++++++++++-
 t/meson.build                       |  1 +
 t/t4218-log-follow-subtree-merge.sh | 34 +++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100755 t/t4218-log-follow-subtree-merge.sh

diff --git a/log-tree.c b/log-tree.c
index 7e048701d0..bce09c7dac 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1142,8 +1142,26 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 				/* Show parent info for multiple diffs */
 				log->parent = parents->item;
 			}
-		} else
+		} else {
+			if (opt->diffopt.flags.follow_renames) {
+				/*
+				 * Detect a rename across one of the parents.
+				 * Check each parent till we find a follow.
+				 */
+				struct commit_list *p;
+				for (p = parents; p; p = p->next) {
+					parse_commit_or_die(p->item);
+					diff_tree_oid(get_commit_tree_oid(p->item),
+						      oid, "", &opt->diffopt);
+					diff_queue_clear(&diff_queued_diff);
+					if (opt->diffopt.found_follow) {
+						opt->diffopt.found_follow = 0;
+						break;
+					}
+				}
+			}
 			return 0;
+		}
 	}
 
 	showed_log = 0;
diff --git a/t/meson.build b/t/meson.build
index 7528e5cda5..b4ae8d76d8 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -574,6 +574,7 @@ integration_tests = [
   't4215-log-skewed-merges.sh',
   't4216-log-bloom.sh',
   't4217-log-limit.sh',
+  't4218-log-follow-subtree-merge.sh',
   't4252-am-options.sh',
   't4253-am-keep-cr-dos.sh',
   't4254-am-corrupt.sh',
diff --git a/t/t4218-log-follow-subtree-merge.sh b/t/t4218-log-follow-subtree-merge.sh
new file mode 100755
index 0000000000..7ca607cbb8
--- /dev/null
+++ b/t/t4218-log-follow-subtree-merge.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='Test --follow follows renames across subtree merges'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'setup subtree-merged repository' '
+	git init inner &&
+	echo inner >inner/inner.txt &&
+	git -C inner add inner.txt &&
+	git -C inner commit -m "inner init" &&
+
+	git init outer &&
+	echo outer >outer/outer.txt &&
+	git -C outer add outer.txt &&
+	git -C outer commit -m "outer init" &&
+
+	git -C outer fetch ../inner master &&
+	git -C outer merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C outer read-tree --prefix=inner/ -u FETCH_HEAD &&
+	git -C outer commit -m "Merge inner repo into inner/ subdirectory"
+'
+
+test_expect_success '--follow finds the pre-merge commit through a subtree merge' '
+	git -C outer log --follow --pretty=tformat:%s inner/inner.txt >actual &&
+	echo "inner init" >expect &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: let --follow follow renames in merge commits
  2026-05-12  7:21 [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
@ 2026-05-19  6:17 ` Miklos Vajna
  2026-05-19  6:37   ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2026-05-19  6:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Patrick Steinhardt, brian m. carlson

Hi Junio,

On Tue, May 12, 2026 at 09:21:17AM +0200, Miklos Vajna <vmiklos@collabora.com> wrote:
> I sent this out a week ago at
> <https://lore.kernel.org/git/afmfSa-p-9vuDL3E@collabora.com/T/#u>, I
> didn't get any reply to it -- so I'm somewhat optimistic that the patch
> itself is a good idea, seeing no negative comments.
> 
> So this is a resend, this time to you, CC'ing the list, rather than the
> other way around.
> 
> Could you please review this?

I'm a bit confused regarding what can be a next step here. I
understanding you were away for 3 weeks, so there is a lot to process.
:-) Should I just wait more or should I resend this?

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: let --follow follow renames in merge commits
  2026-05-19  6:17 ` Miklos Vajna
@ 2026-05-19  6:37   ` Junio C Hamano
  2026-05-19  8:14     ` Junio C Hamano
  2026-05-20 13:28     ` [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
  0 siblings, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2026-05-19  6:37 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git, Patrick Steinhardt, brian m. carlson

Miklos Vajna <vmiklos@collabora.com> writes:

> Hi Junio,
>
> On Tue, May 12, 2026 at 09:21:17AM +0200, Miklos Vajna <vmiklos@collabora.com> wrote:
>> I sent this out a week ago at
>> <https://lore.kernel.org/git/afmfSa-p-9vuDL3E@collabora.com/T/#u>, I
>> didn't get any reply to it -- so I'm somewhat optimistic that the patch
>> itself is a good idea, seeing no negative comments.

The patch collecting no comments is just that--nobody so far is
interested enough to drop other things they were doing to give
supporting code reviews---and "no news" does not mean a good news.

>> So this is a resend, this time to you, CC'ing the list, rather than the
>> other way around.
>> 
>> Could you please review this?
>
> I'm a bit confused regarding what can be a next step here. I
> understanding you were away for 3 weeks, so there is a lot to process.
> :-) Should I just wait more or should I resend this?

Rather, ask other reviewers; when I do not comment on a patch, I
often am not interested, or too busy and the change does not look
interesting enough to me to make me drop what I am doing.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: let --follow follow renames in merge commits
  2026-05-19  6:37   ` Junio C Hamano
@ 2026-05-19  8:14     ` Junio C Hamano
  2026-06-08  6:35       ` [PATCH] log: improve --follow following renames for non-linear history Miklos Vajna
  2026-05-20 13:28     ` [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2026-05-19  8:14 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: git, Patrick Steinhardt, brian m. carlson

Junio C Hamano <gitster@pobox.com> writes:

>>> Could you please review this?
>>
>> I'm a bit confused regarding what can be a next step here. I
>> understanding you were away for 3 weeks, so there is a lot to process.
>> :-) Should I just wait more or should I resend this?
>
> Rather, ask other reviewers; when I do not comment on a patch, I
> often am not interested, or too busy and the change does not look
> interesting enough to me to make me drop what I am doing.

Addendum.  As I said in

    https://lore.kernel.org/git/xmqqqzni967o.fsf@gitster.g/

and the subsequent discussion concluded, the "==follow" checkbox
feature is meant to work well only in a linear history, and that is
inherent to the way it "follows" the single path.

It does not follow different pathname(s) while following a set of
different histories merged, e.g., in a history like this (as usual
time flows from left to right)

    ----o----A----o
                   \
                    M----o----o----o
                   /
    ----x----B----x

you may start following path F at the HEAD, and after crossing the
merge M, one history may find out that path F came from path G.
The traveral starts with "F" as the sole element in the pathspec,
but once the traversal hits that commit (say, A), the traversal
switches to use "G" as the sole element in the pathspec and follows
the history down.  Even if the other history (i.e., 'x' on the lower
history) had path F all along, once the pathspec is swapped to
follow "G" on the upper lineage of the history, traversal of the
lower lineage that happens after the traversal passes 'A" _will_ try
to follow "G" that may not exist at all.  Or 'x' may have done the
same rename from "G" to "F" at "B".  Depending on the order in which
"A" and any of these commits on the lower history are visited, the
commit that is a child of "B" (which has the path at "F") may be
visited after "A", in which case the path in question "F" will not
be looked for in it.

A minor "tweak" that does not solve this inherent design issue does
not interest me, so...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: let --follow follow renames in merge commits
  2026-05-19  6:37   ` Junio C Hamano
  2026-05-19  8:14     ` Junio C Hamano
@ 2026-05-20 13:28     ` Miklos Vajna
  2026-05-22  5:43       ` Jeff King
  1 sibling, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2026-05-20 13:28 UTC (permalink / raw)
  To: Elijah Newren, Jeff King; +Cc: git

Hi Elijah, Jeff,

On Tue, May 19, 2026 at 03:37:54PM +0900, Junio C Hamano <gitster@pobox.com> wrote:
> > :-) Should I just wait more or should I resend this?
> 
> Rather, ask other reviewers

I did a small improvement to how 'git log --follow' works, as in if the
rename happens inside the merge commit itself, then the rename was
detected "vs the first parent", but it wasn't detected "vs other
parents", which is painful with a "subtree" merge commit.

I'm not sure if it adds value, but I can append a one-paragraph summary
of Junio's comment in this thread to the end the commit message, to be
more explicit that the inherent limitation of the current log follow
design (single path, once a rename is detected, we only care about the
new path) is not changed with the patch, this is just a fix patch so
'git log' works better, similar to how 'git blame' already does.

May I ask you to review the patch?

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: let --follow follow renames in merge commits
  2026-05-20 13:28     ` [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
@ 2026-05-22  5:43       ` Jeff King
  2026-05-23  6:04         ` [PATCH] log: improve --follow following " Miklos Vajna
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2026-05-22  5:43 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Elijah Newren, git

On Wed, May 20, 2026 at 03:28:11PM +0200, Miklos Vajna wrote:

> Hi Elijah, Jeff,
> 
> On Tue, May 19, 2026 at 03:37:54PM +0900, Junio C Hamano <gitster@pobox.com> wrote:
> > > :-) Should I just wait more or should I resend this?
> > 
> > Rather, ask other reviewers
> 
> I did a small improvement to how 'git log --follow' works, as in if the
> rename happens inside the merge commit itself, then the rename was
> detected "vs the first parent", but it wasn't detected "vs other
> parents", which is painful with a "subtree" merge commit.
> 
> I'm not sure if it adds value, but I can append a one-paragraph summary
> of Junio's comment in this thread to the end the commit message, to be
> more explicit that the inherent limitation of the current log follow
> design (single path, once a rename is detected, we only care about the
> new path) is not changed with the patch, this is just a fix patch so
> 'git log' works better, similar to how 'git blame' already does.
> 
> May I ask you to review the patch?

I saw Junio's comment. I was about to write something very similar
before I saw that he had already done so. ;)

I think we can probably all agree that both before and after your patch,
--follow is never going to do the _right_ thing, which is to follow
paths independently down both sides of history.

I am OK conceptually with making the current broken behavior slightly
more useful if it is easy to do. But I am not sure if we are making
things more useful here or not. If we see a merge where the file "bar"
was previous "foo" on one side and "bar" on the other, our broken follow
is going to either pick "foo" or "bar" to continue with as we traverse.
But which one is right? Whichever name we choose, we are potentially
omitting results from the other side.

Right now we pick the first-parent name always. But it does not seem
more correct to me to pick one from another parent. You'd be missing
further commits using the original name along the first-parent track.

There might be a more useful rule like: if the path is untouched versus
the merge result in all parents but one (i.e., TREESAME), then choose
the parent where it was changed, including any --follow processing. But
we already do something like that for history simplification. Which
makes me wonder if you could get the results you want through some use
of history-simplification flags.

Or maybe we already do that TREESAME check. Simplification kicks in when
the traversal is limited by path, and --follow mode by definition has
such a path. But I'm not sure if the --follow code would see the
simplified parent list or not.

So I dunno. Probably some experimenting could yield more analysis there,
but with the patch as-is I'm not convinced that it is not going to make
some cases worse.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] log: improve --follow following renames in merge commits
  2026-05-22  5:43       ` Jeff King
@ 2026-05-23  6:04         ` Miklos Vajna
  2026-05-30  6:28           ` Miklos Vajna
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2026-05-23  6:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Elijah Newren, git

Have a repo with a subtree merge, do a 'git log --follow prefix/test.c',
the output only contains history in the outer repo, not commits that
were merged via a subtree merge.

There is an inherent limitation of the current 'git log --follow'
design, since it's limited to a single filename, and once 'git log' sees
a rename, it only tracks the new path, which only works with mostly
linear history.  Still, 'git blame prefix/test.c' does find the original
commits, so it's fair to expect 'git log --follow' can do the same.

Fix the problem by improving when to update the followed path in
log_tree_diff(). If the path is untouched versus the merge result in all
parents but one, then choose the parent where it was changed, including
any --follow processing.

This is almost the same as requiring that all but one parents are
TREESAME, except we don't consider the addition of a file as
"interesting". With this, the pre-merge history of subtree merge is
visible in git log, but the behavior is unchanged for other cases (e.g.
when a file was previously named differently on multiple parents).

Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
---

Hi Jeff,

On Fri, May 22, 2026 at 01:43:12AM -0400, Jeff King <peff@peff.net> wrote:
> I think we can probably all agree that both before and after your patch,
> --follow is never going to do the _right_ thing, which is to follow
> paths independently down both sides of history.

Sure.

> I am OK conceptually with making the current broken behavior slightly
> more useful if it is easy to do. But I am not sure if we are making
> things more useful here or not. If we see a merge where the file "bar"
> was previous "foo" on one side and "bar" on the other, our broken follow
> is going to either pick "foo" or "bar" to continue with as we traverse.
> But which one is right? Whichever name we choose, we are potentially
> omitting results from the other side.

Indeed, I didn't consider this case.

> There might be a more useful rule like: if the path is untouched versus
> the merge result in all parents but one (i.e., TREESAME), then choose
> the parent where it was changed, including any --follow processing.

I like this idea: it keeps working with the subtree use-case I have in
mind and goes back to not change behavior when the file has history on
multiple parents.

> So I dunno. Probably some experimenting could yield more analysis there,

I think requiring TREESAME for all but one parents is too strict, since
a subtree merge will look like an addition vs the first parent and will
look like a rename on the first parent. It seems to me that handling
addition as TREESAME can be correct: if the file was just added, that
suggests it has no prior history.

So a slightly relaxed rule could be: if the path is untouched or just
added versus the merge result in all parents but one, then choose the
parent where it was changed, including any --follow processing.

Here is a patch that implements that idea. It works for the subtree
merge use-case I outlined and I also added a test to show that the
behavior is unchanged for the "multiple parents have actual history for
this file" case you mentioned.

What do you think?

Thanks,

Miklos

 log-tree.c                          | 55 ++++++++++++++++++++++++-
 t/meson.build                       |  1 +
 t/t4218-log-follow-subtree-merge.sh | 64 +++++++++++++++++++++++++++++
 3 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100755 t/t4218-log-follow-subtree-merge.sh

diff --git a/log-tree.c b/log-tree.c
index 7e048701d0..368144fafc 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1142,8 +1142,61 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 				/* Show parent info for multiple diffs */
 				log->parent = parents->item;
 			}
-		} else
+		} else {
+			if (opt->diffopt.flags.follow_renames) {
+				/*
+				 * If the path is untouched in all parents but
+				 * one, then choose the parent where it was
+				 * changed.
+				 */
+				struct commit_list *p;
+				struct commit *changed_parent = NULL;
+				int n_changed = 0;
+
+				for (p = parents; p; p = p->next) {
+					struct diff_options diff_opts;
+					int interesting = 0;
+					int i;
+
+					parse_commit_or_die(p->item);
+					repo_diff_setup(opt->diffopt.repo, &diff_opts);
+					copy_pathspec(&diff_opts.pathspec,
+						      &opt->diffopt.pathspec);
+					diff_opts.flags.recursive = 1;
+					diff_opts.flags.follow_renames = 1;
+					diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+					diff_setup_done(&diff_opts);
+					diff_tree_oid(get_commit_tree_oid(p->item),
+						      oid, "", &diff_opts);
+
+					for (i = 0; i < diff_queued_diff.nr; i++) {
+						struct diff_filepair *pair = diff_queued_diff.queue[i];
+						if (DIFF_FILE_VALID(pair->one)) {
+							interesting = 1;
+							break;
+						}
+					}
+
+					diff_queue_clear(&diff_queued_diff);
+					diff_free(&diff_opts);
+
+					if (interesting) {
+						n_changed++;
+						changed_parent = p->item;
+						if (n_changed > 1)
+							break;
+					}
+				}
+
+				if (n_changed == 1) {
+					diff_tree_oid(get_commit_tree_oid(changed_parent),
+						      oid, "", &opt->diffopt);
+					diff_queue_clear(&diff_queued_diff);
+					opt->diffopt.found_follow = 0;
+				}
+			}
 			return 0;
+		}
 	}
 
 	showed_log = 0;
diff --git a/t/meson.build b/t/meson.build
index 7528e5cda5..b4ae8d76d8 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -574,6 +574,7 @@ integration_tests = [
   't4215-log-skewed-merges.sh',
   't4216-log-bloom.sh',
   't4217-log-limit.sh',
+  't4218-log-follow-subtree-merge.sh',
   't4252-am-options.sh',
   't4253-am-keep-cr-dos.sh',
   't4254-am-corrupt.sh',
diff --git a/t/t4218-log-follow-subtree-merge.sh b/t/t4218-log-follow-subtree-merge.sh
new file mode 100755
index 0000000000..fc846ebeb4
--- /dev/null
+++ b/t/t4218-log-follow-subtree-merge.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+test_description='Test --follow follows renames across subtree merges'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'setup subtree-merged repository' '
+	git init inner &&
+	echo inner >inner/inner.txt &&
+	git -C inner add inner.txt &&
+	git -C inner commit -m "inner init" &&
+
+	git init outer &&
+	echo outer >outer/outer.txt &&
+	git -C outer add outer.txt &&
+	git -C outer commit -m "outer init" &&
+
+	git -C outer fetch ../inner master &&
+	git -C outer merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C outer read-tree --prefix=inner/ -u FETCH_HEAD &&
+	git -C outer commit -m "Merge inner repo into inner/ subdirectory"
+'
+
+test_expect_success '--follow finds the pre-merge commit through a subtree merge' '
+	git -C outer log --follow --pretty=tformat:%s inner/inner.txt >actual &&
+	echo "inner init" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup merge with rename sources on multiple parents' '
+	git init left &&
+	printf "shared content\n" >left/a.txt &&
+	git -C left add a.txt &&
+	git -C left commit -m "left: a.txt" &&
+
+	git init right &&
+	printf "shared content\n" >right/b.txt &&
+	git -C right add b.txt &&
+	git -C right commit -m "right: b.txt" &&
+
+	git -C left fetch ../right master &&
+	git -C left merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C left rm a.txt &&
+	printf "shared content\n" >left/c.txt &&
+	git -C left add c.txt &&
+	git -C left commit -m "Merge: rename to c.txt" &&
+
+	printf "more content\n" >>left/c.txt &&
+	git -C left add c.txt &&
+	git -C left commit -m "modify c.txt"
+'
+
+test_expect_success '--follow does not switch when multiple parents supply a rename source' '
+	git -C left log --follow --pretty=tformat:%s c.txt >actual &&
+	echo "modify c.txt" >expect &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: improve --follow following renames in merge commits
  2026-05-23  6:04         ` [PATCH] log: improve --follow following " Miklos Vajna
@ 2026-05-30  6:28           ` Miklos Vajna
  2026-06-04 12:20             ` Miklos Vajna
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2026-05-30  6:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Elijah Newren, git

Hi Jeff,

On Sat, May 23, 2026 at 08:04:50AM +0200, Miklos Vajna <vmiklos@collabora.com> wrote:
> > There might be a more useful rule like: if the path is untouched versus
> > the merge result in all parents but one (i.e., TREESAME), then choose
> > the parent where it was changed, including any --follow processing.
> 
> I like this idea: it keeps working with the subtree use-case I have in
> mind and goes back to not change behavior when the file has history on
> multiple parents.
> 
> > So I dunno. Probably some experimenting could yield more analysis there,
> 
> I think requiring TREESAME for all but one parents is too strict, since
> a subtree merge will look like an addition vs the first parent and will
> look like a rename on the first parent. It seems to me that handling
> addition as TREESAME can be correct: if the file was just added, that
> suggests it has no prior history.
> 
> So a slightly relaxed rule could be: if the path is untouched or just
> added versus the merge result in all parents but one, then choose the
> parent where it was changed, including any --follow processing.

Could you please comment on this, if this tweaked rule and its
implementation in the patch looks OK to you? Let me know if I should
just wait some more.

I would hope this addresses your concern where naively following an
other parent just makes one use-case better and can be worse in other
cases.

This also explains why the normal history simplification is not enough
here: the "added vs parent" is a change that is not interesting in this
case, but is more than TREESAME.

Finally, because I forgot to react to that earlier: I'm not against the
idea to attempt to improve --follow work better when visiting a tree of
commits in general, but sounds like a larger rework, so it would be nice
to have a fix for the subtree use-case first.

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: improve --follow following renames in merge commits
  2026-05-30  6:28           ` Miklos Vajna
@ 2026-06-04 12:20             ` Miklos Vajna
  0 siblings, 0 replies; 11+ messages in thread
From: Miklos Vajna @ 2026-06-04 12:20 UTC (permalink / raw)
  To: Jeff King; +Cc: Elijah Newren, git

Hi Jeff,

On Sat, May 30, 2026 at 08:29:03AM +0200, Miklos Vajna <vmiklos@collabora.com> wrote:
> Could you please comment on this, if this tweaked rule and its
> implementation in the patch looks OK to you? Let me know if I should
> just wait some more.

Just to come back to this, the idea was to make the --follow behavior
slightly more useful by not always assuming we should follow a first
parent in merge commits, but see if only one parent has effective
changes to the followed file, and if so, follow that one.

I did this by doing a diff on the followed path in each parent, then
mark the parent as "interesting" if DIFF_FILE_VALID() says so. This is
true if the file is touched or the rename happens inside the merge
commit (vs that parent), but it's not true if the file is really not
touched or the file only shows up as an addition. And if we have only
have one interesting parent, then switch to this, even if it's not the
first parent. With this rule, I think we address your worry case about
"making some other cases" worse and this still works for the subtree
case, and this is relatively easy to do.

What do you think?

Thanks,

Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] log: improve --follow following renames for non-linear history
  2026-05-19  8:14     ` Junio C Hamano
@ 2026-06-08  6:35       ` Miklos Vajna
  2026-06-08 15:10         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2026-06-08  6:35 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git

Have a repo with a subtree merge, do a 'git log --follow prefix/test.c',
the output only contains history in the outer repo, not commits that
were merged via a subtree merge.

What happened is that 'git log --follow' used to store the followed path
only in opt->diffopt.pathspec, so in case the commit history is
non-linear, and multiple parents had renames to the followed path, then
the end result wasn't really defined: the first commit that happened to
be visited in one of the parents updated opt->diffopt.pathspec, and from
that point, only that updated path was visited.

Fix the problem by introducing a commit -> path map
(follow_pathspec_slab) that stores that will be path to follow when
visiting that parent. At the top of log_tree_commit(), if the slab has
an entry for this commit, we replace opt->diffopt.pathspec with it, so
the correct path is followed, even if an unrelated sub-tree changed the
path to be followed to something else. After log_tree_diff() runs, we
record each parent's path in the slab: for a non-merge commit,
try_to_follow_renames() inside diff_tree_oid() has already updated
opt->diffopt.pathspec to the parent's name, so we just record it. For a
merge, log_tree_diff() is a no-op; we run a separate
diff_tree_oid(parent, commit, ...) with follow_renames=1 for each parent
and record the path it finds. As a result, the walk order doesn't
matter, which was exactly the source of problems previously.

This helps with subtree merges (rename happens inside the merge commit),
but also the general case when the rename happens in the history of
parents, not in the merge commit itself.
---

Hi Junio, Jeff,

On Tue, May 19, 2026 at 05:14:54PM +0900, Junio C Hamano <gitster@pobox.com> wrote:
> A minor "tweak" that does not solve this inherent design issue does
> not interest me, so...

Here is a patch that attempts to actually solve the problem you point
out here, by tracking what path should be followed for multiple branches
of the history.

It obsoletes the previous two "tweak" patches in this thread.

Hopefully this one is more interesting. :-)

Thanks,

Miklos

 Documentation/config/log.adoc |   3 +-
 log-tree.c                    | 116 ++++++++++++++++++++++++++++++++++
 log-tree.h                    |   1 +
 revision.c                    |   2 +
 revision.h                    |   4 ++
 t/meson.build                 |   1 +
 t/t4218-log-follow-merge.sh   |  80 +++++++++++++++++++++++
 7 files changed, 205 insertions(+), 2 deletions(-)
 create mode 100755 t/t4218-log-follow-merge.sh

diff --git a/Documentation/config/log.adoc b/Documentation/config/log.adoc
index f20cc25cd7..757a7be196 100644
--- a/Documentation/config/log.adoc
+++ b/Documentation/config/log.adoc
@@ -53,8 +53,7 @@ This is the same as the `--decorate` option of the `git log`.
 `log.follow`::
 	If `true`, `git log` will act as if the `--follow` option was used when
 	a single <path> is given.  This has the same limitations as `--follow`,
-	i.e. it cannot be used to follow multiple files and does not work well
-	on non-linear history.
+	i.e. it cannot be used to follow multiple files.
 
 `log.graphColors`::
 	A list of colors, separated by commas, that can be used to draw
diff --git a/log-tree.c b/log-tree.c
index 7e048701d0..e7f098e571 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -3,6 +3,7 @@
 
 #include "git-compat-util.h"
 #include "commit-reach.h"
+#include "commit-slab.h"
 #include "config.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1089,6 +1090,96 @@ static int do_remerge_diff(struct rev_info *opt,
 	return !opt->loginfo;
 }
 
+/* Per-commit pathspec storage for --follow across merges */
+define_commit_slab(follow_pathspec_slab, char *);
+
+static const char *pathspec_single_path(const struct pathspec *ps)
+{
+	if (ps->nr != 1)
+		return NULL;
+	return ps->items[0].match;
+}
+
+static void set_pathspec_to_single_path(struct pathspec *ps, const char *path)
+{
+	const char *paths[2] = { path, NULL };
+
+	clear_pathspec(ps);
+	parse_pathspec(ps,
+		       PATHSPEC_ALL_MAGIC & ~PATHSPEC_LITERAL,
+		       PATHSPEC_LITERAL_PATH, "", paths);
+}
+
+static void remember_follow_pathspec(struct rev_info *opt,
+				     struct commit *c, const char *path)
+{
+	char **slot;
+
+	if (!path)
+		return;
+	if (!opt->follow_pathspec_slab) {
+		opt->follow_pathspec_slab = xmalloc(sizeof(*opt->follow_pathspec_slab));
+		init_follow_pathspec_slab(opt->follow_pathspec_slab);
+	}
+	slot = follow_pathspec_slab_at(opt->follow_pathspec_slab, c);
+	if (*slot && !strcmp(*slot, path))
+		return;
+	free(*slot);
+	*slot = xstrdup(path);
+}
+
+static const char *recall_follow_pathspec(struct rev_info *opt,
+					  struct commit *c)
+{
+	char **slot;
+
+	if (!opt->follow_pathspec_slab)
+		return NULL;
+	slot = follow_pathspec_slab_peek(opt->follow_pathspec_slab, c);
+	return slot ? *slot : NULL;
+}
+
+static void free_follow_pathspec_slot(char **slot)
+{
+	FREE_AND_NULL(*slot);
+}
+
+void release_follow_pathspec_slab(struct rev_info *opt)
+{
+	if (!opt->follow_pathspec_slab)
+		return;
+	deep_clear_follow_pathspec_slab(opt->follow_pathspec_slab,
+					free_follow_pathspec_slot);
+	FREE_AND_NULL(opt->follow_pathspec_slab);
+}
+
+/* Compute the followed pathspec that should apply to parent. */
+static void propagate_follow_pathspec_to_parent(struct rev_info *opt,
+						struct commit *commit,
+						struct commit *parent)
+{
+	struct diff_options diff_opts;
+	const char *path;
+
+	parse_commit_or_die(parent);
+	repo_diff_setup(opt->diffopt.repo, &diff_opts);
+	copy_pathspec(&diff_opts.pathspec, &opt->diffopt.pathspec);
+	diff_opts.flags.recursive = 1;
+	diff_opts.flags.follow_renames = 1;
+	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_setup_done(&diff_opts);
+	diff_tree_oid(get_commit_tree_oid(parent),
+		      get_commit_tree_oid(commit),
+		      "", &diff_opts);
+
+	path = pathspec_single_path(&diff_opts.pathspec);
+	if (path)
+		remember_follow_pathspec(opt, parent, path);
+
+	diff_queue_clear(&diff_queued_diff);
+	diff_free(&diff_opts);
+}
+
 /*
  * Show the diff of a commit.
  *
@@ -1179,6 +1270,16 @@ int log_tree_commit(struct rev_info *opt, struct commit *commit)
 	opt->loginfo = &log;
 	opt->diffopt.no_free = 1;
 
+	/* Any recorded pathspec for this commit? If so, restore it. */
+	if (opt->diffopt.flags.follow_renames) {
+		const char *stored = recall_follow_pathspec(opt, commit);
+		if (stored) {
+			const char *current = pathspec_single_path(&opt->diffopt.pathspec);
+			if (!current || strcmp(current, stored))
+				set_pathspec_to_single_path(&opt->diffopt.pathspec, stored);
+		}
+	}
+
 	/* NEEDSWORK: no restoring of no_free?  Why? */
 	if (opt->line_level_traverse)
 		return line_log_print(opt, commit);
@@ -1195,6 +1296,21 @@ int log_tree_commit(struct rev_info *opt, struct commit *commit)
 		fprintf(opt->diffopt.file, "\n%s\n", opt->break_bar);
 	if (shown)
 		show_diff_of_diff(opt);
+
+	/* Record what pathspec each parent of this commit should use */
+	if (opt->diffopt.flags.follow_renames) {
+		struct commit_list *parents = get_saved_parents(opt, commit);
+		if (parents && parents->next) {
+			struct commit_list *p;
+			for (p = parents; p; p = p->next)
+				propagate_follow_pathspec_to_parent(opt, commit,
+								    p->item);
+		} else if (parents) {
+			remember_follow_pathspec(opt, parents->item,
+				pathspec_single_path(&opt->diffopt.pathspec));
+		}
+	}
+
 	opt->loginfo = NULL;
 	maybe_flush_or_die(opt->diffopt.file, "stdout");
 	opt->diffopt.no_free = no_free;
diff --git a/log-tree.h b/log-tree.h
index 07924be8bc..e8679b6c4a 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -26,6 +26,7 @@ struct decoration_options {
 int parse_decorate_color_config(const char *var, const char *slot_name, const char *value);
 int log_tree_diff_flush(struct rev_info *);
 int log_tree_commit(struct rev_info *, struct commit *);
+void release_follow_pathspec_slab(struct rev_info *);
 void show_log(struct rev_info *opt);
 void format_decorations(struct strbuf *sb, const struct commit *commit,
 			enum git_colorbool use_color, const struct decoration_options *opts);
diff --git a/revision.c b/revision.c
index 5693618be4..caa85fb4c6 100644
--- a/revision.c
+++ b/revision.c
@@ -26,6 +26,7 @@
 #include "decorate.h"
 #include "string-list.h"
 #include "line-log.h"
+#include "log-tree.h"
 #include "mailmap.h"
 #include "commit-slab.h"
 #include "cache-tree.h"
@@ -3284,6 +3285,7 @@ void release_revisions(struct rev_info *revs)
 	line_log_free(revs);
 	oidset_clear(&revs->missing_commits);
 	release_revisions_bloom_keyvecs(revs);
+	release_follow_pathspec_slab(revs);
 }
 
 static void add_child(struct rev_info *revs, struct commit *parent, struct commit *child)
diff --git a/revision.h b/revision.h
index c9a11827cc..607113ca74 100644
--- a/revision.h
+++ b/revision.h
@@ -65,6 +65,7 @@ struct repository;
 struct rev_info;
 struct string_list;
 struct saved_parents;
+struct follow_pathspec_slab;
 struct bloom_keyvec;
 struct bloom_filter_settings;
 struct option;
@@ -354,6 +355,9 @@ struct rev_info {
 	/* copies of the parent lists, for --full-diff display */
 	struct saved_parents *saved_parents_slab;
 
+	/* per-commit pathspec for --follow across merges */
+	struct follow_pathspec_slab *follow_pathspec_slab;
+
 	struct commit_list *previous_parents;
 	struct commit_list *ancestry_path_bottoms;
 	const char *break_bar;
diff --git a/t/meson.build b/t/meson.build
index 2af8d01279..cd43e0609a 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -576,6 +576,7 @@ integration_tests = [
   't4215-log-skewed-merges.sh',
   't4216-log-bloom.sh',
   't4217-log-limit.sh',
+  't4218-log-follow-merge.sh',
   't4252-am-options.sh',
   't4253-am-keep-cr-dos.sh',
   't4254-am-corrupt.sh',
diff --git a/t/t4218-log-follow-merge.sh b/t/t4218-log-follow-merge.sh
new file mode 100755
index 0000000000..7a1b6fcb84
--- /dev/null
+++ b/t/t4218-log-follow-merge.sh
@@ -0,0 +1,80 @@
+#!/bin/sh
+
+test_description='Test --follow follows renames across merges'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'setup subtree-merged repository' '
+	git init inner &&
+	echo inner >inner/inner.txt &&
+	git -C inner add inner.txt &&
+	git -C inner commit -m "inner init" &&
+
+	git init outer &&
+	echo outer >outer/outer.txt &&
+	git -C outer add outer.txt &&
+	git -C outer commit -m "outer init" &&
+
+	git -C outer fetch ../inner master &&
+	git -C outer merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C outer read-tree --prefix=inner/ -u FETCH_HEAD &&
+	git -C outer commit -m "Merge inner repo into inner/ subdirectory"
+'
+
+test_expect_success '--follow finds the pre-merge commit through a subtree merge' '
+	git -C outer log --follow --pretty=tformat:%s inner/inner.txt >actual &&
+	echo "inner init" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup merge of two branches that both renamed a file to README' '
+	git init foo &&
+	mkdir foo/foo &&
+	echo "foo readme" >foo/foo/README &&
+	git -C foo add foo/README &&
+	git -C foo commit -m "add foo README" &&
+
+	git -C foo mv foo/README README &&
+	git -C foo commit -m "promote foo README to toplevel" &&
+
+	echo "foo c" >foo/foo.c &&
+	git -C foo add foo.c &&
+	git -C foo commit -m "add foo C impl" &&
+
+	git init bar &&
+	mkdir bar/bar &&
+	echo "bar readme" >bar/bar/README &&
+	git -C bar add bar/README &&
+	git -C bar commit -m "add bar README" &&
+
+	git -C bar mv bar/README README &&
+	git -C bar commit -m "promote bar README to toplevel" &&
+
+	echo "bar c" >bar/bar.c &&
+	git -C bar add bar.c &&
+	git -C bar commit -m "add bar C impl" &&
+
+	git -C foo fetch ../bar master &&
+	git -C foo merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C foo checkout FETCH_HEAD -- bar.c &&
+	git -C foo commit -m "merge bar into foo"
+'
+
+test_expect_success '--follow follows renames across both sides of a merge' '
+	git -C foo log --follow --pretty=tformat:%s README >actual &&
+	sort actual >actual.sorted &&
+	cat >expect <<-\EOF &&
+	add bar README
+	add foo README
+	promote bar README to toplevel
+	promote foo README to toplevel
+	EOF
+	test_cmp expect actual.sorted
+'
+
+test_done
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] log: improve --follow following renames for non-linear history
  2026-06-08  6:35       ` [PATCH] log: improve --follow following renames for non-linear history Miklos Vajna
@ 2026-06-08 15:10         ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2026-06-08 15:10 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Jeff King, git

Miklos Vajna <vmiklos@collabora.com> writes:

> Have a repo with a subtree merge, do a 'git log --follow prefix/test.c',
> the output only contains history in the outer repo, not commits that
> were merged via a subtree merge.
>
> What happened is that 'git log --follow' used to store the followed path
> only in opt->diffopt.pathspec, so in case the commit history is
> non-linear, and multiple parents had renames to the followed path, then
> the end result wasn't really defined: the first commit that happened to
> be visited in one of the parents updated opt->diffopt.pathspec, and from
> that point, only that updated path was visited.

When describing a problematic symptom you are trying to improve, you
should talk about the current state of the system in the present
tense.  "used to store" makes it sound like in ancient times back
when Linus wrote the first version of this feature it was so, but a
few years ago that changed, but that is not what you want to say, is
it?

The above may sound picky, but using the consistent style of
description makes it easier to follow the thought process,
especially when you need to read many commits to understand what is
going on.

> Fix the problem by introducing a commit -> path map
> (follow_pathspec_slab) that stores that will be path to follow when
> visiting that parent. At the top of log_tree_commit(), if the slab has
> an entry for this commit, we replace opt->diffopt.pathspec with it, so
> the correct path is followed, even if an unrelated sub-tree changed the
> path to be followed to something else.

Can a "map" cut it?

If a history forked at commit A, with two children commit B and
commit C, and you started traversing the history from a much later
descendant M that merges these two lines of history (i.e., M^1
contains B, M^2 contains C, and A==B^1==C^1), while traversing down
from M to B you may find that you need to follow path1 and similarly
somewhere between M down to C the path you are following may be
path2.  And the traversal meets at A.  The slab records path1 for B
and path2 for C.  Wouldn't you need to be able to store both path1
and path2 for commit A?  What path do you need to pay attention to
when traversing past A to its ancestors?

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-06-08 15:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-12  7:21 [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
2026-05-19  6:17 ` Miklos Vajna
2026-05-19  6:37   ` Junio C Hamano
2026-05-19  8:14     ` Junio C Hamano
2026-06-08  6:35       ` [PATCH] log: improve --follow following renames for non-linear history Miklos Vajna
2026-06-08 15:10         ` Junio C Hamano
2026-05-20 13:28     ` [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
2026-05-22  5:43       ` Jeff King
2026-05-23  6:04         ` [PATCH] log: improve --follow following " Miklos Vajna
2026-05-30  6:28           ` Miklos Vajna
2026-06-04 12:20             ` Miklos Vajna

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox