All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miklos Vajna <vmiklos@collabora.com>
To: Jeff King <peff@peff.net>
Cc: Elijah Newren <newren@gmail.com>, git@vger.kernel.org
Subject: [PATCH] log: improve --follow following renames in merge commits
Date: Sat, 23 May 2026 08:04:50 +0200	[thread overview]
Message-ID: <ahFDgq4TAcs29zCA@collabora.com> (raw)
In-Reply-To: <20260522054312.GD861761@coredump.intra.peff.net>

Have a repo with a subtree merge, do a 'git log --follow prefix/test.c',
the output only contains history in the outer repo, not commits that
were merged via a subtree merge.

There is an inherent limitation of the current 'git log --follow'
design, since it's limited to a single filename, and once 'git log' sees
a rename, it only tracks the new path, which only works with mostly
linear history.  Still, 'git blame prefix/test.c' does find the original
commits, so it's fair to expect 'git log --follow' can do the same.

Fix the problem by improving when to update the followed path in
log_tree_diff(). If the path is untouched versus the merge result in all
parents but one, then choose the parent where it was changed, including
any --follow processing.

This is almost the same as requiring that all but one parents are
TREESAME, except we don't consider the addition of a file as
"interesting". With this, the pre-merge history of subtree merge is
visible in git log, but the behavior is unchanged for other cases (e.g.
when a file was previously named differently on multiple parents).

Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
---

Hi Jeff,

On Fri, May 22, 2026 at 01:43:12AM -0400, Jeff King <peff@peff.net> wrote:
> I think we can probably all agree that both before and after your patch,
> --follow is never going to do the _right_ thing, which is to follow
> paths independently down both sides of history.

Sure.

> I am OK conceptually with making the current broken behavior slightly
> more useful if it is easy to do. But I am not sure if we are making
> things more useful here or not. If we see a merge where the file "bar"
> was previous "foo" on one side and "bar" on the other, our broken follow
> is going to either pick "foo" or "bar" to continue with as we traverse.
> But which one is right? Whichever name we choose, we are potentially
> omitting results from the other side.

Indeed, I didn't consider this case.

> There might be a more useful rule like: if the path is untouched versus
> the merge result in all parents but one (i.e., TREESAME), then choose
> the parent where it was changed, including any --follow processing.

I like this idea: it keeps working with the subtree use-case I have in
mind and goes back to not change behavior when the file has history on
multiple parents.

> So I dunno. Probably some experimenting could yield more analysis there,

I think requiring TREESAME for all but one parents is too strict, since
a subtree merge will look like an addition vs the first parent and will
look like a rename on the first parent. It seems to me that handling
addition as TREESAME can be correct: if the file was just added, that
suggests it has no prior history.

So a slightly relaxed rule could be: if the path is untouched or just
added versus the merge result in all parents but one, then choose the
parent where it was changed, including any --follow processing.

Here is a patch that implements that idea. It works for the subtree
merge use-case I outlined and I also added a test to show that the
behavior is unchanged for the "multiple parents have actual history for
this file" case you mentioned.

What do you think?

Thanks,

Miklos

 log-tree.c                          | 55 ++++++++++++++++++++++++-
 t/meson.build                       |  1 +
 t/t4218-log-follow-subtree-merge.sh | 64 +++++++++++++++++++++++++++++
 3 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100755 t/t4218-log-follow-subtree-merge.sh

diff --git a/log-tree.c b/log-tree.c
index 7e048701d0..368144fafc 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1142,8 +1142,61 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 				/* Show parent info for multiple diffs */
 				log->parent = parents->item;
 			}
-		} else
+		} else {
+			if (opt->diffopt.flags.follow_renames) {
+				/*
+				 * If the path is untouched in all parents but
+				 * one, then choose the parent where it was
+				 * changed.
+				 */
+				struct commit_list *p;
+				struct commit *changed_parent = NULL;
+				int n_changed = 0;
+
+				for (p = parents; p; p = p->next) {
+					struct diff_options diff_opts;
+					int interesting = 0;
+					int i;
+
+					parse_commit_or_die(p->item);
+					repo_diff_setup(opt->diffopt.repo, &diff_opts);
+					copy_pathspec(&diff_opts.pathspec,
+						      &opt->diffopt.pathspec);
+					diff_opts.flags.recursive = 1;
+					diff_opts.flags.follow_renames = 1;
+					diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+					diff_setup_done(&diff_opts);
+					diff_tree_oid(get_commit_tree_oid(p->item),
+						      oid, "", &diff_opts);
+
+					for (i = 0; i < diff_queued_diff.nr; i++) {
+						struct diff_filepair *pair = diff_queued_diff.queue[i];
+						if (DIFF_FILE_VALID(pair->one)) {
+							interesting = 1;
+							break;
+						}
+					}
+
+					diff_queue_clear(&diff_queued_diff);
+					diff_free(&diff_opts);
+
+					if (interesting) {
+						n_changed++;
+						changed_parent = p->item;
+						if (n_changed > 1)
+							break;
+					}
+				}
+
+				if (n_changed == 1) {
+					diff_tree_oid(get_commit_tree_oid(changed_parent),
+						      oid, "", &opt->diffopt);
+					diff_queue_clear(&diff_queued_diff);
+					opt->diffopt.found_follow = 0;
+				}
+			}
 			return 0;
+		}
 	}
 
 	showed_log = 0;
diff --git a/t/meson.build b/t/meson.build
index 7528e5cda5..b4ae8d76d8 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -574,6 +574,7 @@ integration_tests = [
   't4215-log-skewed-merges.sh',
   't4216-log-bloom.sh',
   't4217-log-limit.sh',
+  't4218-log-follow-subtree-merge.sh',
   't4252-am-options.sh',
   't4253-am-keep-cr-dos.sh',
   't4254-am-corrupt.sh',
diff --git a/t/t4218-log-follow-subtree-merge.sh b/t/t4218-log-follow-subtree-merge.sh
new file mode 100755
index 0000000000..fc846ebeb4
--- /dev/null
+++ b/t/t4218-log-follow-subtree-merge.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+test_description='Test --follow follows renames across subtree merges'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'setup subtree-merged repository' '
+	git init inner &&
+	echo inner >inner/inner.txt &&
+	git -C inner add inner.txt &&
+	git -C inner commit -m "inner init" &&
+
+	git init outer &&
+	echo outer >outer/outer.txt &&
+	git -C outer add outer.txt &&
+	git -C outer commit -m "outer init" &&
+
+	git -C outer fetch ../inner master &&
+	git -C outer merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C outer read-tree --prefix=inner/ -u FETCH_HEAD &&
+	git -C outer commit -m "Merge inner repo into inner/ subdirectory"
+'
+
+test_expect_success '--follow finds the pre-merge commit through a subtree merge' '
+	git -C outer log --follow --pretty=tformat:%s inner/inner.txt >actual &&
+	echo "inner init" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup merge with rename sources on multiple parents' '
+	git init left &&
+	printf "shared content\n" >left/a.txt &&
+	git -C left add a.txt &&
+	git -C left commit -m "left: a.txt" &&
+
+	git init right &&
+	printf "shared content\n" >right/b.txt &&
+	git -C right add b.txt &&
+	git -C right commit -m "right: b.txt" &&
+
+	git -C left fetch ../right master &&
+	git -C left merge -s ours --no-commit --allow-unrelated-histories \
+		FETCH_HEAD &&
+	git -C left rm a.txt &&
+	printf "shared content\n" >left/c.txt &&
+	git -C left add c.txt &&
+	git -C left commit -m "Merge: rename to c.txt" &&
+
+	printf "more content\n" >>left/c.txt &&
+	git -C left add c.txt &&
+	git -C left commit -m "modify c.txt"
+'
+
+test_expect_success '--follow does not switch when multiple parents supply a rename source' '
+	git -C left log --follow --pretty=tformat:%s c.txt >actual &&
+	echo "modify c.txt" >expect &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.51.0


  reply	other threads:[~2026-05-23  6:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12  7:21 [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
2026-05-19  6:17 ` Miklos Vajna
2026-05-19  6:37   ` Junio C Hamano
2026-05-19  8:14     ` Junio C Hamano
2026-05-20 13:28     ` Miklos Vajna
2026-05-22  5:43       ` Jeff King
2026-05-23  6:04         ` Miklos Vajna [this message]
2026-05-30  6:28           ` [PATCH] log: improve --follow following " Miklos Vajna
2026-06-04 12:20             ` Miklos Vajna

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahFDgq4TAcs29zCA@collabora.com \
    --to=vmiklos@collabora.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.