All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Miklos Vajna <vmiklos@collabora.com>
Cc: Jeff King <peff@peff.net>,  git@vger.kernel.org
Subject: Re: [PATCH v2] log: improve --follow following renames for non-linear history
Date: Thu, 11 Jun 2026 15:32:42 -0700	[thread overview]
Message-ID: <xmqqo6hglncl.fsf@gitster.g> (raw)
In-Reply-To: <aipTOsH8LKTSwglj@collabora.com> (Miklos Vajna's message of "Thu, 11 Jun 2026 08:18:34 +0200")

Miklos Vajna <vmiklos@collabora.com> writes:

> situation when determining what path to follow for a specific commit
> with multiple previously visited children.
> ---

Missing sign-off; omitting sign-off to say that this is primarily
for requesting comments and not ready for application (often we see
RFC on the Subject line when this is done) is fine, though.

>> Can a "map" cut it?
>> 
>> If a history forked at commit A, with two children commit B and
>> commit C, and you started traversing the history from a much later
>> descendant M that merges these two lines of history (i.e., M^1
>> contains B, M^2 contains C, and A==B^1==C^1), while traversing down
>> from M to B you may find that you need to follow path1 and similarly
>> somewhere between M down to C the path you are following may be
>> path2.  And the traversal meets at A.  The slab records path1 for B
>> and path2 for C.  Wouldn't you need to be able to store both path1
>> and path2 for commit A?  What path do you need to pay attention to
>> when traversing past A to its ancestors?
>
> Indeed, I focused on merge commits and their parents and I did not 
> consider that slab[A] may be set to path1 when visiting one parent and 
> then slab[A] may be set to path2 when visiting an other parent -- even 
> if "A" itself is just a plain commit with no renames and is not a merge.

"A" in my example is a fork point.  One of A's children may arrive
at A following path1 while another child may come to A following
path2.  IOW, in the history below:

      B---X---o---M---o---Z
     /           /
    A---C---Y---o

 * A has the original path at "path0"; so do B and C.
 * X renames "path0" to "path1"
 * Y renames "path0" to "path2"
 * M merges path1 coming from upper and path2 from lower history
   and records the result at path "path".
 * Z has "path".

You run "git log --follow Z -- path".

My answer to my (rhetorical) question (Can a "map" cut it?) actually
was "we probably can", since our "rename following" code does not
handle cases where two paths in a parent is merged into a single
path in a child, or a single path in a parent is split to form
multiple paths in a child.

So the "what path are we following?" slab would need to keep track
of a single path.  From Z down to M, we follow "path".  "path1" is
followed from M to X and "path2" is followed from M to Y.  And from
X to A, and Y to A, we follow "path0".  IOW, I did not think we need
two paths recorded for one commit.

Are any of your test cases added by this patch behave differently
with this version (vs the "single path assigned to each commit"
version you had earlier)?  If so, then obviously there is some hole
in my above discussion.  

One case that _could_ break down is if a rename on one track (say,
at Y) is so huge that it is not recognised as a rename.  Then from Y
down to A we would probably try to track "path2" (because we fail to
notice that "path2" came from "path0") and declare that "path2"
appeared at Y from nowhere.  But even then, we shouldn't propagate
"path2" down to A, so A would get only "path0" which was what we
follow going from X down to A, I think.  Still no need for following
multiple paths at a fork point.

> +	/* Any recorded paths for this commit? If so, restore it */
> +	if (opt->diffopt.flags.follow_renames) {
> +		paths = get_follow_pathspec_at(opt, commit);
> +		if (!paths->nr) {
> +			const char *path = pathspec_single_path(&opt->diffopt.pathspec);
> +			if (path)
> +				string_list_insert(paths, path);

We do not need to worry about deduplicating, as string_list_insert()
will automatically takes care of that for us, which is nice.

> +		}
> +		set_pathspec_to_paths(&opt->diffopt.pathspec, paths);
> +		if (paths->nr > 1) {
> +			/* diff_check_follow_pathspec() doesn't handle multiple paths */
> +			saved_follow_renames = opt->diffopt.flags.follow_renames;
> +			opt->diffopt.flags.follow_renames = 0;
> +		}
> +	}

Eek. That's a subtle workaround to break the built-in safety to
ensure there is only one pathspec element while following.


  reply	other threads:[~2026-06-11 22:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12  7:21 [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
2026-05-19  6:17 ` Miklos Vajna
2026-05-19  6:37   ` Junio C Hamano
2026-05-19  8:14     ` Junio C Hamano
2026-06-08  6:35       ` [PATCH] log: improve --follow following renames for non-linear history Miklos Vajna
2026-06-08 15:10         ` Junio C Hamano
2026-06-11  6:18           ` [PATCH v2] " Miklos Vajna
2026-06-11 22:32             ` Junio C Hamano [this message]
2026-05-20 13:28     ` [PATCH] log: let --follow follow renames in merge commits Miklos Vajna
2026-05-22  5:43       ` Jeff King
2026-05-23  6:04         ` [PATCH] log: improve --follow following " Miklos Vajna
2026-05-30  6:28           ` Miklos Vajna
2026-06-04 12:20             ` Miklos Vajna

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo6hglncl.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=vmiklos@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.