* [PATCH] log: --remerge-diff needs to keep around commit parents
@ 2024-11-08 13:43 Johannes Schindelin via GitGitGadget
2024-11-08 15:47 ` Elijah Newren
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-11-08 13:43 UTC (permalink / raw)
To: git; +Cc: Johannes Sixt, Elijah Newren, Johannes Schindelin,
Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
To show a remerge diff, the merge needs to be recreated. For that to
work, the merge base(s) need to be found, which means that the commits'
parents have to be traversed until common ancestors are found (if any).
However, one optimization that hails all the way back to
cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release
the commit's list of parents immediately after showing it. This can break
the merge base computation.
Note that it matters more clearly when traversing the commits in
reverse: In that instance, if a parent of a merge commit has been shown
as part of the `git log` command, by the time the merge commit's diff
needs to be computed, that parent commit's list of parent commits will
have been set to `NULL` and as a result no merge base will be found.
Let's fix this by special-casing the `remerge_diff` mode, similar to
what we did with reflogs in f35650dff6a4 (log: do not free parents when
walking reflog, 2017-07-07).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
log: --remerge-diff needs to keep around commit parents
This fixes a bug that is pretty much as old as the remerge-diff
machinery itself. I noticed it while writing my reply to Hannes Sixt's
concerns about my range-diff --diff-merges=<mode> patch
(https://lore.kernel.org/git/af576487-5de2-fba3-b341-3c082322c9ec@gmx.de/).
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1825%2Fdscho%2Flog-remerge-diff-needs-commit-parents-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1825/dscho/log-remerge-diff-needs-commit-parents-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1825
builtin/log.c | 2 +-
t/t4069-remerge-diff.sh | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/builtin/log.c b/builtin/log.c
index c0a8bb95e98..a297c6caf59 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -522,7 +522,7 @@ static int cmd_log_walk_no_free(struct rev_info *rev)
* but we didn't actually show the commit.
*/
rev->max_count++;
- if (!rev->reflog_info) {
+ if (!rev->reflog_info && !rev->remerge_diff) {
/*
* We may show a given commit multiple times when
* walking the reflogs.
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe0..a68c6bfa036 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -317,4 +317,11 @@ test_expect_success 'remerge-diff turns off history simplification' '
test_cmp expect actual
'
+test_expect_success 'remerge-diff with --reverse' '
+ git log -1 --remerge-diff --oneline ab_resolution^ >expect &&
+ git log -1 --remerge-diff --oneline ab_resolution >>expect &&
+ git log -2 --remerge-diff --oneline ab_resolution --reverse >actual &&
+ test_cmp expect actual
+'
+
test_done
base-commit: bea9ecd24b0c3bf06cab4a851694fe09e7e51408
--
gitgitgadget
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] log: --remerge-diff needs to keep around commit parents
2024-11-08 13:43 [PATCH] log: --remerge-diff needs to keep around commit parents Johannes Schindelin via GitGitGadget
@ 2024-11-08 15:47 ` Elijah Newren
2024-11-11 1:52 ` Junio C Hamano
2024-11-11 1:46 ` Junio C Hamano
2024-11-11 18:33 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2 siblings, 1 reply; 9+ messages in thread
From: Elijah Newren @ 2024-11-08 15:47 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, Johannes Sixt, Johannes Schindelin
On Fri, Nov 8, 2024 at 5:43 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> To show a remerge diff, the merge needs to be recreated. For that to
> work, the merge base(s) need to be found, which means that the commits'
> parents have to be traversed until common ancestors are found (if any).
>
> However, one optimization that hails all the way back to
> cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release
> the commit's list of parents immediately after showing it. This can break
> the merge base computation.
>
> Note that it matters more clearly when traversing the commits in
> reverse: In that instance, if a parent of a merge commit has been shown
> as part of the `git log` command, by the time the merge commit's diff
> needs to be computed, that parent commit's list of parent commits will
> have been set to `NULL` and as a result no merge base will be found.
>
> Let's fix this by special-casing the `remerge_diff` mode, similar to
> what we did with reflogs in f35650dff6a4 (log: do not free parents when
> walking reflog, 2017-07-07).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> log: --remerge-diff needs to keep around commit parents
>
> This fixes a bug that is pretty much as old as the remerge-diff
> machinery itself. I noticed it while writing my reply to Hannes Sixt's
> concerns about my range-diff --diff-merges=<mode> patch
> (https://lore.kernel.org/git/af576487-5de2-fba3-b341-3c082322c9ec@gmx.de/).
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1825%2Fdscho%2Flog-remerge-diff-needs-commit-parents-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1825/dscho/log-remerge-diff-needs-commit-parents-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1825
>
> builtin/log.c | 2 +-
> t/t4069-remerge-diff.sh | 7 +++++++
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index c0a8bb95e98..a297c6caf59 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -522,7 +522,7 @@ static int cmd_log_walk_no_free(struct rev_info *rev)
> * but we didn't actually show the commit.
> */
> rev->max_count++;
> - if (!rev->reflog_info) {
> + if (!rev->reflog_info && !rev->remerge_diff) {
> /*
> * We may show a given commit multiple times when
> * walking the reflogs.
Should this comment be updated to reflect the expanded rationale for
this block's purpose?
> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> index 07323ebafe0..a68c6bfa036 100755
> --- a/t/t4069-remerge-diff.sh
> +++ b/t/t4069-remerge-diff.sh
> @@ -317,4 +317,11 @@ test_expect_success 'remerge-diff turns off history simplification' '
> test_cmp expect actual
> '
>
> +test_expect_success 'remerge-diff with --reverse' '
> + git log -1 --remerge-diff --oneline ab_resolution^ >expect &&
> + git log -1 --remerge-diff --oneline ab_resolution >>expect &&
> + git log -2 --remerge-diff --oneline ab_resolution --reverse >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
>
> base-commit: bea9ecd24b0c3bf06cab4a851694fe09e7e51408
> --
> gitgitgadget
Wow, nice find, and I appreciate the nice simple testcase
demonstrating what had been the problem.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] log: --remerge-diff needs to keep around commit parents
2024-11-08 13:43 [PATCH] log: --remerge-diff needs to keep around commit parents Johannes Schindelin via GitGitGadget
2024-11-08 15:47 ` Elijah Newren
@ 2024-11-11 1:46 ` Junio C Hamano
2024-11-11 18:32 ` Johannes Schindelin
2024-11-11 18:33 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2024-11-11 1:46 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, Johannes Sixt, Elijah Newren, Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> To show a remerge diff, the merge needs to be recreated. For that to
> work, the merge base(s) need to be found, which means that the commits'
> parents have to be traversed until common ancestors are found (if any).
>
> However, one optimization that hails all the way back to
> cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release
> the commit's list of parents immediately after showing it. This can break
> the merge base computation.
> Note that it matters more clearly when traversing the commits in
> reverse: In that instance, if a parent of a merge commit has been shown
> as part of the `git log` command, by the time the merge commit's diff
> needs to be computed, that parent commit's list of parent commits will
> have been set to `NULL` and as a result no merge base will be found.
Ouch.
I am curious about "more clearly" in the above, though. Do we also
lose parents before a base can be computed during a forward
traversal? Unlike the reflog walk, we won't show the same commit
twice, so unless we are traversing some of the history of our
parents *and* showing these ancestors found there _before_ we need
to show the merge in question, I do not think the issue would
surface during a forward traversal. So the problematic case is when
we are not doing topological display, and traversing down to a
commit with skewed timestamp causes it to be shown (and lose its
parents due to memory optimization) because it appears much newer
than our merge, and there is an ancestry chain leading to it that
passes commits other than our merge, or something?
> Let's fix this by special-casing the `remerge_diff` mode, similar to
> what we did with reflogs in f35650dff6a4 (log: do not free parents when
> walking reflog, 2017-07-07).
OK.
> diff --git a/builtin/log.c b/builtin/log.c
> index c0a8bb95e98..a297c6caf59 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -522,7 +522,7 @@ static int cmd_log_walk_no_free(struct rev_info *rev)
> * but we didn't actually show the commit.
> */
> rev->max_count++;
> - if (!rev->reflog_info) {
> + if (!rev->reflog_info && !rev->remerge_diff) {
> /*
> * We may show a given commit multiple times when
> * walking the reflogs.
OK, that's trivial ;-)
> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> index 07323ebafe0..a68c6bfa036 100755
> --- a/t/t4069-remerge-diff.sh
> +++ b/t/t4069-remerge-diff.sh
> @@ -317,4 +317,11 @@ test_expect_success 'remerge-diff turns off history simplification' '
> test_cmp expect actual
> '
>
> +test_expect_success 'remerge-diff with --reverse' '
> + git log -1 --remerge-diff --oneline ab_resolution^ >expect &&
> + git log -1 --remerge-diff --oneline ab_resolution >>expect &&
> + git log -2 --remerge-diff --oneline ab_resolution --reverse >actual &&
> + test_cmp expect actual
> +'
This is the trivially reproducible "show in reverse" case. Nice
finding.
Will queue.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] log: --remerge-diff needs to keep around commit parents
2024-11-08 15:47 ` Elijah Newren
@ 2024-11-11 1:52 ` Junio C Hamano
0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2024-11-11 1:52 UTC (permalink / raw)
To: Elijah Newren
Cc: Johannes Schindelin via GitGitGadget, git, Johannes Sixt,
Johannes Schindelin
Elijah Newren <newren@gmail.com> writes:
>> - if (!rev->reflog_info) {
>> + if (!rev->reflog_info && !rev->remerge_diff) {
>> /*
>> * We may show a given commit multiple times when
>> * walking the reflogs.
>
> Should this comment be updated to reflect the expanded rationale for
> this block's purpose?
Ah, we probably should. Especially if the newly discovered failure
mode is not due to having to show the same commit multiple times.
During reflog walk, we may show the same commit more than once,
so do not discard the parents. There may be a merge commit that
is descendent from the current commit, and in order to show it with
remerge-diff enabled, we need its ancestry chain, including our
parents, to compute the merge base.
or something? It is still not clear to me in what scenario a merge
commit, in a forward traversal, is shown _after_ one of its ancestor
commit is shown (and due to the memory optimization, loses its
parents).
Thanks, both.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] log: --remerge-diff needs to keep around commit parents
2024-11-11 1:46 ` Junio C Hamano
@ 2024-11-11 18:32 ` Johannes Schindelin
2024-11-11 22:13 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin @ 2024-11-11 18:32 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, Johannes Sixt,
Elijah Newren
Hi Junio,
On Mon, 11 Nov 2024, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > To show a remerge diff, the merge needs to be recreated. For that to
> > work, the merge base(s) need to be found, which means that the commits'
> > parents have to be traversed until common ancestors are found (if any).
> >
> > However, one optimization that hails all the way back to
> > cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release
> > the commit's list of parents immediately after showing it. This can break
> > the merge base computation.
>
> > Note that it matters more clearly when traversing the commits in
> > reverse: In that instance, if a parent of a merge commit has been shown
> > as part of the `git log` command, by the time the merge commit's diff
> > needs to be computed, that parent commit's list of parent commits will
> > have been set to `NULL` and as a result no merge base will be found.
>
> Ouch.
>
> I am curious about "more clearly" in the above, though.
I consider these examples less clear, but they are still affected:
git show --remerge-diff v2.45.2^0
vs
git show --remerge-diff v2.44.2^0 v2.45.2^0
git show --remerge-diff v2.45.1~1
vs
git log --topo-order --first-parent --remerge-diff v2.44.2 v2.45.1
Concretely, these diffs should be empty, but are not:
git diff --no-index \
<(git show --remerge-diff v2.45.2^0 | sed 1d) \
<(git show --remerge-diff v2.44.2^0 v2.45.2^0 | sed 1,/^commit/d)
and
git diff --no-index \
<(git show --remerge-diff v2.45.1~1 | grep -v ^commit) \
<(git log --topo-order --first-parent -6 --remerge-diff v2.44.2 v2.45.1 |
sed '1,/^commit 1c00f92eb5ee4a48ab615eefa41f2dd6024d43bc/d;/^commit/,$d')
No `--reverse` required, not even clock skew.
Ciao,
Johannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] log: --remerge-diff needs to keep around commit parents
2024-11-08 13:43 [PATCH] log: --remerge-diff needs to keep around commit parents Johannes Schindelin via GitGitGadget
2024-11-08 15:47 ` Elijah Newren
2024-11-11 1:46 ` Junio C Hamano
@ 2024-11-11 18:33 ` Johannes Schindelin via GitGitGadget
2024-12-12 10:29 ` [PATCH v3] " Johannes Schindelin via GitGitGadget
2 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-11-11 18:33 UTC (permalink / raw)
To: git; +Cc: Johannes Sixt, Elijah Newren, Johannes Schindelin,
Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
To show a remerge diff, the merge needs to be recreated. For that to
work, the merge base(s) need to be found, which means that the commits'
parents have to be traversed until common ancestors are found (if any).
However, one optimization that hails all the way back to
cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release
the commit's list of parents immediately after showing it. This can break
the merge base computation.
Note that it matters more clearly when traversing the commits in
reverse: In that instance, if a parent of a merge commit has been shown
as part of the `git log` command, by the time the merge commit's diff
needs to be computed, that parent commit's list of parent commits will
have been set to `NULL` and as a result no merge base will be found.
Let's fix this by special-casing the `remerge_diff` mode, similar to
what we did with reflogs in f35650dff6a4 (log: do not free parents when
walking reflog, 2017-07-07).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
log: --remerge-diff needs to keep around commit parents
This fixes a bug that is pretty much as old as the remerge-diff
machinery itself. I noticed it while writing my reply to Hannes Sixt's
concerns about my range-diff --diff-merges=<mode> patch
(https://lore.kernel.org/git/af576487-5de2-fba3-b341-3c082322c9ec@gmx.de/).
Changes since v1:
* Amended the code comment of the affected conditional code block in
harmony with the newly-added condition.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1825%2Fdscho%2Flog-remerge-diff-needs-commit-parents-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1825/dscho/log-remerge-diff-needs-commit-parents-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1825
Range-diff vs v1:
1: 68238315c14 ! 1: afff27f8222 log: --remerge-diff needs to keep around commit parents
@@ builtin/log.c: static int cmd_log_walk_no_free(struct rev_info *rev)
+ if (!rev->reflog_info && !rev->remerge_diff) {
/*
* We may show a given commit multiple times when
- * walking the reflogs.
+- * walking the reflogs.
++ * walking the reflogs. Therefore we still need it.
++ *
++ * Likewise, we potentially still need the parents
++ * of * already shown commits to determine merge
++ * bases when showing remerge diffs.
+ */
+ free_commit_buffer(the_repository->parsed_objects,
+ commit);
## t/t4069-remerge-diff.sh ##
@@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff turns off history simplification' '
builtin/log.c | 8 ++++++--
t/t4069-remerge-diff.sh | 7 +++++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/builtin/log.c b/builtin/log.c
index c0a8bb95e98..2f88a3e607d 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -522,10 +522,14 @@ static int cmd_log_walk_no_free(struct rev_info *rev)
* but we didn't actually show the commit.
*/
rev->max_count++;
- if (!rev->reflog_info) {
+ if (!rev->reflog_info && !rev->remerge_diff) {
/*
* We may show a given commit multiple times when
- * walking the reflogs.
+ * walking the reflogs. Therefore we still need it.
+ *
+ * Likewise, we potentially still need the parents
+ * of * already shown commits to determine merge
+ * bases when showing remerge diffs.
*/
free_commit_buffer(the_repository->parsed_objects,
commit);
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe0..a68c6bfa036 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -317,4 +317,11 @@ test_expect_success 'remerge-diff turns off history simplification' '
test_cmp expect actual
'
+test_expect_success 'remerge-diff with --reverse' '
+ git log -1 --remerge-diff --oneline ab_resolution^ >expect &&
+ git log -1 --remerge-diff --oneline ab_resolution >>expect &&
+ git log -2 --remerge-diff --oneline ab_resolution --reverse >actual &&
+ test_cmp expect actual
+'
+
test_done
base-commit: bea9ecd24b0c3bf06cab4a851694fe09e7e51408
--
gitgitgadget
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] log: --remerge-diff needs to keep around commit parents
2024-11-11 18:32 ` Johannes Schindelin
@ 2024-11-11 22:13 ` Junio C Hamano
0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2024-11-11 22:13 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Johannes Schindelin via GitGitGadget, git, Johannes Sixt,
Elijah Newren
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> I consider these examples less clear, but they are still affected:
>
> git show --remerge-diff v2.45.2^0
> vs
> git show --remerge-diff v2.44.2^0 v2.45.2^0
>
> git show --remerge-diff v2.45.1~1
> vs
> git log --topo-order --first-parent --remerge-diff v2.44.2 v2.45.1
>
> Concretely, these diffs should be empty, but are not:
>
> git diff --no-index \
> <(git show --remerge-diff v2.45.2^0 | sed 1d) \
> <(git show --remerge-diff v2.44.2^0 v2.45.2^0 | sed 1,/^commit/d)
>
> and
>
> git diff --no-index \
> <(git show --remerge-diff v2.45.1~1 | grep -v ^commit) \
> <(git log --topo-order --first-parent -6 --remerge-diff v2.44.2 v2.45.1 |
> sed '1,/^commit 1c00f92eb5ee4a48ab615eefa41f2dd6024d43bc/d;/^commit/,$d')
>
> No `--reverse` required, not even clock skew.
As we often tell contributors, questions in reviewer comments should
be also answered in an updated patch, so that future readers of "git
log", who cannot ask direct questions to the author of the patch, do
not have to ask the same questions. Can we add an explanation how
this affects forward traversal in a three-line paragraph?
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3] log: --remerge-diff needs to keep around commit parents
2024-11-11 18:33 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
@ 2024-12-12 10:29 ` Johannes Schindelin via GitGitGadget
2024-12-13 15:02 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-12-12 10:29 UTC (permalink / raw)
To: git; +Cc: Johannes Sixt, Elijah Newren, Johannes Schindelin,
Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
To show a remerge diff, the merge needs to be recreated. For that to
work, the merge base(s) need to be found, which means that the commits'
parents have to be traversed until common ancestors are found (if any).
However, one optimization that hails all the way back to cb115748ec0d
(Some more memory leak avoidance, 2006-06-17) is to release the commit's
list of parents immediately after showing it _and to set that parent
list to `NULL`_. This can break the merge base computation.
This problem is most obvious when traversing the commits in reverse: In
that instance, if a parent of a merge commit has been shown as part of
the `git log` command, by the time the merge commit's diff needs to be
computed, that parent commit's list of parent commits will have been set
to `NULL` and as a result no merge base will be found (even if one
should be found).
Traversing commits in reverse is far from the only circumstance in which
this problem occurs, though. There are many avenues to traversing at
least one commit in the revision walk that will later be part of a merge
base computation, for example when not even walking any revisions in
`git show <merge1> <merge2>` where `<merge1>` is part of the commit
graph between the parents of `<merge2>`.
Another way to force a scenario where a commit is traversed before it
has to be traversed again as part of a merge base computation is to
start with two revisions (where the first one is reachable from the
second but not in a first-parent ancestry) and show the commit log with
`--topo-order` and `--first-parent`.
Let's fix this by special-casing the `remerge_diff` mode, similar to
what we did with reflogs in f35650dff6a4 (log: do not free parents when
walking reflog, 2017-07-07).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
log: --remerge-diff needs to keep around commit parents
This fixes a bug that is pretty much as old as the remerge-diff
machinery itself. I noticed it while writing my reply to Hannes Sixt's
concerns about my range-diff --diff-merges=<mode> patch
(https://lore.kernel.org/git/af576487-5de2-fba3-b341-3c082322c9ec@gmx.de/).
Changes since v2:
* Amended the commit message to illustrate why reverse walks are not
the only instances where this fix is relevant.
Changes since v1:
* Amended the code comment of the affected conditional code block in
harmony with the newly-added condition.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1825%2Fdscho%2Flog-remerge-diff-needs-commit-parents-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1825/dscho/log-remerge-diff-needs-commit-parents-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1825
Range-diff vs v2:
1: afff27f8222 ! 1: ab75a565b4c log: --remerge-diff needs to keep around commit parents
@@ Commit message
work, the merge base(s) need to be found, which means that the commits'
parents have to be traversed until common ancestors are found (if any).
- However, one optimization that hails all the way back to
- cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release
- the commit's list of parents immediately after showing it. This can break
- the merge base computation.
-
- Note that it matters more clearly when traversing the commits in
- reverse: In that instance, if a parent of a merge commit has been shown
- as part of the `git log` command, by the time the merge commit's diff
- needs to be computed, that parent commit's list of parent commits will
- have been set to `NULL` and as a result no merge base will be found.
+ However, one optimization that hails all the way back to cb115748ec0d
+ (Some more memory leak avoidance, 2006-06-17) is to release the commit's
+ list of parents immediately after showing it _and to set that parent
+ list to `NULL`_. This can break the merge base computation.
+
+ This problem is most obvious when traversing the commits in reverse: In
+ that instance, if a parent of a merge commit has been shown as part of
+ the `git log` command, by the time the merge commit's diff needs to be
+ computed, that parent commit's list of parent commits will have been set
+ to `NULL` and as a result no merge base will be found (even if one
+ should be found).
+
+ Traversing commits in reverse is far from the only circumstance in which
+ this problem occurs, though. There are many avenues to traversing at
+ least one commit in the revision walk that will later be part of a merge
+ base computation, for example when not even walking any revisions in
+ `git show <merge1> <merge2>` where `<merge1>` is part of the commit
+ graph between the parents of `<merge2>`.
+
+ Another way to force a scenario where a commit is traversed before it
+ has to be traversed again as part of a merge base computation is to
+ start with two revisions (where the first one is reachable from the
+ second but not in a first-parent ancestry) and show the commit log with
+ `--topo-order` and `--first-parent`.
Let's fix this by special-casing the `remerge_diff` mode, similar to
what we did with reflogs in f35650dff6a4 (log: do not free parents when
builtin/log.c | 8 ++++++--
t/t4069-remerge-diff.sh | 7 +++++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/builtin/log.c b/builtin/log.c
index c0a8bb95e98..2f88a3e607d 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -522,10 +522,14 @@ static int cmd_log_walk_no_free(struct rev_info *rev)
* but we didn't actually show the commit.
*/
rev->max_count++;
- if (!rev->reflog_info) {
+ if (!rev->reflog_info && !rev->remerge_diff) {
/*
* We may show a given commit multiple times when
- * walking the reflogs.
+ * walking the reflogs. Therefore we still need it.
+ *
+ * Likewise, we potentially still need the parents
+ * of * already shown commits to determine merge
+ * bases when showing remerge diffs.
*/
free_commit_buffer(the_repository->parsed_objects,
commit);
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe0..a68c6bfa036 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -317,4 +317,11 @@ test_expect_success 'remerge-diff turns off history simplification' '
test_cmp expect actual
'
+test_expect_success 'remerge-diff with --reverse' '
+ git log -1 --remerge-diff --oneline ab_resolution^ >expect &&
+ git log -1 --remerge-diff --oneline ab_resolution >>expect &&
+ git log -2 --remerge-diff --oneline ab_resolution --reverse >actual &&
+ test_cmp expect actual
+'
+
test_done
base-commit: bea9ecd24b0c3bf06cab4a851694fe09e7e51408
--
gitgitgadget
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3] log: --remerge-diff needs to keep around commit parents
2024-12-12 10:29 ` [PATCH v3] " Johannes Schindelin via GitGitGadget
@ 2024-12-13 15:02 ` Junio C Hamano
0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2024-12-13 15:02 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, Johannes Sixt, Elijah Newren, Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> This problem is most obvious when traversing the commits in reverse: In
> that instance, if a parent of a merge commit has been shown as part of
> the `git log` command, by the time the merge commit's diff needs to be
> computed, that parent commit's list of parent commits will have been set
> to `NULL` and as a result no merge base will be found (even if one
> should be found).
>
> Traversing commits in reverse is far from the only circumstance in which
> this problem occurs, though. There are many avenues to traversing at
> least one commit in the revision walk that will later be part of a merge
> base computation, for example when not even walking any revisions in
> `git show <merge1> <merge2>` where `<merge1>` is part of the commit
> graph between the parents of `<merge2>`.
>
> Another way to force a scenario where a commit is traversed before it
> has to be traversed again as part of a merge base computation is to
> start with two revisions (where the first one is reachable from the
> second but not in a first-parent ancestry) and show the commit log with
> `--topo-order` and `--first-parent`.
>
> Let's fix this by special-casing the `remerge_diff` mode, similar to
> what we did with reflogs in f35650dff6a4 (log: do not free parents when
> walking reflog, 2017-07-07).
Nicely described.
> This fixes a bug that is pretty much as old as the remerge-diff
> machinery itself. I noticed it while writing my reply to Hannes Sixt's
> concerns about my range-diff --diff-merges=<mode> patch
> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> index 07323ebafe0..a68c6bfa036 100755
> --- a/t/t4069-remerge-diff.sh
> +++ b/t/t4069-remerge-diff.sh
> @@ -317,4 +317,11 @@ test_expect_success 'remerge-diff turns off history simplification' '
> test_cmp expect actual
> '
>
> +test_expect_success 'remerge-diff with --reverse' '
> + git log -1 --remerge-diff --oneline ab_resolution^ >expect &&
> + git log -1 --remerge-diff --oneline ab_resolution >>expect &&
> + git log -2 --remerge-diff --oneline ab_resolution --reverse >actual &&
> + test_cmp expect actual
> +'
> +
Will queue. Let me mark it for 'next'.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-12-13 15:03 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08 13:43 [PATCH] log: --remerge-diff needs to keep around commit parents Johannes Schindelin via GitGitGadget
2024-11-08 15:47 ` Elijah Newren
2024-11-11 1:52 ` Junio C Hamano
2024-11-11 1:46 ` Junio C Hamano
2024-11-11 18:32 ` Johannes Schindelin
2024-11-11 22:13 ` Junio C Hamano
2024-11-11 18:33 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2024-12-12 10:29 ` [PATCH v3] " Johannes Schindelin via GitGitGadget
2024-12-13 15:02 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).