From: Tamir Duberstein <tamird@gmail.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Karthik Nayak <karthik.188@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Victoria Dye <vdye@github.com>,
Derrick Stolee <stolee@gmail.com>,
Elijah Newren <newren@gmail.com>,
Kristofer Karlsson <krka@spotify.com>,
Tamir Duberstein <tamird@gmail.com>
Subject: [PATCH v3 1/3] commit-reach: handle cycles in contains walk
Date: Thu, 11 Jun 2026 20:00:13 -0700 [thread overview]
Message-ID: <20260611-ref-filter-memoized-contains-v3-1-b26af3dba285@gmail.com> (raw)
In-Reply-To: <20260611-ref-filter-memoized-contains-v3-0-b26af3dba285@gmail.com>
The memoized contains traversal used by git tag assumes that commit
ancestry is acyclic. Replacement refs can violate that assumption,
causing it to keep pushing an already active commit until memory is
exhausted.
Mark commits while they are active. If the traversal encounters an
active commit, discard the cache and retry the candidate with the
cycle-safe reachability walk. Cache the candidate's result so a later
walk that reaches it can reuse the answer. Die if the fallback cannot
read ancestry.
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
commit-reach.c | 29 +++++++++++++++++++++++++----
commit-reach.h | 3 ++-
t/t7004-tag.sh | 21 +++++++++++++++++++++
3 files changed, 48 insertions(+), 5 deletions(-)
diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..1d34d66fe8 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -708,7 +708,8 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
/*
* Test whether the candidate is contained in the list.
- * Do not recurse to find out, though, but return -1 if inconclusive.
+ * Do not recurse to find out, though, but return CONTAINS_UNKNOWN if
+ * inconclusive.
*/
static enum contains_result contains_test(struct commit *candidate,
const struct commit_list *want,
@@ -744,7 +745,7 @@ static void push_to_contains_stack(struct commit *candidate, struct contains_sta
}
static enum contains_result contains_tag_algo(struct commit *candidate,
- const struct commit_list *want,
+ struct commit_list *want,
struct contains_cache *cache)
{
struct contains_stack contains_stack = { 0, 0, NULL };
@@ -765,6 +766,7 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
if (result != CONTAINS_UNKNOWN)
return result;
+ *contains_cache_at(cache, candidate) = CONTAINS_IN_PROGRESS;
push_to_contains_stack(candidate, &contains_stack);
while (contains_stack.nr) {
struct contains_stack_entry *entry = &contains_stack.contains_stack[contains_stack.nr - 1];
@@ -776,8 +778,8 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
contains_stack.nr--;
}
/*
- * If we just popped the stack, parents->item has been marked,
- * therefore contains_test will return a meaningful yes/no.
+ * A parent may have just been popped and marked, or may still
+ * be active when replacement refs create a cycle.
*/
else switch (contains_test(parents->item, want, cache, cutoff)) {
case CONTAINS_YES:
@@ -787,13 +789,32 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
case CONTAINS_NO:
entry->parents = parents->next;
break;
+ case CONTAINS_IN_PROGRESS:
+ /*
+ * Partial negative answers are not safe across a cycle.
+ * Discard them and use the cycle-safe reachability walk.
+ */
+ goto cycle;
case CONTAINS_UNKNOWN:
+ *contains_cache_at(cache, parents->item) =
+ CONTAINS_IN_PROGRESS;
push_to_contains_stack(parents->item, &contains_stack);
break;
}
}
free(contains_stack.contains_stack);
return contains_test(candidate, want, cache, cutoff);
+
+cycle:
+ free(contains_stack.contains_stack);
+ clear_contains_cache(cache);
+
+ result = repo_is_descendant_of(the_repository, candidate, want);
+ if (result < 0)
+ die(_("failed to check reachability after detecting a cycle"));
+ *contains_cache_at(cache, candidate) =
+ result ? CONTAINS_YES : CONTAINS_NO;
+ return result ? CONTAINS_YES : CONTAINS_NO;
}
int commit_contains(struct ref_filter *filter, struct commit *commit,
diff --git a/commit-reach.h b/commit-reach.h
index 3f3a563d8a..f908d305b1 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -73,7 +73,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
enum contains_result {
CONTAINS_UNKNOWN = 0,
CONTAINS_NO,
- CONTAINS_YES
+ CONTAINS_YES,
+ CONTAINS_IN_PROGRESS
};
define_commit_slab(contains_cache, enum contains_result);
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index d918005dd9..4044bab006 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1611,6 +1611,27 @@ test_expect_success 'checking that first commit is in all tags (hash)' '
test_cmp expected actual
'
+test_expect_success 'tag --contains handles cyclic replacement histories' '
+ first=$(git rev-parse HEAD~2) &&
+ second=$(git rev-parse HEAD~) &&
+ third=$(git rev-parse HEAD) &&
+ test_when_finished "
+ git replace -d $first &&
+ git replace -d $third &&
+ git tag -d cycle-a cycle-b
+ " &&
+ git tag cycle-a "$first" &&
+ git tag cycle-b "$third" &&
+ git replace --graft "$first" "$third" "$second" &&
+ git replace --graft "$third" "$first" &&
+ cat >expected <<-\EOF &&
+ cycle-a
+ cycle-b
+ EOF
+ git tag --contains="$second" --list "cycle-*" >actual &&
+ test_cmp expected actual
+'
+
# other ways of specifying the commit
test_expect_success 'checking that first commit is in all tags (tag)' '
cat >expected <<-\EOF &&
--
2.54.0.548.gbe7bb2469c
next prev parent reply other threads:[~2026-06-12 3:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-09 2:36 [PATCH v2 0/2] Reuse --contains traversal results Tamir Duberstein
2026-06-09 2:36 ` [PATCH v2 1/2] commit-reach: handle cycles in contains walk Tamir Duberstein
2026-06-11 7:29 ` Jeff King
2026-06-12 2:40 ` Tamir Duberstein
2026-06-09 2:36 ` [PATCH v2 2/2] ref-filter: memoize --contains with generations Tamir Duberstein
2026-06-10 11:47 ` Karthik Nayak
2026-06-10 12:20 ` Tamir Duberstein
2026-06-11 8:16 ` Karthik Nayak
2026-06-11 20:10 ` Tamir Duberstein
2026-06-11 8:22 ` Jeff King
2026-06-12 2:40 ` Tamir Duberstein
2026-06-12 3:00 ` [PATCH v3 0/3] Reuse --contains traversal results Tamir Duberstein
2026-06-12 3:00 ` Tamir Duberstein [this message]
2026-06-12 6:53 ` [PATCH v3 1/3] commit-reach: handle cycles in contains walk Kristofer Karlsson
2026-06-12 21:26 ` Tamir Duberstein
2026-06-12 3:00 ` [PATCH v3 2/3] ref-filter: memoize --contains with generations Tamir Duberstein
2026-06-12 3:00 ` [PATCH v3 3/3] commit-reach: die on contains walk errors Tamir Duberstein
2026-06-12 21:49 ` [PATCH v4 0/3] Reuse --contains traversal results Tamir Duberstein
2026-06-12 21:49 ` [PATCH v4 1/3] commit-reach: reject cycles in contains walk Tamir Duberstein
2026-06-12 21:49 ` [PATCH v4 2/3] ref-filter: memoize --contains with generations Tamir Duberstein
2026-06-12 21:49 ` [PATCH v4 3/3] commit-reach: die on contains walk errors Tamir Duberstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260611-ref-filter-memoized-contains-v3-1-b26af3dba285@gmail.com \
--to=tamird@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=krka@spotify.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox