From: Stepan Kasal <kasal@ucw.cz>
To: Jeff King <peff@peff.net>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
git@vger.kernel.org,
Jean-Jacques Lafay <jeanjacques.lafay@gmail.com>
Subject: [PATCH v2] git tag --contains : avoid stack overflow
Date: Wed, 23 Apr 2014 09:53:25 +0200 [thread overview]
Message-ID: <20140423075325.GA7268@camelia.ucw.cz> (raw)
In-Reply-To: <20140417215817.GA822@sigill.intra.peff.net>
From: Jean-Jacques Lafay <jeanjacques.lafay@gmail.com>
In large repos, the recursion implementation of contains(commit,
commit_list) may result in a stack overflow. Replace the recursion with
a loop to fix it.
This problem is more apparent on Windows than on Linux, where the stack
is more limited by default.
See also this thread on the msysGit list:
https://groups.google.com/d/topic/msysgit/FqT6boJrb2g/discussion
[jes: re-written to imitate the original recursion more closely]
Thomas Braun pointed out several documentation shortcomings.
Tests are run only if ulimit -s is available. This means they cannot
be run on Windows.
Signed-off-by: Jean-Jacques Lafay <jeanjacques.lafay@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Tested-by: Stepan Kasal <kasal@ucw.cz>
---
Hello,
I have found out that "ulimit -s" does not work on Windows.
Adding this as a prerequisite, we will skip the test there.
On Thu, Apr 17, 2014 at 05:32:38PM -0400, Jeff King wrote:
> So we can bump the depth further; probably 4000 is enough for any system
> to fail with a 64k stack. The deeper we make it, the longer it takes to
> run the test, though. At 4000, my machine seems to take about 300ms to
> run it. That's may be OK.
Consequently, we can accept this proposal.
Stepan Kasal
builtin/tag.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++----------
t/t7004-tag.sh | 23 +++++++++++++++
2 files changed, 98 insertions(+), 15 deletions(-)
diff --git a/builtin/tag.c b/builtin/tag.c
index 6c7c6bd..f344002 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -80,11 +80,19 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
return 0;
}
-static int contains_recurse(struct commit *candidate,
+enum contains_result {
+ CONTAINS_UNKNOWN = -1,
+ CONTAINS_NO = 0,
+ CONTAINS_YES = 1,
+};
+
+/*
+ * Test whether the candidate or one of its parents is contained in the list.
+ * Do not recurse to find out, though, but return -1 if inconclusive.
+ */
+static enum contains_result contains_test(struct commit *candidate,
const struct commit_list *want)
{
- struct commit_list *p;
-
/* was it previously marked as containing a want commit? */
if (candidate->object.flags & TMP_MARK)
return 1;
@@ -92,26 +100,78 @@ static int contains_recurse(struct commit *candidate,
if (candidate->object.flags & UNINTERESTING)
return 0;
/* or are we it? */
- if (in_commit_list(want, candidate))
+ if (in_commit_list(want, candidate)) {
+ candidate->object.flags |= TMP_MARK;
return 1;
+ }
if (parse_commit(candidate) < 0)
return 0;
- /* Otherwise recurse and mark ourselves for future traversals. */
- for (p = candidate->parents; p; p = p->next) {
- if (contains_recurse(p->item, want)) {
- candidate->object.flags |= TMP_MARK;
- return 1;
- }
- }
- candidate->object.flags |= UNINTERESTING;
- return 0;
+ return -1;
}
-static int contains(struct commit *candidate, const struct commit_list *want)
+/*
+ * Mimicking the real stack, this stack lives on the heap, avoiding stack
+ * overflows.
+ *
+ * At each recursion step, the stack items points to the commits whose
+ * ancestors are to be inspected.
+ */
+struct stack {
+ int nr, alloc;
+ struct stack_entry {
+ struct commit *commit;
+ struct commit_list *parents;
+ } *stack;
+};
+
+static void push_to_stack(struct commit *candidate, struct stack *stack)
+{
+ int index = stack->nr++;
+ ALLOC_GROW(stack->stack, stack->nr, stack->alloc);
+ stack->stack[index].commit = candidate;
+ stack->stack[index].parents = candidate->parents;
+}
+
+static enum contains_result contains(struct commit *candidate,
+ const struct commit_list *want)
{
- return contains_recurse(candidate, want);
+ struct stack stack = { 0, 0, NULL };
+ int result = contains_test(candidate, want);
+
+ if (result != CONTAINS_UNKNOWN)
+ return result;
+
+ push_to_stack(candidate, &stack);
+ while (stack.nr) {
+ struct stack_entry *entry = &stack.stack[stack.nr - 1];
+ struct commit *commit = entry->commit;
+ struct commit_list *parents = entry->parents;
+
+ if (!parents) {
+ commit->object.flags |= UNINTERESTING;
+ stack.nr--;
+ }
+ /*
+ * If we just popped the stack, parents->item has been marked,
+ * therefore contains_test will return a meaningful 0 or 1.
+ */
+ else switch (contains_test(parents->item, want)) {
+ case CONTAINS_YES:
+ commit->object.flags |= TMP_MARK;
+ stack.nr--;
+ break;
+ case CONTAINS_NO:
+ entry->parents = parents->next;
+ break;
+ case CONTAINS_UNKNOWN:
+ push_to_stack(parents->item, &stack);
+ break;
+ }
+ }
+ free(stack.stack);
+ return contains_test(candidate, want);
}
static void show_tag_lines(const unsigned char *sha1, int lines)
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 143a8ea..db82f6d 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1423,4 +1423,27 @@ EOF
test_cmp expect actual
'
+ulimit_stack="ulimit -s 64"
+test_lazy_prereq ULIMIT 'bash -c "'"$ulimit_stack"'"'
+
+>expect
+# we require bash and ulimit, this excludes Windows
+test_expect_success ULIMIT '--contains works in a deep repo' '
+ i=1 &&
+ while test $i -lt 4000
+ do
+ echo "commit refs/heads/master
+committer A U Thor <author@example.com> $((1000000000 + $i * 100)) +0200
+data <<EOF
+commit #$i
+EOF"
+ test $i = 1 && echo "from refs/heads/master^0"
+ i=$(($i + 1))
+ done | git fast-import &&
+ git checkout master &&
+ git tag far-far-away HEAD^ &&
+ bash -c "'"$ulimit_stack"' && git tag --contains HEAD >actual" &&
+ test_cmp expect actual
+'
+
test_done
--
1.9.2.msysgit.0.158.g49754fb
next prev parent reply other threads:[~2014-04-23 7:53 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-16 14:15 [PATCH] git tag --contains : avoid stack overflow Stepan Kasal
2014-04-16 15:46 ` Jeff King
2014-04-17 17:31 ` Johannes Schindelin
2014-04-17 21:32 ` Jeff King
2014-04-17 21:52 ` Johannes Schindelin
2014-04-17 21:58 ` Jeff King
2014-04-23 7:53 ` Stepan Kasal [this message]
2014-04-23 14:28 ` [PATCH v2] " Johannes Schindelin
2014-04-23 15:45 ` Stepan Kasal
2014-04-23 19:12 ` Junio C Hamano
2014-04-23 19:16 ` Jeff King
2014-04-23 20:48 ` Junio C Hamano
2014-04-23 20:55 ` Jeff King
2014-04-23 21:05 ` Junio C Hamano
2014-04-24 12:20 ` Stepan Kasal
2014-04-24 12:24 ` [PATCH v3] git tag --contains: " Stepan Kasal
2014-04-25 5:54 ` Jeff King
2014-09-20 18:18 ` Andreas Schwab
2014-09-23 16:05 ` Jeff King
2014-09-23 21:48 ` Andreas Schwab
2014-09-23 22:41 ` Junio C Hamano
2014-04-23 19:59 ` [PATCH v2] git tag --contains : " Stepan Kasal
2014-04-23 19:17 ` Jeff King
2014-04-23 21:14 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140423075325.GA7268@camelia.ucw.cz \
--to=kasal@ucw.cz \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=jeanjacques.lafay@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).