public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Amisha Chhajed <amishhhaaaa@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <stolee@gmail.com>,
	Elijah Newren <newren@gmail.com>, Jeff King <peff@peff.net>,
	Amisha Chhajed <amishhhaaaa@gmail.com>
Subject: [PATCH v7] sparse-checkout: optimize string_list construction and add tests to verify deduplication.
Date: Wed, 21 Jan 2026 18:30:05 +0530	[thread overview]
Message-ID: <20260121130005.72375-1-amishhhaaaa@gmail.com> (raw)
In-Reply-To: <20260120153829.48044-1-amishhhaaaa@gmail.com>

Improve O(n^2) complexity to O(n log n) while building a sorted
'string_list' by constructing it unsorted then sorting it
followed by removing duplicates.

sparse-checkout deduplicates repeated cone-mode patterns,
but this behaviour was previously untested, add tests that
verify that sparse-checkout file contain each cone
pattern only once and sparse-checkout list reports each pattern
only once.

Signed-off-by: Amisha Chhajed <amishhhaaaa@gmail.com>
---
 builtin/sparse-checkout.c          |  7 +++--
 t/t1091-sparse-checkout-builtin.sh | 48 ++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 15d51e60a8..7dfb276bf0 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -91,10 +91,11 @@ static int sparse_checkout_list(int argc, const char **argv, const char *prefix,
 
 		hashmap_for_each_entry(&pl.recursive_hashmap, &iter, pe, ent) {
 			/* pe->pattern starts with "/", skip it */
-			string_list_insert(&sl, pe->pattern + 1);
+			string_list_append(&sl, pe->pattern + 1);
 		}
 
 		string_list_sort(&sl);
+		string_list_remove_duplicates(&sl, 0);
 
 		for (i = 0; i < sl.nr; i++) {
 			quote_c_style(sl.items[i].string, NULL, stdout, 0);
@@ -289,7 +290,7 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 		if (!hashmap_contains_parent(&pl->recursive_hashmap,
 					     pe->pattern,
 					     &parent_pattern))
-			string_list_insert(&sl, pe->pattern);
+			string_list_append(&sl, pe->pattern);
 	}
 
 	string_list_sort(&sl);
@@ -311,7 +312,7 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 		if (!hashmap_contains_parent(&pl->recursive_hashmap,
 					     pe->pattern,
 					     &parent_pattern))
-			string_list_insert(&sl, pe->pattern);
+			string_list_append(&sl, pe->pattern);
 	}
 
 	strbuf_release(&parent_pattern);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index b2da4feaef..cd0aed9975 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -817,6 +817,54 @@ test_expect_success 'cone mode clears ignored subdirectories' '
 	test_cmp expect out
 '
 
+test_expect_success 'sparse-checkout deduplicates repeated cone patterns' '
+	rm -f repo/.git/info/sparse-checkout &&
+	git -C repo sparse-checkout init --cone &&
+	git -C repo sparse-checkout add --stdin <<-\EOF &&
+	foo/bar/baz
+	a/b/c
+	foo/bar/baz
+	a/b
+	EOF
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/a/
+	!/a/*/
+	/foo/
+	!/foo/*/
+	/foo/bar/
+	!/foo/bar/*/
+	/a/b/
+	/foo/bar/baz/
+	EOF
+	test_cmp expect repo/.git/info/sparse-checkout
+'
+
+test_expect_success 'sparse-checkout list deduplicates repeated cone patterns' '
+	rm -f repo/.git/info/sparse-checkout &&
+	git -C repo sparse-checkout init --cone &&
+	cat <<-\EOF >repo/.git/info/sparse-checkout &&
+	/*
+	!/*/
+	/a/
+	!/a/*/
+	/foo/
+	!/foo/*/
+	/foo/bar/
+	!/foo/bar/*/
+	/a/b/
+	/foo/bar/baz/
+	/foo/bar/baz/
+	EOF
+	git -C repo sparse-checkout list >actual &&
+	cat <<-\EOF >expect &&
+	a/b
+	foo/bar/baz
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'malformed cone-mode patterns' '
 	git -C repo sparse-checkout init --cone &&
 	mkdir -p repo/foo/bar &&
-- 
2.51.0


  parent reply	other threads:[~2026-01-21 13:00 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 19:28 [PATCH] sparse-checkout: optimize string_list construction amisha
2026-01-14 21:35 ` Jeff King
2026-01-18  2:39   ` Derrick Stolee
2026-01-15 12:56 ` [PATCH v2] " amisha
2026-01-15 13:09 ` [PATCH v3] " amisha
2026-01-15 13:15   ` Amisha Chhajed
2026-01-15 20:09     ` Jeff King
2026-01-16 17:03       ` Amisha Chhajed
2026-01-15 22:26     ` René Scharfe
2026-01-16  8:30       ` Amisha Chhajed
2026-01-16 17:17         ` Junio C Hamano
2026-01-18  2:46           ` Derrick Stolee
2026-01-18 13:09             ` Amisha Chhajed
2026-01-15 13:55   ` Junio C Hamano
2026-01-16 16:50 ` [PATCH] " amisha
2026-01-16 19:11   ` Junio C Hamano
2026-01-18 13:07     ` Amisha Chhajed
2026-01-19  5:32     ` Jeff King
2026-01-19 19:06       ` Junio C Hamano
2026-01-19 12:33 ` [PATCH v5 1/2] " amisha
2026-01-19 17:04   ` Derrick Stolee
2026-01-19 18:33     ` Pushkar Singh
2026-01-20 15:47     ` Amisha Chhajed
2026-01-20 15:38 ` [PATCH v6] sparse-checkout: optimize string_list construction and add tests to verify deduplication amisha
2026-01-20 20:37   ` Derrick Stolee
2026-01-21 13:00   ` Amisha Chhajed [this message]
2026-01-21 16:28     ` [PATCH v7] " Derrick Stolee
2026-01-21 16:51       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260121130005.72375-1-amishhhaaaa@gmail.com \
    --to=amishhhaaaa@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox