All of lore.kernel.org
 help / color / mirror / Atom feed
From: amisha <amishhhaaaa@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <stolee@gmail.com>,
	Elijah Newren <newren@gmail.com>, Jeff King <peff@peff.net>,
	amisha <amishhhaaaa@gmail.com>
Subject: [PATCH v6] sparse-checkout: optimize string_list construction and add tests to verify deduplication.
Date: Tue, 20 Jan 2026 21:08:29 +0530	[thread overview]
Message-ID: <20260120153829.48044-1-amishhhaaaa@gmail.com> (raw)
In-Reply-To: <20260114192803.4852-1-amishhhaaaa@gmail.com>

From: Amisha Chhajed <amishhhaaaa@gmail.com>

Improve O(n^2) complexity to O(n log n) while building a sorted
'string_list' by constructing it unsorted then sorting it
followed by removing duplicates.

sparse-checkout deduplicates repeated cone-mode patterns,
but this behaviour was previously untested, add tests that
verify that sparse-checkout file contain each cone
pattern only once and sparse-checkout list reports each pattern
only once.

Signed-off-by: Amisha Chhajed <amishhhaaaa@gmail.com>
---
 builtin/sparse-checkout.c          |  7 +++--
 t/t1091-sparse-checkout-builtin.sh | 48 ++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 15d51e60a8..7dfb276bf0 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -91,10 +91,11 @@ static int sparse_checkout_list(int argc, const char **argv, const char *prefix,
 
 		hashmap_for_each_entry(&pl.recursive_hashmap, &iter, pe, ent) {
 			/* pe->pattern starts with "/", skip it */
-			string_list_insert(&sl, pe->pattern + 1);
+			string_list_append(&sl, pe->pattern + 1);
 		}
 
 		string_list_sort(&sl);
+		string_list_remove_duplicates(&sl, 0);
 
 		for (i = 0; i < sl.nr; i++) {
 			quote_c_style(sl.items[i].string, NULL, stdout, 0);
@@ -289,7 +290,7 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 		if (!hashmap_contains_parent(&pl->recursive_hashmap,
 					     pe->pattern,
 					     &parent_pattern))
-			string_list_insert(&sl, pe->pattern);
+			string_list_append(&sl, pe->pattern);
 	}
 
 	string_list_sort(&sl);
@@ -311,7 +312,7 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 		if (!hashmap_contains_parent(&pl->recursive_hashmap,
 					     pe->pattern,
 					     &parent_pattern))
-			string_list_insert(&sl, pe->pattern);
+			string_list_append(&sl, pe->pattern);
 	}
 
 	strbuf_release(&parent_pattern);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index b2da4feaef..ede5c6b3f3 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -817,6 +817,54 @@ test_expect_success 'cone mode clears ignored subdirectories' '
 	test_cmp expect out
 '
 
+test_expect_success 'sparse-checkout deduplicates repeated cone patterns' '
+    rm -f repo/.git/info/sparse-checkout &&
+    git -C repo sparse-checkout init --cone &&
+    git -C repo sparse-checkout add --stdin <<-\EOF &&
+	foo/bar/baz
+	a/b/c
+	foo/bar/baz
+	a/b
+	EOF
+    cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/a/
+	!/a/*/
+	/foo/
+	!/foo/*/
+	/foo/bar/
+	!/foo/bar/*/
+	/a/b/
+	/foo/bar/baz/
+	EOF
+    test_cmp expect repo/.git/info/sparse-checkout
+'
+
+test_expect_success 'sparse-checkout list deduplicates repeated cone patterns' '
+    rm -f repo/.git/info/sparse-checkout &&
+    git -C repo sparse-checkout init --cone &&
+    cat <<-\EOF >repo/.git/info/sparse-checkout &&
+	/*
+	!/*/
+	/a/
+	!/a/*/
+	/foo/
+	!/foo/*/
+	/foo/bar/
+	!/foo/bar/*/
+	/a/b/
+	/foo/bar/baz/
+	/foo/bar/baz/
+	EOF
+    git -C repo sparse-checkout list >actual &&
+    cat <<-\EOF >expect &&
+	a/b
+	foo/bar/baz
+	EOF
+    test_cmp expect actual
+'
+
 test_expect_success 'malformed cone-mode patterns' '
 	git -C repo sparse-checkout init --cone &&
 	mkdir -p repo/foo/bar &&
-- 
2.51.0


  parent reply	other threads:[~2026-01-20 15:38 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 19:28 [PATCH] sparse-checkout: optimize string_list construction amisha
2026-01-14 21:35 ` Jeff King
2026-01-18  2:39   ` Derrick Stolee
2026-01-15 12:56 ` [PATCH v2] " amisha
2026-01-15 13:09 ` [PATCH v3] " amisha
2026-01-15 13:15   ` Amisha Chhajed
2026-01-15 20:09     ` Jeff King
2026-01-16 17:03       ` Amisha Chhajed
2026-01-15 22:26     ` René Scharfe
2026-01-16  8:30       ` Amisha Chhajed
2026-01-16 17:17         ` Junio C Hamano
2026-01-18  2:46           ` Derrick Stolee
2026-01-18 13:09             ` Amisha Chhajed
2026-01-15 13:55   ` Junio C Hamano
2026-01-16 16:50 ` [PATCH] " amisha
2026-01-16 19:11   ` Junio C Hamano
2026-01-18 13:07     ` Amisha Chhajed
2026-01-19  5:32     ` Jeff King
2026-01-19 19:06       ` Junio C Hamano
2026-01-19 12:33 ` [PATCH v5 1/2] " amisha
2026-01-19 17:04   ` Derrick Stolee
2026-01-19 18:33     ` Pushkar Singh
2026-01-20 15:47     ` Amisha Chhajed
2026-01-20 15:38 ` amisha [this message]
2026-01-20 20:37   ` [PATCH v6] sparse-checkout: optimize string_list construction and add tests to verify deduplication Derrick Stolee
2026-01-21 13:00   ` [PATCH v7] " Amisha Chhajed
2026-01-21 16:28     ` Derrick Stolee
2026-01-21 16:51       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260120153829.48044-1-amishhhaaaa@gmail.com \
    --to=amishhhaaaa@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.