public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: amisha <amishhhaaaa@gmail.com>
Cc: git@vger.kernel.org,  stolee@gmail.com,  newren@gmail.com,
	 peff@peff.net
Subject: Re: [PATCH] sparse-checkout: optimize string_list construction
Date: Fri, 16 Jan 2026 11:11:16 -0800	[thread overview]
Message-ID: <xmqqqzrp74q3.fsf@gitster.g> (raw)
In-Reply-To: <20260116165003.95314-1-amishhhaaaa@gmail.com> (amisha's message of "Fri, 16 Jan 2026 22:20:03 +0530")

amisha <amishhhaaaa@gmail.com> writes:

> Subject: Re: [PATCH] sparse-checkout: optimize string_list construction

It would have been nice to see [PATCH v2] or whatever that signals
that there is an earlier iteration.

> From: Amisha Chhajed <amishhhaaaa@gmail.com>
>
> Improve O(n^2) complexity to O(n log n) while building a sorted
> 'string_list' by constructing it unsorted then sorting it
> followed by removing duplicates.

By the way, do we have t/perf/ that substanticates the performance
claim here (in other words, how much improvement are we expecting in
practice)?

Also, have you found out why the previous round that did not remove
duplicates saw no failed tests?  Perhaps it is a good idea to add
some test that would notice if we failed to add calls to
remove_duplicates in this patch?

This is an unrelated tangent, a possible #leftoverbits material, but
should not be part of this patch (or even in the same series as this
patch).  I notice that string_list_remove_duplicates() almost always
immediately follow a call to string_list_sort() of the same
instance, which makes me wonder if we would be better off if we had
a variant of string_list_sort(), and call it string_list_sort_u().

Thanks.

> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> index 15d51e60a8..7dfb276bf0 100644
> --- a/builtin/sparse-checkout.c
> +++ b/builtin/sparse-checkout.c
> @@ -91,10 +91,11 @@ static int sparse_checkout_list(int argc, const char **argv, const char *prefix,
>  
>  		hashmap_for_each_entry(&pl.recursive_hashmap, &iter, pe, ent) {
>  			/* pe->pattern starts with "/", skip it */
> -			string_list_insert(&sl, pe->pattern + 1);
> +			string_list_append(&sl, pe->pattern + 1);
>  		}
>  
>  		string_list_sort(&sl);
> +		string_list_remove_duplicates(&sl, 0);
>  
>  		for (i = 0; i < sl.nr; i++) {
>  			quote_c_style(sl.items[i].string, NULL, stdout, 0);
> @@ -289,7 +290,7 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
>  		if (!hashmap_contains_parent(&pl->recursive_hashmap,
>  					     pe->pattern,
>  					     &parent_pattern))
> -			string_list_insert(&sl, pe->pattern);
> +			string_list_append(&sl, pe->pattern);
>  	}
>  
>  	string_list_sort(&sl);
> @@ -311,7 +312,7 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
>  		if (!hashmap_contains_parent(&pl->recursive_hashmap,
>  					     pe->pattern,
>  					     &parent_pattern))
> -			string_list_insert(&sl, pe->pattern);
> +			string_list_append(&sl, pe->pattern);
>  	}
>  
>  	strbuf_release(&parent_pattern);

  reply	other threads:[~2026-01-16 19:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 19:28 [PATCH] sparse-checkout: optimize string_list construction amisha
2026-01-14 21:35 ` Jeff King
2026-01-18  2:39   ` Derrick Stolee
2026-01-15 12:56 ` [PATCH v2] " amisha
2026-01-15 13:09 ` [PATCH v3] " amisha
2026-01-15 13:15   ` Amisha Chhajed
2026-01-15 20:09     ` Jeff King
2026-01-16 17:03       ` Amisha Chhajed
2026-01-15 22:26     ` René Scharfe
2026-01-16  8:30       ` Amisha Chhajed
2026-01-16 17:17         ` Junio C Hamano
2026-01-18  2:46           ` Derrick Stolee
2026-01-18 13:09             ` Amisha Chhajed
2026-01-15 13:55   ` Junio C Hamano
2026-01-16 16:50 ` [PATCH] " amisha
2026-01-16 19:11   ` Junio C Hamano [this message]
2026-01-18 13:07     ` Amisha Chhajed
2026-01-19  5:32     ` Jeff King
2026-01-19 19:06       ` Junio C Hamano
2026-01-19 12:33 ` [PATCH v5 1/2] " amisha
2026-01-19 17:04   ` Derrick Stolee
2026-01-19 18:33     ` Pushkar Singh
2026-01-20 15:47     ` Amisha Chhajed
2026-01-20 15:38 ` [PATCH v6] sparse-checkout: optimize string_list construction and add tests to verify deduplication amisha
2026-01-20 20:37   ` Derrick Stolee
2026-01-21 13:00   ` [PATCH v7] " Amisha Chhajed
2026-01-21 16:28     ` Derrick Stolee
2026-01-21 16:51       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqqzrp74q3.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=amishhhaaaa@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox