git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, karthik nayak <karthik.188@gmail.com>
Subject: Re: [PATCH v2 6/6] refs/reftable: wire up support for exclude patterns
Date: Tue, 17 Sep 2024 05:26:48 -0400	[thread overview]
Message-ID: <ZulLWJWLgU7gL1MK@nand.local> (raw)
In-Reply-To: <050f4906393d1f9c58a6b6bc695b004695d219be.1726476401.git.ps@pks.im>

On Mon, Sep 16, 2024 at 10:50:16AM +0200, Patrick Steinhardt wrote:
> This makes t1419 work with the "reftable" backend with some slight
> modifications. Of course it also speeds up listing of references with
> hidden refs. The following benchmark prints one reference with 1 million
> hidden references:
>
>     Benchmark 1: HEAD~
>       Time (mean ± σ):      93.3 ms ±   2.1 ms    [User: 90.3 ms, System: 2.5 ms]
>       Range (min … max):    89.8 ms …  97.2 ms    33 runs
>
>     Benchmark 2: HEAD
>       Time (mean ± σ):       4.2 ms ±   0.6 ms    [User: 2.2 ms, System: 1.8 ms]
>       Range (min … max):     3.1 ms …   8.1 ms    765 runs
>
>     Summary
>       HEAD ran
>        22.15 ± 3.19 times faster than HEAD~

Very nice.

> +		/*
> +		 * When the reference name is lexicographically bigger than the
> +		 * current exclude pattern we know that it won't ever match any
> +		 * of the following references, either. We thus advance to the
> +		 * next pattern and re-check whether it matches.
> +		 *
> +		 * Otherwise, if it's smaller, then we do not have a match and
> +		 * thus want to show the current reference.
> +		 */
> +		cmp = strncmp(iter->ref.refname, pattern,
> +			      iter->exclude_patterns_strlen);
> +		if (cmp > 0) {
> +			iter->exclude_patterns_index++;
> +			iter->exclude_patterns_strlen = 0;
> +			continue;
> +		}
> +		if (cmp < 0)
> +			return 0;

Perhaps I am showing my ignorance for the reftable backend, but is it OK
to call strncmp() against all patterns here?

In the packed-refs implementation which I worked on with Peff sometime
last year, we rejected exclude patterns that contain special glob
characters in them for exactly this reason.

The implementation that I'm referring to has a helpful comment that
jogged my memory for what we were thinking at the time (see the
function refs/packed-backend.c::populate_excluded_jump_list()).

Ugh, I just read the next hunk below, so ignore me here ;-).

> +static char **filter_exclude_patterns(const char **exclude_patterns)
> +{
> +	size_t filtered_size = 0, filtered_alloc = 0;
> +	char **filtered = NULL;
> +
> +	if (!exclude_patterns)
> +		return NULL;
> +
> +	for (size_t i = 0; ; i++) {
> +		const char *exclude_pattern = exclude_patterns[i];
> +		int has_glob = 0;
> +
> +		if (!exclude_pattern)
> +			break;
> +
> +		for (const char *p = exclude_pattern; *p; p++) {
> +			has_glob = is_glob_special(*p);
> +			if (has_glob)
> +				break;
> +		}
> +		if (has_glob)
> +			continue;
> +
> +		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
> +		filtered[filtered_size++] = xstrdup(exclude_pattern);
> +	}
> +
> +	if (filtered_size) {
> +		QSORT(filtered, filtered_size, qsort_strcmp);
> +		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
> +		filtered[filtered_size++] = NULL;
> +	}
> +
> +	return filtered;
> +}

Ohhh, here's where we filter out un-excludeable patterns. Ignore me :-).

One question I had reading this is why we don't filter these out on the
fly in the iterator itself instead of allocating a separate array that
we have to xstrdup() into and free later on.

We may be at the point of diminishing returns here, but I wonder if
allocating this thing is more expensive than a few redundant strcmp()s
and calls to is_glob_special(). I dunno.

Thanks,
Taylor

  reply	other threads:[~2024-09-17  9:26 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-09 11:31 [PATCH 0/6] refs/reftable: wire up exclude patterns Patrick Steinhardt
2024-09-09 11:31 ` [PATCH 1/6] refs: properly apply exclude patterns to namespaced refs Patrick Steinhardt
2024-09-13 11:35   ` karthik nayak
2024-09-16  6:56     ` Patrick Steinhardt
2024-09-09 11:31 ` [PATCH 2/6] builtin/receive-pack: fix exclude patterns when announcing refs Patrick Steinhardt
2024-09-13 11:50   ` karthik nayak
2024-09-09 11:31 ` [PATCH 3/6] Makefile: stop listing test library objects twice Patrick Steinhardt
2024-09-09 11:31 ` [PATCH 4/6] t/unit-tests: introduce reftable library Patrick Steinhardt
2024-09-09 11:31 ` [PATCH 5/6] reftable/reader: make table iterator reseekable Patrick Steinhardt
2024-09-13 12:11   ` karthik nayak
2024-09-16  6:56     ` Patrick Steinhardt
2024-09-17 16:44       ` karthik nayak
2024-09-09 11:31 ` [PATCH 6/6] refs/reftable: wire up support for exclude patterns Patrick Steinhardt
2024-09-13 12:47   ` karthik nayak
2024-09-16  6:56     ` Patrick Steinhardt
2024-09-17 17:31       ` karthik nayak
2024-09-13 12:48 ` [PATCH 0/6] refs/reftable: wire up " karthik nayak
2024-09-16  6:56   ` Patrick Steinhardt
2024-09-16  8:49 ` [PATCH v2 " Patrick Steinhardt
2024-09-16  8:50   ` [PATCH v2 1/6] refs: properly apply exclude patterns to namespaced refs Patrick Steinhardt
2024-09-17  9:12     ` Taylor Blau
2024-09-17  9:33       ` Patrick Steinhardt
2024-09-17  9:38         ` Taylor Blau
2024-09-17  9:44           ` Patrick Steinhardt
2024-09-17  9:52             ` Taylor Blau
2024-09-17  9:55               ` Patrick Steinhardt
2024-09-16  8:50   ` [PATCH v2 2/6] builtin/receive-pack: fix exclude patterns when announcing refs Patrick Steinhardt
2024-09-17  9:16     ` Taylor Blau
2024-09-16  8:50   ` [PATCH v2 3/6] Makefile: stop listing test library objects twice Patrick Steinhardt
2024-09-16  8:50   ` [PATCH v2 4/6] t/unit-tests: introduce reftable library Patrick Steinhardt
2024-09-16  8:50   ` [PATCH v2 5/6] reftable/reader: make table iterator reseekable Patrick Steinhardt
2024-09-16  8:50   ` [PATCH v2 6/6] refs/reftable: wire up support for exclude patterns Patrick Steinhardt
2024-09-17  9:26     ` Taylor Blau [this message]
2024-09-17  9:39       ` Patrick Steinhardt
2024-09-17  9:53         ` Taylor Blau
2024-09-17 17:33   ` [PATCH v2 0/6] refs/reftable: wire up " karthik nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZulLWJWLgU7gL1MK@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).