Re: [PATCH] ref-filter: format iteratively with lexicographic refname sorting

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, Victoria Dye <vdye@github.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH] ref-filter: format iteratively with lexicographic refname sorting
Date: Thu, 17 Oct 2024 06:28:54 +0200	[thread overview]
Message-ID: <ZxCShtdTlv7t5fYy@pks.im> (raw)
In-Reply-To: <20241017024828.GC1858436@coredump.intra.peff.net>

On Wed, Oct 16, 2024 at 10:48:28PM -0400, Jeff King wrote:
> On Wed, Oct 16, 2024 at 06:11:47PM -0400, Taylor Blau wrote:
> 
> > On Wed, Oct 16, 2024 at 08:00:30AM +0200, Patrick Steinhardt wrote:
> > > But there is one exception here where we _can_ get away with sorting
> > > refs while streaming: ref backends sort references returned by their
> > > iterators in lexicographic order. So if the following conditions are all
> > > true we can do iterative streaming:
> > >
> > >   - The caller uses at most a single name pattern. Otherwise we'd have
> > >     to sort results from multiple invocations of the iterator.
> > >
> > >   - There must be at most a single sorting specification, as otherwise
> > >     we're not using plain lexicographic ordering.
> > >
> > >   - The sorting specification must use the "refname".
> > >
> > >   - The sorting specification must not be using any flags, like
> > >     case-insensitive sorting.
> > 
> > Perhaps a niche case, but what about ancient packed-refs files that were
> > written before the 'sorted' capability was introduced?
> 
> We should be OK there. In that case we actually read in and sort the
> packed-refs entries ourselves. We have to, since we do an in-order merge
> with the loose refs while iterating.
> 
> I do think this optimization is worth doing, and not a problem with our
> current backends. The biggest worries would be:
> 
>   1. Some new ref backend that doesn't return sorted results. I find
>      this unlikely, and anyway it's easily caught by having coverage in
>      the test suite (which I assume we already have, but I didn't look).

My assumption is that a ref iterator that _isn't_ sorted would lead to
undesirable behaviour. I'd be surprised if git-show-ref(1) started to
show refs in a random order. So we have essentially baked the
requirement that ref iterators return refs in lexicographic order into
our codebase already.

>   2. Some new flag combination that requires disabling the optimization,
>      and which must be dealt with in the code. This seems unlikely to me
>      but not impossible. I think enabling the optimization is worth it,
>      though.

And this part is an issue with or without my patch. The logic we have
in the ref-filter API is quite fragile, and everybody who wants to add
some new flags must remember to update `can_do_iterative_format()`. I'm
not really a huge fan of that, but on the other hand the subsystem does
not change all that frequently anyway.

> > >     to sort results from multiple invocations of the iterator.
> 
> I think this part is erring on the cautious side, as we can often
> collapse these into a single iteration due to the ref-prefix work. It
> may be OK to keep using the slower code here if multiple patterns aren't
> commonly used, but I'd suspect that:
> 
>   git for-each-ref --sort=refname refs/heads refs/tags
> 
> could benefit.

Mh. So we do end up using `refs_for_each_fullref_in_prefixes()`, which
may or may not end up collapsing the prefixes. We do sort and dedup the
prefixes via `find_longest_prefixes()`, so we don't have to worry about
e.g. `git for-each-ref refs/tags refs/heads refs/tags`.

So... it should be fine to also use this with multiple patterns? None of
our tests fail, either, which reassures me a bit.

I'll send a v2 that loosens this requirement.

Thanks for your input!

Patrick

next prev parent reply	other threads:[~2024-10-17  4:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16  6:00 [PATCH] ref-filter: format iteratively with lexicographic refname sorting Patrick Steinhardt
2024-10-16 22:11 ` Taylor Blau
2024-10-17  2:48   ` Jeff King
2024-10-17  4:28     ` Patrick Steinhardt [this message]
2024-10-21 12:36       ` karthik nayak
2024-10-21 20:45         ` Taylor Blau
2024-10-17  5:09 ` [PATCH v2] " Patrick Steinhardt
2024-10-17 20:57   ` Taylor Blau
2024-10-21 11:10   ` Toon Claes
2024-10-21 11:33     ` Patrick Steinhardt
2024-10-21 11:33 ` [PATCH v3] " Patrick Steinhardt
2024-10-21 20:46   ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxCShtdTlv7t5fYy@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).