All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, Victoria Dye <vdye@github.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH] ref-filter: format iteratively with lexicographic refname sorting
Date: Thu, 17 Oct 2024 06:28:54 +0200	[thread overview]
Message-ID: <ZxCShtdTlv7t5fYy@pks.im> (raw)
In-Reply-To: <20241017024828.GC1858436@coredump.intra.peff.net>

On Wed, Oct 16, 2024 at 10:48:28PM -0400, Jeff King wrote:
> On Wed, Oct 16, 2024 at 06:11:47PM -0400, Taylor Blau wrote:
> 
> > On Wed, Oct 16, 2024 at 08:00:30AM +0200, Patrick Steinhardt wrote:
> > > But there is one exception here where we _can_ get away with sorting
> > > refs while streaming: ref backends sort references returned by their
> > > iterators in lexicographic order. So if the following conditions are all
> > > true we can do iterative streaming:
> > >
> > >   - The caller uses at most a single name pattern. Otherwise we'd have
> > >     to sort results from multiple invocations of the iterator.
> > >
> > >   - There must be at most a single sorting specification, as otherwise
> > >     we're not using plain lexicographic ordering.
> > >
> > >   - The sorting specification must use the "refname".
> > >
> > >   - The sorting specification must not be using any flags, like
> > >     case-insensitive sorting.
> > 
> > Perhaps a niche case, but what about ancient packed-refs files that were
> > written before the 'sorted' capability was introduced?
> 
> We should be OK there. In that case we actually read in and sort the
> packed-refs entries ourselves. We have to, since we do an in-order merge
> with the loose refs while iterating.
> 
> I do think this optimization is worth doing, and not a problem with our
> current backends. The biggest worries would be:
> 
>   1. Some new ref backend that doesn't return sorted results. I find
>      this unlikely, and anyway it's easily caught by having coverage in
>      the test suite (which I assume we already have, but I didn't look).

My assumption is that a ref iterator that _isn't_ sorted would lead to
undesirable behaviour. I'd be surprised if git-show-ref(1) started to
show refs in a random order. So we have essentially baked the
requirement that ref iterators return refs in lexicographic order into
our codebase already.

>   2. Some new flag combination that requires disabling the optimization,
>      and which must be dealt with in the code. This seems unlikely to me
>      but not impossible. I think enabling the optimization is worth it,
>      though.

And this part is an issue with or without my patch. The logic we have
in the ref-filter API is quite fragile, and everybody who wants to add
some new flags must remember to update `can_do_iterative_format()`. I'm
not really a huge fan of that, but on the other hand the subsystem does
not change all that frequently anyway.

> > >     to sort results from multiple invocations of the iterator.
> 
> I think this part is erring on the cautious side, as we can often
> collapse these into a single iteration due to the ref-prefix work. It
> may be OK to keep using the slower code here if multiple patterns aren't
> commonly used, but I'd suspect that:
> 
>   git for-each-ref --sort=refname refs/heads refs/tags
> 
> could benefit.

Mh. So we do end up using `refs_for_each_fullref_in_prefixes()`, which
may or may not end up collapsing the prefixes. We do sort and dedup the
prefixes via `find_longest_prefixes()`, so we don't have to worry about
e.g. `git for-each-ref refs/tags refs/heads refs/tags`.

So... it should be fine to also use this with multiple patterns? None of
our tests fail, either, which reassures me a bit.

I'll send a v2 that loosens this requirement.

Thanks for your input!

Patrick

  reply	other threads:[~2024-10-17  4:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16  6:00 [PATCH] ref-filter: format iteratively with lexicographic refname sorting Patrick Steinhardt
2024-10-16 22:11 ` Taylor Blau
2024-10-17  2:48   ` Jeff King
2024-10-17  4:28     ` Patrick Steinhardt [this message]
2024-10-21 12:36       ` karthik nayak
2024-10-21 20:45         ` Taylor Blau
2024-10-17  5:09 ` [PATCH v2] " Patrick Steinhardt
2024-10-17 20:57   ` Taylor Blau
2024-10-21 11:10   ` Toon Claes
2024-10-21 11:33     ` Patrick Steinhardt
2024-10-21 11:33 ` [PATCH v3] " Patrick Steinhardt
2024-10-21 20:46   ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxCShtdTlv7t5fYy@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.