From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Michael Montalbo <mmontalbo@gmail.com>,
git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: weird quadratic reftable behavior, was: Re: [PATCH 3/3] t5551: pack refs after creating many tags
Date: Wed, 1 Jul 2026 08:17:45 +0200 [thread overview]
Message-ID: <akSxCUfm2P7ocLJX@pks.im> (raw)
In-Reply-To: <20260630235850.GB3759976@coredump.intra.peff.net>
On Tue, Jun 30, 2026 at 07:58:50PM -0400, Jeff King wrote:
> On Tue, Jun 30, 2026 at 07:47:02PM -0400, Jeff King wrote:
>
> > There was one other oddity I didn't quite resolve. You may notice the
> > gross reftable.orig stuff in hyperfine. I originally wrote this as:
> >
> > git for-each-ref --format="delete %(refname)" | git update-ref --stdin
> >
> > but for some reason that causes the subsequent update-ref to loop
> > infinitely on merged_iter_next_entry(). It does so reliably, but I can't
> > reproduce it outside of hyperfine. Super weird, and I'm sure I'm missing
> > something obvious.
>
> Ah, maybe not infinite, but probably quadratic. The key is that you have
> to delete a lot of refs and then try to insert them again. So with this
> script:
>
> nr=$1; shift
> rm -rf .git
>
> git init --ref-format=reftable
> blob=$(echo foo | git hash-object -w --stdin)
> seq -f "create refs/tags/foo-%g $blob" $nr >input
> git update-ref --stdin <input
> git for-each-ref --format="delete %(refname)" | git update-ref --stdin
> time git update-ref --stdin <input
>
> I get results like this:
>
> nr | runtime
> ------------
> 1000 | 0.125s
> 2000 | 0.454s
> 4000 | 1.811s
> 8000 | 7.091s
>
> So for every doubling of the input size, the runtime quadruples. I guess
> it is iterating through some deleted tombstone entries, but I'm not sure
> why.
>
> That's probably a more interesting and productive performance problem to
> work on than micro-optimizing out the last few microseconds of writing. :)
This is a known issue, I think [1].
The problem here is the tombstoning: when you delete all references,
chances are that they are not truly gone but that every reference is
just tombstoned. The problem with this is that reading refs may now take
signifciantly more time as we cannot just say "this stack is empty".
Instead, we need to figure out that it is empty by processing all the
tombstones, and that takes a lot of time.
I remember that I did some digging back then and improved the status quo
quite significantly by optimizing `refs_verify_refname_available()`. I'm
sure there are more opportunities for optimization here though -- I have
a feeling that we for example exhaust the merged iterator until its end
when searching for a specific refname, where we could easily abort once
the observed tombstone name sorts lexicographically after the needle.
But eventually I decided to not care too much about this edge case, as
it seems very specific to this artificial benchmark scenario. Which of
course doesn't mean that it's not worth doing, I just had bigger fish to
fry and didn't get around to it yet.
Patrick
[1]: <Z602dzQggtDdcgCX@tapette.crustytoothpaste.net>
next prev parent reply other threads:[~2026-07-01 6:17 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-20 15:33 [RFH] Why do osx CI jobs so unreliable? Michael Montalbo
2026-06-21 21:34 ` Jeff King
2026-06-22 4:42 ` Patrick Steinhardt
2026-06-22 9:47 ` Patrick Steinhardt
2026-06-22 9:55 ` Patrick Steinhardt
2026-06-22 10:29 ` Patrick Steinhardt
2026-06-26 3:27 ` Michael Montalbo
2026-06-26 5:16 ` Jeff King
2026-06-26 10:50 ` Patrick Steinhardt
2026-06-26 13:45 ` Junio C Hamano
2026-06-26 23:26 ` Michael Montalbo
2026-06-28 7:57 ` [PATCH 0/3] fixing expensive http test timeouts Jeff King
2026-06-28 8:00 ` [PATCH 1/3] t/lib-httpd: bump apache timeout Jeff King
2026-07-02 3:24 ` Michael Montalbo
2026-06-28 8:03 ` [PATCH 2/3] t5551: put many-tags case into its own repo Jeff King
2026-06-28 21:44 ` Junio C Hamano
2026-06-29 0:34 ` Jeff King
2026-06-29 14:42 ` Junio C Hamano
2026-06-29 20:36 ` Jeff King
2026-06-28 8:07 ` [PATCH 3/3] t5551: pack refs after creating many tags Jeff King
2026-06-28 21:25 ` Junio C Hamano
2026-06-29 5:57 ` Patrick Steinhardt
2026-06-29 20:35 ` Jeff King
2026-06-30 9:05 ` Patrick Steinhardt
2026-06-30 23:47 ` Jeff King
2026-06-30 23:58 ` weird quadratic reftable behavior, was: " Jeff King
2026-07-01 6:17 ` Patrick Steinhardt [this message]
2026-07-01 8:00 ` Jeff King
2026-07-01 9:04 ` Kristofer Karlsson
2026-07-01 10:07 ` Patrick Steinhardt
2026-07-03 12:09 ` Kristofer Karlsson
2026-06-29 7:33 ` [PATCH 0/3] fixing expensive http test timeouts Patrick Steinhardt
2026-06-29 14:39 ` Junio C Hamano
2026-06-29 16:09 ` Patrick Steinhardt
2026-06-29 16:19 ` Junio C Hamano
2026-06-30 9:05 ` Patrick Steinhardt
2026-06-30 19:21 ` Junio C Hamano
2026-07-01 6:56 ` Patrick Steinhardt
2026-06-26 23:43 ` [RFH] Why do osx CI jobs so unreliable? Jeff King
2026-06-22 5:05 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=akSxCUfm2P7ocLJX@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mmontalbo@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox