All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Christian Couder <christian.couder@gmail.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
	git@vger.kernel.org, Karthik Nayak <karthik.188@gmail.com>
Subject: Re: Poor performance using reftable with many refs
Date: Thu, 13 Feb 2025 14:21:24 +0100	[thread overview]
Message-ID: <Z63x1MTZfVfI7Q1a@pks.im> (raw)
In-Reply-To: <CAP8UFD3E8_mTwneUgNkC_hZbkaeznAT-dG9njT5wjnm-=iMmcw@mail.gmail.com>

On Thu, Feb 13, 2025 at 10:27:39AM +0100, Christian Couder wrote:
> On Thu, Feb 13, 2025 at 8:13 AM Patrick Steinhardt <ps@pks.im> wrote:
> 
> > We end up with two tables: the first one has been created when cloning
> > the repository and contains all references. The second one has been
> > created when deleting all references, so it only contains ref deletions.
> > Because deletions don't have to carry an object ID, the resulting table
> > is also much smaller. This has the effect that auto-compaction does not
> > kick in, because we see that the geometric sequence is still intact.
> 
> Not that I think we should work on this right now, but theoretically,
> could we "just" count the number of entries in each file and base the
> geometric sequence on the number of entries in each file instead of
> file size?

In theory we could, and that may lead to better results in edge cases
like these indeed. And I think if either the header or footer of
reftables contained a total count of contained records that might have
been a viable thing to do indeed. But they don't, so we'd have to open
and parse every complete reftable to do so.

Because of that I think the cost of this would ultimately outweight the
benfit. After all, this logic kicks in on every write to determine if we
need to auto-compact. As a result, it needs to remain cheap.

Patrick

      reply	other threads:[~2025-02-13 13:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-13  0:01 Poor performance using reftable with many refs brian m. carlson
2025-02-13  6:11 ` Patrick Steinhardt
2025-02-13  7:13   ` Patrick Steinhardt
2025-02-13  8:22     ` Jeff King
2025-02-13 11:20       ` Patrick Steinhardt
2025-02-13 14:31         ` Patrick Steinhardt
2025-02-13 19:53           ` Jeff King
2025-02-13 19:42         ` Jeff King
2025-02-13 20:12           ` Junio C Hamano
2025-02-13 22:17       ` brian m. carlson
2025-02-13  9:27     ` Christian Couder
2025-02-13 13:21       ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z63x1MTZfVfI7Q1a@pks.im \
    --to=ps@pks.im \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.