From: Patrick Steinhardt <ps@pks.im>
To: Justin Tobler <jltobler@gmail.com>
Cc: git@vger.kernel.org,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Karthik Nayak <karthik.188@gmail.com>,
K Jayatheerth <jayatheerthkulkarni2005@gmail.com>,
ryenus@gmail.com, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/2] BreakingChanges: announce switch to "reftable" format
Date: Thu, 3 Jul 2025 07:00:21 +0200 [thread overview]
Message-ID: <aGYOZVyaR_OYIhtl@pks.im> (raw)
In-Reply-To: <q6zyvqpyxobtp65ptrmkdg3kvc2plxmsltaurqf52hglitikir@5p5jpcqc577o>
On Wed, Jul 02, 2025 at 12:17:50PM -0500, Justin Tobler wrote:
> On 25/07/02 12:14PM, Patrick Steinhardt wrote:
> > diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc
> > index c6bd94986c5..c96b5319cdd 100644
> > --- a/Documentation/BreakingChanges.adoc
> > +++ b/Documentation/BreakingChanges.adoc
> > @@ -118,6 +118,45 @@ Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@zombino.com>,
> > <20170223155046.e7nxivfwqqoprsqj@LykOS.localdomain>,
> > <CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@mail.gmail.com>.
> >
> > +* The default storage format for references in newly created repositories will
> > + be changed from "files" to "reftable". The "reftable" format provides
> > + multiple advantages over the "files" format:
> > ++
> > + ** It is impossible to store two references that only differ in casing on
> > + case-insensitive filesystems with the "files" format. This issue is
> > + especially common on Windows, but also on older versions of macOS. As the
> > + "reftable" backend does not use filesystem paths anymore to encode
> > + reference names this problem goes away.
>
> I believe even modern macOS by default uses a case-insensitive
> file-system. Maybe we should instead say:
>
> This limitation is common on Windows and macOS platforms.
Okay, thanks for the clarification. I thought recent versions of macOS
were case-sensitive by default.
> > + ** Similarly, macOS normalizes path names that contain unicode characters,
> > + which has the consequence that you cannot store two names with unicode
> > + characters that are encoded differently with the "files" backend. Again,
> > + this is not an issue with the "reftable" backend.
> > + ** Deleting references with the "files" backend requires Git to rewrite the
> > + complete "packed-refs" file. In large repositories with many references
> > + this file can easily be dozens of megabytes in size, in extreme cases it
> > + may be gigabytes. The "reftable" backend uses tombstone markers for
> > + deleted references and thus does not have to rewrite all of its data.
> > + ** Repository housekeeping with the "files" backend typically performs
> > + all-into-one repacks of references. This can be quite expensive, and
> > + consequently housekeeping is a tradeoff between the number of loose
> > + references that accumulate and slow down operations that read references,
> > + and compressing those loose references into the "packed-refs" file. The
> > + "reftable" backend uses geometric compaction after every write, which
> > + amortizes costs and ensures that the backend is always in a
> > + well-maintained state.
> > + ** Operations that write multiple references at once are not atomic with the
> > + "files" backend. Consequently, Git may see in-between states when it reads
> > + references while a reference transaction is in the process of being
> > + committed to disk.
> > + ** Writing many references at once is slow with the "files" backend because
> > + every reference is created as a separate file. The "reftable" backend
> > + significantly outperforms the "files" backend by multiple orders of
> > + magnitude.
>
> The examples above do a good job at explaining individual technical
> benefits. I do wonder if we should include a more general statement
> aimed at users as to why the change to reftables is beneficial. Maybe
> something like:
>
> The reftables backend addresses several performance concerns as the
> number of references scale in a repository.
I think this would be a bit too handwavy. I'd rather want to point out
the specific cases where we know it to perform better.
Patrick
next prev parent reply other threads:[~2025-07-03 5:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 10:14 [PATCH 0/2] Add reftable by default as a breaking change Patrick Steinhardt
2025-07-02 10:14 ` [PATCH 1/2] BreakingChanges: announce switch to "reftable" format Patrick Steinhardt
2025-07-02 17:03 ` Junio C Hamano
2025-07-02 21:21 ` brian m. carlson
2025-07-03 4:43 ` Patrick Steinhardt
2025-07-03 4:43 ` Patrick Steinhardt
2025-07-02 17:17 ` Justin Tobler
2025-07-03 5:00 ` Patrick Steinhardt [this message]
2025-07-02 10:14 ` [PATCH 2/2] setup: use "reftable" format when experimental features are enabled Patrick Steinhardt
2025-07-03 6:15 ` [PATCH v2 0/2] Add reftable by default as a breaking change Patrick Steinhardt
2025-07-03 6:15 ` [PATCH v2 1/2] BreakingChanges: announce switch to "reftable" format Patrick Steinhardt
2025-07-03 10:54 ` Karthik Nayak
2025-07-03 11:42 ` Patrick Steinhardt
2025-07-03 12:24 ` Karthik Nayak
2025-07-03 13:08 ` Patrick Steinhardt
2025-07-03 6:15 ` [PATCH v2 2/2] setup: use "reftable" format when experimental features are enabled Patrick Steinhardt
2025-07-07 5:37 ` [PATCH v2 0/2] Add reftable by default as a breaking change Junio C Hamano
2025-07-04 9:42 ` [PATCH v3 " Patrick Steinhardt
2025-07-04 9:42 ` [PATCH v3 1/2] BreakingChanges: announce switch to "reftable" format Patrick Steinhardt
2025-07-04 9:42 ` [PATCH v3 2/2] setup: use "reftable" format when experimental features are enabled Patrick Steinhardt
2025-07-04 13:14 ` [PATCH v3 0/2] Add reftable by default as a breaking change Karthik Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGYOZVyaR_OYIhtl@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jayatheerthkulkarni2005@gmail.com \
--cc=jltobler@gmail.com \
--cc=karthik.188@gmail.com \
--cc=ryenus@gmail.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.