From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee <stolee@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 00/17] object-store: carve out the object database subsystem
Date: Fri, 9 May 2025 13:25:42 +0200 [thread overview]
Message-ID: <aB3mNiiAGE2KHbGY@pks.im> (raw)
In-Reply-To: <xmqqa57oe4mj.fsf@gitster.g>
On Wed, May 07, 2025 at 10:02:12AM -0700, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
>
> > Patches 1 and 2 involve renaming some core structures, and I had
> > some questions around these names (since we hope to be stuck with
> > the new names for a long time). I was thinking out loud on a per-
> > patch basis, but now want to collect my thoughts around these:
> >
> > * raw_object_store currently describes the abstraction that contains
> > all objects that can be accessed within the repository. This may
> > include multiple alternates. Patch 1 renames this to
> > 'object_database'.
> >
> > * object_directory currently describes a single directory that
> > has the same structure as $GIT_DIR/objects/ but may be an alternate
> > or a submodule object directory. Patch 2 renames this to
> > 'odb_backend'.
> >
> > My concerns around this are basically around not liking "backend" for
> > this purpose. When I think of a backend, I'm thinking about the
> > implementation details (like the refs backend being files or reftable)
> > and not multiple distinct locations that have their own objects.
>
> Yup, odb_backend_files (aot odb_backend_redis) or something?
Yeah, that was my vision indeed. I think it works equally well though in
case we name this `odb_alternate`. The benefit of the "alternate"
terminology is that we already use it and it's almost a perfect fit, and
it gives the reader a hint that we may have multiple alternates. On the
other hand, `odb_backend` sounds as if there would only be a single
backend for a `struct object_database`.
So Stolee caused me to reconsider and favor `odb_alternate`. But in the
end I guess that both names would work alright.
> > * 'struct object_directory' could be renamed to 'struct odb_shard' or
> > 'struct odb_slice' or similar. I may even recommend 'odb_partition'
> > though that does imply some disjointness that is not guaranteed (an
> > object can exist in multiple parts).
> >
> > * In the event that we create multiple implementations for storing
> > objects, then a 'struct odb_shard' could point to a backend to help
> > find the appropriate methods for interacting with its storage.
>
> Hmph, I do not have strong opinions, but I consider it an
> implementation detail of one particular backend, namely, the
> filesystem based backend, that it can link together multiple
> object_directory instances and present them as if they form a single
> object database, just like all files within a single object_directory
> form an illusion of a single object database (aka key-value store) even
> though some objects are stored in individual loose object files while
> many others are packed in a single packfile.
>
> I did not expect you would want to go to the world where a single
> "shard" consists of an object_directory backed by the filesystem and
> some other more database-y backend. It is an interesting idea, but
> we'd need to worry about many things we do not have to worry about
> right now. E.g. what do the precedence rules among different
> components within a single "shard" look like? How do we express "in
> this repository, local filesystem-backed piece is consulted first,
> and then check this piece backed by low-cost but high-latency
> storage backend"?
Well, in fact I want to design this from the start so that you can mix
and match different backends. I think it falls out naturally from the
design if an alternate can be backed by anything, and it has a lot of
very interesting features.
Furthermore, it would cause a bunch of problems if we _didn't_ allow for
this, at least for hosting providers:
- Migrations would now need to be atomic across fork networks where
all forks need to be migrated at once so that we don't mix backends.
- Migrations in general would be a pain if we had to do an atomic
migration even for a single object directory. With mixed backends we
can already make a partially-migrated backend available while the
old backend is still in use.
- High-latency storage backends may work well for binary files, but
not for smallish text files.
This all of course still needs to be hashed out. I do want to send an
RFC document to the mailing list soonish, probably in the first half of
the Git 2.51 release cycle, so that we can discuss where to go.
> > I do mention that the rename of the object-store.[c|h] files may be
> > unnecessary, or perhaps could be delayed until this series is merged
> > and the collateral is calmed.
>
> Right now, merge-fix needed against all other topics in flight look
> like this, in order to merge it to 'seen'.
Okay. In that case I'll keep that patch for now.
Patrick
next prev parent reply other threads:[~2025-05-09 11:25 UTC|newest]
Thread overview: 166+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-06 11:09 [PATCH 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-07 0:47 ` Derrick Stolee
2025-05-07 15:27 ` Junio C Hamano
2025-05-06 11:09 ` [PATCH 02/17] object-store: rename `object_directory` to `odb_backend` Patrick Steinhardt
2025-05-07 0:51 ` Derrick Stolee
2025-05-07 1:00 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-06 18:06 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-07 1:10 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 07/17] " Patrick Steinhardt
2025-05-07 1:12 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-07 1:14 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-07 1:21 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-12 18:28 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 10/17] odb: get rid of `the_repository` when handling the primary backend Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 11/17] odb: get rid of `the_repository` when handling submodule backends Patrick Steinhardt
2025-05-07 1:25 ` Derrick Stolee
2025-05-07 1:29 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-07 1:25 ` Derrick Stolee
2025-05-07 16:38 ` Junio C Hamano
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-07 1:40 ` [PATCH 00/17] object-store: carve out the object database subsystem Derrick Stolee
2025-05-07 17:02 ` Junio C Hamano
2025-05-09 11:25 ` Patrick Steinhardt [this message]
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-13 19:28 ` Toon Claes
2025-05-14 4:31 ` Patrick Steinhardt
2025-05-07 23:22 ` Junio C Hamano
2025-05-09 14:12 ` [PATCH v2 " Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-13 19:27 ` Toon Claes
2025-05-09 14:12 ` [PATCH v2 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-13 19:28 ` Toon Claes
2025-05-14 4:31 ` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-13 19:28 ` Toon Claes
2025-05-14 4:31 ` Patrick Steinhardt
2025-05-14 12:58 ` Junio C Hamano
2025-05-09 14:12 ` [PATCH v2 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-09 21:43 ` [PATCH v2 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-05-12 18:33 ` Derrick Stolee
2025-05-14 5:12 ` [PATCH v3 " Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-22 21:59 ` Justin Tobler
2025-05-14 5:12 ` [PATCH v3 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-22 22:13 ` Justin Tobler
2025-05-26 5:45 ` Patrick Steinhardt
2025-05-27 16:45 ` Justin Tobler
2025-05-28 13:18 ` Toon Claes
2025-05-30 9:39 ` Patrick Steinhardt
2025-05-30 9:39 ` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-14 14:48 ` [PATCH v3 00/17] object-store: carve out the object database subsystem Toon Claes
2025-05-15 8:22 ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 " Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-04 8:55 ` Toon Claes
2025-06-04 11:52 ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-04 13:24 ` Toon Claes
2025-06-04 13:55 ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-02 15:38 ` [PATCH v4 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-06-05 6:46 ` [PATCH v5 " Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-30 2:02 ` Justin Tobler
2025-07-01 12:17 ` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-30 2:34 ` Justin Tobler
2025-07-01 12:17 ` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-30 2:36 ` Justin Tobler
2025-06-05 6:46 ` [PATCH v5 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-30 2:56 ` Justin Tobler
2025-07-01 12:18 ` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-30 3:14 ` Justin Tobler
2025-06-05 6:47 ` [PATCH v5 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-30 3:15 ` Justin Tobler
2025-07-01 12:22 ` [PATCH v6 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-07-01 14:26 ` [PATCH v6 00/17] object-store: carve out the object database subsystem Justin Tobler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aB3mNiiAGE2KHbGY@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).