From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee <stolee@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 00/17] object-store: carve out the object database subsystem
Date: Fri, 9 May 2025 13:25:42 +0200 [thread overview]
Message-ID: <aB3mNiiAGE2KHbGY@pks.im> (raw)
In-Reply-To: <xmqqa57oe4mj.fsf@gitster.g>
On Wed, May 07, 2025 at 10:02:12AM -0700, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
>
> > Patches 1 and 2 involve renaming some core structures, and I had
> > some questions around these names (since we hope to be stuck with
> > the new names for a long time). I was thinking out loud on a per-
> > patch basis, but now want to collect my thoughts around these:
> >
> > * raw_object_store currently describes the abstraction that contains
> > all objects that can be accessed within the repository. This may
> > include multiple alternates. Patch 1 renames this to
> > 'object_database'.
> >
> > * object_directory currently describes a single directory that
> > has the same structure as $GIT_DIR/objects/ but may be an alternate
> > or a submodule object directory. Patch 2 renames this to
> > 'odb_backend'.
> >
> > My concerns around this are basically around not liking "backend" for
> > this purpose. When I think of a backend, I'm thinking about the
> > implementation details (like the refs backend being files or reftable)
> > and not multiple distinct locations that have their own objects.
>
> Yup, odb_backend_files (aot odb_backend_redis) or something?
Yeah, that was my vision indeed. I think it works equally well though in
case we name this `odb_alternate`. The benefit of the "alternate"
terminology is that we already use it and it's almost a perfect fit, and
it gives the reader a hint that we may have multiple alternates. On the
other hand, `odb_backend` sounds as if there would only be a single
backend for a `struct object_database`.
So Stolee caused me to reconsider and favor `odb_alternate`. But in the
end I guess that both names would work alright.
> > * 'struct object_directory' could be renamed to 'struct odb_shard' or
> > 'struct odb_slice' or similar. I may even recommend 'odb_partition'
> > though that does imply some disjointness that is not guaranteed (an
> > object can exist in multiple parts).
> >
> > * In the event that we create multiple implementations for storing
> > objects, then a 'struct odb_shard' could point to a backend to help
> > find the appropriate methods for interacting with its storage.
>
> Hmph, I do not have strong opinions, but I consider it an
> implementation detail of one particular backend, namely, the
> filesystem based backend, that it can link together multiple
> object_directory instances and present them as if they form a single
> object database, just like all files within a single object_directory
> form an illusion of a single object database (aka key-value store) even
> though some objects are stored in individual loose object files while
> many others are packed in a single packfile.
>
> I did not expect you would want to go to the world where a single
> "shard" consists of an object_directory backed by the filesystem and
> some other more database-y backend. It is an interesting idea, but
> we'd need to worry about many things we do not have to worry about
> right now. E.g. what do the precedence rules among different
> components within a single "shard" look like? How do we express "in
> this repository, local filesystem-backed piece is consulted first,
> and then check this piece backed by low-cost but high-latency
> storage backend"?
Well, in fact I want to design this from the start so that you can mix
and match different backends. I think it falls out naturally from the
design if an alternate can be backed by anything, and it has a lot of
very interesting features.
Furthermore, it would cause a bunch of problems if we _didn't_ allow for
this, at least for hosting providers:
- Migrations would now need to be atomic across fork networks where
all forks need to be migrated at once so that we don't mix backends.
- Migrations in general would be a pain if we had to do an atomic
migration even for a single object directory. With mixed backends we
can already make a partially-migrated backend available while the
old backend is still in use.
- High-latency storage backends may work well for binary files, but
not for smallish text files.
This all of course still needs to be hashed out. I do want to send an
RFC document to the mailing list soonish, probably in the first half of
the Git 2.51 release cycle, so that we can discuss where to go.
> > I do mention that the rename of the object-store.[c|h] files may be
> > unnecessary, or perhaps could be delayed until this series is merged
> > and the collateral is calmed.
>
> Right now, merge-fix needed against all other topics in flight look
> like this, in order to merge it to 'seen'.
Okay. In that case I'll keep that patch for now.
Patrick
next prev parent reply other threads:[~2025-05-09 11:25 UTC|newest]
Thread overview: 166+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-06 11:09 [PATCH 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-07 0:47 ` Derrick Stolee
2025-05-07 15:27 ` Junio C Hamano
2025-05-06 11:09 ` [PATCH 02/17] object-store: rename `object_directory` to `odb_backend` Patrick Steinhardt
2025-05-07 0:51 ` Derrick Stolee
2025-05-07 1:00 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-06 18:06 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-07 1:10 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 07/17] " Patrick Steinhardt
2025-05-07 1:12 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-07 1:14 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-07 1:21 ` Derrick Stolee
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-12 18:28 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 10/17] odb: get rid of `the_repository` when handling the primary backend Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 11/17] odb: get rid of `the_repository` when handling submodule backends Patrick Steinhardt
2025-05-07 1:25 ` Derrick Stolee
2025-05-07 1:29 ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-07 1:25 ` Derrick Stolee
2025-05-07 16:38 ` Junio C Hamano
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-07 1:40 ` [PATCH 00/17] object-store: carve out the object database subsystem Derrick Stolee
2025-05-07 17:02 ` Junio C Hamano
2025-05-09 11:25 ` Patrick Steinhardt [this message]
2025-05-09 11:25 ` Patrick Steinhardt
2025-05-13 19:28 ` Toon Claes
2025-05-14 4:31 ` Patrick Steinhardt
2025-05-07 23:22 ` Junio C Hamano
2025-05-09 14:12 ` [PATCH v2 " Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-13 19:27 ` Toon Claes
2025-05-09 14:12 ` [PATCH v2 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-13 19:28 ` Toon Claes
2025-05-14 4:31 ` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-13 19:28 ` Toon Claes
2025-05-14 4:31 ` Patrick Steinhardt
2025-05-14 12:58 ` Junio C Hamano
2025-05-09 14:12 ` [PATCH v2 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-09 14:12 ` [PATCH v2 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-09 21:43 ` [PATCH v2 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-05-12 18:33 ` Derrick Stolee
2025-05-14 5:12 ` [PATCH v3 " Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-22 21:59 ` Justin Tobler
2025-05-14 5:12 ` [PATCH v3 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-22 22:13 ` Justin Tobler
2025-05-26 5:45 ` Patrick Steinhardt
2025-05-27 16:45 ` Justin Tobler
2025-05-28 13:18 ` Toon Claes
2025-05-30 9:39 ` Patrick Steinhardt
2025-05-30 9:39 ` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-14 5:12 ` [PATCH v3 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-14 14:48 ` [PATCH v3 00/17] object-store: carve out the object database subsystem Toon Claes
2025-05-15 8:22 ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 " Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-04 8:55 ` Toon Claes
2025-06-04 11:52 ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-04 13:24 ` Toon Claes
2025-06-04 13:55 ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-02 15:38 ` [PATCH v4 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-06-05 6:46 ` [PATCH v5 " Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-30 2:02 ` Justin Tobler
2025-07-01 12:17 ` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-30 2:34 ` Justin Tobler
2025-07-01 12:17 ` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-30 2:36 ` Justin Tobler
2025-06-05 6:46 ` [PATCH v5 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-30 2:56 ` Justin Tobler
2025-07-01 12:18 ` Patrick Steinhardt
2025-06-05 6:46 ` [PATCH v5 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-05 6:47 ` [PATCH v5 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-30 3:14 ` Justin Tobler
2025-06-05 6:47 ` [PATCH v5 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-30 3:15 ` Justin Tobler
2025-07-01 12:22 ` [PATCH v6 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-07-01 12:22 ` [PATCH v6 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-07-01 14:26 ` [PATCH v6 00/17] object-store: carve out the object database subsystem Justin Tobler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aB3mNiiAGE2KHbGY@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.