git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee <stolee@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 00/17] object-store: carve out the object database subsystem
Date: Fri, 9 May 2025 13:25:42 +0200	[thread overview]
Message-ID: <aB3mNiiAGE2KHbGY@pks.im> (raw)
In-Reply-To: <xmqqa57oe4mj.fsf@gitster.g>

On Wed, May 07, 2025 at 10:02:12AM -0700, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> > Patches 1 and 2 involve renaming some core structures, and I had
> > some questions around these names (since we hope to be stuck with
> > the new names for a long time). I was thinking out loud on a per-
> > patch basis, but now want to collect my thoughts around these:
> >
> >  * raw_object_store currently describes the abstraction that contains
> >    all objects that can be accessed within the repository. This may
> >    include multiple alternates. Patch 1 renames this to
> >    'object_database'.
> >
> >  * object_directory currently describes a single directory that
> >    has the same structure as $GIT_DIR/objects/ but may be an alternate
> >    or a submodule object directory. Patch 2 renames this to
> >    'odb_backend'.
> >
> > My concerns around this are basically around not liking "backend" for
> > this purpose. When I think of a backend, I'm thinking about the
> > implementation details (like the refs backend being files or reftable)
> > and not multiple distinct locations that have their own objects.
> 
> Yup, odb_backend_files (aot odb_backend_redis) or something?

Yeah, that was my vision indeed. I think it works equally well though in
case we name this `odb_alternate`. The benefit of the "alternate"
terminology is that we already use it and it's almost a perfect fit, and
it gives the reader a hint that we may have multiple alternates. On the
other hand, `odb_backend` sounds as if there would only be a single
backend for a `struct object_database`.

So Stolee caused me to reconsider and favor `odb_alternate`. But in the
end I guess that both names would work alright.

> >  * 'struct object_directory' could be renamed to 'struct odb_shard' or
> >    'struct odb_slice' or similar. I may even recommend 'odb_partition'
> >    though that does imply some disjointness that is not guaranteed (an
> >    object can exist in multiple parts).
> >
> >  * In the event that we create multiple implementations for storing
> >    objects, then a 'struct odb_shard' could point to a backend to help
> >    find the appropriate methods for interacting with its storage.
> 
> Hmph, I do not have strong opinions, but I consider it an
> implementation detail of one particular backend, namely, the
> filesystem based backend, that it can link together multiple
> object_directory instances and present them as if they form a single
> object database, just like all files within a single object_directory
> form an illusion of a single object database (aka key-value store) even
> though some objects are stored in individual loose object files while
> many others are packed in a single packfile.
> 
> I did not expect you would want to go to the world where a single
> "shard" consists of an object_directory backed by the filesystem and
> some other more database-y backend.  It is an interesting idea, but
> we'd need to worry about many things we do not have to worry about
> right now.  E.g. what do the precedence rules among different
> components within a single "shard" look like?  How do we express "in
> this repository, local filesystem-backed piece is consulted first,
> and then check this piece backed by low-cost but high-latency
> storage backend"?

Well, in fact I want to design this from the start so that you can mix
and match different backends. I think it falls out naturally from the
design if an alternate can be backed by anything, and it has a lot of
very interesting features.

Furthermore, it would cause a bunch of problems if we _didn't_ allow for
this, at least for hosting providers:

  - Migrations would now need to be atomic across fork networks where
    all forks need to be migrated at once so that we don't mix backends.

  - Migrations in general would be a pain if we had to do an atomic
    migration even for a single object directory. With mixed backends we
    can already make a partially-migrated backend available while the
    old backend is still in use.

  - High-latency storage backends may work well for binary files, but
    not for smallish text files. 

This all of course still needs to be hashed out. I do want to send an
RFC document to the mailing list soonish, probably in the first half of
the Git 2.51 release cycle, so that we can discuss where to go.

> > I do mention that the rename of the object-store.[c|h] files may be
> > unnecessary, or perhaps could be delayed until this series is merged
> > and the collateral is calmed.
> 
> Right now, merge-fix needed against all other topics in flight look
> like this, in order to merge it to 'seen'.

Okay. In that case I'll keep that patch for now.

Patrick

  reply	other threads:[~2025-05-09 11:25 UTC|newest]

Thread overview: 166+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06 11:09 [PATCH 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-07  0:47   ` Derrick Stolee
2025-05-07 15:27     ` Junio C Hamano
2025-05-06 11:09 ` [PATCH 02/17] object-store: rename `object_directory` to `odb_backend` Patrick Steinhardt
2025-05-07  0:51   ` Derrick Stolee
2025-05-07  1:00     ` Derrick Stolee
2025-05-09 11:25       ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-06 18:06   ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-07  1:10   ` Derrick Stolee
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 07/17] " Patrick Steinhardt
2025-05-07  1:12   ` Derrick Stolee
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-07  1:14   ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-07  1:21   ` Derrick Stolee
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-12 18:28       ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 10/17] odb: get rid of `the_repository` when handling the primary backend Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 11/17] odb: get rid of `the_repository` when handling submodule backends Patrick Steinhardt
2025-05-07  1:25   ` Derrick Stolee
2025-05-07  1:29     ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-07  1:25   ` Derrick Stolee
2025-05-07 16:38     ` Junio C Hamano
2025-05-09 11:25       ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-07  1:40 ` [PATCH 00/17] object-store: carve out the object database subsystem Derrick Stolee
2025-05-07 17:02   ` Junio C Hamano
2025-05-09 11:25     ` Patrick Steinhardt [this message]
2025-05-09 11:25   ` Patrick Steinhardt
2025-05-13 19:28     ` Toon Claes
2025-05-14  4:31       ` Patrick Steinhardt
2025-05-07 23:22 ` Junio C Hamano
2025-05-09 14:12 ` [PATCH v2 " Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-13 19:27     ` Toon Claes
2025-05-09 14:12   ` [PATCH v2 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-13 19:28     ` Toon Claes
2025-05-14  4:31       ` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-13 19:28     ` Toon Claes
2025-05-14  4:31       ` Patrick Steinhardt
2025-05-14 12:58       ` Junio C Hamano
2025-05-09 14:12   ` [PATCH v2 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-09 21:43   ` [PATCH v2 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-05-12 18:33     ` Derrick Stolee
2025-05-14  5:12 ` [PATCH v3 " Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-22 21:59     ` Justin Tobler
2025-05-14  5:12   ` [PATCH v3 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-22 22:13     ` Justin Tobler
2025-05-26  5:45       ` Patrick Steinhardt
2025-05-27 16:45         ` Justin Tobler
2025-05-28 13:18           ` Toon Claes
2025-05-30  9:39             ` Patrick Steinhardt
2025-05-30  9:39           ` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-14 14:48   ` [PATCH v3 00/17] object-store: carve out the object database subsystem Toon Claes
2025-05-15  8:22     ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 " Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-04  8:55     ` Toon Claes
2025-06-04 11:52       ` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-04 13:24     ` Toon Claes
2025-06-04 13:55       ` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-02 15:38   ` [PATCH v4 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-06-05  6:46 ` [PATCH v5 " Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-30  2:02     ` Justin Tobler
2025-07-01 12:17       ` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-30  2:34     ` Justin Tobler
2025-07-01 12:17       ` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-30  2:36     ` Justin Tobler
2025-06-05  6:46   ` [PATCH v5 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-30  2:56     ` Justin Tobler
2025-07-01 12:18       ` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-30  3:14     ` Justin Tobler
2025-06-05  6:47   ` [PATCH v5 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-30  3:15     ` Justin Tobler
2025-07-01 12:22 ` [PATCH v6 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-07-01 14:26   ` [PATCH v6 00/17] object-store: carve out the object database subsystem Justin Tobler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aB3mNiiAGE2KHbGY@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).