git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 00/17] object-store: carve out the object database subsystem
Date: Fri, 9 May 2025 13:25:38 +0200	[thread overview]
Message-ID: <aB3mMtoxCcaOFn0W@pks.im> (raw)
In-Reply-To: <5bea19fe-6616-4f01-a78d-9b7da94db899@gmail.com>

On Tue, May 06, 2025 at 09:40:16PM -0400, Derrick Stolee wrote:
> On 5/6/25 7:09 AM, Patrick Steinhardt wrote:
> 
> > this patch series refactors the object store subsystem to become more
> > self-contained by getting rid of `the_repository`. Instead of passing in
> > the repository explicitly, we start to pass in the object store itself,
> > which is in contrast to many other refactorings we did, but in line with
> > what we did for the ref store, as well.
> > 
> > This series also starts to properly scope functions to the carved out
> > object database subsystem, which requires a bit of shuffling. This
> > allows us to have a short-and-sweet `odb_` prefix for functions and
> > prepares us for a future with pluggable object backends.
> > 
> > The series is structured as follows:
> > 
> >    - Patches 1 to 3 rename `struct object_store` and `struct
> >      object_directory` as well as the code files.
> 
> Patches 1 and 2 involve renaming some core structures, and I had
> some questions around these names (since we hope to be stuck with
> the new names for a long time). I was thinking out loud on a per-
> patch basis, but now want to collect my thoughts around these:
> 
>  * raw_object_store currently describes the abstraction that contains
>    all objects that can be accessed within the repository. This may
>    include multiple alternates. Patch 1 renames this to
>    'object_database'.
> 
>  * object_directory currently describes a single directory that
>    has the same structure as $GIT_DIR/objects/ but may be an alternate
>    or a submodule object directory. Patch 2 renames this to
>    'odb_backend'.
> 
> My concerns around this are basically around not liking "backend" for
> this purpose. When I think of a backend, I'm thinking about the
> implementation details (like the refs backend being files or reftable)
> and not multiple distinct locations that have their own objects.

That is very much the intent eventually. Right now the backend is always
the one that uses loose objects and packfiles. But eventually, the goal
is to introduce different backends.

But regardless of that, ...

> In this sense, I'm partial to being brief for the most-common structure
> that will be passed around and then more descriptive about the smaller
> pieces:
> 
>  * 'struct raw_object_store' could be renamed to 'struct odb' to match
>    its use in all of the new odb_*() methods. This represents the
>    "object database abstraction" and consumers don't care if this is
>    one or many object directories in a trench coat.

    NB: I think having a long name here is nicer, even if it's
    abbreviated in the functions. But that's mostly my own preference, I
    don't care too much. I'll keep this as-is in the next iteration, but
    if you feel strongly I'm certainly happy to rename it to `struct
    odb`. Just give me a ping and I'll do so.

>  * 'struct object_directory' could be renamed to 'struct odb_shard' or
>    'struct odb_slice' or similar. I may even recommend 'odb_partition'
>    though that does imply some disjointness that is not guaranteed (an
>    object can exist in multiple parts).
> 
>  * In the event that we create multiple implementations for storing
>    objects, then a 'struct odb_shard' could point to a backend to help
>    find the appropriate methods for interacting with its storage.

... I have decided to rename this to `odb_alternate`. I don't think
"shard" works well, as shard is an extremely generic term that doesn't
really convey much meaning.

On the other hand, I think that `odb_alternate` is quite a good fit. We
already use it all over the place to mean almost exactly what we are
after here. And it doesn't seem far-fetched to have an
`odb_packed_alternate`, `odb_loose_alternate` and `odb_redis_alternate`
for different backends.

The only stretch is that the primary object directory is now the primary
alternate. I think that this is acceptable though'

>  * "alternate refs" are locked in as names based on the following
>    config key names:
> 
>    - core.alternateRefsCommand
>    - core.alternateRefsPrefix
> 
>    These user-facing names should not change. This may be valuable to
>    make sure that the 'odb_shard's still have a state of "I'm an
>    alternate" or "I'm the base read/write shard for this repo".

Agreed.

> >    - Patches 4 to 12 refactor "odb.c" to get rid of `the_repository`.
> 
> These are carefully done. Thanks. I only have a few nitpicks here
> and there.
> 
> >    - Patches 13 to 17 adjust the name of remaining functions so that they
> >      can be clearly attributed to the ODB. I'm happy to kick these
> >      patches out of this series and resend them at a later point in case
> >      they create too much turmoil.
> 
> I like that these are present, especially because you included
> compatibility macros for in-flight topics.
> 
> I do mention that the rename of the object-store.[c|h] files may be
> unnecessary, or perhaps could be delayed until this series is merged
> and the collateral is calmed.
> 
> ---
> 
> This was clearly a lot of work to put together. Thanks for working
> hard to thoughtfully rename things while refactoring to reduce our
> dependence on global state.

Well. Frankly, the hard work is only just starting. Next step: push down
`struct packed_git` from `struct object_database` to `odb_alternate`.
I'm scared what I'll find there.

Patrick

  parent reply	other threads:[~2025-05-09 11:25 UTC|newest]

Thread overview: 166+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06 11:09 [PATCH 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-07  0:47   ` Derrick Stolee
2025-05-07 15:27     ` Junio C Hamano
2025-05-06 11:09 ` [PATCH 02/17] object-store: rename `object_directory` to `odb_backend` Patrick Steinhardt
2025-05-07  0:51   ` Derrick Stolee
2025-05-07  1:00     ` Derrick Stolee
2025-05-09 11:25       ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-06 18:06   ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-07  1:10   ` Derrick Stolee
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 07/17] " Patrick Steinhardt
2025-05-07  1:12   ` Derrick Stolee
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-07  1:14   ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-07  1:21   ` Derrick Stolee
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-12 18:28       ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 10/17] odb: get rid of `the_repository` when handling the primary backend Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 11/17] odb: get rid of `the_repository` when handling submodule backends Patrick Steinhardt
2025-05-07  1:25   ` Derrick Stolee
2025-05-07  1:29     ` Derrick Stolee
2025-05-06 11:09 ` [PATCH 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-07  1:25   ` Derrick Stolee
2025-05-07 16:38     ` Junio C Hamano
2025-05-09 11:25       ` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-06 11:09 ` [PATCH 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-07  1:40 ` [PATCH 00/17] object-store: carve out the object database subsystem Derrick Stolee
2025-05-07 17:02   ` Junio C Hamano
2025-05-09 11:25     ` Patrick Steinhardt
2025-05-09 11:25   ` Patrick Steinhardt [this message]
2025-05-13 19:28     ` Toon Claes
2025-05-14  4:31       ` Patrick Steinhardt
2025-05-07 23:22 ` Junio C Hamano
2025-05-09 14:12 ` [PATCH v2 " Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-13 19:27     ` Toon Claes
2025-05-09 14:12   ` [PATCH v2 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-13 19:28     ` Toon Claes
2025-05-14  4:31       ` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-13 19:28     ` Toon Claes
2025-05-14  4:31       ` Patrick Steinhardt
2025-05-14 12:58       ` Junio C Hamano
2025-05-09 14:12   ` [PATCH v2 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-09 14:12   ` [PATCH v2 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-09 21:43   ` [PATCH v2 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-05-12 18:33     ` Derrick Stolee
2025-05-14  5:12 ` [PATCH v3 " Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-05-22 21:59     ` Justin Tobler
2025-05-14  5:12   ` [PATCH v3 02/17] object-store: rename `object_directory` to `odb_alternate` Patrick Steinhardt
2025-05-22 22:13     ` Justin Tobler
2025-05-26  5:45       ` Patrick Steinhardt
2025-05-27 16:45         ` Justin Tobler
2025-05-28 13:18           ` Toon Claes
2025-05-30  9:39             ` Patrick Steinhardt
2025-05-30  9:39           ` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 10/17] odb: get rid of `the_repository` when handling the primary alternate Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 11/17] odb: get rid of `the_repository` when handling submodule alternates Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-05-14  5:12   ` [PATCH v3 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-05-14 14:48   ` [PATCH v3 00/17] object-store: carve out the object database subsystem Toon Claes
2025-05-15  8:22     ` Patrick Steinhardt
2025-06-02 10:27 ` [PATCH v4 " Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-04  8:55     ` Toon Claes
2025-06-04 11:52       ` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-04 13:24     ` Toon Claes
2025-06-04 13:55       ` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-02 10:27   ` [PATCH v4 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-02 15:38   ` [PATCH v4 00/17] object-store: carve out the object database subsystem Junio C Hamano
2025-06-05  6:46 ` [PATCH v5 " Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-06-30  2:02     ` Justin Tobler
2025-07-01 12:17       ` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-06-30  2:34     ` Justin Tobler
2025-07-01 12:17       ` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-06-30  2:36     ` Justin Tobler
2025-06-05  6:46   ` [PATCH v5 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-06-30  2:56     ` Justin Tobler
2025-07-01 12:18       ` Patrick Steinhardt
2025-06-05  6:46   ` [PATCH v5 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-06-05  6:47   ` [PATCH v5 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-06-30  3:14     ` Justin Tobler
2025-06-05  6:47   ` [PATCH v5 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-06-30  3:15     ` Justin Tobler
2025-07-01 12:22 ` [PATCH v6 00/17] object-store: carve out the object database subsystem Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 01/17] object-store: rename `raw_object_store` to `object_database` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 02/17] object-store: rename `object_directory` to `odb_source` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 03/17] object-store: rename files to "odb.{c,h}" Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 04/17] odb: introduce parent pointers Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 05/17] odb: get rid of `the_repository` in `find_odb()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 06/17] odb: get rid of `the_repository` in `assert_oid_type()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 07/17] odb: get rid of `the_repository` in `odb_mkstemp()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 08/17] odb: get rid of `the_repository` when handling alternates Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 09/17] odb: get rid of `the_repository` in `for_each()` functions Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 10/17] odb: get rid of `the_repository` when handling the primary source Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 11/17] odb: get rid of `the_repository` when handling submodule sources Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 12/17] odb: trivial refactorings to get rid of `the_repository` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 13/17] odb: rename `oid_object_info()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 14/17] odb: rename `repo_read_object_file()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 15/17] odb: rename `has_object()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 16/17] odb: rename `pretend_object_file()` Patrick Steinhardt
2025-07-01 12:22   ` [PATCH v6 17/17] odb: rename `read_object_with_reference()` Patrick Steinhardt
2025-07-01 14:26   ` [PATCH v6 00/17] object-store: carve out the object database subsystem Justin Tobler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aB3mMtoxCcaOFn0W@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).