public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, Karthik Nayak <karthik.188@gmail.com>,
	Justin Tobler <jltobler@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 12/14] builtin/pack-objects: use `packfile_store_for_each_object()`
Date: Fri, 30 Jan 2026 13:57:45 +0100	[thread overview]
Message-ID: <aXyqPhWDNeKJv2re@pks.im> (raw)
In-Reply-To: <20260129110839.GA1285720@coredump.intra.peff.net>

On Thu, Jan 29, 2026 at 06:08:39AM -0500, Jeff King wrote:
> On Mon, Jan 26, 2026 at 09:53:18AM +0100, Patrick Steinhardt wrote:
> 
> > > Yes, the end result is the same, both your patch and what I wrote here
> > > implement the same GC-specific definition of an object's "mtime". I am
> > > not following the argument about pluggability, though. The concern I
> > > have above is that we are pushing domain-specific logic into the object
> > > storage backend, not the other way around.
> > 
> > To expand on the pluggability bit: every time you add a new backend
> > you'll have to extend the above logic to understand how it represents
> > the mtime. That by itself might be doable, but let's for example
> > consider a backend that is a black box to us (like a shared library that
> > may plug in arbitrary storage logic). In that case you would not even be
> > able to derive the information unless you have a generic layer that lets
> > you convey it to the caller.
> > 
> > So overall I agree with you that there are nuances here, and that the
> > mtimep pointer _can_ be used incorrectly. But I still think that the
> > concept is generic enough across backends, and the refactored logic
> > still works as extended. I'll try to expand the docs and commit message
> > a bit to cover this discussion.
> 
> There's a related concept that I saw while reading some of the earlier
> patches. When you converted fsck, I wondered how you would handle the
> call to read_loose_object(), which takes an actual path. And it needs to
> do so, because we want to make sure we are opening and reading that
> particular copy of the object, and not one from elsewhere.
> 
> The answer is that you punted on it for this series, and we still get
> the path via for_each_loose_file_in_source(). ;) That is OK, but I think
> it will eventually run into the same issue: we will need some kind of
> cursor or context for the iterator to be able to get extended
> information about a particular copy of an object.
> 
> I think there are probably two approaches here:
> 
>   1. The abstract odb API tries to share as little as possible. It gives
>      the caller back an opaque context struct, and that struct can be
>      handed back to the odb to get object contents or other information
>      (perhaps even an mtime!). Under the hood for the current odb
>      implementation this is probably just a pointer to a string with the
>      filesystem path for loose objects, and the usual packed_git/offset
>      pair for packed objects.
> 
>   2. The odb API provides a set of information that a particular backend
>      _might_ implement, and callers can poke at that information and
>      decide how to handle it when it's not available. And so that might
>      include a filesystem path for loose objects, which some backends
>      may choose to leave NULL.
> 
> Option (1) presents a cleaner API for the odb, but it's also more
> restrictive. Anything that a caller _might_ want to do has to be pushed
> down into the API, and it has to start learning about things like
> mtimes. And how to decide what "mtime" means for non-filesystem
> backends.
> 
> Option (2) pushes more work onto the callers. They need to not only look
> up the mtimes themselves (like they do now), but they have to decide how
> to handle the case when no path is available. Which in the worst case
> means a special case for each type of backend, though I think in
> practice they'd probably fall into rough groups.
> 
> I think one thing that appeals to me about option 2, though, is that it
> keeps a lot of the specialized "business logic" together in those
> callers. Most code doesn't are about concepts like mtime or specific
> copies of objects. But when it does, like in repack or fsck, there are
> often subtle assumptions and interpretations. I'd rather see all of that
> lumped together in the fsck code than have it split half-and-half
> between them and the odb code (which is really going to be some backends
> idea of how its concepts can be shoe-horned into the abstract API).

Yup. One thing that I'm planning to do in one of the subsequent patch
series is to expand `struct object_info` to handle this.

Right now, the sturcture contains a `whence` pointer that tells us which
backend the information is stored in. But that concept can be extended
to surface more info: instead of only telling the caller the type, we
can instead return the actual source that the object has been looked up
in.

Furthermore, the `struct object_info::u` union already contains enough
information for us to uniquely identify a specific option. So what we
would do then is to call `odb_source_read_object_info()` on the specific
source and pass it the union.

The loose source wouldn't have to do anything in that case, as the
location of the object is deterministic and there can only be one copy.
But the packed source would inspect `u.packed.pack` and thus know which
specific object we refer to.

This still hinges on a couple intermediate steps, but I think with this
plan we should be able to handle this issue in a way where the caller
doesn't need to know _anything_ about how exactly the ODB source itself
works.

Thanks!

Patrick

  reply	other threads:[~2026-01-30 12:58 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-15 11:04 [PATCH 00/14] odb: introduce `odb_for_each_object()` Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 01/14] odb: rename `FOR_EACH_OBJECT_*` flags Patrick Steinhardt
2026-01-15 18:00   ` Justin Tobler
2026-01-15 11:04 ` [PATCH 02/14] odb: fix flags parameter to be unsigned Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 03/14] object-file: extract function to read object info from path Patrick Steinhardt
2026-01-15 18:31   ` Justin Tobler
2026-01-16  7:03     ` Patrick Steinhardt
2026-01-20  9:09   ` Karthik Nayak
2026-01-15 11:04 ` [PATCH 04/14] object-file: introduce function to iterate through objects Patrick Steinhardt
2026-01-15 20:54   ` Justin Tobler
2026-01-16  7:03     ` Patrick Steinhardt
2026-01-20  9:16   ` Karthik Nayak
2026-01-15 11:04 ` [PATCH 05/14] packfile: extract function to iterate through objects of a store Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 06/14] packfile: introduce function to iterate through objects Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 07/14] odb: introduce `odb_for_each_object()` Patrick Steinhardt
2026-01-15 21:17   ` Justin Tobler
2026-01-16  7:03     ` Patrick Steinhardt
2026-01-16 17:46   ` Justin Tobler
2026-01-19  7:10     ` Patrick Steinhardt
2026-01-20  9:20   ` Karthik Nayak
2026-01-21  7:39     ` Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 08/14] builtin/fsck: refactor to use `odb_for_each_object()` Patrick Steinhardt
2026-01-15 21:24   ` Justin Tobler
2026-01-15 11:04 ` [PATCH 09/14] treewide: enumerate promisor objects via `odb_for_each_object()` Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 10/14] treewide: drop uses of `for_each_{loose,packed}_object()` Patrick Steinhardt
2026-01-15 21:44   ` Justin Tobler
2026-01-16  7:03     ` Patrick Steinhardt
2026-01-16 17:47       ` Justin Tobler
2026-01-19  7:10         ` Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 11/14] odb: introduce mtime fields for object info requests Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 12/14] builtin/pack-objects: use `packfile_store_for_each_object()` Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 13/14] reachable: convert to use `odb_for_each_object()` Patrick Steinhardt
2026-01-15 11:04 ` [PATCH 14/14] odb: drop unused `for_each_{loose,packed}_object()` functions Patrick Steinhardt
2026-01-15 13:50 ` [PATCH 00/14] odb: introduce `odb_for_each_object()` Junio C Hamano
2026-01-16  7:03   ` Patrick Steinhardt
2026-01-16 16:49     ` Junio C Hamano
2026-01-20 15:25 ` [PATCH v2 " Patrick Steinhardt
2026-01-20 15:25   ` [PATCH v2 01/14] odb: rename `FOR_EACH_OBJECT_*` flags Patrick Steinhardt
2026-01-20 15:25   ` [PATCH v2 02/14] odb: fix flags parameter to be unsigned Patrick Steinhardt
2026-01-20 15:25   ` [PATCH v2 03/14] object-file: extract function to read object info from path Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 04/14] object-file: introduce function to iterate through objects Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 05/14] packfile: extract function to iterate through objects of a store Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 06/14] packfile: introduce function to iterate through objects Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 07/14] odb: introduce `odb_for_each_object()` Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 08/14] builtin/fsck: refactor to use `odb_for_each_object()` Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 09/14] treewide: enumerate promisor objects via `odb_for_each_object()` Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 10/14] treewide: drop uses of `for_each_{loose,packed}_object()` Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 11/14] odb: introduce mtime fields for object info requests Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 12/14] builtin/pack-objects: use `packfile_store_for_each_object()` Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 13/14] reachable: convert to use `odb_for_each_object()` Patrick Steinhardt
2026-01-20 15:26   ` [PATCH v2 14/14] odb: drop unused `for_each_{loose,packed}_object()` functions Patrick Steinhardt
2026-01-21 12:50 ` [PATCH v3 00/14] odb: introduce `odb_for_each_object()` Patrick Steinhardt
2026-01-21 12:50   ` [PATCH v3 01/14] odb: rename `FOR_EACH_OBJECT_*` flags Patrick Steinhardt
2026-01-21 12:50   ` [PATCH v3 02/14] odb: fix flags parameter to be unsigned Patrick Steinhardt
2026-01-21 21:11     ` Jeff King
2026-01-22  0:00       ` Taylor Blau
2026-01-22 15:41         ` Junio C Hamano
2026-01-22 19:23           ` Jeff King
2026-01-23 10:57             ` Patrick Steinhardt
2026-01-26 22:32             ` Junio C Hamano
2026-01-22  6:50       ` Patrick Steinhardt
2026-01-22 23:44         ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 03/14] object-file: extract function to read object info from path Patrick Steinhardt
2026-01-22  0:04     ` Taylor Blau
2026-01-22  6:51       ` Patrick Steinhardt
2026-01-22 23:47         ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 04/14] object-file: introduce function to iterate through objects Patrick Steinhardt
2026-01-22  0:15     ` Taylor Blau
2026-01-22  6:52       ` Patrick Steinhardt
2026-01-23  0:01         ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 05/14] packfile: extract function to iterate through objects of a store Patrick Steinhardt
2026-01-22  1:37     ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 06/14] packfile: introduce function to iterate through objects Patrick Steinhardt
2026-01-23  0:06     ` Taylor Blau
2026-01-23  9:42       ` Patrick Steinhardt
2026-01-23  9:52         ` Chris Torek
2026-01-23 16:22           ` Junio C Hamano
2026-01-23 17:45             ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 07/14] odb: introduce `odb_for_each_object()` Patrick Steinhardt
2026-01-23  0:13     ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 08/14] builtin/fsck: refactor to use `odb_for_each_object()` Patrick Steinhardt
2026-01-23  0:32     ` Taylor Blau
2026-01-23  9:42       ` Patrick Steinhardt
2026-01-21 12:50   ` [PATCH v3 09/14] treewide: enumerate promisor objects via `odb_for_each_object()` Patrick Steinhardt
2026-01-23  0:33     ` Taylor Blau
2026-01-21 12:50   ` [PATCH v3 10/14] treewide: drop uses of `for_each_{loose,packed}_object()` Patrick Steinhardt
2026-01-23  0:46     ` Taylor Blau
2026-01-23  9:43       ` Patrick Steinhardt
2026-01-21 12:50   ` [PATCH v3 11/14] odb: introduce mtime fields for object info requests Patrick Steinhardt
2026-01-23  1:06     ` Taylor Blau
2026-01-23  9:43       ` Patrick Steinhardt
2026-01-23 17:48         ` Taylor Blau
2026-01-26  8:53           ` Patrick Steinhardt
2026-01-21 12:50   ` [PATCH v3 12/14] builtin/pack-objects: use `packfile_store_for_each_object()` Patrick Steinhardt
2026-01-23  1:21     ` Taylor Blau
2026-01-23  9:43       ` Patrick Steinhardt
2026-01-23 18:35         ` Taylor Blau
2026-01-26  8:53           ` Patrick Steinhardt
2026-01-29 11:08             ` Jeff King
2026-01-30 12:57               ` Patrick Steinhardt [this message]
2026-01-21 12:50   ` [PATCH v3 13/14] reachable: convert to use `odb_for_each_object()` Patrick Steinhardt
2026-01-21 12:50   ` [PATCH v3 14/14] odb: drop unused `for_each_{loose,packed}_object()` functions Patrick Steinhardt
2026-01-22  1:33   ` [PATCH v3 00/14] odb: introduce `odb_for_each_object()` Taylor Blau
2026-01-22 17:02     ` Junio C Hamano
2026-01-26  9:51 ` [PATCH v4 " Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 01/14] odb: rename `FOR_EACH_OBJECT_*` flags Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 02/14] odb: fix flags parameter to be unsigned Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 03/14] object-file: extract function to read object info from path Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 04/14] object-file: introduce function to iterate through objects Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 05/14] packfile: extract function to iterate through objects of a store Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 06/14] packfile: introduce function to iterate through objects Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 07/14] odb: introduce `odb_for_each_object()` Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 08/14] builtin/fsck: refactor to use `odb_for_each_object()` Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 09/14] treewide: enumerate promisor objects via `odb_for_each_object()` Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 10/14] treewide: drop uses of `for_each_{loose,packed}_object()` Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 11/14] odb: introduce mtime fields for object info requests Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 12/14] builtin/pack-objects: use `packfile_store_for_each_object()` Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 13/14] reachable: convert to use `odb_for_each_object()` Patrick Steinhardt
2026-01-26  9:51   ` [PATCH v4 14/14] odb: drop unused `for_each_{loose,packed}_object()` functions Patrick Steinhardt
2026-02-20 22:59   ` [PATCH v4 00/14] odb: introduce `odb_for_each_object()` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXyqPhWDNeKJv2re@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jltobler@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox