From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH 2/8] packfile: move the MRU list into the packfile store
Date: Wed, 29 Oct 2025 18:39:32 -0400 [thread overview]
Message-ID: <aQKXpM8g3Oy3DVAa@nand.local> (raw)
In-Reply-To: <20251028-pks-packfiles-store-drop-list-v1-2-1a3b82030a7a@pks.im>
On Tue, Oct 28, 2025 at 12:08:32PM +0100, Patrick Steinhardt wrote:
> Packfiles have two lists associated to them:
>
> - A list that keeps track of packfiles in the order that they were
> added to a packfile store.
>
> - A list that keeps track of packfiles in most-recently-used order so
> that packfiles that are more likely to contain a specific object are
> ordered towards the front.
>
> Both of these lists are hosted by `struct packed_git` itself, So to
> identify all packfiles in a repository you simply need to grab the first
> packfile and then iterate the `->next` pointers or the MRU list. This
> pattern has the problem that all packfiles are part of the same list,
> regardless of whether or not they belong to the same object source.
>
> With the upcoming pluggable object database effort this needs to change:
> packfiles should be contained by a single object source, and reading an
> object from any such packfile should use that source to look up the
> object. Consequently, we need to break up the global lists of packfiles
s/lists/list/
> into per-object-source lists.
How does this work for alternates? My understanding is that each
alternate now has its own object source. So to perform an object lookup
in a repository with alternate(s), I am assuming that at some layer we
need to iterate over those sources to then enumerate the packs in that
source looking for some object.
I would have imagined that packfile.c::find_pack_entry() would have to
be adjusted in a similar way as above, but I couldn't find the changes
in this series, so I feel like I must be missing something in my
understanding of how this all works together :-).
Are packs from different sources still connected somehow such that
iterating over the list of packs from one source will enumerate the list
of packs from all sources?
> A first step towards this goal is to move those lists ouf of `struct
s/ouf/out/
> packed_git` and into the packfile store. While the packfile store is
> currently sitting on the `struct object_database` level, the intent is
> to push it down one level into the `struct odb_source` in a subsequent
> patch series.
Before sending, I was confused by "Consequently, we need to break up the
global lists of packfiles [...]", since it wasn't clear whether or not
this series realizes that goal, or pushes us in the direction towards
it.
But this clarifies things, and I think is the reason that we do not see
more invasive changes like needing to enumerate the MRU cache of each
store in order to find an object like I mentioned above.
> Introduce a new `struct packfile_list` that is used to manage lists of
> packfiles and use it to store the list of most-recently-used packfiles
> in `struct packfile_store`. For now, the new list type is only used in a
> single spot, but we'll expand its usage in subsequent patches.
I am a little curious why we need a new list type and implementation
here. Is it to avoid exposing the list as part of struct packed_git like
we are forced to do with list_head?
I could imagine that you might want to avoid exposing the "struct
list_head mru" part of packed_git to avoid the suggestion that all
packfiles (including those from different sources) are part of the same
list. But if that's the case, I wonder if we couldn't have kept the same
mru list and clarified via comment that it is per-store, not global.
I suppose that is a bit of a foot-gun, and perhaps that is what you are
trying to do here, but after reading the patch message a few times I
wasn't clear on what the motivation for the new type was.
Thanks,
Taylor
next prev parent reply other threads:[~2025-10-29 22:39 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 11:08 [PATCH 0/8] packfiles: track pack lists via the packfile store Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 1/8] packfile: use a `strmap` to store packs by name Patrick Steinhardt
2025-10-29 22:16 ` Taylor Blau
2025-10-28 11:08 ` [PATCH 2/8] packfile: move the MRU list into the packfile store Patrick Steinhardt
2025-10-29 22:39 ` Taylor Blau [this message]
2025-10-30 8:59 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 3/8] http: refactor subsystem to use `packfile_list`s Patrick Steinhardt
2025-10-29 14:24 ` Toon Claes
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 4/8] packfile: fix approximation of object counts Patrick Steinhardt
2025-10-29 22:49 ` Taylor Blau
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects Patrick Steinhardt
2025-10-29 14:55 ` Toon Claes
2025-10-29 23:15 ` Taylor Blau
2025-10-30 8:59 ` Patrick Steinhardt
2025-10-29 23:13 ` Taylor Blau
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-30 9:31 ` Toon Claes
2025-10-30 9:52 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 6/8] packfile: move list of packs into the packfile store Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 7/8] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-10-29 23:25 ` Taylor Blau
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 8/8] packfile: track packs via the MRU list exclusively Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 0/8] packfiles: track pack lists via the packfile store Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 1/8] packfile: use a `strmap` to store packs by name Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 2/8] packfile: move the MRU list into the packfile store Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 3/8] http: refactor subsystem to use `packfile_list`s Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 4/8] packfile: fix approximation of object counts Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 6/8] packfile: move list of packs into the packfile store Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 7/8] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 8/8] packfile: track packs via the MRU list exclusively Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aQKXpM8g3Oy3DVAa@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.