git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/8] midx: start tracking per object database source
Date: Tue, 15 Jul 2025 10:26:23 +0200	[thread overview]
Message-ID: <aHYQr1O9beg64aOW@pks.im> (raw)
In-Reply-To: <aHBISjW/OErOy3Cp@nand.local>

On Thu, Jul 10, 2025 at 07:10:02PM -0400, Taylor Blau wrote:
> On Wed, Jul 09, 2025 at 09:54:49AM +0200, Patrick Steinhardt wrote:
> > Multi-pack indices are tracked via `struct multi_pack_index`. This data
> > structure is stored as a linked list inside `struct object_database`,
> > which is the global database that spans across all of the object
> > sources.
> >
> > This layout causes two problems:
> >
> >   - Multi-pack indices aren't global to an object database, but instead
> >     there can be one multi-pack index per source. This creates a
> >     mismatch between the on-disk layout and how things are organized in
> >     the object database subsystems and makes some parts, like figuring
> >     out whether a source has an MIDX, quite awkward.
> 
> This is a little confusing to me. What do we consider to be an "object
> database", and what do we consider to be a "source"?
> 
> For instance, if I have a repository with one or more alternates, I
> would imagine that each alternate is a separate "source", and the
> sources together comprise the object database. Does that match the way
> you're thinking about it?
> 
> If so, that makes sense. But if not (i.e., we consider all alternates to
> belong to the same object database and share a single source), then I
> don't know how this will interact with the existing MIDX alternates
> mechanism.
> 
> Some clarification here would be helpful, I think.

I'll add some, sure. This whole infra is currently developing, so it
doesn't hurt to reiterate some of the points that have already landed.

> > diff --git a/midx.c b/midx.c
> > index 3c5bc821730..a91231bfcdf 100644
> > --- a/midx.c
> > +++ b/midx.c
> > @@ -724,28 +724,29 @@ int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id)
> >  	return 0;
> >  }
> >
> > -int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local)
> > +int prepare_multi_pack_index_one(struct odb_source *source, int local)
> 
> OK, so the combination of a repository and "object_dir" path becomes a
> new "struct odb_source", which makes sense. But shouldn't "local" also
> be a property of the odb_source?
> 
> I am not familiar enough with the odb_source work thus far to know
> whether that is a good idea or not, but just something that stood out
> to me while reading.

Yes, you're right on point. I have a follow-up patch series that'll
eventually get rid of some duplicate information that is now present in
the ODB source connected to an MIDX.

[snip]
> > +		source->multi_pack_index = m;
> 
> As an aside, I see that we're calling this new field "multi_pack_index".
> I wonder if it would be better to call it "midx", since typing out
> "multi_pack_index" is a little verbose, and "midx" is a common
> abbreviation that I don't think we would cause any confusion with.

Sure, I can do that.

> > diff --git a/midx.h b/midx.h
> > index 9d1374cbd58..b1626a9a7c4 100644
> > --- a/midx.h
> > +++ b/midx.h
> > @@ -3,11 +3,12 @@
> >
> >  #include "string-list.h"
> >
> > +struct bitmapped_pack;
> > +struct git_hash_algo;
> >  struct object_id;
> > +struct odb_source;
> >  struct pack_entry;
> >  struct repository;
> > -struct bitmapped_pack;
> > -struct git_hash_algo;
> 
> (I'm nit-picking, but a similar note here that the unrelated changes
> make it harder to see what is actually going on in this hunk, which is
> adding a declaration for "struct odb_source".)
> 
> > diff --git a/odb.h b/odb.h
> > index e922f256802..8e79c7be520 100644
> > --- a/odb.h
> > +++ b/odb.h
> > @@ -9,10 +

  parent reply	other threads:[~2025-07-15  8:26 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-09  7:54 [PATCH 0/8] odb: track multi-pack-indices via their object sources Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 1/8] midx: start tracking per object database source Patrick Steinhardt
2025-07-10 20:56   ` Justin Tobler
2025-07-10 23:10   ` Taylor Blau
2025-07-10 23:19     ` Taylor Blau
2025-07-15  8:26       ` Patrick Steinhardt
2025-07-15  8:26     ` Patrick Steinhardt [this message]
2025-07-09  7:54 ` [PATCH 2/8] packfile: refactor `prepare_packed_git_one()` to work on sources Patrick Steinhardt
2025-07-10 21:07   ` Justin Tobler
2025-07-10 23:14   ` Taylor Blau
2025-07-15  8:26     ` Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 3/8] midx: stop using linked list when closing MIDX Patrick Steinhardt
2025-07-10 21:31   ` Justin Tobler
2025-07-15  8:26     ` Patrick Steinhardt
2025-07-10 23:22   ` Taylor Blau
2025-07-15  8:26     ` Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 4/8] midx: track whether we have loaded the MIDX Patrick Steinhardt
2025-07-10 22:16   ` Justin Tobler
2025-07-10 23:26     ` Taylor Blau
2025-07-15  8:27       ` Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 5/8] packfile: refactor `get_multi_pack_index()` to work on sources Patrick Steinhardt
2025-07-10 22:35   ` Justin Tobler
2025-07-10 23:56   ` Taylor Blau
2025-07-15  8:27     ` Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 6/8] packfile: stop using linked MIDX list in `find_pack_entry()` Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 7/8] packfile: stop using linked MIDX list in `get_all_packs()` Patrick Steinhardt
2025-07-09  7:54 ` [PATCH 8/8] midx: remove now-unused linked list of multi-pack indices Patrick Steinhardt
2025-07-10 22:48   ` Justin Tobler
2025-07-09 22:04 ` [PATCH 0/8] odb: track multi-pack-indices via their object sources Junio C Hamano
2025-07-10 23:58   ` Taylor Blau
2025-07-15  8:27     ` Patrick Steinhardt
2025-07-15 11:29 ` [PATCH v2 0/7] " Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 1/7] midx: start tracking per object database source Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 2/7] packfile: refactor `prepare_packed_git_one()` to work on sources Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 3/7] midx: stop using linked list when closing MIDX Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 4/7] packfile: refactor `get_multi_pack_index()` to work on sources Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 5/7] packfile: stop using linked MIDX list in `find_pack_entry()` Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 6/7] packfile: stop using linked MIDX list in `get_all_packs()` Patrick Steinhardt
2025-07-15 11:29   ` [PATCH v2 7/7] midx: remove now-unused linked list of multi-pack indices Patrick Steinhardt
2025-07-15 21:59   ` [PATCH v2 0/7] odb: track multi-pack-indices via their object sources Justin Tobler
2025-07-23 21:22   ` Junio C Hamano
2025-07-24  8:00     ` Patrick Steinhardt
2025-08-12 21:58       ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aHYQr1O9beg64aOW@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).