* [PATCH 00/16] packfile: carve out a new packfile store
@ 2025-08-19 8:19 Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
` (19 more replies)
0 siblings, 20 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
Hi,
information about a object database's packfiles is currently distributed
across two different structures:
- `struct packed_git` contains the `next` pointer as well as the
`mru_head`, both of which serve to store the list of packfiles.
- `struct object_database` contains several fields that relate to the
packfiles.
So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.
This patch series introduces a new `struct packfile_store`, which is
about to become the single source of truth for managing packfiles, and
carves out the packfile store subsystem.
This is the first step to make packfiles work with pluggable object
databases. Next steps will be to:
- Move the `struct packed_git::next` and `struct packed::mru_head`
pointers into the packfile store so that `struct packed_git` only
tracks a single packfile.
- Push the `struct packfile_store` down one level so that it's not
hosted by the object database anymore, but instead by the object
database source.
Thanks!
Patrick
---
Patrick Steinhardt (16):
packfile: introduce a new `struct packfile_store`
odb: move list of packfiles into `struct packfile_store`
odb: move initialization bit into `struct packfile_store`
odb: move packfile map into `struct packfile_store`
odb: move MRU list of packfiles into `struct packfile_store`
odb: move kept cache into `struct packfile_store`
packfile: reorder functions to avoid function declaration
packfile: refactor `prepare_packed_git()` to work on packfile store
packfile: split up responsibilities of `reprepare_packed_git()`
packfile: refactor `install_packed_git()` to work on packfile store
packfile: always add packfiles to MRU when adding a pack
packfile: introduce function to load and add packfiles
packfile: move `get_multi_pack_index()` into "midx.c"
packfile: remove `get_packed_git()`
packfile: refactor `get_all_packs()` to work on packfile store
packfile: refactor `get_packed_git_mru()` to work on packfile store
builtin/backfill.c | 2 +-
builtin/cat-file.c | 2 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 8 +-
builtin/fsck.c | 8 +-
builtin/gc.c | 12 +-
builtin/grep.c | 2 +-
builtin/index-pack.c | 10 +-
builtin/pack-objects.c | 22 ++--
builtin/pack-redundant.c | 4 +-
builtin/receive-pack.c | 2 +-
builtin/repack.c | 8 +-
bulk-checkin.c | 2 +-
connected.c | 4 +-
fetch-pack.c | 4 +-
http-backend.c | 4 +-
http.c | 4 +-
http.h | 2 +-
midx.c | 26 ++--
midx.h | 2 +
object-name.c | 6 +-
odb.c | 37 ++++--
odb.h | 34 ++---
pack-bitmap.c | 4 +-
pack-objects.c | 2 +-
packfile.c | 293 ++++++++++++++++++++++++--------------------
packfile.h | 111 ++++++++++++++---
server-info.c | 2 +-
t/helper/test-find-pack.c | 2 +-
t/helper/test-pack-mtimes.c | 2 +-
transport-helper.c | 2 +-
31 files changed, 354 insertions(+), 271 deletions(-)
---
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
change-id: 20250806-b4-pks-packfiles-store-a44a608ca396
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH 01/16] packfile: introduce a new `struct packfile_store`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 9:47 ` Karthik Nayak
2025-08-19 17:32 ` Junio C Hamano
2025-08-19 8:19 ` [PATCH 02/16] odb: move list of packfiles into " Patrick Steinhardt
` (18 subsequent siblings)
19 siblings, 2 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
Information about a object database's packfiles is currently distributed
across two different structures:
- `struct packed_git` contains the `next` pointer as well as the
`mru_head`, both of which serve to store the list of packfiles.
- `struct object_database` contains several fields that relate to the
packfiles.
So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.
Introduce a new `struct packfile_store` which is about to become the
single source of truth for managing packfiles. Right now this data
structure doesn't yet contain anything, but in subsequent patches we
will move all data structures that relate to packfiles and that are
currently contained in `struct object_database` into this new home.
Note that this is only a first step: most importantly, we won't (yet)
move the `struct packed_git::next` pointer around. This will happen in a
subsequent patch series though so that `struct packed_git` will really
only host information about the specific packfile it represents.
Further note that the new structure still sits at the wrong level at the
end of this patch series: as mentioned, it should eventually sit at the
level of the object database source, not at the object database level.
But introducing the packfile store now already makes it way easier to
eventually push down the now-selfcontained data structure by one level.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.c | 1 +
odb.h | 2 ++
packfile.c | 13 +++++++++++++
packfile.h | 18 ++++++++++++++++++
4 files changed, 34 insertions(+)
diff --git a/odb.c b/odb.c
index 2a92a018c4..34b70d0074 100644
--- a/odb.c
+++ b/odb.c
@@ -996,6 +996,7 @@ struct object_database *odb_new(struct repository *repo)
memset(o, 0, sizeof(*o));
o->repo = repo;
+ o->packfiles = packfile_store_new(o);
INIT_LIST_HEAD(&o->packed_git_mru);
hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
pthread_mutex_init(&o->replace_mutex, NULL);
diff --git a/odb.h b/odb.h
index 3dfc66d75a..026ba9386d 100644
--- a/odb.h
+++ b/odb.h
@@ -83,6 +83,7 @@ struct odb_source {
};
struct packed_git;
+struct packfile_store;
struct cached_object_entry;
/*
@@ -128,6 +129,7 @@ struct object_database {
*
* should only be accessed directly by packfile.c
*/
+ struct packfile_store *packfiles;
struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
diff --git a/packfile.c b/packfile.c
index 5d73932f50..8fbf1cfc2d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2333,3 +2333,16 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
*len = hdr - out;
return 0;
}
+
+struct packfile_store *packfile_store_new(struct object_database *odb)
+{
+ struct packfile_store *store;
+ CALLOC_ARRAY(store, 1);
+ store->odb = odb;
+ return store;
+}
+
+void packfile_store_free(struct packfile_store *store)
+{
+ free(store);
+}
diff --git a/packfile.h b/packfile.h
index f16753f2a9..8d31fd619a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -52,6 +52,24 @@ struct packed_git {
char pack_name[FLEX_ARRAY]; /* more */
};
+/*
+ * A store that manages packfiles for a given object database.
+ */
+struct packfile_store {
+ struct object_database *odb;
+};
+
+/*
+ * Allocate and initialize a new empty packfile store for the given object
+ * database.
+ */
+struct packfile_store *packfile_store_new(struct object_database *odb);
+
+/*
+ * Free the packfile store and all its associated state.
+ */
+void packfile_store_free(struct packfile_store *store);
+
static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
const struct hashmap_entry *entry,
const struct hashmap_entry *entry2,
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 02/16] odb: move list of packfiles into `struct packfile_store`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 03/16] odb: move initialization bit " Patrick Steinhardt
` (17 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The object database tracks the list of packfiles it currently knows
about. With the introduction of the `struct packfile_store` we have a
better place to host this list though.
Move the list accordingly. Extract the logic from `odb_clear()` that
knows to close all such packfiles and move it into the new subsystem, as
well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.c | 11 +----------
odb.h | 1 -
packfile.c | 47 ++++++++++++++++++++++++++++++-----------------
packfile.h | 16 +++++++++++++++-
4 files changed, 46 insertions(+), 29 deletions(-)
diff --git a/odb.c b/odb.c
index 34b70d0074..17a9135cbd 100644
--- a/odb.c
+++ b/odb.c
@@ -1038,16 +1038,7 @@ void odb_clear(struct object_database *o)
INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
-
- /*
- * `close_object_store()` only closes the packfiles, but doesn't free
- * them. We thus have to do this manually.
- */
- for (struct packed_git *p = o->packed_git, *next; p; p = next) {
- next = p->next;
- free(p);
- }
- o->packed_git = NULL;
+ packfile_store_free(o->packfiles);
hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
diff --git a/odb.h b/odb.h
index 026ba9386d..273ad0ceaa 100644
--- a/odb.h
+++ b/odb.h
@@ -131,7 +131,6 @@ struct object_database {
*/
struct packfile_store *packfiles;
- struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
struct list_head packed_git_mru;
diff --git a/packfile.c b/packfile.c
index 8fbf1cfc2d..6478e4cc30 100644
--- a/packfile.c
+++ b/packfile.c
@@ -278,7 +278,7 @@ static int unuse_one_window(struct packed_git *current)
if (current)
scan_windows(current, &lru_p, &lru_w, &lru_l);
- for (p = current->repo->objects->packed_git; p; p = p->next)
+ for (p = current->repo->objects->packfiles->packs; p; p = p->next)
scan_windows(p, &lru_p, &lru_w, &lru_l);
if (lru_p) {
munmap(lru_w->base, lru_w->len);
@@ -362,13 +362,8 @@ void close_pack(struct packed_git *p)
void close_object_store(struct object_database *o)
{
struct odb_source *source;
- struct packed_git *p;
- for (p = o->packed_git; p; p = p->next)
- if (p->do_not_close)
- BUG("want to close pack marked 'do-not-close'");
- else
- close_pack(p);
+ packfile_store_close(o->packfiles);
for (source = o->sources; source; source = source->next) {
if (source->midx)
@@ -468,7 +463,7 @@ static int close_one_pack(struct repository *r)
struct pack_window *mru_w = NULL;
int accept_windows_inuse = 1;
- for (p = r->objects->packed_git; p; p = p->next) {
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
if (p->pack_fd == -1)
continue;
find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
@@ -789,8 +784,8 @@ void install_packed_git(struct repository *r, struct packed_git *pack)
if (pack->pack_fd != -1)
pack_open_fds++;
- pack->next = r->objects->packed_git;
- r->objects->packed_git = pack;
+ pack->next = r->objects->packfiles->packs;
+ r->objects->packfiles->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
hashmap_add(&r->objects->pack_map, &pack->packmap_ent);
@@ -974,7 +969,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
count += m->num_objects;
}
- for (p = r->objects->packed_git; p; p = p->next) {
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
if (open_pack_index(p))
continue;
count += p->num_objects;
@@ -1015,7 +1010,7 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
static void rearrange_packed_git(struct repository *r)
{
- sort_packs(&r->objects->packed_git, sort_pack);
+ sort_packs(&r->objects->packfiles->packs, sort_pack);
}
static void prepare_packed_git_mru(struct repository *r)
@@ -1024,7 +1019,7 @@ static void prepare_packed_git_mru(struct repository *r)
INIT_LIST_HEAD(&r->objects->packed_git_mru);
- for (p = r->objects->packed_git; p; p = p->next)
+ for (p = r->objects->packfiles->packs; p; p = p->next)
list_add_tail(&p->mru, &r->objects->packed_git_mru);
}
@@ -1074,7 +1069,7 @@ void reprepare_packed_git(struct repository *r)
struct packed_git *get_packed_git(struct repository *r)
{
prepare_packed_git(r);
- return r->objects->packed_git;
+ return r->objects->packfiles->packs;
}
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
@@ -1095,7 +1090,7 @@ struct packed_git *get_all_packs(struct repository *r)
prepare_midx_pack(r, m, i);
}
- return r->objects->packed_git;
+ return r->objects->packfiles->packs;
}
struct list_head *get_packed_git_mru(struct repository *r)
@@ -1220,7 +1215,7 @@ const struct packed_git *has_packed_and_bad(struct repository *r,
{
struct packed_git *p;
- for (p = r->objects->packed_git; p; p = p->next)
+ for (p = r->objects->packfiles->packs; p; p = p->next)
if (oidset_contains(&p->bad_objects, oid))
return p;
return NULL;
@@ -2081,7 +2076,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
if (source->midx && fill_midx_entry(r, oid, e, source->midx))
return 1;
- if (!r->objects->packed_git)
+ if (!r->objects->packfiles->packs)
return 0;
list_for_each(pos, &r->objects->packed_git_mru) {
@@ -2344,5 +2339,23 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
void packfile_store_free(struct packfile_store *store)
{
+ packfile_store_close(store);
+
+ for (struct packed_git *p = store->packs, *next; p; p = next) {
+ next = p->next;
+ free(p);
+ }
+
free(store);
}
+
+void packfile_store_close(struct packfile_store *store)
+{
+ struct packed_git *p;
+
+ for (p = store->packs; p; p = p->next)
+ if (p->do_not_close)
+ BUG("want to close pack marked 'do-not-close'");
+ else
+ close_pack(p);
+}
diff --git a/packfile.h b/packfile.h
index 8d31fd619a..1404b80917 100644
--- a/packfile.h
+++ b/packfile.h
@@ -57,6 +57,13 @@ struct packed_git {
*/
struct packfile_store {
struct object_database *odb;
+
+ /*
+ * The list of packfiles in the order in which they are being added to
+ * the store. The local packfile typically sits at the head of this
+ * list.
+ */
+ struct packed_git *packs;
};
/*
@@ -66,10 +73,17 @@ struct packfile_store {
struct packfile_store *packfile_store_new(struct object_database *odb);
/*
- * Free the packfile store and all its associated state.
+ * Free the packfile store and all its associated state. All packfiles
+ * tracked by the store will be closed.
*/
void packfile_store_free(struct packfile_store *store);
+/*
+ * Close all packfiles associated with this store. The packfiles won't be
+ * free'd, so they can be re-opened at a later point in time.
+ */
+void packfile_store_close(struct packfile_store *store);
+
static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
const struct hashmap_entry *entry,
const struct hashmap_entry *entry2,
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 02/16] odb: move list of packfiles into " Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 9:57 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 04/16] odb: move packfile map " Patrick Steinhardt
` (16 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The object database knows to skip re-initializing the list of packfiles
in case it's already been initialized. Whether or not that is the case
is tracked via a separate `initialized` bit that is stored in the object
database. With the introduction of the `struct packfile_store` we have a
better place to host this bit though.
Move it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.h | 6 ------
packfile.c | 6 +++---
packfile.h | 6 ++++++
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/odb.h b/odb.h
index 273ad0ceaa..970919403d 100644
--- a/odb.h
+++ b/odb.h
@@ -162,12 +162,6 @@ struct object_database {
unsigned long approximate_object_count;
unsigned approximate_object_count_valid : 1;
- /*
- * Whether packed_git has already been populated with this repository's
- * packs.
- */
- unsigned packed_git_initialized : 1;
-
/*
* Submodule source paths that will be added as additional sources to
* allow lookup of submodule objects via the main object database.
diff --git a/packfile.c b/packfile.c
index 6478e4cc30..4e5f84eb09 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,7 +1027,7 @@ static void prepare_packed_git(struct repository *r)
{
struct odb_source *source;
- if (r->objects->packed_git_initialized)
+ if (r->objects->packfiles->initialized)
return;
odb_prepare_alternates(r->objects);
@@ -1039,7 +1039,7 @@ static void prepare_packed_git(struct repository *r)
rearrange_packed_git(r);
prepare_packed_git_mru(r);
- r->objects->packed_git_initialized = 1;
+ r->objects->packfiles->initialized = 1;
}
void reprepare_packed_git(struct repository *r)
@@ -1061,7 +1061,7 @@ void reprepare_packed_git(struct repository *r)
odb_clear_loose_cache(source);
r->objects->approximate_object_count_valid = 0;
- r->objects->packed_git_initialized = 0;
+ r->objects->packfiles->initialized = 0;
prepare_packed_git(r);
obj_read_unlock();
}
diff --git a/packfile.h b/packfile.h
index 1404b80917..573564b19e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,12 @@ struct packfile_store {
* list.
*/
struct packed_git *packs;
+
+ /*
+ * Whether packfiles have already been populated with this store's
+ * packs.
+ */
+ unsigned initialized : 1;
};
/*
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 04/16] odb: move packfile map into `struct packfile_store`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (2 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 03/16] odb: move initialization bit " Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
` (15 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The object database tracks a map of packfiles by their respective paths,
which is used to figure out whether a given packfile has already been
loaded.With the introduction of the `struct packfile_store` we have a
better place to host this list though.
Move the map accordingly. `pack_map_entry_cmp()` isn't used anywhere but
in "packfile.c" anymore after this change, so we convert it to a static
function, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 2 +-
odb.c | 2 --
odb.h | 6 ------
packfile.c | 20 ++++++++++++++++++--
packfile.h | 20 ++++++--------------
5 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/midx.c b/midx.c
index 7d407682e6..7f3f74ef2b 100644
--- a/midx.c
+++ b/midx.c
@@ -471,7 +471,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
strbuf_addbuf(&key, &pack_name);
strbuf_strip_suffix(&key, ".idx");
strbuf_addstr(&key, ".pack");
- p = hashmap_get_entry_from_hash(&r->objects->pack_map,
+ p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
strhash(key.buf), key.buf,
struct packed_git, packmap_ent);
if (!p) {
diff --git a/odb.c b/odb.c
index 17a9135cbd..568c820ef8 100644
--- a/odb.c
+++ b/odb.c
@@ -998,7 +998,6 @@ struct object_database *odb_new(struct repository *repo)
o->repo = repo;
o->packfiles = packfile_store_new(o);
INIT_LIST_HEAD(&o->packed_git_mru);
- hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
pthread_mutex_init(&o->replace_mutex, NULL);
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ -1040,6 +1039,5 @@ void odb_clear(struct object_database *o)
close_object_store(o);
packfile_store_free(o->packfiles);
- hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
}
diff --git a/odb.h b/odb.h
index 970919403d..99c1ba7b77 100644
--- a/odb.h
+++ b/odb.h
@@ -148,12 +148,6 @@ struct object_database {
struct cached_object_entry *cached_objects;
size_t cached_object_nr, cached_object_alloc;
- /*
- * A map of packfiles to packed_git structs for tracking which
- * packs have been loaded already.
- */
- struct hashmap pack_map;
-
/*
* A fast, rough count of the number of objects in the repository.
* These two fields are not meant for direct access. Use
diff --git a/packfile.c b/packfile.c
index 4e5f84eb09..6582b0a479 100644
--- a/packfile.c
+++ b/packfile.c
@@ -788,7 +788,7 @@ void install_packed_git(struct repository *r, struct packed_git *pack)
r->objects->packfiles->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
- hashmap_add(&r->objects->pack_map, &pack->packmap_ent);
+ hashmap_add(&r->objects->packfiles->map, &pack->packmap_ent);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -901,7 +901,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
hashmap_entry_init(&hent, hash);
/* Don't reopen a pack we already have. */
- if (!hashmap_get(&data->r->objects->pack_map, &hent, pack_name)) {
+ if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
p = add_packed_git(data->r, full_name, full_name_len, data->local);
if (p)
install_packed_git(data->r, p);
@@ -2329,11 +2329,26 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
return 0;
}
+static int pack_map_entry_cmp(const void *cmp_data UNUSED,
+ const struct hashmap_entry *entry,
+ const struct hashmap_entry *entry2,
+ const void *keydata)
+{
+ const char *key = keydata;
+ const struct packed_git *pg1, *pg2;
+
+ pg1 = container_of(entry, const struct packed_git, packmap_ent);
+ pg2 = container_of(entry2, const struct packed_git, packmap_ent);
+
+ return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
+}
+
struct packfile_store *packfile_store_new(struct object_database *odb)
{
struct packfile_store *store;
CALLOC_ARRAY(store, 1);
store->odb = odb;
+ hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
return store;
}
@@ -2346,6 +2361,7 @@ void packfile_store_free(struct packfile_store *store)
free(p);
}
+ hashmap_clear(&store->map);
free(store);
}
diff --git a/packfile.h b/packfile.h
index 573564b19e..2f84d7d7e6 100644
--- a/packfile.h
+++ b/packfile.h
@@ -65,6 +65,12 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /*
+ * A map of packfile names to packed_git structs for tracking which
+ * packs have been loaded already.
+ */
+ struct hashmap map;
+
/*
* Whether packfiles have already been populated with this store's
* packs.
@@ -90,20 +96,6 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
-static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *entry,
- const struct hashmap_entry *entry2,
- const void *keydata)
-{
- const char *key = keydata;
- const struct packed_git *pg1, *pg2;
-
- pg1 = container_of(entry, const struct packed_git, packmap_ent);
- pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
- return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 05/16] odb: move MRU list of packfiles into `struct packfile_store`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (3 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 04/16] odb: move packfile map " Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-20 12:44 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 06/16] odb: move kept cache " Patrick Steinhardt
` (14 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The object database tracks the list of packfiles in most-recently-used
order, which is mostly used to favor reading from packfiles that contain
most of the objects that we're currently accessing. With the
introduction of the `struct packfile_store` we have a better place to
host this list though.
Move the list accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 2 +-
odb.c | 2 --
odb.h | 4 ----
packfile.c | 11 ++++++-----
packfile.h | 3 +++
5 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/midx.c b/midx.c
index 7f3f74ef2b..7fa2b8473a 100644
--- a/midx.c
+++ b/midx.c
@@ -478,7 +478,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
if (p) {
install_packed_git(r, p);
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
}
diff --git a/odb.c b/odb.c
index 568c820ef8..80ec6fc1fa 100644
--- a/odb.c
+++ b/odb.c
@@ -997,7 +997,6 @@ struct object_database *odb_new(struct repository *repo)
memset(o, 0, sizeof(*o));
o->repo = repo;
o->packfiles = packfile_store_new(o);
- INIT_LIST_HEAD(&o->packed_git_mru);
pthread_mutex_init(&o->replace_mutex, NULL);
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ -1035,7 +1034,6 @@ void odb_clear(struct object_database *o)
free((char *) o->cached_objects[i].value.buf);
FREE_AND_NULL(o->cached_objects);
- INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
packfile_store_free(o->packfiles);
diff --git a/odb.h b/odb.h
index 99c1ba7b77..2dc3bdc79d 100644
--- a/odb.h
+++ b/odb.h
@@ -3,7 +3,6 @@
#include "hashmap.h"
#include "object.h"
-#include "list.h"
#include "oidset.h"
#include "oidmap.h"
#include "string-list.h"
@@ -131,9 +130,6 @@ struct object_database {
*/
struct packfile_store *packfiles;
- /* A most-recently-used ordered version of the packed_git list. */
- struct list_head packed_git_mru;
-
struct {
struct packed_git **packs;
unsigned flags;
diff --git a/packfile.c b/packfile.c
index 6582b0a479..f82856c19e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1017,10 +1017,10 @@ static void prepare_packed_git_mru(struct repository *r)
{
struct packed_git *p;
- INIT_LIST_HEAD(&r->objects->packed_git_mru);
+ INIT_LIST_HEAD(&r->objects->packfiles->mru);
for (p = r->objects->packfiles->packs; p; p = p->next)
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
static void prepare_packed_git(struct repository *r)
@@ -1096,7 +1096,7 @@ struct packed_git *get_all_packs(struct repository *r)
struct list_head *get_packed_git_mru(struct repository *r)
{
prepare_packed_git(r);
- return &r->objects->packed_git_mru;
+ return &r->objects->packfiles->mru;
}
unsigned long unpack_object_header_buffer(const unsigned char *buf,
@@ -2079,10 +2079,10 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
if (!r->objects->packfiles->packs)
return 0;
- list_for_each(pos, &r->objects->packed_git_mru) {
+ list_for_each(pos, &r->objects->packfiles->mru) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) {
- list_move(&p->mru, &r->objects->packed_git_mru);
+ list_move(&p->mru, &r->objects->packfiles->mru);
return 1;
}
}
@@ -2348,6 +2348,7 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
struct packfile_store *store;
CALLOC_ARRAY(store, 1);
store->odb = odb;
+ INIT_LIST_HEAD(&store->mru);
hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
return store;
}
diff --git a/packfile.h b/packfile.h
index 2f84d7d7e6..3022f3a19e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -65,6 +65,9 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /* A most-recently-used ordered version of the packs list. */
+ struct list_head mru;
+
/*
* A map of packfile names to packed_git structs for tracking which
* packs have been loaded already.
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 06/16] odb: move kept cache into `struct packfile_store`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (4 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 18:56 ` Junio C Hamano
2025-08-19 8:19 ` [PATCH 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
` (13 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The object database tracks a cache of "kept" packfiles, which is used by
git-pack-objects(1) to handle cruft objects. With the introduction of
the `struct packfile_store` we have a better place to host this cache
though.
Move the cache accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.h | 9 +--------
packfile.c | 16 ++++++++--------
packfile.h | 5 +++++
3 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/odb.h b/odb.h
index 2dc3bdc79d..f1736b067c 100644
--- a/odb.h
+++ b/odb.h
@@ -124,17 +124,10 @@ struct object_database {
unsigned commit_graph_attempted : 1; /* if loading has been attempted */
/*
- * private data
- *
- * should only be accessed directly by packfile.c
+ * Should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- struct {
- struct packed_git **packs;
- unsigned flags;
- } kept_pack_cache;
-
/*
* This is meant to hold a *small* number of objects that you would
* want odb_read_object() to be able to return, but yet you do not want
diff --git a/packfile.c b/packfile.c
index f82856c19e..f33445a5ff 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2092,19 +2092,19 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
static void maybe_invalidate_kept_pack_cache(struct repository *r,
unsigned flags)
{
- if (!r->objects->kept_pack_cache.packs)
+ if (!r->objects->packfiles->kept_cache.packs)
return;
- if (r->objects->kept_pack_cache.flags == flags)
+ if (r->objects->packfiles->kept_cache.flags == flags)
return;
- FREE_AND_NULL(r->objects->kept_pack_cache.packs);
- r->objects->kept_pack_cache.flags = 0;
+ FREE_AND_NULL(r->objects->packfiles->kept_cache.packs);
+ r->objects->packfiles->kept_cache.flags = 0;
}
struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
{
maybe_invalidate_kept_pack_cache(r, flags);
- if (!r->objects->kept_pack_cache.packs) {
+ if (!r->objects->packfiles->kept_cache.packs) {
struct packed_git **packs = NULL;
size_t nr = 0, alloc = 0;
struct packed_git *p;
@@ -2127,11 +2127,11 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
ALLOC_GROW(packs, nr + 1, alloc);
packs[nr] = NULL;
- r->objects->kept_pack_cache.packs = packs;
- r->objects->kept_pack_cache.flags = flags;
+ r->objects->packfiles->kept_cache.packs = packs;
+ r->objects->packfiles->kept_cache.flags = flags;
}
- return r->objects->kept_pack_cache.packs;
+ return r->objects->packfiles->kept_cache.packs;
}
int find_kept_pack_entry(struct repository *r,
diff --git a/packfile.h b/packfile.h
index 3022f3a19e..f46ea9ceec 100644
--- a/packfile.h
+++ b/packfile.h
@@ -65,6 +65,11 @@ struct packfile_store {
*/
struct packed_git *packs;
+ struct {
+ struct packed_git **packs;
+ unsigned flags;
+ } kept_cache;
+
/* A most-recently-used ordered version of the packs list. */
struct list_head mru;
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 07/16] packfile: reorder functions to avoid function declaration
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (5 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 06/16] odb: move kept cache " Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 19:18 ` Junio C Hamano
2025-08-19 8:19 ` [PATCH 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
` (12 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
Reorder functions so that we can avoid an extra declaration of
`prepare_packed_git()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
packfile.c | 67 +++++++++++++++++++++++++++++++-------------------------------
1 file changed, 33 insertions(+), 34 deletions(-)
diff --git a/packfile.c b/packfile.c
index f33445a5ff..99f2d20812 100644
--- a/packfile.c
+++ b/packfile.c
@@ -946,40 +946,6 @@ static void prepare_packed_git_one(struct odb_source *source, int local)
string_list_clear(data.garbage, 0);
}
-static void prepare_packed_git(struct repository *r);
-/*
- * Give a fast, rough count of the number of objects in the repository. This
- * ignores loose objects completely. If you have a lot of them, then either
- * you should repack because your performance will be awful, or they are
- * all unreachable objects about to be pruned, in which case they're not really
- * interesting as a measure of repo size in the first place.
- */
-unsigned long repo_approximate_object_count(struct repository *r)
-{
- if (!r->objects->approximate_object_count_valid) {
- struct odb_source *source;
- unsigned long count = 0;
- struct packed_git *p;
-
- prepare_packed_git(r);
-
- for (source = r->objects->sources; source; source = source->next) {
- struct multi_pack_index *m = get_multi_pack_index(source);
- if (m)
- count += m->num_objects;
- }
-
- for (p = r->objects->packfiles->packs; p; p = p->next) {
- if (open_pack_index(p))
- continue;
- count += p->num_objects;
- }
- r->objects->approximate_object_count = count;
- r->objects->approximate_object_count_valid = 1;
- }
- return r->objects->approximate_object_count;
-}
-
DEFINE_LIST_SORT(static, sort_packs, struct packed_git, next);
static int sort_pack(const struct packed_git *a, const struct packed_git *b)
@@ -1099,6 +1065,39 @@ struct list_head *get_packed_git_mru(struct repository *r)
return &r->objects->packfiles->mru;
}
+/*
+ * Give a fast, rough count of the number of objects in the repository. This
+ * ignores loose objects completely. If you have a lot of them, then either
+ * you should repack because your performance will be awful, or they are
+ * all unreachable objects about to be pruned, in which case they're not really
+ * interesting as a measure of repo size in the first place.
+ */
+unsigned long repo_approximate_object_count(struct repository *r)
+{
+ if (!r->objects->approximate_object_count_valid) {
+ struct odb_source *source;
+ unsigned long count = 0;
+ struct packed_git *p;
+
+ prepare_packed_git(r);
+
+ for (source = r->objects->sources; source; source = source->next) {
+ struct multi_pack_index *m = get_multi_pack_index(source);
+ if (m)
+ count += m->num_objects;
+ }
+
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
+ if (open_pack_index(p))
+ continue;
+ count += p->num_objects;
+ }
+ r->objects->approximate_object_count = count;
+ r->objects->approximate_object_count_valid = 1;
+ }
+ return r->objects->approximate_object_count;
+}
+
unsigned long unpack_object_header_buffer(const unsigned char *buf,
unsigned long len, enum object_type *type, unsigned long *sizep)
{
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (6 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
` (11 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The `prepare_packed_git()` function and its friends are responsible for
loading packfiles as well as the multi-pack index for a given object
database. Refactor these functions to accept a packfile store instead of
a repository to clarify their scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
packfile.c | 43 +++++++++++++++++++------------------------
1 file changed, 19 insertions(+), 24 deletions(-)
diff --git a/packfile.c b/packfile.c
index 99f2d20812..58e50d7b30 100644
--- a/packfile.c
+++ b/packfile.c
@@ -974,38 +974,33 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
return -1;
}
-static void rearrange_packed_git(struct repository *r)
-{
- sort_packs(&r->objects->packfiles->packs, sort_pack);
-}
-
-static void prepare_packed_git_mru(struct repository *r)
+static void packfile_store_prepare_mru(struct packfile_store *store)
{
struct packed_git *p;
- INIT_LIST_HEAD(&r->objects->packfiles->mru);
+ INIT_LIST_HEAD(&store->mru);
- for (p = r->objects->packfiles->packs; p; p = p->next)
- list_add_tail(&p->mru, &r->objects->packfiles->mru);
+ for (p = store->packs; p; p = p->next)
+ list_add_tail(&p->mru, &store->mru);
}
-static void prepare_packed_git(struct repository *r)
+static void packfile_store_prepare(struct packfile_store *store)
{
struct odb_source *source;
- if (r->objects->packfiles->initialized)
+ if (store->initialized)
return;
- odb_prepare_alternates(r->objects);
- for (source = r->objects->sources; source; source = source->next) {
- int local = (source == r->objects->sources);
+ odb_prepare_alternates(store->odb);
+ for (source = store->odb->sources; source; source = source->next) {
+ int local = (source == store->odb->sources);
prepare_multi_pack_index_one(source, local);
prepare_packed_git_one(source, local);
}
- rearrange_packed_git(r);
+ sort_packs(&store->packs, sort_pack);
- prepare_packed_git_mru(r);
- r->objects->packfiles->initialized = 1;
+ packfile_store_prepare_mru(store);
+ store->initialized = 1;
}
void reprepare_packed_git(struct repository *r)
@@ -1028,25 +1023,25 @@ void reprepare_packed_git(struct repository *r)
r->objects->approximate_object_count_valid = 0;
r->objects->packfiles->initialized = 0;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
obj_read_unlock();
}
struct packed_git *get_packed_git(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
return r->objects->packfiles->packs;
}
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
{
- prepare_packed_git(source->odb->repo);
+ packfile_store_prepare(source->odb->packfiles);
return source->midx;
}
struct packed_git *get_all_packs(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next) {
struct multi_pack_index *m = source->midx;
@@ -1061,7 +1056,7 @@ struct packed_git *get_all_packs(struct repository *r)
struct list_head *get_packed_git_mru(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
return &r->objects->packfiles->mru;
}
@@ -1079,7 +1074,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
unsigned long count = 0;
struct packed_git *p;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (source = r->objects->sources; source; source = source->next) {
struct multi_pack_index *m = get_multi_pack_index(source);
@@ -2069,7 +2064,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
{
struct list_head *pos;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next)
if (source->midx && fill_midx_entry(r, oid, e, source->midx))
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 09/16] packfile: split up responsibilities of `reprepare_packed_git()`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (7 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-20 13:17 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
` (10 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
In `reprepare_packed_git()` we perform a couple of operations:
- We reload alternate object directories.
- We clear the loose object cache.
- We reprepare packfiles.
While the logic is hosted in "packfile.c", it clearly reaches into other
subsystems that aren't related to packfiles.
Split up the responsibility and introduce `odb_reprepare()` which now
becomes responsible for repreparing the whole object database. The
existing `reprepare_packed_git()` function is refactored accordingly and
only cares about reloading the packfile store now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/backfill.c | 2 +-
builtin/gc.c | 4 ++--
builtin/receive-pack.c | 2 +-
builtin/repack.c | 2 +-
bulk-checkin.c | 2 +-
connected.c | 2 +-
fetch-pack.c | 4 ++--
object-name.c | 2 +-
odb.c | 25 ++++++++++++++++++++++++-
odb.h | 6 ++++++
packfile.c | 26 ++++----------------------
packfile.h | 9 ++++++++-
transport-helper.c | 2 +-
13 files changed, 53 insertions(+), 35 deletions(-)
diff --git a/builtin/backfill.c b/builtin/backfill.c
index 80056abe47..e80fc1b694 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -53,7 +53,7 @@ static void download_batch(struct backfill_context *ctx)
* We likely have a new packfile. Add it to the packed list to
* avoid possible duplicate downloads of the same objects.
*/
- reprepare_packed_git(ctx->repo);
+ odb_reprepare(ctx->repo->objects);
}
static int fill_missing_blobs(const char *path UNUSED,
diff --git a/builtin/gc.c b/builtin/gc.c
index 0edd94a76f..1d30d1af2c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1041,7 +1041,7 @@ int cmd_gc(int argc,
die(FAILED_RUN, "rerere");
report_garbage = report_pack_garbage;
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (pack_garbage.nr > 0) {
close_object_store(the_repository->objects);
clean_pack_garbage();
@@ -1490,7 +1490,7 @@ static off_t get_auto_pack_size(void)
struct packed_git *p;
struct repository *r = the_repository;
- reprepare_packed_git(r);
+ odb_reprepare(r->objects);
for (p = get_all_packs(r); p; p = p->next) {
if (p->pack_size > max_size) {
second_largest_size = max_size;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1113137a6f..c9288a9c7e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -2389,7 +2389,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
status = finish_command(&child);
if (status)
return "index-pack abnormal exit";
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
}
return NULL;
}
diff --git a/builtin/repack.c b/builtin/repack.c
index a4def39197..ee8c80cd95 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1684,7 +1684,7 @@ int cmd_repack(int argc,
goto cleanup;
}
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (delete_redundant) {
int opts = 0;
diff --git a/bulk-checkin.c b/bulk-checkin.c
index b2809ab039..f65439a748 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -90,7 +90,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
strbuf_release(&packname);
/* Make objects we just wrote available to ourselves */
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
}
/*
diff --git a/connected.c b/connected.c
index 18c13245d8..d6e9682fd9 100644
--- a/connected.c
+++ b/connected.c
@@ -72,7 +72,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
* Before checking for promisor packs, be sure we have the
* latest pack-files loaded into memory.
*/
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
do {
struct packed_git *p;
diff --git a/fetch-pack.c b/fetch-pack.c
index 46c39f85c4..3b8960608c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1982,7 +1982,7 @@ static void update_shallow(struct fetch_pack_args *args,
* remote is shallow, but this is a clone, there are
* no objects in repo to worry about. Accept any
* shallow points that exist in the pack (iow in repo
- * after get_pack() and reprepare_packed_git())
+ * after get_pack() and odb_reprepare())
*/
struct oid_array extra = OID_ARRAY_INIT;
struct object_id *oid = si->shallow->oid;
@@ -2107,7 +2107,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
&si, pack_lockfiles);
}
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (!args->cloning && args->deepen) {
struct check_connected_options opt = CHECK_CONNECTED_INIT;
diff --git a/object-name.c b/object-name.c
index 11aa0e6afc..44b0d416ac 100644
--- a/object-name.c
+++ b/object-name.c
@@ -596,7 +596,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
* or migrated from loose to packed.
*/
if (status == MISSING_OBJECT) {
- reprepare_packed_git(r);
+ odb_reprepare(r->objects);
find_short_object_filename(&ds);
find_short_packed_object(&ds);
status = finish_object_disambiguation(&ds, oid);
diff --git a/odb.c b/odb.c
index 80ec6fc1fa..37ed21f53b 100644
--- a/odb.c
+++ b/odb.c
@@ -694,7 +694,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
/* Not a loose object; someone else may have just packed it. */
if (!(flags & OBJECT_INFO_QUICK)) {
- reprepare_packed_git(odb->repo);
+ odb_reprepare(odb->repo->objects);
if (find_pack_entry(odb->repo, real, &e))
break;
}
@@ -1039,3 +1039,26 @@ void odb_clear(struct object_database *o)
string_list_clear(&o->submodule_source_paths, 0);
}
+
+void odb_reprepare(struct object_database *o)
+{
+ struct odb_source *source;
+
+ /*
+ * Reprepare alt odbs, in case the alternates file was modified
+ * during the course of this process. This only _adds_ odbs to
+ * the linked list, so existing odbs will continue to exist for
+ * the lifetime of the process.
+ */
+ o->loaded_alternates = 0;
+ odb_prepare_alternates(o);
+
+ for (source = o->sources; source; source = source->next)
+ odb_clear_loose_cache(source);
+
+ o->approximate_object_count_valid = 0;
+
+ packfile_store_reprepare(o->packfiles);
+
+ obj_read_unlock();
+}
diff --git a/odb.h b/odb.h
index f1736b067c..9810ec60a0 100644
--- a/odb.h
+++ b/odb.h
@@ -155,6 +155,12 @@ struct object_database {
struct object_database *odb_new(struct repository *repo);
void odb_clear(struct object_database *o);
+/*
+ * Clear caches, reload alternates and then reload object sources so that new
+ * objects may become accessible.
+ */
+void odb_reprepare(struct object_database *o);
+
/*
* Find source by its object directory path. Dies in case the source couldn't
* be found.
diff --git a/packfile.c b/packfile.c
index 58e50d7b30..180c95ec1c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1003,28 +1003,10 @@ static void packfile_store_prepare(struct packfile_store *store)
store->initialized = 1;
}
-void reprepare_packed_git(struct repository *r)
+void packfile_store_reprepare(struct packfile_store *store)
{
- struct odb_source *source;
-
- obj_read_lock();
-
- /*
- * Reprepare alt odbs, in case the alternates file was modified
- * during the course of this process. This only _adds_ odbs to
- * the linked list, so existing odbs will continue to exist for
- * the lifetime of the process.
- */
- r->objects->loaded_alternates = 0;
- odb_prepare_alternates(r->objects);
-
- for (source = r->objects->sources; source; source = source->next)
- odb_clear_loose_cache(source);
-
- r->objects->approximate_object_count_valid = 0;
- r->objects->packfiles->initialized = 0;
- packfile_store_prepare(r->objects->packfiles);
- obj_read_unlock();
+ store->initialized = 0;
+ packfile_store_prepare(store);
}
struct packed_git *get_packed_git(struct repository *r)
@@ -1145,7 +1127,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
*
* Other worrying sections could be the call to close_pack_fd(),
* which can close packs even with in-use windows, and to
- * reprepare_packed_git(). Regarding the former, mmap doc says:
+ * odb_reprepare(). Regarding the former, mmap doc says:
* "closing the file descriptor does not unmap the region". And
* for the latter, it won't re-open already available packs.
*/
diff --git a/packfile.h b/packfile.h
index f46ea9ceec..75672c808a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -104,6 +104,14 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
+/*
+ * Clear the packfile caches and try to look up any new packfiles that have
+ * appeared since last preparing the packfiles store.
+ *
+ * This function must be called under the `odb_read_lock()`.
+ */
+void packfile_store_reprepare(struct packfile_store *store);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
@@ -180,7 +188,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-void reprepare_packed_git(struct repository *r);
void install_packed_git(struct repository *r, struct packed_git *pack);
struct packed_git *get_packed_git(struct repository *r);
diff --git a/transport-helper.c b/transport-helper.c
index 0789e5bca5..4d95d84f9e 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -450,7 +450,7 @@ static int fetch_with_fetch(struct transport *transport,
}
strbuf_release(&buf);
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
return 0;
}
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 10/16] packfile: refactor `install_packed_git()` to work on packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (8 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
` (9 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The `install_packed_git()` functions adds a packfile to a specific
object store. Refactor it to accept a packfile store instead of a
repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 2 +-
builtin/index-pack.c | 2 +-
http.c | 2 +-
http.h | 2 +-
midx.c | 2 +-
packfile.c | 11 ++++++-----
packfile.h | 9 +++++++--
7 files changed, 18 insertions(+), 12 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 2c35f9345d..e9d82b31c3 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -901,7 +901,7 @@ static void end_packfile(void)
if (!new_p)
die("core git rejected index %s", idx_name);
all_packs[pack_id] = new_p;
- install_packed_git(the_repository, new_p);
+ packfile_store_add_pack(the_repository->objects->packfiles, new_p);
free(idx_name);
/* Print the boundary */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f91c301bba..ed490dfad4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1645,7 +1645,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
p = add_packed_git(the_repository, final_index_name,
strlen(final_index_name), 0);
if (p)
- install_packed_git(the_repository, p);
+ packfile_store_add_pack(the_repository->objects->packfiles, p);
}
if (!from_stdin) {
diff --git a/http.c b/http.c
index 98853d6483..af2120b64c 100644
--- a/http.c
+++ b/http.c
@@ -2541,7 +2541,7 @@ void http_install_packfile(struct packed_git *p,
lst = &((*lst)->next);
*lst = (*lst)->next;
- install_packed_git(the_repository, p);
+ packfile_store_add_pack(the_repository->objects->packfiles, p);
}
struct http_pack_request *new_http_pack_request(
diff --git a/http.h b/http.h
index 36202139f4..e5a5380c6c 100644
--- a/http.h
+++ b/http.h
@@ -210,7 +210,7 @@ int finish_http_pack_request(struct http_pack_request *preq);
void release_http_pack_request(struct http_pack_request *preq);
/*
- * Remove p from the given list, and invoke install_packed_git() on it.
+ * Remove p from the given list, and invoke packfile_store_add_pack() on it.
*
* This is a convenience function for users that have obtained a list of packs
* from http_get_info_packs() and have chosen a specific pack to fetch.
diff --git a/midx.c b/midx.c
index 7fa2b8473a..95e74c79c1 100644
--- a/midx.c
+++ b/midx.c
@@ -477,7 +477,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
if (!p) {
p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
if (p) {
- install_packed_git(r, p);
+ packfile_store_add_pack(r->objects->packfiles, p);
list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
}
diff --git a/packfile.c b/packfile.c
index 180c95ec1c..186d182c7c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -779,16 +779,17 @@ struct packed_git *add_packed_git(struct repository *r, const char *path,
return p;
}
-void install_packed_git(struct repository *r, struct packed_git *pack)
+void packfile_store_add_pack(struct packfile_store *store,
+ struct packed_git *pack)
{
if (pack->pack_fd != -1)
pack_open_fds++;
- pack->next = r->objects->packfiles->packs;
- r->objects->packfiles->packs = pack;
+ pack->next = store->packs;
+ store->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
- hashmap_add(&r->objects->packfiles->map, &pack->packmap_ent);
+ hashmap_add(&store->map, &pack->packmap_ent);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -904,7 +905,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
p = add_packed_git(data->r, full_name, full_name_len, data->local);
if (p)
- install_packed_git(data->r, p);
+ packfile_store_add_pack(data->r->objects->packfiles, p);
}
free(pack_name);
}
diff --git a/packfile.h b/packfile.h
index 75672c808a..e751a5d93e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -112,6 +112,13 @@ void packfile_store_close(struct packfile_store *store);
*/
void packfile_store_reprepare(struct packfile_store *store);
+/*
+ * Add the pack to the store so that contained objects become accessible via
+ * the store. This moves ownership into the store.
+ */
+void packfile_store_add_pack(struct packfile_store *store,
+ struct packed_git *pack);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
@@ -188,8 +195,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-void install_packed_git(struct repository *r, struct packed_git *pack);
-
struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 11/16] packfile: always add packfiles to MRU when adding a pack
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (9 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-20 13:35 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
` (8 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
When adding a packfile to it store we add it both to the list and map of
packfiles, but we don't append it to the most-recently-used list of
packs. We do know to add the packfile to the MRU list as soon as we
access any of its objects, but in between we're being inconistent. It
doesn't help that there are some subsystems that _do_ add the packfile
to the MRU after having added it, which only adds to the confusion.
Refactor the code so that we unconditionally add packfiles to the MRU
when adding them to a packfile store.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 4 +---
packfile.c | 1 +
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/midx.c b/midx.c
index 95e74c79c1..3cfe7884ad 100644
--- a/midx.c
+++ b/midx.c
@@ -476,10 +476,8 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
struct packed_git, packmap_ent);
if (!p) {
p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
- if (p) {
+ if (p)
packfile_store_add_pack(r->objects->packfiles, p);
- list_add_tail(&p->mru, &r->objects->packfiles->mru);
- }
}
strbuf_release(&pack_name);
diff --git a/packfile.c b/packfile.c
index 186d182c7c..8b5e6b96ce 100644
--- a/packfile.c
+++ b/packfile.c
@@ -790,6 +790,7 @@ void packfile_store_add_pack(struct packfile_store *store,
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
hashmap_add(&store->map, &pack->packmap_ent);
+ list_add_tail(&pack->mru, &store->mru);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 12/16] packfile: introduce function to load and add packfiles
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (10 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-20 13:41 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
` (7 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
We have a recurring pattern where we essentially perform an upsert of a
packfile in case it isn't yet known by the packfile store. The logic to
do so is non-trivial as we have to reconstruct the packfile's key, check
the map of packfiles, then create the new packfile and finally add it to
the store.
Introduce a new function that does this dance for us. Refactor callsites
to use it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 4 ++--
builtin/index-pack.c | 10 +++-------
midx.c | 18 ++----------------
packfile.c | 44 +++++++++++++++++++++++++++++++-------------
packfile.h | 8 ++++++++
5 files changed, 46 insertions(+), 38 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index e9d82b31c3..a26e79689d 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -897,11 +897,11 @@ static void end_packfile(void)
idx_name = keep_pack(create_index());
/* Register the packfile with core git's machinery. */
- new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
+ new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
+ idx_name, 1);
if (!new_p)
die("core git rejected index %s", idx_name);
all_packs[pack_id] = new_p;
- packfile_store_add_pack(the_repository->objects->packfiles, new_p);
free(idx_name);
/* Print the boundary */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ed490dfad4..2b78ba7fe4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
hash, "idx", 1);
- if (do_fsck_object) {
- struct packed_git *p;
- p = add_packed_git(the_repository, final_index_name,
- strlen(final_index_name), 0);
- if (p)
- packfile_store_add_pack(the_repository->objects->packfiles, p);
- }
+ if (do_fsck_object)
+ packfile_store_load_pack(the_repository->objects->packfiles,
+ final_index_name, 0);
if (!from_stdin) {
printf("%s\n", hash_to_hex(hash));
diff --git a/midx.c b/midx.c
index 3cfe7884ad..d30feda019 100644
--- a/midx.c
+++ b/midx.c
@@ -454,7 +454,6 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
uint32_t pack_int_id)
{
struct strbuf pack_name = STRBUF_INIT;
- struct strbuf key = STRBUF_INIT;
struct packed_git *p;
pack_int_id = midx_for_pack(&m, pack_int_id);
@@ -466,22 +465,9 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
m->pack_names[pack_int_id]);
-
- /* pack_map holds the ".pack" name, but we have the .idx */
- strbuf_addbuf(&key, &pack_name);
- strbuf_strip_suffix(&key, ".idx");
- strbuf_addstr(&key, ".pack");
- p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
- strhash(key.buf), key.buf,
- struct packed_git, packmap_ent);
- if (!p) {
- p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
- if (p)
- packfile_store_add_pack(r->objects->packfiles, p);
- }
-
+ p = packfile_store_load_pack(r->objects->packfiles,
+ pack_name.buf, m->local);
strbuf_release(&pack_name);
- strbuf_release(&key);
if (!p) {
m->packs[pack_int_id] = MIDX_PACK_ERROR;
diff --git a/packfile.c b/packfile.c
index 8b5e6b96ce..f7916543a6 100644
--- a/packfile.c
+++ b/packfile.c
@@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store,
list_add_tail(&pack->mru, &store->mru);
}
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
+ const char *idx_path, int local)
+{
+ struct strbuf key = STRBUF_INIT;
+ struct packed_git *p;
+
+ /*
+ * We're being called with the path to the index file, but `pack_map`
+ * holds the path to the packfile itself.
+ */
+ strbuf_addstr(&key, idx_path);
+ strbuf_strip_suffix(&key, ".idx");
+ strbuf_addstr(&key, ".pack");
+
+ p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
+ struct packed_git, packmap_ent);
+ if (!p) {
+ p = add_packed_git(store->odb->repo, idx_path,
+ strlen(idx_path), local);
+ if (p)
+ packfile_store_add_pack(store, p);
+ }
+
+ strbuf_release(&key);
+ return p;
+}
+
void (*report_garbage)(unsigned seen_bits, const char *path);
static void report_helper(const struct string_list *list,
@@ -892,23 +919,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
const char *file_name, void *_data)
{
struct prepare_pack_data *data = (struct prepare_pack_data *)_data;
- struct packed_git *p;
size_t base_len = full_name_len;
if (strip_suffix_mem(full_name, &base_len, ".idx") &&
!(data->m && midx_contains_pack(data->m, file_name))) {
- struct hashmap_entry hent;
- char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name);
- unsigned int hash = strhash(pack_name);
- hashmap_entry_init(&hent, hash);
-
- /* Don't reopen a pack we already have. */
- if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
- p = add_packed_git(data->r, full_name, full_name_len, data->local);
- if (p)
- packfile_store_add_pack(data->r->objects->packfiles, p);
- }
- free(pack_name);
+ char *trimmed_path = xstrndup(full_name, full_name_len);
+ packfile_store_load_pack(data->r->objects->packfiles,
+ trimmed_path, data->local);
+ free(trimmed_path);
}
if (!report_garbage)
diff --git a/packfile.h b/packfile.h
index e751a5d93e..4971f18f51 100644
--- a/packfile.h
+++ b/packfile.h
@@ -119,6 +119,14 @@ void packfile_store_reprepare(struct packfile_store *store);
void packfile_store_add_pack(struct packfile_store *store,
struct packed_git *pack);
+/*
+ * Open the packfile and add it to the store if it isn't yet known. Returns
+ * either the newly opened packfile or the preexisting packfile. Returns a
+ * `NULL` pointer in case the packfile could not be opened.
+ */
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
+ const char *idx_path, int local);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 13/16] packfile: move `get_multi_pack_index()` into "midx.c"
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (11 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
` (6 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The `get_multi_pack_index()` function is declared and implemented in the
packfile subsystem, even though it really belongs into the multi-pack
index subsystem. The reason for this is likely that it needs to call
`packfile_store_prepare()`, which is not exposed by the packfile system.
In a subsequent commit we're about to add another caller outside of the
packfile system though, so we'll have to expose the function anyway.
Do so now already and move `get_multi_pack_index()` into the MIDX
subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 6 ++++++
midx.h | 2 ++
packfile.c | 8 +-------
packfile.h | 10 +++++++++-
4 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/midx.c b/midx.c
index d30feda019..c1b2f141fa 100644
--- a/midx.c
+++ b/midx.c
@@ -95,6 +95,12 @@ static int midx_read_object_offsets(const unsigned char *chunk_start,
return 0;
}
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
+{
+ packfile_store_prepare(source->odb->packfiles);
+ return source->midx;
+}
+
static struct multi_pack_index *load_multi_pack_index_one(struct repository *r,
const char *object_dir,
const char *midx_name,
diff --git a/midx.h b/midx.h
index 076382de8a..8d6ea28682 100644
--- a/midx.h
+++ b/midx.h
@@ -100,6 +100,8 @@ void get_split_midx_filename_ext(const struct git_hash_algo *hash_algo,
struct strbuf *buf, const char *object_dir,
const unsigned char *hash, const char *ext);
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
+
struct multi_pack_index *load_multi_pack_index(struct repository *r,
const char *object_dir,
int local);
diff --git a/packfile.c b/packfile.c
index f7916543a6..bc32c45fe6 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1004,7 +1004,7 @@ static void packfile_store_prepare_mru(struct packfile_store *store)
list_add_tail(&p->mru, &store->mru);
}
-static void packfile_store_prepare(struct packfile_store *store)
+void packfile_store_prepare(struct packfile_store *store)
{
struct odb_source *source;
@@ -1035,12 +1035,6 @@ struct packed_git *get_packed_git(struct repository *r)
return r->objects->packfiles->packs;
}
-struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
-{
- packfile_store_prepare(source->odb->packfiles);
- return source->midx;
-}
-
struct packed_git *get_all_packs(struct repository *r)
{
packfile_store_prepare(r->objects->packfiles);
diff --git a/packfile.h b/packfile.h
index 4971f18f51..1522da96f8 100644
--- a/packfile.h
+++ b/packfile.h
@@ -104,6 +104,15 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
+/*
+ * Prepare the packfile store by loading packfiles and multi-pack indices for
+ * all alternates. This becomes a no-op if the store is already prepared.
+ *
+ * It shouldn't typically be necessary to call this function directly, as
+ * functions that access the store know to prepare it.
+ */
+void packfile_store_prepare(struct packfile_store *store);
+
/*
* Clear the packfile caches and try to look up any new packfiles that have
* appeared since last preparing the packfiles store.
@@ -205,7 +214,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
-struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
struct packed_git *get_all_packs(struct repository *r);
/*
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 14/16] packfile: remove `get_packed_git()`
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (12 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-20 13:50 ` Karthik Nayak
2025-08-20 13:51 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
` (5 subsequent siblings)
19 siblings, 2 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
We have two different functions to retrieve packfiles for a packfile
store:
- `get_packed_git()` returns the list of packfiles directly.
- `get_all_packs()` does more work and also prepares packfiles that
are being indexed by a multi-pack-index.
The distinction is not immediately obvious. Furthermore, to make the
situation even worse, `get_packed_git()` would return the same result as
`get_all_packs()` once the latter has been called once as they both
refer to the same list.
As it turns out, the distinction isn't necessary. We only have a couple
of callers of `get_packed_git()`, and all of those callers are prepared
to call `get_all_packs()` instead:
- "builtin/gc.c": We explicitly check how many packfiles aren't
contained in the multi-pack-index, so loading extra packfiles that
are indexed by it won't change the result.
- "builtin/grep.c": We only care `get_packed_git()` to prepare eagerly
load packfiles. In the preceding commit we have started to expose
`packfile_store_prepare()`, which is a more direct way of achieving
the same result.
- "object-name.c": `find_abbrev_len_for_pack()` and `unique_in_pack()`
exit early in case the multi-pack index is set, so both callsites of
`get_packed_git()` know to handle packs loaded via the MIDX already.
Convert all of these sites to use `get_all_packs()` instead and remove
`get_packed_git()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/gc.c | 2 +-
builtin/grep.c | 2 +-
object-name.c | 4 ++--
packfile.c | 6 ------
packfile.h | 1 -
5 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index 1d30d1af2c..565afda51f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1422,7 +1422,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
if (incremental_repack_auto_limit < 0)
return 1;
- for (p = get_packed_git(the_repository);
+ for (p = get_all_packs(the_repository);
count < incremental_repack_auto_limit && p;
p = p->next) {
if (!p->multi_pack_index)
diff --git a/builtin/grep.c b/builtin/grep.c
index 5df6537333..8f0e21bd70 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1214,7 +1214,7 @@ int cmd_grep(int argc,
if (recurse_submodules)
repo_read_gitmodules(the_repository, 1);
if (startup_info->have_repository)
- (void)get_packed_git(the_repository);
+ packfile_store_prepare(the_repository->objects->packfiles);
start_threads(&opt);
} else {
diff --git a/object-name.c b/object-name.c
index 44b0d416ac..c87995cc1e 100644
--- a/object-name.c
+++ b/object-name.c
@@ -213,7 +213,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
unique_in_midx(m, ds);
}
- for (p = get_packed_git(ds->repo); p && !ds->ambiguous;
+ for (p = get_all_packs(ds->repo); p && !ds->ambiguous;
p = p->next)
unique_in_pack(p, ds);
}
@@ -806,7 +806,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
find_abbrev_len_for_midx(m, mad);
}
- for (p = get_packed_git(mad->repo); p; p = p->next)
+ for (p = get_all_packs(mad->repo); p; p = p->next)
find_abbrev_len_for_pack(p, mad);
}
diff --git a/packfile.c b/packfile.c
index bc32c45fe6..f1526e361c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1029,12 +1029,6 @@ void packfile_store_reprepare(struct packfile_store *store)
packfile_store_prepare(store);
}
-struct packed_git *get_packed_git(struct repository *r)
-{
- packfile_store_prepare(r->objects->packfiles);
- return r->objects->packfiles->packs;
-}
-
struct packed_git *get_all_packs(struct repository *r)
{
packfile_store_prepare(r->objects->packfiles);
diff --git a/packfile.h b/packfile.h
index 1522da96f8..dff0237092 100644
--- a/packfile.h
+++ b/packfile.h
@@ -212,7 +212,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
struct packed_git *get_all_packs(struct repository *r);
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (13 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-20 13:53 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 16/16] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
` (4 subsequent siblings)
19 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The `get_all_packs()` function prepares the packfile store and then
returns its packfiles. Refactor it to accept a packfile store instead of
a repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/cat-file.c | 2 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 4 ++--
builtin/fsck.c | 8 ++++----
builtin/gc.c | 8 ++++----
builtin/pack-objects.c | 18 +++++++++---------
builtin/pack-redundant.c | 4 ++--
builtin/repack.c | 6 +++---
connected.c | 2 +-
http-backend.c | 4 ++--
http.c | 2 +-
object-name.c | 4 ++--
pack-bitmap.c | 4 ++--
pack-objects.c | 2 +-
packfile.c | 14 +++++++-------
packfile.h | 7 ++++++-
server-info.c | 2 +-
t/helper/test-find-pack.c | 2 +-
t/helper/test-pack-mtimes.c | 2 +-
19 files changed, 51 insertions(+), 46 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index fce0b06451c..7124c43fb14 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -854,7 +854,7 @@ static void batch_each_object(struct batch_options *opt,
batch_one_object_bitmapped, &payload)) {
struct packed_git *pack;
- for (pack = get_all_packs(the_repository); pack; pack = pack->next) {
+ for (pack = packfile_store_get_packs(the_repository->objects->packfiles); pack; pack = pack->next) {
if (bitmap_index_contains_pack(bitmap, pack) ||
open_pack_index(pack))
continue;
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a61d3b46aac..471d96a3089 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -129,7 +129,7 @@ int cmd_count_objects(int argc,
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local)
continue;
if (open_pack_index(p))
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index a26e79689d5..4f355118a10 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -975,7 +975,7 @@ static int store_object(
if (e->idx.offset) {
duplicate_count_by_type[type]++;
return 1;
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
+ } else if (find_oid_pack(&oid, packfile_store_get_packs(the_repository->objects->packfiles))) {
e->type = type;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
@@ -1175,7 +1175,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
duplicate_count_by_type[OBJ_BLOB]++;
truncate_pack(&checkpoint);
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
+ } else if (find_oid_pack(&oid, packfile_store_get_packs(the_repository->objects->packfiles))) {
e->type = OBJ_BLOB;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 543a2cdb5cd..e867fd510a3 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -873,14 +873,14 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
int res = 0;
if (show_progress) {
- for (struct packed_git *p = get_all_packs(r); p; p = p->next)
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next)
pack_count++;
progress = start_delayed_progress(the_repository,
"Verifying reverse pack-indexes", pack_count);
pack_count = 0;
}
- for (struct packed_git *p = get_all_packs(r); p; p = p->next) {
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
int load_error = load_pack_revindex_from_disk(p);
if (load_error < 0) {
@@ -1010,7 +1010,7 @@ int cmd_fsck(int argc,
struct progress *progress = NULL;
if (show_progress) {
- for (p = get_all_packs(the_repository); p;
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p;
p = p->next) {
if (open_pack_index(p))
continue;
@@ -1020,7 +1020,7 @@ int cmd_fsck(int argc,
progress = start_progress(the_repository,
_("Checking objects"), total);
}
- for (p = get_all_packs(the_repository); p;
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p;
p = p->next) {
/* verify gives error messages itself */
if (verify_pack(the_repository,
diff --git a/builtin/gc.c b/builtin/gc.c
index 565afda51fe..030d0b0c774 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -488,7 +488,7 @@ static struct packed_git *find_base_packs(struct string_list *packs,
{
struct packed_git *p, *base = NULL;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local || p->is_cruft)
continue;
if (limit) {
@@ -513,7 +513,7 @@ static int too_many_packs(struct gc_config *cfg)
if (cfg->gc_auto_pack_limit <= 0)
return 0;
- for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
+ for (cnt = 0, p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local)
continue;
if (p->pack_keep)
@@ -1422,7 +1422,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
if (incremental_repack_auto_limit < 0)
return 1;
- for (p = get_all_packs(the_repository);
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles);
count < incremental_repack_auto_limit && p;
p = p->next) {
if (!p->multi_pack_index)
@@ -1491,7 +1491,7 @@ static off_t get_auto_pack_size(void)
struct repository *r = the_repository;
odb_reprepare(r->objects);
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if (p->pack_size > max_size) {
second_largest_size = max_size;
max_size = p->pack_size;
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 53a22562503..1c24b84510e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3855,7 +3855,7 @@ static void read_packs_list_from_stdin(struct rev_info *revs)
string_list_sort(&exclude_packs);
string_list_remove_duplicates(&exclude_packs, 0);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
const char *pack_name = pack_basename(p);
if ((item = string_list_lookup(&include_packs, pack_name)))
@@ -4105,7 +4105,7 @@ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs
* Re-mark only the fresh packs as kept so that objects in
* unknown packs do not halt the reachability traversal early.
*/
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
p->pack_keep_in_core = 0;
mark_pack_kept_in_core(fresh_packs, 1);
@@ -4142,7 +4142,7 @@ static void read_cruft_objects(void)
string_list_sort(&discard_packs);
string_list_sort(&fresh_packs);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
const char *pack_name = pack_basename(p);
struct string_list_item *item;
@@ -4394,7 +4394,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
struct packed_git *p;
p = (last_found != (void *)1) ? last_found :
- get_all_packs(the_repository);
+ packfile_store_get_packs(the_repository->objects->packfiles);
while (p) {
if ((!p->pack_local || p->pack_keep ||
@@ -4404,7 +4404,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
return 1;
}
if (p == last_found)
- p = get_all_packs(the_repository);
+ p = packfile_store_get_packs(the_repository->objects->packfiles);
else
p = p->next;
if (p == last_found)
@@ -4441,7 +4441,7 @@ static void loosen_unused_packed_objects(void)
uint32_t loosened_objects_nr = 0;
struct object_id oid;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local || p->pack_keep || p->pack_keep_in_core)
continue;
@@ -4747,7 +4747,7 @@ static void add_extra_kept_packs(const struct string_list *names)
if (!names->nr)
return;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
const char *name = basename(p->pack_name);
int i;
@@ -5186,7 +5186,7 @@ int cmd_pack_objects(int argc,
add_extra_kept_packs(&keep_pack_list);
if (ignore_packed_keep_on_disk) {
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
if (p->pack_local && p->pack_keep)
break;
if (!p) /* no keep-able packs found */
@@ -5199,7 +5199,7 @@ int cmd_pack_objects(int argc,
* it also covers non-local objects
*/
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local) {
have_non_local_packs = 1;
break;
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index fe81c293e3a..7b2cb3ef1e2 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -566,7 +566,7 @@ static struct pack_list * add_pack(struct packed_git *p)
static struct pack_list * add_pack_file(const char *filename)
{
- struct packed_git *p = get_all_packs(the_repository);
+ struct packed_git *p = packfile_store_get_packs(the_repository->objects->packfiles);
if (strlen(filename) < 40)
die("Bad pack filename: %s", filename);
@@ -581,7 +581,7 @@ static struct pack_list * add_pack_file(const char *filename)
static void load_all(void)
{
- struct packed_git *p = get_all_packs(the_repository);
+ struct packed_git *p = packfile_store_get_packs(the_repository->objects->packfiles);
while (p) {
add_pack(p);
diff --git a/builtin/repack.c b/builtin/repack.c
index ee8c80cd95c..6119e236512 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -267,7 +267,7 @@ static void collect_pack_filenames(struct existing_packs *existing,
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
int i;
const char *base;
@@ -499,7 +499,7 @@ static void init_pack_geometry(struct pack_geometry *geometry,
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (args->local && !p->pack_local)
/*
* When asked to only repack local packfiles we skip
@@ -1140,7 +1140,7 @@ static void combine_small_cruft_packs(FILE *in, size_t combine_cruft_below_size,
struct strbuf buf = STRBUF_INIT;
size_t i;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!(p->is_cruft && p->pack_local))
continue;
diff --git a/connected.c b/connected.c
index d6e9682fd93..d7e07fa6b0d 100644
--- a/connected.c
+++ b/connected.c
@@ -76,7 +76,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
do {
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_promisor)
continue;
if (find_pack_entry_one(oid, p))
diff --git a/http-backend.c b/http-backend.c
index d5dfe762bb5..be4d8263a58 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -608,13 +608,13 @@ static void get_info_packs(struct strbuf *hdr, char *arg UNUSED)
size_t cnt = 0;
select_getanyfile(hdr);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (p->pack_local)
cnt++;
}
strbuf_grow(&buf, cnt * 53 + 2);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (p->pack_local)
strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
}
diff --git a/http.c b/http.c
index af2120b64c7..16a1ab54f34 100644
--- a/http.c
+++ b/http.c
@@ -2416,7 +2416,7 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
* If we already have the pack locally, no need to fetch its index or
* even add it to list; we already have all of its objects.
*/
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (hasheq(p->hash, sha1, the_repository->hash_algo))
return 0;
}
diff --git a/object-name.c b/object-name.c
index c87995cc1e6..e346075394d 100644
--- a/object-name.c
+++ b/object-name.c
@@ -213,7 +213,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
unique_in_midx(m, ds);
}
- for (p = get_all_packs(ds->repo); p && !ds->ambiguous;
+ for (p = packfile_store_get_packs(ds->repo->objects->packfiles); p && !ds->ambiguous;
p = p->next)
unique_in_pack(p, ds);
}
@@ -806,7 +806,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
find_abbrev_len_for_midx(m, mad);
}
- for (p = get_all_packs(mad->repo); p; p = p->next)
+ for (p = packfile_store_get_packs(mad->repo->objects->packfiles); p; p = p->next)
find_abbrev_len_for_pack(p, mad);
}
diff --git a/pack-bitmap.c b/pack-bitmap.c
index d14421ee204..67f9e92ec18 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -665,7 +665,7 @@ static int open_pack_bitmap(struct repository *r,
struct packed_git *p;
int ret = -1;
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if (open_pack_bitmap_1(bitmap_git, p) == 0) {
ret = 0;
/*
@@ -3363,7 +3363,7 @@ int verify_bitmap_files(struct repository *r)
free(midx_bitmap_name);
}
- for (struct packed_git *p = get_all_packs(r);
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles);
p; p = p->next) {
char *pack_bitmap_name = pack_bitmap_filename(p);
res |= verify_bitmap_file(r->hash_algo, pack_bitmap_name);
diff --git a/pack-objects.c b/pack-objects.c
index a9d9855063a..5506f12293c 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -95,7 +95,7 @@ static void prepare_in_pack_by_idx(struct packing_data *pdata)
* (i.e. in_pack_idx also zero) should return NULL.
*/
mapping[cnt++] = NULL;
- for (p = get_all_packs(pdata->repo); p; p = p->next, cnt++) {
+ for (p = packfile_store_get_packs(pdata->repo->objects->packfiles); p; p = p->next, cnt++) {
if (cnt == nr) {
free(mapping);
return;
diff --git a/packfile.c b/packfile.c
index f1526e361c2..b60faf5c3e7 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1029,19 +1029,19 @@ void packfile_store_reprepare(struct packfile_store *store)
packfile_store_prepare(store);
}
-struct packed_git *get_all_packs(struct repository *r)
+struct packed_git *packfile_store_get_packs(struct packfile_store *store)
{
- packfile_store_prepare(r->objects->packfiles);
+ packfile_store_prepare(store);
- for (struct odb_source *source = r->objects->sources; source; source = source->next) {
+ for (struct odb_source *source = store->odb->sources; source; source = source->next) {
struct multi_pack_index *m = source->midx;
if (!m)
continue;
for (uint32_t i = 0; i < m->num_packs + m->num_packs_in_base; i++)
- prepare_midx_pack(r, m, i);
+ prepare_midx_pack(store->odb->repo, m, i);
}
- return r->objects->packfiles->packs;
+ return store->packs;
}
struct list_head *get_packed_git_mru(struct repository *r)
@@ -2101,7 +2101,7 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
* covers, one kept and one not kept, but the midx returns only
* the non-kept version.
*/
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if ((p->pack_keep && (flags & ON_DISK_KEEP_PACKS)) ||
(p->pack_keep_in_core && (flags & IN_CORE_KEEP_PACKS))) {
ALLOC_GROW(packs, nr + 1, alloc);
@@ -2198,7 +2198,7 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
int r = 0;
int pack_errors = 0;
- for (p = get_all_packs(repo); p; p = p->next) {
+ for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next) {
if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
continue;
if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
diff --git a/packfile.h b/packfile.h
index dff02370924..8f501f00947 100644
--- a/packfile.h
+++ b/packfile.h
@@ -128,6 +128,12 @@ void packfile_store_reprepare(struct packfile_store *store);
void packfile_store_add_pack(struct packfile_store *store,
struct packed_git *pack);
+/*
+ * Get all packs managed by the given store, including packfiles that are
+ * referenced by multi-pack indices.
+ */
+struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+
/*
* Open the packfile and add it to the store if it isn't yet known. Returns
* either the newly opened packfile or the preexisting packfile. Returns a
@@ -213,7 +219,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
extern void (*report_garbage)(unsigned seen_bits, const char *path);
struct list_head *get_packed_git_mru(struct repository *r);
-struct packed_git *get_all_packs(struct repository *r);
/*
* Give a rough count of objects in the repository. This sacrifices accuracy
diff --git a/server-info.c b/server-info.c
index 9bb30d9ab71..79234c7fed3 100644
--- a/server-info.c
+++ b/server-info.c
@@ -292,7 +292,7 @@ static void init_pack_info(struct repository *r, const char *infofile, int force
int i;
size_t alloc = 0;
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
/* we ignore things on alternate path since they are
* not available to the pullers in general.
*/
diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c
index 611a13a3261..183a777fc54 100644
--- a/t/helper/test-find-pack.c
+++ b/t/helper/test-find-pack.c
@@ -39,7 +39,7 @@ int cmd__find_pack(int argc, const char **argv)
if (repo_get_oid(the_repository, argv[0], &oid))
die("cannot parse %s as an object name", argv[0]);
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
if (find_pack_entry_one(&oid, p)) {
printf("%s\n", p->pack_name);
actual_count++;
diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c
index d51aaa3dc40..cfdfae77a6c 100644
--- a/t/helper/test-pack-mtimes.c
+++ b/t/helper/test-pack-mtimes.c
@@ -37,7 +37,7 @@ int cmd__pack_mtimes(int argc, const char **argv)
if (argc != 2)
usage(pack_mtimes_usage);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
strbuf_addstr(&buf, basename(p->pack_name));
strbuf_strip_suffix(&buf, ".pack");
strbuf_addstr(&buf, ".mtimes");
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH 16/16] packfile: refactor `get_packed_git_mru()` to work on packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (14 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
@ 2025-08-19 8:19 ` Patrick Steinhardt
2025-08-19 17:13 ` [PATCH 00/16] packfile: carve out a new " Junio C Hamano
` (3 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-19 8:19 UTC (permalink / raw)
To: git
The `get_packed_git_mru()` function prepares the packfile store and then
returns its packfiles in most-recently-used order. Refactor it to accept
a packfile store instead of a repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/pack-objects.c | 4 ++--
packfile.c | 6 +++---
packfile.h | 7 +++++--
3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1c24b84510..4e75f14df1 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1748,12 +1748,12 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
}
}
- list_for_each(pos, get_packed_git_mru(the_repository)) {
+ list_for_each(pos, packfile_store_get_packs_mru(the_repository->objects->packfiles)) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset, found_mtime);
if (!exclude && want > 0)
list_move(&p->mru,
- get_packed_git_mru(the_repository));
+ packfile_store_get_packs_mru(the_repository->objects->packfiles));
if (want != -1)
return want;
}
diff --git a/packfile.c b/packfile.c
index b60faf5c3e..69844dd6cc 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1044,10 +1044,10 @@ struct packed_git *packfile_store_get_packs(struct packfile_store *store)
return store->packs;
}
-struct list_head *get_packed_git_mru(struct repository *r)
+struct list_head *packfile_store_get_packs_mru(struct packfile_store *store)
{
- packfile_store_prepare(r->objects->packfiles);
- return &r->objects->packfiles->mru;
+ packfile_store_prepare(store);
+ return &store->mru;
}
/*
diff --git a/packfile.h b/packfile.h
index 8f501f0094..f6dc26d08a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -134,6 +134,11 @@ void packfile_store_add_pack(struct packfile_store *store,
*/
struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+/*
+ * Get all packs in most-recently-used order.
+ */
+struct list_head *packfile_store_get_packs_mru(struct packfile_store *store);
+
/*
* Open the packfile and add it to the store if it isn't yet known. Returns
* either the newly opened packfile or the preexisting packfile. Returns a
@@ -218,8 +223,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-struct list_head *get_packed_git_mru(struct repository *r);
-
/*
* Give a rough count of objects in the repository. This sacrifices accuracy
* for speed.
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* Re: [PATCH 01/16] packfile: introduce a new `struct packfile_store`
2025-08-19 8:19 ` [PATCH 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
@ 2025-08-19 9:47 ` Karthik Nayak
2025-08-20 4:58 ` Patrick Steinhardt
2025-08-19 17:32 ` Junio C Hamano
1 sibling, 1 reply; 102+ messages in thread
From: Karthik Nayak @ 2025-08-19 9:47 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 3020 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> Information about a object database's packfiles is currently distributed
> across two different structures:
>
> - `struct packed_git` contains the `next` pointer as well as the
> `mru_head`, both of which serve to store the list of packfiles.
>
> - `struct object_database` contains several fields that relate to the
> packfiles.
>
> So we don't really have a central data structure that tracks our
> packfiles, and consequently responsibilities aren't always clear cut.
> A consequence for the upcoming pluggable object databases is that this
> makes it very hard to move management of packfiles from the object
> database level down into the object database source.
>
> Introduce a new `struct packfile_store` which is about to become the
> single source of truth for managing packfiles. Right now this data
> structure doesn't yet contain anything, but in subsequent patches we
> will move all data structures that relate to packfiles and that are
> currently contained in `struct object_database` into this new home.
>
> Note that this is only a first step: most importantly, we won't (yet)
> move the `struct packed_git::next` pointer around. This will happen in a
> subsequent patch series though so that `struct packed_git` will really
> only host information about the specific packfile it represents.
>
> Further note that the new structure still sits at the wrong level at the
> end of this patch series: as mentioned, it should eventually sit at the
> level of the object database source, not at the object database level.
> But introducing the packfile store now already makes it way easier to
> eventually push down the now-selfcontained data structure by one level.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> odb.c | 1 +
> odb.h | 2 ++
> packfile.c | 13 +++++++++++++
> packfile.h | 18 ++++++++++++++++++
> 4 files changed, 34 insertions(+)
>
> diff --git a/odb.c b/odb.c
> index 2a92a018c4..34b70d0074 100644
> --- a/odb.c
> +++ b/odb.c
> @@ -996,6 +996,7 @@ struct object_database *odb_new(struct repository *repo)
>
> memset(o, 0, sizeof(*o));
> o->repo = repo;
> + o->packfiles = packfile_store_new(o);
> INIT_LIST_HEAD(&o->packed_git_mru);
> hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
> pthread_mutex_init(&o->replace_mutex, NULL);
> diff --git a/odb.h b/odb.h
> index 3dfc66d75a..026ba9386d 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -83,6 +83,7 @@ struct odb_source {
> };
>
> struct packed_git;
> +struct packfile_store;
> struct cached_object_entry;
>
> /*
> @@ -128,6 +129,7 @@ struct object_database {
> *
> * should only be accessed directly by packfile.c
> */
> + struct packfile_store *packfiles;
>
Nit: The newline spacing makes it seem like the comment above only
applies to `struct packfile_store` while actually it also applies to
`struct packed_git`.
> struct packed_git *packed_git;
> /* A most-recently-used ordered version of the packed_git list. */
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-19 8:19 ` [PATCH 03/16] odb: move initialization bit " Patrick Steinhardt
@ 2025-08-19 9:57 ` Karthik Nayak
2025-08-19 16:24 ` Junio C Hamano
2025-08-20 4:58 ` [PATCH 03/16] odb: move initialization bit into `struct packfile_store` Patrick Steinhardt
0 siblings, 2 replies; 102+ messages in thread
From: Karthik Nayak @ 2025-08-19 9:57 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 783 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> diff --git a/packfile.h b/packfile.h
> index 1404b80917..573564b19e 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -64,6 +64,12 @@ struct packfile_store {
> * list.
> */
> struct packed_git *packs;
> +
> + /*
> + * Whether packfiles have already been populated with this store's
> + * packs.
> + */
> + unsigned initialized : 1;
> };
>
Nit: I know this is moved from existing code, but might be nice to
adhere to our format rules here and remove spaces around the bit field.
Tangent: Also this is something that is only mentioned in the
'.clang-format' but not in any of our documentation, should we add it to
the documentation? Usage seems to be around the same for both types.
> /*
>
> --
> 2.51.0.261.g7ce5a0a67e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-19 9:57 ` Karthik Nayak
@ 2025-08-19 16:24 ` Junio C Hamano
2025-08-20 8:04 ` Karthik Nayak
2025-08-20 4:58 ` [PATCH 03/16] odb: move initialization bit into `struct packfile_store` Patrick Steinhardt
1 sibling, 1 reply; 102+ messages in thread
From: Junio C Hamano @ 2025-08-19 16:24 UTC (permalink / raw)
To: Karthik Nayak; +Cc: Patrick Steinhardt, git
Karthik Nayak <karthik.188@gmail.com> writes:
> Tangent: Also this is something that is only mentioned in the
> '.clang-format' but not in any of our documentation, should we add it to
> the documentation? Usage seems to be around the same for both types.
It merely means that whatever .clang-format does is forcing changes
to half of existing code base without any developer input, let alone
conseusus, doesn't it?
A quick look around does indicate that with spaces around both sides
were dominant in the code base early days (like 1.0.0), but that
dominance eroded fairly quickly and by the time 1.6.0 was released
it was already half-half (24 among 43 are with spaces). As you
reported, among 216 hits for "^[ ]*unsigned .*:.*;" in header files
(in 2.50.0), 105 of them are with a space after that colon, and the
rest without, so it is really about the same, indeed.
I think it is a good idea to just pick one for new code and stick to
it, and if we can do without churning existing code, that would be
great.
I have personal preferences, and usually I'd like to hear from
others first before mentioning my preference, but for something this
small and does not affect readability very much, perhaps I can just
pick and dictate? I dunno ;-).
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 00/16] packfile: carve out a new packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (15 preceding siblings ...)
2025-08-19 8:19 ` [PATCH 16/16] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
@ 2025-08-19 17:13 ` Junio C Hamano
2025-08-20 13:55 ` Karthik Nayak
` (2 subsequent siblings)
19 siblings, 0 replies; 102+ messages in thread
From: Junio C Hamano @ 2025-08-19 17:13 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> So we don't really have a central data structure that tracks our
> packfiles, and consequently responsibilities aren't always clear cut.
> A consequence for the upcoming pluggable object databases is that this
> makes it very hard to move management of packfiles from the object
> database level down into the object database source.
>
> This patch series introduces a new `struct packfile_store`, which is
> about to become the single source of truth for managing packfiles, and
> carves out the packfile store subsystem.
Nice.
> This is the first step to make packfiles work with pluggable object
> databases. Next steps will be to:
>
> - Move the `struct packed_git::next` and `struct packed::mru_head`
> pointers into the packfile store so that `struct packed_git` only
> tracks a single packfile.
>
> - Push the `struct packfile_store` down one level so that it's not
> hosted by the object database anymore, but instead by the object
> database source.
Makes sense. Each packfile belong to a single $GIT_DIR/objects/ and
together with loose object files in there form a set of objects in a
single object store. When alternates are in effect, I think we
still out of convenience link these packfiles taken from multiple
places into a single list, which a series like this one may have to
untangle.
Thanks.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 01/16] packfile: introduce a new `struct packfile_store`
2025-08-19 8:19 ` [PATCH 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-08-19 9:47 ` Karthik Nayak
@ 2025-08-19 17:32 ` Junio C Hamano
2025-08-20 4:58 ` Patrick Steinhardt
1 sibling, 1 reply; 102+ messages in thread
From: Junio C Hamano @ 2025-08-19 17:32 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> memset(o, 0, sizeof(*o));
> o->repo = repo;
> + o->packfiles = packfile_store_new(o);
Shouldn't this be called o->packfile_store? It is not like a
packfile_store is merely an array of packfile struct, is it?
> @@ -128,6 +129,7 @@ struct object_database {
> *
> * should only be accessed directly by packfile.c
> */
> + struct packfile_store *packfiles;
So odb has a pointer to packfile_store, which in turn has a pointer
to a(nother) odb?
Hmph. It is unclear what this step has achieved (in other words,
there is no obvious thing that the information stored in the new
structure is used to achieve at this step). Let me read on.
Thanks.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 06/16] odb: move kept cache into `struct packfile_store`
2025-08-19 8:19 ` [PATCH 06/16] odb: move kept cache " Patrick Steinhardt
@ 2025-08-19 18:56 ` Junio C Hamano
2025-08-20 4:58 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Junio C Hamano @ 2025-08-19 18:56 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> The object database tracks a cache of "kept" packfiles, which is used by
> git-pack-objects(1) to handle cruft objects. With the introduction of
> the `struct packfile_store` we have a better place to host this cache
> though.
>
> Move the cache accordingly.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> odb.h | 9 +--------
> packfile.c | 16 ++++++++--------
> packfile.h | 5 +++++
> 3 files changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/odb.h b/odb.h
> index 2dc3bdc79d..f1736b067c 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -124,17 +124,10 @@ struct object_database {
> unsigned commit_graph_attempted : 1; /* if loading has been attempted */
>
> /*
> - * private data
> - *
> - * should only be accessed directly by packfile.c
> + * Should only be accessed directly by packfile.c
> */
Hmph, would this be better done in the step [01/16]? Or did the
removal of kept_pack_cache make the last piece of "private data"
disappear with this step?
> struct packfile_store *packfiles;
>
> - struct {
> - struct packed_git **packs;
> - unsigned flags;
> - } kept_pack_cache;
> -
> /*
> * This is meant to hold a *small* number of objects that you would
> * want odb_read_object() to be able to return, but yet you do not want
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 07/16] packfile: reorder functions to avoid function declaration
2025-08-19 8:19 ` [PATCH 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
@ 2025-08-19 19:18 ` Junio C Hamano
0 siblings, 0 replies; 102+ messages in thread
From: Junio C Hamano @ 2025-08-19 19:18 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
Patrick Steinhardt <ps@pks.im> writes:
> Reorder functions so that we can avoid an extra declaration of
> `prepare_packed_git()`.
Makes sense.
We usually call that "forward declaration" instead, though.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 01/16] packfile: introduce a new `struct packfile_store`
2025-08-19 9:47 ` Karthik Nayak
@ 2025-08-20 4:58 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-20 4:58 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Tue, Aug 19, 2025 at 02:47:32AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/odb.h b/odb.h
> > index 3dfc66d75a..026ba9386d 100644
> > --- a/odb.h
> > +++ b/odb.h
> > @@ -128,6 +129,7 @@ struct object_database {
> > *
> > * should only be accessed directly by packfile.c
> > */
> > + struct packfile_store *packfiles;
> >
>
> Nit: The newline spacing makes it seem like the comment above only
> applies to `struct packfile_store` while actually it also applies to
> `struct packed_git`.
Fair. The remaining structs will go away over subsequent commits anyway,
but we can still make this more obvious in this first step.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 01/16] packfile: introduce a new `struct packfile_store`
2025-08-19 17:32 ` Junio C Hamano
@ 2025-08-20 4:58 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-20 4:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, Aug 19, 2025 at 10:32:20AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > memset(o, 0, sizeof(*o));
> > o->repo = repo;
> > + o->packfiles = packfile_store_new(o);
>
> Shouldn't this be called o->packfile_store? It is not like a
> packfile_store is merely an array of packfile struct, is it?
I was mostly aiming for brevity as there's going to be a bunch of sites
that access the variable. But if you feel strongly about it I can adapt.
> > @@ -128,6 +129,7 @@ struct object_database {
> > *
> > * should only be accessed directly by packfile.c
> > */
> > + struct packfile_store *packfiles;
>
> So odb has a pointer to packfile_store, which in turn has a pointer
> to a(nother) odb?
The ODB has a pointer to a packfile store, and that store has a pointer
to its owning ODB.
> Hmph. It is unclear what this step has achieved (in other words,
> there is no obvious thing that the information stored in the new
> structure is used to achieve at this step). Let me read on.
Well, this series really only cares about encapsulating access to
packfiles so that it becomes easier in subsequent patch series to move
the data structures around. So next to carving out the subsystem we
don't achieve much, but it's a necessary prerequisite.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-19 9:57 ` Karthik Nayak
2025-08-19 16:24 ` Junio C Hamano
@ 2025-08-20 4:58 ` Patrick Steinhardt
2025-08-20 6:24 ` Junio C Hamano
1 sibling, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-20 4:58 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Tue, Aug 19, 2025 at 02:57:54AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > diff --git a/packfile.h b/packfile.h
> > index 1404b80917..573564b19e 100644
> > --- a/packfile.h
> > +++ b/packfile.h
> > @@ -64,6 +64,12 @@ struct packfile_store {
> > * list.
> > */
> > struct packed_git *packs;
> > +
> > + /*
> > + * Whether packfiles have already been populated with this store's
> > + * packs.
> > + */
> > + unsigned initialized : 1;
> > };
> >
>
> Nit: I know this is moved from existing code, but might be nice to
> adhere to our format rules here and remove spaces around the bit field.
>
> Tangent: Also this is something that is only mentioned in the
> '.clang-format' but not in any of our documentation, should we add it to
> the documentation? Usage seems to be around the same for both types.
Well, now that booleans are allowed I think we should just stop using
width specifiers like this altogether and instead use bool. There's
probably still going to be cases where we use those, but I assume that
the majority of users of this syntax is for flags.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 06/16] odb: move kept cache into `struct packfile_store`
2025-08-19 18:56 ` Junio C Hamano
@ 2025-08-20 4:58 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-20 4:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, Aug 19, 2025 at 11:56:54AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/odb.h b/odb.h
> > index 2dc3bdc79d..f1736b067c 100644
> > --- a/odb.h
> > +++ b/odb.h
> > @@ -124,17 +124,10 @@ struct object_database {
> > unsigned commit_graph_attempted : 1; /* if loading has been attempted */
> >
> > /*
> > - * private data
> > - *
> > - * should only be accessed directly by packfile.c
> > + * Should only be accessed directly by packfile.c
> > */
>
> Hmph, would this be better done in the step [01/16]? Or did the
> removal of kept_pack_cache make the last piece of "private data"
> disappear with this step?
Yeah, the latter. All packfile-related private data is now encapsulated
in `struct packfile_store`, so I felt like the comment became redundant
with this commit here.
I'll mention this in the commit message.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-20 4:58 ` [PATCH 03/16] odb: move initialization bit into `struct packfile_store` Patrick Steinhardt
@ 2025-08-20 6:24 ` Junio C Hamano
0 siblings, 0 replies; 102+ messages in thread
From: Junio C Hamano @ 2025-08-20 6:24 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Karthik Nayak, git
Patrick Steinhardt <ps@pks.im> writes:
> Well, now that booleans are allowed I think we should just stop using
> width specifiers like this altogether and instead use bool. There's
> probably still going to be cases where we use those, but I assume that
> the majority of users of this syntax is for flags.
Yeah, unless the structure with these members are designed to exist
in the millions in core (like "struct object" and its descendants),
bool is fine.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-19 16:24 ` Junio C Hamano
@ 2025-08-20 8:04 ` Karthik Nayak
2025-08-22 23:50 ` Junio C Hamano
0 siblings, 1 reply; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 8:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 1641 bytes --]
Junio C Hamano <gitster@pobox.com> writes:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> Tangent: Also this is something that is only mentioned in the
>> '.clang-format' but not in any of our documentation, should we add it to
>> the documentation? Usage seems to be around the same for both types.
>
> It merely means that whatever .clang-format does is forcing changes
> to half of existing code base without any developer input, let alone
> conseusus, doesn't it?
>
Yup indeed.
> A quick look around does indicate that with spaces around both sides
> were dominant in the code base early days (like 1.0.0), but that
> dominance eroded fairly quickly and by the time 1.6.0 was released
> it was already half-half (24 among 43 are with spaces). As you
> reported, among 216 hits for "^[ ]*unsigned .*:.*;" in header files
> (in 2.50.0), 105 of them are with a space after that colon, and the
> rest without, so it is really about the same, indeed.
>
> I think it is a good idea to just pick one for new code and stick to
> it, and if we can do without churning existing code, that would be
> great.
>
My thoughts here too. I do like the spaced version since it reads better
for me, but in the end, I care more for consistency.
> I have personal preferences, and usually I'd like to hear from
> others first before mentioning my preference, but for something this
> small and does not affect readability very much, perhaps I can just
> pick and dictate? I dunno ;-).
I wouldn't mind if you picked one over the other, like I mentioned, I
care more that we make it consistent and that the formatter can notify
or fix it for us.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 05/16] odb: move MRU list of packfiles into `struct packfile_store`
2025-08-19 8:19 ` [PATCH 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
@ 2025-08-20 12:44 ` Karthik Nayak
2025-08-20 19:20 ` Jeff King
0 siblings, 1 reply; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 12:44 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 1133 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> The object database tracks the list of packfiles in most-recently-used
> order, which is mostly used to favor reading from packfiles that contain
> most of the objects that we're currently accessing. With the
> introduction of the `struct packfile_store` we have a better place to
> host this list though.
>
> Move the list accordingly.
[snip]
> diff --git a/packfile.h b/packfile.h
> index 2f84d7d7e6..3022f3a19e 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -65,6 +65,9 @@ struct packfile_store {
> */
> struct packed_git *packs;
>
> + /* A most-recently-used ordered version of the packs list. */
> + struct list_head mru;
> +
> /*
> * A map of packfile names to packed_git structs for tracking which
> * packs have been loaded already.
>
> --
> 2.51.0.261.g7ce5a0a67e.dirty
Question: for my understanding, so we maintain a list of `packed_git`
packfiles in `packfile_store.packs` and then the same list is
available in a MRU form in `packfile_store.mru`?
I assume this is to optimize searches to use the mru form? Is there a
reason to not pick the mru list?
Thanks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 09/16] packfile: split up responsibilities of `reprepare_packed_git()`
2025-08-19 8:19 ` [PATCH 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
@ 2025-08-20 13:17 ` Karthik Nayak
0 siblings, 0 replies; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:17 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> In `reprepare_packed_git()` we perform a couple of operations:
>
> - We reload alternate object directories.
>
> - We clear the loose object cache.
>
> - We reprepare packfiles.
>
> While the logic is hosted in "packfile.c", it clearly reaches into other
> subsystems that aren't related to packfiles.
>
> Split up the responsibility and introduce `odb_reprepare()` which now
> becomes responsible for repreparing the whole object database. The
> existing `reprepare_packed_git()` function is refactored accordingly and
> only cares about reloading the packfile store now.
[snip]
> diff --git a/odb.h b/odb.h
> index f1736b067c..9810ec60a0 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -155,6 +155,12 @@ struct object_database {
> struct object_database *odb_new(struct repository *repo);
> void odb_clear(struct object_database *o);
>
> +/*
> + * Clear caches, reload alternates and then reload object sources so that new
> + * objects may become accessible.
> + */
> +void odb_reprepare(struct object_database *o);
>
I was first wondering why you don't go into details like mentioning the
packile cleanup. But since we eventually will move it into its own
object source, this reads better.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 11/16] packfile: always add packfiles to MRU when adding a pack
2025-08-19 8:19 ` [PATCH 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
@ 2025-08-20 13:35 ` Karthik Nayak
0 siblings, 0 replies; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:35 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 938 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> When adding a packfile to it store we add it both to the list and map of
> packfiles, but we don't append it to the most-recently-used list of
> packs. We do know to add the packfile to the MRU list as soon as we
> access any of its objects, but in between we're being inconistent. It
> doesn't help that there are some subsystems that _do_ add the packfile
> to the MRU after having added it, which only adds to the confusion.
>
> Refactor the code so that we unconditionally add packfiles to the MRU
> when adding them to a packfile store.
This makes sense.
It would be really nice if the internals of packfile_store is private,
that way users wouldn't know/care about the mru in the first place. Till
this patch, we seem to be moving in that direction as more things move
into `packfile_store` we can slowly abstract it behind a strong set of
APIs and internals can be kept private.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 12/16] packfile: introduce function to load and add packfiles
2025-08-19 8:19 ` [PATCH 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
@ 2025-08-20 13:41 ` Karthik Nayak
2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:41 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 7020 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> We have a recurring pattern where we essentially perform an upsert of a
> packfile in case it isn't yet known by the packfile store. The logic to
> do so is non-trivial as we have to reconstruct the packfile's key, check
> the map of packfiles, then create the new packfile and finally add it to
> the store.
>
I was just thinking about this in the previous patch and how it seemed
weird that the midx.c file was checking and adding a packfile, so good
to see this.
> Introduce a new function that does this dance for us. Refactor callsites
> to use it.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/fast-import.c | 4 ++--
> builtin/index-pack.c | 10 +++-------
> midx.c | 18 ++----------------
> packfile.c | 44 +++++++++++++++++++++++++++++++-------------
> packfile.h | 8 ++++++++
> 5 files changed, 46 insertions(+), 38 deletions(-)
>
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index e9d82b31c3..a26e79689d 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -897,11 +897,11 @@ static void end_packfile(void)
> idx_name = keep_pack(create_index());
>
> /* Register the packfile with core git's machinery. */
> - new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
> + new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
> + idx_name, 1);
>
I assume that the 'packfile_store_load_pack' function here returns a
new/existing packfile.
> if (!new_p)
> die("core git rejected index %s", idx_name);
> all_packs[pack_id] = new_p;
> - packfile_store_add_pack(the_repository->objects->packfiles, new_p);
> free(idx_name);
>
> /* Print the boundary */
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index ed490dfad4..2b78ba7fe4 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
> rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
> hash, "idx", 1);
>
> - if (do_fsck_object) {
> - struct packed_git *p;
> - p = add_packed_git(the_repository, final_index_name,
> - strlen(final_index_name), 0);
> - if (p)
> - packfile_store_add_pack(the_repository->objects->packfiles, p);
> - }
> + if (do_fsck_object)
> + packfile_store_load_pack(the_repository->objects->packfiles,
> + final_index_name, 0);
>
> if (!from_stdin) {
> printf("%s\n", hash_to_hex(hash));
> diff --git a/midx.c b/midx.c
> index 3cfe7884ad..d30feda019 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -454,7 +454,6 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
> uint32_t pack_int_id)
> {
> struct strbuf pack_name = STRBUF_INIT;
> - struct strbuf key = STRBUF_INIT;
> struct packed_git *p;
>
> pack_int_id = midx_for_pack(&m, pack_int_id);
> @@ -466,22 +465,9 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
>
> strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
> m->pack_names[pack_int_id]);
> -
> - /* pack_map holds the ".pack" name, but we have the .idx */
> - strbuf_addbuf(&key, &pack_name);
> - strbuf_strip_suffix(&key, ".idx");
> - strbuf_addstr(&key, ".pack");
> - p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
> - strhash(key.buf), key.buf,
> - struct packed_git, packmap_ent);
> - if (!p) {
> - p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
> - if (p)
> - packfile_store_add_pack(r->objects->packfiles, p);
> - }
> -
> + p = packfile_store_load_pack(r->objects->packfiles,
> + pack_name.buf, m->local);
> strbuf_release(&pack_name);
> - strbuf_release(&key);
>
> if (!p) {
> m->packs[pack_int_id] = MIDX_PACK_ERROR;
> diff --git a/packfile.c b/packfile.c
> index 8b5e6b96ce..f7916543a6 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store,
> list_add_tail(&pack->mru, &store->mru);
> }
>
> +struct packed_git *packfile_store_load_pack(struct packfile_store *store,
> + const char *idx_path, int local)
> +{
> + struct strbuf key = STRBUF_INIT;
> + struct packed_git *p;
> +
> + /*
> + * We're being called with the path to the index file, but `pack_map`
> + * holds the path to the packfile itself.
> + */
> + strbuf_addstr(&key, idx_path);
> + strbuf_strip_suffix(&key, ".idx");
> + strbuf_addstr(&key, ".pack");
> +
> + p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
> + struct packed_git, packmap_ent);
I was wondering from an earlier patch too, is there a reason to simply
not use 'strmap' for 'packfile_store.map'?
> + if (!p) {
> + p = add_packed_git(store->odb->repo, idx_path,
> + strlen(idx_path), local);
> + if (p)
> + packfile_store_add_pack(store, p);
> + }
> +
> + strbuf_release(&key);
> + return p;
> +}
> +
> void (*report_garbage)(unsigned seen_bits, const char *path);
>
> static void report_helper(const struct string_list *list,
> @@ -892,23 +919,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
> const char *file_name, void *_data)
> {
> struct prepare_pack_data *data = (struct prepare_pack_data *)_data;
> - struct packed_git *p;
> size_t base_len = full_name_len;
>
> if (strip_suffix_mem(full_name, &base_len, ".idx") &&
> !(data->m && midx_contains_pack(data->m, file_name))) {
> - struct hashmap_entry hent;
> - char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name);
> - unsigned int hash = strhash(pack_name);
> - hashmap_entry_init(&hent, hash);
> -
> - /* Don't reopen a pack we already have. */
> - if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
> - p = add_packed_git(data->r, full_name, full_name_len, data->local);
> - if (p)
> - packfile_store_add_pack(data->r->objects->packfiles, p);
> - }
> - free(pack_name);
> + char *trimmed_path = xstrndup(full_name, full_name_len);
> + packfile_store_load_pack(data->r->objects->packfiles,
> + trimmed_path, data->local);
> + free(trimmed_path);
> }
>
> if (!report_garbage)
> diff --git a/packfile.h b/packfile.h
> index e751a5d93e..4971f18f51 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -119,6 +119,14 @@ void packfile_store_reprepare(struct packfile_store *store);
> void packfile_store_add_pack(struct packfile_store *store,
> struct packed_git *pack);
>
> +/*
> + * Open the packfile and add it to the store if it isn't yet known. Returns
> + * either the newly opened packfile or the preexisting packfile. Returns a
> + * `NULL` pointer in case the packfile could not be opened.
> + */
> +struct packed_git *packfile_store_load_pack(struct packfile_store *store,
> + const char *idx_path, int local);
> +
This seems inline with my expectations.
> struct pack_window {
> struct pack_window *next;
> unsigned char *base;
>
> --
> 2.51.0.261.g7ce5a0a67e.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 14/16] packfile: remove `get_packed_git()`
2025-08-19 8:19 ` [PATCH 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
@ 2025-08-20 13:50 ` Karthik Nayak
2025-08-21 6:40 ` Patrick Steinhardt
2025-08-20 13:51 ` Karthik Nayak
1 sibling, 1 reply; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:50 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> We have two different functions to retrieve packfiles for a packfile
> store:
>
> - `get_packed_git()` returns the list of packfiles directly.
>
> - `get_all_packs()` does more work and also prepares packfiles that
> are being indexed by a multi-pack-index.
>
Question, under what situation would a packfile not returned by
`get_packed_git()` but be indexed by a multi-pack-index.
> The distinction is not immediately obvious. Furthermore, to make the
> situation even worse, `get_packed_git()` would return the same result as
> `get_all_packs()` once the latter has been called once as they both
> refer to the same list.
>
> As it turns out, the distinction isn't necessary. We only have a couple
> of callers of `get_packed_git()`, and all of those callers are prepared
> to call `get_all_packs()` instead:
>
> - "builtin/gc.c": We explicitly check how many packfiles aren't
> contained in the multi-pack-index, so loading extra packfiles that
> are indexed by it won't change the result.
>
> - "builtin/grep.c": We only care `get_packed_git()` to prepare eagerly
> load packfiles. In the preceding commit we have started to expose
Nit: the first sentence reads a bit weird.
> `packfile_store_prepare()`, which is a more direct way of achieving
> the same result.
>
> - "object-name.c": `find_abbrev_len_for_pack()` and `unique_in_pack()`
> exit early in case the multi-pack index is set, so both callsites of
> `get_packed_git()` know to handle packs loaded via the MIDX already.
>
> Convert all of these sites to use `get_all_packs()` instead and remove
> `get_packed_git()`.
>
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 14/16] packfile: remove `get_packed_git()`
2025-08-19 8:19 ` [PATCH 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
2025-08-20 13:50 ` Karthik Nayak
@ 2025-08-20 13:51 ` Karthik Nayak
1 sibling, 0 replies; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:51 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 581 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
[snip]
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 1d30d1af2c..565afda51f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1422,7 +1422,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
> if (incremental_repack_auto_limit < 0)
> return 1;
>
> - for (p = get_packed_git(the_repository);
> + for (p = get_all_packs(the_repository);
> count < incremental_repack_auto_limit && p;
> p = p->next) {
> if (!p->multi_pack_index)
Nit: We can get rid of the curly braces here
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store
2025-08-19 8:19 ` [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
@ 2025-08-20 13:53 ` Karthik Nayak
2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:53 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 2759 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> The `get_all_packs()` function prepares the packfile store and then
> returns its packfiles. Refactor it to accept a packfile store instead of
> a repository to clarify its scope.
>
[snip]
Nit: From running the clang formatter, small cleanups:
diff --git a/builtin/gc.c b/builtin/gc.c
index 030d0b0c77..41433b31ed 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1424,10 +1424,9 @@ static int
incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
for (p = packfile_store_get_packs(the_repository->objects->packfiles);
count < incremental_repack_auto_limit && p;
- p = p->next) {
+ p = p->next)
if (!p->multi_pack_index)
count++;
- }
return count >= incremental_repack_auto_limit;
}
@@ -1491,13 +1490,12 @@ static off_t get_auto_pack_size(void)
struct repository *r = the_repository;
odb_reprepare(r->objects);
- for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next)
if (p->pack_size > max_size) {
second_largest_size = max_size;
max_size = p->pack_size;
} else if (p->pack_size > second_largest_size)
second_largest_size = p->pack_size;
- }
result_size = second_largest_size + 1;
diff --git a/http-backend.c b/http-backend.c
index be4d8263a5..c5779db79d 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -608,16 +608,14 @@ static void get_info_packs(struct strbuf *hdr,
char *arg UNUSED)
size_t cnt = 0;
select_getanyfile(hdr);
- for (p = packfile_store_get_packs(the_repository->objects->packfiles);
p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles);
p; p = p->next)
if (p->pack_local)
cnt++;
- }
strbuf_grow(&buf, cnt * 53 + 2);
- for (p = packfile_store_get_packs(the_repository->objects->packfiles);
p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles);
p; p = p->next)
if (p->pack_local)
strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
- }
strbuf_addch(&buf, '\n');
hdr_nocache(hdr);
diff --git a/http.c b/http.c
index 16a1ab54f3..bf8711d6f8 100644
--- a/http.c
+++ b/http.c
@@ -2416,10 +2416,9 @@ static int fetch_and_setup_pack_index(struct
packed_git **packs_head,
* If we already have the pack locally, no need to fetch its index or
* even add it to list; we already have all of its objects.
*/
- for (p = packfile_store_get_packs(the_repository->objects->packfiles);
p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles);
p; p = p->next)
if (hasheq(p->hash, sha1, the_repository->hash_algo))
return 0;
- }
tmp_idx = fetch_pack_index(sha1, base_url);
if (!tmp_idx)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply related [flat|nested] 102+ messages in thread
* Re: [PATCH 00/16] packfile: carve out a new packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (16 preceding siblings ...)
2025-08-19 17:13 ` [PATCH 00/16] packfile: carve out a new " Junio C Hamano
@ 2025-08-20 13:55 ` Karthik Nayak
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
19 siblings, 0 replies; 102+ messages in thread
From: Karthik Nayak @ 2025-08-20 13:55 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 1631 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> Hi,
>
> information about a object database's packfiles is currently distributed
> across two different structures:
>
> - `struct packed_git` contains the `next` pointer as well as the
> `mru_head`, both of which serve to store the list of packfiles.
>
> - `struct object_database` contains several fields that relate to the
> packfiles.
>
> So we don't really have a central data structure that tracks our
> packfiles, and consequently responsibilities aren't always clear cut.
> A consequence for the upcoming pluggable object databases is that this
> makes it very hard to move management of packfiles from the object
> database level down into the object database source.
>
> This patch series introduces a new `struct packfile_store`, which is
> about to become the single source of truth for managing packfiles, and
> carves out the packfile store subsystem.
>
> This is the first step to make packfiles work with pluggable object
> databases. Next steps will be to:
>
> - Move the `struct packed_git::next` and `struct packed::mru_head`
> pointers into the packfile store so that `struct packed_git` only
> tracks a single packfile.
>
> - Push the `struct packfile_store` down one level so that it's not
> hosted by the object database anymore, but instead by the object
> database source.
>
> Thanks!
>
> Patrick
>
Hello Patrick,
I took some time to read through your patches and comment on them. I
only had some small nits.
Overall they look good to me, but I must say I don't know much about
this part of the codebase.
Thanks,
Karthik
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 05/16] odb: move MRU list of packfiles into `struct packfile_store`
2025-08-20 12:44 ` Karthik Nayak
@ 2025-08-20 19:20 ` Jeff King
2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Jeff King @ 2025-08-20 19:20 UTC (permalink / raw)
To: Karthik Nayak; +Cc: Patrick Steinhardt, git
On Wed, Aug 20, 2025 at 05:44:36AM -0700, Karthik Nayak wrote:
> Question: for my understanding, so we maintain a list of `packed_git`
> packfiles in `packfile_store.packs` and then the same list is
> available in a MRU form in `packfile_store.mru`?
>
> I assume this is to optimize searches to use the mru form? Is there a
> reason to not pick the mru list?
Yeah, I believe you should find the same packfiles in the linked list
currently formed by packed_git.next and the MRU list. When I introduced
the MRU list long ago, I cowardly did not want to take the risk that
somebody depended on the order of the original list, and left it in
place.
I think it would _probably_ be OK to just keep a single list (and
manipulate it to keep the MRU property). But whoever does should take a
close look and make sure that is true. The biggest risk I can think of
is that there could be some code iterating over the packfiles, coupled
with object lookups in their loop body. If those lookups reorder the
list, that would screw up the iteration.
IMHO it is probably better to do a change like that outside of Patrick's
series. There already is a lot going on with moving fields around, and
consolidating the lists can happen on top (and would be made easier for
having pulled it out into two adjacent lists).
-Peff
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 05/16] odb: move MRU list of packfiles into `struct packfile_store`
2025-08-20 19:20 ` Jeff King
@ 2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 6:40 UTC (permalink / raw)
To: Jeff King; +Cc: Karthik Nayak, git
On Wed, Aug 20, 2025 at 03:20:08PM -0400, Jeff King wrote:
> On Wed, Aug 20, 2025 at 05:44:36AM -0700, Karthik Nayak wrote:
>
> > Question: for my understanding, so we maintain a list of `packed_git`
> > packfiles in `packfile_store.packs` and then the same list is
> > available in a MRU form in `packfile_store.mru`?
> >
> > I assume this is to optimize searches to use the mru form? Is there a
> > reason to not pick the mru list?
>
> Yeah, I believe you should find the same packfiles in the linked list
> currently formed by packed_git.next and the MRU list. When I introduced
> the MRU list long ago, I cowardly did not want to take the risk that
> somebody depended on the order of the original list, and left it in
> place.
>
> I think it would _probably_ be OK to just keep a single list (and
> manipulate it to keep the MRU property). But whoever does should take a
> close look and make sure that is true. The biggest risk I can think of
> is that there could be some code iterating over the packfiles, coupled
> with object lookups in their loop body. If those lookups reorder the
> list, that would screw up the iteration.
>
> IMHO it is probably better to do a change like that outside of Patrick's
> series. There already is a lot going on with moving fields around, and
> consolidating the lists can happen on top (and would be made easier for
> having pulled it out into two adjacent lists).
There's in fact been a couple sites where we _didn't_ add packfiles to
the MRU list: "builtin/fast-import.c", "builtin/index-pack.c",
"packfile.c" via `prepare_pack()`. This is no longer going to be the
case at the end of this patch series as we in a later patch adjust
`packfile_store_store_add()` to handle this for us.
But I agree, this is something I'd rather want to push into a subsequent
patch series. I've already got one cooking where I change how the lists
are getting handled, so that's a good opportunity to do so.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 12/16] packfile: introduce function to load and add packfiles
2025-08-20 13:41 ` Karthik Nayak
@ 2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 6:40 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Wed, Aug 20, 2025 at 06:41:23AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> > index e9d82b31c3..a26e79689d 100644
> > --- a/builtin/fast-import.c
> > +++ b/builtin/fast-import.c
> > @@ -897,11 +897,11 @@ static void end_packfile(void)
> > idx_name = keep_pack(create_index());
> >
> > /* Register the packfile with core git's machinery. */
> > - new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
> > + new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
> > + idx_name, 1);
> >
>
> I assume that the 'packfile_store_load_pack' function here returns a
> new/existing packfile.
Yes, exactly.
> > diff --git a/packfile.c b/packfile.c
> > index 8b5e6b96ce..f7916543a6 100644
> > --- a/packfile.c
> > +++ b/packfile.c
> > @@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store,
> > list_add_tail(&pack->mru, &store->mru);
> > }
> >
> > +struct packed_git *packfile_store_load_pack(struct packfile_store *store,
> > + const char *idx_path, int local)
> > +{
> > + struct strbuf key = STRBUF_INIT;
> > + struct packed_git *p;
> > +
> > + /*
> > + * We're being called with the path to the index file, but `pack_map`
> > + * holds the path to the packfile itself.
> > + */
> > + strbuf_addstr(&key, idx_path);
> > + strbuf_strip_suffix(&key, ".idx");
> > + strbuf_addstr(&key, ".pack");
> > +
> > + p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
> > + struct packed_git, packmap_ent);
>
> I was wondering from an earlier patch too, is there a reason to simply
> not use 'strmap' for 'packfile_store.map'?
Hm. I cannot think of any, no. I'll leave this as-is in this patch
series though and move it into the next one where I'm revamping how
packfiles are stored.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 14/16] packfile: remove `get_packed_git()`
2025-08-20 13:50 ` Karthik Nayak
@ 2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 6:40 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Wed, Aug 20, 2025 at 06:50:05AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > We have two different functions to retrieve packfiles for a packfile
> > store:
> >
> > - `get_packed_git()` returns the list of packfiles directly.
> >
> > - `get_all_packs()` does more work and also prepares packfiles that
> > are being indexed by a multi-pack-index.
> >
>
> Question, under what situation would a packfile not returned by
> `get_packed_git()` but be indexed by a multi-pack-index.
Okay, this whole commit message isn't all that enlightening. The only
difference between these two functions is that `get_packed_git()` only
calls `packfile_store_prepare()` and then returns the list of packs,
whereas `get_all_packs()` does the same and then also prepares the pack
so that it's properly marked as being indexed by an MIDX.
Now that means that `get_packed_git()` does _less_ work. But both
functions already load the MIDX via `packfile_store_prepare()` anyway,
so the only difference is that we now also call `prepare_midx_pack()`
for each indexed pack. And that function does not do a lot: we have
already loaded the pack itself, so `packfile_store_load_pack()` returns
the already-in-memory pack. So all we end up doing is to figure out
whether the pack is indexed in the MIDX and then store this info in
`p->multi_pack_index` as well as `m->packs[pack_int_id]`.
So the amount of extra work shouldn't really matter. I'll rephrase the
commit message.
> > The distinction is not immediately obvious. Furthermore, to make the
> > situation even worse, `get_packed_git()` would return the same result as
> > `get_all_packs()` once the latter has been called once as they both
> > refer to the same list.
> >
> > As it turns out, the distinction isn't necessary. We only have a couple
> > of callers of `get_packed_git()`, and all of those callers are prepared
> > to call `get_all_packs()` instead:
> >
> > - "builtin/gc.c": We explicitly check how many packfiles aren't
> > contained in the multi-pack-index, so loading extra packfiles that
> > are indexed by it won't change the result.
> >
> > - "builtin/grep.c": We only care `get_packed_git()` to prepare eagerly
> > load packfiles. In the preceding commit we have started to expose
>
> Nit: the first sentence reads a bit weird.
Ah, that was supposed to read "call", not "care".
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store
2025-08-20 13:53 ` Karthik Nayak
@ 2025-08-21 6:40 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 6:40 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Wed, Aug 20, 2025 at 06:53:34AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > The `get_all_packs()` function prepares the packfile store and then
> > returns its packfiles. Refactor it to accept a packfile store instead of
> > a repository to clarify its scope.
> >
>
> [snip]
>
> Nit: From running the clang formatter, small cleanups:
These would all be while-at-it changes though, so I'd prefer to punt on
them for now.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH v2 00/16] packfile: carve out a new packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (17 preceding siblings ...)
2025-08-20 13:55 ` Karthik Nayak
@ 2025-08-21 7:38 ` Patrick Steinhardt
2025-08-21 7:38 ` [PATCH v2 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
` (15 more replies)
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
19 siblings, 16 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:38 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
Hi,
information about a object database's packfiles is currently distributed
across two different structures:
- `struct packed_git` contains the `next` pointer as well as the
`mru_head`, both of which serve to store the list of packfiles.
- `struct object_database` contains several fields that relate to the
packfiles.
So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.
This patch series introduces a new `struct packfile_store`, which is
about to become the single source of truth for managing packfiles, and
carves out the packfile store subsystem.
This is the first step to make packfiles work with pluggable object
databases. Next steps will be to:
- Move the `struct packed_git::next` and `struct packed::mru_head`
pointers into the packfile store so that `struct packed_git` only
tracks a single packfile.
- Push the `struct packfile_store` down one level so that it's not
hosted by the object database anymore, but instead by the object
database source.
Changes in v2:
- Convert the `initialized` flag into a boolean.
- Polish some commit messages.
- Some smaller formatting changes to the layout of `struct
object_database`.
- Link to v1: https://lore.kernel.org/r/20250819-b4-pks-packfiles-store-v1-0-1660842e125a@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (16):
packfile: introduce a new `struct packfile_store`
odb: move list of packfiles into `struct packfile_store`
odb: move initialization bit into `struct packfile_store`
odb: move packfile map into `struct packfile_store`
odb: move MRU list of packfiles into `struct packfile_store`
odb: move kept cache into `struct packfile_store`
packfile: reorder functions to avoid function declaration
packfile: refactor `prepare_packed_git()` to work on packfile store
packfile: split up responsibilities of `reprepare_packed_git()`
packfile: refactor `install_packed_git()` to work on packfile store
packfile: always add packfiles to MRU when adding a pack
packfile: introduce function to load and add packfiles
packfile: move `get_multi_pack_index()` into "midx.c"
packfile: remove `get_packed_git()`
packfile: refactor `get_all_packs()` to work on packfile store
packfile: refactor `get_packed_git_mru()` to work on packfile store
builtin/backfill.c | 2 +-
builtin/cat-file.c | 2 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 8 +-
builtin/fsck.c | 8 +-
builtin/gc.c | 12 +-
builtin/grep.c | 2 +-
builtin/index-pack.c | 10 +-
builtin/pack-objects.c | 22 ++--
builtin/pack-redundant.c | 4 +-
builtin/receive-pack.c | 2 +-
builtin/repack.c | 8 +-
bulk-checkin.c | 2 +-
connected.c | 4 +-
fetch-pack.c | 4 +-
http-backend.c | 4 +-
http.c | 4 +-
http.h | 2 +-
midx.c | 26 ++--
midx.h | 2 +
object-name.c | 6 +-
odb.c | 37 ++++--
odb.h | 34 ++---
pack-bitmap.c | 4 +-
pack-objects.c | 2 +-
packfile.c | 293 ++++++++++++++++++++++++--------------------
packfile.h | 110 ++++++++++++++---
server-info.c | 2 +-
t/helper/test-find-pack.c | 2 +-
t/helper/test-pack-mtimes.c | 2 +-
transport-helper.c | 2 +-
31 files changed, 353 insertions(+), 271 deletions(-)
Range-diff versus v1:
1: 49001587ad ! 1: 5f89325948 packfile: introduce a new `struct packfile_store`
@@ odb.h: struct object_database {
*
* should only be accessed directly by packfile.c
*/
+-
+ struct packfile_store *packfiles;
-
struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
+ struct list_head packed_git_mru;
## packfile.c ##
@@ packfile.c: int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
2: 40e8926e48 ! 2: bf3e61a3ac odb: move list of packfiles into `struct packfile_store`
@@ odb.c: void odb_clear(struct object_database *o)
## odb.h ##
@@ odb.h: struct object_database {
+ * should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
-
- struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
struct list_head packed_git_mru;
@@ packfile.h: struct packed_git {
+
+ /*
+ * The list of packfiles in the order in which they are being added to
-+ * the store. The local packfile typically sits at the head of this
-+ * list.
++ * the store.
+ */
+ struct packed_git *packs;
};
3: 7dacc2aeb3 ! 3: e0a167b952 odb: move initialization bit into `struct packfile_store`
@@ Commit message
database. With the introduction of the `struct packfile_store` we have a
better place to host this bit though.
- Move it accordingly.
+ Move it accordingly. While at it, convert the field into a boolean now
+ that we're allowed to use them in our code base.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
@@ packfile.c: static void prepare_packed_git(struct repository *r)
prepare_packed_git_mru(r);
- r->objects->packed_git_initialized = 1;
-+ r->objects->packfiles->initialized = 1;
++ r->objects->packfiles->initialized = true;
}
void reprepare_packed_git(struct repository *r)
@@ packfile.c: void reprepare_packed_git(struct repository *r)
r->objects->approximate_object_count_valid = 0;
- r->objects->packed_git_initialized = 0;
-+ r->objects->packfiles->initialized = 0;
++ r->objects->packfiles->initialized = false;
prepare_packed_git(r);
obj_read_unlock();
}
## packfile.h ##
@@ packfile.h: struct packfile_store {
- * list.
+ * the store.
*/
struct packed_git *packs;
+
@@ packfile.h: struct packfile_store {
+ * Whether packfiles have already been populated with this store's
+ * packs.
+ */
-+ unsigned initialized : 1;
++ bool initialized;
};
/*
4: 970ddf62ba = 4: 5940ac0cc3 odb: move packfile map into `struct packfile_store`
5: 300f68f5f3 ! 5: e1c81f4258 odb: move MRU list of packfiles into `struct packfile_store`
@@ odb.h
#include "oidmap.h"
#include "string-list.h"
@@ odb.h: struct object_database {
+ * should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
-
- /* A most-recently-used ordered version of the packed_git list. */
- struct list_head packed_git_mru;
-
6: 97d9a8322e ! 6: 96eb610300 odb: move kept cache into `struct packfile_store`
@@ Commit message
Move the cache accordingly.
+ This moves the last bit of packfile-related state from the object
+ database into the packfile store. Adapt the comment for the `packfiles`
+ pointer in `struct object_database` to reflect this.
+
Signed-off-by: Patrick Steinhardt <ps@pks.im>
## odb.h ##
@@ odb.h: struct object_database {
+ * Should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
-
- struct {
- struct packed_git **packs;
- unsigned flags;
- } kept_pack_cache;
--
+
/*
* This is meant to hold a *small* number of objects that you would
- * want odb_read_object() to be able to return, but yet you do not want
## packfile.c ##
@@ packfile.c: int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
7: 7e5c801326 ! 7: 2a4392afeb packfile: reorder functions to avoid function declaration
@@ Metadata
## Commit message ##
packfile: reorder functions to avoid function declaration
- Reorder functions so that we can avoid an extra declaration of
+ Reorder functions so that we can avoid a forward declaration of
`prepare_packed_git()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
8: 967232564f ! 8: e92deae728 packfile: refactor `prepare_packed_git()` to work on packfile store
@@ packfile.c: static int sort_pack(const struct packed_git *a, const struct packed
+ sort_packs(&store->packs, sort_pack);
- prepare_packed_git_mru(r);
-- r->objects->packfiles->initialized = 1;
+- r->objects->packfiles->initialized = true;
+ packfile_store_prepare_mru(store);
-+ store->initialized = 1;
++ store->initialized = true;
}
void reprepare_packed_git(struct repository *r)
@@ packfile.c: void reprepare_packed_git(struct repository *r)
r->objects->approximate_object_count_valid = 0;
- r->objects->packfiles->initialized = 0;
+ r->objects->packfiles->initialized = false;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
obj_read_unlock();
9: 11860d282d ! 9: 2fcc09332f packfile: split up responsibilities of `reprepare_packed_git()`
@@ odb.h: struct object_database {
## packfile.c ##
@@ packfile.c: static void packfile_store_prepare(struct packfile_store *store)
- store->initialized = 1;
+ store->initialized = true;
}
-void reprepare_packed_git(struct repository *r)
@@ packfile.c: static void packfile_store_prepare(struct packfile_store *store)
- odb_clear_loose_cache(source);
-
- r->objects->approximate_object_count_valid = 0;
-- r->objects->packfiles->initialized = 0;
+- r->objects->packfiles->initialized = false;
- packfile_store_prepare(r->objects->packfiles);
- obj_read_unlock();
-+ store->initialized = 0;
++ store->initialized = false;
+ packfile_store_prepare(store);
}
10: 7472988a1f = 10: 3097fd3d85 packfile: refactor `install_packed_git()` to work on packfile store
11: e705f1fbf4 = 11: aff98b7a67 packfile: always add packfiles to MRU when adding a pack
12: 70b4d4921e = 12: ab55a14dbc packfile: introduce function to load and add packfiles
13: 80326900fa = 13: abb34ffc8a packfile: move `get_multi_pack_index()` into "midx.c"
14: 98219d6ed4 ! 14: 273f2d698e packfile: remove `get_packed_git()`
@@ Commit message
We have two different functions to retrieve packfiles for a packfile
store:
- - `get_packed_git()` returns the list of packfiles directly.
+ - `get_packed_git()` returns the list of packfiles after having called
+ `prepare_packed_git()`.
- - `get_all_packs()` does more work and also prepares packfiles that
- are being indexed by a multi-pack-index.
+ - `get_all_packs()` calls `prepare_packed_git()`, as well, but also
+ calls `prepare_midx_pack()` for each pack.
- The distinction is not immediately obvious. Furthermore, to make the
- situation even worse, `get_packed_git()` would return the same result as
- `get_all_packs()` once the latter has been called once as they both
- refer to the same list.
+ This means that the latter function also properly loads the info of
+ whether or not a packfile is part of a multi-pack index. Preparing this
+ extra information also shouldn't be significantly more expensive:
- As it turns out, the distinction isn't necessary. We only have a couple
- of callers of `get_packed_git()`, and all of those callers are prepared
- to call `get_all_packs()` instead:
+ - We have already loaded all packfiles via `prepare_packed_git_one()`.
+ So given that multi-pack indices may only refer to packfiles in the
+ same object directory we know that we already loaded each packfile.
- - "builtin/gc.c": We explicitly check how many packfiles aren't
- contained in the multi-pack-index, so loading extra packfiles that
- are indexed by it won't change the result.
+ - The multi-pack index was prepared via `packfile_store_prepare()`
+ already, which calls `prepare_multi_pack_index_one()`.
- - "builtin/grep.c": We only care `get_packed_git()` to prepare eagerly
- load packfiles. In the preceding commit we have started to expose
- `packfile_store_prepare()`, which is a more direct way of achieving
- the same result.
+ - So all that remains to be done is to look up the index of the pack
+ in its multi-pack index so that we can store that info in both the
+ pack itself and the MIDX.
- - "object-name.c": `find_abbrev_len_for_pack()` and `unique_in_pack()`
- exit early in case the multi-pack index is set, so both callsites of
- `get_packed_git()` know to handle packs loaded via the MIDX already.
+ So it is somewhat confusing to readers that one of these two functions
+ claims to load "all" packfiles while the other one doesn't, even though
+ the ultimate difference is way more nuanced.
Convert all of these sites to use `get_all_packs()` instead and remove
- `get_packed_git()`.
+ `get_packed_git()`. There doesn't seem to be a good reason to discern
+ these two functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
15: 3b39e6c4c4 = 15: 15bc2b858c packfile: refactor `get_all_packs()` to work on packfile store
16: 4b2a3429d6 = 16: 0bfb6cf52a packfile: refactor `get_packed_git_mru()` to work on packfile store
---
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
change-id: 20250806-b4-pks-packfiles-store-a44a608ca396
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH v2 01/16] packfile: introduce a new `struct packfile_store`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
@ 2025-08-21 7:38 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 02/16] odb: move list of packfiles into " Patrick Steinhardt
` (14 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:38 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
Information about a object database's packfiles is currently distributed
across two different structures:
- `struct packed_git` contains the `next` pointer as well as the
`mru_head`, both of which serve to store the list of packfiles.
- `struct object_database` contains several fields that relate to the
packfiles.
So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.
Introduce a new `struct packfile_store` which is about to become the
single source of truth for managing packfiles. Right now this data
structure doesn't yet contain anything, but in subsequent patches we
will move all data structures that relate to packfiles and that are
currently contained in `struct object_database` into this new home.
Note that this is only a first step: most importantly, we won't (yet)
move the `struct packed_git::next` pointer around. This will happen in a
subsequent patch series though so that `struct packed_git` will really
only host information about the specific packfile it represents.
Further note that the new structure still sits at the wrong level at the
end of this patch series: as mentioned, it should eventually sit at the
level of the object database source, not at the object database level.
But introducing the packfile store now already makes it way easier to
eventually push down the now-selfcontained data structure by one level.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.c | 1 +
odb.h | 3 ++-
packfile.c | 13 +++++++++++++
packfile.h | 18 ++++++++++++++++++
4 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/odb.c b/odb.c
index 2a92a018c4..34b70d0074 100644
--- a/odb.c
+++ b/odb.c
@@ -996,6 +996,7 @@ struct object_database *odb_new(struct repository *repo)
memset(o, 0, sizeof(*o));
o->repo = repo;
+ o->packfiles = packfile_store_new(o);
INIT_LIST_HEAD(&o->packed_git_mru);
hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
pthread_mutex_init(&o->replace_mutex, NULL);
diff --git a/odb.h b/odb.h
index 3dfc66d75a..08c3a01f3b 100644
--- a/odb.h
+++ b/odb.h
@@ -83,6 +83,7 @@ struct odb_source {
};
struct packed_git;
+struct packfile_store;
struct cached_object_entry;
/*
@@ -128,7 +129,7 @@ struct object_database {
*
* should only be accessed directly by packfile.c
*/
-
+ struct packfile_store *packfiles;
struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
struct list_head packed_git_mru;
diff --git a/packfile.c b/packfile.c
index 5d73932f50..8fbf1cfc2d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2333,3 +2333,16 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
*len = hdr - out;
return 0;
}
+
+struct packfile_store *packfile_store_new(struct object_database *odb)
+{
+ struct packfile_store *store;
+ CALLOC_ARRAY(store, 1);
+ store->odb = odb;
+ return store;
+}
+
+void packfile_store_free(struct packfile_store *store)
+{
+ free(store);
+}
diff --git a/packfile.h b/packfile.h
index f16753f2a9..8d31fd619a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -52,6 +52,24 @@ struct packed_git {
char pack_name[FLEX_ARRAY]; /* more */
};
+/*
+ * A store that manages packfiles for a given object database.
+ */
+struct packfile_store {
+ struct object_database *odb;
+};
+
+/*
+ * Allocate and initialize a new empty packfile store for the given object
+ * database.
+ */
+struct packfile_store *packfile_store_new(struct object_database *odb);
+
+/*
+ * Free the packfile store and all its associated state.
+ */
+void packfile_store_free(struct packfile_store *store);
+
static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
const struct hashmap_entry *entry,
const struct hashmap_entry *entry2,
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 02/16] odb: move list of packfiles into `struct packfile_store`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
2025-08-21 7:38 ` [PATCH v2 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-25 23:42 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 03/16] odb: move initialization bit " Patrick Steinhardt
` (13 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The object database tracks the list of packfiles it currently knows
about. With the introduction of the `struct packfile_store` we have a
better place to host this list though.
Move the list accordingly. Extract the logic from `odb_clear()` that
knows to close all such packfiles and move it into the new subsystem, as
well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.c | 11 +----------
odb.h | 1 -
packfile.c | 47 ++++++++++++++++++++++++++++++-----------------
packfile.h | 15 ++++++++++++++-
4 files changed, 45 insertions(+), 29 deletions(-)
diff --git a/odb.c b/odb.c
index 34b70d0074..17a9135cbd 100644
--- a/odb.c
+++ b/odb.c
@@ -1038,16 +1038,7 @@ void odb_clear(struct object_database *o)
INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
-
- /*
- * `close_object_store()` only closes the packfiles, but doesn't free
- * them. We thus have to do this manually.
- */
- for (struct packed_git *p = o->packed_git, *next; p; p = next) {
- next = p->next;
- free(p);
- }
- o->packed_git = NULL;
+ packfile_store_free(o->packfiles);
hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
diff --git a/odb.h b/odb.h
index 08c3a01f3b..6f901c5ac0 100644
--- a/odb.h
+++ b/odb.h
@@ -130,7 +130,6 @@ struct object_database {
* should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
struct list_head packed_git_mru;
diff --git a/packfile.c b/packfile.c
index 8fbf1cfc2d..6478e4cc30 100644
--- a/packfile.c
+++ b/packfile.c
@@ -278,7 +278,7 @@ static int unuse_one_window(struct packed_git *current)
if (current)
scan_windows(current, &lru_p, &lru_w, &lru_l);
- for (p = current->repo->objects->packed_git; p; p = p->next)
+ for (p = current->repo->objects->packfiles->packs; p; p = p->next)
scan_windows(p, &lru_p, &lru_w, &lru_l);
if (lru_p) {
munmap(lru_w->base, lru_w->len);
@@ -362,13 +362,8 @@ void close_pack(struct packed_git *p)
void close_object_store(struct object_database *o)
{
struct odb_source *source;
- struct packed_git *p;
- for (p = o->packed_git; p; p = p->next)
- if (p->do_not_close)
- BUG("want to close pack marked 'do-not-close'");
- else
- close_pack(p);
+ packfile_store_close(o->packfiles);
for (source = o->sources; source; source = source->next) {
if (source->midx)
@@ -468,7 +463,7 @@ static int close_one_pack(struct repository *r)
struct pack_window *mru_w = NULL;
int accept_windows_inuse = 1;
- for (p = r->objects->packed_git; p; p = p->next) {
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
if (p->pack_fd == -1)
continue;
find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
@@ -789,8 +784,8 @@ void install_packed_git(struct repository *r, struct packed_git *pack)
if (pack->pack_fd != -1)
pack_open_fds++;
- pack->next = r->objects->packed_git;
- r->objects->packed_git = pack;
+ pack->next = r->objects->packfiles->packs;
+ r->objects->packfiles->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
hashmap_add(&r->objects->pack_map, &pack->packmap_ent);
@@ -974,7 +969,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
count += m->num_objects;
}
- for (p = r->objects->packed_git; p; p = p->next) {
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
if (open_pack_index(p))
continue;
count += p->num_objects;
@@ -1015,7 +1010,7 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
static void rearrange_packed_git(struct repository *r)
{
- sort_packs(&r->objects->packed_git, sort_pack);
+ sort_packs(&r->objects->packfiles->packs, sort_pack);
}
static void prepare_packed_git_mru(struct repository *r)
@@ -1024,7 +1019,7 @@ static void prepare_packed_git_mru(struct repository *r)
INIT_LIST_HEAD(&r->objects->packed_git_mru);
- for (p = r->objects->packed_git; p; p = p->next)
+ for (p = r->objects->packfiles->packs; p; p = p->next)
list_add_tail(&p->mru, &r->objects->packed_git_mru);
}
@@ -1074,7 +1069,7 @@ void reprepare_packed_git(struct repository *r)
struct packed_git *get_packed_git(struct repository *r)
{
prepare_packed_git(r);
- return r->objects->packed_git;
+ return r->objects->packfiles->packs;
}
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
@@ -1095,7 +1090,7 @@ struct packed_git *get_all_packs(struct repository *r)
prepare_midx_pack(r, m, i);
}
- return r->objects->packed_git;
+ return r->objects->packfiles->packs;
}
struct list_head *get_packed_git_mru(struct repository *r)
@@ -1220,7 +1215,7 @@ const struct packed_git *has_packed_and_bad(struct repository *r,
{
struct packed_git *p;
- for (p = r->objects->packed_git; p; p = p->next)
+ for (p = r->objects->packfiles->packs; p; p = p->next)
if (oidset_contains(&p->bad_objects, oid))
return p;
return NULL;
@@ -2081,7 +2076,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
if (source->midx && fill_midx_entry(r, oid, e, source->midx))
return 1;
- if (!r->objects->packed_git)
+ if (!r->objects->packfiles->packs)
return 0;
list_for_each(pos, &r->objects->packed_git_mru) {
@@ -2344,5 +2339,23 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
void packfile_store_free(struct packfile_store *store)
{
+ packfile_store_close(store);
+
+ for (struct packed_git *p = store->packs, *next; p; p = next) {
+ next = p->next;
+ free(p);
+ }
+
free(store);
}
+
+void packfile_store_close(struct packfile_store *store)
+{
+ struct packed_git *p;
+
+ for (p = store->packs; p; p = p->next)
+ if (p->do_not_close)
+ BUG("want to close pack marked 'do-not-close'");
+ else
+ close_pack(p);
+}
diff --git a/packfile.h b/packfile.h
index 8d31fd619a..d7ac8d24b4 100644
--- a/packfile.h
+++ b/packfile.h
@@ -57,6 +57,12 @@ struct packed_git {
*/
struct packfile_store {
struct object_database *odb;
+
+ /*
+ * The list of packfiles in the order in which they are being added to
+ * the store.
+ */
+ struct packed_git *packs;
};
/*
@@ -66,10 +72,17 @@ struct packfile_store {
struct packfile_store *packfile_store_new(struct object_database *odb);
/*
- * Free the packfile store and all its associated state.
+ * Free the packfile store and all its associated state. All packfiles
+ * tracked by the store will be closed.
*/
void packfile_store_free(struct packfile_store *store);
+/*
+ * Close all packfiles associated with this store. The packfiles won't be
+ * free'd, so they can be re-opened at a later point in time.
+ */
+void packfile_store_close(struct packfile_store *store);
+
static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
const struct hashmap_entry *entry,
const struct hashmap_entry *entry2,
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
2025-08-21 7:38 ` [PATCH v2 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 02/16] odb: move list of packfiles into " Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 1:40 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 04/16] odb: move packfile map " Patrick Steinhardt
` (12 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The object database knows to skip re-initializing the list of packfiles
in case it's already been initialized. Whether or not that is the case
is tracked via a separate `initialized` bit that is stored in the object
database. With the introduction of the `struct packfile_store` we have a
better place to host this bit though.
Move it accordingly. While at it, convert the field into a boolean now
that we're allowed to use them in our code base.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.h | 6 ------
packfile.c | 6 +++---
packfile.h | 6 ++++++
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/odb.h b/odb.h
index 6f901c5ac0..98e038fa73 100644
--- a/odb.h
+++ b/odb.h
@@ -161,12 +161,6 @@ struct object_database {
unsigned long approximate_object_count;
unsigned approximate_object_count_valid : 1;
- /*
- * Whether packed_git has already been populated with this repository's
- * packs.
- */
- unsigned packed_git_initialized : 1;
-
/*
* Submodule source paths that will be added as additional sources to
* allow lookup of submodule objects via the main object database.
diff --git a/packfile.c b/packfile.c
index 6478e4cc30..17f770e0e0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,7 +1027,7 @@ static void prepare_packed_git(struct repository *r)
{
struct odb_source *source;
- if (r->objects->packed_git_initialized)
+ if (r->objects->packfiles->initialized)
return;
odb_prepare_alternates(r->objects);
@@ -1039,7 +1039,7 @@ static void prepare_packed_git(struct repository *r)
rearrange_packed_git(r);
prepare_packed_git_mru(r);
- r->objects->packed_git_initialized = 1;
+ r->objects->packfiles->initialized = true;
}
void reprepare_packed_git(struct repository *r)
@@ -1061,7 +1061,7 @@ void reprepare_packed_git(struct repository *r)
odb_clear_loose_cache(source);
r->objects->approximate_object_count_valid = 0;
- r->objects->packed_git_initialized = 0;
+ r->objects->packfiles->initialized = false;
prepare_packed_git(r);
obj_read_unlock();
}
diff --git a/packfile.h b/packfile.h
index d7ac8d24b4..cf81091175 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,6 +63,12 @@ struct packfile_store {
* the store.
*/
struct packed_git *packs;
+
+ /*
+ * Whether packfiles have already been populated with this store's
+ * packs.
+ */
+ bool initialized;
};
/*
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 04/16] odb: move packfile map into `struct packfile_store`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (2 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 03/16] odb: move initialization bit " Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 1:41 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
` (11 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The object database tracks a map of packfiles by their respective paths,
which is used to figure out whether a given packfile has already been
loaded.With the introduction of the `struct packfile_store` we have a
better place to host this list though.
Move the map accordingly. `pack_map_entry_cmp()` isn't used anywhere but
in "packfile.c" anymore after this change, so we convert it to a static
function, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 2 +-
odb.c | 2 --
odb.h | 6 ------
packfile.c | 20 ++++++++++++++++++--
packfile.h | 20 ++++++--------------
5 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/midx.c b/midx.c
index 7d407682e6..7f3f74ef2b 100644
--- a/midx.c
+++ b/midx.c
@@ -471,7 +471,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
strbuf_addbuf(&key, &pack_name);
strbuf_strip_suffix(&key, ".idx");
strbuf_addstr(&key, ".pack");
- p = hashmap_get_entry_from_hash(&r->objects->pack_map,
+ p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
strhash(key.buf), key.buf,
struct packed_git, packmap_ent);
if (!p) {
diff --git a/odb.c b/odb.c
index 17a9135cbd..568c820ef8 100644
--- a/odb.c
+++ b/odb.c
@@ -998,7 +998,6 @@ struct object_database *odb_new(struct repository *repo)
o->repo = repo;
o->packfiles = packfile_store_new(o);
INIT_LIST_HEAD(&o->packed_git_mru);
- hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
pthread_mutex_init(&o->replace_mutex, NULL);
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ -1040,6 +1039,5 @@ void odb_clear(struct object_database *o)
close_object_store(o);
packfile_store_free(o->packfiles);
- hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
}
diff --git a/odb.h b/odb.h
index 98e038fa73..fb37c6ebce 100644
--- a/odb.h
+++ b/odb.h
@@ -147,12 +147,6 @@ struct object_database {
struct cached_object_entry *cached_objects;
size_t cached_object_nr, cached_object_alloc;
- /*
- * A map of packfiles to packed_git structs for tracking which
- * packs have been loaded already.
- */
- struct hashmap pack_map;
-
/*
* A fast, rough count of the number of objects in the repository.
* These two fields are not meant for direct access. Use
diff --git a/packfile.c b/packfile.c
index 17f770e0e0..752a0cee8d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -788,7 +788,7 @@ void install_packed_git(struct repository *r, struct packed_git *pack)
r->objects->packfiles->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
- hashmap_add(&r->objects->pack_map, &pack->packmap_ent);
+ hashmap_add(&r->objects->packfiles->map, &pack->packmap_ent);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -901,7 +901,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
hashmap_entry_init(&hent, hash);
/* Don't reopen a pack we already have. */
- if (!hashmap_get(&data->r->objects->pack_map, &hent, pack_name)) {
+ if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
p = add_packed_git(data->r, full_name, full_name_len, data->local);
if (p)
install_packed_git(data->r, p);
@@ -2329,11 +2329,26 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
return 0;
}
+static int pack_map_entry_cmp(const void *cmp_data UNUSED,
+ const struct hashmap_entry *entry,
+ const struct hashmap_entry *entry2,
+ const void *keydata)
+{
+ const char *key = keydata;
+ const struct packed_git *pg1, *pg2;
+
+ pg1 = container_of(entry, const struct packed_git, packmap_ent);
+ pg2 = container_of(entry2, const struct packed_git, packmap_ent);
+
+ return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
+}
+
struct packfile_store *packfile_store_new(struct object_database *odb)
{
struct packfile_store *store;
CALLOC_ARRAY(store, 1);
store->odb = odb;
+ hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
return store;
}
@@ -2346,6 +2361,7 @@ void packfile_store_free(struct packfile_store *store)
free(p);
}
+ hashmap_clear(&store->map);
free(store);
}
diff --git a/packfile.h b/packfile.h
index cf81091175..9bbef51164 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,12 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /*
+ * A map of packfile names to packed_git structs for tracking which
+ * packs have been loaded already.
+ */
+ struct hashmap map;
+
/*
* Whether packfiles have already been populated with this store's
* packs.
@@ -89,20 +95,6 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
-static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *entry,
- const struct hashmap_entry *entry2,
- const void *keydata)
-{
- const char *key = keydata;
- const struct packed_git *pg1, *pg2;
-
- pg1 = container_of(entry, const struct packed_git, packmap_ent);
- pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
- return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 05/16] odb: move MRU list of packfiles into `struct packfile_store`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (3 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 04/16] odb: move packfile map " Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 06/16] odb: move kept cache " Patrick Steinhardt
` (10 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The object database tracks the list of packfiles in most-recently-used
order, which is mostly used to favor reading from packfiles that contain
most of the objects that we're currently accessing. With the
introduction of the `struct packfile_store` we have a better place to
host this list though.
Move the list accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 2 +-
odb.c | 2 --
odb.h | 4 ----
packfile.c | 11 ++++++-----
packfile.h | 3 +++
5 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/midx.c b/midx.c
index 7f3f74ef2b..7fa2b8473a 100644
--- a/midx.c
+++ b/midx.c
@@ -478,7 +478,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
if (p) {
install_packed_git(r, p);
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
}
diff --git a/odb.c b/odb.c
index 568c820ef8..80ec6fc1fa 100644
--- a/odb.c
+++ b/odb.c
@@ -997,7 +997,6 @@ struct object_database *odb_new(struct repository *repo)
memset(o, 0, sizeof(*o));
o->repo = repo;
o->packfiles = packfile_store_new(o);
- INIT_LIST_HEAD(&o->packed_git_mru);
pthread_mutex_init(&o->replace_mutex, NULL);
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ -1035,7 +1034,6 @@ void odb_clear(struct object_database *o)
free((char *) o->cached_objects[i].value.buf);
FREE_AND_NULL(o->cached_objects);
- INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
packfile_store_free(o->packfiles);
diff --git a/odb.h b/odb.h
index fb37c6ebce..1505e39729 100644
--- a/odb.h
+++ b/odb.h
@@ -3,7 +3,6 @@
#include "hashmap.h"
#include "object.h"
-#include "list.h"
#include "oidset.h"
#include "oidmap.h"
#include "string-list.h"
@@ -130,9 +129,6 @@ struct object_database {
* should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- /* A most-recently-used ordered version of the packed_git list. */
- struct list_head packed_git_mru;
-
struct {
struct packed_git **packs;
unsigned flags;
diff --git a/packfile.c b/packfile.c
index 752a0cee8d..91a7a4064f 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1017,10 +1017,10 @@ static void prepare_packed_git_mru(struct repository *r)
{
struct packed_git *p;
- INIT_LIST_HEAD(&r->objects->packed_git_mru);
+ INIT_LIST_HEAD(&r->objects->packfiles->mru);
for (p = r->objects->packfiles->packs; p; p = p->next)
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
static void prepare_packed_git(struct repository *r)
@@ -1096,7 +1096,7 @@ struct packed_git *get_all_packs(struct repository *r)
struct list_head *get_packed_git_mru(struct repository *r)
{
prepare_packed_git(r);
- return &r->objects->packed_git_mru;
+ return &r->objects->packfiles->mru;
}
unsigned long unpack_object_header_buffer(const unsigned char *buf,
@@ -2079,10 +2079,10 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
if (!r->objects->packfiles->packs)
return 0;
- list_for_each(pos, &r->objects->packed_git_mru) {
+ list_for_each(pos, &r->objects->packfiles->mru) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) {
- list_move(&p->mru, &r->objects->packed_git_mru);
+ list_move(&p->mru, &r->objects->packfiles->mru);
return 1;
}
}
@@ -2348,6 +2348,7 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
struct packfile_store *store;
CALLOC_ARRAY(store, 1);
store->odb = odb;
+ INIT_LIST_HEAD(&store->mru);
hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
return store;
}
diff --git a/packfile.h b/packfile.h
index 9bbef51164..d48d46cc1b 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,9 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /* A most-recently-used ordered version of the packs list. */
+ struct list_head mru;
+
/*
* A map of packfile names to packed_git structs for tracking which
* packs have been loaded already.
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 06/16] odb: move kept cache into `struct packfile_store`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (4 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 1:46 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
` (9 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The object database tracks a cache of "kept" packfiles, which is used by
git-pack-objects(1) to handle cruft objects. With the introduction of
the `struct packfile_store` we have a better place to host this cache
though.
Move the cache accordingly.
This moves the last bit of packfile-related state from the object
database into the packfile store. Adapt the comment for the `packfiles`
pointer in `struct object_database` to reflect this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.h | 8 +-------
packfile.c | 16 ++++++++--------
packfile.h | 5 +++++
3 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/odb.h b/odb.h
index 1505e39729..f1736b067c 100644
--- a/odb.h
+++ b/odb.h
@@ -124,15 +124,9 @@ struct object_database {
unsigned commit_graph_attempted : 1; /* if loading has been attempted */
/*
- * private data
- *
- * should only be accessed directly by packfile.c
+ * Should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- struct {
- struct packed_git **packs;
- unsigned flags;
- } kept_pack_cache;
/*
* This is meant to hold a *small* number of objects that you would
diff --git a/packfile.c b/packfile.c
index 91a7a4064f..07c574f359 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2092,19 +2092,19 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
static void maybe_invalidate_kept_pack_cache(struct repository *r,
unsigned flags)
{
- if (!r->objects->kept_pack_cache.packs)
+ if (!r->objects->packfiles->kept_cache.packs)
return;
- if (r->objects->kept_pack_cache.flags == flags)
+ if (r->objects->packfiles->kept_cache.flags == flags)
return;
- FREE_AND_NULL(r->objects->kept_pack_cache.packs);
- r->objects->kept_pack_cache.flags = 0;
+ FREE_AND_NULL(r->objects->packfiles->kept_cache.packs);
+ r->objects->packfiles->kept_cache.flags = 0;
}
struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
{
maybe_invalidate_kept_pack_cache(r, flags);
- if (!r->objects->kept_pack_cache.packs) {
+ if (!r->objects->packfiles->kept_cache.packs) {
struct packed_git **packs = NULL;
size_t nr = 0, alloc = 0;
struct packed_git *p;
@@ -2127,11 +2127,11 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
ALLOC_GROW(packs, nr + 1, alloc);
packs[nr] = NULL;
- r->objects->kept_pack_cache.packs = packs;
- r->objects->kept_pack_cache.flags = flags;
+ r->objects->packfiles->kept_cache.packs = packs;
+ r->objects->packfiles->kept_cache.flags = flags;
}
- return r->objects->kept_pack_cache.packs;
+ return r->objects->packfiles->kept_cache.packs;
}
int find_kept_pack_entry(struct repository *r,
diff --git a/packfile.h b/packfile.h
index d48d46cc1b..74cea1a4a9 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,11 @@ struct packfile_store {
*/
struct packed_git *packs;
+ struct {
+ struct packed_git **packs;
+ unsigned flags;
+ } kept_cache;
+
/* A most-recently-used ordered version of the packs list. */
struct list_head mru;
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 07/16] packfile: reorder functions to avoid function declaration
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (5 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 06/16] odb: move kept cache " Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 1:47 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
` (8 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
Reorder functions so that we can avoid a forward declaration of
`prepare_packed_git()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
packfile.c | 67 +++++++++++++++++++++++++++++++-------------------------------
1 file changed, 33 insertions(+), 34 deletions(-)
diff --git a/packfile.c b/packfile.c
index 07c574f359..90f15b0c20 100644
--- a/packfile.c
+++ b/packfile.c
@@ -946,40 +946,6 @@ static void prepare_packed_git_one(struct odb_source *source, int local)
string_list_clear(data.garbage, 0);
}
-static void prepare_packed_git(struct repository *r);
-/*
- * Give a fast, rough count of the number of objects in the repository. This
- * ignores loose objects completely. If you have a lot of them, then either
- * you should repack because your performance will be awful, or they are
- * all unreachable objects about to be pruned, in which case they're not really
- * interesting as a measure of repo size in the first place.
- */
-unsigned long repo_approximate_object_count(struct repository *r)
-{
- if (!r->objects->approximate_object_count_valid) {
- struct odb_source *source;
- unsigned long count = 0;
- struct packed_git *p;
-
- prepare_packed_git(r);
-
- for (source = r->objects->sources; source; source = source->next) {
- struct multi_pack_index *m = get_multi_pack_index(source);
- if (m)
- count += m->num_objects;
- }
-
- for (p = r->objects->packfiles->packs; p; p = p->next) {
- if (open_pack_index(p))
- continue;
- count += p->num_objects;
- }
- r->objects->approximate_object_count = count;
- r->objects->approximate_object_count_valid = 1;
- }
- return r->objects->approximate_object_count;
-}
-
DEFINE_LIST_SORT(static, sort_packs, struct packed_git, next);
static int sort_pack(const struct packed_git *a, const struct packed_git *b)
@@ -1099,6 +1065,39 @@ struct list_head *get_packed_git_mru(struct repository *r)
return &r->objects->packfiles->mru;
}
+/*
+ * Give a fast, rough count of the number of objects in the repository. This
+ * ignores loose objects completely. If you have a lot of them, then either
+ * you should repack because your performance will be awful, or they are
+ * all unreachable objects about to be pruned, in which case they're not really
+ * interesting as a measure of repo size in the first place.
+ */
+unsigned long repo_approximate_object_count(struct repository *r)
+{
+ if (!r->objects->approximate_object_count_valid) {
+ struct odb_source *source;
+ unsigned long count = 0;
+ struct packed_git *p;
+
+ prepare_packed_git(r);
+
+ for (source = r->objects->sources; source; source = source->next) {
+ struct multi_pack_index *m = get_multi_pack_index(source);
+ if (m)
+ count += m->num_objects;
+ }
+
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
+ if (open_pack_index(p))
+ continue;
+ count += p->num_objects;
+ }
+ r->objects->approximate_object_count = count;
+ r->objects->approximate_object_count_valid = 1;
+ }
+ return r->objects->approximate_object_count;
+}
+
unsigned long unpack_object_header_buffer(const unsigned char *buf,
unsigned long len, enum object_type *type, unsigned long *sizep)
{
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (6 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 1:58 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
` (7 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The `prepare_packed_git()` function and its friends are responsible for
loading packfiles as well as the multi-pack index for a given object
database. Refactor these functions to accept a packfile store instead of
a repository to clarify their scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
packfile.c | 43 +++++++++++++++++++------------------------
1 file changed, 19 insertions(+), 24 deletions(-)
diff --git a/packfile.c b/packfile.c
index 90f15b0c20..2e45a3a05f 100644
--- a/packfile.c
+++ b/packfile.c
@@ -974,38 +974,33 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
return -1;
}
-static void rearrange_packed_git(struct repository *r)
-{
- sort_packs(&r->objects->packfiles->packs, sort_pack);
-}
-
-static void prepare_packed_git_mru(struct repository *r)
+static void packfile_store_prepare_mru(struct packfile_store *store)
{
struct packed_git *p;
- INIT_LIST_HEAD(&r->objects->packfiles->mru);
+ INIT_LIST_HEAD(&store->mru);
- for (p = r->objects->packfiles->packs; p; p = p->next)
- list_add_tail(&p->mru, &r->objects->packfiles->mru);
+ for (p = store->packs; p; p = p->next)
+ list_add_tail(&p->mru, &store->mru);
}
-static void prepare_packed_git(struct repository *r)
+static void packfile_store_prepare(struct packfile_store *store)
{
struct odb_source *source;
- if (r->objects->packfiles->initialized)
+ if (store->initialized)
return;
- odb_prepare_alternates(r->objects);
- for (source = r->objects->sources; source; source = source->next) {
- int local = (source == r->objects->sources);
+ odb_prepare_alternates(store->odb);
+ for (source = store->odb->sources; source; source = source->next) {
+ int local = (source == store->odb->sources);
prepare_multi_pack_index_one(source, local);
prepare_packed_git_one(source, local);
}
- rearrange_packed_git(r);
+ sort_packs(&store->packs, sort_pack);
- prepare_packed_git_mru(r);
- r->objects->packfiles->initialized = true;
+ packfile_store_prepare_mru(store);
+ store->initialized = true;
}
void reprepare_packed_git(struct repository *r)
@@ -1028,25 +1023,25 @@ void reprepare_packed_git(struct repository *r)
r->objects->approximate_object_count_valid = 0;
r->objects->packfiles->initialized = false;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
obj_read_unlock();
}
struct packed_git *get_packed_git(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
return r->objects->packfiles->packs;
}
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
{
- prepare_packed_git(source->odb->repo);
+ packfile_store_prepare(source->odb->packfiles);
return source->midx;
}
struct packed_git *get_all_packs(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next) {
struct multi_pack_index *m = source->midx;
@@ -1061,7 +1056,7 @@ struct packed_git *get_all_packs(struct repository *r)
struct list_head *get_packed_git_mru(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
return &r->objects->packfiles->mru;
}
@@ -1079,7 +1074,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
unsigned long count = 0;
struct packed_git *p;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (source = r->objects->sources; source; source = source->next) {
struct multi_pack_index *m = get_multi_pack_index(source);
@@ -2069,7 +2064,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
{
struct list_head *pos;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next)
if (source->midx && fill_midx_entry(r, oid, e, source->midx))
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (7 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 2:10 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
` (6 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
In `reprepare_packed_git()` we perform a couple of operations:
- We reload alternate object directories.
- We clear the loose object cache.
- We reprepare packfiles.
While the logic is hosted in "packfile.c", it clearly reaches into other
subsystems that aren't related to packfiles.
Split up the responsibility and introduce `odb_reprepare()` which now
becomes responsible for repreparing the whole object database. The
existing `reprepare_packed_git()` function is refactored accordingly and
only cares about reloading the packfile store now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/backfill.c | 2 +-
builtin/gc.c | 4 ++--
builtin/receive-pack.c | 2 +-
builtin/repack.c | 2 +-
bulk-checkin.c | 2 +-
connected.c | 2 +-
fetch-pack.c | 4 ++--
object-name.c | 2 +-
odb.c | 25 ++++++++++++++++++++++++-
odb.h | 6 ++++++
packfile.c | 26 ++++----------------------
packfile.h | 9 ++++++++-
transport-helper.c | 2 +-
13 files changed, 53 insertions(+), 35 deletions(-)
diff --git a/builtin/backfill.c b/builtin/backfill.c
index 80056abe47..e80fc1b694 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -53,7 +53,7 @@ static void download_batch(struct backfill_context *ctx)
* We likely have a new packfile. Add it to the packed list to
* avoid possible duplicate downloads of the same objects.
*/
- reprepare_packed_git(ctx->repo);
+ odb_reprepare(ctx->repo->objects);
}
static int fill_missing_blobs(const char *path UNUSED,
diff --git a/builtin/gc.c b/builtin/gc.c
index 0edd94a76f..1d30d1af2c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1041,7 +1041,7 @@ int cmd_gc(int argc,
die(FAILED_RUN, "rerere");
report_garbage = report_pack_garbage;
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (pack_garbage.nr > 0) {
close_object_store(the_repository->objects);
clean_pack_garbage();
@@ -1490,7 +1490,7 @@ static off_t get_auto_pack_size(void)
struct packed_git *p;
struct repository *r = the_repository;
- reprepare_packed_git(r);
+ odb_reprepare(r->objects);
for (p = get_all_packs(r); p; p = p->next) {
if (p->pack_size > max_size) {
second_largest_size = max_size;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1113137a6f..c9288a9c7e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -2389,7 +2389,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
status = finish_command(&child);
if (status)
return "index-pack abnormal exit";
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
}
return NULL;
}
diff --git a/builtin/repack.c b/builtin/repack.c
index a4def39197..ee8c80cd95 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1684,7 +1684,7 @@ int cmd_repack(int argc,
goto cleanup;
}
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (delete_redundant) {
int opts = 0;
diff --git a/bulk-checkin.c b/bulk-checkin.c
index b2809ab039..f65439a748 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -90,7 +90,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
strbuf_release(&packname);
/* Make objects we just wrote available to ourselves */
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
}
/*
diff --git a/connected.c b/connected.c
index 18c13245d8..d6e9682fd9 100644
--- a/connected.c
+++ b/connected.c
@@ -72,7 +72,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
* Before checking for promisor packs, be sure we have the
* latest pack-files loaded into memory.
*/
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
do {
struct packed_git *p;
diff --git a/fetch-pack.c b/fetch-pack.c
index 46c39f85c4..3b8960608c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1982,7 +1982,7 @@ static void update_shallow(struct fetch_pack_args *args,
* remote is shallow, but this is a clone, there are
* no objects in repo to worry about. Accept any
* shallow points that exist in the pack (iow in repo
- * after get_pack() and reprepare_packed_git())
+ * after get_pack() and odb_reprepare())
*/
struct oid_array extra = OID_ARRAY_INIT;
struct object_id *oid = si->shallow->oid;
@@ -2107,7 +2107,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
&si, pack_lockfiles);
}
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (!args->cloning && args->deepen) {
struct check_connected_options opt = CHECK_CONNECTED_INIT;
diff --git a/object-name.c b/object-name.c
index 11aa0e6afc..44b0d416ac 100644
--- a/object-name.c
+++ b/object-name.c
@@ -596,7 +596,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
* or migrated from loose to packed.
*/
if (status == MISSING_OBJECT) {
- reprepare_packed_git(r);
+ odb_reprepare(r->objects);
find_short_object_filename(&ds);
find_short_packed_object(&ds);
status = finish_object_disambiguation(&ds, oid);
diff --git a/odb.c b/odb.c
index 80ec6fc1fa..37ed21f53b 100644
--- a/odb.c
+++ b/odb.c
@@ -694,7 +694,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
/* Not a loose object; someone else may have just packed it. */
if (!(flags & OBJECT_INFO_QUICK)) {
- reprepare_packed_git(odb->repo);
+ odb_reprepare(odb->repo->objects);
if (find_pack_entry(odb->repo, real, &e))
break;
}
@@ -1039,3 +1039,26 @@ void odb_clear(struct object_database *o)
string_list_clear(&o->submodule_source_paths, 0);
}
+
+void odb_reprepare(struct object_database *o)
+{
+ struct odb_source *source;
+
+ /*
+ * Reprepare alt odbs, in case the alternates file was modified
+ * during the course of this process. This only _adds_ odbs to
+ * the linked list, so existing odbs will continue to exist for
+ * the lifetime of the process.
+ */
+ o->loaded_alternates = 0;
+ odb_prepare_alternates(o);
+
+ for (source = o->sources; source; source = source->next)
+ odb_clear_loose_cache(source);
+
+ o->approximate_object_count_valid = 0;
+
+ packfile_store_reprepare(o->packfiles);
+
+ obj_read_unlock();
+}
diff --git a/odb.h b/odb.h
index f1736b067c..9810ec60a0 100644
--- a/odb.h
+++ b/odb.h
@@ -155,6 +155,12 @@ struct object_database {
struct object_database *odb_new(struct repository *repo);
void odb_clear(struct object_database *o);
+/*
+ * Clear caches, reload alternates and then reload object sources so that new
+ * objects may become accessible.
+ */
+void odb_reprepare(struct object_database *o);
+
/*
* Find source by its object directory path. Dies in case the source couldn't
* be found.
diff --git a/packfile.c b/packfile.c
index 2e45a3a05f..8e446dda69 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1003,28 +1003,10 @@ static void packfile_store_prepare(struct packfile_store *store)
store->initialized = true;
}
-void reprepare_packed_git(struct repository *r)
+void packfile_store_reprepare(struct packfile_store *store)
{
- struct odb_source *source;
-
- obj_read_lock();
-
- /*
- * Reprepare alt odbs, in case the alternates file was modified
- * during the course of this process. This only _adds_ odbs to
- * the linked list, so existing odbs will continue to exist for
- * the lifetime of the process.
- */
- r->objects->loaded_alternates = 0;
- odb_prepare_alternates(r->objects);
-
- for (source = r->objects->sources; source; source = source->next)
- odb_clear_loose_cache(source);
-
- r->objects->approximate_object_count_valid = 0;
- r->objects->packfiles->initialized = false;
- packfile_store_prepare(r->objects->packfiles);
- obj_read_unlock();
+ store->initialized = false;
+ packfile_store_prepare(store);
}
struct packed_git *get_packed_git(struct repository *r)
@@ -1145,7 +1127,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
*
* Other worrying sections could be the call to close_pack_fd(),
* which can close packs even with in-use windows, and to
- * reprepare_packed_git(). Regarding the former, mmap doc says:
+ * odb_reprepare(). Regarding the former, mmap doc says:
* "closing the file descriptor does not unmap the region". And
* for the latter, it won't re-open already available packs.
*/
diff --git a/packfile.h b/packfile.h
index 74cea1a4a9..e9e60ec21b 100644
--- a/packfile.h
+++ b/packfile.h
@@ -103,6 +103,14 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
+/*
+ * Clear the packfile caches and try to look up any new packfiles that have
+ * appeared since last preparing the packfiles store.
+ *
+ * This function must be called under the `odb_read_lock()`.
+ */
+void packfile_store_reprepare(struct packfile_store *store);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
@@ -179,7 +187,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-void reprepare_packed_git(struct repository *r);
void install_packed_git(struct repository *r, struct packed_git *pack);
struct packed_git *get_packed_git(struct repository *r);
diff --git a/transport-helper.c b/transport-helper.c
index 0789e5bca5..4d95d84f9e 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -450,7 +450,7 @@ static int fetch_with_fetch(struct transport *transport,
}
strbuf_release(&buf);
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
return 0;
}
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (8 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-26 2:11 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
` (5 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The `install_packed_git()` functions adds a packfile to a specific
object store. Refactor it to accept a packfile store instead of a
repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 2 +-
builtin/index-pack.c | 2 +-
http.c | 2 +-
http.h | 2 +-
midx.c | 2 +-
packfile.c | 11 ++++++-----
packfile.h | 9 +++++++--
7 files changed, 18 insertions(+), 12 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 2c35f9345d..e9d82b31c3 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -901,7 +901,7 @@ static void end_packfile(void)
if (!new_p)
die("core git rejected index %s", idx_name);
all_packs[pack_id] = new_p;
- install_packed_git(the_repository, new_p);
+ packfile_store_add_pack(the_repository->objects->packfiles, new_p);
free(idx_name);
/* Print the boundary */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f91c301bba..ed490dfad4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1645,7 +1645,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
p = add_packed_git(the_repository, final_index_name,
strlen(final_index_name), 0);
if (p)
- install_packed_git(the_repository, p);
+ packfile_store_add_pack(the_repository->objects->packfiles, p);
}
if (!from_stdin) {
diff --git a/http.c b/http.c
index 98853d6483..af2120b64c 100644
--- a/http.c
+++ b/http.c
@@ -2541,7 +2541,7 @@ void http_install_packfile(struct packed_git *p,
lst = &((*lst)->next);
*lst = (*lst)->next;
- install_packed_git(the_repository, p);
+ packfile_store_add_pack(the_repository->objects->packfiles, p);
}
struct http_pack_request *new_http_pack_request(
diff --git a/http.h b/http.h
index 36202139f4..e5a5380c6c 100644
--- a/http.h
+++ b/http.h
@@ -210,7 +210,7 @@ int finish_http_pack_request(struct http_pack_request *preq);
void release_http_pack_request(struct http_pack_request *preq);
/*
- * Remove p from the given list, and invoke install_packed_git() on it.
+ * Remove p from the given list, and invoke packfile_store_add_pack() on it.
*
* This is a convenience function for users that have obtained a list of packs
* from http_get_info_packs() and have chosen a specific pack to fetch.
diff --git a/midx.c b/midx.c
index 7fa2b8473a..95e74c79c1 100644
--- a/midx.c
+++ b/midx.c
@@ -477,7 +477,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
if (!p) {
p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
if (p) {
- install_packed_git(r, p);
+ packfile_store_add_pack(r->objects->packfiles, p);
list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
}
diff --git a/packfile.c b/packfile.c
index 8e446dda69..c885046d9f 100644
--- a/packfile.c
+++ b/packfile.c
@@ -779,16 +779,17 @@ struct packed_git *add_packed_git(struct repository *r, const char *path,
return p;
}
-void install_packed_git(struct repository *r, struct packed_git *pack)
+void packfile_store_add_pack(struct packfile_store *store,
+ struct packed_git *pack)
{
if (pack->pack_fd != -1)
pack_open_fds++;
- pack->next = r->objects->packfiles->packs;
- r->objects->packfiles->packs = pack;
+ pack->next = store->packs;
+ store->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
- hashmap_add(&r->objects->packfiles->map, &pack->packmap_ent);
+ hashmap_add(&store->map, &pack->packmap_ent);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -904,7 +905,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
p = add_packed_git(data->r, full_name, full_name_len, data->local);
if (p)
- install_packed_git(data->r, p);
+ packfile_store_add_pack(data->r->objects->packfiles, p);
}
free(pack_name);
}
diff --git a/packfile.h b/packfile.h
index e9e60ec21b..6641238796 100644
--- a/packfile.h
+++ b/packfile.h
@@ -111,6 +111,13 @@ void packfile_store_close(struct packfile_store *store);
*/
void packfile_store_reprepare(struct packfile_store *store);
+/*
+ * Add the pack to the store so that contained objects become accessible via
+ * the store. This moves ownership into the store.
+ */
+void packfile_store_add_pack(struct packfile_store *store,
+ struct packed_git *pack);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
@@ -187,8 +194,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-void install_packed_git(struct repository *r, struct packed_git *pack);
-
struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (9 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-27 1:04 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
` (4 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
When adding a packfile to it store we add it both to the list and map of
packfiles, but we don't append it to the most-recently-used list of
packs. We do know to add the packfile to the MRU list as soon as we
access any of its objects, but in between we're being inconistent. It
doesn't help that there are some subsystems that _do_ add the packfile
to the MRU after having added it, which only adds to the confusion.
Refactor the code so that we unconditionally add packfiles to the MRU
when adding them to a packfile store.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 4 +---
packfile.c | 1 +
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/midx.c b/midx.c
index 95e74c79c1..3cfe7884ad 100644
--- a/midx.c
+++ b/midx.c
@@ -476,10 +476,8 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
struct packed_git, packmap_ent);
if (!p) {
p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
- if (p) {
+ if (p)
packfile_store_add_pack(r->objects->packfiles, p);
- list_add_tail(&p->mru, &r->objects->packfiles->mru);
- }
}
strbuf_release(&pack_name);
diff --git a/packfile.c b/packfile.c
index c885046d9f..a79d0fc1fa 100644
--- a/packfile.c
+++ b/packfile.c
@@ -790,6 +790,7 @@ void packfile_store_add_pack(struct packfile_store *store,
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
hashmap_add(&store->map, &pack->packmap_ent);
+ list_add_tail(&pack->mru, &store->mru);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 12/16] packfile: introduce function to load and add packfiles
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (10 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-27 1:12 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
` (3 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
We have a recurring pattern where we essentially perform an upsert of a
packfile in case it isn't yet known by the packfile store. The logic to
do so is non-trivial as we have to reconstruct the packfile's key, check
the map of packfiles, then create the new packfile and finally add it to
the store.
Introduce a new function that does this dance for us. Refactor callsites
to use it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 4 ++--
builtin/index-pack.c | 10 +++-------
midx.c | 18 ++----------------
packfile.c | 44 +++++++++++++++++++++++++++++++-------------
packfile.h | 8 ++++++++
5 files changed, 46 insertions(+), 38 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index e9d82b31c3..a26e79689d 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -897,11 +897,11 @@ static void end_packfile(void)
idx_name = keep_pack(create_index());
/* Register the packfile with core git's machinery. */
- new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
+ new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
+ idx_name, 1);
if (!new_p)
die("core git rejected index %s", idx_name);
all_packs[pack_id] = new_p;
- packfile_store_add_pack(the_repository->objects->packfiles, new_p);
free(idx_name);
/* Print the boundary */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ed490dfad4..2b78ba7fe4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
hash, "idx", 1);
- if (do_fsck_object) {
- struct packed_git *p;
- p = add_packed_git(the_repository, final_index_name,
- strlen(final_index_name), 0);
- if (p)
- packfile_store_add_pack(the_repository->objects->packfiles, p);
- }
+ if (do_fsck_object)
+ packfile_store_load_pack(the_repository->objects->packfiles,
+ final_index_name, 0);
if (!from_stdin) {
printf("%s\n", hash_to_hex(hash));
diff --git a/midx.c b/midx.c
index 3cfe7884ad..d30feda019 100644
--- a/midx.c
+++ b/midx.c
@@ -454,7 +454,6 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
uint32_t pack_int_id)
{
struct strbuf pack_name = STRBUF_INIT;
- struct strbuf key = STRBUF_INIT;
struct packed_git *p;
pack_int_id = midx_for_pack(&m, pack_int_id);
@@ -466,22 +465,9 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
m->pack_names[pack_int_id]);
-
- /* pack_map holds the ".pack" name, but we have the .idx */
- strbuf_addbuf(&key, &pack_name);
- strbuf_strip_suffix(&key, ".idx");
- strbuf_addstr(&key, ".pack");
- p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
- strhash(key.buf), key.buf,
- struct packed_git, packmap_ent);
- if (!p) {
- p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
- if (p)
- packfile_store_add_pack(r->objects->packfiles, p);
- }
-
+ p = packfile_store_load_pack(r->objects->packfiles,
+ pack_name.buf, m->local);
strbuf_release(&pack_name);
- strbuf_release(&key);
if (!p) {
m->packs[pack_int_id] = MIDX_PACK_ERROR;
diff --git a/packfile.c b/packfile.c
index a79d0fc1fa..f7a9967c9d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store,
list_add_tail(&pack->mru, &store->mru);
}
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
+ const char *idx_path, int local)
+{
+ struct strbuf key = STRBUF_INIT;
+ struct packed_git *p;
+
+ /*
+ * We're being called with the path to the index file, but `pack_map`
+ * holds the path to the packfile itself.
+ */
+ strbuf_addstr(&key, idx_path);
+ strbuf_strip_suffix(&key, ".idx");
+ strbuf_addstr(&key, ".pack");
+
+ p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
+ struct packed_git, packmap_ent);
+ if (!p) {
+ p = add_packed_git(store->odb->repo, idx_path,
+ strlen(idx_path), local);
+ if (p)
+ packfile_store_add_pack(store, p);
+ }
+
+ strbuf_release(&key);
+ return p;
+}
+
void (*report_garbage)(unsigned seen_bits, const char *path);
static void report_helper(const struct string_list *list,
@@ -892,23 +919,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
const char *file_name, void *_data)
{
struct prepare_pack_data *data = (struct prepare_pack_data *)_data;
- struct packed_git *p;
size_t base_len = full_name_len;
if (strip_suffix_mem(full_name, &base_len, ".idx") &&
!(data->m && midx_contains_pack(data->m, file_name))) {
- struct hashmap_entry hent;
- char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name);
- unsigned int hash = strhash(pack_name);
- hashmap_entry_init(&hent, hash);
-
- /* Don't reopen a pack we already have. */
- if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
- p = add_packed_git(data->r, full_name, full_name_len, data->local);
- if (p)
- packfile_store_add_pack(data->r->objects->packfiles, p);
- }
- free(pack_name);
+ char *trimmed_path = xstrndup(full_name, full_name_len);
+ packfile_store_load_pack(data->r->objects->packfiles,
+ trimmed_path, data->local);
+ free(trimmed_path);
}
if (!report_garbage)
diff --git a/packfile.h b/packfile.h
index 6641238796..c4e5516f9e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -118,6 +118,14 @@ void packfile_store_reprepare(struct packfile_store *store);
void packfile_store_add_pack(struct packfile_store *store,
struct packed_git *pack);
+/*
+ * Open the packfile and add it to the store if it isn't yet known. Returns
+ * either the newly opened packfile or the preexisting packfile. Returns a
+ * `NULL` pointer in case the packfile could not be opened.
+ */
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
+ const char *idx_path, int local);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 13/16] packfile: move `get_multi_pack_index()` into "midx.c"
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (11 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-27 1:20 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
` (2 subsequent siblings)
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The `get_multi_pack_index()` function is declared and implemented in the
packfile subsystem, even though it really belongs into the multi-pack
index subsystem. The reason for this is likely that it needs to call
`packfile_store_prepare()`, which is not exposed by the packfile system.
In a subsequent commit we're about to add another caller outside of the
packfile system though, so we'll have to expose the function anyway.
Do so now already and move `get_multi_pack_index()` into the MIDX
subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 6 ++++++
midx.h | 2 ++
packfile.c | 8 +-------
packfile.h | 10 +++++++++-
4 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/midx.c b/midx.c
index d30feda019..c1b2f141fa 100644
--- a/midx.c
+++ b/midx.c
@@ -95,6 +95,12 @@ static int midx_read_object_offsets(const unsigned char *chunk_start,
return 0;
}
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
+{
+ packfile_store_prepare(source->odb->packfiles);
+ return source->midx;
+}
+
static struct multi_pack_index *load_multi_pack_index_one(struct repository *r,
const char *object_dir,
const char *midx_name,
diff --git a/midx.h b/midx.h
index 076382de8a..8d6ea28682 100644
--- a/midx.h
+++ b/midx.h
@@ -100,6 +100,8 @@ void get_split_midx_filename_ext(const struct git_hash_algo *hash_algo,
struct strbuf *buf, const char *object_dir,
const unsigned char *hash, const char *ext);
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
+
struct multi_pack_index *load_multi_pack_index(struct repository *r,
const char *object_dir,
int local);
diff --git a/packfile.c b/packfile.c
index f7a9967c9d..16384e0865 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1004,7 +1004,7 @@ static void packfile_store_prepare_mru(struct packfile_store *store)
list_add_tail(&p->mru, &store->mru);
}
-static void packfile_store_prepare(struct packfile_store *store)
+void packfile_store_prepare(struct packfile_store *store)
{
struct odb_source *source;
@@ -1035,12 +1035,6 @@ struct packed_git *get_packed_git(struct repository *r)
return r->objects->packfiles->packs;
}
-struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
-{
- packfile_store_prepare(source->odb->packfiles);
- return source->midx;
-}
-
struct packed_git *get_all_packs(struct repository *r)
{
packfile_store_prepare(r->objects->packfiles);
diff --git a/packfile.h b/packfile.h
index c4e5516f9e..816b762770 100644
--- a/packfile.h
+++ b/packfile.h
@@ -103,6 +103,15 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
+/*
+ * Prepare the packfile store by loading packfiles and multi-pack indices for
+ * all alternates. This becomes a no-op if the store is already prepared.
+ *
+ * It shouldn't typically be necessary to call this function directly, as
+ * functions that access the store know to prepare it.
+ */
+void packfile_store_prepare(struct packfile_store *store);
+
/*
* Clear the packfile caches and try to look up any new packfiles that have
* appeared since last preparing the packfiles store.
@@ -204,7 +213,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
-struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
struct packed_git *get_all_packs(struct repository *r);
/*
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 14/16] packfile: remove `get_packed_git()`
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (12 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-27 1:38 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 16/16] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
We have two different functions to retrieve packfiles for a packfile
store:
- `get_packed_git()` returns the list of packfiles after having called
`prepare_packed_git()`.
- `get_all_packs()` calls `prepare_packed_git()`, as well, but also
calls `prepare_midx_pack()` for each pack.
This means that the latter function also properly loads the info of
whether or not a packfile is part of a multi-pack index. Preparing this
extra information also shouldn't be significantly more expensive:
- We have already loaded all packfiles via `prepare_packed_git_one()`.
So given that multi-pack indices may only refer to packfiles in the
same object directory we know that we already loaded each packfile.
- The multi-pack index was prepared via `packfile_store_prepare()`
already, which calls `prepare_multi_pack_index_one()`.
- So all that remains to be done is to look up the index of the pack
in its multi-pack index so that we can store that info in both the
pack itself and the MIDX.
So it is somewhat confusing to readers that one of these two functions
claims to load "all" packfiles while the other one doesn't, even though
the ultimate difference is way more nuanced.
Convert all of these sites to use `get_all_packs()` instead and remove
`get_packed_git()`. There doesn't seem to be a good reason to discern
these two functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/gc.c | 2 +-
builtin/grep.c | 2 +-
object-name.c | 4 ++--
packfile.c | 6 ------
packfile.h | 1 -
5 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index 1d30d1af2c..565afda51f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1422,7 +1422,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
if (incremental_repack_auto_limit < 0)
return 1;
- for (p = get_packed_git(the_repository);
+ for (p = get_all_packs(the_repository);
count < incremental_repack_auto_limit && p;
p = p->next) {
if (!p->multi_pack_index)
diff --git a/builtin/grep.c b/builtin/grep.c
index 5df6537333..8f0e21bd70 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1214,7 +1214,7 @@ int cmd_grep(int argc,
if (recurse_submodules)
repo_read_gitmodules(the_repository, 1);
if (startup_info->have_repository)
- (void)get_packed_git(the_repository);
+ packfile_store_prepare(the_repository->objects->packfiles);
start_threads(&opt);
} else {
diff --git a/object-name.c b/object-name.c
index 44b0d416ac..c87995cc1e 100644
--- a/object-name.c
+++ b/object-name.c
@@ -213,7 +213,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
unique_in_midx(m, ds);
}
- for (p = get_packed_git(ds->repo); p && !ds->ambiguous;
+ for (p = get_all_packs(ds->repo); p && !ds->ambiguous;
p = p->next)
unique_in_pack(p, ds);
}
@@ -806,7 +806,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
find_abbrev_len_for_midx(m, mad);
}
- for (p = get_packed_git(mad->repo); p; p = p->next)
+ for (p = get_all_packs(mad->repo); p; p = p->next)
find_abbrev_len_for_pack(p, mad);
}
diff --git a/packfile.c b/packfile.c
index 16384e0865..523c30c5a7 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1029,12 +1029,6 @@ void packfile_store_reprepare(struct packfile_store *store)
packfile_store_prepare(store);
}
-struct packed_git *get_packed_git(struct repository *r)
-{
- packfile_store_prepare(r->objects->packfiles);
- return r->objects->packfiles->packs;
-}
-
struct packed_git *get_all_packs(struct repository *r)
{
packfile_store_prepare(r->objects->packfiles);
diff --git a/packfile.h b/packfile.h
index 816b762770..15cb378781 100644
--- a/packfile.h
+++ b/packfile.h
@@ -211,7 +211,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
struct packed_git *get_all_packs(struct repository *r);
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (13 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
2025-08-27 1:45 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 16/16] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
15 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The `get_all_packs()` function prepares the packfile store and then
returns its packfiles. Refactor it to accept a packfile store instead of
a repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/cat-file.c | 2 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 4 ++--
builtin/fsck.c | 8 ++++----
builtin/gc.c | 8 ++++----
builtin/pack-objects.c | 18 +++++++++---------
builtin/pack-redundant.c | 4 ++--
builtin/repack.c | 6 +++---
connected.c | 2 +-
http-backend.c | 4 ++--
http.c | 2 +-
object-name.c | 4 ++--
pack-bitmap.c | 4 ++--
pack-objects.c | 2 +-
packfile.c | 14 +++++++-------
packfile.h | 7 ++++++-
server-info.c | 2 +-
t/helper/test-find-pack.c | 2 +-
t/helper/test-pack-mtimes.c | 2 +-
19 files changed, 51 insertions(+), 46 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index fce0b06451c..7124c43fb14 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -854,7 +854,7 @@ static void batch_each_object(struct batch_options *opt,
batch_one_object_bitmapped, &payload)) {
struct packed_git *pack;
- for (pack = get_all_packs(the_repository); pack; pack = pack->next) {
+ for (pack = packfile_store_get_packs(the_repository->objects->packfiles); pack; pack = pack->next) {
if (bitmap_index_contains_pack(bitmap, pack) ||
open_pack_index(pack))
continue;
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a61d3b46aac..471d96a3089 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -129,7 +129,7 @@ int cmd_count_objects(int argc,
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local)
continue;
if (open_pack_index(p))
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index a26e79689d5..4f355118a10 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -975,7 +975,7 @@ static int store_object(
if (e->idx.offset) {
duplicate_count_by_type[type]++;
return 1;
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
+ } else if (find_oid_pack(&oid, packfile_store_get_packs(the_repository->objects->packfiles))) {
e->type = type;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
@@ -1175,7 +1175,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
duplicate_count_by_type[OBJ_BLOB]++;
truncate_pack(&checkpoint);
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
+ } else if (find_oid_pack(&oid, packfile_store_get_packs(the_repository->objects->packfiles))) {
e->type = OBJ_BLOB;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 543a2cdb5cd..e867fd510a3 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -873,14 +873,14 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress)
int res = 0;
if (show_progress) {
- for (struct packed_git *p = get_all_packs(r); p; p = p->next)
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next)
pack_count++;
progress = start_delayed_progress(the_repository,
"Verifying reverse pack-indexes", pack_count);
pack_count = 0;
}
- for (struct packed_git *p = get_all_packs(r); p; p = p->next) {
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
int load_error = load_pack_revindex_from_disk(p);
if (load_error < 0) {
@@ -1010,7 +1010,7 @@ int cmd_fsck(int argc,
struct progress *progress = NULL;
if (show_progress) {
- for (p = get_all_packs(the_repository); p;
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p;
p = p->next) {
if (open_pack_index(p))
continue;
@@ -1020,7 +1020,7 @@ int cmd_fsck(int argc,
progress = start_progress(the_repository,
_("Checking objects"), total);
}
- for (p = get_all_packs(the_repository); p;
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p;
p = p->next) {
/* verify gives error messages itself */
if (verify_pack(the_repository,
diff --git a/builtin/gc.c b/builtin/gc.c
index 565afda51fe..030d0b0c774 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -488,7 +488,7 @@ static struct packed_git *find_base_packs(struct string_list *packs,
{
struct packed_git *p, *base = NULL;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local || p->is_cruft)
continue;
if (limit) {
@@ -513,7 +513,7 @@ static int too_many_packs(struct gc_config *cfg)
if (cfg->gc_auto_pack_limit <= 0)
return 0;
- for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
+ for (cnt = 0, p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local)
continue;
if (p->pack_keep)
@@ -1422,7 +1422,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
if (incremental_repack_auto_limit < 0)
return 1;
- for (p = get_all_packs(the_repository);
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles);
count < incremental_repack_auto_limit && p;
p = p->next) {
if (!p->multi_pack_index)
@@ -1491,7 +1491,7 @@ static off_t get_auto_pack_size(void)
struct repository *r = the_repository;
odb_reprepare(r->objects);
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if (p->pack_size > max_size) {
second_largest_size = max_size;
max_size = p->pack_size;
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 53a22562503..1c24b84510e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3855,7 +3855,7 @@ static void read_packs_list_from_stdin(struct rev_info *revs)
string_list_sort(&exclude_packs);
string_list_remove_duplicates(&exclude_packs, 0);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
const char *pack_name = pack_basename(p);
if ((item = string_list_lookup(&include_packs, pack_name)))
@@ -4105,7 +4105,7 @@ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs
* Re-mark only the fresh packs as kept so that objects in
* unknown packs do not halt the reachability traversal early.
*/
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
p->pack_keep_in_core = 0;
mark_pack_kept_in_core(fresh_packs, 1);
@@ -4142,7 +4142,7 @@ static void read_cruft_objects(void)
string_list_sort(&discard_packs);
string_list_sort(&fresh_packs);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
const char *pack_name = pack_basename(p);
struct string_list_item *item;
@@ -4394,7 +4394,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
struct packed_git *p;
p = (last_found != (void *)1) ? last_found :
- get_all_packs(the_repository);
+ packfile_store_get_packs(the_repository->objects->packfiles);
while (p) {
if ((!p->pack_local || p->pack_keep ||
@@ -4404,7 +4404,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
return 1;
}
if (p == last_found)
- p = get_all_packs(the_repository);
+ p = packfile_store_get_packs(the_repository->objects->packfiles);
else
p = p->next;
if (p == last_found)
@@ -4441,7 +4441,7 @@ static void loosen_unused_packed_objects(void)
uint32_t loosened_objects_nr = 0;
struct object_id oid;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local || p->pack_keep || p->pack_keep_in_core)
continue;
@@ -4747,7 +4747,7 @@ static void add_extra_kept_packs(const struct string_list *names)
if (!names->nr)
return;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
const char *name = basename(p->pack_name);
int i;
@@ -5186,7 +5186,7 @@ int cmd_pack_objects(int argc,
add_extra_kept_packs(&keep_pack_list);
if (ignore_packed_keep_on_disk) {
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
if (p->pack_local && p->pack_keep)
break;
if (!p) /* no keep-able packs found */
@@ -5199,7 +5199,7 @@ int cmd_pack_objects(int argc,
* it also covers non-local objects
*/
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_local) {
have_non_local_packs = 1;
break;
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index fe81c293e3a..7b2cb3ef1e2 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -566,7 +566,7 @@ static struct pack_list * add_pack(struct packed_git *p)
static struct pack_list * add_pack_file(const char *filename)
{
- struct packed_git *p = get_all_packs(the_repository);
+ struct packed_git *p = packfile_store_get_packs(the_repository->objects->packfiles);
if (strlen(filename) < 40)
die("Bad pack filename: %s", filename);
@@ -581,7 +581,7 @@ static struct pack_list * add_pack_file(const char *filename)
static void load_all(void)
{
- struct packed_git *p = get_all_packs(the_repository);
+ struct packed_git *p = packfile_store_get_packs(the_repository->objects->packfiles);
while (p) {
add_pack(p);
diff --git a/builtin/repack.c b/builtin/repack.c
index ee8c80cd95c..6119e236512 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -267,7 +267,7 @@ static void collect_pack_filenames(struct existing_packs *existing,
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
int i;
const char *base;
@@ -499,7 +499,7 @@ static void init_pack_geometry(struct pack_geometry *geometry,
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (args->local && !p->pack_local)
/*
* When asked to only repack local packfiles we skip
@@ -1140,7 +1140,7 @@ static void combine_small_cruft_packs(FILE *in, size_t combine_cruft_below_size,
struct strbuf buf = STRBUF_INIT;
size_t i;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!(p->is_cruft && p->pack_local))
continue;
diff --git a/connected.c b/connected.c
index d6e9682fd93..d7e07fa6b0d 100644
--- a/connected.c
+++ b/connected.c
@@ -76,7 +76,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
do {
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (!p->pack_promisor)
continue;
if (find_pack_entry_one(oid, p))
diff --git a/http-backend.c b/http-backend.c
index d5dfe762bb5..be4d8263a58 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -608,13 +608,13 @@ static void get_info_packs(struct strbuf *hdr, char *arg UNUSED)
size_t cnt = 0;
select_getanyfile(hdr);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (p->pack_local)
cnt++;
}
strbuf_grow(&buf, cnt * 53 + 2);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (p->pack_local)
strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
}
diff --git a/http.c b/http.c
index af2120b64c7..16a1ab54f34 100644
--- a/http.c
+++ b/http.c
@@ -2416,7 +2416,7 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
* If we already have the pack locally, no need to fetch its index or
* even add it to list; we already have all of its objects.
*/
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
if (hasheq(p->hash, sha1, the_repository->hash_algo))
return 0;
}
diff --git a/object-name.c b/object-name.c
index c87995cc1e6..e346075394d 100644
--- a/object-name.c
+++ b/object-name.c
@@ -213,7 +213,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
unique_in_midx(m, ds);
}
- for (p = get_all_packs(ds->repo); p && !ds->ambiguous;
+ for (p = packfile_store_get_packs(ds->repo->objects->packfiles); p && !ds->ambiguous;
p = p->next)
unique_in_pack(p, ds);
}
@@ -806,7 +806,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
find_abbrev_len_for_midx(m, mad);
}
- for (p = get_all_packs(mad->repo); p; p = p->next)
+ for (p = packfile_store_get_packs(mad->repo->objects->packfiles); p; p = p->next)
find_abbrev_len_for_pack(p, mad);
}
diff --git a/pack-bitmap.c b/pack-bitmap.c
index d14421ee204..67f9e92ec18 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -665,7 +665,7 @@ static int open_pack_bitmap(struct repository *r,
struct packed_git *p;
int ret = -1;
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if (open_pack_bitmap_1(bitmap_git, p) == 0) {
ret = 0;
/*
@@ -3363,7 +3363,7 @@ int verify_bitmap_files(struct repository *r)
free(midx_bitmap_name);
}
- for (struct packed_git *p = get_all_packs(r);
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles);
p; p = p->next) {
char *pack_bitmap_name = pack_bitmap_filename(p);
res |= verify_bitmap_file(r->hash_algo, pack_bitmap_name);
diff --git a/pack-objects.c b/pack-objects.c
index a9d9855063a..5506f12293c 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -95,7 +95,7 @@ static void prepare_in_pack_by_idx(struct packing_data *pdata)
* (i.e. in_pack_idx also zero) should return NULL.
*/
mapping[cnt++] = NULL;
- for (p = get_all_packs(pdata->repo); p; p = p->next, cnt++) {
+ for (p = packfile_store_get_packs(pdata->repo->objects->packfiles); p; p = p->next, cnt++) {
if (cnt == nr) {
free(mapping);
return;
diff --git a/packfile.c b/packfile.c
index 523c30c5a71..19227ea0b3c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1029,19 +1029,19 @@ void packfile_store_reprepare(struct packfile_store *store)
packfile_store_prepare(store);
}
-struct packed_git *get_all_packs(struct repository *r)
+struct packed_git *packfile_store_get_packs(struct packfile_store *store)
{
- packfile_store_prepare(r->objects->packfiles);
+ packfile_store_prepare(store);
- for (struct odb_source *source = r->objects->sources; source; source = source->next) {
+ for (struct odb_source *source = store->odb->sources; source; source = source->next) {
struct multi_pack_index *m = source->midx;
if (!m)
continue;
for (uint32_t i = 0; i < m->num_packs + m->num_packs_in_base; i++)
- prepare_midx_pack(r, m, i);
+ prepare_midx_pack(store->odb->repo, m, i);
}
- return r->objects->packfiles->packs;
+ return store->packs;
}
struct list_head *get_packed_git_mru(struct repository *r)
@@ -2101,7 +2101,7 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
* covers, one kept and one not kept, but the midx returns only
* the non-kept version.
*/
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if ((p->pack_keep && (flags & ON_DISK_KEEP_PACKS)) ||
(p->pack_keep_in_core && (flags & IN_CORE_KEEP_PACKS))) {
ALLOC_GROW(packs, nr + 1, alloc);
@@ -2198,7 +2198,7 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
int r = 0;
int pack_errors = 0;
- for (p = get_all_packs(repo); p; p = p->next) {
+ for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next) {
if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
continue;
if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
diff --git a/packfile.h b/packfile.h
index 15cb378781d..86ab70eef9e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -127,6 +127,12 @@ void packfile_store_reprepare(struct packfile_store *store);
void packfile_store_add_pack(struct packfile_store *store,
struct packed_git *pack);
+/*
+ * Get all packs managed by the given store, including packfiles that are
+ * referenced by multi-pack indices.
+ */
+struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+
/*
* Open the packfile and add it to the store if it isn't yet known. Returns
* either the newly opened packfile or the preexisting packfile. Returns a
@@ -212,7 +218,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
extern void (*report_garbage)(unsigned seen_bits, const char *path);
struct list_head *get_packed_git_mru(struct repository *r);
-struct packed_git *get_all_packs(struct repository *r);
/*
* Give a rough count of objects in the repository. This sacrifices accuracy
diff --git a/server-info.c b/server-info.c
index 9bb30d9ab71..79234c7fed3 100644
--- a/server-info.c
+++ b/server-info.c
@@ -292,7 +292,7 @@ static void init_pack_info(struct repository *r, const char *infofile, int force
int i;
size_t alloc = 0;
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
/* we ignore things on alternate path since they are
* not available to the pullers in general.
*/
diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c
index 611a13a3261..183a777fc54 100644
--- a/t/helper/test-find-pack.c
+++ b/t/helper/test-find-pack.c
@@ -39,7 +39,7 @@ int cmd__find_pack(int argc, const char **argv)
if (repo_get_oid(the_repository, argv[0], &oid))
die("cannot parse %s as an object name", argv[0]);
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
if (find_pack_entry_one(&oid, p)) {
printf("%s\n", p->pack_name);
actual_count++;
diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c
index d51aaa3dc40..cfdfae77a6c 100644
--- a/t/helper/test-pack-mtimes.c
+++ b/t/helper/test-pack-mtimes.c
@@ -37,7 +37,7 @@ int cmd__pack_mtimes(int argc, const char **argv)
if (argc != 2)
usage(pack_mtimes_usage);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
strbuf_addstr(&buf, basename(p->pack_name));
strbuf_strip_suffix(&buf, ".pack");
strbuf_addstr(&buf, ".mtimes");
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v2 16/16] packfile: refactor `get_packed_git_mru()` to work on packfile store
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
` (14 preceding siblings ...)
2025-08-21 7:39 ` [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
@ 2025-08-21 7:39 ` Patrick Steinhardt
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-08-21 7:39 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King
The `get_packed_git_mru()` function prepares the packfile store and then
returns its packfiles in most-recently-used order. Refactor it to accept
a packfile store instead of a repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/pack-objects.c | 4 ++--
packfile.c | 6 +++---
packfile.h | 7 +++++--
3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1c24b84510..4e75f14df1 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1748,12 +1748,12 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
}
}
- list_for_each(pos, get_packed_git_mru(the_repository)) {
+ list_for_each(pos, packfile_store_get_packs_mru(the_repository->objects->packfiles)) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset, found_mtime);
if (!exclude && want > 0)
list_move(&p->mru,
- get_packed_git_mru(the_repository));
+ packfile_store_get_packs_mru(the_repository->objects->packfiles));
if (want != -1)
return want;
}
diff --git a/packfile.c b/packfile.c
index 19227ea0b3..8451467daf 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1044,10 +1044,10 @@ struct packed_git *packfile_store_get_packs(struct packfile_store *store)
return store->packs;
}
-struct list_head *get_packed_git_mru(struct repository *r)
+struct list_head *packfile_store_get_packs_mru(struct packfile_store *store)
{
- packfile_store_prepare(r->objects->packfiles);
- return &r->objects->packfiles->mru;
+ packfile_store_prepare(store);
+ return &store->mru;
}
/*
diff --git a/packfile.h b/packfile.h
index 86ab70eef9..b87fa2df59 100644
--- a/packfile.h
+++ b/packfile.h
@@ -133,6 +133,11 @@ void packfile_store_add_pack(struct packfile_store *store,
*/
struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+/*
+ * Get all packs in most-recently-used order.
+ */
+struct list_head *packfile_store_get_packs_mru(struct packfile_store *store);
+
/*
* Open the packfile and add it to the store if it isn't yet known. Returns
* either the newly opened packfile or the preexisting packfile. Returns a
@@ -217,8 +222,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-struct list_head *get_packed_git_mru(struct repository *r);
-
/*
* Give a rough count of objects in the repository. This sacrifices accuracy
* for speed.
--
2.51.0.261.g7ce5a0a67e.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* Re: [PATCH 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-20 8:04 ` Karthik Nayak
@ 2025-08-22 23:50 ` Junio C Hamano
2025-08-26 12:19 ` [PATCH] Documentation: note styling for bit fields Karthik Nayak
0 siblings, 1 reply; 102+ messages in thread
From: Junio C Hamano @ 2025-08-22 23:50 UTC (permalink / raw)
To: Karthik Nayak; +Cc: Patrick Steinhardt, git
Karthik Nayak <karthik.188@gmail.com> writes:
>> I have personal preferences, and usually I'd like to hear from
>> others first before mentioning my preference, but for something this
>> small and does not affect readability very much, perhaps I can just
>> pick and dictate? I dunno ;-).
>
> I wouldn't mind if you picked one over the other, like I mentioned, I
> care more that we make it consistent and that the formatter can notify
> or fix it for us.
Then let's declare that these shall be written like so:
unsigned my_field:1;
unsigned other_field:1;
unsigned field_with_longer_name:1;
without a space around the colon. It would allow us not to modify
the clang-format file, and more importantly, discourage people from
doing ugly alignment with spaces, i.e.
unsigned my_field : 1;
unsigned other_field : 1;
unsigned field_with_longer_name : 1;
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 02/16] odb: move list of packfiles into `struct packfile_store`
2025-08-21 7:39 ` [PATCH v2 02/16] odb: move list of packfiles into " Patrick Steinhardt
@ 2025-08-25 23:42 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-25 23:42 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:00AM +0200, Patrick Steinhardt wrote:
> The object database tracks the list of packfiles it currently knows
> about. With the introduction of the `struct packfile_store` we have a
> better place to host this list though.
>
> Move the list accordingly. Extract the logic from `odb_clear()` that
> knows to close all such packfiles and move it into the new subsystem, as
> well.
Not a comment on this patch itself, but as a meta-comment, I really
appreciate you taking such an incremental approach here. The packfile
machinery is quite fragile in my experience, so breaking it up into (what
are so far) easily review-able chunks makes it much easier to build
confidence in the correctness of these changes.
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> odb.c | 11 +----------
> odb.h | 1 -
> packfile.c | 47 ++++++++++++++++++++++++++++++-----------------
> packfile.h | 15 ++++++++++++++-
> 4 files changed, 45 insertions(+), 29 deletions(-)
>
> diff --git a/odb.c b/odb.c
> index 34b70d0074..17a9135cbd 100644
> --- a/odb.c
> +++ b/odb.c
> @@ -1038,16 +1038,7 @@ void odb_clear(struct object_database *o)
>
> INIT_LIST_HEAD(&o->packed_git_mru);
> close_object_store(o);
> -
> - /*
> - * `close_object_store()` only closes the packfiles, but doesn't free
> - * them. We thus have to do this manually.
> - */
> - for (struct packed_git *p = o->packed_git, *next; p; p = next) {
> - next = p->next;
> - free(p);
> - }
> - o->packed_git = NULL;
> + packfile_store_free(o->packfiles);
Interesting. The movement of the for-loop here all looks correct to me.
But I think the new packfile_store is creating a new awkardness here
that we should consider.
In existing implementation, all of the ->next pointers here point to
heap locations that have already been free()'d. But that's OK, since
they aren't reachable at the moment that we do "o-packed_store = NULL".
Having a separate packfile_store changes that, since (from my reading of
the code) o->packfiles will still be non-NULL even after calling
odb_clear(), *and* those pointers will refer to free'd heap locations.
That seems like a potential footgun to me. I think that we could either:
* Change packfile_store_free() to take in an object_database pointer,
and NULL out the ->packs pointer after free'ing all of the packfiles.
That would make it more similar to the existing behavior.
* Leave packfile_store_free() as-is, document that it does NOT clear
out the top-level pointer, and so callers are encouraged to NULL it
out themselves after calling it. Likewise, we should change
odb_clear() to do:
packfile_store_free(o->packfiles);
o->packfiles = NULL;
Let me know what you think.
> hashmap_clear(&o->pack_map);
> string_list_clear(&o->submodule_source_paths, 0);
> diff --git a/odb.h b/odb.h
> index 08c3a01f3b..6f901c5ac0 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -130,7 +130,6 @@ struct object_database {
> * should only be accessed directly by packfile.c
> */
> struct packfile_store *packfiles;
> - struct packed_git *packed_git;
Makes sense.
> /* A most-recently-used ordered version of the packed_git list. */
> struct list_head packed_git_mru;
>
> diff --git a/packfile.c b/packfile.c
> index 8fbf1cfc2d..6478e4cc30 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -278,7 +278,7 @@ static int unuse_one_window(struct packed_git *current)
>
> if (current)
> scan_windows(current, &lru_p, &lru_w, &lru_l);
> - for (p = current->repo->objects->packed_git; p; p = p->next)
> + for (p = current->repo->objects->packfiles->packs; p; p = p->next)
Not a huge deal, but I do find "current->repo->objects->packfiles->packs"
to be a bit unfortunate. I wonder if we should rename "packs" to "head"
or "list_head" or similar since it's clear from
"current->repo->objects->packfiles" that this is a list of packfiles.
> scan_windows(p, &lru_p, &lru_w, &lru_l);
> if (lru_p) {
> munmap(lru_w->base, lru_w->len);
> @@ -362,13 +362,8 @@ void close_pack(struct packed_git *p)
> void close_object_store(struct object_database *o)
> {
> struct odb_source *source;
> - struct packed_git *p;
>
> - for (p = o->packed_git; p; p = p->next)
> - if (p->do_not_close)
> - BUG("want to close pack marked 'do-not-close'");
> - else
> - close_pack(p);
> + packfile_store_close(o->packfiles);
Looks good.
> @@ -468,7 +463,7 @@ static int close_one_pack(struct repository *r)
> struct pack_window *mru_w = NULL;
> int accept_windows_inuse = 1;
>
> - for (p = r->objects->packed_git; p; p = p->next) {
> + for (p = r->objects->packfiles->packs; p; p = p->next) {
Likewise.
> @@ -2344,5 +2339,23 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
>
> void packfile_store_free(struct packfile_store *store)
> {
> + packfile_store_close(store);
Seeing a call to packfile_store_close() here was a little surprising to
me. The code that you are moving has a comment that says:
* `close_object_store()` only closes the packfiles, but doesn't free
* them. We thus have to do this manually.
, so I would have expected to preserve that behavior. I *think* that
this happens to be OK, since close_pack() is a noop if it is called more
than once (though I had to double check through all of its leaf
functions that that was indeed the case).
I would probably strike this from the new function, since the sole
caller above already calls close_object_store() before calling
packfile_store_free().
> +
> + for (struct packed_git *p = store->packs, *next; p; p = next) {
> + next = p->next;
> + free(p);
> + }
> +
> free(store);
> }
This part looks good.
> +void packfile_store_close(struct packfile_store *store)
> +{
> + struct packed_git *p;
> +
> + for (p = store->packs; p; p = p->next)
> + if (p->do_not_close)
> + BUG("want to close pack marked 'do-not-close'");
> + else
> + close_pack(p);
> +}
And likewise this looks good to me. I do find the braceless for-loop a
little hard to read, but it's (a) correct, and (b) consistent with the
original implementation, so I don't feel strongly about changing it.
As a side-note, you could inline the declaration of "p" here into the
for-loop, but I can understand not wanting to to make the diff more
readable with --color-moved.
> diff --git a/packfile.h b/packfile.h
> index 8d31fd619a..d7ac8d24b4 100644
> --- a/packfile.h
> +++ b/packfile.h
The rest looks good to me.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 03/16] odb: move initialization bit into `struct packfile_store`
2025-08-21 7:39 ` [PATCH v2 03/16] odb: move initialization bit " Patrick Steinhardt
@ 2025-08-26 1:40 ` Taylor Blau
0 siblings, 0 replies; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 1:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:01AM +0200, Patrick Steinhardt wrote:
> diff --git a/packfile.h b/packfile.h
> index d7ac8d24b4..cf81091175 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -63,6 +63,12 @@ struct packfile_store {
> * the store.
> */
> struct packed_git *packs;
> +
> + /*
> + * Whether packfiles have already been populated with this store's
> + * packs.
> + */
> + bool initialized;
Exciting ;-).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 04/16] odb: move packfile map into `struct packfile_store`
2025-08-21 7:39 ` [PATCH v2 04/16] odb: move packfile map " Patrick Steinhardt
@ 2025-08-26 1:41 ` Taylor Blau
0 siblings, 0 replies; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 1:41 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:02AM +0200, Patrick Steinhardt wrote:
> The object database tracks a map of packfiles by their respective paths,
> which is used to figure out whether a given packfile has already been
> loaded.With the introduction of the `struct packfile_store` we have a
> better place to host this list though.
>
> Move the map accordingly. `pack_map_entry_cmp()` isn't used anywhere but
> in "packfile.c" anymore after this change, so we convert it to a static
> function, as well.
This patch looks exactly as expected, let's keep reading...
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 06/16] odb: move kept cache into `struct packfile_store`
2025-08-21 7:39 ` [PATCH v2 06/16] odb: move kept cache " Patrick Steinhardt
@ 2025-08-26 1:46 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 1:46 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:04AM +0200, Patrick Steinhardt wrote:
> The object database tracks a cache of "kept" packfiles, which is used by
> git-pack-objects(1) to handle cruft objects. With the introduction of
> the `struct packfile_store` we have a better place to host this cache
> though.
>
> Move the cache accordingly.
This all looks good to me, thanks for taking care to preserve the
kept-pack cache's behavior.
> This moves the last bit of packfile-related state from the object
> database into the packfile store. Adapt the comment for the `packfiles`
> pointer in `struct object_database` to reflect this.
Thanks for keeping the comment up-to-date :-).
> diff --git a/packfile.h b/packfile.h
> index d48d46cc1b..74cea1a4a9 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -64,6 +64,11 @@ struct packfile_store {
> */
> struct packed_git *packs;
>
> + struct {
> + struct packed_git **packs;
> + unsigned flags;
> + } kept_cache;
> +
This wouldn't be a bad time to add a comment here explaining what the
kept_cache is for and what each of the struct's members represent. We
can blame (at least one of) the author(s) of 20b031fede (packfile: add
kept-pack cache for find_kept_pack_entry(), 2021-02-22) for omitting it
in the first place ;-).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 07/16] packfile: reorder functions to avoid function declaration
2025-08-21 7:39 ` [PATCH v2 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
@ 2025-08-26 1:47 ` Taylor Blau
0 siblings, 0 replies; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 1:47 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:05AM +0200, Patrick Steinhardt wrote:
> ---
> packfile.c | 67 +++++++++++++++++++++++++++++++-------------------------------
> 1 file changed, 33 insertions(+), 34 deletions(-)
Inspecting the diff locally with --color-moved shows that the changes
are faithful here; thanks.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store
2025-08-21 7:39 ` [PATCH v2 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-08-26 1:58 ` Taylor Blau
0 siblings, 0 replies; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 1:58 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:06AM +0200, Patrick Steinhardt wrote:
> The `prepare_packed_git()` function and its friends are responsible for
> loading packfiles as well as the multi-pack index for a given object
> database. Refactor these functions to accept a packfile store instead of
> a repository to clarify their scope.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> packfile.c | 43 +++++++++++++++++++------------------------
> 1 file changed, 19 insertions(+), 24 deletions(-)
>
> diff --git a/packfile.c b/packfile.c
> index 90f15b0c20..2e45a3a05f 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -974,38 +974,33 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
> return -1;
> }
>
> -static void rearrange_packed_git(struct repository *r)
> -{
> - sort_packs(&r->objects->packfiles->packs, sort_pack);
> -}
OK, makes sense -- it looks like you inlined rearrange_packed_git() in
its sole caller packfile_store_prepare() below. I think that could have
been done as a separate step, but it's equally fine to include it here,
too.
> -static void prepare_packed_git_mru(struct repository *r)
> +static void packfile_store_prepare_mru(struct packfile_store *store)
> {
> struct packed_git *p;
>
> - INIT_LIST_HEAD(&r->objects->packfiles->mru);
> + INIT_LIST_HEAD(&store->mru);
>
> - for (p = r->objects->packfiles->packs; p; p = p->next)
> - list_add_tail(&p->mru, &r->objects->packfiles->mru);
> + for (p = store->packs; p; p = p->next)
> + list_add_tail(&p->mru, &store->mru);
Looks all good.
> -static void prepare_packed_git(struct repository *r)
> +static void packfile_store_prepare(struct packfile_store *store)
> {
> struct odb_source *source;
>
> - if (r->objects->packfiles->initialized)
> + if (store->initialized)
> return;
>
> - odb_prepare_alternates(r->objects);
> - for (source = r->objects->sources; source; source = source->next) {
> - int local = (source == r->objects->sources);
> + odb_prepare_alternates(store->odb);
Hmmm. I admit that I don't love that the packfile_store knows about the
object_store that it belongs to. This feels like a layering violation
and makes me worry that we pushed too much down into the new
packfile_store. I'm not sure I have a better idea off the top of my
head, though.
Everything down from here looks good to me.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()`
2025-08-21 7:39 ` [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
@ 2025-08-26 2:10 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 2:10 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:07AM +0200, Patrick Steinhardt wrote:
> While the logic is hosted in "packfile.c", it clearly reaches into other
> subsystems that aren't related to packfiles.
>
> Split up the responsibility and introduce `odb_reprepare()` which now
> becomes responsible for repreparing the whole object database. The
> existing `reprepare_packed_git()` function is refactored accordingly and
> only cares about reloading the packfile store now.
Makes sense.
> diff --git a/odb.c b/odb.c
> index 80ec6fc1fa..37ed21f53b 100644
> --- a/odb.c
> +++ b/odb.c
> @@ -694,7 +694,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
>
> /* Not a loose object; someone else may have just packed it. */
> if (!(flags & OBJECT_INFO_QUICK)) {
> - reprepare_packed_git(odb->repo);
> + odb_reprepare(odb->repo->objects);
> if (find_pack_entry(odb->repo, real, &e))
> break;
> }
> @@ -1039,3 +1039,26 @@ void odb_clear(struct object_database *o)
>
> string_list_clear(&o->submodule_source_paths, 0);
> }
> +
> +void odb_reprepare(struct object_database *o)
OK; so here is the new location for the non-packfile related portions of
the former reprepare_packed_git() function. That makes sense, but...
> +{
> + struct odb_source *source;
> +
> + /*
> + * Reprepare alt odbs, in case the alternates file was modified
> + * during the course of this process. This only _adds_ odbs to
> + * the linked list, so existing odbs will continue to exist for
> + * the lifetime of the process.
> + */
> + o->loaded_alternates = 0;
> + odb_prepare_alternates(o);
> +
> + for (source = o->sources; source; source = source->next)
> + odb_clear_loose_cache(source);
> +
> + o->approximate_object_count_valid = 0;
> +
> + packfile_store_reprepare(o->packfiles);
> +
> + obj_read_unlock();
...I think I am missing where we call odb_read_lock(). The function
packfile_store_reprepare() has a comment that it must be called under
the odb_read_lock(), but I don't see where we acquire that lock here.
Are the callers of odb_reprepare() supposed to acquire that lock? If so,
it seems a little awkward that the caller is supposed to acquire the
lock, but the callee is the one to release it. Is this function missing
a odb_read_lock() at the top?
I looked at a few callers here and none of them seem to be holding this
lock. pthread_mutex_unlock() is supposed to check that the mutex lock is
held for recursive and error-checking mutexes. IIRC we initialize the
the odb mutex with PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, so I am a
little surprised that this did not cause a runtime error.
The rest of the patch looks good.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store
2025-08-21 7:39 ` [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-08-26 2:11 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-26 2:11 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:08AM +0200, Patrick Steinhardt wrote:
> The `install_packed_git()` functions adds a packfile to a specific
> object store. Refactor it to accept a packfile store instead of a
> repository to clarify its scope.
All of the refactoring here looks straightforward and correct to me. I
admit that I have a vague preference towards keeping the word "install"
in the function name here, since it (to me) suggests that the packfile
in question is going to be used for lookups, whereas "add" is a bit
more generic.
I don't feel strongly about it, though, so if you have a preference
towards "add" then I'm fine with that.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH] Documentation: note styling for bit fields
2025-08-22 23:50 ` Junio C Hamano
@ 2025-08-26 12:19 ` Karthik Nayak
0 siblings, 0 replies; 102+ messages in thread
From: Karthik Nayak @ 2025-08-26 12:19 UTC (permalink / raw)
To: gitster; +Cc: git, karthik.188, ps
Our codebase uses a lot of bit field variables, generally to mark
boolean type variables. While there is a formatting rule in the
'.clang-format', there is no guideline specified in the
'CodingGuidelines'.
Since the '.clang-format' is not yet enforced, let's also add a
guideline with the same rule as mentioned in the '.clang-format', which
is to not use any spaces around the colon, like so:
unsigned my_field:1;
unsigned other_field:1;
unsigned field_with_longer_name:1;
This would allow us not to modify the clang-format file, and more
importantly, discourage people from doing ugly alignment with spaces,
i.e.
unsigned my_field : 1;
unsigned other_field : 1;
unsigned field_with_longer_name : 1;
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
I think it would be worthwhile to also add this decision to the
'CodingGuidelines', but if you feel it is unnecessary, feel free to drop it.
Documentation/CodingGuidelines | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 224f0978a8..df72fe0177 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -650,6 +650,12 @@ For C programs:
cases. However, it is recommended to find a more descriptive name wherever
possible to improve the readability and maintainability of the code.
+ - Bit fields should be defined without a space around the colon. E.g.
+
+ unsigned my_field:1;
+ unsigned other_field:1;
+ unsigned field_with_longer_name:1;
+
For Perl programs:
- Most of the C guidelines above apply.
--
2.50.1
^ permalink raw reply related [flat|nested] 102+ messages in thread
* Re: [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack
2025-08-21 7:39 ` [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
@ 2025-08-27 1:04 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-27 1:04 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:09AM +0200, Patrick Steinhardt wrote:
> When adding a packfile to it store we add it both to the list and map of
> packfiles, but we don't append it to the most-recently-used list of
> packs. We do know to add the packfile to the MRU list as soon as we
> access any of its objects, but in between we're being inconistent. It
> doesn't help that there are some subsystems that _do_ add the packfile
> to the MRU after having added it, which only adds to the confusion.
>
> Refactor the code so that we unconditionally add packfiles to the MRU
> when adding them to a packfile store.
I am a little confused why prepare_midx_pack() wants to add packs to the
MRU cache so eagerly, and the commit which introduced that behavior
(commit af96fe3392 (midx: add packs to packed_git linked list,
2019-04-29)) doesn't focus on that area in detail.
(Note that commit af96fe3392 *does* discuss a separate cache's behavior
regarding the open file descriptor limit, but that LRU cache is a
different one than the MRU cache we're discussing here.)
What I do wonder about is why af96fe3392 adds packs to the MRU cache in
the first place. As far as I can tell, we never move MIDX'd packs to
the front of the MRU cache at all. There are two spots that call
list_move() on the MRU cache, which are:
- packfile.c::find_pack_entry(), which enumerates MIDX'd
packs in a separate loop earlier on in the function, and ignores
packs in the MRU cache whose p->multi_pack_index bit is set.
- builtin/pack-objects.c::want_object_in_pack_mtime(), which also
enumerates MIDX'd packs in a separate loop, though it does not
explicitly ignore packs in the MRU cache with the multi_pack_index
bit set.
In practice, though, I think these two are equivalent, since
want_object_in_pack_mtime() will return before it gets to the MRU cache
if it found the object in a MIDX'd pack.
So I don't think we need to be adding MIDX'd packs to the MRU cache in
the first place.
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> midx.c | 4 +---
> packfile.c | 1 +
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index 95e74c79c1..3cfe7884ad 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -476,10 +476,8 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
> struct packed_git, packmap_ent);
> if (!p) {
> p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
> - if (p) {
> + if (p)
> packfile_store_add_pack(r->objects->packfiles, p);
> - list_add_tail(&p->mru, &r->objects->packfiles->mru);
> - }
All of that aside, this portion of the diff is preserving the existing
behavior, since it inlines the list_add_tail() call within
packfile_store_add_pack(). OK.
> diff --git a/packfile.c b/packfile.c
> index c885046d9f..a79d0fc1fa 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -790,6 +790,7 @@ void packfile_store_add_pack(struct packfile_store *store,
>
> hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
> hashmap_add(&store->map, &pack->packmap_ent);
> + list_add_tail(&pack->mru, &store->mru);
But this spot makes me a little unsure. There are callers of what is now
packfile_store_add_pack() that *don't* immediately add the pack to the
MRU cache (e.g., packfile.c::prepare_pack()). I think that behavior is
intentional, since we don't necessarily want to have packs in the MRU
cache which haven't actually received an object lookup yet.
So I am not sure I understand the full extent of this change. I think it
might be worth investigating the eagerness of prepare_midx_pack() to add
packs to the MRU cache, and perhaps drop that behavior altogether, which
I think would obviate the need for this patch in the series.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 12/16] packfile: introduce function to load and add packfiles
2025-08-21 7:39 ` [PATCH v2 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
@ 2025-08-27 1:12 ` Taylor Blau
0 siblings, 0 replies; 102+ messages in thread
From: Taylor Blau @ 2025-08-27 1:12 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:10AM +0200, Patrick Steinhardt wrote:
> We have a recurring pattern where we essentially perform an upsert of a
> packfile in case it isn't yet known by the packfile store. The logic to
> do so is non-trivial as we have to reconstruct the packfile's key, check
> the map of packfiles, then create the new packfile and finally add it to
> the store.
>
> Introduce a new function that does this dance for us. Refactor callsites
> to use it.
Nice, I have definitely noticed this pattern before and thought it would
be nice to DRY it up a bit, but never got around to doing so ;-).
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/fast-import.c | 4 ++--
> builtin/index-pack.c | 10 +++-------
> midx.c | 18 ++----------------
> packfile.c | 44 +++++++++++++++++++++++++++++++-------------
> packfile.h | 8 ++++++++
> 5 files changed, 46 insertions(+), 38 deletions(-)
>
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index e9d82b31c3..a26e79689d 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -897,11 +897,11 @@ static void end_packfile(void)
> idx_name = keep_pack(create_index());
>
> /* Register the packfile with core git's machinery. */
> - new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
> + new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
> + idx_name, 1);
> if (!new_p)
> die("core git rejected index %s", idx_name);
> all_packs[pack_id] = new_p;
> - packfile_store_add_pack(the_repository->objects->packfiles, new_p);
OK, we can now avoid calling packfile_store_add_pack() explicitly here,
since that is part of the new packfile_store_load_pack() function which
is called a few lines up. That does change the order of operations a
little bit (previously the new pack would end up in 'all_packs' first
before being installed, now it's the other way around), but not in a way
that I think matters.
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index ed490dfad4..2b78ba7fe4 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
> rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
> hash, "idx", 1);
>
> - if (do_fsck_object) {
> - struct packed_git *p;
> - p = add_packed_git(the_repository, final_index_name,
> - strlen(final_index_name), 0);
> - if (p)
> - packfile_store_add_pack(the_repository->objects->packfiles, p);
> - }
> + if (do_fsck_object)
> + packfile_store_load_pack(the_repository->objects->packfiles,
> + final_index_name, 0);
Looks obviously correct to me.
> diff --git a/midx.c b/midx.c
> index 3cfe7884ad..d30feda019 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -454,7 +454,6 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
> uint32_t pack_int_id)
> {
> struct strbuf pack_name = STRBUF_INIT;
> - struct strbuf key = STRBUF_INIT;
> struct packed_git *p;
>
> pack_int_id = midx_for_pack(&m, pack_int_id);
> @@ -466,22 +465,9 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
>
> strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
> m->pack_names[pack_int_id]);
> -
> - /* pack_map holds the ".pack" name, but we have the .idx */
> - strbuf_addbuf(&key, &pack_name);
> - strbuf_strip_suffix(&key, ".idx");
> - strbuf_addstr(&key, ".pack");
> - p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
> - strhash(key.buf), key.buf,
> - struct packed_git, packmap_ent);
> - if (!p) {
> - p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
> - if (p)
> - packfile_store_add_pack(r->objects->packfiles, p);
> - }
> -
> + p = packfile_store_load_pack(r->objects->packfiles,
> + pack_name.buf, m->local);
Nice. This all looks like it preserves the right behavior, and it's nice
to see the "we have a thing that ends in '.pack', but we need one that
ends in '.idx'" logic get inlined, too.
> diff --git a/packfile.c b/packfile.c
> index a79d0fc1fa..f7a9967c9d 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store,
> list_add_tail(&pack->mru, &store->mru);
> }
>
> +struct packed_git *packfile_store_load_pack(struct packfile_store *store,
> + const char *idx_path, int local)
> +{
> + struct strbuf key = STRBUF_INIT;
> + struct packed_git *p;
> +
> + /*
> + * We're being called with the path to the index file, but `pack_map`
> + * holds the path to the packfile itself.
> + */
> + strbuf_addstr(&key, idx_path);
> + strbuf_strip_suffix(&key, ".idx");
> + strbuf_addstr(&key, ".pack");
> +
> + p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
> + struct packed_git, packmap_ent);
> + if (!p) {
> + p = add_packed_git(store->odb->repo, idx_path,
> + strlen(idx_path), local);
> + if (p)
> + packfile_store_add_pack(store, p);
> + }
> +
> + strbuf_release(&key);
> + return p;
> +}
> +
This all looks good too, and matches the behavior of the callees which
are being refactored.
> void (*report_garbage)(unsigned seen_bits, const char *path);
>
> static void report_helper(const struct string_list *list,
> @@ -892,23 +919,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
> const char *file_name, void *_data)
> {
> struct prepare_pack_data *data = (struct prepare_pack_data *)_data;
> - struct packed_git *p;
> size_t base_len = full_name_len;
>
> if (strip_suffix_mem(full_name, &base_len, ".idx") &&
> !(data->m && midx_contains_pack(data->m, file_name))) {
> - struct hashmap_entry hent;
> - char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name);
> - unsigned int hash = strhash(pack_name);
> - hashmap_entry_init(&hent, hash);
> -
> - /* Don't reopen a pack we already have. */
> - if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
> - p = add_packed_git(data->r, full_name, full_name_len, data->local);
> - if (p)
> - packfile_store_add_pack(data->r->objects->packfiles, p);
> - }
> - free(pack_name);
> + char *trimmed_path = xstrndup(full_name, full_name_len);
> + packfile_store_load_pack(data->r->objects->packfiles,
> + trimmed_path, data->local);
I think we could avoid the allocation here by passing along the length
of the string we want to use, as in:
packfile_store_load_pack(data->r->objects->packfiles,
full_name, full_name_len,
data->local);
, but I prefer the way it is written here.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 13/16] packfile: move `get_multi_pack_index()` into "midx.c"
2025-08-21 7:39 ` [PATCH v2 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
@ 2025-08-27 1:20 ` Taylor Blau
0 siblings, 0 replies; 102+ messages in thread
From: Taylor Blau @ 2025-08-27 1:20 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:11AM +0200, Patrick Steinhardt wrote:
> The `get_multi_pack_index()` function is declared and implemented in the
> packfile subsystem, even though it really belongs into the multi-pack
> index subsystem. The reason for this is likely that it needs to call
> `packfile_store_prepare()`, which is not exposed by the packfile system.
> In a subsequent commit we're about to add another caller outside of the
> packfile system though, so we'll have to expose the function anyway.
>
> Do so now already and move `get_multi_pack_index()` into the MIDX
> subsystem.
Makes sense.
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> midx.c | 6 ++++++
> midx.h | 2 ++
> packfile.c | 8 +-------
> packfile.h | 10 +++++++++-
> 4 files changed, 18 insertions(+), 8 deletions(-)
And all looks good here.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 14/16] packfile: remove `get_packed_git()`
2025-08-21 7:39 ` [PATCH v2 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
@ 2025-08-27 1:38 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-27 1:38 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:12AM +0200, Patrick Steinhardt wrote:
> We have two different functions to retrieve packfiles for a packfile
> store:
>
> - `get_packed_git()` returns the list of packfiles after having called
> `prepare_packed_git()`.
>
> - `get_all_packs()` calls `prepare_packed_git()`, as well, but also
> calls `prepare_midx_pack()` for each pack.
Yeah, having two of these functions that are named so similarly as to
suggest they do the same thing (even though they don't) is unfortunate,
and I am glad that we are looking at it here.
> This means that the latter function also properly loads the info of
> whether or not a packfile is part of a multi-pack index. Preparing this
> extra information also shouldn't be significantly more expensive:
Right; get_packed_git() only loads the non-MIDX'd packs, and
get_all_packs() loads everything (regardless whether or not a pack is
part of the MIDX or not).
Are all of the get_packed_git() callers prepared to handle packs that
are in the MIDX? Looking through them:
- builtin/gc.c::incremental_repack_auto_condition() skips over
'p->multi_pack_index', so this one is fine to convert.
- builtin/grep.c::cmd_grep() calls get_packed_git() but doesn't
actually use the result, so this should be fine to convert, though I
think there is some subtlty here.
- builtin/pack-objects.c::want_object_in_pack_mtime() takes a separate
pass over the MIDX'd packs before calling get_packed_git_mru() (which
itself calls prepare_packed_git()). I think in practice this is OK,
since we will have already handled the MIDX'd packs, but this
function is now iterating over packs in the MIDX twice, so it may be
worth adding a "if (p->multi_pack_index) continue;" in there.
- object-name.c::find_short_packed_object() handles MIDX'd packs
separately, and unique_in_pack() is a noop for MIDX'd packs, so this
one is fine.
- object-name.c::find_abbrev_len_packed() is OK for the same reasons.
So I think that want_object_in_pack_mtime() may need a small tweak, and
I am not 100% certain that cmd_grep() is OK to convert.
> - We have already loaded all packfiles via `prepare_packed_git_one()`.
> So given that multi-pack indices may only refer to packfiles in the
> same object directory we know that we already loaded each packfile.
>
> - The multi-pack index was prepared via `packfile_store_prepare()`
> already, which calls `prepare_multi_pack_index_one()`.
>
> - So all that remains to be done is to look up the index of the pack
> in its multi-pack index so that we can store that info in both the
> pack itself and the MIDX.
This clearly shows that get_all_packs() is not meaningfully more
expensive than get_packed_git(), but I think that may be obscuring some
of the details above on why it is OK (or not) to transition these calls
over.
> So it is somewhat confusing to readers that one of these two functions
> claims to load "all" packfiles while the other one doesn't, even though
> the ultimate difference is way more nuanced.
>
> Convert all of these sites to use `get_all_packs()` instead and remove
> `get_packed_git()`. There doesn't seem to be a good reason to discern
> these two functions.
The last sentence here threw me off a bit. I think the patch message
would benefit from some of the reasoning above about why it is OK to
transition callers over from one function to the other.
As an aside, I am not convinced that having one caller (in pack-objects)
that cares about whether or not it sees a MIDX'd pack is that great of a
reason to have separate functions. But I am also not convinced that it
isn't either. If we kept both, I think they would benefit from having
more distinct names, like get_all_packs() and get_non_midx_packs() or
something.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store
2025-08-21 7:39 ` [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
@ 2025-08-27 1:45 ` Taylor Blau
2025-09-02 8:51 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-08-27 1:45 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Thu, Aug 21, 2025 at 09:39:13AM +0200, Patrick Steinhardt wrote:
> The `get_all_packs()` function prepares the packfile store and then
> returns its packfiles. Refactor it to accept a packfile store instead of
> a repository to clarify its scope.
I think that clarifying the scope here is a good idea. But I am a little
sad to see this patch proposing that we drop get_all_packs(), which IMHO
is a useful convenience function.
In effect this is pushing out the implementation details of the
packfile_store out to every caller that wants to use get_all_packs(),
which I am not sure is a win. Should those callers care where the array
of packs is found, or have to write
packfile_store_get_packs(the_repository->objects->packfiles)
each time they want to get the set of packs in a repository?
I could see an argument in the future where we have object stores that
aren't packfile-based and thus calling "get_all_packs()" is not
meaningful. But I don't think we are there yet, so I think that this
patch is pushing the burden of that future hypothetical on all existing
callers of get_all_packs().
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/cat-file.c | 2 +-
> builtin/count-objects.c | 2 +-
> builtin/fast-import.c | 4 ++--
> builtin/fsck.c | 8 ++++----
> builtin/gc.c | 8 ++++----
> builtin/pack-objects.c | 18 +++++++++---------
> builtin/pack-redundant.c | 4 ++--
> builtin/repack.c | 6 +++---
> connected.c | 2 +-
> http-backend.c | 4 ++--
> http.c | 2 +-
> object-name.c | 4 ++--
> pack-bitmap.c | 4 ++--
> pack-objects.c | 2 +-
> packfile.c | 14 +++++++-------
> packfile.h | 7 ++++++-
> server-info.c | 2 +-
> t/helper/test-find-pack.c | 2 +-
> t/helper/test-pack-mtimes.c | 2 +-
> 19 files changed, 51 insertions(+), 46 deletions(-)
>
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index fce0b06451c..7124c43fb14 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -854,7 +854,7 @@ static void batch_each_object(struct batch_options *opt,
> batch_one_object_bitmapped, &payload)) {
> struct packed_git *pack;
>
> - for (pack = get_all_packs(the_repository); pack; pack = pack->next) {
> + for (pack = packfile_store_get_packs(the_repository->objects->packfiles); pack; pack = pack->next) {
If we do go this route, it might be nice to introduce the pattern of
having a stack variable to hold the packfile_store pointer, since the
line above here is getting a little long at >100 characters just to
enumerate packs.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 06/16] odb: move kept cache into `struct packfile_store`
2025-08-26 1:46 ` Taylor Blau
@ 2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:50 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Mon, Aug 25, 2025 at 09:46:07PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:04AM +0200, Patrick Steinhardt wrote:
> > diff --git a/packfile.h b/packfile.h
> > index d48d46cc1b..74cea1a4a9 100644
> > --- a/packfile.h
> > +++ b/packfile.h
> > @@ -64,6 +64,11 @@ struct packfile_store {
> > */
> > struct packed_git *packs;
> >
> > + struct {
> > + struct packed_git **packs;
> > + unsigned flags;
> > + } kept_cache;
> > +
>
> This wouldn't be a bad time to add a comment here explaining what the
> kept_cache is for and what each of the struct's members represent. We
> can blame (at least one of) the author(s) of 20b031fede (packfile: add
> kept-pack cache for find_kept_pack_entry(), 2021-02-22) for omitting it
> in the first place ;-).
Good idea. I'll try to puzzle something together.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()`
2025-08-26 2:10 ` Taylor Blau
@ 2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:50 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Mon, Aug 25, 2025 at 10:10:02PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:07AM +0200, Patrick Steinhardt wrote:
> > diff --git a/odb.c b/odb.c
> > index 80ec6fc1fa..37ed21f53b 100644
> > --- a/odb.c
> > +++ b/odb.c
> > @@ -694,7 +694,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
> >
> > /* Not a loose object; someone else may have just packed it. */
> > if (!(flags & OBJECT_INFO_QUICK)) {
> > - reprepare_packed_git(odb->repo);
> > + odb_reprepare(odb->repo->objects);
> > if (find_pack_entry(odb->repo, real, &e))
> > break;
> > }
> > @@ -1039,3 +1039,26 @@ void odb_clear(struct object_database *o)
> >
> > string_list_clear(&o->submodule_source_paths, 0);
> > }
> > +
> > +void odb_reprepare(struct object_database *o)
>
> OK; so here is the new location for the non-packfile related portions of
> the former reprepare_packed_git() function. That makes sense, but...
>
> > +{
> > + struct odb_source *source;
> > +
> > + /*
> > + * Reprepare alt odbs, in case the alternates file was modified
> > + * during the course of this process. This only _adds_ odbs to
> > + * the linked list, so existing odbs will continue to exist for
> > + * the lifetime of the process.
> > + */
> > + o->loaded_alternates = 0;
> > + odb_prepare_alternates(o);
> > +
> > + for (source = o->sources; source; source = source->next)
> > + odb_clear_loose_cache(source);
> > +
> > + o->approximate_object_count_valid = 0;
> > +
> > + packfile_store_reprepare(o->packfiles);
> > +
> > + obj_read_unlock();
>
> ...I think I am missing where we call odb_read_lock(). The function
> packfile_store_reprepare() has a comment that it must be called under
> the odb_read_lock(), but I don't see where we acquire that lock here.
>
> Are the callers of odb_reprepare() supposed to acquire that lock? If so,
> it seems a little awkward that the caller is supposed to acquire the
> lock, but the callee is the one to release it. Is this function missing
> a odb_read_lock() at the top?
>
> I looked at a few callers here and none of them seem to be holding this
> lock. pthread_mutex_unlock() is supposed to check that the mutex lock is
> held for recursive and error-checking mutexes. IIRC we initialize the
> the odb mutex with PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, so I am a
> little surprised that this did not cause a runtime error.
Oh, that's a very good catch. And yes, surprising indeed that this does
not cause an error anywhere. But if you look at the code it ultimately
isn't all that surprising: there is only a single command in our code
base that performs concurrent reads via the ODB, namely git-grep(1).
That command explicitly opts into `obj_read_use_lock` by calling
`enable_obj_read_lock()`. All the other commands never perform any
locking whatsoever.
In any case, I think I merely missed to add the locking call into
`odb_reprepare()`. It shouldn't be the responsibility of the caller to
care about locking, it should be handled automatically.
Will fix.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store
2025-08-26 2:11 ` Taylor Blau
@ 2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:50 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Mon, Aug 25, 2025 at 10:11:52PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:08AM +0200, Patrick Steinhardt wrote:
> > The `install_packed_git()` functions adds a packfile to a specific
> > object store. Refactor it to accept a packfile store instead of a
> > repository to clarify its scope.
>
> All of the refactoring here looks straightforward and correct to me. I
> admit that I have a vague preference towards keeping the word "install"
> in the function name here, since it (to me) suggests that the packfile
> in question is going to be used for lookups, whereas "add" is a bit
> more generic.
>
> I don't feel strongly about it, though, so if you have a preference
> towards "add" then I'm fine with that.
I think "install" would make sense if there was a mode where you can set
up a packfile via the store that is _inactive_, but that's not the case
anymore. It was before the changes to introduce the packfile store,
where "add" only instantiated the packfile whereas "install" added it to
the store. But that difference is basically going away now, so as soon
as you add a packfile to the store it immediately becomes active.
So I picked "add" because it's a bit shorter compared to "install", and
because from my point of view the previous distinction isn't necessary
anymore.
I'll keep this as-is for now, but same as you: I don't really feel all
that strongly about it.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 02/16] odb: move list of packfiles into `struct packfile_store`
2025-08-25 23:42 ` Taylor Blau
@ 2025-09-02 8:50 ` Patrick Steinhardt
2025-09-02 17:21 ` Taylor Blau
0 siblings, 1 reply; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:50 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Mon, Aug 25, 2025 at 07:42:53PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:00AM +0200, Patrick Steinhardt wrote:
> > The object database tracks the list of packfiles it currently knows
> > about. With the introduction of the `struct packfile_store` we have a
> > better place to host this list though.
> >
> > Move the list accordingly. Extract the logic from `odb_clear()` that
> > knows to close all such packfiles and move it into the new subsystem, as
> > well.
>
> Not a comment on this patch itself, but as a meta-comment, I really
> appreciate you taking such an incremental approach here. The packfile
> machinery is quite fragile in my experience, so breaking it up into (what
> are so far) easily review-able chunks makes it much easier to build
> confidence in the correctness of these changes.
It certainly is fragile overall. I stared at code for way longer than I
really want to admit in some cases.
> > diff --git a/odb.c b/odb.c
> > index 34b70d0074..17a9135cbd 100644
> > --- a/odb.c
> > +++ b/odb.c
> > @@ -1038,16 +1038,7 @@ void odb_clear(struct object_database *o)
> >
> > INIT_LIST_HEAD(&o->packed_git_mru);
> > close_object_store(o);
> > -
> > - /*
> > - * `close_object_store()` only closes the packfiles, but doesn't free
> > - * them. We thus have to do this manually.
> > - */
> > - for (struct packed_git *p = o->packed_git, *next; p; p = next) {
> > - next = p->next;
> > - free(p);
> > - }
> > - o->packed_git = NULL;
> > + packfile_store_free(o->packfiles);
>
> Interesting. The movement of the for-loop here all looks correct to me.
> But I think the new packfile_store is creating a new awkardness here
> that we should consider.
>
> In existing implementation, all of the ->next pointers here point to
> heap locations that have already been free()'d. But that's OK, since
> they aren't reachable at the moment that we do "o-packed_store = NULL".
>
> Having a separate packfile_store changes that, since (from my reading of
> the code) o->packfiles will still be non-NULL even after calling
> odb_clear(), *and* those pointers will refer to free'd heap locations.
>
> That seems like a potential footgun to me. I think that we could either:
>
> * Change packfile_store_free() to take in an object_database pointer,
> and NULL out the ->packs pointer after free'ing all of the packfiles.
> That would make it more similar to the existing behavior.
>
> * Leave packfile_store_free() as-is, document that it does NOT clear
> out the top-level pointer, and so callers are encouraged to NULL it
> out themselves after calling it. Likewise, we should change
> odb_clear() to do:
>
> packfile_store_free(o->packfiles);
> o->packfiles = NULL;
>
> Let me know what you think.
Good point. I think it's unlikely to ever become a problem, but I don't
see a reason why we shouldn't NULL out `o->packfiles`, either. So I'll
do the second approach.
> > diff --git a/packfile.c b/packfile.c
> > index 8fbf1cfc2d..6478e4cc30 100644
> > --- a/packfile.c
> > +++ b/packfile.c
> > @@ -278,7 +278,7 @@ static int unuse_one_window(struct packed_git *current)
> >
> > if (current)
> > scan_windows(current, &lru_p, &lru_w, &lru_l);
> > - for (p = current->repo->objects->packed_git; p; p = p->next)
> > + for (p = current->repo->objects->packfiles->packs; p; p = p->next)
>
> Not a huge deal, but I do find "current->repo->objects->packfiles->packs"
> to be a bit unfortunate. I wonder if we should rename "packs" to "head"
> or "list_head" or similar since it's clear from
> "current->repo->objects->packfiles" that this is a list of packfiles.
I'd like to keep this part as-is for now if you don't mind. This is
mostly because I've got a follow-up patch series that _does_ introduce
`head` as part of making the `->next` pointer go away.
> > @@ -2344,5 +2339,23 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
> >
> > void packfile_store_free(struct packfile_store *store)
> > {
> > + packfile_store_close(store);
>
> Seeing a call to packfile_store_close() here was a little surprising to
> me. The code that you are moving has a comment that says:
>
> * `close_object_store()` only closes the packfiles, but doesn't free
> * them. We thus have to do this manually.
>
> , so I would have expected to preserve that behavior.
This behaviour is preserved though. Calling `packfile_store_close()`
does not free the packfiles, it only closes them. And we continue to
call `packfile_store_close()` in `close_object_store()`, so nothing
changes.
The only change in behaviour is that we now also know to close packfiles
when freeing the packfile store.
> I *think* that
> this happens to be OK, since close_pack() is a noop if it is called more
> than once (though I had to double check through all of its leaf
> functions that that was indeed the case).
>
> I would probably strike this from the new function, since the sole
> caller above already calls close_object_store() before calling
> packfile_store_free().
Calling `packfile_store_close()` is idempotent indeed, so it shouldn't
be an issue to call this function twice. To me the question is whether
there's ever a use case where you want to free the packfile store, but
don't want to close the packfiles stored in it.
From all I've seen that is never the case, so I think it's sensible to
ensure that we always close the packfile store before we free it to make
things a tiny bit more self-contained.
> > +void packfile_store_close(struct packfile_store *store)
> > +{
> > + struct packed_git *p;
> > +
> > + for (p = store->packs; p; p = p->next)
> > + if (p->do_not_close)
> > + BUG("want to close pack marked 'do-not-close'");
> > + else
> > + close_pack(p);
> > +}
>
> And likewise this looks good to me. I do find the braceless for-loop a
> little hard to read, but it's (a) correct, and (b) consistent with the
> original implementation, so I don't feel strongly about changing it.
Agreed, it is a bit awkward. I feel like our coding style should be
amended to say that we only do braceless bodies in case the body is a
single statement.
> As a side-note, you could inline the declaration of "p" here into the
> for-loop, but I can understand not wanting to to make the diff more
> readable with --color-moved.
I wouldn't mind adapting this while at it, too.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack
2025-08-27 1:04 ` Taylor Blau
@ 2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:50 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Tue, Aug 26, 2025 at 09:04:24PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:09AM +0200, Patrick Steinhardt wrote:
> > When adding a packfile to it store we add it both to the list and map of
> > packfiles, but we don't append it to the most-recently-used list of
> > packs. We do know to add the packfile to the MRU list as soon as we
> > access any of its objects, but in between we're being inconistent. It
> > doesn't help that there are some subsystems that _do_ add the packfile
> > to the MRU after having added it, which only adds to the confusion.
> >
> > Refactor the code so that we unconditionally add packfiles to the MRU
> > when adding them to a packfile store.
>
> I am a little confused why prepare_midx_pack() wants to add packs to the
> MRU cache so eagerly, and the commit which introduced that behavior
> (commit af96fe3392 (midx: add packs to packed_git linked list,
> 2019-04-29)) doesn't focus on that area in detail.
>
> (Note that commit af96fe3392 *does* discuss a separate cache's behavior
> regarding the open file descriptor limit, but that LRU cache is a
> different one than the MRU cache we're discussing here.)
>
> What I do wonder about is why af96fe3392 adds packs to the MRU cache in
> the first place. As far as I can tell, we never move MIDX'd packs to
> the front of the MRU cache at all. There are two spots that call
> list_move() on the MRU cache, which are:
>
> - packfile.c::find_pack_entry(), which enumerates MIDX'd
> packs in a separate loop earlier on in the function, and ignores
> packs in the MRU cache whose p->multi_pack_index bit is set.
>
> - builtin/pack-objects.c::want_object_in_pack_mtime(), which also
> enumerates MIDX'd packs in a separate loop, though it does not
> explicitly ignore packs in the MRU cache with the multi_pack_index
> bit set.
>
> In practice, though, I think these two are equivalent, since
> want_object_in_pack_mtime() will return before it gets to the MRU cache
> if it found the object in a MIDX'd pack.
>
> So I don't think we need to be adding MIDX'd packs to the MRU cache in
> the first place.
I think the status quo is quite confusing. There are callers which
directly iterate through the list of packfiles in MRU order, and that
list is not guaranteed right now to even contain all packfiles that are
tracked in the packfile store. The list is complete when we only load
packfiles from disk, but if we ever manually add a packfile to the store
in-memory the list is not up-to-date anymore. I also couldn't find a
reason for that distinction.
Despite being confusing, there's another motivation here as discussed
with Peff in [1]: we can drop the distinction between the MRU list and
the "normal" list altogether. Ensuring that all packfiles are always
stored in the MRU is a prerequisite for that subsequent change. I
already got a patch series pending that does this refactoring.
That being said, I wouldn't mind moving this change into that subsequent
patch series, either. It doesn't really have a strong reason to exist
yet, but once we remove the distinction between the two packfile lists
we have a much stronger argument.
[1]: <20250820192008.GA1662788@coredump.intra.peff.net>
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 14/16] packfile: remove `get_packed_git()`
2025-08-27 1:38 ` Taylor Blau
@ 2025-09-02 8:50 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:50 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Tue, Aug 26, 2025 at 09:38:47PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:12AM +0200, Patrick Steinhardt wrote:
> > We have two different functions to retrieve packfiles for a packfile
> > store:
> >
> > - `get_packed_git()` returns the list of packfiles after having called
> > `prepare_packed_git()`.
> >
> > - `get_all_packs()` calls `prepare_packed_git()`, as well, but also
> > calls `prepare_midx_pack()` for each pack.
>
> Yeah, having two of these functions that are named so similarly as to
> suggest they do the same thing (even though they don't) is unfortunate,
> and I am glad that we are looking at it here.
>
> > This means that the latter function also properly loads the info of
> > whether or not a packfile is part of a multi-pack index. Preparing this
> > extra information also shouldn't be significantly more expensive:
>
> Right; get_packed_git() only loads the non-MIDX'd packs, and
> get_all_packs() loads everything (regardless whether or not a pack is
> part of the MIDX or not).
I initially understood the distinction of these functions to be exactly
this. But after looking further I don't think this is the actual
distinction: both functions end up loading all packfiles in the repo,
with the only distinction being that `get_all_packs()` also prepares the
MIDX for each MIDX'd packfile.
The important thing to note is that an MIDX may only ever refer to
packfiles in the same object directory. And as we have already loaded
all packfiles in that object directory via `prepare_packed_git_one()`
there isn't really any difference in the returned list of packfiles
whatsoever. Both functions end up loading all packfiles, regardless of
whether or not they have an MIDX.
Please double check my understanding though, I have been staring at this
code for quite a while to figure out what the actual differences are,
and whether or not `get_all_packs()` may actually end up loading more
packs than `get_packed_git()`.
> Are all of the get_packed_git() callers prepared to handle packs that
> are in the MIDX? Looking through them:
>
> - builtin/gc.c::incremental_repack_auto_condition() skips over
> 'p->multi_pack_index', so this one is fine to convert.
>
> - builtin/grep.c::cmd_grep() calls get_packed_git() but doesn't
> actually use the result, so this should be fine to convert, though I
> think there is some subtlty here.
>
> - builtin/pack-objects.c::want_object_in_pack_mtime() takes a separate
> pass over the MIDX'd packs before calling get_packed_git_mru() (which
> itself calls prepare_packed_git()). I think in practice this is OK,
> since we will have already handled the MIDX'd packs, but this
> function is now iterating over packs in the MIDX twice, so it may be
> worth adding a "if (p->multi_pack_index) continue;" in there.
>
> - object-name.c::find_short_packed_object() handles MIDX'd packs
> separately, and unique_in_pack() is a noop for MIDX'd packs, so this
> one is fine.
>
> - object-name.c::find_abbrev_len_packed() is OK for the same reasons.
>
> So I think that want_object_in_pack_mtime() may need a small tweak, and
> I am not 100% certain that cmd_grep() is OK to convert.
I initially misunderstood the distinction between these two functions
the same as you did, and had a similar list to the above in the initial
commit message. But with the adjusted understanding of the actual
difference between these functions I think it shouldn't be necessary
anymore to go through each caller one by one.
I'll adapt the commit message a bit.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store
2025-08-27 1:45 ` Taylor Blau
@ 2025-09-02 8:51 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 8:51 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Karthik Nayak, Jeff King
On Tue, Aug 26, 2025 at 09:45:15PM -0400, Taylor Blau wrote:
> On Thu, Aug 21, 2025 at 09:39:13AM +0200, Patrick Steinhardt wrote:
> > The `get_all_packs()` function prepares the packfile store and then
> > returns its packfiles. Refactor it to accept a packfile store instead of
> > a repository to clarify its scope.
>
> I think that clarifying the scope here is a good idea. But I am a little
> sad to see this patch proposing that we drop get_all_packs(), which IMHO
> is a useful convenience function.
>
> In effect this is pushing out the implementation details of the
> packfile_store out to every caller that wants to use get_all_packs(),
> which I am not sure is a win. Should those callers care where the array
> of packs is found, or have to write
>
> packfile_store_get_packs(the_repository->objects->packfiles)
>
> each time they want to get the set of packs in a repository?
>
> I could see an argument in the future where we have object stores that
> aren't packfile-based and thus calling "get_all_packs()" is not
> meaningful. But I don't think we are there yet, so I think that this
> patch is pushing the burden of that future hypothetical on all existing
> callers of get_all_packs().
We aren't there yet indeed, but the entire goal of this patch series is
to prepare for that future. So we have to do some steps into that
direction that might not yet be entirely sensible by themselves, but
that are necessary prerequisites.
In the end there ideally shouldn't be that many callers that want to
access packfiles directly, but it should be the case that most of them
go via the ODB. But many of the callers that we're adapting in this
patch are callers that are deeply tied to the actual ODB on-disk layout,
and we'll have to tear down the abstraction layer between ODB and the
actual backend used to store objects. It's unfortunate, but we cannot
really avoid that in a bunch of situations.
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index fce0b06451c..7124c43fb14 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -854,7 +854,7 @@ static void batch_each_object(struct batch_options *opt,
> > batch_one_object_bitmapped, &payload)) {
> > struct packed_git *pack;
> >
> > - for (pack = get_all_packs(the_repository); pack; pack = pack->next) {
> > + for (pack = packfile_store_get_packs(the_repository->objects->packfiles); pack; pack = pack->next) {
>
> If we do go this route, it might be nice to introduce the pattern of
> having a stack variable to hold the packfile_store pointer, since the
> line above here is getting a little long at >100 characters just to
> enumerate packs.
Okay, will do for now. In a subsequent patch series I'm going to
introduce a helper `packfile_store_for_each_pack()` that'll make this a
bit less verbose, too.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH v3 00/15] packfile: carve out a new packfile store
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
` (18 preceding siblings ...)
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 01/15] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
` (15 more replies)
19 siblings, 16 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
Hi,
information about a object database's packfiles is currently distributed
across two different structures:
- `struct packed_git` contains the `next` pointer as well as the
`mru_head`, both of which serve to store the list of packfiles.
- `struct object_database` contains several fields that relate to the
packfiles.
So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.
This patch series introduces a new `struct packfile_store`, which is
about to become the single source of truth for managing packfiles, and
carves out the packfile store subsystem.
This is the first step to make packfiles work with pluggable object
databases. Next steps will be to:
- Move the `struct packed_git::next` and `struct packed::mru_head`
pointers into the packfile store so that `struct packed_git` only
tracks a single packfile.
- Push the `struct packfile_store` down one level so that it's not
hosted by the object database anymore, but instead by the object
database source.
Changes in v2:
- Convert the `initialized` flag into a boolean.
- Polish some commit messages.
- Some smaller formatting changes to the layout of `struct
object_database`.
- Link to v1: https://lore.kernel.org/r/20250819-b4-pks-packfiles-store-v1-0-1660842e125a@pks.im
Changes in v3:
- Rebased on top of master at 6ad8021821 (The fifth batch, 2025-08-29)
with ps/object-store-midx-dedup-info at 13296ac909 (midx: compute
paths via their source, 2025-08-11) merged into it. This fixes
various conflicts with "seen". There's still two conflicts: a
trivial one with jt/de-global-bulk-checkin. And a more complex one
with tb/prepare-midx-pack-cleanup. I don't think it's necessary to
really address the first one, but I'm unsure how to proceed with the
second one given that the patch series still seems to be cooking.
- Set `struct object_database::packfiles` to `NULL` after free'ing it.
- Add a comment to explain the kept cache.
- Fix a missing `obj_read_lock()` call.
- Drop the commit that always adds packfiles to the MRU. I've moved
this into a subsequent patch series.
- Avoid some overly long lines by storing the pointer to the packfile
store on the stack.
- Point out the difference between `get_all_packs()` and
`get_packed_git()`.
- Link to v2: https://lore.kernel.org/r/20250821-b4-pks-packfiles-store-v2-0-d10623355e9f@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (15):
packfile: introduce a new `struct packfile_store`
odb: move list of packfiles into `struct packfile_store`
odb: move initialization bit into `struct packfile_store`
odb: move packfile map into `struct packfile_store`
odb: move MRU list of packfiles into `struct packfile_store`
odb: move kept cache into `struct packfile_store`
packfile: reorder functions to avoid function declaration
packfile: refactor `prepare_packed_git()` to work on packfile store
packfile: split up responsibilities of `reprepare_packed_git()`
packfile: refactor `install_packed_git()` to work on packfile store
packfile: introduce function to load and add packfiles
packfile: move `get_multi_pack_index()` into "midx.c"
packfile: remove `get_packed_git()`
packfile: refactor `get_all_packs()` to work on packfile store
packfile: refactor `get_packed_git_mru()` to work on packfile store
builtin/backfill.c | 2 +-
builtin/cat-file.c | 3 +-
builtin/count-objects.c | 3 +-
builtin/fast-import.c | 10 +-
builtin/fsck.c | 11 +-
builtin/gc.c | 14 ++-
builtin/grep.c | 2 +-
builtin/index-pack.c | 10 +-
builtin/pack-objects.c | 32 +++--
builtin/pack-redundant.c | 6 +-
builtin/receive-pack.c | 2 +-
builtin/repack.c | 11 +-
bulk-checkin.c | 2 +-
connected.c | 5 +-
fetch-pack.c | 4 +-
http-backend.c | 5 +-
http.c | 5 +-
http.h | 2 +-
midx.c | 29 ++---
midx.h | 1 +
object-name.c | 6 +-
odb.c | 40 ++++--
odb.h | 34 ++----
pack-bitmap.c | 4 +-
pack-objects.c | 3 +-
packfile.c | 287 ++++++++++++++++++++++++--------------------
packfile.h | 119 +++++++++++++++---
server-info.c | 3 +-
t/helper/test-find-pack.c | 2 +-
t/helper/test-pack-mtimes.c | 2 +-
transport-helper.c | 2 +-
31 files changed, 391 insertions(+), 270 deletions(-)
Range-diff versus v2:
1: 9f205abd1e = 1: efa41d9634 packfile: introduce a new `struct packfile_store`
2: a48391a618 ! 2: c4a45fdaf1 odb: move list of packfiles into `struct packfile_store`
@@ odb.c: void odb_clear(struct object_database *o)
- }
- o->packed_git = NULL;
+ packfile_store_free(o->packfiles);
++ o->packfiles = NULL;
hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
@@ packfile.c: void reprepare_packed_git(struct repository *r)
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
@@ packfile.c: struct packed_git *get_all_packs(struct repository *r)
- prepare_midx_pack(r, m, i);
+ prepare_midx_pack(m, i);
}
- return r->objects->packed_git;
@@ packfile.c: const struct packed_git *has_packed_and_bad(struct repository *r,
return p;
return NULL;
@@ packfile.c: int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
- if (source->midx && fill_midx_entry(r, oid, e, source->midx))
+ if (source->midx && fill_midx_entry(source->midx, oid, e))
return 1;
- if (!r->objects->packed_git)
@@ packfile.c: struct packfile_store *packfile_store_new(struct object_database *od
+
+void packfile_store_close(struct packfile_store *store)
+{
-+ struct packed_git *p;
-+
-+ for (p = store->packs; p; p = p->next)
++ for (struct packed_git *p = store->packs; p; p = p->next) {
+ if (p->do_not_close)
+ BUG("want to close pack marked 'do-not-close'");
+ else
+ close_pack(p);
++ }
+}
## packfile.h ##
3: 4a8cdfd936 = 3: 22125b27eb odb: move initialization bit into `struct packfile_store`
4: ba32eb35c4 ! 4: b927b57c6d odb: move packfile map into `struct packfile_store`
@@ Commit message
Signed-off-by: Patrick Steinhardt <ps@pks.im>
## midx.c ##
-@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
+@@ midx.c: int prepare_midx_pack(struct multi_pack_index *m,
strbuf_addbuf(&key, &pack_name);
strbuf_strip_suffix(&key, ".idx");
strbuf_addstr(&key, ".pack");
@@ odb.c: struct object_database *odb_new(struct repository *repo)
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ odb.c: void odb_clear(struct object_database *o)
- close_object_store(o);
packfile_store_free(o->packfiles);
+ o->packfiles = NULL;
- hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
5: a87f102193 ! 5: 030e34967d odb: move MRU list of packfiles into `struct packfile_store`
@@ Commit message
Signed-off-by: Patrick Steinhardt <ps@pks.im>
## midx.c ##
-@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
- p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
+@@ midx.c: int prepare_midx_pack(struct multi_pack_index *m,
+ m->source->local);
if (p) {
install_packed_git(r, p);
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
@@ odb.c: void odb_clear(struct object_database *o)
- INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
packfile_store_free(o->packfiles);
-
+ o->packfiles = NULL;
## odb.h ##
@@
6: 7b43149530 ! 6: bda7ae0259 odb: move kept cache into `struct packfile_store`
@@ packfile.h: struct packfile_store {
*/
struct packed_git *packs;
++ /*
++ * Cache of packfiles which are marked as "kept", either because there
++ * is an on-disk ".keep" file or because they are marked as "kept" in
++ * memory.
++ *
++ * Should not be accessed directly, but via `kept_pack_cache()`. The
++ * list of packs gets invalidated when the stored flags and the flags
++ * passed to `kept_pack_cache()` mismatch.
++ */
+ struct {
+ struct packed_git **packs;
+ unsigned flags;
7: 23d8d87330 ! 7: 0ed3c2f690 packfile: reorder functions to avoid function declaration
@@ Commit message
Signed-off-by: Patrick Steinhardt <ps@pks.im>
## packfile.c ##
-@@ packfile.c: static void prepare_packed_git_one(struct odb_source *source, int local)
+@@ packfile.c: static void prepare_packed_git_one(struct odb_source *source)
string_list_clear(data.garbage, 0);
}
8: 1623a5682e ! 8: c39180e7d2 packfile: refactor `prepare_packed_git()` to work on packfile store
@@ packfile.c: static int sort_pack(const struct packed_git *a, const struct packed
- odb_prepare_alternates(r->objects);
- for (source = r->objects->sources; source; source = source->next) {
-- int local = (source == r->objects->sources);
+ odb_prepare_alternates(store->odb);
+ for (source = store->odb->sources; source; source = source->next) {
-+ int local = (source == store->odb->sources);
- prepare_multi_pack_index_one(source, local);
- prepare_packed_git_one(source, local);
+ prepare_multi_pack_index_one(source);
+ prepare_packed_git_one(source);
}
- rearrange_packed_git(r);
+ sort_packs(&store->packs, sort_pack);
@@ packfile.c: int find_pack_entry(struct repository *r, const struct object_id *oi
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next)
- if (source->midx && fill_midx_entry(r, oid, e, source->midx))
+ if (source->midx && fill_midx_entry(source->midx, oid, e))
9: 1ef4f6b1da ! 9: c39986a035 packfile: split up responsibilities of `reprepare_packed_git()`
@@ odb.c: void odb_clear(struct object_database *o)
+{
+ struct odb_source *source;
+
++ obj_read_lock();
++
+ /*
+ * Reprepare alt odbs, in case the alternates file was modified
+ * during the course of this process. This only _adds_ odbs to
@@ odb.h: struct object_database {
+void odb_reprepare(struct object_database *o);
+
/*
- * Find source by its object directory path. Dies in case the source couldn't
- * be found.
+ * Find source by its object directory path. Returns a `NULL` pointer in case
+ * the source could not be found.
## packfile.c ##
@@ packfile.c: static void packfile_store_prepare(struct packfile_store *store)
10: 3da8d69d11 ! 10: 8bc0e68c2f packfile: refactor `install_packed_git()` to work on packfile store
@@ http.h: int finish_http_pack_request(struct http_pack_request *preq);
* from http_get_info_packs() and have chosen a specific pack to fetch.
## midx.c ##
-@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
- if (!p) {
- p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
+@@ midx.c: int prepare_midx_pack(struct multi_pack_index *m,
+ p = add_packed_git(r, pack_name.buf, pack_name.len,
+ m->source->local);
if (p) {
- install_packed_git(r, p);
+ packfile_store_add_pack(r->objects->packfiles, p);
11: 5c98d84581 < -: ---------- packfile: always add packfiles to MRU when adding a pack
12: d9bca4e7cf ! 11: 0ea8488379 packfile: introduce function to load and add packfiles
@@ builtin/index-pack.c: static void final(const char *final_pack_name, const char
printf("%s\n", hash_to_hex(hash));
## midx.c ##
-@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
- uint32_t pack_int_id)
+@@ midx.c: int prepare_midx_pack(struct multi_pack_index *m,
{
+ struct repository *r = m->source->odb->repo;
struct strbuf pack_name = STRBUF_INIT;
- struct strbuf key = STRBUF_INIT;
struct packed_git *p;
pack_int_id = midx_for_pack(&m, pack_int_id);
-@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
+@@ midx.c: int prepare_midx_pack(struct multi_pack_index *m,
- strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
+ strbuf_addf(&pack_name, "%s/pack/%s", m->source->path,
m->pack_names[pack_int_id]);
-
- /* pack_map holds the ".pack" name, but we have the .idx */
@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
- strhash(key.buf), key.buf,
- struct packed_git, packmap_ent);
- if (!p) {
-- p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
-- if (p)
+- p = add_packed_git(r, pack_name.buf, pack_name.len,
+- m->source->local);
+- if (p) {
- packfile_store_add_pack(r->objects->packfiles, p);
+- list_add_tail(&p->mru, &r->objects->packfiles->mru);
+- }
- }
-
+ p = packfile_store_load_pack(r->objects->packfiles,
-+ pack_name.buf, m->local);
++ pack_name.buf, m->source->local);
++ if (p)
++ list_add_tail(&p->mru, &r->objects->packfiles->mru);
strbuf_release(&pack_name);
- strbuf_release(&key);
@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
## packfile.c ##
@@ packfile.c: void packfile_store_add_pack(struct packfile_store *store,
- list_add_tail(&pack->mru, &store->mru);
+ hashmap_add(&store->map, &pack->packmap_ent);
}
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
13: bfda0a35f8 ! 12: 34a1b17424 packfile: move `get_multi_pack_index()` into "midx.c"
@@ midx.c: static int midx_read_object_offsets(const unsigned char *chunk_start,
+ return source->midx;
+}
+
- static struct multi_pack_index *load_multi_pack_index_one(struct repository *r,
- const char *object_dir,
- const char *midx_name,
+ static struct multi_pack_index *load_multi_pack_index_one(struct odb_source *source,
+ const char *midx_name)
+ {
## midx.h ##
-@@ midx.h: void get_split_midx_filename_ext(const struct git_hash_algo *hash_algo,
- struct strbuf *buf, const char *object_dir,
+@@ midx.h: void get_midx_chain_filename(struct odb_source *source, struct strbuf *out);
+ void get_split_midx_filename_ext(struct odb_source *source, struct strbuf *buf,
const unsigned char *hash, const char *ext);
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
-+
- struct multi_pack_index *load_multi_pack_index(struct repository *r,
- const char *object_dir,
- int local);
+ struct multi_pack_index *load_multi_pack_index(struct odb_source *source);
+ int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id);
+ struct packed_git *nth_midxed_pack(struct multi_pack_index *m,
## packfile.c ##
@@ packfile.c: static void packfile_store_prepare_mru(struct packfile_store *store)
14: 8d54b4034b ! 13: e039ab7770 packfile: remove `get_packed_git()`
@@ Commit message
- `get_all_packs()` calls `prepare_packed_git()`, as well, but also
calls `prepare_midx_pack()` for each pack.
- This means that the latter function also properly loads the info of
- whether or not a packfile is part of a multi-pack index. Preparing this
- extra information also shouldn't be significantly more expensive:
+ Based on the naming alone one might think that `get_all_packs()` would
+ return more packs than `get_packed_git()`. But that's not the case: both
+ functions end up returning the exact same list of packfiles. The real
+ difference between those functions is that `get_all_packs()` also loads
+ the info of whether or not a packfile is part of a multi-pack index.
+
+ Preparing this extra information also shouldn't be significantly more
+ expensive:
- We have already loaded all packfiles via `prepare_packed_git_one()`.
So given that multi-pack indices may only refer to packfiles in the
15: b04c2e4f7d ! 14: 65690526c2 packfile: refactor `get_all_packs()` to work on packfile store
@@ Commit message
## builtin/cat-file.c ##
@@ builtin/cat-file.c: static void batch_each_object(struct batch_options *opt,
+
+ if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter,
batch_one_object_bitmapped, &payload)) {
++ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *pack;
- for (pack = get_all_packs(the_repository); pack; pack = pack->next) {
-+ for (pack = packfile_store_get_packs(the_repository->objects->packfiles); pack; pack = pack->next) {
++ for (pack = packfile_store_get_packs(packs); pack; pack = pack->next) {
if (bitmap_index_contains_pack(bitmap, pack) ||
open_pack_index(pack))
continue;
## builtin/count-objects.c ##
+@@ builtin/count-objects.c: int cmd_count_objects(int argc,
+ count_loose, count_cruft, NULL, NULL);
+
+ if (verbose) {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p;
+ unsigned long num_pack = 0;
+ off_t size_pack = 0;
@@ builtin/count-objects.c: int cmd_count_objects(int argc,
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local)
continue;
if (open_pack_index(p))
## builtin/fast-import.c ##
+@@ builtin/fast-import.c: static int store_object(
+ struct object_id *oidout,
+ uintmax_t mark)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ void *out, *delta;
+ struct object_entry *e;
+ unsigned char hdr[96];
@@ builtin/fast-import.c: static int store_object(
if (e->idx.offset) {
duplicate_count_by_type[type]++;
return 1;
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
-+ } else if (find_oid_pack(&oid, packfile_store_get_packs(the_repository->objects->packfiles))) {
++ } else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) {
e->type = type;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
+@@ builtin/fast-import.c: static void truncate_pack(struct hashfile_checkpoint *checkpoint)
+
+ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ size_t in_sz = 64 * 1024, out_sz = 64 * 1024;
+ unsigned char *in_buf = xmalloc(in_sz);
+ unsigned char *out_buf = xmalloc(out_sz);
@@ builtin/fast-import.c: static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
duplicate_count_by_type[OBJ_BLOB]++;
truncate_pack(&checkpoint);
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
-+ } else if (find_oid_pack(&oid, packfile_store_get_packs(the_repository->objects->packfiles))) {
++ } else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) {
e->type = OBJ_BLOB;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
## builtin/fsck.c ##
-@@ builtin/fsck.c: static int check_pack_rev_indexes(struct repository *r, int show_progress)
+@@ builtin/fsck.c: static int mark_packed_for_connectivity(const struct object_id *oid,
+
+ static int check_pack_rev_indexes(struct repository *r, int show_progress)
+ {
++ struct packfile_store *packs = r->objects->packfiles;
+ struct progress *progress = NULL;
+ uint32_t pack_count = 0;
int res = 0;
if (show_progress) {
- for (struct packed_git *p = get_all_packs(r); p; p = p->next)
-+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next)
++ for (struct packed_git *p = packfile_store_get_packs(packs); p; p = p->next)
pack_count++;
progress = start_delayed_progress(the_repository,
"Verifying reverse pack-indexes", pack_count);
@@ builtin/fsck.c: static int check_pack_rev_indexes(struct repository *r, int show
}
- for (struct packed_git *p = get_all_packs(r); p; p = p->next) {
-+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
++ for (struct packed_git *p = packfile_store_get_packs(packs); p; p = p->next) {
int load_error = load_pack_revindex_from_disk(p);
if (load_error < 0) {
+@@ builtin/fsck.c: int cmd_fsck(int argc,
+ for_each_packed_object(the_repository,
+ mark_packed_for_connectivity, NULL, 0);
+ } else {
++ struct packfile_store *packs = the_repository->objects->packfiles;
++
+ odb_prepare_alternates(the_repository->objects);
+ for (source = the_repository->objects->sources; source; source = source->next)
+ fsck_source(source);
@@ builtin/fsck.c: int cmd_fsck(int argc,
struct progress *progress = NULL;
if (show_progress) {
- for (p = get_all_packs(the_repository); p;
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p;
++ for (p = packfile_store_get_packs(packs); p;
p = p->next) {
if (open_pack_index(p))
continue;
@@ builtin/fsck.c: int cmd_fsck(int argc,
_("Checking objects"), total);
}
- for (p = get_all_packs(the_repository); p;
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p;
++ for (p = packfile_store_get_packs(packs); p;
p = p->next) {
/* verify gives error messages itself */
if (verify_pack(the_repository,
## builtin/gc.c ##
-@@ builtin/gc.c: static struct packed_git *find_base_packs(struct string_list *packs,
+@@ builtin/gc.c: static int too_many_loose_objects(struct gc_config *cfg)
+ static struct packed_git *find_base_packs(struct string_list *packs,
+ unsigned long limit)
{
++ struct packfile_store *packfiles = the_repository->objects->packfiles;
struct packed_git *p, *base = NULL;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packfiles); p; p = p->next) {
if (!p->pack_local || p->is_cruft)
continue;
if (limit) {
-@@ builtin/gc.c: static int too_many_packs(struct gc_config *cfg)
+@@ builtin/gc.c: static struct packed_git *find_base_packs(struct string_list *packs,
+
+ static int too_many_packs(struct gc_config *cfg)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p;
+ int cnt;
+
if (cfg->gc_auto_pack_limit <= 0)
return 0;
- for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
-+ for (cnt = 0, p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (cnt = 0, p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local)
continue;
if (p->pack_keep)
@@ builtin/gc.c: static off_t get_auto_pack_size(void)
max_size = p->pack_size;
## builtin/pack-objects.c ##
+@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b)
+
+ static void read_packs_list_from_stdin(struct rev_info *revs)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct strbuf buf = STRBUF_INIT;
+ struct string_list include_packs = STRING_LIST_INIT_DUP;
+ struct string_list exclude_packs = STRING_LIST_INIT_DUP;
@@ builtin/pack-objects.c: static void read_packs_list_from_stdin(struct rev_info *revs)
string_list_sort(&exclude_packs);
string_list_remove_duplicates(&exclude_packs, 0);
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
const char *pack_name = pack_basename(p);
if ((item = string_list_lookup(&include_packs, pack_name)))
+@@ builtin/pack-objects.c: static void enumerate_cruft_objects(void)
+
+ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p;
+ struct rev_info revs;
+ int ret;
@@ builtin/pack-objects.c: static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs
* Re-mark only the fresh packs as kept so that objects in
* unknown packs do not halt the reachability traversal early.
*/
- for (p = get_all_packs(the_repository); p; p = p->next)
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
++ for (p = packfile_store_get_packs(packs); p; p = p->next)
p->pack_keep_in_core = 0;
mark_pack_kept_in_core(fresh_packs, 1);
+@@ builtin/pack-objects.c: static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs
+
+ static void read_cruft_objects(void)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct strbuf buf = STRBUF_INIT;
+ struct string_list discard_packs = STRING_LIST_INIT_DUP;
+ struct string_list fresh_packs = STRING_LIST_INIT_DUP;
@@ builtin/pack-objects.c: static void read_cruft_objects(void)
string_list_sort(&discard_packs);
string_list_sort(&fresh_packs);
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
const char *pack_name = pack_basename(p);
struct string_list_item *item;
-@@ builtin/pack-objects.c: static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
+@@ builtin/pack-objects.c: static void add_unreachable_loose_objects(struct rev_info *revs)
+
+ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ static struct packed_git *last_found = (void *)1;
struct packed_git *p;
p = (last_found != (void *)1) ? last_found :
- get_all_packs(the_repository);
-+ packfile_store_get_packs(the_repository->objects->packfiles);
++ packfile_store_get_packs(packs);
while (p) {
if ((!p->pack_local || p->pack_keep ||
@@ builtin/pack-objects.c: static int has_sha1_pack_kept_or_nonlocal(const struct o
}
if (p == last_found)
- p = get_all_packs(the_repository);
-+ p = packfile_store_get_packs(the_repository->objects->packfiles);
++ p = packfile_store_get_packs(packs);
else
p = p->next;
if (p == last_found)
-@@ builtin/pack-objects.c: static void loosen_unused_packed_objects(void)
+@@ builtin/pack-objects.c: static int loosened_object_can_be_discarded(const struct object_id *oid,
+
+ static void loosen_unused_packed_objects(void)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p;
+ uint32_t i;
uint32_t loosened_objects_nr = 0;
struct object_id oid;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local || p->pack_keep || p->pack_keep_in_core)
continue;
-@@ builtin/pack-objects.c: static void add_extra_kept_packs(const struct string_list *names)
+@@ builtin/pack-objects.c: static void get_object_list(struct rev_info *revs, int ac, const char **av)
+
+ static void add_extra_kept_packs(const struct string_list *names)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p;
+
if (!names->nr)
return;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
const char *name = basename(p->pack_name);
int i;
@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
+
add_extra_kept_packs(&keep_pack_list);
if (ignore_packed_keep_on_disk) {
++ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next)
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
++
++ for (p = packfile_store_get_packs(packs); p; p = p->next)
if (p->pack_local && p->pack_keep)
break;
if (!p) /* no keep-able packs found */
@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
+ * want to unset "local" based on looking at packs, as
* it also covers non-local objects
*/
++ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local) {
have_non_local_packs = 1;
break;
@@ builtin/pack-redundant.c: static struct pack_list * add_pack(struct packed_git *
static struct pack_list * add_pack_file(const char *filename)
{
- struct packed_git *p = get_all_packs(the_repository);
-+ struct packed_git *p = packfile_store_get_packs(the_repository->objects->packfiles);
++ struct packfile_store *packs = the_repository->objects->packfiles;
++ struct packed_git *p = packfile_store_get_packs(packs);
if (strlen(filename) < 40)
die("Bad pack filename: %s", filename);
@@ builtin/pack-redundant.c: static struct pack_list * add_pack_file(const char *fi
static void load_all(void)
{
- struct packed_git *p = get_all_packs(the_repository);
-+ struct packed_git *p = packfile_store_get_packs(the_repository->objects->packfiles);
++ struct packfile_store *packs = the_repository->objects->packfiles;
++ struct packed_git *p = packfile_store_get_packs(packs);
while (p) {
add_pack(p);
## builtin/repack.c ##
-@@ builtin/repack.c: static void collect_pack_filenames(struct existing_packs *existing,
+@@ builtin/repack.c: static void existing_packs_release(struct existing_packs *existing)
+ static void collect_pack_filenames(struct existing_packs *existing,
+ const struct string_list *extra_keep)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
int i;
const char *base;
@@ builtin/repack.c: static void init_pack_geometry(struct pack_geometry *geometry,
+ struct existing_packs *existing,
+ const struct pack_objects_args *args)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (args->local && !p->pack_local)
/*
* When asked to only repack local packfiles we skip
-@@ builtin/repack.c: static void combine_small_cruft_packs(FILE *in, size_t combine_cruft_below_size,
+@@ builtin/repack.c: static int write_filtered_pack(const struct pack_objects_args *args,
+ static void combine_small_cruft_packs(FILE *in, size_t combine_cruft_below_size,
+ struct existing_packs *existing)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
size_t i;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!(p->is_cruft && p->pack_local))
continue;
## connected.c ##
@@ connected.c: int check_connected(oid_iterate_fn fn, void *cb_data,
+ */
+ odb_reprepare(the_repository->objects);
do {
++ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_promisor)
continue;
if (find_pack_entry_one(oid, p))
## http-backend.c ##
-@@ http-backend.c: static void get_info_packs(struct strbuf *hdr, char *arg UNUSED)
+@@ http-backend.c: static void get_head(struct strbuf *hdr, char *arg UNUSED)
+ static void get_info_packs(struct strbuf *hdr, char *arg UNUSED)
+ {
+ size_t objdirlen = strlen(repo_get_object_directory(the_repository));
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct strbuf buf = STRBUF_INIT;
+ struct packed_git *p;
size_t cnt = 0;
select_getanyfile(hdr);
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (p->pack_local)
cnt++;
}
strbuf_grow(&buf, cnt * 53 + 2);
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (p->pack_local)
strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
}
## http.c ##
+@@ http.c: static char *fetch_pack_index(unsigned char *hash, const char *base_url)
+ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
+ unsigned char *sha1, const char *base_url)
+ {
++ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *new_pack, *p;
+ char *tmp_idx = NULL;
+ int ret;
@@ http.c: static int fetch_and_setup_pack_index(struct packed_git **packs_head,
* If we already have the pack locally, no need to fetch its index or
* even add it to list; we already have all of its objects.
*/
- for (p = get_all_packs(the_repository); p; p = p->next) {
-+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (hasheq(p->hash, sha1, the_repository->hash_algo))
return 0;
}
@@ pack-bitmap.c: int verify_bitmap_files(struct repository *r)
res |= verify_bitmap_file(r->hash_algo, pack_bitmap_name);
## pack-objects.c ##
+@@ pack-objects.c: struct object_entry *packlist_find(struct packing_data *pdata,
+
+ static void prepare_in_pack_by_idx(struct packing_data *pdata)
+ {
++ struct packfile_store *packs = pdata->repo->objects->packfiles;
+ struct packed_git **mapping, *p;
+ int cnt = 0, nr = 1U << OE_IN_PACK_BITS;
+
@@ pack-objects.c: static void prepare_in_pack_by_idx(struct packing_data *pdata)
* (i.e. in_pack_idx also zero) should return NULL.
*/
mapping[cnt++] = NULL;
- for (p = get_all_packs(pdata->repo); p; p = p->next, cnt++) {
-+ for (p = packfile_store_get_packs(pdata->repo->objects->packfiles); p; p = p->next, cnt++) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next, cnt++) {
if (cnt == nr) {
free(mapping);
return;
@@ packfile.c: void packfile_store_reprepare(struct packfile_store *store)
struct multi_pack_index *m = source->midx;
if (!m)
continue;
- for (uint32_t i = 0; i < m->num_packs + m->num_packs_in_base; i++)
-- prepare_midx_pack(r, m, i);
-+ prepare_midx_pack(store->odb->repo, m, i);
+@@ packfile.c: struct packed_git *get_all_packs(struct repository *r)
+ prepare_midx_pack(m, i);
}
- return r->objects->packfiles->packs;
@@ packfile.h: int for_each_packed_object(struct repository *repo, each_packed_obje
* Give a rough count of objects in the repository. This sacrifices accuracy
## server-info.c ##
-@@ server-info.c: static void init_pack_info(struct repository *r, const char *infofile, int force
+@@ server-info.c: static int compare_info(const void *a_, const void *b_)
+
+ static void init_pack_info(struct repository *r, const char *infofile, int force)
+ {
++ struct packfile_store *packs = r->objects->packfiles;
+ struct packed_git *p;
+ int stale;
int i;
size_t alloc = 0;
- for (p = get_all_packs(r); p; p = p->next) {
-+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
++ for (p = packfile_store_get_packs(packs); p; p = p->next) {
/* we ignore things on alternate path since they are
* not available to the pullers in general.
*/
16: 594d14487e = 15: 1315247fd2 packfile: refactor `get_packed_git_mru()` to work on packfile store
---
base-commit: 337c7a0bbcf228ce11c87d066ecee352b3e52467
change-id: 20250806-b4-pks-packfiles-store-a44a608ca396
^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH v3 01/15] packfile: introduce a new `struct packfile_store`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 02/15] odb: move list of packfiles into " Patrick Steinhardt
` (14 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
Information about a object database's packfiles is currently distributed
across two different structures:
- `struct packed_git` contains the `next` pointer as well as the
`mru_head`, both of which serve to store the list of packfiles.
- `struct object_database` contains several fields that relate to the
packfiles.
So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.
Introduce a new `struct packfile_store` which is about to become the
single source of truth for managing packfiles. Right now this data
structure doesn't yet contain anything, but in subsequent patches we
will move all data structures that relate to packfiles and that are
currently contained in `struct object_database` into this new home.
Note that this is only a first step: most importantly, we won't (yet)
move the `struct packed_git::next` pointer around. This will happen in a
subsequent patch series though so that `struct packed_git` will really
only host information about the specific packfile it represents.
Further note that the new structure still sits at the wrong level at the
end of this patch series: as mentioned, it should eventually sit at the
level of the object database source, not at the object database level.
But introducing the packfile store now already makes it way easier to
eventually push down the now-selfcontained data structure by one level.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.c | 1 +
odb.h | 3 ++-
packfile.c | 13 +++++++++++++
packfile.h | 18 ++++++++++++++++++
4 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/odb.c b/odb.c
index 75c443fe66..a2289ea97d 100644
--- a/odb.c
+++ b/odb.c
@@ -996,6 +996,7 @@ struct object_database *odb_new(struct repository *repo)
memset(o, 0, sizeof(*o));
o->repo = repo;
+ o->packfiles = packfile_store_new(o);
INIT_LIST_HEAD(&o->packed_git_mru);
hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
pthread_mutex_init(&o->replace_mutex, NULL);
diff --git a/odb.h b/odb.h
index 51fe8a5a92..33034eaf2f 100644
--- a/odb.h
+++ b/odb.h
@@ -91,6 +91,7 @@ struct odb_source {
};
struct packed_git;
+struct packfile_store;
struct cached_object_entry;
/*
@@ -136,7 +137,7 @@ struct object_database {
*
* should only be accessed directly by packfile.c
*/
-
+ struct packfile_store *packfiles;
struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
struct list_head packed_git_mru;
diff --git a/packfile.c b/packfile.c
index acb680966d..130d3e2507 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2332,3 +2332,16 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
*len = hdr - out;
return 0;
}
+
+struct packfile_store *packfile_store_new(struct object_database *odb)
+{
+ struct packfile_store *store;
+ CALLOC_ARRAY(store, 1);
+ store->odb = odb;
+ return store;
+}
+
+void packfile_store_free(struct packfile_store *store)
+{
+ free(store);
+}
diff --git a/packfile.h b/packfile.h
index f16753f2a9..8d31fd619a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -52,6 +52,24 @@ struct packed_git {
char pack_name[FLEX_ARRAY]; /* more */
};
+/*
+ * A store that manages packfiles for a given object database.
+ */
+struct packfile_store {
+ struct object_database *odb;
+};
+
+/*
+ * Allocate and initialize a new empty packfile store for the given object
+ * database.
+ */
+struct packfile_store *packfile_store_new(struct object_database *odb);
+
+/*
+ * Free the packfile store and all its associated state.
+ */
+void packfile_store_free(struct packfile_store *store);
+
static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
const struct hashmap_entry *entry,
const struct hashmap_entry *entry2,
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 02/15] odb: move list of packfiles into `struct packfile_store`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 01/15] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 03/15] odb: move initialization bit " Patrick Steinhardt
` (13 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The object database tracks the list of packfiles it currently knows
about. With the introduction of the `struct packfile_store` we have a
better place to host this list though.
Move the list accordingly. Extract the logic from `odb_clear()` that
knows to close all such packfiles and move it into the new subsystem, as
well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.c | 12 ++----------
odb.h | 1 -
packfile.c | 46 +++++++++++++++++++++++++++++-----------------
packfile.h | 15 ++++++++++++++-
4 files changed, 45 insertions(+), 29 deletions(-)
diff --git a/odb.c b/odb.c
index a2289ea97d..7201d01406 100644
--- a/odb.c
+++ b/odb.c
@@ -1038,16 +1038,8 @@ void odb_clear(struct object_database *o)
INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
-
- /*
- * `close_object_store()` only closes the packfiles, but doesn't free
- * them. We thus have to do this manually.
- */
- for (struct packed_git *p = o->packed_git, *next; p; p = next) {
- next = p->next;
- free(p);
- }
- o->packed_git = NULL;
+ packfile_store_free(o->packfiles);
+ o->packfiles = NULL;
hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
diff --git a/odb.h b/odb.h
index 33034eaf2f..22a170b434 100644
--- a/odb.h
+++ b/odb.h
@@ -138,7 +138,6 @@ struct object_database {
* should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- struct packed_git *packed_git;
/* A most-recently-used ordered version of the packed_git list. */
struct list_head packed_git_mru;
diff --git a/packfile.c b/packfile.c
index 130d3e2507..43e9a7cb45 100644
--- a/packfile.c
+++ b/packfile.c
@@ -278,7 +278,7 @@ static int unuse_one_window(struct packed_git *current)
if (current)
scan_windows(current, &lru_p, &lru_w, &lru_l);
- for (p = current->repo->objects->packed_git; p; p = p->next)
+ for (p = current->repo->objects->packfiles->packs; p; p = p->next)
scan_windows(p, &lru_p, &lru_w, &lru_l);
if (lru_p) {
munmap(lru_w->base, lru_w->len);
@@ -362,13 +362,8 @@ void close_pack(struct packed_git *p)
void close_object_store(struct object_database *o)
{
struct odb_source *source;
- struct packed_git *p;
- for (p = o->packed_git; p; p = p->next)
- if (p->do_not_close)
- BUG("want to close pack marked 'do-not-close'");
- else
- close_pack(p);
+ packfile_store_close(o->packfiles);
for (source = o->sources; source; source = source->next) {
if (source->midx)
@@ -468,7 +463,7 @@ static int close_one_pack(struct repository *r)
struct pack_window *mru_w = NULL;
int accept_windows_inuse = 1;
- for (p = r->objects->packed_git; p; p = p->next) {
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
if (p->pack_fd == -1)
continue;
find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
@@ -789,8 +784,8 @@ void install_packed_git(struct repository *r, struct packed_git *pack)
if (pack->pack_fd != -1)
pack_open_fds++;
- pack->next = r->objects->packed_git;
- r->objects->packed_git = pack;
+ pack->next = r->objects->packfiles->packs;
+ r->objects->packfiles->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
hashmap_add(&r->objects->pack_map, &pack->packmap_ent);
@@ -974,7 +969,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
count += m->num_objects;
}
- for (p = r->objects->packed_git; p; p = p->next) {
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
if (open_pack_index(p))
continue;
count += p->num_objects;
@@ -1015,7 +1010,7 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
static void rearrange_packed_git(struct repository *r)
{
- sort_packs(&r->objects->packed_git, sort_pack);
+ sort_packs(&r->objects->packfiles->packs, sort_pack);
}
static void prepare_packed_git_mru(struct repository *r)
@@ -1024,7 +1019,7 @@ static void prepare_packed_git_mru(struct repository *r)
INIT_LIST_HEAD(&r->objects->packed_git_mru);
- for (p = r->objects->packed_git; p; p = p->next)
+ for (p = r->objects->packfiles->packs; p; p = p->next)
list_add_tail(&p->mru, &r->objects->packed_git_mru);
}
@@ -1073,7 +1068,7 @@ void reprepare_packed_git(struct repository *r)
struct packed_git *get_packed_git(struct repository *r)
{
prepare_packed_git(r);
- return r->objects->packed_git;
+ return r->objects->packfiles->packs;
}
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
@@ -1094,7 +1089,7 @@ struct packed_git *get_all_packs(struct repository *r)
prepare_midx_pack(m, i);
}
- return r->objects->packed_git;
+ return r->objects->packfiles->packs;
}
struct list_head *get_packed_git_mru(struct repository *r)
@@ -1219,7 +1214,7 @@ const struct packed_git *has_packed_and_bad(struct repository *r,
{
struct packed_git *p;
- for (p = r->objects->packed_git; p; p = p->next)
+ for (p = r->objects->packfiles->packs; p; p = p->next)
if (oidset_contains(&p->bad_objects, oid))
return p;
return NULL;
@@ -2080,7 +2075,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
if (source->midx && fill_midx_entry(source->midx, oid, e))
return 1;
- if (!r->objects->packed_git)
+ if (!r->objects->packfiles->packs)
return 0;
list_for_each(pos, &r->objects->packed_git_mru) {
@@ -2343,5 +2338,22 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
void packfile_store_free(struct packfile_store *store)
{
+ packfile_store_close(store);
+
+ for (struct packed_git *p = store->packs, *next; p; p = next) {
+ next = p->next;
+ free(p);
+ }
+
free(store);
}
+
+void packfile_store_close(struct packfile_store *store)
+{
+ for (struct packed_git *p = store->packs; p; p = p->next) {
+ if (p->do_not_close)
+ BUG("want to close pack marked 'do-not-close'");
+ else
+ close_pack(p);
+ }
+}
diff --git a/packfile.h b/packfile.h
index 8d31fd619a..d7ac8d24b4 100644
--- a/packfile.h
+++ b/packfile.h
@@ -57,6 +57,12 @@ struct packed_git {
*/
struct packfile_store {
struct object_database *odb;
+
+ /*
+ * The list of packfiles in the order in which they are being added to
+ * the store.
+ */
+ struct packed_git *packs;
};
/*
@@ -66,10 +72,17 @@ struct packfile_store {
struct packfile_store *packfile_store_new(struct object_database *odb);
/*
- * Free the packfile store and all its associated state.
+ * Free the packfile store and all its associated state. All packfiles
+ * tracked by the store will be closed.
*/
void packfile_store_free(struct packfile_store *store);
+/*
+ * Close all packfiles associated with this store. The packfiles won't be
+ * free'd, so they can be re-opened at a later point in time.
+ */
+void packfile_store_close(struct packfile_store *store);
+
static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
const struct hashmap_entry *entry,
const struct hashmap_entry *entry2,
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 03/15] odb: move initialization bit into `struct packfile_store`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 01/15] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 02/15] odb: move list of packfiles into " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 04/15] odb: move packfile map " Patrick Steinhardt
` (12 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The object database knows to skip re-initializing the list of packfiles
in case it's already been initialized. Whether or not that is the case
is tracked via a separate `initialized` bit that is stored in the object
database. With the introduction of the `struct packfile_store` we have a
better place to host this bit though.
Move it accordingly. While at it, convert the field into a boolean now
that we're allowed to use them in our code base.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.h | 6 ------
packfile.c | 6 +++---
packfile.h | 6 ++++++
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/odb.h b/odb.h
index 22a170b434..bf1b4d4677 100644
--- a/odb.h
+++ b/odb.h
@@ -169,12 +169,6 @@ struct object_database {
unsigned long approximate_object_count;
unsigned approximate_object_count_valid : 1;
- /*
- * Whether packed_git has already been populated with this repository's
- * packs.
- */
- unsigned packed_git_initialized : 1;
-
/*
* Submodule source paths that will be added as additional sources to
* allow lookup of submodule objects via the main object database.
diff --git a/packfile.c b/packfile.c
index 43e9a7cb45..0cfeb68b6b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,7 +1027,7 @@ static void prepare_packed_git(struct repository *r)
{
struct odb_source *source;
- if (r->objects->packed_git_initialized)
+ if (r->objects->packfiles->initialized)
return;
odb_prepare_alternates(r->objects);
@@ -1038,7 +1038,7 @@ static void prepare_packed_git(struct repository *r)
rearrange_packed_git(r);
prepare_packed_git_mru(r);
- r->objects->packed_git_initialized = 1;
+ r->objects->packfiles->initialized = true;
}
void reprepare_packed_git(struct repository *r)
@@ -1060,7 +1060,7 @@ void reprepare_packed_git(struct repository *r)
odb_clear_loose_cache(source);
r->objects->approximate_object_count_valid = 0;
- r->objects->packed_git_initialized = 0;
+ r->objects->packfiles->initialized = false;
prepare_packed_git(r);
obj_read_unlock();
}
diff --git a/packfile.h b/packfile.h
index d7ac8d24b4..cf81091175 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,6 +63,12 @@ struct packfile_store {
* the store.
*/
struct packed_git *packs;
+
+ /*
+ * Whether packfiles have already been populated with this store's
+ * packs.
+ */
+ bool initialized;
};
/*
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 04/15] odb: move packfile map into `struct packfile_store`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (2 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 03/15] odb: move initialization bit " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 05/15] odb: move MRU list of packfiles " Patrick Steinhardt
` (11 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The object database tracks a map of packfiles by their respective paths,
which is used to figure out whether a given packfile has already been
loaded.With the introduction of the `struct packfile_store` we have a
better place to host this list though.
Move the map accordingly. `pack_map_entry_cmp()` isn't used anywhere but
in "packfile.c" anymore after this change, so we convert it to a static
function, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 2 +-
odb.c | 2 --
odb.h | 6 ------
packfile.c | 20 ++++++++++++++++++--
packfile.h | 20 ++++++--------------
5 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/midx.c b/midx.c
index 7726c13d7e..e96970efbf 100644
--- a/midx.c
+++ b/midx.c
@@ -460,7 +460,7 @@ int prepare_midx_pack(struct multi_pack_index *m,
strbuf_addbuf(&key, &pack_name);
strbuf_strip_suffix(&key, ".idx");
strbuf_addstr(&key, ".pack");
- p = hashmap_get_entry_from_hash(&r->objects->pack_map,
+ p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
strhash(key.buf), key.buf,
struct packed_git, packmap_ent);
if (!p) {
diff --git a/odb.c b/odb.c
index 7201d01406..737d98c911 100644
--- a/odb.c
+++ b/odb.c
@@ -998,7 +998,6 @@ struct object_database *odb_new(struct repository *repo)
o->repo = repo;
o->packfiles = packfile_store_new(o);
INIT_LIST_HEAD(&o->packed_git_mru);
- hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
pthread_mutex_init(&o->replace_mutex, NULL);
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ -1041,6 +1040,5 @@ void odb_clear(struct object_database *o)
packfile_store_free(o->packfiles);
o->packfiles = NULL;
- hashmap_clear(&o->pack_map);
string_list_clear(&o->submodule_source_paths, 0);
}
diff --git a/odb.h b/odb.h
index bf1b4d4677..73a669b993 100644
--- a/odb.h
+++ b/odb.h
@@ -155,12 +155,6 @@ struct object_database {
struct cached_object_entry *cached_objects;
size_t cached_object_nr, cached_object_alloc;
- /*
- * A map of packfiles to packed_git structs for tracking which
- * packs have been loaded already.
- */
- struct hashmap pack_map;
-
/*
* A fast, rough count of the number of objects in the repository.
* These two fields are not meant for direct access. Use
diff --git a/packfile.c b/packfile.c
index 0cfeb68b6b..60ccdfaafb 100644
--- a/packfile.c
+++ b/packfile.c
@@ -788,7 +788,7 @@ void install_packed_git(struct repository *r, struct packed_git *pack)
r->objects->packfiles->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
- hashmap_add(&r->objects->pack_map, &pack->packmap_ent);
+ hashmap_add(&r->objects->packfiles->map, &pack->packmap_ent);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -901,7 +901,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
hashmap_entry_init(&hent, hash);
/* Don't reopen a pack we already have. */
- if (!hashmap_get(&data->r->objects->pack_map, &hent, pack_name)) {
+ if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
p = add_packed_git(data->r, full_name, full_name_len, data->local);
if (p)
install_packed_git(data->r, p);
@@ -2328,11 +2328,26 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
return 0;
}
+static int pack_map_entry_cmp(const void *cmp_data UNUSED,
+ const struct hashmap_entry *entry,
+ const struct hashmap_entry *entry2,
+ const void *keydata)
+{
+ const char *key = keydata;
+ const struct packed_git *pg1, *pg2;
+
+ pg1 = container_of(entry, const struct packed_git, packmap_ent);
+ pg2 = container_of(entry2, const struct packed_git, packmap_ent);
+
+ return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
+}
+
struct packfile_store *packfile_store_new(struct object_database *odb)
{
struct packfile_store *store;
CALLOC_ARRAY(store, 1);
store->odb = odb;
+ hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
return store;
}
@@ -2345,6 +2360,7 @@ void packfile_store_free(struct packfile_store *store)
free(p);
}
+ hashmap_clear(&store->map);
free(store);
}
diff --git a/packfile.h b/packfile.h
index cf81091175..9bbef51164 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,12 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /*
+ * A map of packfile names to packed_git structs for tracking which
+ * packs have been loaded already.
+ */
+ struct hashmap map;
+
/*
* Whether packfiles have already been populated with this store's
* packs.
@@ -89,20 +95,6 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
-static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *entry,
- const struct hashmap_entry *entry2,
- const void *keydata)
-{
- const char *key = keydata;
- const struct packed_git *pg1, *pg2;
-
- pg1 = container_of(entry, const struct packed_git, packmap_ent);
- pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
- return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 05/15] odb: move MRU list of packfiles into `struct packfile_store`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (3 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 04/15] odb: move packfile map " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 06/15] odb: move kept cache " Patrick Steinhardt
` (10 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The object database tracks the list of packfiles in most-recently-used
order, which is mostly used to favor reading from packfiles that contain
most of the objects that we're currently accessing. With the
introduction of the `struct packfile_store` we have a better place to
host this list though.
Move the list accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 2 +-
odb.c | 2 --
odb.h | 4 ----
packfile.c | 11 ++++++-----
packfile.h | 3 +++
5 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/midx.c b/midx.c
index e96970efbf..91c7b3917d 100644
--- a/midx.c
+++ b/midx.c
@@ -468,7 +468,7 @@ int prepare_midx_pack(struct multi_pack_index *m,
m->source->local);
if (p) {
install_packed_git(r, p);
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
}
diff --git a/odb.c b/odb.c
index 737d98c911..32e982bf0b 100644
--- a/odb.c
+++ b/odb.c
@@ -997,7 +997,6 @@ struct object_database *odb_new(struct repository *repo)
memset(o, 0, sizeof(*o));
o->repo = repo;
o->packfiles = packfile_store_new(o);
- INIT_LIST_HEAD(&o->packed_git_mru);
pthread_mutex_init(&o->replace_mutex, NULL);
string_list_init_dup(&o->submodule_source_paths);
return o;
@@ -1035,7 +1034,6 @@ void odb_clear(struct object_database *o)
free((char *) o->cached_objects[i].value.buf);
FREE_AND_NULL(o->cached_objects);
- INIT_LIST_HEAD(&o->packed_git_mru);
close_object_store(o);
packfile_store_free(o->packfiles);
o->packfiles = NULL;
diff --git a/odb.h b/odb.h
index 73a669b993..8ee1f8bb43 100644
--- a/odb.h
+++ b/odb.h
@@ -3,7 +3,6 @@
#include "hashmap.h"
#include "object.h"
-#include "list.h"
#include "oidset.h"
#include "oidmap.h"
#include "string-list.h"
@@ -138,9 +137,6 @@ struct object_database {
* should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- /* A most-recently-used ordered version of the packed_git list. */
- struct list_head packed_git_mru;
-
struct {
struct packed_git **packs;
unsigned flags;
diff --git a/packfile.c b/packfile.c
index 60ccdfaafb..98207aa380 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1017,10 +1017,10 @@ static void prepare_packed_git_mru(struct repository *r)
{
struct packed_git *p;
- INIT_LIST_HEAD(&r->objects->packed_git_mru);
+ INIT_LIST_HEAD(&r->objects->packfiles->mru);
for (p = r->objects->packfiles->packs; p; p = p->next)
- list_add_tail(&p->mru, &r->objects->packed_git_mru);
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
static void prepare_packed_git(struct repository *r)
@@ -1095,7 +1095,7 @@ struct packed_git *get_all_packs(struct repository *r)
struct list_head *get_packed_git_mru(struct repository *r)
{
prepare_packed_git(r);
- return &r->objects->packed_git_mru;
+ return &r->objects->packfiles->mru;
}
unsigned long unpack_object_header_buffer(const unsigned char *buf,
@@ -2078,10 +2078,10 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
if (!r->objects->packfiles->packs)
return 0;
- list_for_each(pos, &r->objects->packed_git_mru) {
+ list_for_each(pos, &r->objects->packfiles->mru) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) {
- list_move(&p->mru, &r->objects->packed_git_mru);
+ list_move(&p->mru, &r->objects->packfiles->mru);
return 1;
}
}
@@ -2347,6 +2347,7 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
struct packfile_store *store;
CALLOC_ARRAY(store, 1);
store->odb = odb;
+ INIT_LIST_HEAD(&store->mru);
hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
return store;
}
diff --git a/packfile.h b/packfile.h
index 9bbef51164..d48d46cc1b 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,9 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /* A most-recently-used ordered version of the packs list. */
+ struct list_head mru;
+
/*
* A map of packfile names to packed_git structs for tracking which
* packs have been loaded already.
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 06/15] odb: move kept cache into `struct packfile_store`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (4 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 05/15] odb: move MRU list of packfiles " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 07/15] packfile: reorder functions to avoid function declaration Patrick Steinhardt
` (9 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The object database tracks a cache of "kept" packfiles, which is used by
git-pack-objects(1) to handle cruft objects. With the introduction of
the `struct packfile_store` we have a better place to host this cache
though.
Move the cache accordingly.
This moves the last bit of packfile-related state from the object
database into the packfile store. Adapt the comment for the `packfiles`
pointer in `struct object_database` to reflect this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
odb.h | 8 +-------
packfile.c | 16 ++++++++--------
packfile.h | 14 ++++++++++++++
3 files changed, 23 insertions(+), 15 deletions(-)
diff --git a/odb.h b/odb.h
index 8ee1f8bb43..1c998a2478 100644
--- a/odb.h
+++ b/odb.h
@@ -132,15 +132,9 @@ struct object_database {
unsigned commit_graph_attempted : 1; /* if loading has been attempted */
/*
- * private data
- *
- * should only be accessed directly by packfile.c
+ * Should only be accessed directly by packfile.c
*/
struct packfile_store *packfiles;
- struct {
- struct packed_git **packs;
- unsigned flags;
- } kept_pack_cache;
/*
* This is meant to hold a *small* number of objects that you would
diff --git a/packfile.c b/packfile.c
index 98207aa380..6ae7f22d65 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2091,19 +2091,19 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
static void maybe_invalidate_kept_pack_cache(struct repository *r,
unsigned flags)
{
- if (!r->objects->kept_pack_cache.packs)
+ if (!r->objects->packfiles->kept_cache.packs)
return;
- if (r->objects->kept_pack_cache.flags == flags)
+ if (r->objects->packfiles->kept_cache.flags == flags)
return;
- FREE_AND_NULL(r->objects->kept_pack_cache.packs);
- r->objects->kept_pack_cache.flags = 0;
+ FREE_AND_NULL(r->objects->packfiles->kept_cache.packs);
+ r->objects->packfiles->kept_cache.flags = 0;
}
struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
{
maybe_invalidate_kept_pack_cache(r, flags);
- if (!r->objects->kept_pack_cache.packs) {
+ if (!r->objects->packfiles->kept_cache.packs) {
struct packed_git **packs = NULL;
size_t nr = 0, alloc = 0;
struct packed_git *p;
@@ -2126,11 +2126,11 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
ALLOC_GROW(packs, nr + 1, alloc);
packs[nr] = NULL;
- r->objects->kept_pack_cache.packs = packs;
- r->objects->kept_pack_cache.flags = flags;
+ r->objects->packfiles->kept_cache.packs = packs;
+ r->objects->packfiles->kept_cache.flags = flags;
}
- return r->objects->kept_pack_cache.packs;
+ return r->objects->packfiles->kept_cache.packs;
}
int find_kept_pack_entry(struct repository *r,
diff --git a/packfile.h b/packfile.h
index d48d46cc1b..bf66211986 100644
--- a/packfile.h
+++ b/packfile.h
@@ -64,6 +64,20 @@ struct packfile_store {
*/
struct packed_git *packs;
+ /*
+ * Cache of packfiles which are marked as "kept", either because there
+ * is an on-disk ".keep" file or because they are marked as "kept" in
+ * memory.
+ *
+ * Should not be accessed directly, but via `kept_pack_cache()`. The
+ * list of packs gets invalidated when the stored flags and the flags
+ * passed to `kept_pack_cache()` mismatch.
+ */
+ struct {
+ struct packed_git **packs;
+ unsigned flags;
+ } kept_cache;
+
/* A most-recently-used ordered version of the packs list. */
struct list_head mru;
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 07/15] packfile: reorder functions to avoid function declaration
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (5 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 06/15] odb: move kept cache " Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 08/15] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
` (8 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
Reorder functions so that we can avoid a forward declaration of
`prepare_packed_git()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
packfile.c | 67 +++++++++++++++++++++++++++++++-------------------------------
1 file changed, 33 insertions(+), 34 deletions(-)
diff --git a/packfile.c b/packfile.c
index 6ae7f22d65..771b58df8b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -946,40 +946,6 @@ static void prepare_packed_git_one(struct odb_source *source)
string_list_clear(data.garbage, 0);
}
-static void prepare_packed_git(struct repository *r);
-/*
- * Give a fast, rough count of the number of objects in the repository. This
- * ignores loose objects completely. If you have a lot of them, then either
- * you should repack because your performance will be awful, or they are
- * all unreachable objects about to be pruned, in which case they're not really
- * interesting as a measure of repo size in the first place.
- */
-unsigned long repo_approximate_object_count(struct repository *r)
-{
- if (!r->objects->approximate_object_count_valid) {
- struct odb_source *source;
- unsigned long count = 0;
- struct packed_git *p;
-
- prepare_packed_git(r);
-
- for (source = r->objects->sources; source; source = source->next) {
- struct multi_pack_index *m = get_multi_pack_index(source);
- if (m)
- count += m->num_objects;
- }
-
- for (p = r->objects->packfiles->packs; p; p = p->next) {
- if (open_pack_index(p))
- continue;
- count += p->num_objects;
- }
- r->objects->approximate_object_count = count;
- r->objects->approximate_object_count_valid = 1;
- }
- return r->objects->approximate_object_count;
-}
-
DEFINE_LIST_SORT(static, sort_packs, struct packed_git, next);
static int sort_pack(const struct packed_git *a, const struct packed_git *b)
@@ -1098,6 +1064,39 @@ struct list_head *get_packed_git_mru(struct repository *r)
return &r->objects->packfiles->mru;
}
+/*
+ * Give a fast, rough count of the number of objects in the repository. This
+ * ignores loose objects completely. If you have a lot of them, then either
+ * you should repack because your performance will be awful, or they are
+ * all unreachable objects about to be pruned, in which case they're not really
+ * interesting as a measure of repo size in the first place.
+ */
+unsigned long repo_approximate_object_count(struct repository *r)
+{
+ if (!r->objects->approximate_object_count_valid) {
+ struct odb_source *source;
+ unsigned long count = 0;
+ struct packed_git *p;
+
+ prepare_packed_git(r);
+
+ for (source = r->objects->sources; source; source = source->next) {
+ struct multi_pack_index *m = get_multi_pack_index(source);
+ if (m)
+ count += m->num_objects;
+ }
+
+ for (p = r->objects->packfiles->packs; p; p = p->next) {
+ if (open_pack_index(p))
+ continue;
+ count += p->num_objects;
+ }
+ r->objects->approximate_object_count = count;
+ r->objects->approximate_object_count_valid = 1;
+ }
+ return r->objects->approximate_object_count;
+}
+
unsigned long unpack_object_header_buffer(const unsigned char *buf,
unsigned long len, enum object_type *type, unsigned long *sizep)
{
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 08/15] packfile: refactor `prepare_packed_git()` to work on packfile store
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (6 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 07/15] packfile: reorder functions to avoid function declaration Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 09/15] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
` (7 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The `prepare_packed_git()` function and its friends are responsible for
loading packfiles as well as the multi-pack index for a given object
database. Refactor these functions to accept a packfile store instead of
a repository to clarify their scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
packfile.c | 41 ++++++++++++++++++-----------------------
1 file changed, 18 insertions(+), 23 deletions(-)
diff --git a/packfile.c b/packfile.c
index 771b58df8b..4564026658 100644
--- a/packfile.c
+++ b/packfile.c
@@ -974,37 +974,32 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
return -1;
}
-static void rearrange_packed_git(struct repository *r)
-{
- sort_packs(&r->objects->packfiles->packs, sort_pack);
-}
-
-static void prepare_packed_git_mru(struct repository *r)
+static void packfile_store_prepare_mru(struct packfile_store *store)
{
struct packed_git *p;
- INIT_LIST_HEAD(&r->objects->packfiles->mru);
+ INIT_LIST_HEAD(&store->mru);
- for (p = r->objects->packfiles->packs; p; p = p->next)
- list_add_tail(&p->mru, &r->objects->packfiles->mru);
+ for (p = store->packs; p; p = p->next)
+ list_add_tail(&p->mru, &store->mru);
}
-static void prepare_packed_git(struct repository *r)
+static void packfile_store_prepare(struct packfile_store *store)
{
struct odb_source *source;
- if (r->objects->packfiles->initialized)
+ if (store->initialized)
return;
- odb_prepare_alternates(r->objects);
- for (source = r->objects->sources; source; source = source->next) {
+ odb_prepare_alternates(store->odb);
+ for (source = store->odb->sources; source; source = source->next) {
prepare_multi_pack_index_one(source);
prepare_packed_git_one(source);
}
- rearrange_packed_git(r);
+ sort_packs(&store->packs, sort_pack);
- prepare_packed_git_mru(r);
- r->objects->packfiles->initialized = true;
+ packfile_store_prepare_mru(store);
+ store->initialized = true;
}
void reprepare_packed_git(struct repository *r)
@@ -1027,25 +1022,25 @@ void reprepare_packed_git(struct repository *r)
r->objects->approximate_object_count_valid = 0;
r->objects->packfiles->initialized = false;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
obj_read_unlock();
}
struct packed_git *get_packed_git(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
return r->objects->packfiles->packs;
}
struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
{
- prepare_packed_git(source->odb->repo);
+ packfile_store_prepare(source->odb->packfiles);
return source->midx;
}
struct packed_git *get_all_packs(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next) {
struct multi_pack_index *m = source->midx;
@@ -1060,7 +1055,7 @@ struct packed_git *get_all_packs(struct repository *r)
struct list_head *get_packed_git_mru(struct repository *r)
{
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
return &r->objects->packfiles->mru;
}
@@ -1078,7 +1073,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
unsigned long count = 0;
struct packed_git *p;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (source = r->objects->sources; source; source = source->next) {
struct multi_pack_index *m = get_multi_pack_index(source);
@@ -2068,7 +2063,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
{
struct list_head *pos;
- prepare_packed_git(r);
+ packfile_store_prepare(r->objects->packfiles);
for (struct odb_source *source = r->objects->sources; source; source = source->next)
if (source->midx && fill_midx_entry(source->midx, oid, e))
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 09/15] packfile: split up responsibilities of `reprepare_packed_git()`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (7 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 08/15] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 10/15] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
` (6 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
In `reprepare_packed_git()` we perform a couple of operations:
- We reload alternate object directories.
- We clear the loose object cache.
- We reprepare packfiles.
While the logic is hosted in "packfile.c", it clearly reaches into other
subsystems that aren't related to packfiles.
Split up the responsibility and introduce `odb_reprepare()` which now
becomes responsible for repreparing the whole object database. The
existing `reprepare_packed_git()` function is refactored accordingly and
only cares about reloading the packfile store now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/backfill.c | 2 +-
builtin/gc.c | 4 ++--
builtin/receive-pack.c | 2 +-
builtin/repack.c | 2 +-
bulk-checkin.c | 2 +-
connected.c | 2 +-
fetch-pack.c | 4 ++--
object-name.c | 2 +-
odb.c | 27 ++++++++++++++++++++++++++-
odb.h | 6 ++++++
packfile.c | 26 ++++----------------------
packfile.h | 9 ++++++++-
transport-helper.c | 2 +-
13 files changed, 55 insertions(+), 35 deletions(-)
diff --git a/builtin/backfill.c b/builtin/backfill.c
index 80056abe47..e80fc1b694 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -53,7 +53,7 @@ static void download_batch(struct backfill_context *ctx)
* We likely have a new packfile. Add it to the packed list to
* avoid possible duplicate downloads of the same objects.
*/
- reprepare_packed_git(ctx->repo);
+ odb_reprepare(ctx->repo->objects);
}
static int fill_missing_blobs(const char *path UNUSED,
diff --git a/builtin/gc.c b/builtin/gc.c
index 03ae4926b2..aeca06a08b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1042,7 +1042,7 @@ int cmd_gc(int argc,
die(FAILED_RUN, "rerere");
report_garbage = report_pack_garbage;
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (pack_garbage.nr > 0) {
close_object_store(the_repository->objects);
clean_pack_garbage();
@@ -1491,7 +1491,7 @@ static off_t get_auto_pack_size(void)
struct packed_git *p;
struct repository *r = the_repository;
- reprepare_packed_git(r);
+ odb_reprepare(r->objects);
for (p = get_all_packs(r); p; p = p->next) {
if (p->pack_size > max_size) {
second_largest_size = max_size;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1113137a6f..c9288a9c7e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -2389,7 +2389,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
status = finish_command(&child);
if (status)
return "index-pack abnormal exit";
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
}
return NULL;
}
diff --git a/builtin/repack.c b/builtin/repack.c
index c490a51e91..5ff27fc8e2 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1685,7 +1685,7 @@ int cmd_repack(int argc,
goto cleanup;
}
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (delete_redundant) {
int opts = 0;
diff --git a/bulk-checkin.c b/bulk-checkin.c
index b2809ab039..f65439a748 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -90,7 +90,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
strbuf_release(&packname);
/* Make objects we just wrote available to ourselves */
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
}
/*
diff --git a/connected.c b/connected.c
index 18c13245d8..d6e9682fd9 100644
--- a/connected.c
+++ b/connected.c
@@ -72,7 +72,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
* Before checking for promisor packs, be sure we have the
* latest pack-files loaded into memory.
*/
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
do {
struct packed_git *p;
diff --git a/fetch-pack.c b/fetch-pack.c
index 6ed5662951..fe7a84bf2f 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1983,7 +1983,7 @@ static void update_shallow(struct fetch_pack_args *args,
* remote is shallow, but this is a clone, there are
* no objects in repo to worry about. Accept any
* shallow points that exist in the pack (iow in repo
- * after get_pack() and reprepare_packed_git())
+ * after get_pack() and odb_reprepare())
*/
struct oid_array extra = OID_ARRAY_INIT;
struct object_id *oid = si->shallow->oid;
@@ -2108,7 +2108,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
&si, pack_lockfiles);
}
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
if (!args->cloning && args->deepen) {
struct check_connected_options opt = CHECK_CONNECTED_INIT;
diff --git a/object-name.c b/object-name.c
index 732056ff5e..df9e0c5f02 100644
--- a/object-name.c
+++ b/object-name.c
@@ -596,7 +596,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
* or migrated from loose to packed.
*/
if (status == MISSING_OBJECT) {
- reprepare_packed_git(r);
+ odb_reprepare(r->objects);
find_short_object_filename(&ds);
find_short_packed_object(&ds);
status = finish_object_disambiguation(&ds, oid);
diff --git a/odb.c b/odb.c
index 32e982bf0b..65a6cc67b6 100644
--- a/odb.c
+++ b/odb.c
@@ -694,7 +694,7 @@ static int do_oid_object_info_extended(struct object_database *odb,
/* Not a loose object; someone else may have just packed it. */
if (!(flags & OBJECT_INFO_QUICK)) {
- reprepare_packed_git(odb->repo);
+ odb_reprepare(odb->repo->objects);
if (find_pack_entry(odb->repo, real, &e))
break;
}
@@ -1040,3 +1040,28 @@ void odb_clear(struct object_database *o)
string_list_clear(&o->submodule_source_paths, 0);
}
+
+void odb_reprepare(struct object_database *o)
+{
+ struct odb_source *source;
+
+ obj_read_lock();
+
+ /*
+ * Reprepare alt odbs, in case the alternates file was modified
+ * during the course of this process. This only _adds_ odbs to
+ * the linked list, so existing odbs will continue to exist for
+ * the lifetime of the process.
+ */
+ o->loaded_alternates = 0;
+ odb_prepare_alternates(o);
+
+ for (source = o->sources; source; source = source->next)
+ odb_clear_loose_cache(source);
+
+ o->approximate_object_count_valid = 0;
+
+ packfile_store_reprepare(o->packfiles);
+
+ obj_read_unlock();
+}
diff --git a/odb.h b/odb.h
index 1c998a2478..ef34132c58 100644
--- a/odb.h
+++ b/odb.h
@@ -163,6 +163,12 @@ struct object_database {
struct object_database *odb_new(struct repository *repo);
void odb_clear(struct object_database *o);
+/*
+ * Clear caches, reload alternates and then reload object sources so that new
+ * objects may become accessible.
+ */
+void odb_reprepare(struct object_database *o);
+
/*
* Find source by its object directory path. Returns a `NULL` pointer in case
* the source could not be found.
diff --git a/packfile.c b/packfile.c
index 4564026658..edd5ecc9cf 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1002,28 +1002,10 @@ static void packfile_store_prepare(struct packfile_store *store)
store->initialized = true;
}
-void reprepare_packed_git(struct repository *r)
+void packfile_store_reprepare(struct packfile_store *store)
{
- struct odb_source *source;
-
- obj_read_lock();
-
- /*
- * Reprepare alt odbs, in case the alternates file was modified
- * during the course of this process. This only _adds_ odbs to
- * the linked list, so existing odbs will continue to exist for
- * the lifetime of the process.
- */
- r->objects->loaded_alternates = 0;
- odb_prepare_alternates(r->objects);
-
- for (source = r->objects->sources; source; source = source->next)
- odb_clear_loose_cache(source);
-
- r->objects->approximate_object_count_valid = 0;
- r->objects->packfiles->initialized = false;
- packfile_store_prepare(r->objects->packfiles);
- obj_read_unlock();
+ store->initialized = false;
+ packfile_store_prepare(store);
}
struct packed_git *get_packed_git(struct repository *r)
@@ -1144,7 +1126,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
*
* Other worrying sections could be the call to close_pack_fd(),
* which can close packs even with in-use windows, and to
- * reprepare_packed_git(). Regarding the former, mmap doc says:
+ * odb_reprepare(). Regarding the former, mmap doc says:
* "closing the file descriptor does not unmap the region". And
* for the latter, it won't re-open already available packs.
*/
diff --git a/packfile.h b/packfile.h
index bf66211986..a85ff607fe 100644
--- a/packfile.h
+++ b/packfile.h
@@ -112,6 +112,14 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
+/*
+ * Clear the packfile caches and try to look up any new packfiles that have
+ * appeared since last preparing the packfiles store.
+ *
+ * This function must be called under the `odb_read_lock()`.
+ */
+void packfile_store_reprepare(struct packfile_store *store);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
@@ -188,7 +196,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-void reprepare_packed_git(struct repository *r);
void install_packed_git(struct repository *r, struct packed_git *pack);
struct packed_git *get_packed_git(struct repository *r);
diff --git a/transport-helper.c b/transport-helper.c
index 0789e5bca5..4d95d84f9e 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -450,7 +450,7 @@ static int fetch_with_fetch(struct transport *transport,
}
strbuf_release(&buf);
- reprepare_packed_git(the_repository);
+ odb_reprepare(the_repository->objects);
return 0;
}
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 10/15] packfile: refactor `install_packed_git()` to work on packfile store
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (8 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 09/15] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 11/15] packfile: introduce function to load and add packfiles Patrick Steinhardt
` (5 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The `install_packed_git()` functions adds a packfile to a specific
object store. Refactor it to accept a packfile store instead of a
repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 2 +-
builtin/index-pack.c | 2 +-
http.c | 2 +-
http.h | 2 +-
midx.c | 2 +-
packfile.c | 11 ++++++-----
packfile.h | 9 +++++++--
7 files changed, 18 insertions(+), 12 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 2c35f9345d..e9d82b31c3 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -901,7 +901,7 @@ static void end_packfile(void)
if (!new_p)
die("core git rejected index %s", idx_name);
all_packs[pack_id] = new_p;
- install_packed_git(the_repository, new_p);
+ packfile_store_add_pack(the_repository->objects->packfiles, new_p);
free(idx_name);
/* Print the boundary */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f91c301bba..ed490dfad4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1645,7 +1645,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
p = add_packed_git(the_repository, final_index_name,
strlen(final_index_name), 0);
if (p)
- install_packed_git(the_repository, p);
+ packfile_store_add_pack(the_repository->objects->packfiles, p);
}
if (!from_stdin) {
diff --git a/http.c b/http.c
index 98853d6483..af2120b64c 100644
--- a/http.c
+++ b/http.c
@@ -2541,7 +2541,7 @@ void http_install_packfile(struct packed_git *p,
lst = &((*lst)->next);
*lst = (*lst)->next;
- install_packed_git(the_repository, p);
+ packfile_store_add_pack(the_repository->objects->packfiles, p);
}
struct http_pack_request *new_http_pack_request(
diff --git a/http.h b/http.h
index 36202139f4..e5a5380c6c 100644
--- a/http.h
+++ b/http.h
@@ -210,7 +210,7 @@ int finish_http_pack_request(struct http_pack_request *preq);
void release_http_pack_request(struct http_pack_request *preq);
/*
- * Remove p from the given list, and invoke install_packed_git() on it.
+ * Remove p from the given list, and invoke packfile_store_add_pack() on it.
*
* This is a convenience function for users that have obtained a list of packs
* from http_get_info_packs() and have chosen a specific pack to fetch.
diff --git a/midx.c b/midx.c
index 91c7b3917d..69c44be71c 100644
--- a/midx.c
+++ b/midx.c
@@ -467,7 +467,7 @@ int prepare_midx_pack(struct multi_pack_index *m,
p = add_packed_git(r, pack_name.buf, pack_name.len,
m->source->local);
if (p) {
- install_packed_git(r, p);
+ packfile_store_add_pack(r->objects->packfiles, p);
list_add_tail(&p->mru, &r->objects->packfiles->mru);
}
}
diff --git a/packfile.c b/packfile.c
index edd5ecc9cf..e1a3c0487c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -779,16 +779,17 @@ struct packed_git *add_packed_git(struct repository *r, const char *path,
return p;
}
-void install_packed_git(struct repository *r, struct packed_git *pack)
+void packfile_store_add_pack(struct packfile_store *store,
+ struct packed_git *pack)
{
if (pack->pack_fd != -1)
pack_open_fds++;
- pack->next = r->objects->packfiles->packs;
- r->objects->packfiles->packs = pack;
+ pack->next = store->packs;
+ store->packs = pack;
hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
- hashmap_add(&r->objects->packfiles->map, &pack->packmap_ent);
+ hashmap_add(&store->map, &pack->packmap_ent);
}
void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -904,7 +905,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
p = add_packed_git(data->r, full_name, full_name_len, data->local);
if (p)
- install_packed_git(data->r, p);
+ packfile_store_add_pack(data->r->objects->packfiles, p);
}
free(pack_name);
}
diff --git a/packfile.h b/packfile.h
index a85ff607fe..ba4b0cef9c 100644
--- a/packfile.h
+++ b/packfile.h
@@ -120,6 +120,13 @@ void packfile_store_close(struct packfile_store *store);
*/
void packfile_store_reprepare(struct packfile_store *store);
+/*
+ * Add the pack to the store so that contained objects become accessible via
+ * the store. This moves ownership into the store.
+ */
+void packfile_store_add_pack(struct packfile_store *store,
+ struct packed_git *pack);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
@@ -196,8 +203,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-void install_packed_git(struct repository *r, struct packed_git *pack);
-
struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 11/15] packfile: introduce function to load and add packfiles
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (9 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 10/15] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 12/15] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
` (4 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
We have a recurring pattern where we essentially perform an upsert of a
packfile in case it isn't yet known by the packfile store. The logic to
do so is non-trivial as we have to reconstruct the packfile's key, check
the map of packfiles, then create the new packfile and finally add it to
the store.
Introduce a new function that does this dance for us. Refactor callsites
to use it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 4 ++--
builtin/index-pack.c | 10 +++-------
midx.c | 23 ++++-------------------
packfile.c | 44 +++++++++++++++++++++++++++++++-------------
packfile.h | 8 ++++++++
5 files changed, 48 insertions(+), 41 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index e9d82b31c3..a26e79689d 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -897,11 +897,11 @@ static void end_packfile(void)
idx_name = keep_pack(create_index());
/* Register the packfile with core git's machinery. */
- new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
+ new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
+ idx_name, 1);
if (!new_p)
die("core git rejected index %s", idx_name);
all_packs[pack_id] = new_p;
- packfile_store_add_pack(the_repository->objects->packfiles, new_p);
free(idx_name);
/* Print the boundary */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ed490dfad4..2b78ba7fe4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
hash, "idx", 1);
- if (do_fsck_object) {
- struct packed_git *p;
- p = add_packed_git(the_repository, final_index_name,
- strlen(final_index_name), 0);
- if (p)
- packfile_store_add_pack(the_repository->objects->packfiles, p);
- }
+ if (do_fsck_object)
+ packfile_store_load_pack(the_repository->objects->packfiles,
+ final_index_name, 0);
if (!from_stdin) {
printf("%s\n", hash_to_hex(hash));
diff --git a/midx.c b/midx.c
index 69c44be71c..3faeaf2f8f 100644
--- a/midx.c
+++ b/midx.c
@@ -443,7 +443,6 @@ int prepare_midx_pack(struct multi_pack_index *m,
{
struct repository *r = m->source->odb->repo;
struct strbuf pack_name = STRBUF_INIT;
- struct strbuf key = STRBUF_INIT;
struct packed_git *p;
pack_int_id = midx_for_pack(&m, pack_int_id);
@@ -455,25 +454,11 @@ int prepare_midx_pack(struct multi_pack_index *m,
strbuf_addf(&pack_name, "%s/pack/%s", m->source->path,
m->pack_names[pack_int_id]);
-
- /* pack_map holds the ".pack" name, but we have the .idx */
- strbuf_addbuf(&key, &pack_name);
- strbuf_strip_suffix(&key, ".idx");
- strbuf_addstr(&key, ".pack");
- p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
- strhash(key.buf), key.buf,
- struct packed_git, packmap_ent);
- if (!p) {
- p = add_packed_git(r, pack_name.buf, pack_name.len,
- m->source->local);
- if (p) {
- packfile_store_add_pack(r->objects->packfiles, p);
- list_add_tail(&p->mru, &r->objects->packfiles->mru);
- }
- }
-
+ p = packfile_store_load_pack(r->objects->packfiles,
+ pack_name.buf, m->source->local);
+ if (p)
+ list_add_tail(&p->mru, &r->objects->packfiles->mru);
strbuf_release(&pack_name);
- strbuf_release(&key);
if (!p) {
m->packs[pack_int_id] = MIDX_PACK_ERROR;
diff --git a/packfile.c b/packfile.c
index e1a3c0487c..e8b5be645c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -792,6 +792,33 @@ void packfile_store_add_pack(struct packfile_store *store,
hashmap_add(&store->map, &pack->packmap_ent);
}
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
+ const char *idx_path, int local)
+{
+ struct strbuf key = STRBUF_INIT;
+ struct packed_git *p;
+
+ /*
+ * We're being called with the path to the index file, but `pack_map`
+ * holds the path to the packfile itself.
+ */
+ strbuf_addstr(&key, idx_path);
+ strbuf_strip_suffix(&key, ".idx");
+ strbuf_addstr(&key, ".pack");
+
+ p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
+ struct packed_git, packmap_ent);
+ if (!p) {
+ p = add_packed_git(store->odb->repo, idx_path,
+ strlen(idx_path), local);
+ if (p)
+ packfile_store_add_pack(store, p);
+ }
+
+ strbuf_release(&key);
+ return p;
+}
+
void (*report_garbage)(unsigned seen_bits, const char *path);
static void report_helper(const struct string_list *list,
@@ -891,23 +918,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
const char *file_name, void *_data)
{
struct prepare_pack_data *data = (struct prepare_pack_data *)_data;
- struct packed_git *p;
size_t base_len = full_name_len;
if (strip_suffix_mem(full_name, &base_len, ".idx") &&
!(data->m && midx_contains_pack(data->m, file_name))) {
- struct hashmap_entry hent;
- char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name);
- unsigned int hash = strhash(pack_name);
- hashmap_entry_init(&hent, hash);
-
- /* Don't reopen a pack we already have. */
- if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
- p = add_packed_git(data->r, full_name, full_name_len, data->local);
- if (p)
- packfile_store_add_pack(data->r->objects->packfiles, p);
- }
- free(pack_name);
+ char *trimmed_path = xstrndup(full_name, full_name_len);
+ packfile_store_load_pack(data->r->objects->packfiles,
+ trimmed_path, data->local);
+ free(trimmed_path);
}
if (!report_garbage)
diff --git a/packfile.h b/packfile.h
index ba4b0cef9c..fcefcbbef6 100644
--- a/packfile.h
+++ b/packfile.h
@@ -127,6 +127,14 @@ void packfile_store_reprepare(struct packfile_store *store);
void packfile_store_add_pack(struct packfile_store *store,
struct packed_git *pack);
+/*
+ * Open the packfile and add it to the store if it isn't yet known. Returns
+ * either the newly opened packfile or the preexisting packfile. Returns a
+ * `NULL` pointer in case the packfile could not be opened.
+ */
+struct packed_git *packfile_store_load_pack(struct packfile_store *store,
+ const char *idx_path, int local);
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 12/15] packfile: move `get_multi_pack_index()` into "midx.c"
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (10 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 11/15] packfile: introduce function to load and add packfiles Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 13/15] packfile: remove `get_packed_git()` Patrick Steinhardt
` (3 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The `get_multi_pack_index()` function is declared and implemented in the
packfile subsystem, even though it really belongs into the multi-pack
index subsystem. The reason for this is likely that it needs to call
`packfile_store_prepare()`, which is not exposed by the packfile system.
In a subsequent commit we're about to add another caller outside of the
packfile system though, so we'll have to expose the function anyway.
Do so now already and move `get_multi_pack_index()` into the MIDX
subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx.c | 6 ++++++
midx.h | 1 +
packfile.c | 8 +-------
packfile.h | 10 +++++++++-
4 files changed, 17 insertions(+), 8 deletions(-)
diff --git a/midx.c b/midx.c
index 3faeaf2f8f..1d6269f957 100644
--- a/midx.c
+++ b/midx.c
@@ -93,6 +93,12 @@ static int midx_read_object_offsets(const unsigned char *chunk_start,
return 0;
}
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
+{
+ packfile_store_prepare(source->odb->packfiles);
+ return source->midx;
+}
+
static struct multi_pack_index *load_multi_pack_index_one(struct odb_source *source,
const char *midx_name)
{
diff --git a/midx.h b/midx.h
index e241d2d690..6e54d73503 100644
--- a/midx.h
+++ b/midx.h
@@ -94,6 +94,7 @@ void get_midx_chain_filename(struct odb_source *source, struct strbuf *out);
void get_split_midx_filename_ext(struct odb_source *source, struct strbuf *buf,
const unsigned char *hash, const char *ext);
+struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
struct multi_pack_index *load_multi_pack_index(struct odb_source *source);
int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id);
struct packed_git *nth_midxed_pack(struct multi_pack_index *m,
diff --git a/packfile.c b/packfile.c
index e8b5be645c..70355ae92b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1003,7 +1003,7 @@ static void packfile_store_prepare_mru(struct packfile_store *store)
list_add_tail(&p->mru, &store->mru);
}
-static void packfile_store_prepare(struct packfile_store *store)
+void packfile_store_prepare(struct packfile_store *store)
{
struct odb_source *source;
@@ -1033,12 +1033,6 @@ struct packed_git *get_packed_git(struct repository *r)
return r->objects->packfiles->packs;
}
-struct multi_pack_index *get_multi_pack_index(struct odb_source *source)
-{
- packfile_store_prepare(source->odb->packfiles);
- return source->midx;
-}
-
struct packed_git *get_all_packs(struct repository *r)
{
packfile_store_prepare(r->objects->packfiles);
diff --git a/packfile.h b/packfile.h
index fcefcbbef6..a9e561ac39 100644
--- a/packfile.h
+++ b/packfile.h
@@ -112,6 +112,15 @@ void packfile_store_free(struct packfile_store *store);
*/
void packfile_store_close(struct packfile_store *store);
+/*
+ * Prepare the packfile store by loading packfiles and multi-pack indices for
+ * all alternates. This becomes a no-op if the store is already prepared.
+ *
+ * It shouldn't typically be necessary to call this function directly, as
+ * functions that access the store know to prepare it.
+ */
+void packfile_store_prepare(struct packfile_store *store);
+
/*
* Clear the packfile caches and try to look up any new packfiles that have
* appeared since last preparing the packfiles store.
@@ -213,7 +222,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
-struct multi_pack_index *get_multi_pack_index(struct odb_source *source);
struct packed_git *get_all_packs(struct repository *r);
/*
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 13/15] packfile: remove `get_packed_git()`
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (11 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 12/15] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 14/15] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
` (2 subsequent siblings)
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
We have two different functions to retrieve packfiles for a packfile
store:
- `get_packed_git()` returns the list of packfiles after having called
`prepare_packed_git()`.
- `get_all_packs()` calls `prepare_packed_git()`, as well, but also
calls `prepare_midx_pack()` for each pack.
Based on the naming alone one might think that `get_all_packs()` would
return more packs than `get_packed_git()`. But that's not the case: both
functions end up returning the exact same list of packfiles. The real
difference between those functions is that `get_all_packs()` also loads
the info of whether or not a packfile is part of a multi-pack index.
Preparing this extra information also shouldn't be significantly more
expensive:
- We have already loaded all packfiles via `prepare_packed_git_one()`.
So given that multi-pack indices may only refer to packfiles in the
same object directory we know that we already loaded each packfile.
- The multi-pack index was prepared via `packfile_store_prepare()`
already, which calls `prepare_multi_pack_index_one()`.
- So all that remains to be done is to look up the index of the pack
in its multi-pack index so that we can store that info in both the
pack itself and the MIDX.
So it is somewhat confusing to readers that one of these two functions
claims to load "all" packfiles while the other one doesn't, even though
the ultimate difference is way more nuanced.
Convert all of these sites to use `get_all_packs()` instead and remove
`get_packed_git()`. There doesn't seem to be a good reason to discern
these two functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/gc.c | 2 +-
builtin/grep.c | 2 +-
object-name.c | 4 ++--
packfile.c | 6 ------
packfile.h | 1 -
5 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index aeca06a08b..b3eec213d2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1423,7 +1423,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
if (incremental_repack_auto_limit < 0)
return 1;
- for (p = get_packed_git(the_repository);
+ for (p = get_all_packs(the_repository);
count < incremental_repack_auto_limit && p;
p = p->next) {
if (!p->multi_pack_index)
diff --git a/builtin/grep.c b/builtin/grep.c
index 5df6537333..8f0e21bd70 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1214,7 +1214,7 @@ int cmd_grep(int argc,
if (recurse_submodules)
repo_read_gitmodules(the_repository, 1);
if (startup_info->have_repository)
- (void)get_packed_git(the_repository);
+ packfile_store_prepare(the_repository->objects->packfiles);
start_threads(&opt);
} else {
diff --git a/object-name.c b/object-name.c
index df9e0c5f02..ecffd2d5b1 100644
--- a/object-name.c
+++ b/object-name.c
@@ -213,7 +213,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
unique_in_midx(m, ds);
}
- for (p = get_packed_git(ds->repo); p && !ds->ambiguous;
+ for (p = get_all_packs(ds->repo); p && !ds->ambiguous;
p = p->next)
unique_in_pack(p, ds);
}
@@ -806,7 +806,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
find_abbrev_len_for_midx(m, mad);
}
- for (p = get_packed_git(mad->repo); p; p = p->next)
+ for (p = get_all_packs(mad->repo); p; p = p->next)
find_abbrev_len_for_pack(p, mad);
}
diff --git a/packfile.c b/packfile.c
index 70355ae92b..faa796f2a3 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,12 +1027,6 @@ void packfile_store_reprepare(struct packfile_store *store)
packfile_store_prepare(store);
}
-struct packed_git *get_packed_git(struct repository *r)
-{
- packfile_store_prepare(r->objects->packfiles);
- return r->objects->packfiles->packs;
-}
-
struct packed_git *get_all_packs(struct repository *r)
{
packfile_store_prepare(r->objects->packfiles);
diff --git a/packfile.h b/packfile.h
index a9e561ac39..34c2132863 100644
--- a/packfile.h
+++ b/packfile.h
@@ -220,7 +220,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-struct packed_git *get_packed_git(struct repository *r);
struct list_head *get_packed_git_mru(struct repository *r);
struct packed_git *get_all_packs(struct repository *r);
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 14/15] packfile: refactor `get_all_packs()` to work on packfile store
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (12 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 13/15] packfile: remove `get_packed_git()` Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 15/15] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
2025-09-02 16:40 ` [PATCH v3 00/15] packfile: carve out a new " Junio C Hamano
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The `get_all_packs()` function prepares the packfile store and then
returns its packfiles. Refactor it to accept a packfile store instead of
a repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/cat-file.c | 3 ++-
builtin/count-objects.c | 3 ++-
builtin/fast-import.c | 6 ++++--
builtin/fsck.c | 11 +++++++----
builtin/gc.c | 10 ++++++----
builtin/pack-objects.c | 28 +++++++++++++++++++---------
builtin/pack-redundant.c | 6 ++++--
builtin/repack.c | 9 ++++++---
connected.c | 3 ++-
http-backend.c | 5 +++--
http.c | 3 ++-
object-name.c | 4 ++--
pack-bitmap.c | 4 ++--
pack-objects.c | 3 ++-
packfile.c | 12 ++++++------
packfile.h | 7 ++++++-
server-info.c | 3 ++-
t/helper/test-find-pack.c | 2 +-
t/helper/test-pack-mtimes.c | 2 +-
19 files changed, 79 insertions(+), 45 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index fce0b06451c..461eb2a6e09 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -852,9 +852,10 @@ static void batch_each_object(struct batch_options *opt,
if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter,
batch_one_object_bitmapped, &payload)) {
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *pack;
- for (pack = get_all_packs(the_repository); pack; pack = pack->next) {
+ for (pack = packfile_store_get_packs(packs); pack; pack = pack->next) {
if (bitmap_index_contains_pack(bitmap, pack) ||
open_pack_index(pack))
continue;
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a61d3b46aac..cb03c580553 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -122,6 +122,7 @@ int cmd_count_objects(int argc,
count_loose, count_cruft, NULL, NULL);
if (verbose) {
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
unsigned long num_pack = 0;
off_t size_pack = 0;
@@ -129,7 +130,7 @@ int cmd_count_objects(int argc,
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local)
continue;
if (open_pack_index(p))
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index a26e79689d5..fea914cf9eb 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -952,6 +952,7 @@ static int store_object(
struct object_id *oidout,
uintmax_t mark)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
void *out, *delta;
struct object_entry *e;
unsigned char hdr[96];
@@ -975,7 +976,7 @@ static int store_object(
if (e->idx.offset) {
duplicate_count_by_type[type]++;
return 1;
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
+ } else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) {
e->type = type;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
@@ -1092,6 +1093,7 @@ static void truncate_pack(struct hashfile_checkpoint *checkpoint)
static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
size_t in_sz = 64 * 1024, out_sz = 64 * 1024;
unsigned char *in_buf = xmalloc(in_sz);
unsigned char *out_buf = xmalloc(out_sz);
@@ -1175,7 +1177,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
duplicate_count_by_type[OBJ_BLOB]++;
truncate_pack(&checkpoint);
- } else if (find_oid_pack(&oid, get_all_packs(the_repository))) {
+ } else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) {
e->type = OBJ_BLOB;
e->pack_id = MAX_PACK_ID;
e->idx.offset = 1; /* just not zero! */
diff --git a/builtin/fsck.c b/builtin/fsck.c
index d2eb9d4fbe9..cabf5401c9f 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -867,19 +867,20 @@ static int mark_packed_for_connectivity(const struct object_id *oid,
static int check_pack_rev_indexes(struct repository *r, int show_progress)
{
+ struct packfile_store *packs = r->objects->packfiles;
struct progress *progress = NULL;
uint32_t pack_count = 0;
int res = 0;
if (show_progress) {
- for (struct packed_git *p = get_all_packs(r); p; p = p->next)
+ for (struct packed_git *p = packfile_store_get_packs(packs); p; p = p->next)
pack_count++;
progress = start_delayed_progress(the_repository,
"Verifying reverse pack-indexes", pack_count);
pack_count = 0;
}
- for (struct packed_git *p = get_all_packs(r); p; p = p->next) {
+ for (struct packed_git *p = packfile_store_get_packs(packs); p; p = p->next) {
int load_error = load_pack_revindex_from_disk(p);
if (load_error < 0) {
@@ -999,6 +1000,8 @@ int cmd_fsck(int argc,
for_each_packed_object(the_repository,
mark_packed_for_connectivity, NULL, 0);
} else {
+ struct packfile_store *packs = the_repository->objects->packfiles;
+
odb_prepare_alternates(the_repository->objects);
for (source = the_repository->objects->sources; source; source = source->next)
fsck_source(source);
@@ -1009,7 +1012,7 @@ int cmd_fsck(int argc,
struct progress *progress = NULL;
if (show_progress) {
- for (p = get_all_packs(the_repository); p;
+ for (p = packfile_store_get_packs(packs); p;
p = p->next) {
if (open_pack_index(p))
continue;
@@ -1019,7 +1022,7 @@ int cmd_fsck(int argc,
progress = start_progress(the_repository,
_("Checking objects"), total);
}
- for (p = get_all_packs(the_repository); p;
+ for (p = packfile_store_get_packs(packs); p;
p = p->next) {
/* verify gives error messages itself */
if (verify_pack(the_repository,
diff --git a/builtin/gc.c b/builtin/gc.c
index b3eec213d25..c0a3b54ce74 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -487,9 +487,10 @@ static int too_many_loose_objects(struct gc_config *cfg)
static struct packed_git *find_base_packs(struct string_list *packs,
unsigned long limit)
{
+ struct packfile_store *packfiles = the_repository->objects->packfiles;
struct packed_git *p, *base = NULL;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packfiles); p; p = p->next) {
if (!p->pack_local || p->is_cruft)
continue;
if (limit) {
@@ -508,13 +509,14 @@ static struct packed_git *find_base_packs(struct string_list *packs,
static int too_many_packs(struct gc_config *cfg)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
int cnt;
if (cfg->gc_auto_pack_limit <= 0)
return 0;
- for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
+ for (cnt = 0, p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local)
continue;
if (p->pack_keep)
@@ -1423,7 +1425,7 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED)
if (incremental_repack_auto_limit < 0)
return 1;
- for (p = get_all_packs(the_repository);
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles);
count < incremental_repack_auto_limit && p;
p = p->next) {
if (!p->multi_pack_index)
@@ -1492,7 +1494,7 @@ static off_t get_auto_pack_size(void)
struct repository *r = the_repository;
odb_reprepare(r->objects);
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if (p->pack_size > max_size) {
second_largest_size = max_size;
max_size = p->pack_size;
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1494afcf3df..914c6e641d0 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3831,6 +3831,7 @@ static int pack_mtime_cmp(const void *_a, const void *_b)
static void read_packs_list_from_stdin(struct rev_info *revs)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct strbuf buf = STRBUF_INIT;
struct string_list include_packs = STRING_LIST_INIT_DUP;
struct string_list exclude_packs = STRING_LIST_INIT_DUP;
@@ -3855,7 +3856,7 @@ static void read_packs_list_from_stdin(struct rev_info *revs)
string_list_sort(&exclude_packs);
string_list_remove_duplicates(&exclude_packs, 0);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
const char *pack_name = pack_basename(p);
if ((item = string_list_lookup(&include_packs, pack_name)))
@@ -4076,6 +4077,7 @@ static void enumerate_cruft_objects(void)
static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
struct rev_info revs;
int ret;
@@ -4105,7 +4107,7 @@ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs
* Re-mark only the fresh packs as kept so that objects in
* unknown packs do not halt the reachability traversal early.
*/
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(packs); p; p = p->next)
p->pack_keep_in_core = 0;
mark_pack_kept_in_core(fresh_packs, 1);
@@ -4122,6 +4124,7 @@ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs
static void read_cruft_objects(void)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct strbuf buf = STRBUF_INIT;
struct string_list discard_packs = STRING_LIST_INIT_DUP;
struct string_list fresh_packs = STRING_LIST_INIT_DUP;
@@ -4142,7 +4145,7 @@ static void read_cruft_objects(void)
string_list_sort(&discard_packs);
string_list_sort(&fresh_packs);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
const char *pack_name = pack_basename(p);
struct string_list_item *item;
@@ -4390,11 +4393,12 @@ static void add_unreachable_loose_objects(struct rev_info *revs)
static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
static struct packed_git *last_found = (void *)1;
struct packed_git *p;
p = (last_found != (void *)1) ? last_found :
- get_all_packs(the_repository);
+ packfile_store_get_packs(packs);
while (p) {
if ((!p->pack_local || p->pack_keep ||
@@ -4404,7 +4408,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
return 1;
}
if (p == last_found)
- p = get_all_packs(the_repository);
+ p = packfile_store_get_packs(packs);
else
p = p->next;
if (p == last_found)
@@ -4436,12 +4440,13 @@ static int loosened_object_can_be_discarded(const struct object_id *oid,
static void loosen_unused_packed_objects(void)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
uint32_t i;
uint32_t loosened_objects_nr = 0;
struct object_id oid;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local || p->pack_keep || p->pack_keep_in_core)
continue;
@@ -4742,12 +4747,13 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av)
static void add_extra_kept_packs(const struct string_list *names)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
if (!names->nr)
return;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
const char *name = basename(p->pack_name);
int i;
@@ -5185,8 +5191,10 @@ int cmd_pack_objects(int argc,
add_extra_kept_packs(&keep_pack_list);
if (ignore_packed_keep_on_disk) {
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next)
+
+ for (p = packfile_store_get_packs(packs); p; p = p->next)
if (p->pack_local && p->pack_keep)
break;
if (!p) /* no keep-able packs found */
@@ -5198,8 +5206,10 @@ int cmd_pack_objects(int argc,
* want to unset "local" based on looking at packs, as
* it also covers non-local objects
*/
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_local) {
have_non_local_packs = 1;
break;
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index fe81c293e3a..721c4e04d5c 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -566,7 +566,8 @@ static struct pack_list * add_pack(struct packed_git *p)
static struct pack_list * add_pack_file(const char *filename)
{
- struct packed_git *p = get_all_packs(the_repository);
+ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p = packfile_store_get_packs(packs);
if (strlen(filename) < 40)
die("Bad pack filename: %s", filename);
@@ -581,7 +582,8 @@ static struct pack_list * add_pack_file(const char *filename)
static void load_all(void)
{
- struct packed_git *p = get_all_packs(the_repository);
+ struct packfile_store *packs = the_repository->objects->packfiles;
+ struct packed_git *p = packfile_store_get_packs(packs);
while (p) {
add_pack(p);
diff --git a/builtin/repack.c b/builtin/repack.c
index 5ff27fc8e29..2579090abbd 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -265,10 +265,11 @@ static void existing_packs_release(struct existing_packs *existing)
static void collect_pack_filenames(struct existing_packs *existing,
const struct string_list *extra_keep)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
int i;
const char *base;
@@ -497,10 +498,11 @@ static void init_pack_geometry(struct pack_geometry *geometry,
struct existing_packs *existing,
const struct pack_objects_args *args)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (args->local && !p->pack_local)
/*
* When asked to only repack local packfiles we skip
@@ -1137,11 +1139,12 @@ static int write_filtered_pack(const struct pack_objects_args *args,
static void combine_small_cruft_packs(FILE *in, size_t combine_cruft_below_size,
struct existing_packs *existing)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
struct strbuf buf = STRBUF_INIT;
size_t i;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!(p->is_cruft && p->pack_local))
continue;
diff --git a/connected.c b/connected.c
index d6e9682fd93..56fedb923d5 100644
--- a/connected.c
+++ b/connected.c
@@ -74,9 +74,10 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
*/
odb_reprepare(the_repository->objects);
do {
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *p;
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (!p->pack_promisor)
continue;
if (find_pack_entry_one(oid, p))
diff --git a/http-backend.c b/http-backend.c
index d5dfe762bb5..c95aa51b45c 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -603,18 +603,19 @@ static void get_head(struct strbuf *hdr, char *arg UNUSED)
static void get_info_packs(struct strbuf *hdr, char *arg UNUSED)
{
size_t objdirlen = strlen(repo_get_object_directory(the_repository));
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct strbuf buf = STRBUF_INIT;
struct packed_git *p;
size_t cnt = 0;
select_getanyfile(hdr);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (p->pack_local)
cnt++;
}
strbuf_grow(&buf, cnt * 53 + 2);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (p->pack_local)
strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
}
diff --git a/http.c b/http.c
index af2120b64c7..321648a81e6 100644
--- a/http.c
+++ b/http.c
@@ -2408,6 +2408,7 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url)
static int fetch_and_setup_pack_index(struct packed_git **packs_head,
unsigned char *sha1, const char *base_url)
{
+ struct packfile_store *packs = the_repository->objects->packfiles;
struct packed_git *new_pack, *p;
char *tmp_idx = NULL;
int ret;
@@ -2416,7 +2417,7 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
* If we already have the pack locally, no need to fetch its index or
* even add it to list; we already have all of its objects.
*/
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
if (hasheq(p->hash, sha1, the_repository->hash_algo))
return 0;
}
diff --git a/object-name.c b/object-name.c
index ecffd2d5b12..53356819a3d 100644
--- a/object-name.c
+++ b/object-name.c
@@ -213,7 +213,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
unique_in_midx(m, ds);
}
- for (p = get_all_packs(ds->repo); p && !ds->ambiguous;
+ for (p = packfile_store_get_packs(ds->repo->objects->packfiles); p && !ds->ambiguous;
p = p->next)
unique_in_pack(p, ds);
}
@@ -806,7 +806,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
find_abbrev_len_for_midx(m, mad);
}
- for (p = get_all_packs(mad->repo); p; p = p->next)
+ for (p = packfile_store_get_packs(mad->repo->objects->packfiles); p; p = p->next)
find_abbrev_len_for_pack(p, mad);
}
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 058bdb5d7de..834c300aab7 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -664,7 +664,7 @@ static int open_pack_bitmap(struct repository *r,
struct packed_git *p;
int ret = -1;
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if (open_pack_bitmap_1(bitmap_git, p) == 0) {
ret = 0;
/*
@@ -3362,7 +3362,7 @@ int verify_bitmap_files(struct repository *r)
free(midx_bitmap_name);
}
- for (struct packed_git *p = get_all_packs(r);
+ for (struct packed_git *p = packfile_store_get_packs(r->objects->packfiles);
p; p = p->next) {
char *pack_bitmap_name = pack_bitmap_filename(p);
res |= verify_bitmap_file(r->hash_algo, pack_bitmap_name);
diff --git a/pack-objects.c b/pack-objects.c
index a9d9855063a..668c1136673 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -86,6 +86,7 @@ struct object_entry *packlist_find(struct packing_data *pdata,
static void prepare_in_pack_by_idx(struct packing_data *pdata)
{
+ struct packfile_store *packs = pdata->repo->objects->packfiles;
struct packed_git **mapping, *p;
int cnt = 0, nr = 1U << OE_IN_PACK_BITS;
@@ -95,7 +96,7 @@ static void prepare_in_pack_by_idx(struct packing_data *pdata)
* (i.e. in_pack_idx also zero) should return NULL.
*/
mapping[cnt++] = NULL;
- for (p = get_all_packs(pdata->repo); p; p = p->next, cnt++) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next, cnt++) {
if (cnt == nr) {
free(mapping);
return;
diff --git a/packfile.c b/packfile.c
index faa796f2a3f..3e0d2a8f41e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,11 +1027,11 @@ void packfile_store_reprepare(struct packfile_store *store)
packfile_store_prepare(store);
}
-struct packed_git *get_all_packs(struct repository *r)
+struct packed_git *packfile_store_get_packs(struct packfile_store *store)
{
- packfile_store_prepare(r->objects->packfiles);
+ packfile_store_prepare(store);
- for (struct odb_source *source = r->objects->sources; source; source = source->next) {
+ for (struct odb_source *source = store->odb->sources; source; source = source->next) {
struct multi_pack_index *m = source->midx;
if (!m)
continue;
@@ -1039,7 +1039,7 @@ struct packed_git *get_all_packs(struct repository *r)
prepare_midx_pack(m, i);
}
- return r->objects->packfiles->packs;
+ return store->packs;
}
struct list_head *get_packed_git_mru(struct repository *r)
@@ -2099,7 +2099,7 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags)
* covers, one kept and one not kept, but the midx returns only
* the non-kept version.
*/
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(r->objects->packfiles); p; p = p->next) {
if ((p->pack_keep && (flags & ON_DISK_KEEP_PACKS)) ||
(p->pack_keep_in_core && (flags & IN_CORE_KEEP_PACKS))) {
ALLOC_GROW(packs, nr + 1, alloc);
@@ -2196,7 +2196,7 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
int r = 0;
int pack_errors = 0;
- for (p = get_all_packs(repo); p; p = p->next) {
+ for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next) {
if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
continue;
if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
diff --git a/packfile.h b/packfile.h
index 34c2132863a..86f2c07101f 100644
--- a/packfile.h
+++ b/packfile.h
@@ -136,6 +136,12 @@ void packfile_store_reprepare(struct packfile_store *store);
void packfile_store_add_pack(struct packfile_store *store,
struct packed_git *pack);
+/*
+ * Get all packs managed by the given store, including packfiles that are
+ * referenced by multi-pack indices.
+ */
+struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+
/*
* Open the packfile and add it to the store if it isn't yet known. Returns
* either the newly opened packfile or the preexisting packfile. Returns a
@@ -221,7 +227,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
extern void (*report_garbage)(unsigned seen_bits, const char *path);
struct list_head *get_packed_git_mru(struct repository *r);
-struct packed_git *get_all_packs(struct repository *r);
/*
* Give a rough count of objects in the repository. This sacrifices accuracy
diff --git a/server-info.c b/server-info.c
index 9bb30d9ab71..d949dea3094 100644
--- a/server-info.c
+++ b/server-info.c
@@ -287,12 +287,13 @@ static int compare_info(const void *a_, const void *b_)
static void init_pack_info(struct repository *r, const char *infofile, int force)
{
+ struct packfile_store *packs = r->objects->packfiles;
struct packed_git *p;
int stale;
int i;
size_t alloc = 0;
- for (p = get_all_packs(r); p; p = p->next) {
+ for (p = packfile_store_get_packs(packs); p; p = p->next) {
/* we ignore things on alternate path since they are
* not available to the pullers in general.
*/
diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c
index 611a13a3261..183a777fc54 100644
--- a/t/helper/test-find-pack.c
+++ b/t/helper/test-find-pack.c
@@ -39,7 +39,7 @@ int cmd__find_pack(int argc, const char **argv)
if (repo_get_oid(the_repository, argv[0], &oid))
die("cannot parse %s as an object name", argv[0]);
- for (p = get_all_packs(the_repository); p; p = p->next)
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next)
if (find_pack_entry_one(&oid, p)) {
printf("%s\n", p->pack_name);
actual_count++;
diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c
index d51aaa3dc40..cfdfae77a6c 100644
--- a/t/helper/test-pack-mtimes.c
+++ b/t/helper/test-pack-mtimes.c
@@ -37,7 +37,7 @@ int cmd__pack_mtimes(int argc, const char **argv)
if (argc != 2)
usage(pack_mtimes_usage);
- for (p = get_all_packs(the_repository); p; p = p->next) {
+ for (p = packfile_store_get_packs(the_repository->objects->packfiles); p; p = p->next) {
strbuf_addstr(&buf, basename(p->pack_name));
strbuf_strip_suffix(&buf, ".pack");
strbuf_addstr(&buf, ".mtimes");
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* [PATCH v3 15/15] packfile: refactor `get_packed_git_mru()` to work on packfile store
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (13 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 14/15] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
@ 2025-09-02 10:48 ` Patrick Steinhardt
2025-09-02 16:40 ` [PATCH v3 00/15] packfile: carve out a new " Junio C Hamano
15 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 10:48 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Taylor Blau, Junio C Hamano
The `get_packed_git_mru()` function prepares the packfile store and then
returns its packfiles in most-recently-used order. Refactor it to accept
a packfile store instead of a repository to clarify its scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/pack-objects.c | 4 ++--
packfile.c | 6 +++---
packfile.h | 7 +++++--
3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 914c6e641d..9558ab883e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1748,12 +1748,12 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
}
}
- list_for_each(pos, get_packed_git_mru(the_repository)) {
+ list_for_each(pos, packfile_store_get_packs_mru(the_repository->objects->packfiles)) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset, found_mtime);
if (!exclude && want > 0)
list_move(&p->mru,
- get_packed_git_mru(the_repository));
+ packfile_store_get_packs_mru(the_repository->objects->packfiles));
if (want != -1)
return want;
}
diff --git a/packfile.c b/packfile.c
index 3e0d2a8f41..91d11e0562 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1042,10 +1042,10 @@ struct packed_git *packfile_store_get_packs(struct packfile_store *store)
return store->packs;
}
-struct list_head *get_packed_git_mru(struct repository *r)
+struct list_head *packfile_store_get_packs_mru(struct packfile_store *store)
{
- packfile_store_prepare(r->objects->packfiles);
- return &r->objects->packfiles->mru;
+ packfile_store_prepare(store);
+ return &store->mru;
}
/*
diff --git a/packfile.h b/packfile.h
index 86f2c07101..e21ebd75d4 100644
--- a/packfile.h
+++ b/packfile.h
@@ -142,6 +142,11 @@ void packfile_store_add_pack(struct packfile_store *store,
*/
struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+/*
+ * Get all packs in most-recently-used order.
+ */
+struct list_head *packfile_store_get_packs_mru(struct packfile_store *store);
+
/*
* Open the packfile and add it to the store if it isn't yet known. Returns
* either the newly opened packfile or the preexisting packfile. Returns a
@@ -226,8 +231,6 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
#define PACKDIR_FILE_GARBAGE 4
extern void (*report_garbage)(unsigned seen_bits, const char *path);
-struct list_head *get_packed_git_mru(struct repository *r);
-
/*
* Give a rough count of objects in the repository. This sacrifices accuracy
* for speed.
--
2.51.0.384.g4c02a37b29.dirty
^ permalink raw reply related [flat|nested] 102+ messages in thread
* Re: [PATCH v3 00/15] packfile: carve out a new packfile store
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
` (14 preceding siblings ...)
2025-09-02 10:48 ` [PATCH v3 15/15] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
@ 2025-09-02 16:40 ` Junio C Hamano
15 siblings, 0 replies; 102+ messages in thread
From: Junio C Hamano @ 2025-09-02 16:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King, Taylor Blau
Patrick Steinhardt <ps@pks.im> writes:
> - Rebased on top of master at 6ad8021821 (The fifth batch, 2025-08-29)
> with ps/object-store-midx-dedup-info at 13296ac909 (midx: compute
> paths via their source, 2025-08-11) merged into it. This fixes
> various conflicts with "seen". There's still two conflicts: a
> trivial one with jt/de-global-bulk-checkin. And a more complex one
> with tb/prepare-midx-pack-cleanup. I don't think it's necessary to
> really address the first one, but I'm unsure how to proceed with the
> second one given that the patch series still seems to be cooking.
I think the second topic is not really cooking, but is expecting a
reroll, so I'd say it is perfectly fine to drop it and expect it to
come back, if it is still relevant, in future, in a shape that is
friendlier to other topics in 'seen' when it happens.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 02/16] odb: move list of packfiles into `struct packfile_store`
2025-09-02 8:50 ` Patrick Steinhardt
@ 2025-09-02 17:21 ` Taylor Blau
2025-09-02 17:42 ` Junio C Hamano
0 siblings, 1 reply; 102+ messages in thread
From: Taylor Blau @ 2025-09-02 17:21 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Jeff King
On Tue, Sep 02, 2025 at 10:50:41AM +0200, Patrick Steinhardt wrote:
> > > +void packfile_store_close(struct packfile_store *store)
> > > +{
> > > + struct packed_git *p;
> > > +
> > > + for (p = store->packs; p; p = p->next)
> > > + if (p->do_not_close)
> > > + BUG("want to close pack marked 'do-not-close'");
> > > + else
> > > + close_pack(p);
> > > +}
> >
> > And likewise this looks good to me. I do find the braceless for-loop a
> > little hard to read, but it's (a) correct, and (b) consistent with the
> > original implementation, so I don't feel strongly about changing it.
>
> Agreed, it is a bit awkward. I feel like our coding style should be
> amended to say that we only do braceless bodies in case the body is a
> single statement.
I think that our CodingGuidelines cover this as of 1797dc5176
(CodingGuidelines: clarify multi-line brace style, 2017-01-17), which
frowns upon statements that extend multiple lines.
So I think in this case, the CodingGuidelines would suggest that we
write this as:
for (p = store->packs; p; p = p->next) {
if (p->do_not_close)
BUG("want to close pack marked 'do-not-close'");
else
close_pack(p);
}
, which from our discussion here seems like something that we both find
more readable than the original.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 02/16] odb: move list of packfiles into `struct packfile_store`
2025-09-02 17:21 ` Taylor Blau
@ 2025-09-02 17:42 ` Junio C Hamano
2025-09-03 5:58 ` Patrick Steinhardt
0 siblings, 1 reply; 102+ messages in thread
From: Junio C Hamano @ 2025-09-02 17:42 UTC (permalink / raw)
To: Taylor Blau; +Cc: Patrick Steinhardt, git, Karthik Nayak, Jeff King
Taylor Blau <me@ttaylorr.com> writes:
> So I think in this case, the CodingGuidelines would suggest that we
> write this as:
>
> for (p = store->packs; p; p = p->next) {
> if (p->do_not_close)
> BUG("want to close pack marked 'do-not-close'");
> else
> close_pack(p);
> }
>
> , which from our discussion here seems like something that we both find
> more readable than the original.
Yes. Technically the "if...else..." is still a single statement, so
a rule like "do not use {} only if you would place a single
statement in it", though.
I would actually write it more like this, though.
for (p = store->packs; p; p = p->next) {
if (p->do_not_close)
BUG("want to close pack marked 'do-not-close'");
close_pack(p);
}
The first two lines in that block is a glorified assert(), and
without a programming bug, what the loop wants to do is only to call
close_pack() on eacn and every pack on the list. Not using "else"
conveys that much clearer.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH v2 02/16] odb: move list of packfiles into `struct packfile_store`
2025-09-02 17:42 ` Junio C Hamano
@ 2025-09-03 5:58 ` Patrick Steinhardt
0 siblings, 0 replies; 102+ messages in thread
From: Patrick Steinhardt @ 2025-09-03 5:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Taylor Blau, git, Karthik Nayak, Jeff King
On Tue, Sep 02, 2025 at 10:42:25AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > So I think in this case, the CodingGuidelines would suggest that we
> > write this as:
> >
> > for (p = store->packs; p; p = p->next) {
> > if (p->do_not_close)
> > BUG("want to close pack marked 'do-not-close'");
> > else
> > close_pack(p);
> > }
> >
> > , which from our discussion here seems like something that we both find
> > more readable than the original.
>
> Yes. Technically the "if...else..." is still a single statement, so
> a rule like "do not use {} only if you would place a single
> statement in it", though.
>
> I would actually write it more like this, though.
>
> for (p = store->packs; p; p = p->next) {
> if (p->do_not_close)
> BUG("want to close pack marked 'do-not-close'");
>
> close_pack(p);
> }
>
> The first two lines in that block is a glorified assert(), and
> without a programming bug, what the loop wants to do is only to call
> close_pack() on eacn and every pack on the list. Not using "else"
> conveys that much clearer.
That reads even better, agreed. Will queue this change locally and send
it out with the next revision of this patch series.
Patrick
^ permalink raw reply [flat|nested] 102+ messages in thread
end of thread, other threads:[~2025-09-03 5:58 UTC | newest]
Thread overview: 102+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 8:19 [PATCH 00/16] packfile: carve out a new packfile store Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-08-19 9:47 ` Karthik Nayak
2025-08-20 4:58 ` Patrick Steinhardt
2025-08-19 17:32 ` Junio C Hamano
2025-08-20 4:58 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 02/16] odb: move list of packfiles into " Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 03/16] odb: move initialization bit " Patrick Steinhardt
2025-08-19 9:57 ` Karthik Nayak
2025-08-19 16:24 ` Junio C Hamano
2025-08-20 8:04 ` Karthik Nayak
2025-08-22 23:50 ` Junio C Hamano
2025-08-26 12:19 ` [PATCH] Documentation: note styling for bit fields Karthik Nayak
2025-08-20 4:58 ` [PATCH 03/16] odb: move initialization bit into `struct packfile_store` Patrick Steinhardt
2025-08-20 6:24 ` Junio C Hamano
2025-08-19 8:19 ` [PATCH 04/16] odb: move packfile map " Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
2025-08-20 12:44 ` Karthik Nayak
2025-08-20 19:20 ` Jeff King
2025-08-21 6:40 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 06/16] odb: move kept cache " Patrick Steinhardt
2025-08-19 18:56 ` Junio C Hamano
2025-08-20 4:58 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
2025-08-19 19:18 ` Junio C Hamano
2025-08-19 8:19 ` [PATCH 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
2025-08-20 13:17 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-08-20 13:35 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
2025-08-20 13:41 ` Karthik Nayak
2025-08-21 6:40 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
2025-08-20 13:50 ` Karthik Nayak
2025-08-21 6:40 ` Patrick Steinhardt
2025-08-20 13:51 ` Karthik Nayak
2025-08-19 8:19 ` [PATCH 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
2025-08-20 13:53 ` Karthik Nayak
2025-08-21 6:40 ` Patrick Steinhardt
2025-08-19 8:19 ` [PATCH 16/16] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
2025-08-19 17:13 ` [PATCH 00/16] packfile: carve out a new " Junio C Hamano
2025-08-20 13:55 ` Karthik Nayak
2025-08-21 7:38 ` [PATCH v2 " Patrick Steinhardt
2025-08-21 7:38 ` [PATCH v2 01/16] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 02/16] odb: move list of packfiles into " Patrick Steinhardt
2025-08-25 23:42 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
2025-09-02 17:21 ` Taylor Blau
2025-09-02 17:42 ` Junio C Hamano
2025-09-03 5:58 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 03/16] odb: move initialization bit " Patrick Steinhardt
2025-08-26 1:40 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 04/16] odb: move packfile map " Patrick Steinhardt
2025-08-26 1:41 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 05/16] odb: move MRU list of packfiles " Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 06/16] odb: move kept cache " Patrick Steinhardt
2025-08-26 1:46 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 07/16] packfile: reorder functions to avoid function declaration Patrick Steinhardt
2025-08-26 1:47 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 08/16] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
2025-08-26 1:58 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 09/16] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
2025-08-26 2:10 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 10/16] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
2025-08-26 2:11 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 11/16] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-08-27 1:04 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 12/16] packfile: introduce function to load and add packfiles Patrick Steinhardt
2025-08-27 1:12 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 13/16] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
2025-08-27 1:20 ` Taylor Blau
2025-08-21 7:39 ` [PATCH v2 14/16] packfile: remove `get_packed_git()` Patrick Steinhardt
2025-08-27 1:38 ` Taylor Blau
2025-09-02 8:50 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 15/16] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
2025-08-27 1:45 ` Taylor Blau
2025-09-02 8:51 ` Patrick Steinhardt
2025-08-21 7:39 ` [PATCH v2 16/16] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 00/15] packfile: carve out a new " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 01/15] packfile: introduce a new `struct packfile_store` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 02/15] odb: move list of packfiles into " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 03/15] odb: move initialization bit " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 04/15] odb: move packfile map " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 05/15] odb: move MRU list of packfiles " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 06/15] odb: move kept cache " Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 07/15] packfile: reorder functions to avoid function declaration Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 08/15] packfile: refactor `prepare_packed_git()` to work on packfile store Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 09/15] packfile: split up responsibilities of `reprepare_packed_git()` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 10/15] packfile: refactor `install_packed_git()` to work on packfile store Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 11/15] packfile: introduce function to load and add packfiles Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 12/15] packfile: move `get_multi_pack_index()` into "midx.c" Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 13/15] packfile: remove `get_packed_git()` Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 14/15] packfile: refactor `get_all_packs()` to work on packfile store Patrick Steinhardt
2025-09-02 10:48 ` [PATCH v3 15/15] packfile: refactor `get_packed_git_mru()` " Patrick Steinhardt
2025-09-02 16:40 ` [PATCH v3 00/15] packfile: carve out a new " Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).