* [PATCH 00/13] object-store: a handful of cleanups
@ 2025-04-23 7:48 Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
` (15 more replies)
0 siblings, 16 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
Hi,
this patch series contains a handful of cleanups to the object store
subsystem:
- A couple of definitions are moved out of "object-store.h" as they
belong to other subsystems.
- Some functions are dropped and/or renamed.
- The biggest part is the removal of `repo_has_object_file()`. This
function and its `_with_flags()` variant are marked as deprecated,
with the replacement being `has_object()`. The benefit of that
function is that it doesn't reload packfiles and doesn't fetch
promisor objects by default so that it becomes more explicit when
one really wants to do so.
These cleanups are in preparation for getting rid of `the_repository` in
"object-store.c".
The patch series is built on top of 4bbb303af69 (The seventh batch,
2025-04-17) with ps/object-file-cleanup at 68cd492a3e6 (object-store:
merge "object-store-ll.h" and "object-store.h", 2025-04-15) merged into
it.
Thanks!
Patrick
---
Patrick Steinhardt (13):
object-store: move `struct packed_git` into "packfile.h"
object-store: drop `loose_object_path()`
object-store: move and rename `odb_pack_keep()`
object-store: move function declarations to their respective subsystems
object-store: allow fetching objects via `has_object()`
treewide: trivial conversions of `repo_has_object_file()`
builtin/index-pack: don't fetch promised objects for collision check
builtin/show-ref: don't fetch objects when printing refs
refs: don't fetch promisor objects in `ref_resolves_to_object()`
http-walker: don't fetch objects via promisor remotes
list-objects: clarify how promised blobs are excluded
bulk-checkin: don't fetch promised objects on write
object-store: drop `repo_has_object_file()`
builtin/cat-file.c | 3 +-
builtin/clone.c | 4 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 3 +-
builtin/fetch.c | 15 ++--
builtin/gc.c | 2 +-
builtin/index-pack.c | 6 +-
builtin/receive-pack.c | 4 +-
builtin/remote.c | 3 +-
builtin/show-ref.c | 2 +-
builtin/unpack-objects.c | 3 +-
bulk-checkin.c | 2 +-
cache-tree.c | 13 +++-
fetch-pack.c | 7 +-
http-push.c | 11 ++-
http-walker.c | 7 +-
http.c | 4 +-
list-objects.c | 3 +-
notes.c | 3 +-
object-file.c | 4 +-
object-file.h | 77 +++++++++++++++++++
object-name.c | 2 +-
object-store.c | 44 ++---------
object-store.h | 191 +++--------------------------------------------
pack-objects.h | 1 +
packfile.h | 78 ++++++++++++++++++-
path.c | 14 ++++
path.h | 7 ++
prune-packed.c | 2 +-
reachable.c | 2 +-
reflog.c | 3 +-
refs.c | 2 +-
remote.c | 2 +-
send-pack.c | 5 +-
shallow.c | 9 ++-
upload-pack.c | 3 +-
walker.c | 3 +-
37 files changed, 265 insertions(+), 281 deletions(-)
---
base-commit: ca819c0751cedd1713334882e4c83687f8478a54
change-id: 20250422-pks-object-store-cleanups-5a6077014155
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH 01/13] object-store: move `struct packed_git` into "packfile.h"
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 02/13] object-store: drop `loose_object_path()` Patrick Steinhardt
` (14 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
The "object-store.h" header contains the definition of `struct
packed_git`. As this structure hosts all kind of information about a
specific packfile it is arguably a bit out of place in a generic place
like "object-store.h".
Move the structure as well as `pack_map_entry_cmp()` into "packfile.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.h | 59 +---------------------------------------------------------
pack-objects.h | 1 +
packfile.h | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 60 insertions(+), 59 deletions(-)
diff --git a/object-store.h b/object-store.h
index 46961dc9542..e04469a85fb 100644
--- a/object-store.h
+++ b/object-store.h
@@ -92,65 +92,8 @@ struct oidtree *odb_loose_cache(struct object_directory *odb,
/* Empty the loose object cache for the specified object directory. */
void odb_clear_loose_cache(struct object_directory *odb);
-struct packed_git {
- struct hashmap_entry packmap_ent;
- struct packed_git *next;
- struct list_head mru;
- struct pack_window *windows;
- off_t pack_size;
- const void *index_data;
- size_t index_size;
- uint32_t num_objects;
- size_t crc_offset;
- struct oidset bad_objects;
- int index_version;
- time_t mtime;
- int pack_fd;
- int index; /* for builtin/pack-objects.c */
- unsigned pack_local:1,
- pack_keep:1,
- pack_keep_in_core:1,
- freshened:1,
- do_not_close:1,
- pack_promisor:1,
- multi_pack_index:1,
- is_cruft:1;
- unsigned char hash[GIT_MAX_RAWSZ];
- struct revindex_entry *revindex;
- const uint32_t *revindex_data;
- const uint32_t *revindex_map;
- size_t revindex_size;
- /*
- * mtimes_map points at the beginning of the memory mapped region of
- * this pack's corresponding .mtimes file, and mtimes_size is the size
- * of that .mtimes file
- */
- const uint32_t *mtimes_map;
- size_t mtimes_size;
-
- /* repo denotes the repository this packfile belongs to */
- struct repository *repo;
-
- /* something like ".git/objects/pack/xxxxx.pack" */
- char pack_name[FLEX_ARRAY]; /* more */
-};
-
+struct packed_git;
struct multi_pack_index;
-
-static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *entry,
- const struct hashmap_entry *entry2,
- const void *keydata)
-{
- const char *key = keydata;
- const struct packed_git *pg1, *pg2;
-
- pg1 = container_of(entry, const struct packed_git, packmap_ent);
- pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
- return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
struct cached_object_entry;
struct raw_object_store {
diff --git a/pack-objects.h b/pack-objects.h
index d1c4ae7f9b6..475a2d67ce3 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -4,6 +4,7 @@
#include "object-store.h"
#include "thread-utils.h"
#include "pack.h"
+#include "packfile.h"
struct repository;
diff --git a/packfile.h b/packfile.h
index 25097213d06..05499382397 100644
--- a/packfile.h
+++ b/packfile.h
@@ -1,13 +1,70 @@
#ifndef PACKFILE_H
#define PACKFILE_H
+#include "list.h"
#include "object.h"
#include "oidset.h"
/* in object-store.h */
-struct packed_git;
struct object_info;
+struct packed_git {
+ struct hashmap_entry packmap_ent;
+ struct packed_git *next;
+ struct list_head mru;
+ struct pack_window *windows;
+ off_t pack_size;
+ const void *index_data;
+ size_t index_size;
+ uint32_t num_objects;
+ size_t crc_offset;
+ struct oidset bad_objects;
+ int index_version;
+ time_t mtime;
+ int pack_fd;
+ int index; /* for builtin/pack-objects.c */
+ unsigned pack_local:1,
+ pack_keep:1,
+ pack_keep_in_core:1,
+ freshened:1,
+ do_not_close:1,
+ pack_promisor:1,
+ multi_pack_index:1,
+ is_cruft:1;
+ unsigned char hash[GIT_MAX_RAWSZ];
+ struct revindex_entry *revindex;
+ const uint32_t *revindex_data;
+ const uint32_t *revindex_map;
+ size_t revindex_size;
+ /*
+ * mtimes_map points at the beginning of the memory mapped region of
+ * this pack's corresponding .mtimes file, and mtimes_size is the size
+ * of that .mtimes file
+ */
+ const uint32_t *mtimes_map;
+ size_t mtimes_size;
+
+ /* repo denotes the repository this packfile belongs to */
+ struct repository *repo;
+
+ /* something like ".git/objects/pack/xxxxx.pack" */
+ char pack_name[FLEX_ARRAY]; /* more */
+};
+
+static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
+ const struct hashmap_entry *entry,
+ const struct hashmap_entry *entry2,
+ const void *keydata)
+{
+ const char *key = keydata;
+ const struct packed_git *pg1, *pg2;
+
+ pg1 = container_of(entry, const struct packed_git, packmap_ent);
+ pg2 = container_of(entry2, const struct packed_git, packmap_ent);
+
+ return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
+}
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 02/13] object-store: drop `loose_object_path()`
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
` (13 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
The function `loose_object_path()` is a trivial wrapper around
`odb_loose_path()`, with the only exception that it always uses the
primary object database of the given repository. This doesn't really add
a ton of value though, so let's drop the function and inline it at every
callsite.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
http-walker.c | 3 ++-
http.c | 4 ++--
object-file.c | 4 ++--
object-file.h | 4 ++++
object-store.c | 6 ------
object-store.h | 7 -------
6 files changed, 10 insertions(+), 18 deletions(-)
diff --git a/http-walker.c b/http-walker.c
index 882cae19c24..95458e2f638 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -9,6 +9,7 @@
#include "list.h"
#include "transport.h"
#include "packfile.h"
+#include "object-file.h"
#include "object-store.h"
struct alt_base {
@@ -540,7 +541,7 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
- loose_object_path(the_repository, &buf, &req->oid);
+ odb_loose_path(the_repository->objects->odb, &buf, &req->oid);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release(&buf);
}
diff --git a/http.c b/http.c
index 0c411380425..3c029cf8947 100644
--- a/http.c
+++ b/http.c
@@ -2662,7 +2662,7 @@ struct http_object_request *new_http_object_request(const char *base_url,
oidcpy(&freq->oid, oid);
freq->localfile = -1;
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
strbuf_addf(&freq->tmpfile, "%s.temp", filename.buf);
strbuf_addf(&prevfile, "%s.prev", filename.buf);
@@ -2814,7 +2814,7 @@ int finish_http_object_request(struct http_object_request *freq)
unlink_or_warn(freq->tmpfile.buf);
return -1;
}
- loose_object_path(the_repository, &filename, &freq->oid);
+ odb_loose_path(the_repository->objects->odb, &filename, &freq->oid);
freq->rename = finalize_object_file(freq->tmpfile.buf, filename.buf);
strbuf_release(&filename);
diff --git a/object-file.c b/object-file.c
index 9cc3a24a40d..dc56a4766df 100644
--- a/object-file.c
+++ b/object-file.c
@@ -932,7 +932,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
prepare_loose_object_bulk_checkin();
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
fd = start_loose_object_common(&tmp_file, filename.buf, flags,
&stream, compressed, sizeof(compressed),
@@ -1079,7 +1079,7 @@ int stream_loose_object(struct input_stream *in_stream, size_t len,
goto cleanup;
}
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
/* We finally know the object path, and create the missing dir. */
dirlen = directory_size(filename.buf);
diff --git a/object-file.h b/object-file.h
index c002fbe2345..0a7b6b9f9d9 100644
--- a/object-file.h
+++ b/object-file.h
@@ -25,6 +25,10 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
struct object_directory;
+/*
+ * Put in `buf` the name of the file in the local object database that
+ * would be used to store a loose object with the specified oid.
+ */
const char *odb_loose_path(struct object_directory *odb,
struct strbuf *buf,
const struct object_id *oid);
diff --git a/object-store.c b/object-store.c
index 6ab50d25d3e..e5cfb8c0079 100644
--- a/object-store.c
+++ b/object-store.c
@@ -96,12 +96,6 @@ int odb_pack_keep(const char *name)
return open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
}
-const char *loose_object_path(struct repository *r, struct strbuf *buf,
- const struct object_id *oid)
-{
- return odb_loose_path(r->objects->odb, buf, oid);
-}
-
/*
* Return non-zero iff the path is usable as an alternate object database.
*/
diff --git a/object-store.h b/object-store.h
index e04469a85fb..5668de62d01 100644
--- a/object-store.h
+++ b/object-store.h
@@ -196,13 +196,6 @@ int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
*/
int odb_pack_keep(const char *name);
-/*
- * Put in `buf` the name of the file in the local object database that
- * would be used to store a loose object with the specified oid.
- */
-const char *loose_object_path(struct repository *r, struct strbuf *buf,
- const struct object_id *oid);
-
void *map_loose_object(struct repository *r, const struct object_id *oid,
unsigned long *size);
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 03/13] object-store: move and rename `odb_pack_keep()`
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 02/13] object-store: drop `loose_object_path()` Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 10:03 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 04/13] object-store: move function declarations to their respective subsystems Patrick Steinhardt
` (12 subsequent siblings)
15 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
The function `odb_pack_keep()` creates a file at the passed-in path. If
this fails, then the function re-tries by first creating any potentially
missing leading directoriesk and then trying to create the file once
again. As such, this function doesn't host any kind of logic that is
specific to the object store, but is rather a generic helper function.
Rename the function to `safe_create_file_with_leading_directories()` and
move it into "path.c". While at it, refactor it so that it loses its
dependency on `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 3 ++-
builtin/index-pack.c | 2 +-
object-store.c | 13 -------------
object-store.h | 7 -------
path.c | 14 ++++++++++++++
path.h | 7 +++++++
6 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index c1e198f4e34..b2839c5f439 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -811,7 +811,8 @@ static char *keep_pack(const char *curr_index_name)
int keep_fd;
odb_pack_name(pack_data->repo, &name, pack_data->hash, "keep");
- keep_fd = odb_pack_keep(name.buf);
+ keep_fd = safe_create_file_with_leading_directories(pack_data->repo,
+ name.buf);
if (keep_fd < 0)
die_errno("cannot create keep file");
write_or_die(keep_fd, keep_msg, strlen(keep_msg));
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 60a8ee05dbc..f49431d626b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,7 +1565,7 @@ static void write_special_file(const char *suffix, const char *msg,
else
filename = odb_pack_name(the_repository, &name_buf, hash, suffix);
- fd = odb_pack_keep(filename);
+ fd = safe_create_file_with_leading_directories(the_repository, filename);
if (fd < 0) {
if (errno != EEXIST)
die_errno(_("cannot write %s file '%s'"),
diff --git a/object-store.c b/object-store.c
index e5cfb8c0079..0cbad5a19a0 100644
--- a/object-store.c
+++ b/object-store.c
@@ -83,19 +83,6 @@ int odb_mkstemp(struct strbuf *temp_filename, const char *pattern)
return xmkstemp_mode(temp_filename->buf, mode);
}
-int odb_pack_keep(const char *name)
-{
- int fd;
-
- fd = open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
- if (0 <= fd)
- return fd;
-
- /* slow path */
- safe_create_leading_directories_const(the_repository, name);
- return open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
-}
-
/*
* Return non-zero iff the path is usable as an alternate object database.
*/
diff --git a/object-store.h b/object-store.h
index 5668de62d01..aa8fc63043e 100644
--- a/object-store.h
+++ b/object-store.h
@@ -189,13 +189,6 @@ void raw_object_store_clear(struct raw_object_store *o);
*/
int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
-/*
- * Create a pack .keep file named "name" (which should generally be the output
- * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on
- * error.
- */
-int odb_pack_keep(const char *name);
-
void *map_loose_object(struct repository *r, const struct object_id *oid,
unsigned long *size);
diff --git a/path.c b/path.c
index 4505bb78e8b..3b598b2847f 100644
--- a/path.c
+++ b/path.c
@@ -1011,6 +1011,20 @@ enum scld_error safe_create_leading_directories_const(struct repository *repo,
return result;
}
+int safe_create_file_with_leading_directories(struct repository *repo,
+ const char *path)
+{
+ int fd;
+
+ fd = open(path, O_RDWR|O_CREAT|O_EXCL, 0600);
+ if (0 <= fd)
+ return fd;
+
+ /* slow path */
+ safe_create_leading_directories_const(repo, path);
+ return open(path, O_RDWR|O_CREAT|O_EXCL, 0600);
+}
+
static int have_same_root(const char *path1, const char *path2)
{
int is_abs1, is_abs2;
diff --git a/path.h b/path.h
index fd1a194b060..e67348f2539 100644
--- a/path.h
+++ b/path.h
@@ -266,6 +266,13 @@ enum scld_error safe_create_leading_directories_const(struct repository *repo,
const char *path);
enum scld_error safe_create_leading_directories_no_share(char *path);
+/*
+ * Create a file, potentially creating its leading directories in case they
+ * don't exist. Returns the return value of the open(3p) call.
+ */
+int safe_create_file_with_leading_directories(struct repository *repo,
+ const char *path);
+
# ifdef USE_THE_REPOSITORY_VARIABLE
# include "strbuf.h"
# include "repository.h"
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 04/13] object-store: move function declarations to their respective subsystems
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (2 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
` (11 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
We carry declarations for a couple of functions in "object-store.h" that
are not defined in "object-store.c", but in a different subsystem. Move
these declarations to the respective headers whose matching code files
carry the corresponding definition.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/count-objects.c | 2 +-
builtin/gc.c | 2 +-
object-file.h | 73 +++++++++++++++++++++++++++++++++++++++
object-name.c | 2 +-
object-store.h | 91 +------------------------------------------------
packfile.h | 19 +++++++++++
prune-packed.c | 2 +-
reachable.c | 2 +-
8 files changed, 98 insertions(+), 95 deletions(-)
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 0bb5360b2f2..a88c0c9c09a 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -12,7 +12,7 @@
#include "parse-options.h"
#include "quote.h"
#include "packfile.h"
-#include "object-store.h"
+#include "object-file.h"
static unsigned long garbage;
static off_t size_garbage;
diff --git a/builtin/gc.c b/builtin/gc.c
index b5ce1d32766..4d428f3253d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -28,7 +28,7 @@
#include "commit.h"
#include "commit-graph.h"
#include "packfile.h"
-#include "object-store.h"
+#include "object-file.h"
#include "pack.h"
#include "pack-objects.h"
#include "path.h"
diff --git a/object-file.h b/object-file.h
index 0a7b6b9f9d9..de6dd205ed8 100644
--- a/object-file.h
+++ b/object-file.h
@@ -3,6 +3,7 @@
#include "git-zlib.h"
#include "object.h"
+#include "object-store.h"
struct index_state;
@@ -25,6 +26,16 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
struct object_directory;
+/*
+ * Populate and return the loose object cache array corresponding to the
+ * given object ID.
+ */
+struct oidtree *odb_loose_cache(struct object_directory *odb,
+ const struct object_id *oid);
+
+/* Empty the loose object cache for the specified object directory. */
+void odb_clear_loose_cache(struct object_directory *odb);
+
/*
* Put in `buf` the name of the file in the local object database that
* would be used to store a loose object with the specified oid.
@@ -42,6 +53,68 @@ int has_loose_object_nonlocal(const struct object_id *);
int has_loose_object(const struct object_id *);
+void *map_loose_object(struct repository *r, const struct object_id *oid,
+ unsigned long *size);
+
+/*
+ * Iterate over the files in the loose-object parts of the object
+ * directory "path", triggering the following callbacks:
+ *
+ * - loose_object is called for each loose object we find.
+ *
+ * - loose_cruft is called for any files that do not appear to be
+ * loose objects. Note that we only look in the loose object
+ * directories "objects/[0-9a-f]{2}/", so we will not report
+ * "objects/foobar" as cruft.
+ *
+ * - loose_subdir is called for each top-level hashed subdirectory
+ * of the object directory (e.g., "$OBJDIR/f0"). It is called
+ * after the objects in the directory are processed.
+ *
+ * Any callback that is NULL will be ignored. Callbacks returning non-zero
+ * will end the iteration.
+ *
+ * In the "buf" variant, "path" is a strbuf which will also be used as a
+ * scratch buffer, but restored to its original contents before
+ * the function returns.
+ */
+typedef int each_loose_object_fn(const struct object_id *oid,
+ const char *path,
+ void *data);
+typedef int each_loose_cruft_fn(const char *basename,
+ const char *path,
+ void *data);
+typedef int each_loose_subdir_fn(unsigned int nr,
+ const char *path,
+ void *data);
+int for_each_file_in_obj_subdir(unsigned int subdir_nr,
+ struct strbuf *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+int for_each_loose_file_in_objdir(const char *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+int for_each_loose_file_in_objdir_buf(struct strbuf *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+
+/*
+ * Iterate over all accessible loose objects without respect to
+ * reachability. By default, this includes both local and alternate objects.
+ * The order in which objects are visited is unspecified.
+ *
+ * Any flags specific to packs are ignored.
+ */
+int for_each_loose_object(each_loose_object_fn, void *,
+ enum for_each_object_flags flags);
+
+
/**
* format_object_header() is a thin wrapper around s xsnprintf() that
* writes the initial "<type> <obj-len>" part of the loose object
diff --git a/object-name.c b/object-name.c
index 2c751a5352a..9288b2dd245 100644
--- a/object-name.c
+++ b/object-name.c
@@ -19,7 +19,7 @@
#include "oidtree.h"
#include "packfile.h"
#include "pretty.h"
-#include "object-store.h"
+#include "object-file.h"
#include "read-cache-ll.h"
#include "repo-settings.h"
#include "repository.h"
diff --git a/object-store.h b/object-store.h
index aa8fc63043e..5bbdaba92d1 100644
--- a/object-store.h
+++ b/object-store.h
@@ -82,16 +82,6 @@ struct object_directory *set_temporary_primary_odb(const char *dir, int will_des
*/
void restore_primary_odb(struct object_directory *restore_odb, const char *old_path);
-/*
- * Populate and return the loose object cache array corresponding to the
- * given object ID.
- */
-struct oidtree *odb_loose_cache(struct object_directory *odb,
- const struct object_id *oid);
-
-/* Empty the loose object cache for the specified object directory. */
-void odb_clear_loose_cache(struct object_directory *odb);
-
struct packed_git;
struct multi_pack_index;
struct cached_object_entry;
@@ -189,9 +179,6 @@ void raw_object_store_clear(struct raw_object_store *o);
*/
int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
-void *map_loose_object(struct repository *r, const struct object_id *oid,
- unsigned long *size);
-
void *repo_read_object_file(struct repository *r,
const struct object_id *oid,
enum object_type *type,
@@ -340,56 +327,7 @@ static inline void obj_read_unlock(void)
if(obj_read_use_lock)
pthread_mutex_unlock(&obj_read_mutex);
}
-
-/*
- * Iterate over the files in the loose-object parts of the object
- * directory "path", triggering the following callbacks:
- *
- * - loose_object is called for each loose object we find.
- *
- * - loose_cruft is called for any files that do not appear to be
- * loose objects. Note that we only look in the loose object
- * directories "objects/[0-9a-f]{2}/", so we will not report
- * "objects/foobar" as cruft.
- *
- * - loose_subdir is called for each top-level hashed subdirectory
- * of the object directory (e.g., "$OBJDIR/f0"). It is called
- * after the objects in the directory are processed.
- *
- * Any callback that is NULL will be ignored. Callbacks returning non-zero
- * will end the iteration.
- *
- * In the "buf" variant, "path" is a strbuf which will also be used as a
- * scratch buffer, but restored to its original contents before
- * the function returns.
- */
-typedef int each_loose_object_fn(const struct object_id *oid,
- const char *path,
- void *data);
-typedef int each_loose_cruft_fn(const char *basename,
- const char *path,
- void *data);
-typedef int each_loose_subdir_fn(unsigned int nr,
- const char *path,
- void *data);
-int for_each_file_in_obj_subdir(unsigned int subdir_nr,
- struct strbuf *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-int for_each_loose_file_in_objdir(const char *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-int for_each_loose_file_in_objdir_buf(struct strbuf *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-
-/* Flags for for_each_*_object() below. */
+/* Flags for for_each_*_object(). */
enum for_each_object_flags {
/* Iterate only over local objects, not alternates. */
FOR_EACH_OBJECT_LOCAL_ONLY = (1<<0),
@@ -409,33 +347,6 @@ enum for_each_object_flags {
FOR_EACH_OBJECT_SKIP_ON_DISK_KEPT_PACKS = (1<<4),
};
-/*
- * Iterate over all accessible loose objects without respect to
- * reachability. By default, this includes both local and alternate objects.
- * The order in which objects are visited is unspecified.
- *
- * Any flags specific to packs are ignored.
- */
-int for_each_loose_object(each_loose_object_fn, void *,
- enum for_each_object_flags flags);
-
-/*
- * Iterate over all accessible packed objects without respect to reachability.
- * By default, this includes both local and alternate packs.
- *
- * Note that some objects may appear twice if they are found in multiple packs.
- * Each pack is visited in an unspecified order. By default, objects within a
- * pack are visited in pack-idx order (i.e., sorted by oid).
- */
-typedef int each_packed_object_fn(const struct object_id *oid,
- struct packed_git *pack,
- uint32_t pos,
- void *data);
-int for_each_object_in_pack(struct packed_git *p,
- each_packed_object_fn, void *data,
- enum for_each_object_flags flags);
-int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
- void *data, enum for_each_object_flags flags);
void *read_object_with_reference(struct repository *r,
const struct object_id *oid,
diff --git a/packfile.h b/packfile.h
index 05499382397..3a3c77cf05a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -3,6 +3,7 @@
#include "list.h"
#include "object.h"
+#include "object-store.h"
#include "oidset.h"
/* in object-store.h */
@@ -117,6 +118,24 @@ void for_each_file_in_pack_dir(const char *objdir,
each_file_in_pack_dir_fn fn,
void *data);
+/*
+ * Iterate over all accessible packed objects without respect to reachability.
+ * By default, this includes both local and alternate packs.
+ *
+ * Note that some objects may appear twice if they are found in multiple packs.
+ * Each pack is visited in an unspecified order. By default, objects within a
+ * pack are visited in pack-idx order (i.e., sorted by oid).
+ */
+typedef int each_packed_object_fn(const struct object_id *oid,
+ struct packed_git *pack,
+ uint32_t pos,
+ void *data);
+int for_each_object_in_pack(struct packed_git *p,
+ each_packed_object_fn, void *data,
+ enum for_each_object_flags flags);
+int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
+ void *data, enum for_each_object_flags flags);
+
/* A hook to report invalid files in pack directory */
#define PACKDIR_FILE_PACK 1
#define PACKDIR_FILE_IDX 2
diff --git a/prune-packed.c b/prune-packed.c
index c1d95a519d7..92fb4fbb0ed 100644
--- a/prune-packed.c
+++ b/prune-packed.c
@@ -2,7 +2,7 @@
#include "git-compat-util.h"
#include "gettext.h"
-#include "object-store.h"
+#include "object-file.h"
#include "packfile.h"
#include "progress.h"
#include "prune-packed.h"
diff --git a/reachable.c b/reachable.c
index e5f56f40181..9dc748f0b9a 100644
--- a/reachable.c
+++ b/reachable.c
@@ -14,7 +14,7 @@
#include "list-objects.h"
#include "packfile.h"
#include "worktree.h"
-#include "object-store.h"
+#include "object-file.h"
#include "pack-bitmap.h"
#include "pack-mtimes.h"
#include "config.h"
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 05/13] object-store: allow fetching objects via `has_object()`
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (3 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 04/13] object-store: move function declarations to their respective subsystems Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 10:07 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
` (10 subsequent siblings)
15 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
We're about to fully remove `repo_has_object_file()` in favor of
`has_object()` given that the latter has better defaults: it neither
reloads packfiles by default nor does it fetch any promised objects in
case they are missing.
The latter usecase keeps us from converting a couple of callsites that
currently do fetch objects though. It is not really clear whether _all_
of those callsites should be fetching objects, but for a subset of them
it is the desired behaviour indeed.
Introduce a new flag `HAS_OBJECT_FETCH_PROMISOR` that causes the
function to optionally fetch missing objects which are part of a
promisor pack.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.c | 9 ++++++---
object-store.h | 10 +++++++---
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/object-store.c b/object-store.c
index 0cbad5a19a0..0d873868a6d 100644
--- a/object-store.c
+++ b/object-store.c
@@ -937,12 +937,15 @@ void *read_object_with_reference(struct repository *r,
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags)
{
- int quick = !(flags & HAS_OBJECT_RECHECK_PACKED);
- unsigned object_info_flags = OBJECT_INFO_SKIP_FETCH_OBJECT |
- (quick ? OBJECT_INFO_QUICK : 0);
+ unsigned object_info_flags = 0;
if (!startup_info->have_repository)
return 0;
+ if (!(flags & HAS_OBJECT_RECHECK_PACKED))
+ object_info_flags |= OBJECT_INFO_QUICK;
+ if (!(flags & HAS_OBJECT_FETCH_PROMISOR))
+ object_info_flags |= OBJECT_INFO_SKIP_FETCH_OBJECT;
+
return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
}
diff --git a/object-store.h b/object-store.h
index 5bbdaba92d1..c6055376f49 100644
--- a/object-store.h
+++ b/object-store.h
@@ -266,12 +266,16 @@ int oid_object_info_extended(struct repository *r,
const struct object_id *,
struct object_info *, unsigned flags);
-/* Retry packed storage after checking packed and loose storage */
-#define HAS_OBJECT_RECHECK_PACKED 1
+enum {
+ /* Retry packed storage after checking packed and loose storage */
+ HAS_OBJECT_RECHECK_PACKED = (1 << 0),
+ /* Allow fetching the object in case the repository has a promisor remote. */
+ HAS_OBJECT_FETCH_PROMISOR = (1 << 1),
+};
/*
* Returns 1 if the object exists. This function will not lazily fetch objects
- * in a partial clone.
+ * in a partial clone by default.
*/
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags);
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 06/13] treewide: trivial conversions of `repo_has_object_file()`
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (4 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
` (9 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
favor of `has_object()`. The benefit of the replacement is better
defaults: it doesn't fetch missing objects via promisor remotes, and
neither does it reload packfiles if an object wasn't found by default.
Start sunsetting the functions by replacing trivial callsites of it with
`has_object()`:
- `repo_has_object_file(...)` is equivalent to
`has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., 0)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.
While converting callsites to use `HAS_OBJECT_FETCH_PROMISOR` retains
the current behaviour, it is not entirely clear whether all of them
really should be fetching promised objects. A subset of such callsites
where it is obvious that they really shouldn't fetch objects has been
left out of this commit and will be adapted over the next couple of
commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/cat-file.c | 3 ++-
builtin/clone.c | 4 +---
builtin/fetch.c | 15 +++++++--------
builtin/receive-pack.c | 4 +++-
builtin/remote.c | 3 ++-
builtin/unpack-objects.c | 3 ++-
cache-tree.c | 13 +++++++++----
fetch-pack.c | 7 +++----
http-push.c | 11 +++++++----
notes.c | 3 ++-
object-store.c | 2 +-
reflog.c | 3 ++-
remote.c | 2 +-
send-pack.c | 5 +----
shallow.c | 9 ++++++---
upload-pack.c | 3 +--
walker.c | 3 ++-
17 files changed, 52 insertions(+), 41 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 0e3f10a9467..3914a2a3f61 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -169,7 +169,8 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
goto cleanup;
case 'e':
- ret = !repo_has_object_file(the_repository, &oid);
+ ret = !has_object(the_repository, &oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR);
goto cleanup;
case 'w':
diff --git a/builtin/clone.c b/builtin/clone.c
index 6b1d11a3ed2..b498b81a043 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -504,9 +504,7 @@ static void write_followtags(const struct ref *refs, const char *msg)
continue;
if (ends_with(ref->name, "^{}"))
continue;
- if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid,
- OBJECT_INFO_QUICK |
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &ref->old_oid, 0))
continue;
refs_update_ref(get_main_ref_store(the_repository), msg,
ref->name, &ref->old_oid, NULL, 0,
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 95589b49948..aadcf49a5b4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -337,7 +337,6 @@ static void find_non_local_tags(const struct ref *refs,
struct string_list_item *remote_ref_item;
const struct ref *ref;
struct refname_hash_entry *item = NULL;
- const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
refname_hash_init(&existing_refs);
refname_hash_init(&remote_refs);
@@ -367,9 +366,9 @@ static void find_non_local_tags(const struct ref *refs,
*/
if (ends_with(ref->name, "^{}")) {
if (item &&
- !repo_has_object_file_with_flags(the_repository, &ref->old_oid, quick_flags) &&
+ !has_object(the_repository, &ref->old_oid, 0) &&
!oidset_contains(&fetch_oids, &ref->old_oid) &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
item = NULL;
@@ -383,7 +382,7 @@ static void find_non_local_tags(const struct ref *refs,
* fetch.
*/
if (item &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
@@ -404,7 +403,7 @@ static void find_non_local_tags(const struct ref *refs,
* checked to see if it needs fetching.
*/
if (item &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
@@ -911,7 +910,8 @@ static int update_local_ref(struct ref *ref,
struct commit *current = NULL, *updated;
int fast_forward = 0;
- if (!repo_has_object_file(the_repository, &ref->new_oid))
+ if (!has_object(the_repository, &ref->new_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
die(_("object %s not found"), oid_to_hex(&ref->new_oid));
if (oideq(&ref->old_oid, &ref->new_oid)) {
@@ -1330,8 +1330,7 @@ static int check_exist_and_connected(struct ref *ref_map)
* we need all direct targets to exist.
*/
for (r = rm; r; r = r->next) {
- if (!repo_has_object_file_with_flags(the_repository, &r->old_oid,
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &r->old_oid, HAS_OBJECT_RECHECK_PACKED))
return -1;
}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index be314879e82..c92e57ba188 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1506,7 +1506,9 @@ static const char *update(struct command *cmd, struct shallow_info *si)
}
}
- if (!is_null_oid(new_oid) && !repo_has_object_file(the_repository, new_oid)) {
+ if (!is_null_oid(new_oid) &&
+ !has_object(the_repository, new_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
error("unpack should have generated %s, "
"but I can't find it!", oid_to_hex(new_oid));
ret = "bad pack";
diff --git a/builtin/remote.c b/builtin/remote.c
index b4baa34e665..0d6755bcb71 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -454,7 +454,8 @@ static int get_push_ref_states(const struct ref *remote_refs,
info->status = PUSH_STATUS_UPTODATE;
else if (is_null_oid(&ref->old_oid))
info->status = PUSH_STATUS_CREATE;
- else if (repo_has_object_file(the_repository, &ref->old_oid) &&
+ else if (has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) &&
ref_newer(&ref->new_oid, &ref->old_oid))
info->status = PUSH_STATUS_FASTFORWARD;
else
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 661be789f13..e905d5f4e19 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -449,7 +449,8 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
delta_data = get_data(delta_size);
if (!delta_data)
return;
- if (repo_has_object_file(the_repository, &base_oid))
+ if (has_object(the_repository, &base_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
; /* Ok we have this one */
else if (resolve_against_held(nr, &base_oid,
delta_data, delta_size))
diff --git a/cache-tree.c b/cache-tree.c
index c0e1e9ee1d4..fa3858e2829 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -238,7 +238,9 @@ int cache_tree_fully_valid(struct cache_tree *it)
int i;
if (!it)
return 0;
- if (it->entry_count < 0 || !repo_has_object_file(the_repository, &it->oid))
+ if (it->entry_count < 0 ||
+ has_object(the_repository, &it->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0;
for (i = 0; i < it->subtree_nr; i++) {
if (!cache_tree_fully_valid(it->down[i]->cache_tree))
@@ -289,7 +291,9 @@ static int update_one(struct cache_tree *it,
}
}
- if (0 <= it->entry_count && repo_has_object_file(the_repository, &it->oid))
+ if (0 <= it->entry_count &&
+ has_object(the_repository, &it->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return it->entry_count;
/*
@@ -395,7 +399,8 @@ static int update_one(struct cache_tree *it,
ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
!must_check_existence(ce);
if (is_null_oid(oid) ||
- (!ce_missing_ok && !repo_has_object_file(the_repository, oid))) {
+ (!ce_missing_ok && !has_object(the_repository, oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))) {
strbuf_release(&buffer);
if (expected_missing)
return -1;
@@ -443,7 +448,7 @@ static int update_one(struct cache_tree *it,
struct object_id oid;
hash_object_file(the_hash_algo, buffer.buf, buffer.len,
OBJ_TREE, &oid);
- if (repo_has_object_file_with_flags(the_repository, &oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (has_object(the_repository, &oid, HAS_OBJECT_RECHECK_PACKED))
oidcpy(&it->oid, &oid);
else
to_invalidate = 1;
diff --git a/fetch-pack.c b/fetch-pack.c
index 210dc30d50f..fa4231fee74 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -769,9 +769,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
if (!commit) {
struct object *o;
- if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid,
- OBJECT_INFO_QUICK |
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &ref->old_oid, 0))
continue;
o = parse_object(the_repository, &ref->old_oid);
if (!o || o->type != OBJ_COMMIT)
@@ -1985,7 +1983,8 @@ static void update_shallow(struct fetch_pack_args *args,
struct oid_array extra = OID_ARRAY_INIT;
struct object_id *oid = si->shallow->oid;
for (i = 0; i < si->shallow->nr; i++)
- if (repo_has_object_file(the_repository, &oid[i]))
+ if (has_object(the_repository, &oid[i],
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
oid_array_append(&extra, &oid[i]);
if (extra.nr) {
setup_alternate_shallow(&shallow_lock,
diff --git a/http-push.c b/http-push.c
index 32e37565f4e..f9e67cabd4b 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1446,7 +1446,9 @@ static void one_remote_ref(const char *refname)
* Fetch a copy of the object if it doesn't exist locally - it
* may be required for updating server info later.
*/
- if (repo->can_update_info_refs && !repo_has_object_file(the_repository, &ref->old_oid)) {
+ if (repo->can_update_info_refs &&
+ !has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
obj = lookup_unknown_object(the_repository, &ref->old_oid);
fprintf(stderr, " fetch %s for %s\n",
oid_to_hex(&ref->old_oid), refname);
@@ -1651,14 +1653,14 @@ static int delete_remote_branch(const char *pattern, int force)
return error("Remote HEAD symrefs too deep");
if (is_null_oid(&head_oid))
return error("Unable to resolve remote HEAD");
- if (!repo_has_object_file(the_repository, &head_oid))
+ if (!has_object(the_repository, &head_oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return error("Remote HEAD resolves to object %s\nwhich does not exist locally, perhaps you need to fetch?", oid_to_hex(&head_oid));
/* Remote branch must resolve to a known object */
if (is_null_oid(&remote_ref->old_oid))
return error("Unable to resolve remote branch %s",
remote_ref->name);
- if (!repo_has_object_file(the_repository, &remote_ref->old_oid))
+ if (!has_object(the_repository, &remote_ref->old_oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return error("Remote branch %s resolves to object %s\nwhich does not exist locally, perhaps you need to fetch?", remote_ref->name, oid_to_hex(&remote_ref->old_oid));
/* Remote branch must be an ancestor of remote HEAD */
@@ -1879,7 +1881,8 @@ int cmd_main(int argc, const char **argv)
if (!force_all &&
!is_null_oid(&ref->old_oid) &&
!ref->force) {
- if (!repo_has_object_file(the_repository, &ref->old_oid) ||
+ if (!has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) ||
!ref_newer(&ref->peer_ref->new_oid,
&ref->old_oid)) {
/*
diff --git a/notes.c b/notes.c
index d9645c4b5dc..0a128f1de98 100644
--- a/notes.c
+++ b/notes.c
@@ -794,7 +794,8 @@ static int prune_notes_helper(const struct object_id *object_oid,
struct note_delete_list **l = (struct note_delete_list **) cb_data;
struct note_delete_list *n;
- if (repo_has_object_file(the_repository, object_oid))
+ if (has_object(the_repository, object_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0; /* nothing to do for this note */
/* failed to find object => prune this note */
diff --git a/object-store.c b/object-store.c
index 0d873868a6d..2db34804e8f 100644
--- a/object-store.c
+++ b/object-store.c
@@ -847,7 +847,7 @@ int pretend_object_file(struct repository *repo,
char *co_buf;
hash_object_file(repo->hash_algo, buf, len, type, oid);
- if (repo_has_object_file_with_flags(repo, oid, OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT) ||
+ if (has_object(repo, oid, 0) ||
find_cached_object(repo->objects, oid))
return 0;
diff --git a/reflog.c b/reflog.c
index c6c534dfad4..2db1cc81d01 100644
--- a/reflog.c
+++ b/reflog.c
@@ -152,7 +152,8 @@ static int tree_is_complete(const struct object_id *oid)
init_tree_desc(&desc, &tree->object.oid, tree->buffer, tree->size);
complete = 1;
while (tree_entry(&desc, &entry)) {
- if (!repo_has_object_file(the_repository, &entry.oid) ||
+ if (!has_object(the_repository, &entry.oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) ||
(S_ISDIR(entry.mode) && !tree_is_complete(&entry.oid))) {
tree->object.flags |= INCOMPLETE;
complete = 0;
diff --git a/remote.c b/remote.c
index 9fa3614e7a3..4099183cacd 100644
--- a/remote.c
+++ b/remote.c
@@ -1702,7 +1702,7 @@ void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
if (!reject_reason && !ref->deletion && !is_null_oid(&ref->old_oid)) {
if (starts_with(ref->name, "refs/tags/"))
reject_reason = REF_STATUS_REJECT_ALREADY_EXISTS;
- else if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
+ else if (!has_object(the_repository, &ref->old_oid, HAS_OBJECT_RECHECK_PACKED))
reject_reason = REF_STATUS_REJECT_FETCH_FIRST;
else if (!lookup_commit_reference_gently(the_repository, &ref->old_oid, 1) ||
!lookup_commit_reference_gently(the_repository, &ref->new_oid, 1))
diff --git a/send-pack.c b/send-pack.c
index 5005689cb55..86592ce526d 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -45,10 +45,7 @@ int option_parse_push_signed(const struct option *opt,
static void feed_object(struct repository *r,
const struct object_id *oid, FILE *fh, int negative)
{
- if (negative &&
- !repo_has_object_file_with_flags(r, oid,
- OBJECT_INFO_SKIP_FETCH_OBJECT |
- OBJECT_INFO_QUICK))
+ if (negative && !has_object(r, oid, 0))
return;
if (negative)
diff --git a/shallow.c b/shallow.c
index 2f82ebd6e3f..faeeeb45f98 100644
--- a/shallow.c
+++ b/shallow.c
@@ -310,7 +310,8 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data)
if (graft->nr_parent != -1)
return 0;
if (data->flags & QUICK) {
- if (!repo_has_object_file(the_repository, &graft->oid))
+ if (!has_object(the_repository, &graft->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0;
} else if (data->flags & SEEN_ONLY) {
struct commit *c = lookup_commit(the_repository, &graft->oid);
@@ -476,7 +477,8 @@ void prepare_shallow_info(struct shallow_info *info, struct oid_array *sa)
ALLOC_ARRAY(info->ours, sa->nr);
ALLOC_ARRAY(info->theirs, sa->nr);
for (size_t i = 0; i < sa->nr; i++) {
- if (repo_has_object_file(the_repository, sa->oid + i)) {
+ if (has_object(the_repository, sa->oid + i,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
struct commit_graft *graft;
graft = lookup_commit_graft(the_repository,
&sa->oid[i]);
@@ -513,7 +515,8 @@ void remove_nonexistent_theirs_shallow(struct shallow_info *info)
for (i = dst = 0; i < info->nr_theirs; i++) {
if (i != dst)
info->theirs[dst] = info->theirs[i];
- if (repo_has_object_file(the_repository, oid + info->theirs[i]))
+ if (has_object(the_repository, oid + info->theirs[i],
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
dst++;
}
info->nr_theirs = dst;
diff --git a/upload-pack.c b/upload-pack.c
index 30e4630f3a1..956da5b061a 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -509,8 +509,7 @@ static int got_oid(struct upload_pack_data *data,
{
if (get_oid_hex(hex, oid))
die("git upload-pack: expected SHA1 object, got '%s'", hex);
- if (!repo_has_object_file_with_flags(the_repository, oid,
- OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, oid, 0))
return -1;
return do_got_oid(data, oid);
}
diff --git a/walker.c b/walker.c
index 4fedc19f346..b470d43e54d 100644
--- a/walker.c
+++ b/walker.c
@@ -150,7 +150,8 @@ static int process(struct walker *walker, struct object *obj)
return 0;
obj->flags |= SEEN;
- if (repo_has_object_file(the_repository, &obj->oid)) {
+ if (has_object(the_repository, &obj->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
/* We already have it, so we should scan it now. */
obj->flags |= TO_SCAN;
}
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (5 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 17:08 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
` (8 subsequent siblings)
15 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
Any packed objects indexed via git-index-pack(1) are subject to a
collision check. This collision check has the intent to determine
whether we already have an object with the same object ID, but different
contents in the repository.
The check whether the collision check is really needed is performed via
`repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`. But unless
explicitly told otherwise via `OBJECT_INFO_SKIP_FETCH_OBJECT`, this
function will also cause us to fetch the object ID in case it is part of
a promisor pack. As such, we may end up fetching the object only to
check whether the fetched object and the object that we're indexing have
the same content.
This behaviour is highly dubious and more likely than not unintended.
Fix it by converting to `has_object()`, which knows to neither reload
packfiles nor to fetch promisor objects by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/index-pack.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f49431d626b..805b7aa1e28 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -892,9 +892,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
if (startup_info->have_repository) {
read_lock();
- collision_test_needed =
- repo_has_object_file_with_flags(the_repository, oid,
- OBJECT_INFO_QUICK);
+ collision_test_needed = has_object(the_repository, oid, 0);
read_unlock();
}
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 08/13] builtin/show-ref: don't fetch objects when printing refs
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (6 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
` (7 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
When printing references via git-show-ref(1) we double check that their
respective object IDs point to an existing object. This check is
performed via `repo_has_object_file()`, which knows to fetch missing
promised objects in the background. We shouldn't have a need to fetch
such objects though as no reference should ever point to a missing
object at all.
Convert the callsite to `has_object()`, which doesn't fetch promisor
objects by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/show-ref.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index f81209f23c3..fe592b4c202 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -35,7 +35,7 @@ static void show_one(const struct show_one_options *opts,
const char *hex;
struct object_id peeled;
- if (!repo_has_object_file(the_repository, oid))
+ if (!has_object(the_repository, oid, HAS_OBJECT_RECHECK_PACKED))
die("git show-ref: bad ref %s (%s)", refname,
oid_to_hex(oid));
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()`
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (7 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 17:11 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
` (6 subsequent siblings)
15 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
Similar to the preceding commit, don't try to fetch objects pointed to
by references. Any reference whose object does not exist is broken by
definition an, so we should report it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs.c b/refs.c
index 6559db37890..0492cf0d13a 100644
--- a/refs.c
+++ b/refs.c
@@ -376,7 +376,7 @@ int ref_resolves_to_object(const char *refname,
{
if (flags & REF_ISBROKEN)
return 0;
- if (!repo_has_object_file(repo, oid)) {
+ if (!has_object(repo, oid, HAS_OBJECT_RECHECK_PACKED)) {
error(_("%s does not point to a valid object!"), refname);
return 0;
}
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 10/13] http-walker: don't fetch objects via promisor remotes
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (8 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 11/13] list-objects: clarify how promised blobs are excluded Patrick Steinhardt
` (5 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
The HTTP walker is responsible for fetching objects via the dumb HTTP
protocol. To avoid re-fetching objects that we already have locally we
first check whether the object already exists in the local repository
before issuing the requests.
This existence check is done by calling `repo_has_object_file()`, which
will fetch the object via a promisor remote in case it is in a promisor
pack. This fetch does not make any sense for us though: we're already in
the process of fetching the object anyway, so fetching it via a separate
connection is wasteful, but should otherwise be harmless.
Fix the issue by converting to `has_object()`, which knows to not fetch
objects via promisor remotes by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
http-walker.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/http-walker.c b/http-walker.c
index 95458e2f638..5ad2eae9a11 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -138,7 +138,7 @@ static int fill_active_slot(void *data UNUSED)
list_for_each_safe(pos, tmp, head) {
obj_req = list_entry(pos, struct object_request, node);
if (obj_req->state == WAITING) {
- if (repo_has_object_file(the_repository, &obj_req->oid))
+ if (has_object(the_repository, &obj_req->oid, HAS_OBJECT_RECHECK_PACKED))
obj_req->state = COMPLETE;
else {
start_object_request(obj_req);
@@ -496,7 +496,7 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
if (!obj_req)
return error("Couldn't find request for %s in the queue", hex);
- if (repo_has_object_file(the_repository, &obj_req->oid)) {
+ if (has_object(the_repository, &obj_req->oid, HAS_OBJECT_RECHECK_PACKED)) {
if (obj_req->req)
abort_http_object_request(&obj_req->req);
abort_object_request(obj_req);
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 11/13] list-objects: clarify how promised blobs are excluded
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (9 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
` (4 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
When "--exclude-promisor-objects" is passed by the user, commands like
git-rev-list(1) will exclude any objects part of a promisor pack. For
blobs this logic is handled in `process_blob()` with the following
logic:
if (ctx->revs->exclude_promisor_objects &&
!repo_has_object_file(the_repository, &obj->oid) &&
is_promisor_object(ctx->revs->repo, &obj->oid))
return;
It is somewhat puzzling that we use `repo_has_object_file()` to check
for existence of the blob because this function will cause us to fetch
missing objects in case they are part of a promisor pack. As such, one
may wonder whether the logic to exclude promised blobs is completely
broken.
As it turns out it's not broken: when "--exclude-promisor-objects" is
set we also unset the global `fetch_if_missing` variable, which causes
`do_oid_object_info_extended()` to not fetch any promised objects at
all.
Clarify this logic by using `has_object()`, which doesn't fetch promised
objects by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
list-objects.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/list-objects.c b/list-objects.c
index 1e5512e1318..cae4f7aff8c 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -74,7 +74,8 @@ static void process_blob(struct traversal_context *ctx,
* of missing objects.
*/
if (ctx->revs->exclude_promisor_objects &&
- !repo_has_object_file(the_repository, &obj->oid) &&
+ !has_object(the_repository, &obj->oid,
+ HAS_OBJECT_RECHECK_PACKED) &&
is_promisor_object(ctx->revs->repo, &obj->oid))
return;
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 12/13] bulk-checkin: don't fetch promised objects on write
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (10 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 11/13] list-objects: clarify how promised blobs are excluded Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 13/13] object-store: drop `repo_has_object_file()` Patrick Steinhardt
` (3 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
When writing objects via the bulk-checkin subsystem we first try to
figure out whether an object already exists in the repository before we
append it to the packfile. This check uses `repo_has_object_file()`,
which knows to fetch promised objects by default. As such, if we were
about to write an object that is promised, we'd fetch the object via the
promisor and then skip writing it. This behaviour doesn't seem sensible:
it should be significantly faster to take the locally-written object
instead of faulting in objects from the promisor remote.
There is one counter-argument here: it could be that the bulk-checkin
mechanism will end up writing an object to disk whose content collides
with the object in the promisor remote. The local repository and its
promisor remote would now have two objects with different contents but
the same name. But the resulting behaviour would be wrong both when we
prefer the fetched object, and also when prefering the written object:
- When we prefer the written object we will now see a different world
compared to everyone else who has the promised object.
- When we prefer the fetched object we will end up with an object that
is different compared to what the user just asked us to write. This
seems even worse compared to the first scenario.
In an ideal world, we would protect against this by fetching the
promised object and then performing a collision check. But this feels
exceedingly expensive and ultimately rather pointless, as more common
writing paths like `write_loose_object()` don't protect against this
scenario either. And in any case we're talking about a local user that
has write access to the repository anyway, so if they want to do any
kind of mischieve they already can.
Change the behaviour so that we don't fault in the object via the
promisor remote. We shouldn't have to worry about hash collisions too
much (yet) as the mechanism is only used during local writes anyway. And
even if there was a collision, prefering local data that we were just
asked to write over data controlled by a potentially untrusted remote
feels like the better failure mode.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
bulk-checkin.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/bulk-checkin.c b/bulk-checkin.c
index c31c31b18d8..b182c456d69 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -130,7 +130,7 @@ static void flush_batch_fsync(void)
static int already_written(struct bulk_checkin_packfile *state, struct object_id *oid)
{
/* The object may already exist in the repository */
- if (repo_has_object_file(the_repository, oid))
+ if (has_object(the_repository, oid, HAS_OBJECT_RECHECK_PACKED))
return 1;
/* Might want to keep the list sorted */
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH 13/13] object-store: drop `repo_has_object_file()`
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (11 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
@ 2025-04-23 7:48 ` Patrick Steinhardt
2025-04-23 17:20 ` [PATCH 00/13] object-store: a handful of cleanups Karthik Nayak
` (2 subsequent siblings)
15 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-23 7:48 UTC (permalink / raw)
To: git
In the preceding commits we have converted all users of
`repo_has_object_file()` and its `_with_flags()` variant to instead use
`has_object()`. Drop these functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.c | 14 --------------
object-store.h | 17 -----------------
2 files changed, 31 deletions(-)
diff --git a/object-store.c b/object-store.c
index 2db34804e8f..2f51d0e3b03 100644
--- a/object-store.c
+++ b/object-store.c
@@ -949,20 +949,6 @@ int has_object(struct repository *r, const struct object_id *oid,
return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
}
-int repo_has_object_file_with_flags(struct repository *r,
- const struct object_id *oid, int flags)
-{
- if (!startup_info->have_repository)
- return 0;
- return oid_object_info_extended(r, oid, NULL, flags) >= 0;
-}
-
-int repo_has_object_file(struct repository *r,
- const struct object_id *oid)
-{
- return repo_has_object_file_with_flags(r, oid, 0);
-}
-
void assert_oid_type(const struct object_id *oid, enum object_type expect)
{
enum object_type type = oid_object_info(the_repository, oid, NULL);
diff --git a/object-store.h b/object-store.h
index c6055376f49..2330374990b 100644
--- a/object-store.h
+++ b/object-store.h
@@ -280,23 +280,6 @@ enum {
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags);
-/*
- * These macros and functions are deprecated. If checking existence for an
- * object that is likely to be missing and/or whose absence is relatively
- * inconsequential (or is consequential but the caller is prepared to handle
- * it), use has_object(), which has better defaults (no lazy fetch in a partial
- * clone and no rechecking of packed storage). In the unlikely event that a
- * caller needs to assert existence of an object that it fully expects to
- * exist, and wants to trigger a lazy fetch in a partial clone, use
- * oid_object_info_extended() with a NULL struct object_info.
- *
- * These functions can be removed once all callers have migrated to
- * has_object() and/or oid_object_info_extended().
- */
-int repo_has_object_file(struct repository *r, const struct object_id *oid);
-int repo_has_object_file_with_flags(struct repository *r,
- const struct object_id *oid, int flags);
-
void assert_oid_type(const struct object_id *oid, enum object_type expect);
/*
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [PATCH 03/13] object-store: move and rename `odb_pack_keep()`
2025-04-23 7:48 ` [PATCH 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
@ 2025-04-23 10:03 ` Karthik Nayak
0 siblings, 0 replies; 55+ messages in thread
From: Karthik Nayak @ 2025-04-23 10:03 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 648 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> The function `odb_pack_keep()` creates a file at the passed-in path. If
> this fails, then the function re-tries by first creating any potentially
> missing leading directoriesk and then trying to create the file once
s/directoriesk/directories
> again. As such, this function doesn't host any kind of logic that is
> specific to the object store, but is rather a generic helper function.
>
> Rename the function to `safe_create_file_with_leading_directories()` and
> move it into "path.c". While at it, refactor it so that it loses its
> dependency on `the_repository`.
>
Rest of the patch looks good.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH 05/13] object-store: allow fetching objects via `has_object()`
2025-04-23 7:48 ` [PATCH 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
@ 2025-04-23 10:07 ` Karthik Nayak
0 siblings, 0 replies; 55+ messages in thread
From: Karthik Nayak @ 2025-04-23 10:07 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 2793 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> We're about to fully remove `repo_has_object_file()` in favor of
> `has_object()` given that the latter has better defaults: it neither
> reloads packfiles by default nor does it fetch any promised objects in
> case they are missing.
>
> The latter usecase keeps us from converting a couple of callsites that
> currently do fetch objects though. It is not really clear whether _all_
> of those callsites should be fetching objects, but for a subset of them
> it is the desired behaviour indeed.
>
> Introduce a new flag `HAS_OBJECT_FETCH_PROMISOR` that causes the
> function to optionally fetch missing objects which are part of a
> promisor pack.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> object-store.c | 9 ++++++---
> object-store.h | 10 +++++++---
> 2 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/object-store.c b/object-store.c
> index 0cbad5a19a0..0d873868a6d 100644
> --- a/object-store.c
> +++ b/object-store.c
> @@ -937,12 +937,15 @@ void *read_object_with_reference(struct repository *r,
> int has_object(struct repository *r, const struct object_id *oid,
> unsigned flags)
> {
> - int quick = !(flags & HAS_OBJECT_RECHECK_PACKED);
> - unsigned object_info_flags = OBJECT_INFO_SKIP_FETCH_OBJECT |
> - (quick ? OBJECT_INFO_QUICK : 0);
> + unsigned object_info_flags = 0;
>
> if (!startup_info->have_repository)
> return 0;
> + if (!(flags & HAS_OBJECT_RECHECK_PACKED))
> + object_info_flags |= OBJECT_INFO_QUICK;
> + if (!(flags & HAS_OBJECT_FETCH_PROMISOR))
> + object_info_flags |= OBJECT_INFO_SKIP_FETCH_OBJECT;
> +
This already is much easier to read.
> return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
> }
>
> diff --git a/object-store.h b/object-store.h
> index 5bbdaba92d1..c6055376f49 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -266,12 +266,16 @@ int oid_object_info_extended(struct repository *r,
> const struct object_id *,
> struct object_info *, unsigned flags);
>
> -/* Retry packed storage after checking packed and loose storage */
> -#define HAS_OBJECT_RECHECK_PACKED 1
> +enum {
> + /* Retry packed storage after checking packed and loose storage */
> + HAS_OBJECT_RECHECK_PACKED = (1 << 0),
> + /* Allow fetching the object in case the repository has a promisor remote. */
> + HAS_OBJECT_FETCH_PROMISOR = (1 << 1),
> +};
>
> /*
> * Returns 1 if the object exists. This function will not lazily fetch objects
> - * in a partial clone.
> + * in a partial clone by default.
> */
> int has_object(struct repository *r, const struct object_id *oid,
> unsigned flags);
>
> --
> 2.49.0.901.g37484f566f.dirty
Okay so we add the functionality here, and possibly use it in the future
patches. Good.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-23 7:48 ` [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
@ 2025-04-23 17:08 ` Karthik Nayak
2025-04-25 7:04 ` Patrick Steinhardt
0 siblings, 1 reply; 55+ messages in thread
From: Karthik Nayak @ 2025-04-23 17:08 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 2239 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> Any packed objects indexed via git-index-pack(1) are subject to a
> collision check. This collision check has the intent to determine
> whether we already have an object with the same object ID, but different
> contents in the repository.
>
> The check whether the collision check is really needed is performed via
> `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`. \
>
Nit: this was a little confusing at first, until I saw the code. So what
this means is that the collision check is only performed, iff
`repo_has_object_file_with_flags(...)` returns true.
I think the confusing part was 'is performed via', perhaps:
The collision check is only performed, if
repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK) returns a
truthy value.
But it is okay as is too!
> But unless
> explicitly told otherwise via `OBJECT_INFO_SKIP_FETCH_OBJECT`, this
> function will also cause us to fetch the object ID in case it is part of
> a promisor pack. As such, we may end up fetching the object only to
> check whether the fetched object and the object that we're indexing have
> the same content.
>
So us fetching the object is pointless, since we only care about the
'does it exist' part and not really what it contains. In that case,
shouldn't this be s/same content/same oid/?
> This behaviour is highly dubious and more likely than not unintended.
> Fix it by converting to `has_object()`, which knows to neither reload
> packfiles nor to fetch promisor objects by default.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/index-pack.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index f49431d626b..805b7aa1e28 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -892,9 +892,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
>
> if (startup_info->have_repository) {
> read_lock();
> - collision_test_needed =
> - repo_has_object_file_with_flags(the_repository, oid,
> - OBJECT_INFO_QUICK);
> + collision_test_needed = has_object(the_repository, oid, 0);
> read_unlock();
> }
>
>
> --
> 2.49.0.901.g37484f566f.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()`
2025-04-23 7:48 ` [PATCH 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
@ 2025-04-23 17:11 ` Karthik Nayak
0 siblings, 0 replies; 55+ messages in thread
From: Karthik Nayak @ 2025-04-23 17:11 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 820 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> Similar to the preceding commit, don't try to fetch objects pointed to
> by references. Any reference whose object does not exist is broken by
> definition an, so we should report it accordingly.
>
s/an//
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> refs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/refs.c b/refs.c
> index 6559db37890..0492cf0d13a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -376,7 +376,7 @@ int ref_resolves_to_object(const char *refname,
> {
> if (flags & REF_ISBROKEN)
> return 0;
> - if (!repo_has_object_file(repo, oid)) {
> + if (!has_object(repo, oid, HAS_OBJECT_RECHECK_PACKED)) {
> error(_("%s does not point to a valid object!"), refname);
> return 0;
> }
>
> --
> 2.49.0.901.g37484f566f.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH 00/13] object-store: a handful of cleanups
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (12 preceding siblings ...)
2025-04-23 7:48 ` [PATCH 13/13] object-store: drop `repo_has_object_file()` Patrick Steinhardt
@ 2025-04-23 17:20 ` Karthik Nayak
2025-04-25 7:07 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
15 siblings, 1 reply; 55+ messages in thread
From: Karthik Nayak @ 2025-04-23 17:20 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 915 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> Hi,
>
> this patch series contains a handful of cleanups to the object store
> subsystem:
>
> - A couple of definitions are moved out of "object-store.h" as they
> belong to other subsystems.
>
> - Some functions are dropped and/or renamed.
>
> - The biggest part is the removal of `repo_has_object_file()`. This
> function and its `_with_flags()` variant are marked as deprecated,
> with the replacement being `has_object()`. The benefit of that
> function is that it doesn't reload packfiles and doesn't fetch
> promisor objects by default so that it becomes more explicit when
> one really wants to do so.
>
> These cleanups are in preparation for getting rid of `the_repository` in
> "object-store.c".
>
Apart from the few nits I mentioned, the series looks great! I must say
the split of commits was really nice to go through :)
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-23 17:08 ` Karthik Nayak
@ 2025-04-25 7:04 ` Patrick Steinhardt
2025-04-28 19:48 ` Karthik Nayak
0 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:04 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Wed, Apr 23, 2025 at 10:08:05AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > Any packed objects indexed via git-index-pack(1) are subject to a
> > collision check. This collision check has the intent to determine
> > whether we already have an object with the same object ID, but different
> > contents in the repository.
> >
> > The check whether the collision check is really needed is performed via
> > `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`. \
> >
>
> Nit: this was a little confusing at first, until I saw the code. So what
> this means is that the collision check is only performed, iff
> `repo_has_object_file_with_flags(...)` returns true.
>
> I think the confusing part was 'is performed via', perhaps:
>
> The collision check is only performed, if
> repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK) returns a
> truthy value.
>
> But it is okay as is too!
Will rephrase.
> > But unless
> > explicitly told otherwise via `OBJECT_INFO_SKIP_FETCH_OBJECT`, this
> > function will also cause us to fetch the object ID in case it is part of
> > a promisor pack. As such, we may end up fetching the object only to
> > check whether the fetched object and the object that we're indexing have
> > the same content.
> >
>
> So us fetching the object is pointless, since we only care about the
> 'does it exist' part and not really what it contains. In that case,
> shouldn't this be s/same content/same oid/?
No, it really checks for the same content. It basically verifies that
any pair of objects that:
- Exist in the packfile that we're currently indexing.
- And preexists in the local repository.
Actually have the same content.
The weird part is that we also do this for objects that don't yet exist
in the repository, but which are promised to us. This causes us to fetch
them first only to verify that the fetched promised object has the same
content as the packfile. And given that git-index-pack(1) would usually
run after a fetch, we end up verifying that the fetched object obtained
from the promisor is the same as the fetched object obtained from the
packfile. Which ultimately seems rather dubious to me.
Patrick
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH 00/13] object-store: a handful of cleanups
2025-04-23 17:20 ` [PATCH 00/13] object-store: a handful of cleanups Karthik Nayak
@ 2025-04-25 7:07 ` Patrick Steinhardt
0 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:07 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Wed, Apr 23, 2025 at 10:20:18AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > Hi,
> >
> > this patch series contains a handful of cleanups to the object store
> > subsystem:
> >
> > - A couple of definitions are moved out of "object-store.h" as they
> > belong to other subsystems.
> >
> > - Some functions are dropped and/or renamed.
> >
> > - The biggest part is the removal of `repo_has_object_file()`. This
> > function and its `_with_flags()` variant are marked as deprecated,
> > with the replacement being `has_object()`. The benefit of that
> > function is that it doesn't reload packfiles and doesn't fetch
> > promisor objects by default so that it becomes more explicit when
> > one really wants to do so.
> >
> > These cleanups are in preparation for getting rid of `the_repository` in
> > "object-store.c".
> >
>
> Apart from the few nits I mentioned, the series looks great! I must say
> the split of commits was really nice to go through :)
Thanks for your review!
Patrick
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v2 00/13] object-store: a handful of cleanups
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (13 preceding siblings ...)
2025-04-23 17:20 ` [PATCH 00/13] object-store: a handful of cleanups Karthik Nayak
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
` (13 more replies)
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
15 siblings, 14 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
Hi,
this patch series contains a handful of cleanups to the object store
subsystem:
- A couple of definitions are moved out of "object-store.h" as they
belong to other subsystems.
- Some functions are dropped and/or renamed.
- The biggest part is the removal of `repo_has_object_file()`. This
function and its `_with_flags()` variant are marked as deprecated,
with the replacement being `has_object()`. The benefit of that
function is that it doesn't reload packfiles and doesn't fetch
promisor objects by default so that it becomes more explicit when
one really wants to do so.
These cleanups are in preparation for getting rid of `the_repository` in
"object-store.c".
The patch series is built on top of 4bbb303af69 (The seventh batch,
2025-04-17) with ps/object-file-cleanup at 68cd492a3e6 (object-store:
merge "object-store-ll.h" and "object-store.h", 2025-04-15) merged into
it.
Changes in v2:
- A handful of improvements for commit messages.
- Link to v1: https://lore.kernel.org/r/20250423-pks-object-store-cleanups-v1-0-81f8411a5d08@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (13):
object-store: move `struct packed_git` into "packfile.h"
object-store: drop `loose_object_path()`
object-store: move and rename `odb_pack_keep()`
object-store: move function declarations to their respective subsystems
object-store: allow fetching objects via `has_object()`
treewide: trivial conversions of `repo_has_object_file()`
builtin/index-pack: don't fetch promised objects for collision check
builtin/show-ref: don't fetch objects when printing refs
refs: don't fetch promisor objects in `ref_resolves_to_object()`
http-walker: don't fetch objects via promisor remotes
list-objects: clarify how promised blobs are excluded
bulk-checkin: don't fetch promised objects on write
object-store: drop `repo_has_object_file()`
builtin/cat-file.c | 3 +-
builtin/clone.c | 4 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 3 +-
builtin/fetch.c | 15 ++--
builtin/gc.c | 2 +-
builtin/index-pack.c | 6 +-
builtin/receive-pack.c | 4 +-
builtin/remote.c | 3 +-
builtin/show-ref.c | 2 +-
builtin/unpack-objects.c | 3 +-
bulk-checkin.c | 2 +-
cache-tree.c | 13 +++-
fetch-pack.c | 7 +-
http-push.c | 11 ++-
http-walker.c | 7 +-
http.c | 4 +-
list-objects.c | 3 +-
notes.c | 3 +-
object-file.c | 4 +-
object-file.h | 77 +++++++++++++++++++
object-name.c | 2 +-
object-store.c | 44 ++---------
object-store.h | 191 +++--------------------------------------------
pack-objects.h | 1 +
packfile.h | 78 ++++++++++++++++++-
path.c | 14 ++++
path.h | 7 ++
prune-packed.c | 2 +-
reachable.c | 2 +-
reflog.c | 3 +-
refs.c | 2 +-
remote.c | 2 +-
send-pack.c | 5 +-
shallow.c | 9 ++-
upload-pack.c | 3 +-
walker.c | 3 +-
37 files changed, 265 insertions(+), 281 deletions(-)
Range-diff versus v1:
1: b4f8a00f4c4 = 1: 019c27227dc object-store: move `struct packed_git` into "packfile.h"
2: 8684c481949 = 2: b372e8214de object-store: drop `loose_object_path()`
3: f4f5127f44f ! 3: fa51af2ee24 object-store: move and rename `odb_pack_keep()`
@@ Commit message
The function `odb_pack_keep()` creates a file at the passed-in path. If
this fails, then the function re-tries by first creating any potentially
- missing leading directoriesk and then trying to create the file once
+ missing leading directories and then trying to create the file once
again. As such, this function doesn't host any kind of logic that is
specific to the object store, but is rather a generic helper function.
4: 9f94d3c4780 = 4: 04ad9a1b228 object-store: move function declarations to their respective subsystems
5: 0a187fe90db = 5: 3d45b334f4b object-store: allow fetching objects via `has_object()`
6: 4ebdf7510d2 = 6: 6101dfc393a treewide: trivial conversions of `repo_has_object_file()`
7: 739de6f8c67 ! 7: 35eca639ba4 builtin/index-pack: don't fetch promised objects for collision check
@@ Commit message
whether we already have an object with the same object ID, but different
contents in the repository.
- The check whether the collision check is really needed is performed via
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`. But unless
- explicitly told otherwise via `OBJECT_INFO_SKIP_FETCH_OBJECT`, this
- function will also cause us to fetch the object ID in case it is part of
- a promisor pack. As such, we may end up fetching the object only to
- check whether the fetched object and the object that we're indexing have
- the same content.
+ The check whether the collision check is really needed is only performed
+ in case `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` tells
+ us that the object exists. But unless explicitly told otherwise by
+ passing `OBJECT_INFO_SKIP_FETCH_OBJECT`, this function will also cause
+ us to fetch the object in case it is part of a promisor pack. As such,
+ we may end up fetching the object only to check whether the fetched
+ object and the object that we're indexing have the same content.
This behaviour is highly dubious and more likely than not unintended.
Fix it by converting to `has_object()`, which knows to neither reload
8: 0a79bfdbf14 = 8: 9975d86c59d builtin/show-ref: don't fetch objects when printing refs
9: 3a1ebffcdb0 ! 9: 41ad3e7ede2 refs: don't fetch promisor objects in `ref_resolves_to_object()`
@@ Commit message
Similar to the preceding commit, don't try to fetch objects pointed to
by references. Any reference whose object does not exist is broken by
- definition an, so we should report it accordingly.
+ definition, so we should report it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
10: a21cfe3dc91 = 10: e7bb54ba5bd http-walker: don't fetch objects via promisor remotes
11: b939734e9e6 = 11: bef052ff785 list-objects: clarify how promised blobs are excluded
12: 8e8a5041af3 = 12: 45023c6e96f bulk-checkin: don't fetch promised objects on write
13: 68deca60383 = 13: c263ae0f4cf object-store: drop `repo_has_object_file()`
---
base-commit: ca819c0751cedd1713334882e4c83687f8478a54
change-id: 20250422-pks-object-store-cleanups-5a6077014155
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v2 01/13] object-store: move `struct packed_git` into "packfile.h"
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 02/13] object-store: drop `loose_object_path()` Patrick Steinhardt
` (12 subsequent siblings)
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
The "object-store.h" header contains the definition of `struct
packed_git`. As this structure hosts all kind of information about a
specific packfile it is arguably a bit out of place in a generic place
like "object-store.h".
Move the structure as well as `pack_map_entry_cmp()` into "packfile.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.h | 59 +---------------------------------------------------------
pack-objects.h | 1 +
packfile.h | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 60 insertions(+), 59 deletions(-)
diff --git a/object-store.h b/object-store.h
index 46961dc9542..e04469a85fb 100644
--- a/object-store.h
+++ b/object-store.h
@@ -92,65 +92,8 @@ struct oidtree *odb_loose_cache(struct object_directory *odb,
/* Empty the loose object cache for the specified object directory. */
void odb_clear_loose_cache(struct object_directory *odb);
-struct packed_git {
- struct hashmap_entry packmap_ent;
- struct packed_git *next;
- struct list_head mru;
- struct pack_window *windows;
- off_t pack_size;
- const void *index_data;
- size_t index_size;
- uint32_t num_objects;
- size_t crc_offset;
- struct oidset bad_objects;
- int index_version;
- time_t mtime;
- int pack_fd;
- int index; /* for builtin/pack-objects.c */
- unsigned pack_local:1,
- pack_keep:1,
- pack_keep_in_core:1,
- freshened:1,
- do_not_close:1,
- pack_promisor:1,
- multi_pack_index:1,
- is_cruft:1;
- unsigned char hash[GIT_MAX_RAWSZ];
- struct revindex_entry *revindex;
- const uint32_t *revindex_data;
- const uint32_t *revindex_map;
- size_t revindex_size;
- /*
- * mtimes_map points at the beginning of the memory mapped region of
- * this pack's corresponding .mtimes file, and mtimes_size is the size
- * of that .mtimes file
- */
- const uint32_t *mtimes_map;
- size_t mtimes_size;
-
- /* repo denotes the repository this packfile belongs to */
- struct repository *repo;
-
- /* something like ".git/objects/pack/xxxxx.pack" */
- char pack_name[FLEX_ARRAY]; /* more */
-};
-
+struct packed_git;
struct multi_pack_index;
-
-static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *entry,
- const struct hashmap_entry *entry2,
- const void *keydata)
-{
- const char *key = keydata;
- const struct packed_git *pg1, *pg2;
-
- pg1 = container_of(entry, const struct packed_git, packmap_ent);
- pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
- return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
struct cached_object_entry;
struct raw_object_store {
diff --git a/pack-objects.h b/pack-objects.h
index d1c4ae7f9b6..475a2d67ce3 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -4,6 +4,7 @@
#include "object-store.h"
#include "thread-utils.h"
#include "pack.h"
+#include "packfile.h"
struct repository;
diff --git a/packfile.h b/packfile.h
index 25097213d06..05499382397 100644
--- a/packfile.h
+++ b/packfile.h
@@ -1,13 +1,70 @@
#ifndef PACKFILE_H
#define PACKFILE_H
+#include "list.h"
#include "object.h"
#include "oidset.h"
/* in object-store.h */
-struct packed_git;
struct object_info;
+struct packed_git {
+ struct hashmap_entry packmap_ent;
+ struct packed_git *next;
+ struct list_head mru;
+ struct pack_window *windows;
+ off_t pack_size;
+ const void *index_data;
+ size_t index_size;
+ uint32_t num_objects;
+ size_t crc_offset;
+ struct oidset bad_objects;
+ int index_version;
+ time_t mtime;
+ int pack_fd;
+ int index; /* for builtin/pack-objects.c */
+ unsigned pack_local:1,
+ pack_keep:1,
+ pack_keep_in_core:1,
+ freshened:1,
+ do_not_close:1,
+ pack_promisor:1,
+ multi_pack_index:1,
+ is_cruft:1;
+ unsigned char hash[GIT_MAX_RAWSZ];
+ struct revindex_entry *revindex;
+ const uint32_t *revindex_data;
+ const uint32_t *revindex_map;
+ size_t revindex_size;
+ /*
+ * mtimes_map points at the beginning of the memory mapped region of
+ * this pack's corresponding .mtimes file, and mtimes_size is the size
+ * of that .mtimes file
+ */
+ const uint32_t *mtimes_map;
+ size_t mtimes_size;
+
+ /* repo denotes the repository this packfile belongs to */
+ struct repository *repo;
+
+ /* something like ".git/objects/pack/xxxxx.pack" */
+ char pack_name[FLEX_ARRAY]; /* more */
+};
+
+static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
+ const struct hashmap_entry *entry,
+ const struct hashmap_entry *entry2,
+ const void *keydata)
+{
+ const char *key = keydata;
+ const struct packed_git *pg1, *pg2;
+
+ pg1 = container_of(entry, const struct packed_git, packmap_ent);
+ pg2 = container_of(entry2, const struct packed_git, packmap_ent);
+
+ return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
+}
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 02/13] object-store: drop `loose_object_path()`
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
` (11 subsequent siblings)
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
The function `loose_object_path()` is a trivial wrapper around
`odb_loose_path()`, with the only exception that it always uses the
primary object database of the given repository. This doesn't really add
a ton of value though, so let's drop the function and inline it at every
callsite.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
http-walker.c | 3 ++-
http.c | 4 ++--
object-file.c | 4 ++--
object-file.h | 4 ++++
object-store.c | 6 ------
object-store.h | 7 -------
6 files changed, 10 insertions(+), 18 deletions(-)
diff --git a/http-walker.c b/http-walker.c
index 882cae19c24..95458e2f638 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -9,6 +9,7 @@
#include "list.h"
#include "transport.h"
#include "packfile.h"
+#include "object-file.h"
#include "object-store.h"
struct alt_base {
@@ -540,7 +541,7 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
- loose_object_path(the_repository, &buf, &req->oid);
+ odb_loose_path(the_repository->objects->odb, &buf, &req->oid);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release(&buf);
}
diff --git a/http.c b/http.c
index 0c411380425..3c029cf8947 100644
--- a/http.c
+++ b/http.c
@@ -2662,7 +2662,7 @@ struct http_object_request *new_http_object_request(const char *base_url,
oidcpy(&freq->oid, oid);
freq->localfile = -1;
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
strbuf_addf(&freq->tmpfile, "%s.temp", filename.buf);
strbuf_addf(&prevfile, "%s.prev", filename.buf);
@@ -2814,7 +2814,7 @@ int finish_http_object_request(struct http_object_request *freq)
unlink_or_warn(freq->tmpfile.buf);
return -1;
}
- loose_object_path(the_repository, &filename, &freq->oid);
+ odb_loose_path(the_repository->objects->odb, &filename, &freq->oid);
freq->rename = finalize_object_file(freq->tmpfile.buf, filename.buf);
strbuf_release(&filename);
diff --git a/object-file.c b/object-file.c
index 9cc3a24a40d..dc56a4766df 100644
--- a/object-file.c
+++ b/object-file.c
@@ -932,7 +932,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
prepare_loose_object_bulk_checkin();
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
fd = start_loose_object_common(&tmp_file, filename.buf, flags,
&stream, compressed, sizeof(compressed),
@@ -1079,7 +1079,7 @@ int stream_loose_object(struct input_stream *in_stream, size_t len,
goto cleanup;
}
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
/* We finally know the object path, and create the missing dir. */
dirlen = directory_size(filename.buf);
diff --git a/object-file.h b/object-file.h
index c002fbe2345..0a7b6b9f9d9 100644
--- a/object-file.h
+++ b/object-file.h
@@ -25,6 +25,10 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
struct object_directory;
+/*
+ * Put in `buf` the name of the file in the local object database that
+ * would be used to store a loose object with the specified oid.
+ */
const char *odb_loose_path(struct object_directory *odb,
struct strbuf *buf,
const struct object_id *oid);
diff --git a/object-store.c b/object-store.c
index 6ab50d25d3e..e5cfb8c0079 100644
--- a/object-store.c
+++ b/object-store.c
@@ -96,12 +96,6 @@ int odb_pack_keep(const char *name)
return open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
}
-const char *loose_object_path(struct repository *r, struct strbuf *buf,
- const struct object_id *oid)
-{
- return odb_loose_path(r->objects->odb, buf, oid);
-}
-
/*
* Return non-zero iff the path is usable as an alternate object database.
*/
diff --git a/object-store.h b/object-store.h
index e04469a85fb..5668de62d01 100644
--- a/object-store.h
+++ b/object-store.h
@@ -196,13 +196,6 @@ int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
*/
int odb_pack_keep(const char *name);
-/*
- * Put in `buf` the name of the file in the local object database that
- * would be used to store a loose object with the specified oid.
- */
-const char *loose_object_path(struct repository *r, struct strbuf *buf,
- const struct object_id *oid);
-
void *map_loose_object(struct repository *r, const struct object_id *oid,
unsigned long *size);
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 03/13] object-store: move and rename `odb_pack_keep()`
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 02/13] object-store: drop `loose_object_path()` Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 04/13] object-store: move function declarations to their respective subsystems Patrick Steinhardt
` (10 subsequent siblings)
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
The function `odb_pack_keep()` creates a file at the passed-in path. If
this fails, then the function re-tries by first creating any potentially
missing leading directories and then trying to create the file once
again. As such, this function doesn't host any kind of logic that is
specific to the object store, but is rather a generic helper function.
Rename the function to `safe_create_file_with_leading_directories()` and
move it into "path.c". While at it, refactor it so that it loses its
dependency on `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 3 ++-
builtin/index-pack.c | 2 +-
object-store.c | 13 -------------
object-store.h | 7 -------
path.c | 14 ++++++++++++++
path.h | 7 +++++++
6 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index c1e198f4e34..b2839c5f439 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -811,7 +811,8 @@ static char *keep_pack(const char *curr_index_name)
int keep_fd;
odb_pack_name(pack_data->repo, &name, pack_data->hash, "keep");
- keep_fd = odb_pack_keep(name.buf);
+ keep_fd = safe_create_file_with_leading_directories(pack_data->repo,
+ name.buf);
if (keep_fd < 0)
die_errno("cannot create keep file");
write_or_die(keep_fd, keep_msg, strlen(keep_msg));
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 60a8ee05dbc..f49431d626b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,7 +1565,7 @@ static void write_special_file(const char *suffix, const char *msg,
else
filename = odb_pack_name(the_repository, &name_buf, hash, suffix);
- fd = odb_pack_keep(filename);
+ fd = safe_create_file_with_leading_directories(the_repository, filename);
if (fd < 0) {
if (errno != EEXIST)
die_errno(_("cannot write %s file '%s'"),
diff --git a/object-store.c b/object-store.c
index e5cfb8c0079..0cbad5a19a0 100644
--- a/object-store.c
+++ b/object-store.c
@@ -83,19 +83,6 @@ int odb_mkstemp(struct strbuf *temp_filename, const char *pattern)
return xmkstemp_mode(temp_filename->buf, mode);
}
-int odb_pack_keep(const char *name)
-{
- int fd;
-
- fd = open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
- if (0 <= fd)
- return fd;
-
- /* slow path */
- safe_create_leading_directories_const(the_repository, name);
- return open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
-}
-
/*
* Return non-zero iff the path is usable as an alternate object database.
*/
diff --git a/object-store.h b/object-store.h
index 5668de62d01..aa8fc63043e 100644
--- a/object-store.h
+++ b/object-store.h
@@ -189,13 +189,6 @@ void raw_object_store_clear(struct raw_object_store *o);
*/
int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
-/*
- * Create a pack .keep file named "name" (which should generally be the output
- * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on
- * error.
- */
-int odb_pack_keep(const char *name);
-
void *map_loose_object(struct repository *r, const struct object_id *oid,
unsigned long *size);
diff --git a/path.c b/path.c
index 4505bb78e8b..3b598b2847f 100644
--- a/path.c
+++ b/path.c
@@ -1011,6 +1011,20 @@ enum scld_error safe_create_leading_directories_const(struct repository *repo,
return result;
}
+int safe_create_file_with_leading_directories(struct repository *repo,
+ const char *path)
+{
+ int fd;
+
+ fd = open(path, O_RDWR|O_CREAT|O_EXCL, 0600);
+ if (0 <= fd)
+ return fd;
+
+ /* slow path */
+ safe_create_leading_directories_const(repo, path);
+ return open(path, O_RDWR|O_CREAT|O_EXCL, 0600);
+}
+
static int have_same_root(const char *path1, const char *path2)
{
int is_abs1, is_abs2;
diff --git a/path.h b/path.h
index fd1a194b060..e67348f2539 100644
--- a/path.h
+++ b/path.h
@@ -266,6 +266,13 @@ enum scld_error safe_create_leading_directories_const(struct repository *repo,
const char *path);
enum scld_error safe_create_leading_directories_no_share(char *path);
+/*
+ * Create a file, potentially creating its leading directories in case they
+ * don't exist. Returns the return value of the open(3p) call.
+ */
+int safe_create_file_with_leading_directories(struct repository *repo,
+ const char *path);
+
# ifdef USE_THE_REPOSITORY_VARIABLE
# include "strbuf.h"
# include "repository.h"
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 04/13] object-store: move function declarations to their respective subsystems
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (2 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
` (9 subsequent siblings)
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
We carry declarations for a couple of functions in "object-store.h" that
are not defined in "object-store.c", but in a different subsystem. Move
these declarations to the respective headers whose matching code files
carry the corresponding definition.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/count-objects.c | 2 +-
builtin/gc.c | 2 +-
object-file.h | 73 +++++++++++++++++++++++++++++++++++++++
object-name.c | 2 +-
object-store.h | 91 +------------------------------------------------
packfile.h | 19 +++++++++++
prune-packed.c | 2 +-
reachable.c | 2 +-
8 files changed, 98 insertions(+), 95 deletions(-)
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 0bb5360b2f2..a88c0c9c09a 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -12,7 +12,7 @@
#include "parse-options.h"
#include "quote.h"
#include "packfile.h"
-#include "object-store.h"
+#include "object-file.h"
static unsigned long garbage;
static off_t size_garbage;
diff --git a/builtin/gc.c b/builtin/gc.c
index b5ce1d32766..4d428f3253d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -28,7 +28,7 @@
#include "commit.h"
#include "commit-graph.h"
#include "packfile.h"
-#include "object-store.h"
+#include "object-file.h"
#include "pack.h"
#include "pack-objects.h"
#include "path.h"
diff --git a/object-file.h b/object-file.h
index 0a7b6b9f9d9..de6dd205ed8 100644
--- a/object-file.h
+++ b/object-file.h
@@ -3,6 +3,7 @@
#include "git-zlib.h"
#include "object.h"
+#include "object-store.h"
struct index_state;
@@ -25,6 +26,16 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
struct object_directory;
+/*
+ * Populate and return the loose object cache array corresponding to the
+ * given object ID.
+ */
+struct oidtree *odb_loose_cache(struct object_directory *odb,
+ const struct object_id *oid);
+
+/* Empty the loose object cache for the specified object directory. */
+void odb_clear_loose_cache(struct object_directory *odb);
+
/*
* Put in `buf` the name of the file in the local object database that
* would be used to store a loose object with the specified oid.
@@ -42,6 +53,68 @@ int has_loose_object_nonlocal(const struct object_id *);
int has_loose_object(const struct object_id *);
+void *map_loose_object(struct repository *r, const struct object_id *oid,
+ unsigned long *size);
+
+/*
+ * Iterate over the files in the loose-object parts of the object
+ * directory "path", triggering the following callbacks:
+ *
+ * - loose_object is called for each loose object we find.
+ *
+ * - loose_cruft is called for any files that do not appear to be
+ * loose objects. Note that we only look in the loose object
+ * directories "objects/[0-9a-f]{2}/", so we will not report
+ * "objects/foobar" as cruft.
+ *
+ * - loose_subdir is called for each top-level hashed subdirectory
+ * of the object directory (e.g., "$OBJDIR/f0"). It is called
+ * after the objects in the directory are processed.
+ *
+ * Any callback that is NULL will be ignored. Callbacks returning non-zero
+ * will end the iteration.
+ *
+ * In the "buf" variant, "path" is a strbuf which will also be used as a
+ * scratch buffer, but restored to its original contents before
+ * the function returns.
+ */
+typedef int each_loose_object_fn(const struct object_id *oid,
+ const char *path,
+ void *data);
+typedef int each_loose_cruft_fn(const char *basename,
+ const char *path,
+ void *data);
+typedef int each_loose_subdir_fn(unsigned int nr,
+ const char *path,
+ void *data);
+int for_each_file_in_obj_subdir(unsigned int subdir_nr,
+ struct strbuf *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+int for_each_loose_file_in_objdir(const char *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+int for_each_loose_file_in_objdir_buf(struct strbuf *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+
+/*
+ * Iterate over all accessible loose objects without respect to
+ * reachability. By default, this includes both local and alternate objects.
+ * The order in which objects are visited is unspecified.
+ *
+ * Any flags specific to packs are ignored.
+ */
+int for_each_loose_object(each_loose_object_fn, void *,
+ enum for_each_object_flags flags);
+
+
/**
* format_object_header() is a thin wrapper around s xsnprintf() that
* writes the initial "<type> <obj-len>" part of the loose object
diff --git a/object-name.c b/object-name.c
index 2c751a5352a..9288b2dd245 100644
--- a/object-name.c
+++ b/object-name.c
@@ -19,7 +19,7 @@
#include "oidtree.h"
#include "packfile.h"
#include "pretty.h"
-#include "object-store.h"
+#include "object-file.h"
#include "read-cache-ll.h"
#include "repo-settings.h"
#include "repository.h"
diff --git a/object-store.h b/object-store.h
index aa8fc63043e..5bbdaba92d1 100644
--- a/object-store.h
+++ b/object-store.h
@@ -82,16 +82,6 @@ struct object_directory *set_temporary_primary_odb(const char *dir, int will_des
*/
void restore_primary_odb(struct object_directory *restore_odb, const char *old_path);
-/*
- * Populate and return the loose object cache array corresponding to the
- * given object ID.
- */
-struct oidtree *odb_loose_cache(struct object_directory *odb,
- const struct object_id *oid);
-
-/* Empty the loose object cache for the specified object directory. */
-void odb_clear_loose_cache(struct object_directory *odb);
-
struct packed_git;
struct multi_pack_index;
struct cached_object_entry;
@@ -189,9 +179,6 @@ void raw_object_store_clear(struct raw_object_store *o);
*/
int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
-void *map_loose_object(struct repository *r, const struct object_id *oid,
- unsigned long *size);
-
void *repo_read_object_file(struct repository *r,
const struct object_id *oid,
enum object_type *type,
@@ -340,56 +327,7 @@ static inline void obj_read_unlock(void)
if(obj_read_use_lock)
pthread_mutex_unlock(&obj_read_mutex);
}
-
-/*
- * Iterate over the files in the loose-object parts of the object
- * directory "path", triggering the following callbacks:
- *
- * - loose_object is called for each loose object we find.
- *
- * - loose_cruft is called for any files that do not appear to be
- * loose objects. Note that we only look in the loose object
- * directories "objects/[0-9a-f]{2}/", so we will not report
- * "objects/foobar" as cruft.
- *
- * - loose_subdir is called for each top-level hashed subdirectory
- * of the object directory (e.g., "$OBJDIR/f0"). It is called
- * after the objects in the directory are processed.
- *
- * Any callback that is NULL will be ignored. Callbacks returning non-zero
- * will end the iteration.
- *
- * In the "buf" variant, "path" is a strbuf which will also be used as a
- * scratch buffer, but restored to its original contents before
- * the function returns.
- */
-typedef int each_loose_object_fn(const struct object_id *oid,
- const char *path,
- void *data);
-typedef int each_loose_cruft_fn(const char *basename,
- const char *path,
- void *data);
-typedef int each_loose_subdir_fn(unsigned int nr,
- const char *path,
- void *data);
-int for_each_file_in_obj_subdir(unsigned int subdir_nr,
- struct strbuf *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-int for_each_loose_file_in_objdir(const char *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-int for_each_loose_file_in_objdir_buf(struct strbuf *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-
-/* Flags for for_each_*_object() below. */
+/* Flags for for_each_*_object(). */
enum for_each_object_flags {
/* Iterate only over local objects, not alternates. */
FOR_EACH_OBJECT_LOCAL_ONLY = (1<<0),
@@ -409,33 +347,6 @@ enum for_each_object_flags {
FOR_EACH_OBJECT_SKIP_ON_DISK_KEPT_PACKS = (1<<4),
};
-/*
- * Iterate over all accessible loose objects without respect to
- * reachability. By default, this includes both local and alternate objects.
- * The order in which objects are visited is unspecified.
- *
- * Any flags specific to packs are ignored.
- */
-int for_each_loose_object(each_loose_object_fn, void *,
- enum for_each_object_flags flags);
-
-/*
- * Iterate over all accessible packed objects without respect to reachability.
- * By default, this includes both local and alternate packs.
- *
- * Note that some objects may appear twice if they are found in multiple packs.
- * Each pack is visited in an unspecified order. By default, objects within a
- * pack are visited in pack-idx order (i.e., sorted by oid).
- */
-typedef int each_packed_object_fn(const struct object_id *oid,
- struct packed_git *pack,
- uint32_t pos,
- void *data);
-int for_each_object_in_pack(struct packed_git *p,
- each_packed_object_fn, void *data,
- enum for_each_object_flags flags);
-int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
- void *data, enum for_each_object_flags flags);
void *read_object_with_reference(struct repository *r,
const struct object_id *oid,
diff --git a/packfile.h b/packfile.h
index 05499382397..3a3c77cf05a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -3,6 +3,7 @@
#include "list.h"
#include "object.h"
+#include "object-store.h"
#include "oidset.h"
/* in object-store.h */
@@ -117,6 +118,24 @@ void for_each_file_in_pack_dir(const char *objdir,
each_file_in_pack_dir_fn fn,
void *data);
+/*
+ * Iterate over all accessible packed objects without respect to reachability.
+ * By default, this includes both local and alternate packs.
+ *
+ * Note that some objects may appear twice if they are found in multiple packs.
+ * Each pack is visited in an unspecified order. By default, objects within a
+ * pack are visited in pack-idx order (i.e., sorted by oid).
+ */
+typedef int each_packed_object_fn(const struct object_id *oid,
+ struct packed_git *pack,
+ uint32_t pos,
+ void *data);
+int for_each_object_in_pack(struct packed_git *p,
+ each_packed_object_fn, void *data,
+ enum for_each_object_flags flags);
+int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
+ void *data, enum for_each_object_flags flags);
+
/* A hook to report invalid files in pack directory */
#define PACKDIR_FILE_PACK 1
#define PACKDIR_FILE_IDX 2
diff --git a/prune-packed.c b/prune-packed.c
index c1d95a519d7..92fb4fbb0ed 100644
--- a/prune-packed.c
+++ b/prune-packed.c
@@ -2,7 +2,7 @@
#include "git-compat-util.h"
#include "gettext.h"
-#include "object-store.h"
+#include "object-file.h"
#include "packfile.h"
#include "progress.h"
#include "prune-packed.h"
diff --git a/reachable.c b/reachable.c
index e5f56f40181..9dc748f0b9a 100644
--- a/reachable.c
+++ b/reachable.c
@@ -14,7 +14,7 @@
#include "list-objects.h"
#include "packfile.h"
#include "worktree.h"
-#include "object-store.h"
+#include "object-file.h"
#include "pack-bitmap.h"
#include "pack-mtimes.h"
#include "config.h"
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 05/13] object-store: allow fetching objects via `has_object()`
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (3 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 04/13] object-store: move function declarations to their respective subsystems Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
` (8 subsequent siblings)
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
We're about to fully remove `repo_has_object_file()` in favor of
`has_object()` given that the latter has better defaults: it neither
reloads packfiles by default nor does it fetch any promised objects in
case they are missing.
The latter usecase keeps us from converting a couple of callsites that
currently do fetch objects though. It is not really clear whether _all_
of those callsites should be fetching objects, but for a subset of them
it is the desired behaviour indeed.
Introduce a new flag `HAS_OBJECT_FETCH_PROMISOR` that causes the
function to optionally fetch missing objects which are part of a
promisor pack.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.c | 9 ++++++---
object-store.h | 10 +++++++---
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/object-store.c b/object-store.c
index 0cbad5a19a0..0d873868a6d 100644
--- a/object-store.c
+++ b/object-store.c
@@ -937,12 +937,15 @@ void *read_object_with_reference(struct repository *r,
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags)
{
- int quick = !(flags & HAS_OBJECT_RECHECK_PACKED);
- unsigned object_info_flags = OBJECT_INFO_SKIP_FETCH_OBJECT |
- (quick ? OBJECT_INFO_QUICK : 0);
+ unsigned object_info_flags = 0;
if (!startup_info->have_repository)
return 0;
+ if (!(flags & HAS_OBJECT_RECHECK_PACKED))
+ object_info_flags |= OBJECT_INFO_QUICK;
+ if (!(flags & HAS_OBJECT_FETCH_PROMISOR))
+ object_info_flags |= OBJECT_INFO_SKIP_FETCH_OBJECT;
+
return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
}
diff --git a/object-store.h b/object-store.h
index 5bbdaba92d1..c6055376f49 100644
--- a/object-store.h
+++ b/object-store.h
@@ -266,12 +266,16 @@ int oid_object_info_extended(struct repository *r,
const struct object_id *,
struct object_info *, unsigned flags);
-/* Retry packed storage after checking packed and loose storage */
-#define HAS_OBJECT_RECHECK_PACKED 1
+enum {
+ /* Retry packed storage after checking packed and loose storage */
+ HAS_OBJECT_RECHECK_PACKED = (1 << 0),
+ /* Allow fetching the object in case the repository has a promisor remote. */
+ HAS_OBJECT_FETCH_PROMISOR = (1 << 1),
+};
/*
* Returns 1 if the object exists. This function will not lazily fetch objects
- * in a partial clone.
+ * in a partial clone by default.
*/
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags);
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 06/13] treewide: trivial conversions of `repo_has_object_file()`
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (4 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-28 21:40 ` Junio C Hamano
2025-04-25 7:08 ` [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
` (7 subsequent siblings)
13 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
favor of `has_object()`. The benefit of the replacement is better
defaults: it doesn't fetch missing objects via promisor remotes, and
neither does it reload packfiles if an object wasn't found by default.
Start sunsetting the functions by replacing trivial callsites of it with
`has_object()`:
- `repo_has_object_file(...)` is equivalent to
`has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., 0)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.
While converting callsites to use `HAS_OBJECT_FETCH_PROMISOR` retains
the current behaviour, it is not entirely clear whether all of them
really should be fetching promised objects. A subset of such callsites
where it is obvious that they really shouldn't fetch objects has been
left out of this commit and will be adapted over the next couple of
commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/cat-file.c | 3 ++-
builtin/clone.c | 4 +---
builtin/fetch.c | 15 +++++++--------
builtin/receive-pack.c | 4 +++-
builtin/remote.c | 3 ++-
builtin/unpack-objects.c | 3 ++-
cache-tree.c | 13 +++++++++----
fetch-pack.c | 7 +++----
http-push.c | 11 +++++++----
notes.c | 3 ++-
object-store.c | 2 +-
reflog.c | 3 ++-
remote.c | 2 +-
send-pack.c | 5 +----
shallow.c | 9 ++++++---
upload-pack.c | 3 +--
walker.c | 3 ++-
17 files changed, 52 insertions(+), 41 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 0e3f10a9467..3914a2a3f61 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -169,7 +169,8 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
goto cleanup;
case 'e':
- ret = !repo_has_object_file(the_repository, &oid);
+ ret = !has_object(the_repository, &oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR);
goto cleanup;
case 'w':
diff --git a/builtin/clone.c b/builtin/clone.c
index 6b1d11a3ed2..b498b81a043 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -504,9 +504,7 @@ static void write_followtags(const struct ref *refs, const char *msg)
continue;
if (ends_with(ref->name, "^{}"))
continue;
- if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid,
- OBJECT_INFO_QUICK |
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &ref->old_oid, 0))
continue;
refs_update_ref(get_main_ref_store(the_repository), msg,
ref->name, &ref->old_oid, NULL, 0,
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 95589b49948..aadcf49a5b4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -337,7 +337,6 @@ static void find_non_local_tags(const struct ref *refs,
struct string_list_item *remote_ref_item;
const struct ref *ref;
struct refname_hash_entry *item = NULL;
- const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
refname_hash_init(&existing_refs);
refname_hash_init(&remote_refs);
@@ -367,9 +366,9 @@ static void find_non_local_tags(const struct ref *refs,
*/
if (ends_with(ref->name, "^{}")) {
if (item &&
- !repo_has_object_file_with_flags(the_repository, &ref->old_oid, quick_flags) &&
+ !has_object(the_repository, &ref->old_oid, 0) &&
!oidset_contains(&fetch_oids, &ref->old_oid) &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
item = NULL;
@@ -383,7 +382,7 @@ static void find_non_local_tags(const struct ref *refs,
* fetch.
*/
if (item &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
@@ -404,7 +403,7 @@ static void find_non_local_tags(const struct ref *refs,
* checked to see if it needs fetching.
*/
if (item &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
@@ -911,7 +910,8 @@ static int update_local_ref(struct ref *ref,
struct commit *current = NULL, *updated;
int fast_forward = 0;
- if (!repo_has_object_file(the_repository, &ref->new_oid))
+ if (!has_object(the_repository, &ref->new_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
die(_("object %s not found"), oid_to_hex(&ref->new_oid));
if (oideq(&ref->old_oid, &ref->new_oid)) {
@@ -1330,8 +1330,7 @@ static int check_exist_and_connected(struct ref *ref_map)
* we need all direct targets to exist.
*/
for (r = rm; r; r = r->next) {
- if (!repo_has_object_file_with_flags(the_repository, &r->old_oid,
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &r->old_oid, HAS_OBJECT_RECHECK_PACKED))
return -1;
}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index be314879e82..c92e57ba188 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1506,7 +1506,9 @@ static const char *update(struct command *cmd, struct shallow_info *si)
}
}
- if (!is_null_oid(new_oid) && !repo_has_object_file(the_repository, new_oid)) {
+ if (!is_null_oid(new_oid) &&
+ !has_object(the_repository, new_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
error("unpack should have generated %s, "
"but I can't find it!", oid_to_hex(new_oid));
ret = "bad pack";
diff --git a/builtin/remote.c b/builtin/remote.c
index b4baa34e665..0d6755bcb71 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -454,7 +454,8 @@ static int get_push_ref_states(const struct ref *remote_refs,
info->status = PUSH_STATUS_UPTODATE;
else if (is_null_oid(&ref->old_oid))
info->status = PUSH_STATUS_CREATE;
- else if (repo_has_object_file(the_repository, &ref->old_oid) &&
+ else if (has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) &&
ref_newer(&ref->new_oid, &ref->old_oid))
info->status = PUSH_STATUS_FASTFORWARD;
else
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 661be789f13..e905d5f4e19 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -449,7 +449,8 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
delta_data = get_data(delta_size);
if (!delta_data)
return;
- if (repo_has_object_file(the_repository, &base_oid))
+ if (has_object(the_repository, &base_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
; /* Ok we have this one */
else if (resolve_against_held(nr, &base_oid,
delta_data, delta_size))
diff --git a/cache-tree.c b/cache-tree.c
index c0e1e9ee1d4..fa3858e2829 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -238,7 +238,9 @@ int cache_tree_fully_valid(struct cache_tree *it)
int i;
if (!it)
return 0;
- if (it->entry_count < 0 || !repo_has_object_file(the_repository, &it->oid))
+ if (it->entry_count < 0 ||
+ has_object(the_repository, &it->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0;
for (i = 0; i < it->subtree_nr; i++) {
if (!cache_tree_fully_valid(it->down[i]->cache_tree))
@@ -289,7 +291,9 @@ static int update_one(struct cache_tree *it,
}
}
- if (0 <= it->entry_count && repo_has_object_file(the_repository, &it->oid))
+ if (0 <= it->entry_count &&
+ has_object(the_repository, &it->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return it->entry_count;
/*
@@ -395,7 +399,8 @@ static int update_one(struct cache_tree *it,
ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
!must_check_existence(ce);
if (is_null_oid(oid) ||
- (!ce_missing_ok && !repo_has_object_file(the_repository, oid))) {
+ (!ce_missing_ok && !has_object(the_repository, oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))) {
strbuf_release(&buffer);
if (expected_missing)
return -1;
@@ -443,7 +448,7 @@ static int update_one(struct cache_tree *it,
struct object_id oid;
hash_object_file(the_hash_algo, buffer.buf, buffer.len,
OBJ_TREE, &oid);
- if (repo_has_object_file_with_flags(the_repository, &oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (has_object(the_repository, &oid, HAS_OBJECT_RECHECK_PACKED))
oidcpy(&it->oid, &oid);
else
to_invalidate = 1;
diff --git a/fetch-pack.c b/fetch-pack.c
index 210dc30d50f..fa4231fee74 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -769,9 +769,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
if (!commit) {
struct object *o;
- if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid,
- OBJECT_INFO_QUICK |
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &ref->old_oid, 0))
continue;
o = parse_object(the_repository, &ref->old_oid);
if (!o || o->type != OBJ_COMMIT)
@@ -1985,7 +1983,8 @@ static void update_shallow(struct fetch_pack_args *args,
struct oid_array extra = OID_ARRAY_INIT;
struct object_id *oid = si->shallow->oid;
for (i = 0; i < si->shallow->nr; i++)
- if (repo_has_object_file(the_repository, &oid[i]))
+ if (has_object(the_repository, &oid[i],
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
oid_array_append(&extra, &oid[i]);
if (extra.nr) {
setup_alternate_shallow(&shallow_lock,
diff --git a/http-push.c b/http-push.c
index 32e37565f4e..f9e67cabd4b 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1446,7 +1446,9 @@ static void one_remote_ref(const char *refname)
* Fetch a copy of the object if it doesn't exist locally - it
* may be required for updating server info later.
*/
- if (repo->can_update_info_refs && !repo_has_object_file(the_repository, &ref->old_oid)) {
+ if (repo->can_update_info_refs &&
+ !has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
obj = lookup_unknown_object(the_repository, &ref->old_oid);
fprintf(stderr, " fetch %s for %s\n",
oid_to_hex(&ref->old_oid), refname);
@@ -1651,14 +1653,14 @@ static int delete_remote_branch(const char *pattern, int force)
return error("Remote HEAD symrefs too deep");
if (is_null_oid(&head_oid))
return error("Unable to resolve remote HEAD");
- if (!repo_has_object_file(the_repository, &head_oid))
+ if (!has_object(the_repository, &head_oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return error("Remote HEAD resolves to object %s\nwhich does not exist locally, perhaps you need to fetch?", oid_to_hex(&head_oid));
/* Remote branch must resolve to a known object */
if (is_null_oid(&remote_ref->old_oid))
return error("Unable to resolve remote branch %s",
remote_ref->name);
- if (!repo_has_object_file(the_repository, &remote_ref->old_oid))
+ if (!has_object(the_repository, &remote_ref->old_oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return error("Remote branch %s resolves to object %s\nwhich does not exist locally, perhaps you need to fetch?", remote_ref->name, oid_to_hex(&remote_ref->old_oid));
/* Remote branch must be an ancestor of remote HEAD */
@@ -1879,7 +1881,8 @@ int cmd_main(int argc, const char **argv)
if (!force_all &&
!is_null_oid(&ref->old_oid) &&
!ref->force) {
- if (!repo_has_object_file(the_repository, &ref->old_oid) ||
+ if (!has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) ||
!ref_newer(&ref->peer_ref->new_oid,
&ref->old_oid)) {
/*
diff --git a/notes.c b/notes.c
index d9645c4b5dc..0a128f1de98 100644
--- a/notes.c
+++ b/notes.c
@@ -794,7 +794,8 @@ static int prune_notes_helper(const struct object_id *object_oid,
struct note_delete_list **l = (struct note_delete_list **) cb_data;
struct note_delete_list *n;
- if (repo_has_object_file(the_repository, object_oid))
+ if (has_object(the_repository, object_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0; /* nothing to do for this note */
/* failed to find object => prune this note */
diff --git a/object-store.c b/object-store.c
index 0d873868a6d..2db34804e8f 100644
--- a/object-store.c
+++ b/object-store.c
@@ -847,7 +847,7 @@ int pretend_object_file(struct repository *repo,
char *co_buf;
hash_object_file(repo->hash_algo, buf, len, type, oid);
- if (repo_has_object_file_with_flags(repo, oid, OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT) ||
+ if (has_object(repo, oid, 0) ||
find_cached_object(repo->objects, oid))
return 0;
diff --git a/reflog.c b/reflog.c
index c6c534dfad4..2db1cc81d01 100644
--- a/reflog.c
+++ b/reflog.c
@@ -152,7 +152,8 @@ static int tree_is_complete(const struct object_id *oid)
init_tree_desc(&desc, &tree->object.oid, tree->buffer, tree->size);
complete = 1;
while (tree_entry(&desc, &entry)) {
- if (!repo_has_object_file(the_repository, &entry.oid) ||
+ if (!has_object(the_repository, &entry.oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) ||
(S_ISDIR(entry.mode) && !tree_is_complete(&entry.oid))) {
tree->object.flags |= INCOMPLETE;
complete = 0;
diff --git a/remote.c b/remote.c
index 9fa3614e7a3..4099183cacd 100644
--- a/remote.c
+++ b/remote.c
@@ -1702,7 +1702,7 @@ void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
if (!reject_reason && !ref->deletion && !is_null_oid(&ref->old_oid)) {
if (starts_with(ref->name, "refs/tags/"))
reject_reason = REF_STATUS_REJECT_ALREADY_EXISTS;
- else if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
+ else if (!has_object(the_repository, &ref->old_oid, HAS_OBJECT_RECHECK_PACKED))
reject_reason = REF_STATUS_REJECT_FETCH_FIRST;
else if (!lookup_commit_reference_gently(the_repository, &ref->old_oid, 1) ||
!lookup_commit_reference_gently(the_repository, &ref->new_oid, 1))
diff --git a/send-pack.c b/send-pack.c
index 5005689cb55..86592ce526d 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -45,10 +45,7 @@ int option_parse_push_signed(const struct option *opt,
static void feed_object(struct repository *r,
const struct object_id *oid, FILE *fh, int negative)
{
- if (negative &&
- !repo_has_object_file_with_flags(r, oid,
- OBJECT_INFO_SKIP_FETCH_OBJECT |
- OBJECT_INFO_QUICK))
+ if (negative && !has_object(r, oid, 0))
return;
if (negative)
diff --git a/shallow.c b/shallow.c
index 2f82ebd6e3f..faeeeb45f98 100644
--- a/shallow.c
+++ b/shallow.c
@@ -310,7 +310,8 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data)
if (graft->nr_parent != -1)
return 0;
if (data->flags & QUICK) {
- if (!repo_has_object_file(the_repository, &graft->oid))
+ if (!has_object(the_repository, &graft->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0;
} else if (data->flags & SEEN_ONLY) {
struct commit *c = lookup_commit(the_repository, &graft->oid);
@@ -476,7 +477,8 @@ void prepare_shallow_info(struct shallow_info *info, struct oid_array *sa)
ALLOC_ARRAY(info->ours, sa->nr);
ALLOC_ARRAY(info->theirs, sa->nr);
for (size_t i = 0; i < sa->nr; i++) {
- if (repo_has_object_file(the_repository, sa->oid + i)) {
+ if (has_object(the_repository, sa->oid + i,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
struct commit_graft *graft;
graft = lookup_commit_graft(the_repository,
&sa->oid[i]);
@@ -513,7 +515,8 @@ void remove_nonexistent_theirs_shallow(struct shallow_info *info)
for (i = dst = 0; i < info->nr_theirs; i++) {
if (i != dst)
info->theirs[dst] = info->theirs[i];
- if (repo_has_object_file(the_repository, oid + info->theirs[i]))
+ if (has_object(the_repository, oid + info->theirs[i],
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
dst++;
}
info->nr_theirs = dst;
diff --git a/upload-pack.c b/upload-pack.c
index 30e4630f3a1..956da5b061a 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -509,8 +509,7 @@ static int got_oid(struct upload_pack_data *data,
{
if (get_oid_hex(hex, oid))
die("git upload-pack: expected SHA1 object, got '%s'", hex);
- if (!repo_has_object_file_with_flags(the_repository, oid,
- OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, oid, 0))
return -1;
return do_got_oid(data, oid);
}
diff --git a/walker.c b/walker.c
index 4fedc19f346..b470d43e54d 100644
--- a/walker.c
+++ b/walker.c
@@ -150,7 +150,8 @@ static int process(struct walker *walker, struct object *obj)
return 0;
obj->flags |= SEEN;
- if (repo_has_object_file(the_repository, &obj->oid)) {
+ if (has_object(the_repository, &obj->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
/* We already have it, so we should scan it now. */
obj->flags |= TO_SCAN;
}
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (5 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-28 21:46 ` Junio C Hamano
2025-04-25 7:08 ` [PATCH v2 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
` (6 subsequent siblings)
13 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
Any packed objects indexed via git-index-pack(1) are subject to a
collision check. This collision check has the intent to determine
whether we already have an object with the same object ID, but different
contents in the repository.
The check whether the collision check is really needed is only performed
in case `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` tells
us that the object exists. But unless explicitly told otherwise by
passing `OBJECT_INFO_SKIP_FETCH_OBJECT`, this function will also cause
us to fetch the object in case it is part of a promisor pack. As such,
we may end up fetching the object only to check whether the fetched
object and the object that we're indexing have the same content.
This behaviour is highly dubious and more likely than not unintended.
Fix it by converting to `has_object()`, which knows to neither reload
packfiles nor to fetch promisor objects by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/index-pack.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f49431d626b..805b7aa1e28 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -892,9 +892,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
if (startup_info->have_repository) {
read_lock();
- collision_test_needed =
- repo_has_object_file_with_flags(the_repository, oid,
- OBJECT_INFO_QUICK);
+ collision_test_needed = has_object(the_repository, oid, 0);
read_unlock();
}
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 08/13] builtin/show-ref: don't fetch objects when printing refs
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (6 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
@ 2025-04-25 7:08 ` Patrick Steinhardt
2025-04-28 21:50 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
` (5 subsequent siblings)
13 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:08 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
When printing references via git-show-ref(1) we double check that their
respective object IDs point to an existing object. This check is
performed via `repo_has_object_file()`, which knows to fetch missing
promised objects in the background. We shouldn't have a need to fetch
such objects though as no reference should ever point to a missing
object at all.
Convert the callsite to `has_object()`, which doesn't fetch promisor
objects by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/show-ref.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index f81209f23c3..fe592b4c202 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -35,7 +35,7 @@ static void show_one(const struct show_one_options *opts,
const char *hex;
struct object_id peeled;
- if (!repo_has_object_file(the_repository, oid))
+ if (!has_object(the_repository, oid, HAS_OBJECT_RECHECK_PACKED))
die("git show-ref: bad ref %s (%s)", refname,
oid_to_hex(oid));
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()`
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (7 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
@ 2025-04-25 7:09 ` Patrick Steinhardt
2025-04-28 21:53 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
` (4 subsequent siblings)
13 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:09 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
Similar to the preceding commit, don't try to fetch objects pointed to
by references. Any reference whose object does not exist is broken by
definition, so we should report it accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/refs.c b/refs.c
index 6559db37890..0492cf0d13a 100644
--- a/refs.c
+++ b/refs.c
@@ -376,7 +376,7 @@ int ref_resolves_to_object(const char *refname,
{
if (flags & REF_ISBROKEN)
return 0;
- if (!repo_has_object_file(repo, oid)) {
+ if (!has_object(repo, oid, HAS_OBJECT_RECHECK_PACKED)) {
error(_("%s does not point to a valid object!"), refname);
return 0;
}
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 10/13] http-walker: don't fetch objects via promisor remotes
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (8 preceding siblings ...)
2025-04-25 7:09 ` [PATCH v2 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
@ 2025-04-25 7:09 ` Patrick Steinhardt
2025-04-28 21:56 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 11/13] list-objects: clarify how promised blobs are excluded Patrick Steinhardt
` (3 subsequent siblings)
13 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:09 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
The HTTP walker is responsible for fetching objects via the dumb HTTP
protocol. To avoid re-fetching objects that we already have locally we
first check whether the object already exists in the local repository
before issuing the requests.
This existence check is done by calling `repo_has_object_file()`, which
will fetch the object via a promisor remote in case it is in a promisor
pack. This fetch does not make any sense for us though: we're already in
the process of fetching the object anyway, so fetching it via a separate
connection is wasteful, but should otherwise be harmless.
Fix the issue by converting to `has_object()`, which knows to not fetch
objects via promisor remotes by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
http-walker.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/http-walker.c b/http-walker.c
index 95458e2f638..5ad2eae9a11 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -138,7 +138,7 @@ static int fill_active_slot(void *data UNUSED)
list_for_each_safe(pos, tmp, head) {
obj_req = list_entry(pos, struct object_request, node);
if (obj_req->state == WAITING) {
- if (repo_has_object_file(the_repository, &obj_req->oid))
+ if (has_object(the_repository, &obj_req->oid, HAS_OBJECT_RECHECK_PACKED))
obj_req->state = COMPLETE;
else {
start_object_request(obj_req);
@@ -496,7 +496,7 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
if (!obj_req)
return error("Couldn't find request for %s in the queue", hex);
- if (repo_has_object_file(the_repository, &obj_req->oid)) {
+ if (has_object(the_repository, &obj_req->oid, HAS_OBJECT_RECHECK_PACKED)) {
if (obj_req->req)
abort_http_object_request(&obj_req->req);
abort_object_request(obj_req);
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 11/13] list-objects: clarify how promised blobs are excluded
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (9 preceding siblings ...)
2025-04-25 7:09 ` [PATCH v2 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
@ 2025-04-25 7:09 ` Patrick Steinhardt
2025-04-25 7:09 ` [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
` (2 subsequent siblings)
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:09 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
When "--exclude-promisor-objects" is passed by the user, commands like
git-rev-list(1) will exclude any objects part of a promisor pack. For
blobs this logic is handled in `process_blob()` with the following
logic:
if (ctx->revs->exclude_promisor_objects &&
!repo_has_object_file(the_repository, &obj->oid) &&
is_promisor_object(ctx->revs->repo, &obj->oid))
return;
It is somewhat puzzling that we use `repo_has_object_file()` to check
for existence of the blob because this function will cause us to fetch
missing objects in case they are part of a promisor pack. As such, one
may wonder whether the logic to exclude promised blobs is completely
broken.
As it turns out it's not broken: when "--exclude-promisor-objects" is
set we also unset the global `fetch_if_missing` variable, which causes
`do_oid_object_info_extended()` to not fetch any promised objects at
all.
Clarify this logic by using `has_object()`, which doesn't fetch promised
objects by default.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
list-objects.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/list-objects.c b/list-objects.c
index 1e5512e1318..cae4f7aff8c 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -74,7 +74,8 @@ static void process_blob(struct traversal_context *ctx,
* of missing objects.
*/
if (ctx->revs->exclude_promisor_objects &&
- !repo_has_object_file(the_repository, &obj->oid) &&
+ !has_object(the_repository, &obj->oid,
+ HAS_OBJECT_RECHECK_PACKED) &&
is_promisor_object(ctx->revs->repo, &obj->oid))
return;
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (10 preceding siblings ...)
2025-04-25 7:09 ` [PATCH v2 11/13] list-objects: clarify how promised blobs are excluded Patrick Steinhardt
@ 2025-04-25 7:09 ` Patrick Steinhardt
2025-04-28 22:07 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 13/13] object-store: drop `repo_has_object_file()` Patrick Steinhardt
2025-04-28 19:49 ` [PATCH v2 00/13] object-store: a handful of cleanups Karthik Nayak
13 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:09 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
When writing objects via the bulk-checkin subsystem we first try to
figure out whether an object already exists in the repository before we
append it to the packfile. This check uses `repo_has_object_file()`,
which knows to fetch promised objects by default. As such, if we were
about to write an object that is promised, we'd fetch the object via the
promisor and then skip writing it. This behaviour doesn't seem sensible:
it should be significantly faster to take the locally-written object
instead of faulting in objects from the promisor remote.
There is one counter-argument here: it could be that the bulk-checkin
mechanism will end up writing an object to disk whose content collides
with the object in the promisor remote. The local repository and its
promisor remote would now have two objects with different contents but
the same name. But the resulting behaviour would be wrong both when we
prefer the fetched object, and also when prefering the written object:
- When we prefer the written object we will now see a different world
compared to everyone else who has the promised object.
- When we prefer the fetched object we will end up with an object that
is different compared to what the user just asked us to write. This
seems even worse compared to the first scenario.
In an ideal world, we would protect against this by fetching the
promised object and then performing a collision check. But this feels
exceedingly expensive and ultimately rather pointless, as more common
writing paths like `write_loose_object()` don't protect against this
scenario either. And in any case we're talking about a local user that
has write access to the repository anyway, so if they want to do any
kind of mischieve they already can.
Change the behaviour so that we don't fault in the object via the
promisor remote. We shouldn't have to worry about hash collisions too
much (yet) as the mechanism is only used during local writes anyway. And
even if there was a collision, prefering local data that we were just
asked to write over data controlled by a potentially untrusted remote
feels like the better failure mode.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
bulk-checkin.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/bulk-checkin.c b/bulk-checkin.c
index c31c31b18d8..b182c456d69 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -130,7 +130,7 @@ static void flush_batch_fsync(void)
static int already_written(struct bulk_checkin_packfile *state, struct object_id *oid)
{
/* The object may already exist in the repository */
- if (repo_has_object_file(the_repository, oid))
+ if (has_object(the_repository, oid, HAS_OBJECT_RECHECK_PACKED))
return 1;
/* Might want to keep the list sorted */
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v2 13/13] object-store: drop `repo_has_object_file()`
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (11 preceding siblings ...)
2025-04-25 7:09 ` [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
@ 2025-04-25 7:09 ` Patrick Steinhardt
2025-04-28 19:49 ` [PATCH v2 00/13] object-store: a handful of cleanups Karthik Nayak
13 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-25 7:09 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
In the preceding commits we have converted all users of
`repo_has_object_file()` and its `_with_flags()` variant to instead use
`has_object()`. Drop these functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.c | 14 --------------
object-store.h | 17 -----------------
2 files changed, 31 deletions(-)
diff --git a/object-store.c b/object-store.c
index 2db34804e8f..2f51d0e3b03 100644
--- a/object-store.c
+++ b/object-store.c
@@ -949,20 +949,6 @@ int has_object(struct repository *r, const struct object_id *oid,
return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
}
-int repo_has_object_file_with_flags(struct repository *r,
- const struct object_id *oid, int flags)
-{
- if (!startup_info->have_repository)
- return 0;
- return oid_object_info_extended(r, oid, NULL, flags) >= 0;
-}
-
-int repo_has_object_file(struct repository *r,
- const struct object_id *oid)
-{
- return repo_has_object_file_with_flags(r, oid, 0);
-}
-
void assert_oid_type(const struct object_id *oid, enum object_type expect)
{
enum object_type type = oid_object_info(the_repository, oid, NULL);
diff --git a/object-store.h b/object-store.h
index c6055376f49..2330374990b 100644
--- a/object-store.h
+++ b/object-store.h
@@ -280,23 +280,6 @@ enum {
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags);
-/*
- * These macros and functions are deprecated. If checking existence for an
- * object that is likely to be missing and/or whose absence is relatively
- * inconsequential (or is consequential but the caller is prepared to handle
- * it), use has_object(), which has better defaults (no lazy fetch in a partial
- * clone and no rechecking of packed storage). In the unlikely event that a
- * caller needs to assert existence of an object that it fully expects to
- * exist, and wants to trigger a lazy fetch in a partial clone, use
- * oid_object_info_extended() with a NULL struct object_info.
- *
- * These functions can be removed once all callers have migrated to
- * has_object() and/or oid_object_info_extended().
- */
-int repo_has_object_file(struct repository *r, const struct object_id *oid);
-int repo_has_object_file_with_flags(struct repository *r,
- const struct object_id *oid, int flags);
-
void assert_oid_type(const struct object_id *oid, enum object_type expect);
/*
--
2.49.0.901.g37484f566f.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-25 7:04 ` Patrick Steinhardt
@ 2025-04-28 19:48 ` Karthik Nayak
0 siblings, 0 replies; 55+ messages in thread
From: Karthik Nayak @ 2025-04-28 19:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 3021 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> On Wed, Apr 23, 2025 at 10:08:05AM -0700, Karthik Nayak wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> > Any packed objects indexed via git-index-pack(1) are subject to a
>> > collision check. This collision check has the intent to determine
>> > whether we already have an object with the same object ID, but different
>> > contents in the repository.
>> >
>> > The check whether the collision check is really needed is performed via
>> > `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`. \
>> >
>>
>> Nit: this was a little confusing at first, until I saw the code. So what
>> this means is that the collision check is only performed, iff
>> `repo_has_object_file_with_flags(...)` returns true.
>>
>> I think the confusing part was 'is performed via', perhaps:
>>
>> The collision check is only performed, if
>> repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK) returns a
>> truthy value.
>>
>> But it is okay as is too!
>
> Will rephrase.
>
>> > But unless
>> > explicitly told otherwise via `OBJECT_INFO_SKIP_FETCH_OBJECT`, this
>> > function will also cause us to fetch the object ID in case it is part of
>> > a promisor pack. As such, we may end up fetching the object only to
>> > check whether the fetched object and the object that we're indexing have
>> > the same content.
>> >
>>
>> So us fetching the object is pointless, since we only care about the
>> 'does it exist' part and not really what it contains. In that case,
>> shouldn't this be s/same content/same oid/?
>
> No, it really checks for the same content. It basically verifies that
> any pair of objects that:
>
> - Exist in the packfile that we're currently indexing.
> - And preexists in the local repository.
>
> Actually have the same content.
>
Okay this makes sense, if they do have the same content, this is not
a collision. It is simply a duplicate.
> The weird part is that we also do this for objects that don't yet exist
> in the repository, but which are promised to us. This causes us to fetch
> them first only to verify that the fetched promised object has the same
> content as the packfile. And given that git-index-pack(1) would usually
> run after a fetch, we end up verifying that the fetched object obtained
> from the promisor is the same as the fetched object obtained from the
> packfile. Which ultimately seems rather dubious to me.
>
To clarify, the flow currently (simplified) is:
1. We check if a collision test is required, by checking if the new OID
already exists in the repository.
2. If collision test is required.
a. Fetch and check the object type.
b. Read the old object data.
c. Compare the new object data and the old object data.
d. Collision detected if there is a mismatch.
Currently, we fetch for promisor objects in #1, which is unnecessary
because we simply want to know if the object exists in the repository.
The actual check in #2.b would still fetch the promisor object (if that
flow is taken).
> Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 00/13] object-store: a handful of cleanups
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
` (12 preceding siblings ...)
2025-04-25 7:09 ` [PATCH v2 13/13] object-store: drop `repo_has_object_file()` Patrick Steinhardt
@ 2025-04-28 19:49 ` Karthik Nayak
13 siblings, 0 replies; 55+ messages in thread
From: Karthik Nayak @ 2025-04-28 19:49 UTC (permalink / raw)
To: Patrick Steinhardt, git
[-- Attachment #1: Type: text/plain, Size: 1253 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> Hi,
>
> this patch series contains a handful of cleanups to the object store
> subsystem:
>
> - A couple of definitions are moved out of "object-store.h" as they
> belong to other subsystems.
>
> - Some functions are dropped and/or renamed.
>
> - The biggest part is the removal of `repo_has_object_file()`. This
> function and its `_with_flags()` variant are marked as deprecated,
> with the replacement being `has_object()`. The benefit of that
> function is that it doesn't reload packfiles and doesn't fetch
> promisor objects by default so that it becomes more explicit when
> one really wants to do so.
>
> These cleanups are in preparation for getting rid of `the_repository` in
> "object-store.c".
>
> The patch series is built on top of 4bbb303af69 (The seventh batch,
> 2025-04-17) with ps/object-file-cleanup at 68cd492a3e6 (object-store:
> merge "object-store-ll.h" and "object-store.h", 2025-04-15) merged into
> it.
>
> Changes in v2:
> - A handful of improvements for commit messages.
> - Link to v1: https://lore.kernel.org/r/20250423-pks-object-store-cleanups-v1-0-81f8411a5d08@pks.im
>
> Thanks!
>
> Patrick
[snip]
The range-diff looked good to me. Thanks!
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 06/13] treewide: trivial conversions of `repo_has_object_file()`
2025-04-25 7:08 ` [PATCH v2 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
@ 2025-04-28 21:40 ` Junio C Hamano
0 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2025-04-28 21:40 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> As the comment of `repo_has_object_file()` and its `_with_flags()`
> variant tells us, these functions are considered to be deprecated in
> favor of `has_object()`. The benefit of the replacement is better
> defaults: it doesn't fetch missing objects via promisor remotes, and
> neither does it reload packfiles if an object wasn't found by default.
The reason why the "better default" being "better" is not clear to
readers of this paragraph.
It would be better if too many places need to pass an extra flag to
disable lazy fetching in the current code, for example, in which
case we would instead have fewer very selected places that would
pass an extra flag to lazily fetch (or even better, see has_object()
fail and then invoke a lazy fetch themselves) while majority of
places would just work with the plain vanilla local repository state
without having to worry about lazy fetching at all. Without
explaining why it is a "better default", it is a bit hard to justify
"the benefit of the replacement".
This is common to this and the previous step.
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-25 7:08 ` [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
@ 2025-04-28 21:46 ` Junio C Hamano
2025-04-29 6:15 ` Patrick Steinhardt
0 siblings, 1 reply; 55+ messages in thread
From: Junio C Hamano @ 2025-04-28 21:46 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> Any packed objects indexed via git-index-pack(1) are subject to a
> collision check. This collision check has the intent to determine
> whether we already have an object with the same object ID, but different
> contents in the repository.
>
> The check whether the collision check is really needed is only performed
> in case `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` tells
> us that the object exists. But unless explicitly told otherwise by
> passing `OBJECT_INFO_SKIP_FETCH_OBJECT`, this function will also cause
> us to fetch the object in case it is part of a promisor pack. As such,
> we may end up fetching the object only to check whether the fetched
> object and the object that we're indexing have the same content.
>
> This behaviour is highly dubious and more likely than not unintended.
> Fix it by converting to `has_object()`, which knows to neither reload
> packfiles nor to fetch promisor objects by default.
It is unclear why you thing it is highly dubious from reading the
above paragraph three times, though.
Is it that if we are suspicious of the incoming pack data we are
indexing, we should also not be too trusting of the object that our
promisor remote would be giving us? To put it in reverse, our
attitude being "we trust the first copy of object we saw", which
translates to "we trust where we explicitly clone and fetch from" in
the traditional world without lazy fetching, if somebody else we are
explicitly fetching from offers us an object that the promisor
remote would give us, we just do not bother if they are the same
because it is not like we trust our promisor more than we trust the
current counterpart we are fetching from?
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/index-pack.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index f49431d626b..805b7aa1e28 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -892,9 +892,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
>
> if (startup_info->have_repository) {
> read_lock();
> - collision_test_needed =
> - repo_has_object_file_with_flags(the_repository, oid,
> - OBJECT_INFO_QUICK);
> + collision_test_needed = has_object(the_repository, oid, 0);
> read_unlock();
> }
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 08/13] builtin/show-ref: don't fetch objects when printing refs
2025-04-25 7:08 ` [PATCH v2 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
@ 2025-04-28 21:50 ` Junio C Hamano
0 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2025-04-28 21:50 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> When printing references via git-show-ref(1) we double check that their
> respective object IDs point to an existing object. This check is
> performed via `repo_has_object_file()`, which knows to fetch missing
> promised objects in the background. We shouldn't have a need to fetch
> such objects though as no reference should ever point to a missing
> object at all.
But hasn't the earlier part of this series made sure that asking
has_object() without an extra flag would make it say "no" if the
object that we know we can obtain is kept in our promisor remote
to be lazily fetched? Or am I missing some other mechanism to make
sure that somebody does any necessary lazy fetching of an object
before we point our refs at it?
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()`
2025-04-25 7:09 ` [PATCH v2 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
@ 2025-04-28 21:53 ` Junio C Hamano
0 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2025-04-28 21:53 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> Similar to the preceding commit, don't try to fetch objects pointed to
> by references. Any reference whose object does not exist is broken by
> definition, so we should report it accordingly.
It has always been correct that any reference whose object does not
exist is broken by definition.
But didn't "does not exist" use to mean "not in this repository,
and cannot be obtained from our promisor remotes", but with this
series, its meaning has changed to mean "not in this repository
right now---it does not matter if our promisor has it"?
So with this change, aren't we changing that statement to "any
reference whose object does not exist may be broken, but we do not
know it until we consult our promisor remotes, if any"?
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> refs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/refs.c b/refs.c
> index 6559db37890..0492cf0d13a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -376,7 +376,7 @@ int ref_resolves_to_object(const char *refname,
> {
> if (flags & REF_ISBROKEN)
> return 0;
> - if (!repo_has_object_file(repo, oid)) {
> + if (!has_object(repo, oid, HAS_OBJECT_RECHECK_PACKED)) {
> error(_("%s does not point to a valid object!"), refname);
> return 0;
> }
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 10/13] http-walker: don't fetch objects via promisor remotes
2025-04-25 7:09 ` [PATCH v2 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
@ 2025-04-28 21:56 ` Junio C Hamano
0 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2025-04-28 21:56 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> The HTTP walker is responsible for fetching objects via the dumb HTTP
> protocol. To avoid re-fetching objects that we already have locally we
> first check whether the object already exists in the local repository
> before issuing the requests.
>
> This existence check is done by calling `repo_has_object_file()`, which
> will fetch the object via a promisor remote in case it is in a promisor
> pack. This fetch does not make any sense for us though: we're already in
> the process of fetching the object anyway, so fetching it via a separate
> connection is wasteful, but should otherwise be harmless.
This rationale does make sense for this code path. It half matches
the rationale of omitting the collision check when you see an object
in the pack stream being indexed that it can also be obtained from
your promisor remotes, which was of dubious validity, but in this
case it makes perfect sense.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write
2025-04-25 7:09 ` [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
@ 2025-04-28 22:07 ` Junio C Hamano
2025-04-29 6:15 ` Patrick Steinhardt
0 siblings, 1 reply; 55+ messages in thread
From: Junio C Hamano @ 2025-04-28 22:07 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> In an ideal world, we would protect against this by fetching the
> promised object and then performing a collision check. But this feels
> exceedingly expensive and ultimately rather pointless, as more common
> writing paths like `write_loose_object()` don't protect against this
> scenario either.
When writing loose object, wouldn't collision check kick in, and
didn't we compare "existing (not here but virtually here due to
promisor)" object and what write_loose_object() tried to create, at
least before this series which may (or may not; I lost track) have
disabled that check?
I think the overall goal of deprecating the function with long name
with another function with a short-and-sweet name with different
default is a worthy thing, and while I do agree with "as we are
replacing function with another with different default, we need to
pass different flags to keep the same behaviour" early parts of the
series, I am not sure about these latter steps.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write
2025-04-28 22:07 ` Junio C Hamano
@ 2025-04-29 6:15 ` Patrick Steinhardt
2025-04-29 15:25 ` Junio C Hamano
0 siblings, 1 reply; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 6:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Karthik Nayak
On Mon, Apr 28, 2025 at 03:07:19PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > In an ideal world, we would protect against this by fetching the
> > promised object and then performing a collision check. But this feels
> > exceedingly expensive and ultimately rather pointless, as more common
> > writing paths like `write_loose_object()` don't protect against this
> > scenario either.
>
> When writing loose object, wouldn't collision check kick in, and
> didn't we compare "existing (not here but virtually here due to
> promisor)" object and what write_loose_object() tried to create, at
> least before this series which may (or may not; I lost track) have
> disabled that check?
>
> I think the overall goal of deprecating the function with long name
> with another function with a short-and-sweet name with different
> default is a worthy thing, and while I do agree with "as we are
> replacing function with another with different default, we need to
> pass different flags to keep the same behaviour" early parts of the
> series, I am not sure about these latter steps.
Yeah, to be honest I wasn't totally sure whether to include these steps
myself as I anticipated that they will lead to discussions that derail
my original goal, which is to clean up the interfaces in the object
subsystem. I decided to go with these where I thought that my train of
thought is reasonable, but given your comments I'll probably just drop
those patches.
We can still adapt these callsites in the future as needed.
Patrick
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check
2025-04-28 21:46 ` Junio C Hamano
@ 2025-04-29 6:15 ` Patrick Steinhardt
0 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 6:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Karthik Nayak
On Mon, Apr 28, 2025 at 02:46:52PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > Any packed objects indexed via git-index-pack(1) are subject to a
> > collision check. This collision check has the intent to determine
> > whether we already have an object with the same object ID, but different
> > contents in the repository.
> >
> > The check whether the collision check is really needed is only performed
> > in case `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` tells
> > us that the object exists. But unless explicitly told otherwise by
> > passing `OBJECT_INFO_SKIP_FETCH_OBJECT`, this function will also cause
> > us to fetch the object in case it is part of a promisor pack. As such,
> > we may end up fetching the object only to check whether the fetched
> > object and the object that we're indexing have the same content.
> >
> > This behaviour is highly dubious and more likely than not unintended.
> > Fix it by converting to `has_object()`, which knows to neither reload
> > packfiles nor to fetch promisor objects by default.
>
> It is unclear why you thing it is highly dubious from reading the
> above paragraph three times, though.
>
> Is it that if we are suspicious of the incoming pack data we are
> indexing, we should also not be too trusting of the object that our
> promisor remote would be giving us? To put it in reverse, our
> attitude being "we trust the first copy of object we saw", which
> translates to "we trust where we explicitly clone and fetch from" in
> the traditional world without lazy fetching, if somebody else we are
> explicitly fetching from offers us an object that the promisor
> remote would give us, we just do not bother if they are the same
> because it is not like we trust our promisor more than we trust the
> current counterpart we are fetching from?
Yes, exactly. When we don't have an object locally we don't have a trust
anchor for verifying that contents of the object look as expected. So
there are only two ways to do this:
- Use a trust-on-first-use model. We trust the object we obtain
initially and from thereon we start to treat it as the "correct"
object and verify incoming objects with the same ID against it.
- We only trust what everyone agrees one. In that case though we
really should be cross-verifying with _all_ remotes, not only with
the promisor remote.
Right now we do neither, but we end up treating the promisor as "more
trusted" than any of the other remotes.
I think it's completely unintentional that we end up fetching the object
from the promisor to perform a collision check against the packfile we
are about to index. It is highly likely that the promisor remote and the
remote that we're fetching from are the same anyway, so all this does is
to waste resources.
Anyway, I'll evict these patches from this series. I think a couple of
the sites are broken, but for now I care more about the bigger picture.
Patrick
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v3 0/7] object-store: a handful of cleanups
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
` (14 preceding siblings ...)
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 1/7] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
` (7 more replies)
15 siblings, 8 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
Hi,
this patch series contains a handful of cleanups to the object store
subsystem:
- A couple of definitions are moved out of "object-store.h" as they
belong to other subsystems.
- Some functions are dropped and/or renamed.
- The biggest part is the removal of `repo_has_object_file()`. This
function and its `_with_flags()` variant are marked as deprecated,
with the replacement being `has_object()`. The benefit of that
function is that it doesn't reload packfiles and doesn't fetch
promisor objects by default so that it becomes more explicit when
one really wants to do so.
These cleanups are in preparation for getting rid of `the_repository` in
"object-store.c".
The patch series is built on top of 4bbb303af69 (The seventh batch,
2025-04-17) with ps/object-file-cleanup at 68cd492a3e6 (object-store:
merge "object-store-ll.h" and "object-store.h", 2025-04-15) merged into
it.
Changes in v2:
- A handful of improvements for commit messages.
- Link to v1: https://lore.kernel.org/r/20250423-pks-object-store-cleanups-v1-0-81f8411a5d08@pks.im
Changes in v3:
- Move around `hash_object_file()`, which I missed in previous
iterations.
- Don't try to fix up callsites where we end up fetching promised
objects. This patch series now only does a trivial 1:1 conversion.
- Clarify why we're sunsetting `repo_has_object_file()`.
- Link to v2: https://lore.kernel.org/r/20250425-pks-object-store-cleanups-v2-0-63f1695b7700@pks.im
Thanks!
Patrick
---
Patrick Steinhardt (7):
object-store: move `struct packed_git` into "packfile.h"
object-store: drop `loose_object_path()`
object-store: move and rename `odb_pack_keep()`
object-store: move function declarations to their respective subsystems
object-store: allow fetching objects via `has_object()`
treewide: convert users of `repo_has_object_file()` to `has_object()`
object-store: drop `repo_has_object_file()`
builtin/cat-file.c | 3 +-
builtin/clone.c | 4 +-
builtin/count-objects.c | 2 +-
builtin/fast-import.c | 3 +-
builtin/fetch.c | 15 ++--
builtin/gc.c | 2 +-
builtin/index-pack.c | 7 +-
builtin/receive-pack.c | 4 +-
builtin/remote.c | 3 +-
builtin/show-ref.c | 3 +-
builtin/unpack-objects.c | 3 +-
bulk-checkin.c | 3 +-
cache-tree.c | 13 +++-
convert.c | 2 +-
diffcore-rename.c | 2 +-
dir.c | 2 +-
fetch-pack.c | 7 +-
http-push.c | 11 ++-
http-walker.c | 9 ++-
http.c | 4 +-
list-objects.c | 3 +-
log-tree.c | 2 +-
notes.c | 3 +-
object-file.c | 4 +-
object-file.h | 81 ++++++++++++++++++++
object-name.c | 2 +-
object-store.c | 44 ++---------
object-store.h | 195 +++--------------------------------------------
pack-objects.h | 1 +
packfile.h | 78 ++++++++++++++++++-
path.c | 14 ++++
path.h | 7 ++
prune-packed.c | 2 +-
reachable.c | 2 +-
reflog.c | 3 +-
refs.c | 2 +-
remote.c | 2 +-
send-pack.c | 5 +-
shallow.c | 9 ++-
upload-pack.c | 3 +-
walker.c | 3 +-
41 files changed, 278 insertions(+), 289 deletions(-)
Range-diff versus v2:
1: 3b0282e9621 = 1: ade4c11cb5c object-store: move `struct packed_git` into "packfile.h"
2: 60871520f14 = 2: a38ae18bf91 object-store: drop `loose_object_path()`
3: a3eb7be5cc4 = 3: cb14eafdffa object-store: move and rename `odb_pack_keep()`
4: 2fbf97b6826 ! 4: d1b3c0a23f9 object-store: move function declarations to their respective subsystems
@@ builtin/gc.c
#include "pack-objects.h"
#include "path.h"
+ ## convert.c ##
+@@
+ #include "copy.h"
+ #include "gettext.h"
+ #include "hex.h"
+-#include "object-store.h"
++#include "object-file.h"
+ #include "attr.h"
+ #include "run-command.h"
+ #include "quote.h"
+
+ ## diffcore-rename.c ##
+@@
+ #include "git-compat-util.h"
+ #include "diff.h"
+ #include "diffcore.h"
+-#include "object-store.h"
++#include "object-file.h"
+ #include "hashmap.h"
+ #include "mem-pool.h"
+ #include "oid-array.h"
+
+ ## dir.c ##
+@@
+ #include "environment.h"
+ #include "gettext.h"
+ #include "name-hash.h"
+-#include "object-store.h"
++#include "object-file.h"
+ #include "path.h"
+ #include "refs.h"
+ #include "repository.h"
+
+ ## log-tree.c ##
+@@
+ #include "environment.h"
+ #include "hex.h"
+ #include "object-name.h"
+-#include "object-store.h"
++#include "object-file.h"
+ #include "repository.h"
+ #include "tmp-objdir.h"
+ #include "commit.h"
+
## object-file.h ##
@@
@@ object-file.h: int has_loose_object_nonlocal(const struct object_id *);
/**
* format_object_header() is a thin wrapper around s xsnprintf() that
* writes the initial "<type> <obj-len>" part of the loose object
+@@ object-file.h: int finalize_object_file(const char *tmpfile, const char *filename);
+ int finalize_object_file_flags(const char *tmpfile, const char *filename,
+ enum finalize_object_file_flags flags);
+
++void hash_object_file(const struct git_hash_algo *algo, const void *buf,
++ unsigned long len, enum object_type type,
++ struct object_id *oid);
++
+ /* Helper to check and "touch" a file */
+ int check_and_freshen_file(const char *fn, int freshen);
+
## object-name.c ##
@@
@@ object-store.h: void raw_object_store_clear(struct raw_object_store *o);
void *repo_read_object_file(struct repository *r,
const struct object_id *oid,
enum object_type *type,
+@@ object-store.h: void *repo_read_object_file(struct repository *r,
+ /* Read and unpack an object file into memory, write memory to an object file */
+ int oid_object_info(struct repository *r, const struct object_id *, unsigned long *);
+
+-void hash_object_file(const struct git_hash_algo *algo, const void *buf,
+- unsigned long len, enum object_type type,
+- struct object_id *oid);
+-
+ /*
+ * Add an object file to the in-memory object store, without writing it
+ * to disk.
@@ object-store.h: static inline void obj_read_unlock(void)
if(obj_read_use_lock)
pthread_mutex_unlock(&obj_read_mutex);
5: 6e165b497e4 ! 5: 7b8a6286f6a object-store: allow fetching objects via `has_object()`
@@ Commit message
object-store: allow fetching objects via `has_object()`
We're about to fully remove `repo_has_object_file()` in favor of
- `has_object()` given that the latter has better defaults: it neither
- reloads packfiles by default nor does it fetch any promised objects in
- case they are missing.
-
- The latter usecase keeps us from converting a couple of callsites that
- currently do fetch objects though. It is not really clear whether _all_
- of those callsites should be fetching objects, but for a subset of them
- it is the desired behaviour indeed.
+ `has_object()`. The latter function does not yet have a way to fetch
+ missing objects via a promisor remote though, which means that it cannot
+ fully replace all usecases of `repo_has_object_file()`.
Introduce a new flag `HAS_OBJECT_FETCH_PROMISOR` that causes the
function to optionally fetch missing objects which are part of a
- promisor pack.
+ promisor pack. This flag will be used in the subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
6: 325178adf42 ! 6: a7733b66fd6 treewide: trivial conversions of `repo_has_object_file()`
@@ Metadata
Author: Patrick Steinhardt <ps@pks.im>
## Commit message ##
- treewide: trivial conversions of `repo_has_object_file()`
+ treewide: convert users of `repo_has_object_file()` to `has_object()`
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
- favor of `has_object()`. The benefit of the replacement is better
- defaults: it doesn't fetch missing objects via promisor remotes, and
- neither does it reload packfiles if an object wasn't found by default.
+ favor of `has_object()`. There are a couple of slight benefits in favor
+ of the replacement:
- Start sunsetting the functions by replacing trivial callsites of it with
- `has_object()`:
+ - The new function has a short-and-sweet name.
+
+ - More explicit defaults: `has_object()` doesn't fetch missing objects
+ via promisor remotes, and neither does it reload packfiles if an
+ object wasn't found by default. This ensures that it becomes
+ immediately obvious when a simple object existence check may result
+ in expensive actions.
+
+ Most importantly though, it is confusing that we have two sets of
+ functions that ultimately do the same thing, but with different
+ defaults.
+
+ Start sunsetting `repo_has_object_file()` and its `_with_flags()`
+ sibling by replacing all callsites with `has_object()`:
- `repo_has_object_file(...)` is equivalent to
`has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.
@@ Commit message
- `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.
- While converting callsites to use `HAS_OBJECT_FETCH_PROMISOR` retains
- the current behaviour, it is not entirely clear whether all of them
- really should be fetching promised objects. A subset of such callsites
- where it is obvious that they really shouldn't fetch objects has been
- left out of this commit and will be adapted over the next couple of
- commits.
+ - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`
+ is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`.
+
+ The replacements should be functionally equivalent.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
@@ builtin/fetch.c: static int check_exist_and_connected(struct ref *ref_map)
}
+ ## builtin/index-pack.c ##
+@@ builtin/index-pack.c: static void sha1_object(const void *data, struct object_entry *obj_entry,
+
+ if (startup_info->have_repository) {
+ read_lock();
+- collision_test_needed =
+- repo_has_object_file_with_flags(the_repository, oid,
+- OBJECT_INFO_QUICK);
++ collision_test_needed = has_object(the_repository, oid,
++ HAS_OBJECT_FETCH_PROMISOR);
+ read_unlock();
+ }
+
+
## builtin/receive-pack.c ##
@@ builtin/receive-pack.c: static const char *update(struct command *cmd, struct shallow_info *si)
}
@@ builtin/remote.c: static int get_push_ref_states(const struct ref *remote_refs,
info->status = PUSH_STATUS_FASTFORWARD;
else
+ ## builtin/show-ref.c ##
+@@ builtin/show-ref.c: static void show_one(const struct show_one_options *opts,
+ const char *hex;
+ struct object_id peeled;
+
+- if (!repo_has_object_file(the_repository, oid))
++ if (!has_object(the_repository, oid,
++ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
+ die("git show-ref: bad ref %s (%s)", refname,
+ oid_to_hex(oid));
+
+
## builtin/unpack-objects.c ##
@@ builtin/unpack-objects.c: static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
delta_data = get_data(delta_size);
@@ builtin/unpack-objects.c: static void unpack_delta_entry(enum object_type type,
else if (resolve_against_held(nr, &base_oid,
delta_data, delta_size))
+ ## bulk-checkin.c ##
+@@ bulk-checkin.c: static void flush_batch_fsync(void)
+ static int already_written(struct bulk_checkin_packfile *state, struct object_id *oid)
+ {
+ /* The object may already exist in the repository */
+- if (repo_has_object_file(the_repository, oid))
++ if (has_object(the_repository, oid,
++ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
+ return 1;
+
+ /* Might want to keep the list sorted */
+
## cache-tree.c ##
@@ cache-tree.c: int cache_tree_fully_valid(struct cache_tree *it)
int i;
@@ http-push.c: int cmd_main(int argc, const char **argv)
&ref->old_oid)) {
/*
+ ## http-walker.c ##
+@@ http-walker.c: static int fill_active_slot(void *data UNUSED)
+ list_for_each_safe(pos, tmp, head) {
+ obj_req = list_entry(pos, struct object_request, node);
+ if (obj_req->state == WAITING) {
+- if (repo_has_object_file(the_repository, &obj_req->oid))
++ if (has_object(the_repository, &obj_req->oid,
++ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
+ obj_req->state = COMPLETE;
+ else {
+ start_object_request(obj_req);
+@@ http-walker.c: static int fetch_object(struct walker *walker, const struct object_id *oid)
+ if (!obj_req)
+ return error("Couldn't find request for %s in the queue", hex);
+
+- if (repo_has_object_file(the_repository, &obj_req->oid)) {
++ if (has_object(the_repository, &obj_req->oid,
++ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
+ if (obj_req->req)
+ abort_http_object_request(&obj_req->req);
+ abort_object_request(obj_req);
+
+ ## list-objects.c ##
+@@ list-objects.c: static void process_blob(struct traversal_context *ctx,
+ * of missing objects.
+ */
+ if (ctx->revs->exclude_promisor_objects &&
+- !repo_has_object_file(the_repository, &obj->oid) &&
++ !has_object(the_repository, &obj->oid,
++ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) &&
+ is_promisor_object(ctx->revs->repo, &obj->oid))
+ return;
+
+
## notes.c ##
@@ notes.c: static int prune_notes_helper(const struct object_id *object_oid,
struct note_delete_list **l = (struct note_delete_list **) cb_data;
@@ reflog.c: static int tree_is_complete(const struct object_id *oid)
tree->object.flags |= INCOMPLETE;
complete = 0;
+ ## refs.c ##
+@@ refs.c: int ref_resolves_to_object(const char *refname,
+ {
+ if (flags & REF_ISBROKEN)
+ return 0;
+- if (!repo_has_object_file(repo, oid)) {
++ if (!has_object(repo, oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
+ error(_("%s does not point to a valid object!"), refname);
+ return 0;
+ }
+
## remote.c ##
@@ remote.c: void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
if (!reject_reason && !ref->deletion && !is_null_oid(&ref->old_oid)) {
7: ab0dc1c5ce3 < -: ----------- builtin/index-pack: don't fetch promised objects for collision check
8: 89c314a7dfd < -: ----------- builtin/show-ref: don't fetch objects when printing refs
9: a1f647f264f < -: ----------- refs: don't fetch promisor objects in `ref_resolves_to_object()`
10: 624aa140720 < -: ----------- http-walker: don't fetch objects via promisor remotes
11: a2604942569 < -: ----------- list-objects: clarify how promised blobs are excluded
12: d08d6092580 < -: ----------- bulk-checkin: don't fetch promised objects on write
13: 84af5509257 = 7: 60aeb060bd5 object-store: drop `repo_has_object_file()`
---
base-commit: ca819c0751cedd1713334882e4c83687f8478a54
change-id: 20250422-pks-object-store-cleanups-5a6077014155
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v3 1/7] object-store: move `struct packed_git` into "packfile.h"
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 2/7] object-store: drop `loose_object_path()` Patrick Steinhardt
` (6 subsequent siblings)
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
The "object-store.h" header contains the definition of `struct
packed_git`. As this structure hosts all kind of information about a
specific packfile it is arguably a bit out of place in a generic place
like "object-store.h".
Move the structure as well as `pack_map_entry_cmp()` into "packfile.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.h | 59 +---------------------------------------------------------
pack-objects.h | 1 +
packfile.h | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 60 insertions(+), 59 deletions(-)
diff --git a/object-store.h b/object-store.h
index 46961dc9542..e04469a85fb 100644
--- a/object-store.h
+++ b/object-store.h
@@ -92,65 +92,8 @@ struct oidtree *odb_loose_cache(struct object_directory *odb,
/* Empty the loose object cache for the specified object directory. */
void odb_clear_loose_cache(struct object_directory *odb);
-struct packed_git {
- struct hashmap_entry packmap_ent;
- struct packed_git *next;
- struct list_head mru;
- struct pack_window *windows;
- off_t pack_size;
- const void *index_data;
- size_t index_size;
- uint32_t num_objects;
- size_t crc_offset;
- struct oidset bad_objects;
- int index_version;
- time_t mtime;
- int pack_fd;
- int index; /* for builtin/pack-objects.c */
- unsigned pack_local:1,
- pack_keep:1,
- pack_keep_in_core:1,
- freshened:1,
- do_not_close:1,
- pack_promisor:1,
- multi_pack_index:1,
- is_cruft:1;
- unsigned char hash[GIT_MAX_RAWSZ];
- struct revindex_entry *revindex;
- const uint32_t *revindex_data;
- const uint32_t *revindex_map;
- size_t revindex_size;
- /*
- * mtimes_map points at the beginning of the memory mapped region of
- * this pack's corresponding .mtimes file, and mtimes_size is the size
- * of that .mtimes file
- */
- const uint32_t *mtimes_map;
- size_t mtimes_size;
-
- /* repo denotes the repository this packfile belongs to */
- struct repository *repo;
-
- /* something like ".git/objects/pack/xxxxx.pack" */
- char pack_name[FLEX_ARRAY]; /* more */
-};
-
+struct packed_git;
struct multi_pack_index;
-
-static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *entry,
- const struct hashmap_entry *entry2,
- const void *keydata)
-{
- const char *key = keydata;
- const struct packed_git *pg1, *pg2;
-
- pg1 = container_of(entry, const struct packed_git, packmap_ent);
- pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
- return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
struct cached_object_entry;
struct raw_object_store {
diff --git a/pack-objects.h b/pack-objects.h
index d1c4ae7f9b6..475a2d67ce3 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -4,6 +4,7 @@
#include "object-store.h"
#include "thread-utils.h"
#include "pack.h"
+#include "packfile.h"
struct repository;
diff --git a/packfile.h b/packfile.h
index 25097213d06..05499382397 100644
--- a/packfile.h
+++ b/packfile.h
@@ -1,13 +1,70 @@
#ifndef PACKFILE_H
#define PACKFILE_H
+#include "list.h"
#include "object.h"
#include "oidset.h"
/* in object-store.h */
-struct packed_git;
struct object_info;
+struct packed_git {
+ struct hashmap_entry packmap_ent;
+ struct packed_git *next;
+ struct list_head mru;
+ struct pack_window *windows;
+ off_t pack_size;
+ const void *index_data;
+ size_t index_size;
+ uint32_t num_objects;
+ size_t crc_offset;
+ struct oidset bad_objects;
+ int index_version;
+ time_t mtime;
+ int pack_fd;
+ int index; /* for builtin/pack-objects.c */
+ unsigned pack_local:1,
+ pack_keep:1,
+ pack_keep_in_core:1,
+ freshened:1,
+ do_not_close:1,
+ pack_promisor:1,
+ multi_pack_index:1,
+ is_cruft:1;
+ unsigned char hash[GIT_MAX_RAWSZ];
+ struct revindex_entry *revindex;
+ const uint32_t *revindex_data;
+ const uint32_t *revindex_map;
+ size_t revindex_size;
+ /*
+ * mtimes_map points at the beginning of the memory mapped region of
+ * this pack's corresponding .mtimes file, and mtimes_size is the size
+ * of that .mtimes file
+ */
+ const uint32_t *mtimes_map;
+ size_t mtimes_size;
+
+ /* repo denotes the repository this packfile belongs to */
+ struct repository *repo;
+
+ /* something like ".git/objects/pack/xxxxx.pack" */
+ char pack_name[FLEX_ARRAY]; /* more */
+};
+
+static inline int pack_map_entry_cmp(const void *cmp_data UNUSED,
+ const struct hashmap_entry *entry,
+ const struct hashmap_entry *entry2,
+ const void *keydata)
+{
+ const char *key = keydata;
+ const struct packed_git *pg1, *pg2;
+
+ pg1 = container_of(entry, const struct packed_git, packmap_ent);
+ pg2 = container_of(entry2, const struct packed_git, packmap_ent);
+
+ return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
+}
+
struct pack_window {
struct pack_window *next;
unsigned char *base;
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 2/7] object-store: drop `loose_object_path()`
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 1/7] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 3/7] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
` (5 subsequent siblings)
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
The function `loose_object_path()` is a trivial wrapper around
`odb_loose_path()`, with the only exception that it always uses the
primary object database of the given repository. This doesn't really add
a ton of value though, so let's drop the function and inline it at every
callsite.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
http-walker.c | 3 ++-
http.c | 4 ++--
object-file.c | 4 ++--
object-file.h | 4 ++++
object-store.c | 6 ------
object-store.h | 7 -------
6 files changed, 10 insertions(+), 18 deletions(-)
diff --git a/http-walker.c b/http-walker.c
index 882cae19c24..95458e2f638 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -9,6 +9,7 @@
#include "list.h"
#include "transport.h"
#include "packfile.h"
+#include "object-file.h"
#include "object-store.h"
struct alt_base {
@@ -540,7 +541,7 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
- loose_object_path(the_repository, &buf, &req->oid);
+ odb_loose_path(the_repository->objects->odb, &buf, &req->oid);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release(&buf);
}
diff --git a/http.c b/http.c
index 0c411380425..3c029cf8947 100644
--- a/http.c
+++ b/http.c
@@ -2662,7 +2662,7 @@ struct http_object_request *new_http_object_request(const char *base_url,
oidcpy(&freq->oid, oid);
freq->localfile = -1;
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
strbuf_addf(&freq->tmpfile, "%s.temp", filename.buf);
strbuf_addf(&prevfile, "%s.prev", filename.buf);
@@ -2814,7 +2814,7 @@ int finish_http_object_request(struct http_object_request *freq)
unlink_or_warn(freq->tmpfile.buf);
return -1;
}
- loose_object_path(the_repository, &filename, &freq->oid);
+ odb_loose_path(the_repository->objects->odb, &filename, &freq->oid);
freq->rename = finalize_object_file(freq->tmpfile.buf, filename.buf);
strbuf_release(&filename);
diff --git a/object-file.c b/object-file.c
index 9cc3a24a40d..dc56a4766df 100644
--- a/object-file.c
+++ b/object-file.c
@@ -932,7 +932,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
prepare_loose_object_bulk_checkin();
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
fd = start_loose_object_common(&tmp_file, filename.buf, flags,
&stream, compressed, sizeof(compressed),
@@ -1079,7 +1079,7 @@ int stream_loose_object(struct input_stream *in_stream, size_t len,
goto cleanup;
}
- loose_object_path(the_repository, &filename, oid);
+ odb_loose_path(the_repository->objects->odb, &filename, oid);
/* We finally know the object path, and create the missing dir. */
dirlen = directory_size(filename.buf);
diff --git a/object-file.h b/object-file.h
index c002fbe2345..0a7b6b9f9d9 100644
--- a/object-file.h
+++ b/object-file.h
@@ -25,6 +25,10 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
struct object_directory;
+/*
+ * Put in `buf` the name of the file in the local object database that
+ * would be used to store a loose object with the specified oid.
+ */
const char *odb_loose_path(struct object_directory *odb,
struct strbuf *buf,
const struct object_id *oid);
diff --git a/object-store.c b/object-store.c
index 6ab50d25d3e..e5cfb8c0079 100644
--- a/object-store.c
+++ b/object-store.c
@@ -96,12 +96,6 @@ int odb_pack_keep(const char *name)
return open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
}
-const char *loose_object_path(struct repository *r, struct strbuf *buf,
- const struct object_id *oid)
-{
- return odb_loose_path(r->objects->odb, buf, oid);
-}
-
/*
* Return non-zero iff the path is usable as an alternate object database.
*/
diff --git a/object-store.h b/object-store.h
index e04469a85fb..5668de62d01 100644
--- a/object-store.h
+++ b/object-store.h
@@ -196,13 +196,6 @@ int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
*/
int odb_pack_keep(const char *name);
-/*
- * Put in `buf` the name of the file in the local object database that
- * would be used to store a loose object with the specified oid.
- */
-const char *loose_object_path(struct repository *r, struct strbuf *buf,
- const struct object_id *oid);
-
void *map_loose_object(struct repository *r, const struct object_id *oid,
unsigned long *size);
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 3/7] object-store: move and rename `odb_pack_keep()`
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 1/7] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 2/7] object-store: drop `loose_object_path()` Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 4/7] object-store: move function declarations to their respective subsystems Patrick Steinhardt
` (4 subsequent siblings)
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
The function `odb_pack_keep()` creates a file at the passed-in path. If
this fails, then the function re-tries by first creating any potentially
missing leading directories and then trying to create the file once
again. As such, this function doesn't host any kind of logic that is
specific to the object store, but is rather a generic helper function.
Rename the function to `safe_create_file_with_leading_directories()` and
move it into "path.c". While at it, refactor it so that it loses its
dependency on `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/fast-import.c | 3 ++-
builtin/index-pack.c | 2 +-
object-store.c | 13 -------------
object-store.h | 7 -------
path.c | 14 ++++++++++++++
path.h | 7 +++++++
6 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index c1e198f4e34..b2839c5f439 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -811,7 +811,8 @@ static char *keep_pack(const char *curr_index_name)
int keep_fd;
odb_pack_name(pack_data->repo, &name, pack_data->hash, "keep");
- keep_fd = odb_pack_keep(name.buf);
+ keep_fd = safe_create_file_with_leading_directories(pack_data->repo,
+ name.buf);
if (keep_fd < 0)
die_errno("cannot create keep file");
write_or_die(keep_fd, keep_msg, strlen(keep_msg));
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 60a8ee05dbc..f49431d626b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,7 +1565,7 @@ static void write_special_file(const char *suffix, const char *msg,
else
filename = odb_pack_name(the_repository, &name_buf, hash, suffix);
- fd = odb_pack_keep(filename);
+ fd = safe_create_file_with_leading_directories(the_repository, filename);
if (fd < 0) {
if (errno != EEXIST)
die_errno(_("cannot write %s file '%s'"),
diff --git a/object-store.c b/object-store.c
index e5cfb8c0079..0cbad5a19a0 100644
--- a/object-store.c
+++ b/object-store.c
@@ -83,19 +83,6 @@ int odb_mkstemp(struct strbuf *temp_filename, const char *pattern)
return xmkstemp_mode(temp_filename->buf, mode);
}
-int odb_pack_keep(const char *name)
-{
- int fd;
-
- fd = open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
- if (0 <= fd)
- return fd;
-
- /* slow path */
- safe_create_leading_directories_const(the_repository, name);
- return open(name, O_RDWR|O_CREAT|O_EXCL, 0600);
-}
-
/*
* Return non-zero iff the path is usable as an alternate object database.
*/
diff --git a/object-store.h b/object-store.h
index 5668de62d01..aa8fc63043e 100644
--- a/object-store.h
+++ b/object-store.h
@@ -189,13 +189,6 @@ void raw_object_store_clear(struct raw_object_store *o);
*/
int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
-/*
- * Create a pack .keep file named "name" (which should generally be the output
- * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on
- * error.
- */
-int odb_pack_keep(const char *name);
-
void *map_loose_object(struct repository *r, const struct object_id *oid,
unsigned long *size);
diff --git a/path.c b/path.c
index 4505bb78e8b..3b598b2847f 100644
--- a/path.c
+++ b/path.c
@@ -1011,6 +1011,20 @@ enum scld_error safe_create_leading_directories_const(struct repository *repo,
return result;
}
+int safe_create_file_with_leading_directories(struct repository *repo,
+ const char *path)
+{
+ int fd;
+
+ fd = open(path, O_RDWR|O_CREAT|O_EXCL, 0600);
+ if (0 <= fd)
+ return fd;
+
+ /* slow path */
+ safe_create_leading_directories_const(repo, path);
+ return open(path, O_RDWR|O_CREAT|O_EXCL, 0600);
+}
+
static int have_same_root(const char *path1, const char *path2)
{
int is_abs1, is_abs2;
diff --git a/path.h b/path.h
index fd1a194b060..e67348f2539 100644
--- a/path.h
+++ b/path.h
@@ -266,6 +266,13 @@ enum scld_error safe_create_leading_directories_const(struct repository *repo,
const char *path);
enum scld_error safe_create_leading_directories_no_share(char *path);
+/*
+ * Create a file, potentially creating its leading directories in case they
+ * don't exist. Returns the return value of the open(3p) call.
+ */
+int safe_create_file_with_leading_directories(struct repository *repo,
+ const char *path);
+
# ifdef USE_THE_REPOSITORY_VARIABLE
# include "strbuf.h"
# include "repository.h"
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 4/7] object-store: move function declarations to their respective subsystems
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
` (2 preceding siblings ...)
2025-04-29 7:52 ` [PATCH v3 3/7] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 5/7] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
` (3 subsequent siblings)
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
We carry declarations for a couple of functions in "object-store.h" that
are not defined in "object-store.c", but in a different subsystem. Move
these declarations to the respective headers whose matching code files
carry the corresponding definition.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/count-objects.c | 2 +-
builtin/gc.c | 2 +-
convert.c | 2 +-
diffcore-rename.c | 2 +-
dir.c | 2 +-
log-tree.c | 2 +-
object-file.h | 77 +++++++++++++++++++++++++++++++++++++++
object-name.c | 2 +-
object-store.h | 95 +------------------------------------------------
packfile.h | 19 ++++++++++
prune-packed.c | 2 +-
reachable.c | 2 +-
12 files changed, 106 insertions(+), 103 deletions(-)
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 0bb5360b2f2..a88c0c9c09a 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -12,7 +12,7 @@
#include "parse-options.h"
#include "quote.h"
#include "packfile.h"
-#include "object-store.h"
+#include "object-file.h"
static unsigned long garbage;
static off_t size_garbage;
diff --git a/builtin/gc.c b/builtin/gc.c
index b5ce1d32766..4d428f3253d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -28,7 +28,7 @@
#include "commit.h"
#include "commit-graph.h"
#include "packfile.h"
-#include "object-store.h"
+#include "object-file.h"
#include "pack.h"
#include "pack-objects.h"
#include "path.h"
diff --git a/convert.c b/convert.c
index 8783e17941f..b5f7cf6306c 100644
--- a/convert.c
+++ b/convert.c
@@ -8,7 +8,7 @@
#include "copy.h"
#include "gettext.h"
#include "hex.h"
-#include "object-store.h"
+#include "object-file.h"
#include "attr.h"
#include "run-command.h"
#include "quote.h"
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 179731462b5..7723bc3334e 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -8,7 +8,7 @@
#include "git-compat-util.h"
#include "diff.h"
#include "diffcore.h"
-#include "object-store.h"
+#include "object-file.h"
#include "hashmap.h"
#include "mem-pool.h"
#include "oid-array.h"
diff --git a/dir.c b/dir.c
index 5c4675b4ac4..e11342e1366 100644
--- a/dir.c
+++ b/dir.c
@@ -17,7 +17,7 @@
#include "environment.h"
#include "gettext.h"
#include "name-hash.h"
-#include "object-store.h"
+#include "object-file.h"
#include "path.h"
#include "refs.h"
#include "repository.h"
diff --git a/log-tree.c b/log-tree.c
index a4d4ab59ca0..1d05dc1c701 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -9,7 +9,7 @@
#include "environment.h"
#include "hex.h"
#include "object-name.h"
-#include "object-store.h"
+#include "object-file.h"
#include "repository.h"
#include "tmp-objdir.h"
#include "commit.h"
diff --git a/object-file.h b/object-file.h
index 0a7b6b9f9d9..a85b2e5b494 100644
--- a/object-file.h
+++ b/object-file.h
@@ -3,6 +3,7 @@
#include "git-zlib.h"
#include "object.h"
+#include "object-store.h"
struct index_state;
@@ -25,6 +26,16 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
struct object_directory;
+/*
+ * Populate and return the loose object cache array corresponding to the
+ * given object ID.
+ */
+struct oidtree *odb_loose_cache(struct object_directory *odb,
+ const struct object_id *oid);
+
+/* Empty the loose object cache for the specified object directory. */
+void odb_clear_loose_cache(struct object_directory *odb);
+
/*
* Put in `buf` the name of the file in the local object database that
* would be used to store a loose object with the specified oid.
@@ -42,6 +53,68 @@ int has_loose_object_nonlocal(const struct object_id *);
int has_loose_object(const struct object_id *);
+void *map_loose_object(struct repository *r, const struct object_id *oid,
+ unsigned long *size);
+
+/*
+ * Iterate over the files in the loose-object parts of the object
+ * directory "path", triggering the following callbacks:
+ *
+ * - loose_object is called for each loose object we find.
+ *
+ * - loose_cruft is called for any files that do not appear to be
+ * loose objects. Note that we only look in the loose object
+ * directories "objects/[0-9a-f]{2}/", so we will not report
+ * "objects/foobar" as cruft.
+ *
+ * - loose_subdir is called for each top-level hashed subdirectory
+ * of the object directory (e.g., "$OBJDIR/f0"). It is called
+ * after the objects in the directory are processed.
+ *
+ * Any callback that is NULL will be ignored. Callbacks returning non-zero
+ * will end the iteration.
+ *
+ * In the "buf" variant, "path" is a strbuf which will also be used as a
+ * scratch buffer, but restored to its original contents before
+ * the function returns.
+ */
+typedef int each_loose_object_fn(const struct object_id *oid,
+ const char *path,
+ void *data);
+typedef int each_loose_cruft_fn(const char *basename,
+ const char *path,
+ void *data);
+typedef int each_loose_subdir_fn(unsigned int nr,
+ const char *path,
+ void *data);
+int for_each_file_in_obj_subdir(unsigned int subdir_nr,
+ struct strbuf *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+int for_each_loose_file_in_objdir(const char *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+int for_each_loose_file_in_objdir_buf(struct strbuf *path,
+ each_loose_object_fn obj_cb,
+ each_loose_cruft_fn cruft_cb,
+ each_loose_subdir_fn subdir_cb,
+ void *data);
+
+/*
+ * Iterate over all accessible loose objects without respect to
+ * reachability. By default, this includes both local and alternate objects.
+ * The order in which objects are visited is unspecified.
+ *
+ * Any flags specific to packs are ignored.
+ */
+int for_each_loose_object(each_loose_object_fn, void *,
+ enum for_each_object_flags flags);
+
+
/**
* format_object_header() is a thin wrapper around s xsnprintf() that
* writes the initial "<type> <obj-len>" part of the loose object
@@ -158,6 +231,10 @@ int finalize_object_file(const char *tmpfile, const char *filename);
int finalize_object_file_flags(const char *tmpfile, const char *filename,
enum finalize_object_file_flags flags);
+void hash_object_file(const struct git_hash_algo *algo, const void *buf,
+ unsigned long len, enum object_type type,
+ struct object_id *oid);
+
/* Helper to check and "touch" a file */
int check_and_freshen_file(const char *fn, int freshen);
diff --git a/object-name.c b/object-name.c
index 2c751a5352a..9288b2dd245 100644
--- a/object-name.c
+++ b/object-name.c
@@ -19,7 +19,7 @@
#include "oidtree.h"
#include "packfile.h"
#include "pretty.h"
-#include "object-store.h"
+#include "object-file.h"
#include "read-cache-ll.h"
#include "repo-settings.h"
#include "repository.h"
diff --git a/object-store.h b/object-store.h
index aa8fc63043e..9dc39a7c91e 100644
--- a/object-store.h
+++ b/object-store.h
@@ -82,16 +82,6 @@ struct object_directory *set_temporary_primary_odb(const char *dir, int will_des
*/
void restore_primary_odb(struct object_directory *restore_odb, const char *old_path);
-/*
- * Populate and return the loose object cache array corresponding to the
- * given object ID.
- */
-struct oidtree *odb_loose_cache(struct object_directory *odb,
- const struct object_id *oid);
-
-/* Empty the loose object cache for the specified object directory. */
-void odb_clear_loose_cache(struct object_directory *odb);
-
struct packed_git;
struct multi_pack_index;
struct cached_object_entry;
@@ -189,9 +179,6 @@ void raw_object_store_clear(struct raw_object_store *o);
*/
int odb_mkstemp(struct strbuf *temp_filename, const char *pattern);
-void *map_loose_object(struct repository *r, const struct object_id *oid,
- unsigned long *size);
-
void *repo_read_object_file(struct repository *r,
const struct object_id *oid,
enum object_type *type,
@@ -200,10 +187,6 @@ void *repo_read_object_file(struct repository *r,
/* Read and unpack an object file into memory, write memory to an object file */
int oid_object_info(struct repository *r, const struct object_id *, unsigned long *);
-void hash_object_file(const struct git_hash_algo *algo, const void *buf,
- unsigned long len, enum object_type type,
- struct object_id *oid);
-
/*
* Add an object file to the in-memory object store, without writing it
* to disk.
@@ -340,56 +323,7 @@ static inline void obj_read_unlock(void)
if(obj_read_use_lock)
pthread_mutex_unlock(&obj_read_mutex);
}
-
-/*
- * Iterate over the files in the loose-object parts of the object
- * directory "path", triggering the following callbacks:
- *
- * - loose_object is called for each loose object we find.
- *
- * - loose_cruft is called for any files that do not appear to be
- * loose objects. Note that we only look in the loose object
- * directories "objects/[0-9a-f]{2}/", so we will not report
- * "objects/foobar" as cruft.
- *
- * - loose_subdir is called for each top-level hashed subdirectory
- * of the object directory (e.g., "$OBJDIR/f0"). It is called
- * after the objects in the directory are processed.
- *
- * Any callback that is NULL will be ignored. Callbacks returning non-zero
- * will end the iteration.
- *
- * In the "buf" variant, "path" is a strbuf which will also be used as a
- * scratch buffer, but restored to its original contents before
- * the function returns.
- */
-typedef int each_loose_object_fn(const struct object_id *oid,
- const char *path,
- void *data);
-typedef int each_loose_cruft_fn(const char *basename,
- const char *path,
- void *data);
-typedef int each_loose_subdir_fn(unsigned int nr,
- const char *path,
- void *data);
-int for_each_file_in_obj_subdir(unsigned int subdir_nr,
- struct strbuf *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-int for_each_loose_file_in_objdir(const char *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-int for_each_loose_file_in_objdir_buf(struct strbuf *path,
- each_loose_object_fn obj_cb,
- each_loose_cruft_fn cruft_cb,
- each_loose_subdir_fn subdir_cb,
- void *data);
-
-/* Flags for for_each_*_object() below. */
+/* Flags for for_each_*_object(). */
enum for_each_object_flags {
/* Iterate only over local objects, not alternates. */
FOR_EACH_OBJECT_LOCAL_ONLY = (1<<0),
@@ -409,33 +343,6 @@ enum for_each_object_flags {
FOR_EACH_OBJECT_SKIP_ON_DISK_KEPT_PACKS = (1<<4),
};
-/*
- * Iterate over all accessible loose objects without respect to
- * reachability. By default, this includes both local and alternate objects.
- * The order in which objects are visited is unspecified.
- *
- * Any flags specific to packs are ignored.
- */
-int for_each_loose_object(each_loose_object_fn, void *,
- enum for_each_object_flags flags);
-
-/*
- * Iterate over all accessible packed objects without respect to reachability.
- * By default, this includes both local and alternate packs.
- *
- * Note that some objects may appear twice if they are found in multiple packs.
- * Each pack is visited in an unspecified order. By default, objects within a
- * pack are visited in pack-idx order (i.e., sorted by oid).
- */
-typedef int each_packed_object_fn(const struct object_id *oid,
- struct packed_git *pack,
- uint32_t pos,
- void *data);
-int for_each_object_in_pack(struct packed_git *p,
- each_packed_object_fn, void *data,
- enum for_each_object_flags flags);
-int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
- void *data, enum for_each_object_flags flags);
void *read_object_with_reference(struct repository *r,
const struct object_id *oid,
diff --git a/packfile.h b/packfile.h
index 05499382397..3a3c77cf05a 100644
--- a/packfile.h
+++ b/packfile.h
@@ -3,6 +3,7 @@
#include "list.h"
#include "object.h"
+#include "object-store.h"
#include "oidset.h"
/* in object-store.h */
@@ -117,6 +118,24 @@ void for_each_file_in_pack_dir(const char *objdir,
each_file_in_pack_dir_fn fn,
void *data);
+/*
+ * Iterate over all accessible packed objects without respect to reachability.
+ * By default, this includes both local and alternate packs.
+ *
+ * Note that some objects may appear twice if they are found in multiple packs.
+ * Each pack is visited in an unspecified order. By default, objects within a
+ * pack are visited in pack-idx order (i.e., sorted by oid).
+ */
+typedef int each_packed_object_fn(const struct object_id *oid,
+ struct packed_git *pack,
+ uint32_t pos,
+ void *data);
+int for_each_object_in_pack(struct packed_git *p,
+ each_packed_object_fn, void *data,
+ enum for_each_object_flags flags);
+int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
+ void *data, enum for_each_object_flags flags);
+
/* A hook to report invalid files in pack directory */
#define PACKDIR_FILE_PACK 1
#define PACKDIR_FILE_IDX 2
diff --git a/prune-packed.c b/prune-packed.c
index c1d95a519d7..92fb4fbb0ed 100644
--- a/prune-packed.c
+++ b/prune-packed.c
@@ -2,7 +2,7 @@
#include "git-compat-util.h"
#include "gettext.h"
-#include "object-store.h"
+#include "object-file.h"
#include "packfile.h"
#include "progress.h"
#include "prune-packed.h"
diff --git a/reachable.c b/reachable.c
index e5f56f40181..9dc748f0b9a 100644
--- a/reachable.c
+++ b/reachable.c
@@ -14,7 +14,7 @@
#include "list-objects.h"
#include "packfile.h"
#include "worktree.h"
-#include "object-store.h"
+#include "object-file.h"
#include "pack-bitmap.h"
#include "pack-mtimes.h"
#include "config.h"
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 5/7] object-store: allow fetching objects via `has_object()`
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
` (3 preceding siblings ...)
2025-04-29 7:52 ` [PATCH v3 4/7] object-store: move function declarations to their respective subsystems Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 6/7] treewide: convert users of `repo_has_object_file()` to `has_object()` Patrick Steinhardt
` (2 subsequent siblings)
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
We're about to fully remove `repo_has_object_file()` in favor of
`has_object()`. The latter function does not yet have a way to fetch
missing objects via a promisor remote though, which means that it cannot
fully replace all usecases of `repo_has_object_file()`.
Introduce a new flag `HAS_OBJECT_FETCH_PROMISOR` that causes the
function to optionally fetch missing objects which are part of a
promisor pack. This flag will be used in the subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.c | 9 ++++++---
object-store.h | 10 +++++++---
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/object-store.c b/object-store.c
index 0cbad5a19a0..0d873868a6d 100644
--- a/object-store.c
+++ b/object-store.c
@@ -937,12 +937,15 @@ void *read_object_with_reference(struct repository *r,
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags)
{
- int quick = !(flags & HAS_OBJECT_RECHECK_PACKED);
- unsigned object_info_flags = OBJECT_INFO_SKIP_FETCH_OBJECT |
- (quick ? OBJECT_INFO_QUICK : 0);
+ unsigned object_info_flags = 0;
if (!startup_info->have_repository)
return 0;
+ if (!(flags & HAS_OBJECT_RECHECK_PACKED))
+ object_info_flags |= OBJECT_INFO_QUICK;
+ if (!(flags & HAS_OBJECT_FETCH_PROMISOR))
+ object_info_flags |= OBJECT_INFO_SKIP_FETCH_OBJECT;
+
return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
}
diff --git a/object-store.h b/object-store.h
index 9dc39a7c91e..f0e111464c2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -262,12 +262,16 @@ int oid_object_info_extended(struct repository *r,
const struct object_id *,
struct object_info *, unsigned flags);
-/* Retry packed storage after checking packed and loose storage */
-#define HAS_OBJECT_RECHECK_PACKED 1
+enum {
+ /* Retry packed storage after checking packed and loose storage */
+ HAS_OBJECT_RECHECK_PACKED = (1 << 0),
+ /* Allow fetching the object in case the repository has a promisor remote. */
+ HAS_OBJECT_FETCH_PROMISOR = (1 << 1),
+};
/*
* Returns 1 if the object exists. This function will not lazily fetch objects
- * in a partial clone.
+ * in a partial clone by default.
*/
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags);
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 6/7] treewide: convert users of `repo_has_object_file()` to `has_object()`
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
` (4 preceding siblings ...)
2025-04-29 7:52 ` [PATCH v3 5/7] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 7/7] object-store: drop `repo_has_object_file()` Patrick Steinhardt
2025-04-29 20:07 ` [PATCH v3 0/7] object-store: a handful of cleanups Junio C Hamano
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
favor of `has_object()`. There are a couple of slight benefits in favor
of the replacement:
- The new function has a short-and-sweet name.
- More explicit defaults: `has_object()` doesn't fetch missing objects
via promisor remotes, and neither does it reload packfiles if an
object wasn't found by default. This ensures that it becomes
immediately obvious when a simple object existence check may result
in expensive actions.
Most importantly though, it is confusing that we have two sets of
functions that ultimately do the same thing, but with different
defaults.
Start sunsetting `repo_has_object_file()` and its `_with_flags()`
sibling by replacing all callsites with `has_object()`:
- `repo_has_object_file(...)` is equivalent to
`has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., 0)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`
is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`.
The replacements should be functionally equivalent.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/cat-file.c | 3 ++-
builtin/clone.c | 4 +---
builtin/fetch.c | 15 +++++++--------
builtin/index-pack.c | 5 ++---
builtin/receive-pack.c | 4 +++-
builtin/remote.c | 3 ++-
builtin/show-ref.c | 3 ++-
builtin/unpack-objects.c | 3 ++-
bulk-checkin.c | 3 ++-
cache-tree.c | 13 +++++++++----
fetch-pack.c | 7 +++----
http-push.c | 11 +++++++----
http-walker.c | 6 ++++--
list-objects.c | 3 ++-
notes.c | 3 ++-
object-store.c | 2 +-
reflog.c | 3 ++-
refs.c | 2 +-
remote.c | 2 +-
send-pack.c | 5 +----
shallow.c | 9 ++++++---
upload-pack.c | 3 +--
walker.c | 3 ++-
23 files changed, 65 insertions(+), 50 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 0e3f10a9467..3914a2a3f61 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -169,7 +169,8 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
goto cleanup;
case 'e':
- ret = !repo_has_object_file(the_repository, &oid);
+ ret = !has_object(the_repository, &oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR);
goto cleanup;
case 'w':
diff --git a/builtin/clone.c b/builtin/clone.c
index 6b1d11a3ed2..b498b81a043 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -504,9 +504,7 @@ static void write_followtags(const struct ref *refs, const char *msg)
continue;
if (ends_with(ref->name, "^{}"))
continue;
- if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid,
- OBJECT_INFO_QUICK |
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &ref->old_oid, 0))
continue;
refs_update_ref(get_main_ref_store(the_repository), msg,
ref->name, &ref->old_oid, NULL, 0,
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 95589b49948..aadcf49a5b4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -337,7 +337,6 @@ static void find_non_local_tags(const struct ref *refs,
struct string_list_item *remote_ref_item;
const struct ref *ref;
struct refname_hash_entry *item = NULL;
- const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
refname_hash_init(&existing_refs);
refname_hash_init(&remote_refs);
@@ -367,9 +366,9 @@ static void find_non_local_tags(const struct ref *refs,
*/
if (ends_with(ref->name, "^{}")) {
if (item &&
- !repo_has_object_file_with_flags(the_repository, &ref->old_oid, quick_flags) &&
+ !has_object(the_repository, &ref->old_oid, 0) &&
!oidset_contains(&fetch_oids, &ref->old_oid) &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
item = NULL;
@@ -383,7 +382,7 @@ static void find_non_local_tags(const struct ref *refs,
* fetch.
*/
if (item &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
@@ -404,7 +403,7 @@ static void find_non_local_tags(const struct ref *refs,
* checked to see if it needs fetching.
*/
if (item &&
- !repo_has_object_file_with_flags(the_repository, &item->oid, quick_flags) &&
+ !has_object(the_repository, &item->oid, 0) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
@@ -911,7 +910,8 @@ static int update_local_ref(struct ref *ref,
struct commit *current = NULL, *updated;
int fast_forward = 0;
- if (!repo_has_object_file(the_repository, &ref->new_oid))
+ if (!has_object(the_repository, &ref->new_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
die(_("object %s not found"), oid_to_hex(&ref->new_oid));
if (oideq(&ref->old_oid, &ref->new_oid)) {
@@ -1330,8 +1330,7 @@ static int check_exist_and_connected(struct ref *ref_map)
* we need all direct targets to exist.
*/
for (r = rm; r; r = r->next) {
- if (!repo_has_object_file_with_flags(the_repository, &r->old_oid,
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &r->old_oid, HAS_OBJECT_RECHECK_PACKED))
return -1;
}
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f49431d626b..147e9b8b479 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -892,9 +892,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
if (startup_info->have_repository) {
read_lock();
- collision_test_needed =
- repo_has_object_file_with_flags(the_repository, oid,
- OBJECT_INFO_QUICK);
+ collision_test_needed = has_object(the_repository, oid,
+ HAS_OBJECT_FETCH_PROMISOR);
read_unlock();
}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index be314879e82..c92e57ba188 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1506,7 +1506,9 @@ static const char *update(struct command *cmd, struct shallow_info *si)
}
}
- if (!is_null_oid(new_oid) && !repo_has_object_file(the_repository, new_oid)) {
+ if (!is_null_oid(new_oid) &&
+ !has_object(the_repository, new_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
error("unpack should have generated %s, "
"but I can't find it!", oid_to_hex(new_oid));
ret = "bad pack";
diff --git a/builtin/remote.c b/builtin/remote.c
index b4baa34e665..0d6755bcb71 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -454,7 +454,8 @@ static int get_push_ref_states(const struct ref *remote_refs,
info->status = PUSH_STATUS_UPTODATE;
else if (is_null_oid(&ref->old_oid))
info->status = PUSH_STATUS_CREATE;
- else if (repo_has_object_file(the_repository, &ref->old_oid) &&
+ else if (has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) &&
ref_newer(&ref->new_oid, &ref->old_oid))
info->status = PUSH_STATUS_FASTFORWARD;
else
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index f81209f23c3..623a52a45f8 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -35,7 +35,8 @@ static void show_one(const struct show_one_options *opts,
const char *hex;
struct object_id peeled;
- if (!repo_has_object_file(the_repository, oid))
+ if (!has_object(the_repository, oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
die("git show-ref: bad ref %s (%s)", refname,
oid_to_hex(oid));
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 661be789f13..e905d5f4e19 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -449,7 +449,8 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
delta_data = get_data(delta_size);
if (!delta_data)
return;
- if (repo_has_object_file(the_repository, &base_oid))
+ if (has_object(the_repository, &base_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
; /* Ok we have this one */
else if (resolve_against_held(nr, &base_oid,
delta_data, delta_size))
diff --git a/bulk-checkin.c b/bulk-checkin.c
index c31c31b18d8..678e2ecc2c2 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -130,7 +130,8 @@ static void flush_batch_fsync(void)
static int already_written(struct bulk_checkin_packfile *state, struct object_id *oid)
{
/* The object may already exist in the repository */
- if (repo_has_object_file(the_repository, oid))
+ if (has_object(the_repository, oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 1;
/* Might want to keep the list sorted */
diff --git a/cache-tree.c b/cache-tree.c
index c0e1e9ee1d4..fa3858e2829 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -238,7 +238,9 @@ int cache_tree_fully_valid(struct cache_tree *it)
int i;
if (!it)
return 0;
- if (it->entry_count < 0 || !repo_has_object_file(the_repository, &it->oid))
+ if (it->entry_count < 0 ||
+ has_object(the_repository, &it->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0;
for (i = 0; i < it->subtree_nr; i++) {
if (!cache_tree_fully_valid(it->down[i]->cache_tree))
@@ -289,7 +291,9 @@ static int update_one(struct cache_tree *it,
}
}
- if (0 <= it->entry_count && repo_has_object_file(the_repository, &it->oid))
+ if (0 <= it->entry_count &&
+ has_object(the_repository, &it->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return it->entry_count;
/*
@@ -395,7 +399,8 @@ static int update_one(struct cache_tree *it,
ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
!must_check_existence(ce);
if (is_null_oid(oid) ||
- (!ce_missing_ok && !repo_has_object_file(the_repository, oid))) {
+ (!ce_missing_ok && !has_object(the_repository, oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))) {
strbuf_release(&buffer);
if (expected_missing)
return -1;
@@ -443,7 +448,7 @@ static int update_one(struct cache_tree *it,
struct object_id oid;
hash_object_file(the_hash_algo, buffer.buf, buffer.len,
OBJ_TREE, &oid);
- if (repo_has_object_file_with_flags(the_repository, &oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (has_object(the_repository, &oid, HAS_OBJECT_RECHECK_PACKED))
oidcpy(&it->oid, &oid);
else
to_invalidate = 1;
diff --git a/fetch-pack.c b/fetch-pack.c
index 210dc30d50f..fa4231fee74 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -769,9 +769,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
if (!commit) {
struct object *o;
- if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid,
- OBJECT_INFO_QUICK |
- OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, &ref->old_oid, 0))
continue;
o = parse_object(the_repository, &ref->old_oid);
if (!o || o->type != OBJ_COMMIT)
@@ -1985,7 +1983,8 @@ static void update_shallow(struct fetch_pack_args *args,
struct oid_array extra = OID_ARRAY_INIT;
struct object_id *oid = si->shallow->oid;
for (i = 0; i < si->shallow->nr; i++)
- if (repo_has_object_file(the_repository, &oid[i]))
+ if (has_object(the_repository, &oid[i],
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
oid_array_append(&extra, &oid[i]);
if (extra.nr) {
setup_alternate_shallow(&shallow_lock,
diff --git a/http-push.c b/http-push.c
index 32e37565f4e..f9e67cabd4b 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1446,7 +1446,9 @@ static void one_remote_ref(const char *refname)
* Fetch a copy of the object if it doesn't exist locally - it
* may be required for updating server info later.
*/
- if (repo->can_update_info_refs && !repo_has_object_file(the_repository, &ref->old_oid)) {
+ if (repo->can_update_info_refs &&
+ !has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
obj = lookup_unknown_object(the_repository, &ref->old_oid);
fprintf(stderr, " fetch %s for %s\n",
oid_to_hex(&ref->old_oid), refname);
@@ -1651,14 +1653,14 @@ static int delete_remote_branch(const char *pattern, int force)
return error("Remote HEAD symrefs too deep");
if (is_null_oid(&head_oid))
return error("Unable to resolve remote HEAD");
- if (!repo_has_object_file(the_repository, &head_oid))
+ if (!has_object(the_repository, &head_oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return error("Remote HEAD resolves to object %s\nwhich does not exist locally, perhaps you need to fetch?", oid_to_hex(&head_oid));
/* Remote branch must resolve to a known object */
if (is_null_oid(&remote_ref->old_oid))
return error("Unable to resolve remote branch %s",
remote_ref->name);
- if (!repo_has_object_file(the_repository, &remote_ref->old_oid))
+ if (!has_object(the_repository, &remote_ref->old_oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return error("Remote branch %s resolves to object %s\nwhich does not exist locally, perhaps you need to fetch?", remote_ref->name, oid_to_hex(&remote_ref->old_oid));
/* Remote branch must be an ancestor of remote HEAD */
@@ -1879,7 +1881,8 @@ int cmd_main(int argc, const char **argv)
if (!force_all &&
!is_null_oid(&ref->old_oid) &&
!ref->force) {
- if (!repo_has_object_file(the_repository, &ref->old_oid) ||
+ if (!has_object(the_repository, &ref->old_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) ||
!ref_newer(&ref->peer_ref->new_oid,
&ref->old_oid)) {
/*
diff --git a/http-walker.c b/http-walker.c
index 95458e2f638..463f7b119ad 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -138,7 +138,8 @@ static int fill_active_slot(void *data UNUSED)
list_for_each_safe(pos, tmp, head) {
obj_req = list_entry(pos, struct object_request, node);
if (obj_req->state == WAITING) {
- if (repo_has_object_file(the_repository, &obj_req->oid))
+ if (has_object(the_repository, &obj_req->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
obj_req->state = COMPLETE;
else {
start_object_request(obj_req);
@@ -496,7 +497,8 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
if (!obj_req)
return error("Couldn't find request for %s in the queue", hex);
- if (repo_has_object_file(the_repository, &obj_req->oid)) {
+ if (has_object(the_repository, &obj_req->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
if (obj_req->req)
abort_http_object_request(&obj_req->req);
abort_object_request(obj_req);
diff --git a/list-objects.c b/list-objects.c
index 1e5512e1318..597114281f6 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -74,7 +74,8 @@ static void process_blob(struct traversal_context *ctx,
* of missing objects.
*/
if (ctx->revs->exclude_promisor_objects &&
- !repo_has_object_file(the_repository, &obj->oid) &&
+ !has_object(the_repository, &obj->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) &&
is_promisor_object(ctx->revs->repo, &obj->oid))
return;
diff --git a/notes.c b/notes.c
index d9645c4b5dc..0a128f1de98 100644
--- a/notes.c
+++ b/notes.c
@@ -794,7 +794,8 @@ static int prune_notes_helper(const struct object_id *object_oid,
struct note_delete_list **l = (struct note_delete_list **) cb_data;
struct note_delete_list *n;
- if (repo_has_object_file(the_repository, object_oid))
+ if (has_object(the_repository, object_oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0; /* nothing to do for this note */
/* failed to find object => prune this note */
diff --git a/object-store.c b/object-store.c
index 0d873868a6d..2db34804e8f 100644
--- a/object-store.c
+++ b/object-store.c
@@ -847,7 +847,7 @@ int pretend_object_file(struct repository *repo,
char *co_buf;
hash_object_file(repo->hash_algo, buf, len, type, oid);
- if (repo_has_object_file_with_flags(repo, oid, OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT) ||
+ if (has_object(repo, oid, 0) ||
find_cached_object(repo->objects, oid))
return 0;
diff --git a/reflog.c b/reflog.c
index c6c534dfad4..2db1cc81d01 100644
--- a/reflog.c
+++ b/reflog.c
@@ -152,7 +152,8 @@ static int tree_is_complete(const struct object_id *oid)
init_tree_desc(&desc, &tree->object.oid, tree->buffer, tree->size);
complete = 1;
while (tree_entry(&desc, &entry)) {
- if (!repo_has_object_file(the_repository, &entry.oid) ||
+ if (!has_object(the_repository, &entry.oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR) ||
(S_ISDIR(entry.mode) && !tree_is_complete(&entry.oid))) {
tree->object.flags |= INCOMPLETE;
complete = 0;
diff --git a/refs.c b/refs.c
index 6559db37890..dce5c49ca2b 100644
--- a/refs.c
+++ b/refs.c
@@ -376,7 +376,7 @@ int ref_resolves_to_object(const char *refname,
{
if (flags & REF_ISBROKEN)
return 0;
- if (!repo_has_object_file(repo, oid)) {
+ if (!has_object(repo, oid, HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
error(_("%s does not point to a valid object!"), refname);
return 0;
}
diff --git a/remote.c b/remote.c
index 9fa3614e7a3..4099183cacd 100644
--- a/remote.c
+++ b/remote.c
@@ -1702,7 +1702,7 @@ void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
if (!reject_reason && !ref->deletion && !is_null_oid(&ref->old_oid)) {
if (starts_with(ref->name, "refs/tags/"))
reject_reason = REF_STATUS_REJECT_ALREADY_EXISTS;
- else if (!repo_has_object_file_with_flags(the_repository, &ref->old_oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
+ else if (!has_object(the_repository, &ref->old_oid, HAS_OBJECT_RECHECK_PACKED))
reject_reason = REF_STATUS_REJECT_FETCH_FIRST;
else if (!lookup_commit_reference_gently(the_repository, &ref->old_oid, 1) ||
!lookup_commit_reference_gently(the_repository, &ref->new_oid, 1))
diff --git a/send-pack.c b/send-pack.c
index 5005689cb55..86592ce526d 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -45,10 +45,7 @@ int option_parse_push_signed(const struct option *opt,
static void feed_object(struct repository *r,
const struct object_id *oid, FILE *fh, int negative)
{
- if (negative &&
- !repo_has_object_file_with_flags(r, oid,
- OBJECT_INFO_SKIP_FETCH_OBJECT |
- OBJECT_INFO_QUICK))
+ if (negative && !has_object(r, oid, 0))
return;
if (negative)
diff --git a/shallow.c b/shallow.c
index 2f82ebd6e3f..faeeeb45f98 100644
--- a/shallow.c
+++ b/shallow.c
@@ -310,7 +310,8 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data)
if (graft->nr_parent != -1)
return 0;
if (data->flags & QUICK) {
- if (!repo_has_object_file(the_repository, &graft->oid))
+ if (!has_object(the_repository, &graft->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
return 0;
} else if (data->flags & SEEN_ONLY) {
struct commit *c = lookup_commit(the_repository, &graft->oid);
@@ -476,7 +477,8 @@ void prepare_shallow_info(struct shallow_info *info, struct oid_array *sa)
ALLOC_ARRAY(info->ours, sa->nr);
ALLOC_ARRAY(info->theirs, sa->nr);
for (size_t i = 0; i < sa->nr; i++) {
- if (repo_has_object_file(the_repository, sa->oid + i)) {
+ if (has_object(the_repository, sa->oid + i,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
struct commit_graft *graft;
graft = lookup_commit_graft(the_repository,
&sa->oid[i]);
@@ -513,7 +515,8 @@ void remove_nonexistent_theirs_shallow(struct shallow_info *info)
for (i = dst = 0; i < info->nr_theirs; i++) {
if (i != dst)
info->theirs[dst] = info->theirs[i];
- if (repo_has_object_file(the_repository, oid + info->theirs[i]))
+ if (has_object(the_repository, oid + info->theirs[i],
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
dst++;
}
info->nr_theirs = dst;
diff --git a/upload-pack.c b/upload-pack.c
index 30e4630f3a1..956da5b061a 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -509,8 +509,7 @@ static int got_oid(struct upload_pack_data *data,
{
if (get_oid_hex(hex, oid))
die("git upload-pack: expected SHA1 object, got '%s'", hex);
- if (!repo_has_object_file_with_flags(the_repository, oid,
- OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT))
+ if (!has_object(the_repository, oid, 0))
return -1;
return do_got_oid(data, oid);
}
diff --git a/walker.c b/walker.c
index 4fedc19f346..b470d43e54d 100644
--- a/walker.c
+++ b/walker.c
@@ -150,7 +150,8 @@ static int process(struct walker *walker, struct object *obj)
return 0;
obj->flags |= SEEN;
- if (repo_has_object_file(the_repository, &obj->oid)) {
+ if (has_object(the_repository, &obj->oid,
+ HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)) {
/* We already have it, so we should scan it now. */
obj->flags |= TO_SCAN;
}
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 7/7] object-store: drop `repo_has_object_file()`
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
` (5 preceding siblings ...)
2025-04-29 7:52 ` [PATCH v3 6/7] treewide: convert users of `repo_has_object_file()` to `has_object()` Patrick Steinhardt
@ 2025-04-29 7:52 ` Patrick Steinhardt
2025-04-29 20:07 ` [PATCH v3 0/7] object-store: a handful of cleanups Junio C Hamano
7 siblings, 0 replies; 55+ messages in thread
From: Patrick Steinhardt @ 2025-04-29 7:52 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Junio C Hamano
In the preceding commits we have converted all users of
`repo_has_object_file()` and its `_with_flags()` variant to instead use
`has_object()`. Drop these functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
object-store.c | 14 --------------
object-store.h | 17 -----------------
2 files changed, 31 deletions(-)
diff --git a/object-store.c b/object-store.c
index 2db34804e8f..2f51d0e3b03 100644
--- a/object-store.c
+++ b/object-store.c
@@ -949,20 +949,6 @@ int has_object(struct repository *r, const struct object_id *oid,
return oid_object_info_extended(r, oid, NULL, object_info_flags) >= 0;
}
-int repo_has_object_file_with_flags(struct repository *r,
- const struct object_id *oid, int flags)
-{
- if (!startup_info->have_repository)
- return 0;
- return oid_object_info_extended(r, oid, NULL, flags) >= 0;
-}
-
-int repo_has_object_file(struct repository *r,
- const struct object_id *oid)
-{
- return repo_has_object_file_with_flags(r, oid, 0);
-}
-
void assert_oid_type(const struct object_id *oid, enum object_type expect)
{
enum object_type type = oid_object_info(the_repository, oid, NULL);
diff --git a/object-store.h b/object-store.h
index f0e111464c2..c2fe5a19605 100644
--- a/object-store.h
+++ b/object-store.h
@@ -276,23 +276,6 @@ enum {
int has_object(struct repository *r, const struct object_id *oid,
unsigned flags);
-/*
- * These macros and functions are deprecated. If checking existence for an
- * object that is likely to be missing and/or whose absence is relatively
- * inconsequential (or is consequential but the caller is prepared to handle
- * it), use has_object(), which has better defaults (no lazy fetch in a partial
- * clone and no rechecking of packed storage). In the unlikely event that a
- * caller needs to assert existence of an object that it fully expects to
- * exist, and wants to trigger a lazy fetch in a partial clone, use
- * oid_object_info_extended() with a NULL struct object_info.
- *
- * These functions can be removed once all callers have migrated to
- * has_object() and/or oid_object_info_extended().
- */
-int repo_has_object_file(struct repository *r, const struct object_id *oid);
-int repo_has_object_file_with_flags(struct repository *r,
- const struct object_id *oid, int flags);
-
void assert_oid_type(const struct object_id *oid, enum object_type expect);
/*
--
2.49.0.967.g6a0df3ecc3.dirty
^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write
2025-04-29 6:15 ` Patrick Steinhardt
@ 2025-04-29 15:25 ` Junio C Hamano
0 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2025-04-29 15:25 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> Yeah, to be honest I wasn't totally sure whether to include these steps
> myself as I anticipated that they will lead to discussions that derail
> my original goal, which is to clean up the interfaces in the object
> subsystem. I decided to go with these where I thought that my train of
> thought is reasonable, but given your comments I'll probably just drop
> those patches.
>
> We can still adapt these callsites in the future as needed.
It is probably why among our past rewrites and refactors, successful
ones started with a rewrite faithful to the original, treating
semantic improvements as a set of separate topics on top. I think
some semantic-improvement steps like the http walker one make sense,
but even for them, others may give us some reasons why it is not a
good idea to cause me change my mind. On the other hand, steps I
thought whose semantic changes were undesirable may turn out to be
benign (they unquestionably will improve the performance of the call
path; it was quesitonable to me if they still gave us correct
results). So let's give people some time to evaluate. I'd consider
these "improved behaviour" changes as "to be discarded by default,
unless there are strong supporting arguments why the specific
semantic changes are good" material.
Thanks.
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 0/7] object-store: a handful of cleanups
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
` (6 preceding siblings ...)
2025-04-29 7:52 ` [PATCH v3 7/7] object-store: drop `repo_has_object_file()` Patrick Steinhardt
@ 2025-04-29 20:07 ` Junio C Hamano
7 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2025-04-29 20:07 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
Patrick Steinhardt <ps@pks.im> writes:
> Changes in v3:
> - Move around `hash_object_file()`, which I missed in previous
> iterations.
> - Don't try to fix up callsites where we end up fetching promised
> objects. This patch series now only does a trivial 1:1 conversion.
> - Clarify why we're sunsetting `repo_has_object_file()`.
> - Link to v2: https://lore.kernel.org/r/20250425-pks-object-store-cleanups-v2-0-63f1695b7700@pks.im
Thanks for a quick reroll. I think these are all good.
^ permalink raw reply [flat|nested] 55+ messages in thread
end of thread, other threads:[~2025-04-29 20:07 UTC | newest]
Thread overview: 55+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-23 7:48 [PATCH 00/13] object-store: a handful of cleanups Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 02/13] object-store: drop `loose_object_path()` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
2025-04-23 10:03 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 04/13] object-store: move function declarations to their respective subsystems Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
2025-04-23 10:07 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
2025-04-23 17:08 ` Karthik Nayak
2025-04-25 7:04 ` Patrick Steinhardt
2025-04-28 19:48 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
2025-04-23 17:11 ` Karthik Nayak
2025-04-23 7:48 ` [PATCH 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 11/13] list-objects: clarify how promised blobs are excluded Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
2025-04-23 7:48 ` [PATCH 13/13] object-store: drop `repo_has_object_file()` Patrick Steinhardt
2025-04-23 17:20 ` [PATCH 00/13] object-store: a handful of cleanups Karthik Nayak
2025-04-25 7:07 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 " Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 01/13] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 02/13] object-store: drop `loose_object_path()` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 03/13] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 04/13] object-store: move function declarations to their respective subsystems Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 05/13] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 06/13] treewide: trivial conversions of `repo_has_object_file()` Patrick Steinhardt
2025-04-28 21:40 ` Junio C Hamano
2025-04-25 7:08 ` [PATCH v2 07/13] builtin/index-pack: don't fetch promised objects for collision check Patrick Steinhardt
2025-04-28 21:46 ` Junio C Hamano
2025-04-29 6:15 ` Patrick Steinhardt
2025-04-25 7:08 ` [PATCH v2 08/13] builtin/show-ref: don't fetch objects when printing refs Patrick Steinhardt
2025-04-28 21:50 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 09/13] refs: don't fetch promisor objects in `ref_resolves_to_object()` Patrick Steinhardt
2025-04-28 21:53 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 10/13] http-walker: don't fetch objects via promisor remotes Patrick Steinhardt
2025-04-28 21:56 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 11/13] list-objects: clarify how promised blobs are excluded Patrick Steinhardt
2025-04-25 7:09 ` [PATCH v2 12/13] bulk-checkin: don't fetch promised objects on write Patrick Steinhardt
2025-04-28 22:07 ` Junio C Hamano
2025-04-29 6:15 ` Patrick Steinhardt
2025-04-29 15:25 ` Junio C Hamano
2025-04-25 7:09 ` [PATCH v2 13/13] object-store: drop `repo_has_object_file()` Patrick Steinhardt
2025-04-28 19:49 ` [PATCH v2 00/13] object-store: a handful of cleanups Karthik Nayak
2025-04-29 7:52 ` [PATCH v3 0/7] " Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 1/7] object-store: move `struct packed_git` into "packfile.h" Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 2/7] object-store: drop `loose_object_path()` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 3/7] object-store: move and rename `odb_pack_keep()` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 4/7] object-store: move function declarations to their respective subsystems Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 5/7] object-store: allow fetching objects via `has_object()` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 6/7] treewide: convert users of `repo_has_object_file()` to `has_object()` Patrick Steinhardt
2025-04-29 7:52 ` [PATCH v3 7/7] object-store: drop `repo_has_object_file()` Patrick Steinhardt
2025-04-29 20:07 ` [PATCH v3 0/7] object-store: a handful of cleanups Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).