public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Small fixups for `OBJECT_INFO` flags
@ 2026-01-26 12:17 Patrick Steinhardt
  2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-01-26 12:17 UTC (permalink / raw)
  To: git

Hi,

I was kind of curious why there were gaps in the `OBJECT_INFO_*` flags,
but eventually found out that these gaps are of historic nature: there
used to be more flags, but their respective values got removed at one
point in time. So naturally, I wanted to clean this up a bit so that the
next reader wouldn't have the same question.

Surprisingly though I found out that this breaks tests, which of course
puzzled me. As it turns out though, we were incorrectly using a couple
of these flags for `odb_has_object()`, and the changed definitions had
overlap with the existing meaning of other `HAS_OBJECT_*` flags. There
isn't really any bug here as far as I can see, but this is only really
by chance.

In any case, the first two commits fix calls to `odb_has_object()` that
used invalid flags. The last commit then removes the gaps and converts
the flags to use an enum instead.

Thanks!

Patrick

---
Patrick Steinhardt (3):
      builtin/backfill: fix flags passed to `odb_has_object()`
      builtin/fsck: fix flags passed to `odb_has_object()`
      odb: drop gaps in object info flag values

 builtin/backfill.c |  3 +--
 builtin/fsck.c     |  3 ++-
 odb.h              | 38 ++++++++++++++++++++++----------------
 3 files changed, 25 insertions(+), 19 deletions(-)


---
base-commit: ea24e2c55433012a0a6c4ae947a87bc66404e484
change-id: 20260126-b4-pks-read-object-info-flags-236c4437cfc5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-01-26 12:17 [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Patrick Steinhardt
@ 2026-01-26 12:17 ` Patrick Steinhardt
  2026-01-26 20:17   ` Derrick Stolee
                     ` (2 more replies)
  2026-01-26 12:17 ` [PATCH 2/3] builtin/fsck: " Patrick Steinhardt
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-01-26 12:17 UTC (permalink / raw)
  To: git

The function `fill_missing_blobs()` receives an array of object IDs and
verifies for each of them whether the corresponding object exists. If it
doesn't exist, we add it to a set of objects and then batch-fetch all of
the objects at once.

The check for whether or not we already have the object is broken
though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
expects us to pass `HAS_OBJECT_*` flags. The flag expands to:

  - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
    in case the object wasn't found. This makes sense, as we'd otherwise
    reprepare the object database as many times as we have missing
    objects.

  - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
    not fetch the object in case it's missing. Again, this makes sense,
    as we want to batch-fetch the objects.

This shows that we indeed want the equivalent of this flag, but of
course represented as `HAS_OBJECT_*` flags.

Luckily, the code is already working correctly. The `OBJECT_INFO` flag
expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
flags. And if no flags are passed, `odb_has_object()` ends up calling
`odb_read_object_info_extended()` with exactly the above two flags that
we wanted to set in the first place.

Of course, this is pure luck, and this can break any moment. So let's
fix this and correct the code to not pass any flags at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/backfill.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/builtin/backfill.c b/builtin/backfill.c
index e80fc1b694..d8cb3b0eba 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -67,8 +67,7 @@ static int fill_missing_blobs(const char *path UNUSED,
 		return 0;
 
 	for (size_t i = 0; i < list->nr; i++) {
-		if (!odb_has_object(ctx->repo->objects, &list->oid[i],
-				    OBJECT_INFO_FOR_PREFETCH))
+		if (!odb_has_object(ctx->repo->objects, &list->oid[i], 0))
 			oid_array_append(&ctx->current_batch, &list->oid[i]);
 	}
 

-- 
2.53.0.rc1.267.g6e3a78c723.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/3] builtin/fsck: fix flags passed to `odb_has_object()`
  2026-01-26 12:17 [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Patrick Steinhardt
  2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
@ 2026-01-26 12:17 ` Patrick Steinhardt
  2026-02-09 20:04   ` Justin Tobler
  2026-01-26 12:17 ` [PATCH 3/3] odb: drop gaps in object info flag values Patrick Steinhardt
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Patrick Steinhardt @ 2026-01-26 12:17 UTC (permalink / raw)
  To: git

In `mark_object()` we invoke `has_object()` with a value of 1. This is
somewhat fishy given that the function expects a bitset of flags, so any
behaviour that this results in is purely coincidental and may break at
any point in time.

The call to `has_object()` was originally introduced in 9eb86f41de
(fsck: do not lazy fetch known non-promisor object, 2020-08-05). The
intent here was to skip lazy fetches of promisor objects: we have
already verified that the object is not a promisor object, so if the
object is missing it indicates a corrupt repository.

The hardcoded value that we pass maps to `HAS_OBJECT_RECHECK_PACKED`,
which is probably the intended behaviour: `odb_has_object()` will not
fetch promisor objects unless `HAS_OBJECT_FETCH_PROMISOR` is passed, but
we may want to verify that no concurrent process has written the object
that we're trying to read.

Convert the code to use the named flag instead of the the hardcoded
value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0512f78a87..1d059dd6c2 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -162,7 +162,8 @@ static int mark_object(struct object *obj, enum object_type type,
 		return 0;
 
 	if (!(obj->flags & HAS_OBJ)) {
-		if (parent && !odb_has_object(the_repository->objects, &obj->oid, 1)) {
+		if (parent && !odb_has_object(the_repository->objects, &obj->oid,
+					      HAS_OBJECT_RECHECK_PACKED)) {
 			printf_ln(_("broken link from %7s %s\n"
 				    "              to %7s %s"),
 				  printable_type(&parent->oid, parent->type),

-- 
2.53.0.rc1.267.g6e3a78c723.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-26 12:17 [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Patrick Steinhardt
  2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
  2026-01-26 12:17 ` [PATCH 2/3] builtin/fsck: " Patrick Steinhardt
@ 2026-01-26 12:17 ` Patrick Steinhardt
  2026-01-26 16:58   ` Junio C Hamano
  2026-02-09 20:18   ` Justin Tobler
  2026-01-26 16:28 ` [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Junio C Hamano
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
  4 siblings, 2 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-01-26 12:17 UTC (permalink / raw)
  To: git

The object info flag values have a two gaps in their definitions, where
some bits are skipped over. These gaps don't really hurt, but it makes
one wonder whether anything is going on and whether a subset of flags
might be defined somewhere else.

That's not the case though. Instead, this is a case of flags that have
been dropped in the past:

  - The value 4 was used by `OBJECT_INFO_SKIP_CACHED`, removed in
    9c8a294a1a (sha1-file: remove OBJECT_INFO_SKIP_CACHED, 2020-01-02).

  - The value 8 was used by `OBJECT_INFO_ALLOW_UNKNOWN_TYPE`, removed in
    ae24b032a0 (object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag,
    2025-05-16).

Close those gaps to avoid any more confusion. While at it, convert the
flags to be declared as an enum and use bit shifts to follow modern best
practices.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.h | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/odb.h b/odb.h
index bab07755f4..1e4326b7f4 100644
--- a/odb.h
+++ b/odb.h
@@ -352,23 +352,29 @@ struct object_info {
  */
 #define OBJECT_INFO_INIT { 0 }
 
-/* Invoke lookup_replace_object() on the given hash */
-#define OBJECT_INFO_LOOKUP_REPLACE 1
-/* Do not retry packed storage after checking packed and loose storage */
-#define OBJECT_INFO_QUICK 8
-/*
- * Do not attempt to fetch the object if missing (even if fetch_is_missing is
- * nonzero).
- */
-#define OBJECT_INFO_SKIP_FETCH_OBJECT 16
-/*
- * This is meant for bulk prefetching of missing blobs in a partial
- * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
- */
-#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
+/* Flags that can be passed to `odb_read_object_info_extended()`. */
+enum object_info_flags {
+	/* Invoke lookup_replace_object() on the given hash. */
+	OBJECT_INFO_LOOKUP_REPLACE = (1 << 0),
+
+	/* Do not reprepare object sources when the first lookup has failed. */
+	OBJECT_INFO_QUICK = (1 << 1),
+
+	/*
+	 * Do not attempt to fetch the object if missing (even if fetch_is_missing is
+	 * nonzero).
+	 */
+	OBJECT_INFO_SKIP_FETCH_OBJECT = (1 << 2),
+
+	/* Die if object corruption (not just an object being missing) was detected. */
+	OBJECT_INFO_DIE_IF_CORRUPT = (1 << 3),
 
-/* Die if object corruption (not just an object being missing) was detected. */
-#define OBJECT_INFO_DIE_IF_CORRUPT 32
+	/*
+	 * This is meant for bulk prefetching of missing blobs in a partial
+	 * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK.
+	 */
+	OBJECT_INFO_FOR_PREFETCH = (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK),
+};
 
 /*
  * Read object info from the object database and populate the `object_info`

-- 
2.53.0.rc1.267.g6e3a78c723.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/3] Small fixups for `OBJECT_INFO` flags
  2026-01-26 12:17 [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2026-01-26 12:17 ` [PATCH 3/3] odb: drop gaps in object info flag values Patrick Steinhardt
@ 2026-01-26 16:28 ` Junio C Hamano
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
  4 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2026-01-26 16:28 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> Surprisingly though I found out that this breaks tests, which of course
> puzzled me. As it turns out though, we were incorrectly using a couple
> of these flags for `odb_has_object()`, and the changed definitions had
> overlap with the existing meaning of other `HAS_OBJECT_*` flags. There
> isn't really any bug here as far as I can see, but this is only really
> by chance.

Great findings.

> In any case, the first two commits fix calls to `odb_has_object()` that
> used invalid flags. The last commit then removes the gaps and converts
> the flags to use an enum instead.

Nice.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-26 12:17 ` [PATCH 3/3] odb: drop gaps in object info flag values Patrick Steinhardt
@ 2026-01-26 16:58   ` Junio C Hamano
  2026-01-26 18:02     ` René Scharfe
  2026-02-09 20:18   ` Justin Tobler
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2026-01-26 16:58 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> +enum object_info_flags {
> +	/* Invoke lookup_replace_object() on the given hash. */
> +	OBJECT_INFO_LOOKUP_REPLACE = (1 << 0),
> +
> +	/* Do not reprepare object sources when the first lookup has failed. */
> +	OBJECT_INFO_QUICK = (1 << 1),
> +
> +	/*
> +	 * Do not attempt to fetch the object if missing (even if fetch_is_missing is
> +	 * nonzero).
> +	 */
> +	OBJECT_INFO_SKIP_FETCH_OBJECT = (1 << 2),
> +
> +	/* Die if object corruption (not just an object being missing) was detected. */
> +	OBJECT_INFO_DIE_IF_CORRUPT = (1 << 3),
>  
> -/* Die if object corruption (not just an object being missing) was detected. */
> -#define OBJECT_INFO_DIE_IF_CORRUPT 32
> +	/*
> +	 * This is meant for bulk prefetching of missing blobs in a partial
> +	 * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK.
> +	 */
> +	OBJECT_INFO_FOR_PREFETCH = (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK),
> +};
>  
>  /*
>   * Read object info from the object database and populate the `object_info`

I wonder if this series can be restructured a bit to demonstrate the
benefit of moving to enum a bit more prominently.  For example, even
at the end of the three patches, odb_read_object_info_extended()
still takes an "unsigned flags" parameter, but it is meant to take
this new enum, isn't it?  If we do the "#define to enum" conversion
(without renumbering) first, then "unsigned to enum", would it, with
appropriate compiler warning flags, already reveal the existing bugs
that happened to be working OK as potential problems?  And with that,
fixes in 1/3 and 2/3 would demonstrate why #define to enum" is worth
doing very well.  And after all that, we can renumber the enums in a
separate and final step.

Exactly the same comment applies to odb_has_object() that still
takes "unsigned flags", even though HAS_OBJECT_* constants have
already gone through the "#define to enum" conversion with an earier
f8fc4cac (object-store: allow fetching objects via `has_object()`,
2025-04-29).

In any case, well spotted and nicely done.
Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-26 16:58   ` Junio C Hamano
@ 2026-01-26 18:02     ` René Scharfe
  2026-01-26 18:13       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2026-01-26 18:02 UTC (permalink / raw)
  To: Junio C Hamano, Patrick Steinhardt; +Cc: git

On 1/26/26 5:58 PM, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
>> +enum object_info_flags {
>> +	/* Invoke lookup_replace_object() on the given hash. */
>> +	OBJECT_INFO_LOOKUP_REPLACE = (1 << 0),
>> +
>> +	/* Do not reprepare object sources when the first lookup has failed. */
>> +	OBJECT_INFO_QUICK = (1 << 1),
>> +
>> +	/*
>> +	 * Do not attempt to fetch the object if missing (even if fetch_is_missing is
>> +	 * nonzero).
>> +	 */
>> +	OBJECT_INFO_SKIP_FETCH_OBJECT = (1 << 2),
>> +
>> +	/* Die if object corruption (not just an object being missing) was detected. */
>> +	OBJECT_INFO_DIE_IF_CORRUPT = (1 << 3),
>>  
>> -/* Die if object corruption (not just an object being missing) was detected. */
>> -#define OBJECT_INFO_DIE_IF_CORRUPT 32
>> +	/*
>> +	 * This is meant for bulk prefetching of missing blobs in a partial
>> +	 * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK.
>> +	 */
>> +	OBJECT_INFO_FOR_PREFETCH = (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK),
>> +};
>>  
>>  /*
>>   * Read object info from the object database and populate the `object_info`
> 
> I wonder if this series can be restructured a bit to demonstrate the
> benefit of moving to enum a bit more prominently.  For example, even
> at the end of the three patches, odb_read_object_info_extended()
> still takes an "unsigned flags" parameter, but it is meant to take
> this new enum, isn't it?  If we do the "#define to enum" conversion
> (without renumbering) first, then "unsigned to enum", would it, with
> appropriate compiler warning flags, already reveal the existing bugs
> that happened to be working OK as potential problems?  And with that,
> fixes in 1/3 and 2/3 would demonstrate why #define to enum" is worth
> doing very well.  And after all that, we can renumber the enums in a
> separate and final step.
With -Wenum-conversion you can get GCC to report implicit conversions
between different enum types (like in the backfill case), but I don't
see a way to warn about conversions from int (the fsck case).

https://stackoverflow.com/questions/4669454/how-to-make-gcc-warn-about-passing-wrong-enum-to-a-function
suggests using -Wenum-compare and macros to sneak in a comparison, but
that doesn't seem to catch more than -Wenum-conversion, which doesn't
need any macros.

https://godbolt.org/z/Whvc7Mf1n

Perhaps sparse can do that?

René


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-26 18:02     ` René Scharfe
@ 2026-01-26 18:13       ` Junio C Hamano
  2026-01-27  6:29         ` Patrick Steinhardt
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2026-01-26 18:13 UTC (permalink / raw)
  To: René Scharfe; +Cc: Patrick Steinhardt, git

René Scharfe <l.s.r@web.de> writes:

>> I wonder if this series can be restructured a bit to demonstrate the
>> benefit of moving to enum a bit more prominently.  For example, even
>> at the end of the three patches, odb_read_object_info_extended()
>> still takes an "unsigned flags" parameter, but it is meant to take
>> this new enum, isn't it?  If we do the "#define to enum" conversion
>> (without renumbering) first, then "unsigned to enum", would it, with
>> appropriate compiler warning flags, already reveal the existing bugs
>> that happened to be working OK as potential problems?  And with that,
>> fixes in 1/3 and 2/3 would demonstrate why #define to enum" is worth
>> doing very well.  And after all that, we can renumber the enums in a
>> separate and final step.
> With -Wenum-conversion you can get GCC to report implicit conversions
> between different enum types (like in the backfill case), but I don't
> see a way to warn about conversions from int (the fsck case).

Yes, that is why I suggested "unsigned to enum" change after doing
"#define to enum" conversion.  If a caller passes an enum with
HAS_OBJECT_* to odb_read_object_info_extended() that expects
"unsigned flags", it would not be warned, but if the callee expects
"enum object_info_flags", passing HAS_OBJECT_* enum to it would be
flagged, right?  We may need to give the currently-unnamed enum with
HAS_OBJECT_* a name first.

> https://stackoverflow.com/questions/4669454/how-to-make-gcc-warn-about-passing-wrong-enum-to-a-function
> suggests using -Wenum-compare and macros to sneak in a comparison, but
> that doesn't seem to catch more than -Wenum-conversion, which doesn't
> need any macros.
>
> https://godbolt.org/z/Whvc7Mf1n
>
> Perhaps sparse can do that?
>
> René

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
@ 2026-01-26 20:17   ` Derrick Stolee
  2026-01-26 21:07     ` Junio C Hamano
  2026-02-09 19:57   ` Justin Tobler
  2026-02-10  9:24   ` Karthik Nayak
  2 siblings, 1 reply; 23+ messages in thread
From: Derrick Stolee @ 2026-01-26 20:17 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On 1/26/2026 7:17 AM, Patrick Steinhardt wrote:
> The function `fill_missing_blobs()` receives an array of object IDs and
> verifies for each of them whether the corresponding object exists. If it
> doesn't exist, we add it to a set of objects and then batch-fetch all of
> the objects at once.
> 
> The check for whether or not we already have the object is broken
> though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
> expects us to pass `HAS_OBJECT_*` flags. The flag expands to:
> 
>   - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
>     in case the object wasn't found. This makes sense, as we'd otherwise
>     reprepare the object database as many times as we have missing
>     objects.
> 
>   - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
>     not fetch the object in case it's missing. Again, this makes sense,
>     as we want to batch-fetch the objects.
> 
> This shows that we indeed want the equivalent of this flag, but of
> course represented as `HAS_OBJECT_*` flags.
> 
> Luckily, the code is already working correctly. The `OBJECT_INFO` flag
> expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
> flags. And if no flags are passed, `odb_has_object()` ends up calling
> `odb_read_object_info_extended()` with exactly the above two flags that
> we wanted to set in the first place.
> 
> Of course, this is pure luck, and this can break any moment. So let's
> fix this and correct the code to not pass any flags at all.

Absolutely the right fix for this case. Thanks!

-Stolee

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-01-26 20:17   ` Derrick Stolee
@ 2026-01-26 21:07     ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2026-01-26 21:07 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Patrick Steinhardt, git

Derrick Stolee <stolee@gmail.com> writes:

> On 1/26/2026 7:17 AM, Patrick Steinhardt wrote:
>> The function `fill_missing_blobs()` receives an array of object IDs and
>> verifies for each of them whether the corresponding object exists. If it
>> doesn't exist, we add it to a set of objects and then batch-fetch all of
>> the objects at once.
>> 
>> The check for whether or not we already have the object is broken
>> though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
>> expects us to pass `HAS_OBJECT_*` flags. The flag expands to:
>> 
>>   - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
>>     in case the object wasn't found. This makes sense, as we'd otherwise
>>     reprepare the object database as many times as we have missing
>>     objects.
>> 
>>   - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
>>     not fetch the object in case it's missing. Again, this makes sense,
>>     as we want to batch-fetch the objects.
>> 
>> This shows that we indeed want the equivalent of this flag, but of
>> course represented as `HAS_OBJECT_*` flags.
>> 
>> Luckily, the code is already working correctly. The `OBJECT_INFO` flag
>> expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
>> flags. And if no flags are passed, `odb_has_object()` ends up calling
>> `odb_read_object_info_extended()` with exactly the above two flags that
>> we wanted to set in the first place.
>> 
>> Of course, this is pure luck, and this can break any moment. So let's
>> fix this and correct the code to not pass any flags at all.
>
> Absolutely the right fix for this case. Thanks!
>
> -Stolee

Thanks, both.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-26 18:13       ` Junio C Hamano
@ 2026-01-27  6:29         ` Patrick Steinhardt
  2026-02-09 20:32           ` Justin Tobler
  0 siblings, 1 reply; 23+ messages in thread
From: Patrick Steinhardt @ 2026-01-27  6:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, git

On Mon, Jan 26, 2026 at 10:13:51AM -0800, Junio C Hamano wrote:
> René Scharfe <l.s.r@web.de> writes:
> 
> >> I wonder if this series can be restructured a bit to demonstrate the
> >> benefit of moving to enum a bit more prominently.  For example, even
> >> at the end of the three patches, odb_read_object_info_extended()
> >> still takes an "unsigned flags" parameter, but it is meant to take
> >> this new enum, isn't it?  If we do the "#define to enum" conversion
> >> (without renumbering) first, then "unsigned to enum", would it, with
> >> appropriate compiler warning flags, already reveal the existing bugs
> >> that happened to be working OK as potential problems?  And with that,
> >> fixes in 1/3 and 2/3 would demonstrate why #define to enum" is worth
> >> doing very well.  And after all that, we can renumber the enums in a
> >> separate and final step.
> > With -Wenum-conversion you can get GCC to report implicit conversions
> > between different enum types (like in the backfill case), but I don't
> > see a way to warn about conversions from int (the fsck case).
> 
> Yes, that is why I suggested "unsigned to enum" change after doing
> "#define to enum" conversion.  If a caller passes an enum with
> HAS_OBJECT_* to odb_read_object_info_extended() that expects
> "unsigned flags", it would not be warned, but if the callee expects
> "enum object_info_flags", passing HAS_OBJECT_* enum to it would be
> flagged, right?  We may need to give the currently-unnamed enum with
> HAS_OBJECT_* a name first.

You can get it to generate a warning for one of the callsites:

    ../builtin/backfill.c:71:9: error: implicit conversion from enumeration type 'enum odb_object_info_flag' to different enumeration type 'enum odb_has_object_flag' [-Werror,-Wenum-conversion]
       70 |                 if (!odb_has_object(ctx->repo->objects, &list->oid[i],
          |                      ~~~~~~~~~~~~~~
       71 |                                     OBJECT_INFO_FOR_PREFETCH))
          |                                     ^~~~~~~~~~~~~~~~~~~~~~~~
    1 error generated.

Unfortunately, the other callsite wouldn't see a warning because we pass
an integer constant, and the compiler doesn't complain about that at
all. It also falls apart once you start to OR multiple flags together.

It would be great if there was a way to tell the compiler that a given
flags field expects only enum values so that it could always warn about
misuse. But I'm not aware of any way to do this.

We could of course start to take a more heavy-handed approach and always
accept an options struct instead. E.g.

    struct odb_read_object_info_options {
            unsigned lookup_replace : 1,
                     quick : 1,
                     skip_fetch_object : 1,
                     for_prefetch : 1,
                     die_if_corrupt : 1;
    };

That would give us full type safety, and it would be impossible to
misuse without getting a compiler warning. Furthermore, with designated
initializers it wouldn't be _that_ awful to use:

	if (!odb_read_object_extended(ctx->repo->objects, &list->oid[i],
				      (struct odb_read_object_info_options) {
		.skip_fetch_object = 1,
	}) < 0) {
		die("...");
	}

But I wouldn't exactly call it ergonomic, either.

So I'm not sure whether this partial protection would be worth it, but
if you think it is I'm happy to reroll.

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
  2026-01-26 20:17   ` Derrick Stolee
@ 2026-02-09 19:57   ` Justin Tobler
  2026-02-10  9:24   ` Karthik Nayak
  2 siblings, 0 replies; 23+ messages in thread
From: Justin Tobler @ 2026-02-09 19:57 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/01/26 01:17PM, Patrick Steinhardt wrote:
> The function `fill_missing_blobs()` receives an array of object IDs and
> verifies for each of them whether the corresponding object exists. If it
> doesn't exist, we add it to a set of objects and then batch-fetch all of
> the objects at once.
> 
> The check for whether or not we already have the object is broken
> though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
> expects us to pass `HAS_OBJECT_*` flags.

Ok so the flag we are passing to `odb_has_object()` here is not from the
expected set.

> The flag expands to:
> 
>   - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
>     in case the object wasn't found. This makes sense, as we'd otherwise
>     reprepare the object database as many times as we have missing
>     objects.
> 
>   - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
>     not fetch the object in case it's missing. Again, this makes sense,
>     as we want to batch-fetch the objects.
> 
> This shows that we indeed want the equivalent of this flag, but of
> course represented as `HAS_OBJECT_*` flags.
> 
> Luckily, the code is already working correctly. The `OBJECT_INFO` flag
> expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
> flags. And if no flags are passed, `odb_has_object()` ends up calling
> `odb_read_object_info_extended()` with exactly the above two flags that
> we wanted to set in the first place.

Lucky indeed.

> Of course, this is pure luck, and this can break any moment. So let's
> fix this and correct the code to not pass any flags at all.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  builtin/backfill.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/builtin/backfill.c b/builtin/backfill.c
> index e80fc1b694..d8cb3b0eba 100644
> --- a/builtin/backfill.c
> +++ b/builtin/backfill.c
> @@ -67,8 +67,7 @@ static int fill_missing_blobs(const char *path UNUSED,
>  		return 0;
>  
>  	for (size_t i = 0; i < list->nr; i++) {
> -		if (!odb_has_object(ctx->repo->objects, &list->oid[i],
> -				    OBJECT_INFO_FOR_PREFETCH))
> +		if (!odb_has_object(ctx->repo->objects, &list->oid[i], 0))

By passing 0 as the flag value here, the underlying
`odb_read_object_info_extended()` gets both the `OBJECT_INFO_QUICK` and
`OBJECT_INFO_SKIP_FETCH_OBJECT` flags set. This is exactly what we want
so there is no need to add an additional `HAS_OBJECT_*` flag. Looks
good.

-Justin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] builtin/fsck: fix flags passed to `odb_has_object()`
  2026-01-26 12:17 ` [PATCH 2/3] builtin/fsck: " Patrick Steinhardt
@ 2026-02-09 20:04   ` Justin Tobler
  0 siblings, 0 replies; 23+ messages in thread
From: Justin Tobler @ 2026-02-09 20:04 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/01/26 01:17PM, Patrick Steinhardt wrote:
> In `mark_object()` we invoke `has_object()` with a value of 1. This is
> somewhat fishy given that the function expects a bitset of flags, so any
> behaviour that this results in is purely coincidental and may break at
> any point in time.
> 
> The call to `has_object()` was originally introduced in 9eb86f41de
> (fsck: do not lazy fetch known non-promisor object, 2020-08-05). The
> intent here was to skip lazy fetches of promisor objects: we have
> already verified that the object is not a promisor object, so if the
> object is missing it indicates a corrupt repository.
> 
> The hardcoded value that we pass maps to `HAS_OBJECT_RECHECK_PACKED`,
> which is probably the intended behaviour: `odb_has_object()` will not
> fetch promisor objects unless `HAS_OBJECT_FETCH_PROMISOR` is passed, but
> we may want to verify that no concurrent process has written the object
> that we're trying to read.

As you mentioned, promisor objects are not fetched unless
`HAS_OBJECT_FETCH_PROMISOR` is passed and in this case a flag value of 1
maps only to the `HAS_OBJECT_RECHECK_PACKED` flag. This certainly seems
like the intended option.

> Convert the code to use the named flag instead of the the hardcoded
> value.

Makes sense, this patch looks good to me.

-Justin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-26 12:17 ` [PATCH 3/3] odb: drop gaps in object info flag values Patrick Steinhardt
  2026-01-26 16:58   ` Junio C Hamano
@ 2026-02-09 20:18   ` Justin Tobler
  1 sibling, 0 replies; 23+ messages in thread
From: Justin Tobler @ 2026-02-09 20:18 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 26/01/26 01:17PM, Patrick Steinhardt wrote:
> The object info flag values have a two gaps in their definitions, where
> some bits are skipped over. These gaps don't really hurt, but it makes
> one wonder whether anything is going on and whether a subset of flags
> might be defined somewhere else.
> 
> That's not the case though. Instead, this is a case of flags that have
> been dropped in the past:
> 
>   - The value 4 was used by `OBJECT_INFO_SKIP_CACHED`, removed in
>     9c8a294a1a (sha1-file: remove OBJECT_INFO_SKIP_CACHED, 2020-01-02).
> 
>   - The value 8 was used by `OBJECT_INFO_ALLOW_UNKNOWN_TYPE`, removed in
>     ae24b032a0 (object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag,
>     2025-05-16).
> 
> Close those gaps to avoid any more confusion. While at it, convert the
> flags to be declared as an enum and use bit shifts to follow modern best
> practices.

Makes sense.

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  odb.h | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/odb.h b/odb.h
> index bab07755f4..1e4326b7f4 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -352,23 +352,29 @@ struct object_info {
>   */
>  #define OBJECT_INFO_INIT { 0 }
>  
> -/* Invoke lookup_replace_object() on the given hash */
> -#define OBJECT_INFO_LOOKUP_REPLACE 1
> -/* Do not retry packed storage after checking packed and loose storage */
> -#define OBJECT_INFO_QUICK 8
> -/*
> - * Do not attempt to fetch the object if missing (even if fetch_is_missing is
> - * nonzero).
> - */
> -#define OBJECT_INFO_SKIP_FETCH_OBJECT 16
> -/*
> - * This is meant for bulk prefetching of missing blobs in a partial
> - * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
> - */
> -#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
> +/* Flags that can be passed to `odb_read_object_info_extended()`. */
> +enum object_info_flags {
> +	/* Invoke lookup_replace_object() on the given hash. */
> +	OBJECT_INFO_LOOKUP_REPLACE = (1 << 0),
> +
> +	/* Do not reprepare object sources when the first lookup has failed. */
> +	OBJECT_INFO_QUICK = (1 << 1),
> +
> +	/*
> +	 * Do not attempt to fetch the object if missing (even if fetch_is_missing is
> +	 * nonzero).
> +	 */
> +	OBJECT_INFO_SKIP_FETCH_OBJECT = (1 << 2),
> +
> +	/* Die if object corruption (not just an object being missing) was detected. */
> +	OBJECT_INFO_DIE_IF_CORRUPT = (1 << 3),
>  
> -/* Die if object corruption (not just an object being missing) was detected. */
> -#define OBJECT_INFO_DIE_IF_CORRUPT 32
> +	/*
> +	 * This is meant for bulk prefetching of missing blobs in a partial
> +	 * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK.
> +	 */
> +	OBJECT_INFO_FOR_PREFETCH = (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK),

The changes here all look obviously correct to me. Looks good.

-Justin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] odb: drop gaps in object info flag values
  2026-01-27  6:29         ` Patrick Steinhardt
@ 2026-02-09 20:32           ` Justin Tobler
  0 siblings, 0 replies; 23+ messages in thread
From: Justin Tobler @ 2026-02-09 20:32 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, René Scharfe, git

On 26/01/27 07:29AM, Patrick Steinhardt wrote:
> Unfortunately, the other callsite wouldn't see a warning because we pass
> an integer constant, and the compiler doesn't complain about that at
> all. It also falls apart once you start to OR multiple flags together.

I guess we could have enum values to each of the combinations that get
used together, but that problably isn't a great idea if we use many
different combinations and may not be good in the long term.

> It would be great if there was a way to tell the compiler that a given
> flags field expects only enum values so that it could always warn about
> misuse. But I'm not aware of any way to do this.

I agree with the sentiment. Passing a combination of enum values as
unsigned flags is a bit fragile and in some ways feels like it defeats
the point of using enums to begin with. Also when using a function that
accepts flags, it is not always immediately obvious which set of flags
are expected.

> We could of course start to take a more heavy-handed approach and always
> accept an options struct instead. E.g.
> 
>     struct odb_read_object_info_options {
>             unsigned lookup_replace : 1,
>                      quick : 1,
>                      skip_fetch_object : 1,
>                      for_prefetch : 1,
>                      die_if_corrupt : 1;
>     };
> 
> That would give us full type safety, and it would be impossible to
> misuse without getting a compiler warning. Furthermore, with designated
> initializers it wouldn't be _that_ awful to use:
> 
> 	if (!odb_read_object_extended(ctx->repo->objects, &list->oid[i],
> 				      (struct odb_read_object_info_options) {
> 		.skip_fetch_object = 1,
> 	}) < 0) {
> 		die("...");
> 	}
> 
> But I wouldn't exactly call it ergonomic, either.

This would certainly be the most safe option, but I also agree that it
not particually ergonomic. I'm not sure it's worth going this far as
long as it's documented/easy-to-find the corresponding set of flags.

> So I'm not sure whether this partial protection would be worth it, but
> if you think it is I'm happy to reroll.

I think this version of series is good and doesn't need a reroll.

Thanks,
-Justin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
  2026-01-26 20:17   ` Derrick Stolee
  2026-02-09 19:57   ` Justin Tobler
@ 2026-02-10  9:24   ` Karthik Nayak
  2026-02-10  9:32     ` Karthik Nayak
  2 siblings, 1 reply; 23+ messages in thread
From: Karthik Nayak @ 2026-02-10  9:24 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 2048 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> The function `fill_missing_blobs()` receives an array of object IDs and
> verifies for each of them whether the corresponding object exists. If it
> doesn't exist, we add it to a set of objects and then batch-fetch all of
> the objects at once.
>
> The check for whether or not we already have the object is broken
> though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
> expects us to pass `HAS_OBJECT_*` flags. The flag expands to:
>
>   - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
>     in case the object wasn't found. This makes sense, as we'd otherwise
>     reprepare the object database as many times as we have missing
>     objects.
>
>   - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
>     not fetch the object in case it's missing. Again, this makes sense,
>     as we want to batch-fetch the objects.
>
> This shows that we indeed want the equivalent of this flag, but of
> course represented as `HAS_OBJECT_*` flags.
>
> Luckily, the code is already working correctly. The `OBJECT_INFO` flag
> expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
> flags. And if no flags are passed, `odb_has_object()` ends up calling
> `odb_read_object_info_extended()` with exactly the above two flags that
> we wanted to set in the first place.
>
> Of course, this is pure luck, and this can break any moment. So let's
> fix this and correct the code to not pass any flags at all.
>

We do pass the same equivalent, no? I mean `OBJECT_INFO_FOR_PREFETCH`
does resolve to `OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK` and
calling `odb_has_object(... , 0)` would also eventually set the same
flags.

I understand the issue, `odb_has_object()` should only take in
`HAS_OBJECT_*` flags, even though internally it converts them to
`OBJECT_INFO_*` flags.

Wouldn't it also be nicer to convert the enum for `HAS_OBJECT_*` to no
longer be anonymous and use that in `odb_has_object()`?

The patch itself looks good!

- Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-02-10  9:24   ` Karthik Nayak
@ 2026-02-10  9:32     ` Karthik Nayak
  0 siblings, 0 replies; 23+ messages in thread
From: Karthik Nayak @ 2026-02-10  9:32 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 2273 bytes --]

Karthik Nayak <karthik.188@gmail.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> The function `fill_missing_blobs()` receives an array of object IDs and
>> verifies for each of them whether the corresponding object exists. If it
>> doesn't exist, we add it to a set of objects and then batch-fetch all of
>> the objects at once.
>>
>> The check for whether or not we already have the object is broken
>> though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
>> expects us to pass `HAS_OBJECT_*` flags. The flag expands to:
>>
>>   - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
>>     in case the object wasn't found. This makes sense, as we'd otherwise
>>     reprepare the object database as many times as we have missing
>>     objects.
>>
>>   - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
>>     not fetch the object in case it's missing. Again, this makes sense,
>>     as we want to batch-fetch the objects.
>>
>> This shows that we indeed want the equivalent of this flag, but of
>> course represented as `HAS_OBJECT_*` flags.
>>
>> Luckily, the code is already working correctly. The `OBJECT_INFO` flag
>> expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
>> flags. And if no flags are passed, `odb_has_object()` ends up calling
>> `odb_read_object_info_extended()` with exactly the above two flags that
>> we wanted to set in the first place.
>>
>> Of course, this is pure luck, and this can break any moment. So let's
>> fix this and correct the code to not pass any flags at all.
>>
>
> We do pass the same equivalent, no? I mean `OBJECT_INFO_FOR_PREFETCH`
> does resolve to `OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK` and
> calling `odb_has_object(... , 0)` would also eventually set the same
> flags.
>
> I understand the issue, `odb_has_object()` should only take in
> `HAS_OBJECT_*` flags, even though internally it converts them to
> `OBJECT_INFO_*` flags.
>
> Wouldn't it also be nicer to convert the enum for `HAS_OBJECT_*` to no
> longer be anonymous and use that in `odb_has_object()`?
>
> The patch itself looks good!
>
> - Karthik

Just noticed that there is a similar discussion on the other patch in
this series. So will drop the discussion here.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 0/5] Small fixups for `OBJECT_INFO` flags
  2026-01-26 12:17 [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2026-01-26 16:28 ` [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Junio C Hamano
@ 2026-02-12  6:59 ` Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 1/5] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
                     ` (4 more replies)
  4 siblings, 5 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-02-12  6:59 UTC (permalink / raw)
  To: git

Hi,

I was kind of curious why there were gaps in the `OBJECT_INFO_*` flags,
but eventually found out that these gaps are of historic nature: there
used to be more flags, but their respective values got removed at one
point in time. So naturally, I wanted to clean this up a bit so that the
next reader wouldn't have the same question.

Surprisingly though I found out that this breaks tests, which of course
puzzled me. As it turns out though, we were incorrectly using a couple
of these flags for `odb_has_object()`, and the changed definitions had
overlap with the existing meaning of other `HAS_OBJECT_*` flags. There
isn't really any bug here as far as I can see, but this is only really
by chance.

In any case, the first two commits fix calls to `odb_has_object()` that
used invalid flags. The last commit then removes the gaps and converts
the flags to use an enum instead.

Changes in v2:
  - Add two patches on top that convert the object info and
    `odb_has_object()` flags into enums.
  - Link to v1: https://lore.kernel.org/r/20260126-b4-pks-read-object-info-flags-v1-0-e682a003b17c@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (5):
      builtin/backfill: fix flags passed to `odb_has_object()`
      builtin/fsck: fix flags passed to `odb_has_object()`
      odb: drop gaps in object info flag values
      odb: convert object info flags into an enum
      odb: convert `odb_has_object()` flags into an enum

 builtin/backfill.c |  3 +--
 builtin/fsck.c     |  3 ++-
 object-file.c      |  3 ++-
 object-file.h      |  3 ++-
 odb.c              |  4 ++--
 odb.h              | 44 +++++++++++++++++++++++++-------------------
 packfile.c         |  2 +-
 packfile.h         |  2 +-
 8 files changed, 36 insertions(+), 28 deletions(-)

Range-diff versus v1:

1:  eec721a55b = 1:  308963f244 builtin/backfill: fix flags passed to `odb_has_object()`
2:  317893853b = 2:  7a69a648bb builtin/fsck: fix flags passed to `odb_has_object()`
3:  c785043a72 < -:  ---------- odb: drop gaps in object info flag values
-:  ---------- > 3:  0cea7f03f3 odb: drop gaps in object info flag values
-:  ---------- > 4:  ab98547370 odb: convert object info flags into an enum
-:  ---------- > 5:  414dd30e14 odb: convert `odb_has_object()` flags into an enum

---
base-commit: ea24e2c55433012a0a6c4ae947a87bc66404e484
change-id: 20260126-b4-pks-read-object-info-flags-236c4437cfc5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/5] builtin/backfill: fix flags passed to `odb_has_object()`
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
@ 2026-02-12  6:59   ` Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 2/5] builtin/fsck: " Patrick Steinhardt
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-02-12  6:59 UTC (permalink / raw)
  To: git

The function `fill_missing_blobs()` receives an array of object IDs and
verifies for each of them whether the corresponding object exists. If it
doesn't exist, we add it to a set of objects and then batch-fetch all of
the objects at once.

The check for whether or not we already have the object is broken
though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
expects us to pass `HAS_OBJECT_*` flags. The flag expands to:

  - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
    in case the object wasn't found. This makes sense, as we'd otherwise
    reprepare the object database as many times as we have missing
    objects.

  - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
    not fetch the object in case it's missing. Again, this makes sense,
    as we want to batch-fetch the objects.

This shows that we indeed want the equivalent of this flag, but of
course represented as `HAS_OBJECT_*` flags.

Luckily, the code is already working correctly. The `OBJECT_INFO` flag
expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
flags. And if no flags are passed, `odb_has_object()` ends up calling
`odb_read_object_info_extended()` with exactly the above two flags that
we wanted to set in the first place.

Of course, this is pure luck, and this can break any moment. So let's
fix this and correct the code to not pass any flags at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/backfill.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/builtin/backfill.c b/builtin/backfill.c
index e80fc1b694..d8cb3b0eba 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -67,8 +67,7 @@ static int fill_missing_blobs(const char *path UNUSED,
 		return 0;
 
 	for (size_t i = 0; i < list->nr; i++) {
-		if (!odb_has_object(ctx->repo->objects, &list->oid[i],
-				    OBJECT_INFO_FOR_PREFETCH))
+		if (!odb_has_object(ctx->repo->objects, &list->oid[i], 0))
 			oid_array_append(&ctx->current_batch, &list->oid[i]);
 	}
 

-- 
2.53.0.295.g64333814d3.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/5] builtin/fsck: fix flags passed to `odb_has_object()`
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 1/5] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
@ 2026-02-12  6:59   ` Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 3/5] odb: drop gaps in object info flag values Patrick Steinhardt
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-02-12  6:59 UTC (permalink / raw)
  To: git

In `mark_object()` we invoke `has_object()` with a value of 1. This is
somewhat fishy given that the function expects a bitset of flags, so any
behaviour that this results in is purely coincidental and may break at
any point in time.

The call to `has_object()` was originally introduced in 9eb86f41de
(fsck: do not lazy fetch known non-promisor object, 2020-08-05). The
intent here was to skip lazy fetches of promisor objects: we have
already verified that the object is not a promisor object, so if the
object is missing it indicates a corrupt repository.

The hardcoded value that we pass maps to `HAS_OBJECT_RECHECK_PACKED`,
which is probably the intended behaviour: `odb_has_object()` will not
fetch promisor objects unless `HAS_OBJECT_FETCH_PROMISOR` is passed, but
we may want to verify that no concurrent process has written the object
that we're trying to read.

Convert the code to use the named flag instead of the the hardcoded
value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0512f78a87..1d059dd6c2 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -162,7 +162,8 @@ static int mark_object(struct object *obj, enum object_type type,
 		return 0;
 
 	if (!(obj->flags & HAS_OBJ)) {
-		if (parent && !odb_has_object(the_repository->objects, &obj->oid, 1)) {
+		if (parent && !odb_has_object(the_repository->objects, &obj->oid,
+					      HAS_OBJECT_RECHECK_PACKED)) {
 			printf_ln(_("broken link from %7s %s\n"
 				    "              to %7s %s"),
 				  printable_type(&parent->oid, parent->type),

-- 
2.53.0.295.g64333814d3.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/5] odb: drop gaps in object info flag values
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 1/5] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 2/5] builtin/fsck: " Patrick Steinhardt
@ 2026-02-12  6:59   ` Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 4/5] odb: convert object info flags into an enum Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 5/5] odb: convert `odb_has_object()` " Patrick Steinhardt
  4 siblings, 0 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-02-12  6:59 UTC (permalink / raw)
  To: git

The object info flag values have a two gaps in their definitions, where
some bits are skipped over. These gaps don't really hurt, but it makes
one wonder whether anything is going on and whether a subset of flags
might be defined somewhere else.

That's not the case though. Instead, this is a case of flags that have
been dropped in the past:

  - The value 4 was used by `OBJECT_INFO_SKIP_CACHED`, removed in
    9c8a294a1a (sha1-file: remove OBJECT_INFO_SKIP_CACHED, 2020-01-02).

  - The value 8 was used by `OBJECT_INFO_ALLOW_UNKNOWN_TYPE`, removed in
    ae24b032a0 (object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag,
    2025-05-16).

Close those gaps to avoid any more confusion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/odb.h b/odb.h
index bab07755f4..8e1fca7755 100644
--- a/odb.h
+++ b/odb.h
@@ -353,14 +353,14 @@ struct object_info {
 #define OBJECT_INFO_INIT { 0 }
 
 /* Invoke lookup_replace_object() on the given hash */
-#define OBJECT_INFO_LOOKUP_REPLACE 1
+#define OBJECT_INFO_LOOKUP_REPLACE (1 << 0)
 /* Do not retry packed storage after checking packed and loose storage */
-#define OBJECT_INFO_QUICK 8
+#define OBJECT_INFO_QUICK (1 << 1)
 /*
  * Do not attempt to fetch the object if missing (even if fetch_is_missing is
  * nonzero).
  */
-#define OBJECT_INFO_SKIP_FETCH_OBJECT 16
+#define OBJECT_INFO_SKIP_FETCH_OBJECT (1 << 2)
 /*
  * This is meant for bulk prefetching of missing blobs in a partial
  * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
@@ -368,7 +368,7 @@ struct object_info {
 #define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
 
 /* Die if object corruption (not just an object being missing) was detected. */
-#define OBJECT_INFO_DIE_IF_CORRUPT 32
+#define OBJECT_INFO_DIE_IF_CORRUPT (1 << 3)
 
 /*
  * Read object info from the object database and populate the `object_info`

-- 
2.53.0.295.g64333814d3.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/5] odb: convert object info flags into an enum
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2026-02-12  6:59   ` [PATCH v2 3/5] odb: drop gaps in object info flag values Patrick Steinhardt
@ 2026-02-12  6:59   ` Patrick Steinhardt
  2026-02-12  6:59   ` [PATCH v2 5/5] odb: convert `odb_has_object()` " Patrick Steinhardt
  4 siblings, 0 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-02-12  6:59 UTC (permalink / raw)
  To: git

Convert the object info flags into an enum and adapt all functions that
receive these flags as parameters to use the enum instead of an integer.
This serves two purposes:

  - The function signatures become more self-documenting, as callers
    don't have to wonder which flags they expect.

  - The compiler can warn when a wrong flag type is passed.

Note that the second benefit is somewhat limited. For example, when
or-ing multiple enum flags together the result will be an integer, and
the compiler will not warn about such use cases. But where it does help
is when a single flag of the wrong type is passed, as the compiler would
generate a warning in that case.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c |  3 ++-
 object-file.h |  3 ++-
 odb.c         |  2 +-
 odb.h         | 40 +++++++++++++++++++++++-----------------
 packfile.c    |  2 +-
 packfile.h    |  2 +-
 6 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/object-file.c b/object-file.c
index e7e4c3348f..0ab6c4d4f3 100644
--- a/object-file.c
+++ b/object-file.c
@@ -414,7 +414,8 @@ static int parse_loose_header(const char *hdr, struct object_info *oi)
 
 int odb_source_loose_read_object_info(struct odb_source *source,
 				      const struct object_id *oid,
-				      struct object_info *oi, int flags)
+				      struct object_info *oi,
+				      enum object_info_flags flags)
 {
 	int ret;
 	int fd;
diff --git a/object-file.h b/object-file.h
index 1229d5f675..cdb54b5218 100644
--- a/object-file.h
+++ b/object-file.h
@@ -47,7 +47,8 @@ void odb_source_loose_reprepare(struct odb_source *source);
 
 int odb_source_loose_read_object_info(struct odb_source *source,
 				      const struct object_id *oid,
-				      struct object_info *oi, int flags);
+				      struct object_info *oi,
+				      enum object_info_flags flags);
 
 int odb_source_loose_read_object_stream(struct odb_read_stream **out,
 					struct odb_source *source,
diff --git a/odb.c b/odb.c
index ac70b6a099..d437aa8b06 100644
--- a/odb.c
+++ b/odb.c
@@ -842,7 +842,7 @@ static int oid_object_info_convert(struct repository *r,
 int odb_read_object_info_extended(struct object_database *odb,
 				  const struct object_id *oid,
 				  struct object_info *oi,
-				  unsigned flags)
+				  enum object_info_flags flags)
 {
 	int ret;
 
diff --git a/odb.h b/odb.h
index 8e1fca7755..e94cdc3665 100644
--- a/odb.h
+++ b/odb.h
@@ -352,23 +352,29 @@ struct object_info {
  */
 #define OBJECT_INFO_INIT { 0 }
 
-/* Invoke lookup_replace_object() on the given hash */
-#define OBJECT_INFO_LOOKUP_REPLACE (1 << 0)
-/* Do not retry packed storage after checking packed and loose storage */
-#define OBJECT_INFO_QUICK (1 << 1)
-/*
- * Do not attempt to fetch the object if missing (even if fetch_is_missing is
- * nonzero).
- */
-#define OBJECT_INFO_SKIP_FETCH_OBJECT (1 << 2)
-/*
- * This is meant for bulk prefetching of missing blobs in a partial
- * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
- */
-#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
+/* Flags that can be passed to `odb_read_object_info_extended()`. */
+enum object_info_flags {
+	/* Invoke lookup_replace_object() on the given hash. */
+	OBJECT_INFO_LOOKUP_REPLACE = (1 << 0),
+
+	/* Do not reprepare object sources when the first lookup has failed. */
+	OBJECT_INFO_QUICK = (1 << 1),
+
+	/*
+	 * Do not attempt to fetch the object if missing (even if fetch_is_missing is
+	 * nonzero).
+	 */
+	OBJECT_INFO_SKIP_FETCH_OBJECT = (1 << 2),
+
+	/* Die if object corruption (not just an object being missing) was detected. */
+	OBJECT_INFO_DIE_IF_CORRUPT = (1 << 3),
 
-/* Die if object corruption (not just an object being missing) was detected. */
-#define OBJECT_INFO_DIE_IF_CORRUPT (1 << 3)
+	/*
+	 * This is meant for bulk prefetching of missing blobs in a partial
+	 * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK.
+	 */
+	OBJECT_INFO_FOR_PREFETCH = (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK),
+};
 
 /*
  * Read object info from the object database and populate the `object_info`
@@ -377,7 +383,7 @@ struct object_info {
 int odb_read_object_info_extended(struct object_database *odb,
 				  const struct object_id *oid,
 				  struct object_info *oi,
-				  unsigned flags);
+				  enum object_info_flags flags);
 
 /*
  * Read a subset of object info for the given object ID. Returns an `enum
diff --git a/packfile.c b/packfile.c
index 402c3b5dc7..cb418846ae 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2149,7 +2149,7 @@ int packfile_store_freshen_object(struct packfile_store *store,
 int packfile_store_read_object_info(struct packfile_store *store,
 				    const struct object_id *oid,
 				    struct object_info *oi,
-				    unsigned flags UNUSED)
+				    enum object_info_flags flags UNUSED)
 {
 	struct pack_entry e;
 	int ret;
diff --git a/packfile.h b/packfile.h
index acc5c55ad5..989fd10cb6 100644
--- a/packfile.h
+++ b/packfile.h
@@ -247,7 +247,7 @@ int packfile_store_read_object_stream(struct odb_read_stream **out,
 int packfile_store_read_object_info(struct packfile_store *store,
 				    const struct object_id *oid,
 				    struct object_info *oi,
-				    unsigned flags);
+				    enum object_info_flags flags);
 
 /*
  * Open the packfile and add it to the store if it isn't yet known. Returns

-- 
2.53.0.295.g64333814d3.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/5] odb: convert `odb_has_object()` flags into an enum
  2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2026-02-12  6:59   ` [PATCH v2 4/5] odb: convert object info flags into an enum Patrick Steinhardt
@ 2026-02-12  6:59   ` Patrick Steinhardt
  4 siblings, 0 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2026-02-12  6:59 UTC (permalink / raw)
  To: git

Following the reason in the preceding commit, convert the
`odb_has_object()` flags into an enum.

With this change, we would have catched the misuse of `odb_has_object()`
that was fixed in a preceding commit as the compiler would have
generated a warning:

  ../builtin/backfill.c:71:9: error: implicit conversion from enumeration type 'enum odb_object_info_flag' to different enumeration type 'enum odb_has_object_flag' [-Werror,-Wenum-conversion]
     70 |                 if (!odb_has_object(ctx->repo->objects, &list->oid[i],
        |                      ~~~~~~~~~~~~~~
     71 |                                     OBJECT_INFO_FOR_PREFETCH))
        |                                     ^~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb.c | 2 +-
 odb.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/odb.c b/odb.c
index d437aa8b06..2bbbfb344a 100644
--- a/odb.c
+++ b/odb.c
@@ -964,7 +964,7 @@ void *odb_read_object_peeled(struct object_database *odb,
 }
 
 int odb_has_object(struct object_database *odb, const struct object_id *oid,
-	       unsigned flags)
+		   enum has_object_flags flags)
 {
 	unsigned object_info_flags = 0;
 
diff --git a/odb.h b/odb.h
index e94cdc3665..f7368827ac 100644
--- a/odb.h
+++ b/odb.h
@@ -395,7 +395,7 @@ int odb_read_object_info(struct object_database *odb,
 			 const struct object_id *oid,
 			 unsigned long *sizep);
 
-enum {
+enum has_object_flags {
 	/* Retry packed storage after checking packed and loose storage */
 	HAS_OBJECT_RECHECK_PACKED = (1 << 0),
 	/* Allow fetching the object in case the repository has a promisor remote. */
@@ -408,7 +408,7 @@ enum {
  */
 int odb_has_object(struct object_database *odb,
 		   const struct object_id *oid,
-		   unsigned flags);
+		   enum has_object_flags flags);
 
 int odb_freshen_object(struct object_database *odb,
 		       const struct object_id *oid);

-- 
2.53.0.295.g64333814d3.dirty


^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-02-12  7:00 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-26 12:17 [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Patrick Steinhardt
2026-01-26 12:17 ` [PATCH 1/3] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
2026-01-26 20:17   ` Derrick Stolee
2026-01-26 21:07     ` Junio C Hamano
2026-02-09 19:57   ` Justin Tobler
2026-02-10  9:24   ` Karthik Nayak
2026-02-10  9:32     ` Karthik Nayak
2026-01-26 12:17 ` [PATCH 2/3] builtin/fsck: " Patrick Steinhardt
2026-02-09 20:04   ` Justin Tobler
2026-01-26 12:17 ` [PATCH 3/3] odb: drop gaps in object info flag values Patrick Steinhardt
2026-01-26 16:58   ` Junio C Hamano
2026-01-26 18:02     ` René Scharfe
2026-01-26 18:13       ` Junio C Hamano
2026-01-27  6:29         ` Patrick Steinhardt
2026-02-09 20:32           ` Justin Tobler
2026-02-09 20:18   ` Justin Tobler
2026-01-26 16:28 ` [PATCH 0/3] Small fixups for `OBJECT_INFO` flags Junio C Hamano
2026-02-12  6:59 ` [PATCH v2 0/5] " Patrick Steinhardt
2026-02-12  6:59   ` [PATCH v2 1/5] builtin/backfill: fix flags passed to `odb_has_object()` Patrick Steinhardt
2026-02-12  6:59   ` [PATCH v2 2/5] builtin/fsck: " Patrick Steinhardt
2026-02-12  6:59   ` [PATCH v2 3/5] odb: drop gaps in object info flag values Patrick Steinhardt
2026-02-12  6:59   ` [PATCH v2 4/5] odb: convert object info flags into an enum Patrick Steinhardt
2026-02-12  6:59   ` [PATCH v2 5/5] odb: convert `odb_has_object()` " Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox