* [PATCH] patch-ids: achieve const correctness in patch_id_neq()
@ 2026-03-08 4:31 Tian Yuchen
2026-03-08 6:26 ` Junio C Hamano
2026-03-08 15:02 ` [PATCH v2] patch-ids: document intentional const-casting " Tian Yuchen
0 siblings, 2 replies; 7+ messages in thread
From: Tian Yuchen @ 2026-03-08 4:31 UTC (permalink / raw)
To: git; +Cc: gitster, Tian Yuchen
The implementation of the 'contain_of' macro in 'patch_id_neq()' is:
#define container_of(ptr, type, member) \
((type *) ((char *)(ptr) - offsetof(type, member)))
Here, 'type' is passed as a raw type with no const information.
Consequently, const correctness cannot be guaranteed here, resulting
in an eight-year-long NEEDSWORK comment.
Use explicit casting (struct object_id *) to ensure const correctness.
Signed-off-by: Tian Yuchen <a3205153416@gmail.com>
---
patch-ids.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/patch-ids.c b/patch-ids.c
index a5683b462c..e2d29e9dbb 100644
--- a/patch-ids.c
+++ b/patch-ids.c
@@ -41,19 +41,18 @@ static int patch_id_neq(const void *cmpfn_data,
const struct hashmap_entry *entry_or_key,
const void *keydata UNUSED)
{
- /* NEEDSWORK: const correctness? */
- struct diff_options *opt = (void *)cmpfn_data;
- struct patch_id *a, *b;
+ struct diff_options *opt = (struct diff_options *)cmpfn_data;
+ const struct patch_id *a, *b;
- a = container_of(eptr, struct patch_id, ent);
- b = container_of(entry_or_key, struct patch_id, ent);
+ a = container_of(eptr, const struct patch_id, ent);
+ b = container_of(entry_or_key, const struct patch_id, ent);
if (is_null_oid(&a->patch_id) &&
- commit_patch_id(a->commit, opt, &a->patch_id, 0))
+ commit_patch_id(a->commit, opt, (struct object_id *)&a->patch_id, 0))
return error("Could not get patch ID for %s",
oid_to_hex(&a->commit->object.oid));
if (is_null_oid(&b->patch_id) &&
- commit_patch_id(b->commit, opt, &b->patch_id, 0))
+ commit_patch_id(b->commit, opt, (struct object_id *)&b->patch_id, 0))
return error("Could not get patch ID for %s",
oid_to_hex(&b->commit->object.oid));
return !oideq(&a->patch_id, &b->patch_id);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] patch-ids: achieve const correctness in patch_id_neq()
2026-03-08 4:31 [PATCH] patch-ids: achieve const correctness in patch_id_neq() Tian Yuchen
@ 2026-03-08 6:26 ` Junio C Hamano
2026-03-08 14:42 ` Tian Yuchen
2026-03-08 15:02 ` [PATCH v2] patch-ids: document intentional const-casting " Tian Yuchen
1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2026-03-08 6:26 UTC (permalink / raw)
To: Tian Yuchen; +Cc: git
Tian Yuchen <a3205153416@gmail.com> writes:
> The implementation of the 'contain_of' macro in 'patch_id_neq()' is:
>
> #define container_of(ptr, type, member) \
> ((type *) ((char *)(ptr) - offsetof(type, member)))
>
> Here, 'type' is passed as a raw type with no const information.
> Consequently, const correctness cannot be guaranteed here, resulting
> in an eight-year-long NEEDSWORK comment.
>
> Use explicit casting (struct object_id *) to ensure const correctness.
The "NEEDSWORK: const correctness?" comment in patch-ids.c:patch_id_neq()
is indeed about the fact that this function modifies its arguments `eptr`
and `entry_or_key` (via the `patch_id` pointers `a` and `b`) to lazily
compute the full patch ID.
I am afraid, however, that this patch may not be moving in the right
direction. By marking `a` and `b` as `const struct patch_id *`, you are
telling the compiler and the reader that the objects they point to will
not be modified. But then you immediately cast that constness away when
calling `commit_patch_id()`:
> - commit_patch_id(a->commit, opt, &a->patch_id, 0))
> + commit_patch_id(a->commit, opt, (struct object_id *)&a->patch_id, 0))
I wonder if this is actually any better than the original code. If an
object is marked `const`, it is generally supposed to be immutable.
Casting away the constness of a member to pass it to a function that is
expected to write its findings into that memory region sounds wrong to me.
The fact that `patch_id_neq` receives `const struct hashmap_entry *`
parameters--which is a requirement of the `hashmap` API--is what is
clashing with our lazy initialization here. The original code
handled this by casting to a non-const `struct patch_id *` right at
the beginning (via `container_of`). While this also "drops" the
constness, the original is at least more honest about the fact that
`a` and `b` will be modified.
If we truly wanted to achieve const-correctness here, we would
likely need to avoid lazy initialization within the comparison
function altogether, pre-calculating the full patch ID before it's
needed for comparison. The NEEDSWORK comment should remain until a
more fundamental solution is found, or if we should just admit that
the current lazy evaluation pattern is what we want and document
that (i.e., add a comment to justify why we strip away the constness
here). As it stands, this patch doesn't "ensure" const correctness;
it just masks the violation with an explicit cast at the location
the pointer is used.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] patch-ids: achieve const correctness in patch_id_neq()
2026-03-08 6:26 ` Junio C Hamano
@ 2026-03-08 14:42 ` Tian Yuchen
0 siblings, 0 replies; 7+ messages in thread
From: Tian Yuchen @ 2026-03-08 14:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Hi Junio,
On 3/8/26 14:26, Junio C Hamano wrote:
> The fact that `patch_id_neq` receives `const struct hashmap_entry *`
> parameters--which is a requirement of the `hashmap` API--is what is
> clashing with our lazy initialization here. The original code
> handled this by casting to a non-const `struct patch_id *` right at
> the beginning (via `container_of`). While this also "drops" the
> constness, the original is at least more honest about the fact that
> `a` and `b` will be modified.
Oops, This is something I hadn't considered before. I admit I didn't
think things through carefully enough.
So, we actually find ourselves in this dilemma:
- The Hashmap API specification mandates that input parameters should
be const *in principle*.
- The lazy loading mechanism requires us to write the results into
memory; otherwise, there will be significant performance loss.
Am I correct?
I find that maintaining the current approach seems the most reasonable
option. Computing all patch ids before putting objects into the hashmap
appears to be a move that affects everything else and is not worth the
effort. On the contrary, slightly breaking the Hashmap API conventions
seems to be a more *cost-effective* approach...
Or perhaps it would be more reasonable to slightly modify this NEEDSWORK
flag here?
Will send the next patch shortly.
> If we truly wanted to achieve const-correctness here, we would
> likely need to avoid lazy initialization within the comparison
> function altogether, pre-calculating the full patch ID before it's
> needed for comparison. The NEEDSWORK comment should remain until a
> more fundamental solution is found, or if we should just admit that
> the current lazy evaluation pattern is what we want and document
> that (i.e., add a comment to justify why we strip away the constness
> here). As it stands, this patch doesn't "ensure" const correctness;
> it just masks the violation with an explicit cast at the location
> the pointer is used.
Regards,
Yuchen
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] patch-ids: document intentional const-casting in patch_id_neq()
2026-03-08 4:31 [PATCH] patch-ids: achieve const correctness in patch_id_neq() Tian Yuchen
2026-03-08 6:26 ` Junio C Hamano
@ 2026-03-08 15:02 ` Tian Yuchen
2026-03-09 0:26 ` Junio C Hamano
2026-03-09 6:51 ` [PATCH v3] " Tian Yuchen
1 sibling, 2 replies; 7+ messages in thread
From: Tian Yuchen @ 2026-03-08 15:02 UTC (permalink / raw)
To: git; +Cc: gitster, Tian Yuchen
The hashmap API requires the comparison function to take const pointers.
However, patch_id_neq() uses lazy evaluation to compute patch IDs on
demand.
Pre-calculating all patch IDs to achieve true const correctness would
introduce an unacceptable performance penalty.
Remove the eight-year-old "NEEDSWORK" comment and formally document
this intentional design trade-off.
Signed-off-by: Tian Yuchen <cat@malon.dev>
---
patch-ids.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/patch-ids.c b/patch-ids.c
index a5683b462c..35e6a974f1 100644
--- a/patch-ids.c
+++ b/patch-ids.c
@@ -41,7 +41,15 @@ static int patch_id_neq(const void *cmpfn_data,
const struct hashmap_entry *entry_or_key,
const void *keydata UNUSED)
{
- /* NEEDSWORK: const correctness? */
+ /*
+ * We drop the 'const' modifier here intentionally.
+ *
+ * The hashmap API requires us to treat the entries as const.
+ * However, to avoid performance regression, we lazily compute
+ * the patch IDs inside this comparison function. This fundamentally
+ * requires us to mutate the 'struct patch_id'. Therefore, we use
+ * container_of() to cast away the constness from the hashmap_entry.
+ */
struct diff_options *opt = (void *)cmpfn_data;
struct patch_id *a, *b;
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2] patch-ids: document intentional const-casting in patch_id_neq()
2026-03-08 15:02 ` [PATCH v2] patch-ids: document intentional const-casting " Tian Yuchen
@ 2026-03-09 0:26 ` Junio C Hamano
2026-03-09 6:39 ` cat
2026-03-09 6:51 ` [PATCH v3] " Tian Yuchen
1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2026-03-09 0:26 UTC (permalink / raw)
To: Tian Yuchen; +Cc: git, Kevin Willford
Tian Yuchen <cat@malon.dev> writes:
> + /*
> + * We drop the 'const' modifier here intentionally.
> + *
> + * The hashmap API requires us to treat the entries as const.
> + * However, to avoid performance regression, we lazily compute
> + * the patch IDs inside this comparison function. This fundamentally
> + * requires us to mutate the 'struct patch_id'. Therefore, we use
> + * container_of() to cast away the constness from the hashmap_entry.
> + */
Is that a "performance regression", I have to wonder? We would
regress relative to what by doing what?
Is the lazy evaluation avoiding unnecessary work?
If we are going to pass _all_ the objects in the hashmap to this
comparator function eventually _anyway_, then the total cost of
computing patch IDs to all of them in the hashmap would not change
with or without lazy computation, but if we are currently getting
away without having to compute for all, but only computing for the
ones we pass to this function, then lazy evaluation is clearly a
win. I do not offhand know which of the above two is the case, but
we need to know that before we can touch the NEEDSWORK comment, I
think.
The lazy computation comes from b3dfeebb (rebase: avoid computing
unnecessary patch IDs, 2016-07-29), even though the "const
correctness?" comment is a bit newer than that.
So it seems that we indeed are avoiding unnecessary work without
this patch. We'd encounter "performance regression" only if we stop
avoiding unnecessary work, so I am afraid that the phrasing used in
the patch is somewhat confusing.
Even though eptr and entry_or_key are const, we want to lazily
compute their .patch_id members; see b3dfeebb (rebase: avoid
computing unnecessary patch IDs, 2016-07-29), so cast the
constness away with container_of().
or something, perhaps?
> struct diff_options *opt = (void *)cmpfn_data;
> struct patch_id *a, *b;
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] patch-ids: document intentional const-casting in patch_id_neq()
2026-03-09 0:26 ` Junio C Hamano
@ 2026-03-09 6:39 ` cat
0 siblings, 0 replies; 7+ messages in thread
From: cat @ 2026-03-09 6:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Kevin Willford
Hi Junio,
> Is that a "performance regression", I have to wonder? We would
> regress relative to what by doing what?
>
> Is the lazy evaluation avoiding unnecessary work?
>
> If we are going to pass _all_ the objects in the hashmap to this
> comparator function eventually _anyway_, then the total cost of
> computing patch IDs to all of them in the hashmap would not change
> with or without lazy computation, but if we are currently getting
> away without having to compute for all, but only computing for the
> ones we pass to this function, then lazy evaluation is clearly a
> win. I do not offhand know which of the above two is the case, but
> we need to know that before we can touch the NEEDSWORK comment, I
> think.
>
> The lazy computation comes from b3dfeebb (rebase: avoid computing
> unnecessary patch IDs, 2016-07-29), even though the "const
> correctness?" comment is a bit newer than that.
>
> So it seems that we indeed are avoiding unnecessary work without
> this patch. We'd encounter "performance regression" only if we stop
> avoiding unnecessary work, so I am afraid that the phrasing used in
> the patch is somewhat confusing.
You're right. Avoiding unnecessary work is indeed a more fundamental
reason than preventing performance regression.
> Even though eptr and entry_or_key are const, we want to lazily
> compute their .patch_id members; see b3dfeebb (rebase: avoid
> computing unnecessary patch IDs, 2016-07-29), so cast the
> constness away with container_of().
>
> or something, perhaps?
>
>> struct diff_options *opt = (void *)cmpfn_data;
>> struct patch_id *a, *b;
I will incorporate your suggested phrasing and reference to the
historical
commit in v3.
Regards,
Yuchen
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3] patch-ids: document intentional const-casting in patch_id_neq()
2026-03-08 15:02 ` [PATCH v2] patch-ids: document intentional const-casting " Tian Yuchen
2026-03-09 0:26 ` Junio C Hamano
@ 2026-03-09 6:51 ` Tian Yuchen
1 sibling, 0 replies; 7+ messages in thread
From: Tian Yuchen @ 2026-03-09 6:51 UTC (permalink / raw)
To: git; +Cc: gitster, Tian Yuchen
The hashmap API requires the comparison function to take const pointers.
However, patch_id_neq() uses lazy evaluation to compute patch IDs on
demand. As established in b3dfeebb (rebase: avoid computing unnecessary
patch IDs, 2016-07-29), this avoids unnecessary work since not all
objects in the hashmap will eventually be compared.
Remove the ten-year-old "NEEDSWORK" comment and formally document
this intentional design trade-off.
Signed-off-by: Tian Yuchen <cat@malon.dev>
---
patch-ids.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/patch-ids.c b/patch-ids.c
index a5683b462c..1fbc88cbec 100644
--- a/patch-ids.c
+++ b/patch-ids.c
@@ -41,7 +41,14 @@ static int patch_id_neq(const void *cmpfn_data,
const struct hashmap_entry *entry_or_key,
const void *keydata UNUSED)
{
- /* NEEDSWORK: const correctness? */
+ /*
+ * We drop the 'const' modifier here intentionally.
+ *
+ * Even though eptr and entry_or_key are const, we want to
+ * lazily compute their .patch_id members; see b3dfeebb (rebase:
+ * avoid computing unnecessary patch IDs, 2016-07-29). So we cast
+ * the constness away with container_of().
+ */
struct diff_options *opt = (void *)cmpfn_data;
struct patch_id *a, *b;
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-09 6:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-08 4:31 [PATCH] patch-ids: achieve const correctness in patch_id_neq() Tian Yuchen
2026-03-08 6:26 ` Junio C Hamano
2026-03-08 14:42 ` Tian Yuchen
2026-03-08 15:02 ` [PATCH v2] patch-ids: document intentional const-casting " Tian Yuchen
2026-03-09 0:26 ` Junio C Hamano
2026-03-09 6:39 ` cat
2026-03-09 6:51 ` [PATCH v3] " Tian Yuchen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox