* [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
@ 2026-03-11 18:17 Tian Yuchen
2026-03-12 6:55 ` Patrick Steinhardt
0 siblings, 1 reply; 10+ messages in thread
From: Tian Yuchen @ 2026-03-11 18:17 UTC (permalink / raw)
To: git; +Cc: gitster, karthik.188, phillip.wood, jltobler, ps, Tian Yuchen
The 'cmd_mktree' already receives a 'struct repository *repo' but was
previously marked as UNUSED.
Pass the 'repo' down the 'mktree-line()' and 'write_tree()'.
Consequently, remove the 'USE_THE_REPOSITORY_VARIABLE' macro, and
replace 'parse_oid_hex()' with its context-aware version
'parse_oid_hex_algop()'.
Signed-off-by: Tian Yuchen <cat@malon.dev>
---
I originally intended to attempt the #FIXME in t1006-cat-file.sh.
I followed the clues all the way here, only to discover that the
FIXME required a level of expertise far beyond my capabilities,
so I gave up. However, I spot the global variable here, so I went
ahead and fixed it ;)
Two questions:
The 'oid_to_hex' function appears to use 'the_hash_algo' internally.
Seems that it also implicitly relying on global state. Is there
anything we should be aware of?
I've always been unsure about who to CC on domain-specific patches,
so I've only been sending them to Junio and the mailing list. Could
this be why my previous patch for a global variable refactor didn't
receive any review feedback? Here is the link:
https://lore.kernel.org/git/20260302085738.2510514-1-a3205153416@gmail.com/
I would be most grateful if you could provide any feedback. Thank you.
builtin/mktree.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/builtin/mktree.c b/builtin/mktree.c
index 12772303f5..4084e32476 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -3,7 +3,6 @@
*
* Copyright (c) Junio C Hamano, 2006, 2009
*/
-#define USE_THE_REPOSITORY_VARIABLE
#include "builtin.h"
#include "gettext.h"
#include "hex.h"
@@ -46,7 +45,7 @@ static int ent_compare(const void *a_, const void *b_)
b->name, b->len, b->mode);
}
-static void write_tree(struct object_id *oid)
+static void write_tree(struct repository *repo, struct object_id *oid)
{
struct strbuf buf;
size_t size;
@@ -60,10 +59,10 @@ static void write_tree(struct object_id *oid)
for (i = 0; i < used; i++) {
struct treeent *ent = entries[i];
strbuf_addf(&buf, "%o %s%c", ent->mode, ent->name, '\0');
- strbuf_add(&buf, ent->oid.hash, the_hash_algo->rawsz);
+ strbuf_add(&buf, ent->oid.hash, repo->hash_algo->rawsz);
}
- odb_write_object(the_repository->objects, buf.buf, buf.len, OBJ_TREE, oid);
+ odb_write_object(repo->objects, buf.buf, buf.len, OBJ_TREE, oid);
strbuf_release(&buf);
}
@@ -72,7 +71,7 @@ static const char *const mktree_usage[] = {
NULL
};
-static void mktree_line(char *buf, int nul_term_line, int allow_missing)
+static void mktree_line(struct repository *repo, char *buf, int nul_term_line, int allow_missing)
{
char *ptr, *ntr;
const char *p;
@@ -93,7 +92,7 @@ static void mktree_line(char *buf, int nul_term_line, int allow_missing)
die("input format error: %s", buf);
ptr = ntr + 1; /* type */
ntr = strchr(ptr, ' ');
- if (!ntr || parse_oid_hex(ntr + 1, &oid, &p) ||
+ if (!ntr || parse_oid_hex_algop(ntr + 1, &oid, &p, repo->hash_algo) ||
*p != '\t')
die("input format error: %s", buf);
@@ -124,7 +123,7 @@ static void mktree_line(char *buf, int nul_term_line, int allow_missing)
/* Check the type of object identified by oid without fetching objects */
oi.typep = &obj_type;
- if (odb_read_object_info_extended(the_repository->objects, &oid, &oi,
+ if (odb_read_object_info_extended(repo->objects, &oid, &oi,
OBJECT_INFO_LOOKUP_REPLACE |
OBJECT_INFO_QUICK |
OBJECT_INFO_SKIP_FETCH_OBJECT) < 0)
@@ -155,7 +154,7 @@ static void mktree_line(char *buf, int nul_term_line, int allow_missing)
int cmd_mktree(int ac,
const char **av,
const char *prefix,
- struct repository *repo UNUSED)
+ struct repository *repo)
{
struct strbuf sb = STRBUF_INIT;
struct object_id oid;
@@ -187,7 +186,7 @@ int cmd_mktree(int ac,
break;
die("input format error: (blank line only valid in batch mode)");
}
- mktree_line(sb.buf, nul_term_line, allow_missing);
+ mktree_line(repo, sb.buf, nul_term_line, allow_missing);
}
if (is_batch_mode && got_eof && used < 1) {
/*
@@ -197,7 +196,7 @@ int cmd_mktree(int ac,
*/
; /* skip creating an empty tree */
} else {
- write_tree(&oid);
+ write_tree(repo, &oid);
puts(oid_to_hex(&oid));
fflush(stdout);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-11 18:17 [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE Tian Yuchen
@ 2026-03-12 6:55 ` Patrick Steinhardt
2026-03-12 16:21 ` Tian Yuchen
0 siblings, 1 reply; 10+ messages in thread
From: Patrick Steinhardt @ 2026-03-12 6:55 UTC (permalink / raw)
To: Tian Yuchen; +Cc: git, gitster, karthik.188, phillip.wood, jltobler
On Thu, Mar 12, 2026 at 02:17:03AM +0800, Tian Yuchen wrote:
> The 'cmd_mktree' already receives a 'struct repository *repo' but was
> previously marked as UNUSED.
>
> Pass the 'repo' down the 'mktree-line()' and 'write_tree()'.
I guess s/the/to/? Also, it's `mktree_line()`, not `mktree-line()`.
One thing that commit messages should also explain is why a certain
refactoring is safe to do. That is, can `repo` ever be `NULL`? For that
you have to look at "git.c" and figure out whether or not the command
requires a repository to exist.
> The 'oid_to_hex' function appears to use 'the_hash_algo' internally.
> Seems that it also implicitly relying on global state. Is there
> anything we should be aware of?
`oid_to_hex()` falls back to using `the_hash_algo` in case the object ID
you have doesn't have a proper hash specified. So this depends on how
exactly you construct the object IDs: if you parse them with a proper
hash algorithm, then you're fine.
> I've always been unsure about who to CC on domain-specific patches,
> so I've only been sending them to Junio and the mailing list. Could
> this be why my previous patch for a global variable refactor didn't
> receive any review feedback? Here is the link:
It's typically fine to just send to the mailing list, so you wouldn't
even Cc Junio. Sometimes it's just a matter of capacity, and it's fine
to eventually send a ping after a week or two have passed without any
feedback.
The patch itself looks good to me, thanks!
Patrick
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-12 6:55 ` Patrick Steinhardt
@ 2026-03-12 16:21 ` Tian Yuchen
2026-03-13 7:02 ` Patrick Steinhardt
2026-03-13 16:03 ` Junio C Hamano
0 siblings, 2 replies; 10+ messages in thread
From: Tian Yuchen @ 2026-03-12 16:21 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, gitster, karthik.188, phillip.wood, jltobler
Hi Patrick,
Thanks for the review!
On 3/12/26 14:55, Patrick Steinhardt wrote:
> I guess s/the/to/? Also, it's `mktree_line()`, not `mktree-line()`.
Thank you for pointing that out. I'll correct it right away.
> One thing that commit messages should also explain is why a certain
> refactoring is safe to do.
I'll add it.
> That is, can `repo` ever be `NULL`? For that
> you have to look at "git.c" and figure out whether or not the command
> requires a repository to exist.
I checked git.c and found that there is:
{ "mktree", cmd_mktree, RUN_SETUP }
in commands[]. If my understanding is correct, before cmd_mktree is
called, setup_git_directory() must have been fully executed. In that
case, if the current directory isn't a valid repository (NULL), it
should have already exited at an earlier stage, right?
> `oid_to_hex()` falls back to using `the_hash_algo` in case the object ID
> you have doesn't have a proper hash specified. So this depends on how
> exactly you construct the object IDs: if you parse them with a proper
> hash algorithm, then you're fine.
I see. That's pretty much what I had in mind.
> It's typically fine to just send to the mailing list, so you wouldn't
> even Cc Junio. Sometimes it's just a matter of capacity, and it's fine
> to eventually send a ping after a week or two have passed without any
> feedback.
Oh, I see. I thought “repeatedly bringing up a patch no one cares about”
would be considered kinda *impolite*. Now I understand. Thank you.
Regards,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-12 16:21 ` Tian Yuchen
@ 2026-03-13 7:02 ` Patrick Steinhardt
2026-03-13 16:03 ` Junio C Hamano
1 sibling, 0 replies; 10+ messages in thread
From: Patrick Steinhardt @ 2026-03-13 7:02 UTC (permalink / raw)
To: Tian Yuchen; +Cc: git, gitster, karthik.188, phillip.wood, jltobler
On Fri, Mar 13, 2026 at 12:21:41AM +0800, Tian Yuchen wrote:
> > It's typically fine to just send to the mailing list, so you wouldn't
> > even Cc Junio. Sometimes it's just a matter of capacity, and it's fine
> > to eventually send a ping after a week or two have passed without any
> > feedback.
>
> Oh, I see. I thought “repeatedly bringing up a patch no one cares about”
> would be considered kinda *impolite*. Now I understand. Thank you.
I mean if you nudge every second day that would certainly be considered
impolite. But nuding after a week or two, and then maybe nudging again
after a month is certainly fine. It just happens that patches fall
through the cracks.
These aren't strict numbers by the way, it's mostly pulled out of thin
air with some (hopefully) common sense applied to it by me.
Patrick
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-12 16:21 ` Tian Yuchen
2026-03-13 7:02 ` Patrick Steinhardt
@ 2026-03-13 16:03 ` Junio C Hamano
2026-03-13 17:15 ` Tian Yuchen
1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2026-03-13 16:03 UTC (permalink / raw)
To: Tian Yuchen; +Cc: Patrick Steinhardt, git, karthik.188, phillip.wood, jltobler
Tian Yuchen <cat@malon.dev> writes:
>> That is, can `repo` ever be `NULL`? For that
>> you have to look at "git.c" and figure out whether or not the command
>> requires a repository to exist.
> I checked git.c and found that there is:
>
> { "mktree", cmd_mktree, RUN_SETUP }
>
> in commands[]. If my understanding is correct, before cmd_mktree is
> called, setup_git_directory() must have been fully executed. In that
> case, if the current directory isn't a valid repository (NULL), it
> should have already exited at an earlier stage, right?
There is one corner case; upon "git foo -h", your cmd_foo() will get
repo==NULL when the command is run outside a repository. As long as
your cmd_foo() asks parse_options() to react to "-h" (which gives
the help message and then exits) before it uses repo assuming it
cannot be NULL, you are safe.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-13 16:03 ` Junio C Hamano
@ 2026-03-13 17:15 ` Tian Yuchen
2026-03-13 17:54 ` Junio C Hamano
0 siblings, 1 reply; 10+ messages in thread
From: Tian Yuchen @ 2026-03-13 17:15 UTC (permalink / raw)
To: Junio C Hamano
Cc: Patrick Steinhardt, git, karthik.188, phillip.wood, jltobler
On 3/14/26 00:03, Junio C Hamano wrote:
> There is one corner case; upon "git foo -h", your cmd_foo() will get
> repo==NULL when the command is run outside a repository. As long as
> your cmd_foo() asks parse_options() to react to "-h" (which gives
> the help message and then exits) before it uses repo assuming it
> cannot be NULL, you are safe.
I was completely blown away Σ( ° △ °)
Thanks for pointing that out. Otherwise, I wouldn’t have been able to
figure it out no matter how hard I try.
I just took a quick look at the code:
> const struct option option[] = {
> OPT_BOOL('z', NULL, &nul_term_line, N_("input is NUL terminated")),
> OPT_SET_INT( 0 , "missing", &allow_missing, N_("allow missing objects"), 1),
> OPT_SET_INT( 0 , "batch", &is_batch_mode, N_("allow creation of more than one tree"), 1),
> OPT_END()
> };
>
> ac = parse_options(ac, av, prefix, option, mktree_usage, 0);
> getline_fn = nul_term_line ? strbuf_getline_nul : strbuf_getline_lf;
>
> while (!got_eof) {
> while (1) { ...
I think if there's a '-h' parameter, it gets intercepted in
parse_options() and the process exits before repo is called. So there’s
nothing to worry about, right?
By the way, I find it a bit confusing that the'`-h' parameter — which is
solely used for documentation query — is parsed and intercepted within a
function that handles actual business logic. Wouldn’t it be more
appropriate to intercept it at a higher level? Is this a technical debt?
I don’t intend to write a patch to fix this, but this parameter has made
me realize that this odd way of coding (at least in my understanding) is
highly unpredictable for someone with limited experience like me. I
don't know if there are any troubleshooting methods other than asking
someone with experience.
Regards,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-13 17:15 ` Tian Yuchen
@ 2026-03-13 17:54 ` Junio C Hamano
2026-03-13 18:12 ` Tian Yuchen
0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2026-03-13 17:54 UTC (permalink / raw)
To: Tian Yuchen; +Cc: Patrick Steinhardt, git, karthik.188, phillip.wood, jltobler
Tian Yuchen <cat@malon.dev> writes:
> On 3/14/26 00:03, Junio C Hamano wrote:
>
>> There is one corner case; upon "git foo -h", your cmd_foo() will get
>> repo==NULL when the command is run outside a repository. As long as
>> your cmd_foo() asks parse_options() to react to "-h" (which gives
>> the help message and then exits) before it uses repo assuming it
>> cannot be NULL, you are safe.
>
> I was completely blown away Σ( ° △ °)
>
> Thanks for pointing that out. Otherwise, I wouldn’t have been able to
> figure it out no matter how hard I try.
>
> I just took a quick look at the code:
>
>> const struct option option[] = {
>> OPT_BOOL('z', NULL, &nul_term_line, N_("input is NUL terminated")),
>> OPT_SET_INT( 0 , "missing", &allow_missing, N_("allow missing objects"), 1),
>> OPT_SET_INT( 0 , "batch", &is_batch_mode, N_("allow creation of more than one tree"), 1),
>> OPT_END()
>> };
>>
>> ac = parse_options(ac, av, prefix, option, mktree_usage, 0);
>> getline_fn = nul_term_line ? strbuf_getline_nul : strbuf_getline_lf;
>>
>> while (!got_eof) {
>> while (1) { ...
>
> I think if there's a '-h' parameter, it gets intercepted in
> parse_options() and the process exits before repo is called. So there’s
> nothing to worry about, right?
Correct. The function calls parse_options() before it looks at "repo".
> By the way, I find it a bit confusing that the'`-h' parameter — which is
> solely used for documentation query — is parsed and intercepted within a
> function that handles actual business logic.
I strongly disagree your idea that 'z' is more business logic than
'h' is. Both are equally relevant.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-13 17:54 ` Junio C Hamano
@ 2026-03-13 18:12 ` Tian Yuchen
2026-03-13 20:20 ` Junio C Hamano
0 siblings, 1 reply; 10+ messages in thread
From: Tian Yuchen @ 2026-03-13 18:12 UTC (permalink / raw)
To: Junio C Hamano
Cc: Patrick Steinhardt, git, karthik.188, phillip.wood, jltobler
On 3/14/26 01:54, Junio C Hamano wrote:
> I strongly disagree your idea that 'z' is more business logic than
> 'h' is. Both are equally relevant.
Perhaps I didn't explain myself clearly :(
I do understand that *currently* both are part of the business logic.
However, what puzzles me is: why is it written this way? Why isn't -h
intercepted at the outer global level, but instead handed off to a
function like parse_options() for interception?
Is this due to historical reasons?
Please forgive my slowness. I would appreciate it if you could offer
some guidance!
Thanks,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-13 18:12 ` Tian Yuchen
@ 2026-03-13 20:20 ` Junio C Hamano
2026-03-14 3:17 ` Tian Yuchen
0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2026-03-13 20:20 UTC (permalink / raw)
To: Tian Yuchen; +Cc: Patrick Steinhardt, git, karthik.188, phillip.wood, jltobler
Tian Yuchen <cat@malon.dev> writes:
> On 3/14/26 01:54, Junio C Hamano wrote:
>
>> I strongly disagree your idea that 'z' is more business logic than
>> 'h' is. Both are equally relevant.
>
> Perhaps I didn't explain myself clearly :(
>
> I do understand that *currently* both are part of the business logic.
> However, what puzzles me is: why is it written this way? Why isn't -h
> intercepted at the outer global level, but instead handed off to a
> function like parse_options() for interception?
>
> Is this due to historical reasons?
>
> Please forgive my slowness. I would appreciate it if you could offer
> some guidance!
It is perfectly OK to be slow. Spend enough time to study the code
so that you do not have to ask for forgiveness ;-)
In order to make a useful response to "-h", that business logic
needs to know what options are available and what argument they take
etc., which is already given to parse_options API. What makes it
make any sense to split it to separate codepath?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE
2026-03-13 20:20 ` Junio C Hamano
@ 2026-03-14 3:17 ` Tian Yuchen
0 siblings, 0 replies; 10+ messages in thread
From: Tian Yuchen @ 2026-03-14 3:17 UTC (permalink / raw)
To: Junio C Hamano
Cc: Patrick Steinhardt, git, karthik.188, phillip.wood, jltobler
Hi Junio,
> Tian Yuchen <cat@malon.dev> writes:
>
>> On 3/14/26 01:54, Junio C Hamano wrote:
>>
>>> I strongly disagree your idea that 'z' is more business logic than
>>> 'h' is. Both are equally relevant.
>>
>> Perhaps I didn't explain myself clearly :(
>>
>> I do understand that *currently* both are part of the business logic.
>> However, what puzzles me is: why is it written this way? Why isn't -h
>> intercepted at the outer global level, but instead handed off to a
>> function like parse_options() for interception?
>>
>> Is this due to historical reasons?
>>
>> Please forgive my slowness. I would appreciate it if you could offer
>> some guidance!
>
> It is perfectly OK to be slow. Spend enough time to study the code
> so that you do not have to ask for forgiveness ;-)
>
> In order to make a useful response to "-h", that business logic
> needs to know what options are available and what argument they take
> etc., which is already given to parse_options API. What makes it
> make any sense to split it to separate codepath?
I see.
I’ve been thinking about it, and moving it to an external file seems
to break encapsulation also. If 'option[]' is no longer static, then the
code
that originally handled this logic would have to be moved to a file like
'git.c', and the codebase would increase significantly. That’s probably
not what we want, right?
This is not only semantically confusing, but it also doesn't work any
better in practice.
I kept thinking about maintaining a separate framework to intercept
all of this — I guess I’ve fallen into a certain mindset when it comes
to
modern CLIs. These changes, which do not offer any significant
advantages,
seem to actually undermine performance and local clarity. It really
isn’t
worth the effort.
I hadn't actually planned to migrate anything. I was just a bit confused
as
to why it was written this way.
Thank you,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-14 3:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11 18:17 [PATCH v1] builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE Tian Yuchen
2026-03-12 6:55 ` Patrick Steinhardt
2026-03-12 16:21 ` Tian Yuchen
2026-03-13 7:02 ` Patrick Steinhardt
2026-03-13 16:03 ` Junio C Hamano
2026-03-13 17:15 ` Tian Yuchen
2026-03-13 17:54 ` Junio C Hamano
2026-03-13 18:12 ` Tian Yuchen
2026-03-13 20:20 ` Junio C Hamano
2026-03-14 3:17 ` Tian Yuchen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox