* [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
@ 2026-02-07 11:40 Ayush Jha
2026-02-07 19:12 ` Tian Yuchen
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Ayush Jha @ 2026-02-07 11:40 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana, Lucas Seiki Oshiro, Ayush Jha
read_attr() currently relies on is_bare_repository(), which
implicitly depends on the global the_repository.
As attr.c is a reusable library component used by multiple
commands, this prevents correct behavior when operating on
secondary repositories (e.g. submodules or in-process repos)
whose bareness may differ from the_repository.
Update read_attr() to determine bareness using the repository
associated with istate->repo, based on repository configuration
and worktree presence, instead of relying on global state.
No functional change is intended for the primary repository case.
Signed-off-by: Ayush Jha <kumarayushjha123@gmail.com>
---
attr.c | 36 ++++++++++++++++++++++--------------
1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/attr.c b/attr.c
index 4999b7e09d..f2d25b1863 100644
--- a/attr.c
+++ b/attr.c
@@ -848,21 +848,29 @@ static struct attr_stack *read_attr(struct index_state *istate,
res = read_attr_from_index(istate, path, flags);
} else if (tree_oid) {
res = read_attr_from_blob(istate, tree_oid, path, flags);
- } else if (!is_bare_repository()) {
- if (direction == GIT_ATTR_CHECKOUT) {
- res = read_attr_from_index(istate, path, flags);
- if (!res)
- res = read_attr_from_file(path, flags);
- } else if (direction == GIT_ATTR_CHECKIN) {
- res = read_attr_from_file(path, flags);
- if (!res)
- /*
- * There is no checked out .gitattributes file
- * there, but we might have it in the index.
- * We allow operation in a sparsely checked out
- * work tree, so read from it.
- */
+ } else {
+ int is_bare;
+ int is_bare_cfg = -1;
+
+ repo_config_get_bool(istate->repo, "core.bare", &is_bare_cfg);
+ is_bare = is_bare_cfg && !repo_get_work_tree(istate->repo);
+
+ if (!is_bare) {
+ if (direction == GIT_ATTR_CHECKOUT) {
res = read_attr_from_index(istate, path, flags);
+ if (!res)
+ res = read_attr_from_file(path, flags);
+ } else if (direction == GIT_ATTR_CHECKIN) {
+ res = read_attr_from_file(path, flags);
+ if (!res)
+ /*
+ * There is no checked out .gitattributes file
+ * there, but we might have it in the index.
+ * We allow operation in a sparsely checked out
+ * work tree, so read from it.
+ */
+ res = read_attr_from_index(istate, path, flags);
+ }
}
}
--
2.53.0.windows.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-07 11:40 [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr Ayush Jha
@ 2026-02-07 19:12 ` Tian Yuchen
2026-02-07 21:02 ` Lucas Seiki Oshiro
2026-02-08 4:42 ` Tian Yuchen
2 siblings, 0 replies; 10+ messages in thread
From: Tian Yuchen @ 2026-02-07 19:12 UTC (permalink / raw)
To: Ayush Jha, git
Cc: Junio C Hamano, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana, Lucas Seiki Oshiro
On 2/7/26 19:40, Ayush Jha wrote:
> read_attr() currently relies on is_bare_repository(), which
> implicitly depends on the global the_repository.
>
> As attr.c is a reusable library component used by multiple
> commands, this prevents correct behavior when operating on
> secondary repositories (e.g. submodules or in-process repos)
> whose bareness may differ from the_repository.
>
> Update read_attr() to determine bareness using the repository
> associated with istate->repo, based on repository configuration
> and worktree presence, instead of relying on global state.
>
> No functional change is intended for the primary repository case.
>
> Signed-off-by: Ayush Jha <kumarayushjha123@gmail.com>
> ---
> attr.c | 36 ++++++++++++++++++++++--------------
> 1 file changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index 4999b7e09d..f2d25b1863 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -848,21 +848,29 @@ static struct attr_stack *read_attr(struct index_state *istate,
> res = read_attr_from_index(istate, path, flags);
> } else if (tree_oid) {
> res = read_attr_from_blob(istate, tree_oid, path, flags);
> - } else if (!is_bare_repository()) {
> - if (direction == GIT_ATTR_CHECKOUT) {
> - res = read_attr_from_index(istate, path, flags);
> - if (!res)
> - res = read_attr_from_file(path, flags);
> - } else if (direction == GIT_ATTR_CHECKIN) {
> - res = read_attr_from_file(path, flags);
> - if (!res)
> - /*
> - * There is no checked out .gitattributes file
> - * there, but we might have it in the index.
> - * We allow operation in a sparsely checked out
> - * work tree, so read from it.
> - */
> + } else {
> + int is_bare;
> + int is_bare_cfg = -1;
> +
> + repo_config_get_bool(istate->repo, "core.bare", &is_bare_cfg);
> + is_bare = is_bare_cfg && !repo_get_work_tree(istate->repo);
> +
> + if (!is_bare) {
> + if (direction == GIT_ATTR_CHECKOUT) {
> res = read_attr_from_index(istate, path, flags);
> + if (!res)
> + res = read_attr_from_file(path, flags);
> + } else if (direction == GIT_ATTR_CHECKIN) {
> + res = read_attr_from_file(path, flags);
> + if (!res)
> + /*
> + * There is no checked out .gitattributes file
> + * there, but we might have it in the index.
> + * We allow operation in a sparsely checked out
> + * work tree, so read from it.
> + */
> + res = read_attr_from_index(istate, path, flags);
> + }
> }
> }
>
Hi Ayush,
I'm new to the community ;)
Initially, the concern that replacing 'is_bare_repository' with
'repo_config_get_bool()' inside 'read_attr()' might introduce a
performance regression came to my mind immediately. To verify this, I
ran a benchmark using hyperfine, and the script was like:
>#!/bin/bash
>mkdir -p perf-test && cd perf-test
>git init
>
>git config user.email "malon7782@yahoo.com"
>git config user.name "ILOVEGIT"
>
>for i in {1..10000}; do
> echo "content" > "file_$i.txt"
>done
>
>git add .
>git commit -m "initial commit"
>
>echo "* test=auto" > .gitattributes
>git add .gitattributes
>git commit -m "add attr"
Then
>'git ls-files > files_list.txt'
( I wrote it this way primarily because I didn't anticipate that too
many files would return a 126 error code. So I improvised my switching
to reading from standard input:/
It didn't result in a noticeable performance difference (Even if the
number of loop was increased to 300000). Then I started to create a
stricter benchmark environment to defeat the attribute stack caching:
>mkdir -p dir_A dir_B
>
>echo "* text=auto" > dir_A/.gitattributes
>echo "* text=auto" > dir_B/.gitattributes
>
>for i in {1..5000}; do
> echo "dir_A/file"
> echo "dir_B/file"
>done > thrashing_list.txt
Still, no difference in performance. Has anyone else conducted any
similar performance test? Were the results the same as mine? Is it
normal, or there is something wrong with my test?
Regards,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-07 11:40 [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr Ayush Jha
2026-02-07 19:12 ` Tian Yuchen
@ 2026-02-07 21:02 ` Lucas Seiki Oshiro
2026-02-07 21:41 ` Junio C Hamano
2026-02-08 4:42 ` Tian Yuchen
2 siblings, 1 reply; 10+ messages in thread
From: Lucas Seiki Oshiro @ 2026-02-07 21:02 UTC (permalink / raw)
To: Ayush Jha
Cc: git, Junio C Hamano, Christian Couder, Karthik Nayak,
Justin Tobler, Ayush Chandekar, Siddharth Asthana
Hi Ayush!
Thanks for your interest in helping git-repo-info, even though
this second patch changes the focus. It was my GSoC project last
year.
Nitpick: there are two [PATCH] in the subject, and usually we use
only one subject prefix. I also think that this could be a v2 of
your first patch [1]. This way, this subject should be:
"[RFC GSoC PATCH v2] attr: use local repository state in read_attr".
Tip: use `--subject-prefix='RFC GSoC PATCH'` in this case, and
set `format.subjectPrefix='GSoC PATCH'` for your future GSoC
patches.
> read_attr() currently relies on is_bare_repository(), which
> implicitly depends on the global the_repository.
So, wouldn't it be better to make is_bare_repository depend
on a `struct repository *repo` instead of `the_repository`?
I don't know how feasible it is to do that, but it seems to
me that your change could benefit other places that depend
on that function.
> + int is_bare;
> + int is_bare_cfg = -1;
> +
> + repo_config_get_bool(istate->repo, "core.bare", &is_bare_cfg);
This function returns 0 when the config key is found. If the key
can't be found, it returns 1 and doesn't touch `is_bare_cfg`.
This means that if "core.bare" doesn't exist (which is unlikely,
but, who knows...) we'll proceed with is_bare_cfg = -1, and...
> + is_bare = is_bare_cfg && !repo_get_work_tree(istate->repo);
since -1 is a truthy value in C, then in this case we're
deciding it based on having or not a work tree. I don't know if
someone with more experience than me see something wrong with
this but it seems ok to me. However, I found this way a little
confusing to read. Perhaps it would be clearer if it was written
like this:
int is_bare;
if (repo_config_get_bool(istate->repo, "core.bare", &is_bare_cfg))
is_bare = !repo_get_work_tree(istate->repo);
[1] 20260206152002.1244-1-kumarayushjha123@gmail.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-07 21:02 ` Lucas Seiki Oshiro
@ 2026-02-07 21:41 ` Junio C Hamano
0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2026-02-07 21:41 UTC (permalink / raw)
To: Lucas Seiki Oshiro
Cc: Ayush Jha, git, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana
Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> writes:
>> read_attr() currently relies on is_bare_repository(), which
>> implicitly depends on the global the_repository.
>
> So, wouldn't it be better to make is_bare_repository depend
> on a `struct repository *repo` instead of `the_repository`?
The codepath read_attr() is in is usually not that hot but it is not
cheap.
The repository object should have a boolean that says "I am bare",
perhaps initialized lazily, and your version of is_bare_repository
that takes a repository object would be a good entry point to it.
Also, IIRC, there is another releated effort to allow attribute data
source to become per repository. This change may want to coordinate
with it.
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-07 11:40 [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr Ayush Jha
2026-02-07 19:12 ` Tian Yuchen
2026-02-07 21:02 ` Lucas Seiki Oshiro
@ 2026-02-08 4:42 ` Tian Yuchen
2026-02-10 14:05 ` Ayush Jha
2026-02-10 23:37 ` Junio C Hamano
2 siblings, 2 replies; 10+ messages in thread
From: Tian Yuchen @ 2026-02-08 4:42 UTC (permalink / raw)
To: Ayush Jha, git
Cc: Junio C Hamano, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana, Lucas Seiki Oshiro
Junio C Hamano <gitster@pobox.com> writes:
>The codepath read_attr() is in is usually not that hot but it is not
>cheap.
I'm a bit curious—under what circumstances would calling this method
result in significant performance regression?
Regards,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-08 4:42 ` Tian Yuchen
@ 2026-02-10 14:05 ` Ayush Jha
2026-02-14 0:04 ` Lucas Seiki Oshiro
2026-02-10 23:37 ` Junio C Hamano
1 sibling, 1 reply; 10+ messages in thread
From: Ayush Jha @ 2026-02-10 14:05 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana, Lucas Seiki Oshiro,
Chandra Pratap
Hello everyone,
I’ve incorporated the feedback provided on this patch and sent an
updated version as a follow-up patch series:
[RFC GSoC PATCH v3 0/2] Make read_attr() repository-aware by
introducing a lazy bare state
[RFC GSoC PATCH v3 1/2] repo-settings: add repo_settings_get_is_bare
[RFC GSoC PATCH v3 2/2] attr: use local repository state in read_attr
Since I’m still new to the Git community and the mailing-list
workflow, I wanted to check whether I might have missed anything in
the process (such as CCs, subject tags, or proper threading), as I
haven’t received feedback on the updated patches yet.
Please let me know if there’s anything I should fix or do differently.
I’d really appreciate any guidance.
Thank you for your time and for the earlier feedback.
Best regards,
Ayush Jha
On Sun, Feb 8, 2026 at 10:12 AM Tian Yuchen <a3205153416@gmail.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> >The codepath read_attr() is in is usually not that hot but it is not
> >cheap.
>
> I'm a bit curious—under what circumstances would calling this method
> result in significant performance regression?
>
> Regards,
>
> Yuchen
>
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-08 4:42 ` Tian Yuchen
2026-02-10 14:05 ` Ayush Jha
@ 2026-02-10 23:37 ` Junio C Hamano
2026-02-11 16:43 ` Tian Yuchen
1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2026-02-10 23:37 UTC (permalink / raw)
To: Tian Yuchen
Cc: Ayush Jha, git, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana, Lucas Seiki Oshiro
Tian Yuchen <a3205153416@gmail.com> writes:
> Junio C Hamano <gitster@pobox.com> writes:
>
> >The codepath read_attr() is in is usually not that hot but it is not
> >cheap.
>
> I'm a bit curious—under what circumstances would calling this method
> result in significant performance regression?
Significant? I dunno.
And quite honestly, I do not care about significance very much in a
case like this. Doing things that do not make sense, like checking
the same configuration variable again and again when you _know_ that
you never switched to a different repository since you last checked,
is simply wrong. It burdens the readers with unnecessary cognitive
load by making them wonder why you do such a nonsensical thing.
The read_attr() is called during an attr stack construction, which
traverses the directory hierarchy of a single repositry's working
tree (we do not traverse across submodule boundaries), and the same
istate (i.e., index contents) structure is passed around throughout
the callchain. The repository instance at istate->repo may be a
good place to store "am I bare?" bit that is computed just once and
reused whenever we need to know, like in the funcion under
discussion.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-10 23:37 ` Junio C Hamano
@ 2026-02-11 16:43 ` Tian Yuchen
0 siblings, 0 replies; 10+ messages in thread
From: Tian Yuchen @ 2026-02-11 16:43 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ayush Jha, git, Christian Couder, Karthik Nayak, Justin Tobler,
Ayush Chandekar, Siddharth Asthana, Lucas Seiki Oshiro
On 2/11/26 07:37, Junio C Hamano wrote:
> Tian Yuchen<a3205153416@gmail.com> writes:
>
>> Junio C Hamano<gitster@pobox.com> writes:
>>
>> >The codepath read_attr() is in is usually not that hot but it is not
>> >cheap.
>>
>> I'm a bit curious—under what circumstances would calling this method
>> result in significant performance regression?
> Significant? I dunno.
>
> And quite honestly, I do not care about significance very much in a
> case like this. Doing things that do not make sense, like checking
> the same configuration variable again and again when you_know_ that
> you never switched to a different repository since you last checked,
> is simply wrong. It burdens the readers with unnecessary cognitive
> load by making them wonder why you do such a nonsensical thing.
>
> The read_attr() is called during an attr stack construction, which
> traverses the directory hierarchy of a single repositry's working
> tree (we do not traverse across submodule boundaries), and the same
> istate (i.e., index contents) structure is passed around throughout
> the callchain. The repository instance at istate->repo may be a
> good place to store "am I bare?" bit that is computed just once and
> reused whenever we need to know, like in the funcion under
> discussion.
Hi Junio,
I completely agree that computing it once and storing it in
'istate->repo' is the right fix. Optimizing for logical clarity and
reducing load is indeed more important than micro-benchmarking here.
Thanks for the insight!
Regards,
Yuchen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-10 14:05 ` Ayush Jha
@ 2026-02-14 0:04 ` Lucas Seiki Oshiro
2026-02-14 6:47 ` Ayush Jha
0 siblings, 1 reply; 10+ messages in thread
From: Lucas Seiki Oshiro @ 2026-02-14 0:04 UTC (permalink / raw)
To: Ayush Jha
Cc: git, Junio C Hamano, Christian Couder, Karthik Nayak,
Justin Tobler, Ayush Chandekar, Siddharth Asthana, Chandra Pratap
> Hello everyone,
Hello again, Ayush!
> I’ve incorporated the feedback provided on this patch and sent an
> updated version as a follow-up patch series:
>
> [RFC GSoC PATCH v3 0/2] Make read_attr() repository-aware by
> introducing a lazy bare state
> [RFC GSoC PATCH v3 1/2] repo-settings: add repo_settings_get_is_bare
> [RFC GSoC PATCH v3 2/2] attr: use local repository state in read_attr
Two tips about working with our mailing lists:
1. Normally here we sent next versions replying to the cover letter
of the first version. You can use the flag --in-reply-to in
git-send-email(1) to do that. (This is a Git thing, some other
patch-based FLOSS projects don't do that).
2. When we reference other messages, we use the message id instead of
the subject. For example, I'm referencing your message [1] in this
sentence.
> Best regards,
> Ayush Jha
Thanks and welcome!
[1] CAFNBzOckR2yfGvLMHm0VZW+iKJTgFxzfxQAskdBV2HQ_3yXggA@mail.gmail.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr
2026-02-14 0:04 ` Lucas Seiki Oshiro
@ 2026-02-14 6:47 ` Ayush Jha
0 siblings, 0 replies; 10+ messages in thread
From: Ayush Jha @ 2026-02-14 6:47 UTC (permalink / raw)
To: Lucas Seiki Oshiro
Cc: git, Junio C Hamano, Christian Couder, Karthik Nayak,
Justin Tobler, Ayush Chandekar, Siddharth Asthana, Chandra Pratap
Hi Lucas,
Thank you for the clarification and for explaining the mailing list conventions.
I understand now that future versions should be sent in reply to the
original cover letter using --in-reply-to, and that referencing
messages by their message-id is preferred. I’ll follow this approach
in the next revision.
Thanks again for the guidance.
Best regards,
Ayush Jha
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-02-14 6:47 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-07 11:40 [PATCH] [RFC][GSoC][PATCH] attr: use local repository state in read_attr Ayush Jha
2026-02-07 19:12 ` Tian Yuchen
2026-02-07 21:02 ` Lucas Seiki Oshiro
2026-02-07 21:41 ` Junio C Hamano
2026-02-08 4:42 ` Tian Yuchen
2026-02-10 14:05 ` Ayush Jha
2026-02-14 0:04 ` Lucas Seiki Oshiro
2026-02-14 6:47 ` Ayush Jha
2026-02-10 23:37 ` Junio C Hamano
2026-02-11 16:43 ` Tian Yuchen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox