* [GSoC PATCH] backfill: auto-detect sparse-checkout from config
@ 2026-03-31 11:25 Trieu Huynh
2026-03-31 16:59 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: Trieu Huynh @ 2026-03-31 11:25 UTC (permalink / raw)
To: git; +Cc: Trieu Huynh
git backfill currently initializes the `sparse` field in
backfill_context to 0. This causes the command to always perform a
full backfill by default, even when the repository has sparse-checkout
enabled in its configuration (core.sparseCheckout).
Because 'sparse' is explicitly set to 0 at initialization, any later
logic intended to auto-detect the setting from the repository
configuration becomes dead code, as it only triggers if the value
is negative (sentinel).
Change the initial value of .sparse to -1. This allows the command
to correctly fallback to the repository's sparse-checkout settings
when the '--sparse' or '--no-sparse' options are not provided on the
command line.
Add a test case in t5620-backfill.sh to verify that 'git backfill'
automatically respects the sparse-checkout configuration without
explicit flags.
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
---
builtin/backfill.c | 2 +-
t/t5620-backfill.sh | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/builtin/backfill.c b/builtin/backfill.c
index 4b2db94173..0f31844ce7 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -124,7 +124,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
.repo = repo,
.current_batch = OID_ARRAY_INIT,
.min_batch_size = 50000,
- .sparse = 0,
+ .sparse = -1,
.show_progress = -1,
};
struct option options[] = {
diff --git a/t/t5620-backfill.sh b/t/t5620-backfill.sh
index 91b5115732..a1a8d736db 100755
--- a/t/t5620-backfill.sh
+++ b/t/t5620-backfill.sh
@@ -149,6 +149,21 @@ test_expect_success 'backfill --sparse' '
test_line_count = 0 missing
'
+test_expect_success 'backfill auto-detects sparse-checkout from config' '
+ git clone --sparse --filter=blob:none \
+ --single-branch --branch=main \
+ "file://$(pwd)/srv.bare" backfill-auto-sparse &&
+
+ git -C backfill-auto-sparse rev-list --quiet --objects --missing=print HEAD >missing &&
+ test_line_count = 44 missing &&
+
+ GIT_TRACE2_EVENT="$(pwd)/auto-sparse-trace" git \
+ -C backfill-auto-sparse backfill &&
+
+ test_trace2_data promisor fetch_count 4 <auto-sparse-trace &&
+ test_trace2_data path-walk paths 5 <auto-sparse-trace
+'
+
test_expect_success 'backfill --sparse without cone mode (positive)' '
git clone --no-checkout --filter=blob:none \
--single-branch --branch=main \
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [GSoC PATCH] backfill: auto-detect sparse-checkout from config
2026-03-31 11:25 [GSoC PATCH] backfill: auto-detect sparse-checkout from config Trieu Huynh
@ 2026-03-31 16:59 ` Junio C Hamano
2026-04-01 19:31 ` Trieu Huynh
0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2026-03-31 16:59 UTC (permalink / raw)
To: Trieu Huynh; +Cc: git, Derrick Stolee
Trieu Huynh <vikingtc4@gmail.com> writes:
> git backfill currently initializes the `sparse` field in
> backfill_context to 0. This causes the command to always perform a
> full backfill by default, even when the repository has sparse-checkout
> enabled in its configuration (core.sparseCheckout).
>
> Because 'sparse' is explicitly set to 0 at initialization, any later
> logic intended to auto-detect the setting from the repository
> configuration becomes dead code, as it only triggers if the value
> is negative (sentinel).
>
> Change the initial value of .sparse to -1. This allows the command
> to correctly fallback to the repository's sparse-checkout settings
> when the '--sparse' or '--no-sparse' options are not provided on the
> command line.
The author of bff45557 (backfill: add --sparse option, 2025-02-03),
where this .sparse member originates, CC'ed for more intelligent
input than my review can offer ;-)
> Add a test case in t5620-backfill.sh to verify that 'git backfill'
> automatically respects the sparse-checkout configuration without
> explicit flags.
>
> Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
> ---
> builtin/backfill.c | 2 +-
> t/t5620-backfill.sh | 15 +++++++++++++++
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/backfill.c b/builtin/backfill.c
> index 4b2db94173..0f31844ce7 100644
> --- a/builtin/backfill.c
> +++ b/builtin/backfill.c
> @@ -124,7 +124,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
> .repo = repo,
> .current_batch = OID_ARRAY_INIT,
> .min_batch_size = 50000,
> - .sparse = 0,
> + .sparse = -1,
> .show_progress = -1,
> };
> struct option options[] = {
I am a bit confused by this change. What's the difference between
using -1 (which you picked) and 1 as the initial value for this
member? From the proposed log message, I would have expected a new
code that says "ah, we notice, from this member being -1, that the
user did not specify --no-sparse or --sparse, so let's figure out if
our working tree is sparsely checked out ourselves and set it either
to 0 or to 1", but there is nothing like that in the code. It seems
that the updated code relies on the fact that this part of
do_backfill() only cares if .sparse is zero or not, and ...
if (ctx->sparse) {
CALLOC_ARRAY(info.pl, 1);
if (get_sparse_checkout_patterns(info.pl)) {
path_walk_info_clear(&info);
return error(_("problem loading sparse-checkout"));
}
}
... relies on get_sparse_checkout_patterns() not to do any harm when
the working tree is not sparsely checked out.
I am not sure if we want to call it "auto-detction". It looks more
like "default to --sparse, relying that --sparse is a no-op in a
non-sparse working tree" at least to me. Not that it is necessarily
wrong, and when people do "backfill" knowing that the working tree
is sparse, I am sympathetic if they prefer to keep the sparseness,
so such a change of default may be beneficial.
Derrick, what do you think?
> diff --git a/t/t5620-backfill.sh b/t/t5620-backfill.sh
> index 91b5115732..a1a8d736db 100755
> --- a/t/t5620-backfill.sh
> +++ b/t/t5620-backfill.sh
> @@ -149,6 +149,21 @@ test_expect_success 'backfill --sparse' '
> test_line_count = 0 missing
> '
>
> +test_expect_success 'backfill auto-detects sparse-checkout from config' '
> + git clone --sparse --filter=blob:none \
> + --single-branch --branch=main \
> + "file://$(pwd)/srv.bare" backfill-auto-sparse &&
> +
> + git -C backfill-auto-sparse rev-list --quiet --objects --missing=print HEAD >missing &&
> + test_line_count = 44 missing &&
> +
> + GIT_TRACE2_EVENT="$(pwd)/auto-sparse-trace" git \
> + -C backfill-auto-sparse backfill &&
> +
> + test_trace2_data promisor fetch_count 4 <auto-sparse-trace &&
> + test_trace2_data path-walk paths 5 <auto-sparse-trace
> +'
> +
> test_expect_success 'backfill --sparse without cone mode (positive)' '
> git clone --no-checkout --filter=blob:none \
> --single-branch --branch=main \
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GSoC PATCH] backfill: auto-detect sparse-checkout from config
2026-03-31 16:59 ` Junio C Hamano
@ 2026-04-01 19:31 ` Trieu Huynh
2026-04-02 14:24 ` Derrick Stolee
0 siblings, 1 reply; 5+ messages in thread
From: Trieu Huynh @ 2026-04-01 19:31 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Derrick Stolee
On Tue, Mar 31, 2026 at 09:59:10AM -0700, Junio C Hamano wrote:
> Trieu Huynh <vikingtc4@gmail.com> writes:
>
> > git backfill currently initializes the `sparse` field in
> > backfill_context to 0. This causes the command to always perform a
> > full backfill by default, even when the repository has sparse-checkout
> > enabled in its configuration (core.sparseCheckout).
> >
> > Because 'sparse' is explicitly set to 0 at initialization, any later
> > logic intended to auto-detect the setting from the repository
> > configuration becomes dead code, as it only triggers if the value
> > is negative (sentinel).
> >
> > Change the initial value of .sparse to -1. This allows the command
> > to correctly fallback to the repository's sparse-checkout settings
> > when the '--sparse' or '--no-sparse' options are not provided on the
> > command line.
>
> The author of bff45557 (backfill: add --sparse option, 2025-02-03),
> where this .sparse member originates, CC'ed for more intelligent
> input than my review can offer ;-)
>
>
> > Add a test case in t5620-backfill.sh to verify that 'git backfill'
> > automatically respects the sparse-checkout configuration without
> > explicit flags.
> >
> > Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
> > ---
> > builtin/backfill.c | 2 +-
> > t/t5620-backfill.sh | 15 +++++++++++++++
> > 2 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/builtin/backfill.c b/builtin/backfill.c
> > index 4b2db94173..0f31844ce7 100644
> > --- a/builtin/backfill.c
> > +++ b/builtin/backfill.c
> > @@ -124,7 +124,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
> > .repo = repo,
> > .current_batch = OID_ARRAY_INIT,
> > .min_batch_size = 50000,
> > - .sparse = 0,
> > + .sparse = -1,
> > .show_progress = -1,
> > };
> > struct option options[] = {
>
> I am a bit confused by this change. What's the difference between
> using -1 (which you picked) and 1 as the initial value for this
> member? From the proposed log message, I would have expected a new
> code that says "ah, we notice, from this member being -1, that the
> user did not specify --no-sparse or --sparse, so let's figure out if
> our working tree is sparsely checked out ourselves and set it either
> to 0 or to 1", but there is nothing like that in the code. It seems
> that the updated code relies on the fact that this part of
> do_backfill() only cares if .sparse is zero or not, and ...
>
> if (ctx->sparse) {
> CALLOC_ARRAY(info.pl, 1);
> if (get_sparse_checkout_patterns(info.pl)) {
> path_walk_info_clear(&info);
> return error(_("problem loading sparse-checkout"));
> }
> }
>
> ... relies on get_sparse_checkout_patterns() not to do any harm when
> the working tree is not sparsely checked out.
>
> I am not sure if we want to call it "auto-detction". It looks more
> like "default to --sparse, relying that --sparse is a no-op in a
> non-sparse working tree" at least to me. Not that it is necessarily
> wrong, and when people do "backfill" knowing that the working tree
> is sparse, I am sympathetic if they prefer to keep the sparseness,
> so such a change of default may be beneficial.
>
actually, the logic IIUC here is:
- first, ctx.sprase originally is set to 0.
- then, it check user's options. Assume, there is no option passed,
still 0.
- then, it check repo's config (core.sparseCheckout (default to 0 in
enviroment.c) but it doesn't since the guard:
if (ctx.sparse < 0)
ctx.sparse = cfg->apply_sparse_checkout;
- evenly. ctx.sparse still 0 eventhough in case the
core.sparseCheckout = 1 (git config core.sparseCheckout true)
IMHO, this change set the default value to -1, then it can fallback to
repo's config value if user has no-op passing (default to 0 (full
backfill if user doesnt intent to config previously either).
> Derrick, what do you think?
>
> > diff --git a/t/t5620-backfill.sh b/t/t5620-backfill.sh
> > index 91b5115732..a1a8d736db 100755
> > --- a/t/t5620-backfill.sh
> > +++ b/t/t5620-backfill.sh
> > @@ -149,6 +149,21 @@ test_expect_success 'backfill --sparse' '
> > test_line_count = 0 missing
> > '
> >
> > +test_expect_success 'backfill auto-detects sparse-checkout from config' '
> > + git clone --sparse --filter=blob:none \
> > + --single-branch --branch=main \
> > + "file://$(pwd)/srv.bare" backfill-auto-sparse &&
> > +
> > + git -C backfill-auto-sparse rev-list --quiet --objects --missing=print HEAD >missing &&
> > + test_line_count = 44 missing &&
> > +
> > + GIT_TRACE2_EVENT="$(pwd)/auto-sparse-trace" git \
> > + -C backfill-auto-sparse backfill &&
> > +
> > + test_trace2_data promisor fetch_count 4 <auto-sparse-trace &&
> > + test_trace2_data path-walk paths 5 <auto-sparse-trace
> > +'
> > +
> > test_expect_success 'backfill --sparse without cone mode (positive)' '
> > git clone --no-checkout --filter=blob:none \
> > --single-branch --branch=main \
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GSoC PATCH] backfill: auto-detect sparse-checkout from config
2026-04-01 19:31 ` Trieu Huynh
@ 2026-04-02 14:24 ` Derrick Stolee
2026-04-02 16:05 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: Derrick Stolee @ 2026-04-02 14:24 UTC (permalink / raw)
To: Trieu Huynh, Junio C Hamano; +Cc: git
On 4/1/26 3:31 PM, Trieu Huynh wrote:
> On Tue, Mar 31, 2026 at 09:59:10AM -0700, Junio C Hamano wrote:
>>> diff --git a/builtin/backfill.c b/builtin/backfill.c
>>> index 4b2db94173..0f31844ce7 100644
>>> --- a/builtin/backfill.c
>>> +++ b/builtin/backfill.c
>>> @@ -124,7 +124,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
>>> .repo = repo,
>>> .current_batch = OID_ARRAY_INIT,
>>> .min_batch_size = 50000,
>>> - .sparse = 0,
>>> + .sparse = -1,
>>> .show_progress = -1,
>>> };
>>> struct option options[] = {
>>
>> I am a bit confused by this change. What's the difference between
>> using -1 (which you picked) and 1 as the initial value for this
>> member? From the proposed log message, I would have expected a new
>> code that says "ah, we notice, from this member being -1, that the
>> user did not specify --no-sparse or --sparse, so let's figure out if
>> our working tree is sparsely checked out ourselves and set it either
>> to 0 or to 1", but there is nothing like that in the code. It seems
>> that the updated code relies on the fact that this part of
>> do_backfill() only cares if .sparse is zero or not, and ...
>>
>> if (ctx->sparse) {
>> CALLOC_ARRAY(info.pl, 1);
>> if (get_sparse_checkout_patterns(info.pl)) {
>> path_walk_info_clear(&info);
>> return error(_("problem loading sparse-checkout"));
>> }
>> }
>>
>> ... relies on get_sparse_checkout_patterns() not to do any harm when
>> the working tree is not sparsely checked out.
>>
>> I am not sure if we want to call it "auto-detction". It looks more
>> like "default to --sparse, relying that --sparse is a no-op in a
>> non-sparse working tree" at least to me. Not that it is necessarily
>> wrong, and when people do "backfill" knowing that the working tree
>> is sparse, I am sympathetic if they prefer to keep the sparseness,
>> so such a change of default may be beneficial.
>>
> actually, the logic IIUC here is:
> - first, ctx.sprase originally is set to 0.
> - then, it check user's options. Assume, there is no option passed,
> still 0.
> - then, it check repo's config (core.sparseCheckout (default to 0 in
> enviroment.c) but it doesn't since the guard:
> if (ctx.sparse < 0)
> ctx.sparse = cfg->apply_sparse_checkout;
> - evenly. ctx.sparse still 0 eventhough in case the
> core.sparseCheckout = 1 (git config core.sparseCheckout true)
>
> IMHO, this change set the default value to -1, then it can fallback to
> repo's config value if user has no-op passing (default to 0 (full
> backfill if user doesnt intent to config previously either).
>> Derrick, what do you think?
Indeed, I thought this was how it already worked, as 85127bcdea
(backfill: assume --sparse when sparse-checkout is enabled,
2025-02-03) (introduced in [1]) should have covered.
[1]
https://lore.kernel.org/git/f22cf8b34851a3ba4cd6ab1d31f0835579143c40.1738602667.git.gitgitgadget@gmail.com/
However, that change did not include this test:
>>> +test_expect_success 'backfill auto-detects sparse-checkout from config' '
>>> + git clone --sparse --filter=blob:none \
>>> + --single-branch --branch=main \
>>> + "file://$(pwd)/srv.bare" backfill-auto-sparse &&
>>> +
>>> + git -C backfill-auto-sparse rev-list --quiet --objects --missing=print HEAD >missing &&
>>> + test_line_count = 44 missing &&
>>> +
>>> + GIT_TRACE2_EVENT="$(pwd)/auto-sparse-trace" git \
>>> + -C backfill-auto-sparse backfill &&
>>> +
>>> + test_trace2_data promisor fetch_count 4 <auto-sparse-trace &&
>>> + test_trace2_data path-walk paths 5 <auto-sparse-trace
>>> +'
This actually demonstrates the behavior that I was intending
to introduce in that change, but made an error in (1) not testing
it like I thought I had, and (2) not doing exactly what this change
does by having the .sparse option initialized as -1 (unset).
The code and commit message do a good job of identifying this bug
and difference in behavior. The only suggestion I have is to update
the commit message to point to my original commit that failed to
implement this behavior in the expected way.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GSoC PATCH] backfill: auto-detect sparse-checkout from config
2026-04-02 14:24 ` Derrick Stolee
@ 2026-04-02 16:05 ` Junio C Hamano
0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2026-04-02 16:05 UTC (permalink / raw)
To: Derrick Stolee; +Cc: Trieu Huynh, git
Derrick Stolee <stolee@gmail.com> writes:
> On 4/1/26 3:31 PM, Trieu Huynh wrote:
>> On Tue, Mar 31, 2026 at 09:59:10AM -0700, Junio C Hamano wrote:
> ...
>>> I am a bit confused by this change. What's the difference between
>>> using -1 (which you picked) and 1 as the initial value for this
>>> member? From the proposed log message, I would have expected a new
>>> code that says "ah, we notice, from this member being -1, that the
>>> user did not specify --no-sparse or --sparse, so let's figure out if
>>> our working tree is sparsely checked out ourselves and set it either
>>> to 0 or to 1", but there is nothing like that in the code.
>> ...
>> IMHO, this change set the default value to -1, then it can fallback to
>> repo's config value if user has no-op passing (default to 0 (full
>> backfill if user doesnt intent to config previously either).
>>> Derrick, what do you think?
>
> Indeed, I thought this was how it already worked, as 85127bcdea
> (backfill: assume --sparse when sparse-checkout is enabled,
> 2025-02-03) (introduced in [1]) should have covered.
Ah, OK, in other words, the code to use how the repository is
configured when the user does not override from the command line was
already there; it was just the way the code checked if the user gave
something from the command line was wrong (i.e., initialized to 0,
pretending that '--no-foo" was given even when there isn't), and
that is why we do not see "ok, there is nothing on the command line,
so let's check the repository" _added_ by the patch---because it has
always been there. Makes sense.
> The code and commit message do a good job of identifying this bug
> and difference in behavior. The only suggestion I have is to update
> the commit message to point to my original commit that failed to
> implement this behavior in the expected way.
Sounds sensible and makes sense.
Thanks, both. Let's see a (hopefully small and final) reroll.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-02 16:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 11:25 [GSoC PATCH] backfill: auto-detect sparse-checkout from config Trieu Huynh
2026-03-31 16:59 ` Junio C Hamano
2026-04-01 19:31 ` Trieu Huynh
2026-04-02 14:24 ` Derrick Stolee
2026-04-02 16:05 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox