Git development
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Trieu Huynh <vikingtc4@gmail.com>, Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [GSoC PATCH] backfill: auto-detect sparse-checkout from config
Date: Thu, 2 Apr 2026 10:24:45 -0400	[thread overview]
Message-ID: <b7164e46-0521-4c0c-984e-35fc1891e4bd@gmail.com> (raw)
In-Reply-To: <buisigjsw3zrcy6bqaic2zefypq37kimju32eufquppsvkgkvx@cqd3cwj6an6t>

On 4/1/26 3:31 PM, Trieu Huynh wrote:
> On Tue, Mar 31, 2026 at 09:59:10AM -0700, Junio C Hamano wrote:

>>> diff --git a/builtin/backfill.c b/builtin/backfill.c
>>> index 4b2db94173..0f31844ce7 100644
>>> --- a/builtin/backfill.c
>>> +++ b/builtin/backfill.c
>>> @@ -124,7 +124,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
>>>   		.repo = repo,
>>>   		.current_batch = OID_ARRAY_INIT,
>>>   		.min_batch_size = 50000,
>>> -		.sparse = 0,
>>> +		.sparse = -1,
>>>   		.show_progress = -1,
>>>   	};
>>>   	struct option options[] = {
>>
>> I am a bit confused by this change.  What's the difference between
>> using -1 (which you picked) and 1 as the initial value for this
>> member?  From the proposed log message, I would have expected a new
>> code that says "ah, we notice, from this member being -1, that the
>> user did not specify --no-sparse or --sparse, so let's figure out if
>> our working tree is sparsely checked out ourselves and set it either
>> to 0 or to 1", but there is nothing like that in the code.  It seems
>> that the updated code relies on the fact that this part of
>> do_backfill() only cares if .sparse is zero or not, and ...
>>
>> 	if (ctx->sparse) {
>> 		CALLOC_ARRAY(info.pl, 1);
>> 		if (get_sparse_checkout_patterns(info.pl)) {
>> 			path_walk_info_clear(&info);
>> 			return error(_("problem loading sparse-checkout"));
>> 		}
>> 	}
>>
>> ... relies on get_sparse_checkout_patterns() not to do any harm when
>> the working tree is not sparsely checked out.
>>
>> I am not sure if we want to call it "auto-detction".  It looks more
>> like "default to --sparse, relying that --sparse is a no-op in a
>> non-sparse working tree" at least to me.  Not that it is necessarily
>> wrong, and when people do "backfill" knowing that the working tree
>> is sparse, I am sympathetic if they prefer to keep the sparseness,
>> so such a change of default may be beneficial.
>>
> actually, the logic IIUC here is:
> - first, ctx.sprase originally is set to 0.
> - then, it check user's options. Assume, there is no option passed,
> still 0.
> - then, it check repo's config (core.sparseCheckout (default to 0 in
> enviroment.c) but it doesn't since the guard:
> 	if (ctx.sparse < 0)
> 		ctx.sparse = cfg->apply_sparse_checkout;
> - evenly. ctx.sparse still 0 eventhough in case the
> core.sparseCheckout = 1 (git config core.sparseCheckout true)
> 
> IMHO, this change set the default value to -1, then it can fallback to
> repo's config value if user has no-op passing (default to 0 (full
> backfill if user doesnt intent to config previously either).
>> Derrick, what do you think?

Indeed, I thought this was how it already worked, as 85127bcdea
(backfill: assume --sparse when sparse-checkout is enabled,
2025-02-03) (introduced in [1]) should have covered.

[1] 
https://lore.kernel.org/git/f22cf8b34851a3ba4cd6ab1d31f0835579143c40.1738602667.git.gitgitgadget@gmail.com/

However, that change did not include this test:

>>> +test_expect_success 'backfill auto-detects sparse-checkout from config' '
>>> +	git clone --sparse --filter=blob:none \
>>> +		--single-branch --branch=main \
>>> +		"file://$(pwd)/srv.bare" backfill-auto-sparse &&
>>> +
>>> +	git -C backfill-auto-sparse rev-list --quiet --objects --missing=print HEAD >missing &&
>>> +	test_line_count = 44 missing &&
>>> +
>>> +	GIT_TRACE2_EVENT="$(pwd)/auto-sparse-trace" git \
>>> +		-C backfill-auto-sparse backfill &&
>>> +
>>> +	test_trace2_data promisor fetch_count 4 <auto-sparse-trace &&
>>> +	test_trace2_data path-walk paths 5 <auto-sparse-trace
>>> +'

This actually demonstrates the behavior that I was intending
to introduce in that change, but made an error in (1) not testing
it like I thought I had, and (2) not doing exactly what this change
does by having the .sparse option initialized as -1 (unset).

The code and commit message do a good job of identifying this bug
and difference in behavior. The only suggestion I have is to update
the commit message to point to my original commit that failed to
implement this behavior in the expected way.

Thanks,
-Stolee




  reply	other threads:[~2026-04-02 14:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31 11:25 [GSoC PATCH] backfill: auto-detect sparse-checkout from config Trieu Huynh
2026-03-31 16:59 ` Junio C Hamano
2026-04-01 19:31   ` Trieu Huynh
2026-04-02 14:24     ` Derrick Stolee [this message]
2026-04-02 16:05       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7164e46-0521-4c0c-984e-35fc1891e4bd@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=vikingtc4@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox