* [GSoC PATCH v2] backfill: add --[no-]progress option
@ 2026-04-12 19:36 Trieu Huynh
2026-04-12 19:46 ` Derrick Stolee
2026-04-15 17:04 ` Tian Yuchen
0 siblings, 2 replies; 5+ messages in thread
From: Trieu Huynh @ 2026-04-12 19:36 UTC (permalink / raw)
To: stolee, gitster; +Cc: git, Trieu Huynh
From: Trieu Huynh <vikingtc4@gmail.com>
'git backfill' does not show an overall progress bar across
batches, giving no cross-batch feedback during potentially
long-running operations on large repositories. By contrast,
'git fetch', 'git gc', and 'git index-pack' all support
--[no-]progress.
Add a --[no-]progress option that tracks the total number of
missing blobs downloaded across all batches, defaulting to
showing progress when stderr is a terminal (matching the
behaviour of 'git fetch').
Add tests to verify that:
- progress is shown by default on a TTY
- --progress forces output regardless of TTY
- --no-progress suppresses output
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
---
builtin/backfill.c | 18 +++++++++++++++++-
t/t5620-backfill.sh | 24 ++++++++++++++++++++++++
2 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/builtin/backfill.c b/builtin/backfill.c
index d794dd842f..e90c899071 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -26,7 +26,7 @@
#include "path-walk.h"
static const char * const builtin_backfill_usage[] = {
- N_("git backfill [--min-batch-size=<n>] [--[no-]sparse]"),
+ N_("git backfill [--min-batch-size=<n>] [--[no-]sparse] [--[no-]progress]"),
NULL
};
@@ -36,6 +36,9 @@ struct backfill_context {
size_t min_batch_size;
int sparse;
struct rev_info revs;
+ int show_progress;
+ size_t nr_downloaded;
+ struct progress *progress;
};
static void backfill_context_clear(struct backfill_context *ctx)
@@ -48,6 +51,7 @@ static void download_batch(struct backfill_context *ctx)
promisor_remote_get_direct(ctx->repo,
ctx->current_batch.oid,
ctx->current_batch.nr);
+ ctx->nr_downloaded += ctx->current_batch.nr;
oid_array_clear(&ctx->current_batch);
/*
@@ -55,6 +59,7 @@ static void download_batch(struct backfill_context *ctx)
* avoid possible duplicate downloads of the same objects.
*/
odb_reprepare(ctx->repo->objects);
+ display_progress(ctx->progress, ctx->nr_downloaded);
}
static int fill_missing_blobs(const char *path UNUSED,
@@ -121,12 +126,16 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
.min_batch_size = 50000,
.sparse = -1,
.revs = REV_INFO_INIT,
+ .nr_downloaded = 0,
+ .show_progress = -1,
};
struct option options[] = {
OPT_UNSIGNED(0, "min-batch-size", &ctx.min_batch_size,
N_("Minimum number of objects to request at a time")),
OPT_BOOL(0, "sparse", &ctx.sparse,
N_("Restrict the missing objects to the current sparse-checkout")),
+ OPT_BOOL(0, "progress", &ctx.show_progress,
+ N_("show progress while downloading missing objects")),
OPT_END(),
};
struct repo_config_values *cfg = repo_config_values(the_repository);
@@ -150,7 +159,14 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
if (ctx.sparse < 0)
ctx.sparse = cfg->apply_sparse_checkout;
+ if (ctx.show_progress < 0)
+ ctx.show_progress = isatty(2);
+
+ if (ctx.show_progress)
+ ctx.progress = start_progress(ctx.repo,
+ _("Downloading missing blobs"), 0);
result = do_backfill(&ctx);
+ stop_progress(&ctx.progress);
backfill_context_clear(&ctx);
release_revisions(&ctx.revs);
return result;
diff --git a/t/t5620-backfill.sh b/t/t5620-backfill.sh
index f3b5e39493..a75b84d8ac 100755
--- a/t/t5620-backfill.sh
+++ b/t/t5620-backfill.sh
@@ -133,6 +133,30 @@ test_expect_success 'do partial clone 2, backfill min batch size' '
test_line_count = 0 revs2
'
+test_expect_success TTY 'backfill shows progress on tty by default' '
+ git clone --no-checkout --filter=blob:none \
+ --single-branch --branch=main \
+ "file://$(pwd)/srv.bare" clone-tty &&
+ test_terminal env GIT_PROGRESS_DELAY=0 git -C clone-tty backfill 2>err &&
+ test_grep "Downloading missing blobs" err
+'
+
+test_expect_success 'backfill --progress shows progress' '
+ git clone --no-checkout --filter=blob:none \
+ --single-branch --branch=main \
+ "file://$(pwd)/srv.bare" clone-progress &&
+ git -C clone-progress backfill --progress 2>err &&
+ test_grep "Downloading missing blobs" err
+'
+
+test_expect_success 'backfill --no-progress suppresses progress' '
+ git clone --no-checkout --filter=blob:none \
+ --single-branch --branch=main \
+ "file://$(pwd)/srv.bare" clone-no-progress &&
+ git -C clone-no-progress backfill --no-progress 2>err &&
+ test_grep ! "Downloading missing blobs" err
+'
+
test_expect_success 'backfill --sparse without sparse-checkout fails' '
git init not-sparse &&
test_must_fail git -C not-sparse backfill --sparse 2>err &&
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [GSoC PATCH v2] backfill: add --[no-]progress option
2026-04-12 19:36 [GSoC PATCH v2] backfill: add --[no-]progress option Trieu Huynh
@ 2026-04-12 19:46 ` Derrick Stolee
2026-04-13 19:02 ` Trieu Huynh
2026-04-15 17:04 ` Tian Yuchen
1 sibling, 1 reply; 5+ messages in thread
From: Derrick Stolee @ 2026-04-12 19:46 UTC (permalink / raw)
To: Trieu Huynh, gitster; +Cc: git
On 4/12/26 3:36 PM, Trieu Huynh wrote:
> From: Trieu Huynh <vikingtc4@gmail.com>
>
> 'git backfill' does not show an overall progress bar across
> batches, giving no cross-batch feedback during potentially
> long-running operations on large repositories. By contrast,
> 'git fetch', 'git gc', and 'git index-pack' all support
> --[no-]progress.
>
> Add a --[no-]progress option that tracks the total number of
> missing blobs downloaded across all batches, defaulting to
> showing progress when stderr is a terminal (matching the
> behaviour of 'git fetch').
>
> Add tests to verify that:
> - progress is shown by default on a TTY
> - --progress forces output regardless of TTY
> - --no-progress suppresses output
I think the tests do show an improvement, but we're missing
the interaction with the underlying fetch's progress
indicators. I don't see any mention of how your backfill
progress indicators will work with or against the fetch's
progress from the remote and index-pack steps.
Further, if a user supplies 'git backfill --no-progress'
then they are probably saying "I don't want any progress
indicators" and that would signal also that the fetch should
be quiet. This is perhaps the key detail that makes your
current version unable to move forward. It creates an
implication that it doesn't follow-through on.
One way to go about this is to hide the 'git fetch' output
entirely by passing '--quiet' unconditionally from the
backfill command. But this may also be too much for users
who want to watch the download statistics from the remote.
Perhaps a way to have a robust set that allows all things
to interact is to do the following:
1. Add a --[no-]verbose option that is off by default. The
implementation sends the --quiet flag to 'git fetch' if
--verbose isn't provided from the user. This reduces the
noise for the default user.
2. Add a --[no-]progress option as you've provided here.
The complexity at the end is about what happens when the
user provides both --verbose and --progress, which is the
situation that this patch is currently in. How do the
progress indicators mingle with the verbose fetch output?
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [GSoC PATCH v2] backfill: add --[no-]progress option
2026-04-12 19:46 ` Derrick Stolee
@ 2026-04-13 19:02 ` Trieu Huynh
2026-04-15 18:28 ` Derrick Stolee
0 siblings, 1 reply; 5+ messages in thread
From: Trieu Huynh @ 2026-04-13 19:02 UTC (permalink / raw)
To: Derrick Stolee; +Cc: gitster, git
On Sun, Apr 12, 2026 at 03:46:17PM -0400, Derrick Stolee wrote:
> On 4/12/26 3:36 PM, Trieu Huynh wrote:
> > From: Trieu Huynh <vikingtc4@gmail.com>
> >
> > 'git backfill' does not show an overall progress bar across
> > batches, giving no cross-batch feedback during potentially
> > long-running operations on large repositories. By contrast,
> > 'git fetch', 'git gc', and 'git index-pack' all support
> > --[no-]progress.
> >
> > Add a --[no-]progress option that tracks the total number of
> > missing blobs downloaded across all batches, defaulting to
> > showing progress when stderr is a terminal (matching the
> > behaviour of 'git fetch').
> >
> > Add tests to verify that:
> > - progress is shown by default on a TTY
> > - --progress forces output regardless of TTY
> > - --no-progress suppresses output
>
> I think the tests do show an improvement, but we're missing
> the interaction with the underlying fetch's progress
> indicators. I don't see any mention of how your backfill
> progress indicators will work with or against the fetch's
> progress from the remote and index-pack steps.
Actually, I was missing adding it in the changelog, see below:
As-is:
remote: Enumerating objects: 7391, done.
remote: Counting objects: 100% (293/293), done.
remote: Compressing objects: 100% (162/162), done.
remote: Total 7391 (delta 249), reused 131 (delta 131), pack-reused 7098 (from 1)
Receiving objects: 100% (7391/7391), 4.09 MiB | 10.20 MiB/s, done.
Resolving deltas: 100% (5617/5617), done.
To-be:
remote: Enumerating objects: 7391, done.
remote: Counting objects: 100% (293/293), done.
remote: Compressing objects: 100% (162/162), done.
remote: Total 7391 (delta 249), reused 131 (delta 131), pack-reused 7098 (from 1)
Receiving objects: 100% (7391/7391), 4.09 MiB | 6.46 MiB/s, done.
Resolving deltas: 100% (5618/5618), done.
Downloading missing blobs: 157594, done.
>
> Further, if a user supplies 'git backfill --no-progress'
> then they are probably saying "I don't want any progress
> indicators" and that would signal also that the fetch should
> be quiet. This is perhaps the key detail that makes your
> current version unable to move forward. It creates an
> implication that it doesn't follow-through on.
Thank you for the point.
You are right that the current patch does not address the interaction
between the backfill progress bar and the underlying fetch's own output
(remote counting/compressing objects, index-pack, etc.). Leaving both
active at the same time would produce interleaved and confusing output,
which is worse than no progress at all.
>
> One way to go about this is to hide the 'git fetch' output
> entirely by passing '--quiet' unconditionally from the
> backfill command. But this may also be too much for users
> who want to watch the download statistics from the remote.
>
> Perhaps a way to have a robust set that allows all things
> to interact is to do the following:
>
> 1. Add a --[no-]verbose option that is off by default. The
> implementation sends the --quiet flag to 'git fetch' if
> --verbose isn't provided from the user. This reduces the
> noise for the default user.
>
> 2. Add a --[no-]progress option as you've provided here.
Make sense to me, will change to implement that way in v3.
>
> The complexity at the end is about what happens when the
> user provides both --verbose and --progress, which is the
> situation that this patch is currently in. How do the
> progress indicators mingle with the verbose fetch output?
IIUC, the fetch output for each batch completes before the progress
bar updates, so they do not actually interleave. The
"Downloading missing blobs" counter updates in place via carriage return
during the run, display until it's done partially, and only prints the
final "done." line at the end, for example:
remote: Enumerating objects: 50106, done.
remote: Counting objects: 100% (780/780), done.
...
Receiving objects: 100% (50106/50106), done.
remote: Enumerating objects: 50096, done.
...
Receiving objects: 100% (50096/50096), done.
Downloading missing blobs: 157594, done.
So --verbose and --progress together produce readable output without
any special handling needed.
Does that direction sound reasonable to you?
>
> Thanks,
> -Stolee
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GSoC PATCH v2] backfill: add --[no-]progress option
2026-04-13 19:02 ` Trieu Huynh
@ 2026-04-15 18:28 ` Derrick Stolee
0 siblings, 0 replies; 5+ messages in thread
From: Derrick Stolee @ 2026-04-15 18:28 UTC (permalink / raw)
To: Trieu Huynh; +Cc: gitster, git
On 4/13/2026 3:02 PM, Trieu Huynh wrote:
> On Sun, Apr 12, 2026 at 03:46:17PM -0400, Derrick Stolee wrote:
>> On 4/12/26 3:36 PM, Trieu Huynh wrote:
>>> From: Trieu Huynh <vikingtc4@gmail.com>
>>>
>>> 'git backfill' does not show an overall progress bar across
>>> batches, giving no cross-batch feedback during potentially
>>> long-running operations on large repositories. By contrast,
>>> 'git fetch', 'git gc', and 'git index-pack' all support
>>> --[no-]progress.
>>>
>>> Add a --[no-]progress option that tracks the total number of
>>> missing blobs downloaded across all batches, defaulting to
>>> showing progress when stderr is a terminal (matching the
>>> behaviour of 'git fetch').
>>>
>>> Add tests to verify that:
>>> - progress is shown by default on a TTY
>>> - --progress forces output regardless of TTY
>>> - --no-progress suppresses output
>>
>> I think the tests do show an improvement, but we're missing
>> the interaction with the underlying fetch's progress
>> indicators. I don't see any mention of how your backfill
>> progress indicators will work with or against the fetch's
>> progress from the remote and index-pack steps.
> Actually, I was missing adding it in the changelog, see below:
> As-is:
> remote: Enumerating objects: 7391, done.
> remote: Counting objects: 100% (293/293), done.
> remote: Compressing objects: 100% (162/162), done.
> remote: Total 7391 (delta 249), reused 131 (delta 131), pack-reused 7098 (from 1)
> Receiving objects: 100% (7391/7391), 4.09 MiB | 10.20 MiB/s, done.
> Resolving deltas: 100% (5617/5617), done.
>
> To-be:
> remote: Enumerating objects: 7391, done.
> remote: Counting objects: 100% (293/293), done.
> remote: Compressing objects: 100% (162/162), done.
> remote: Total 7391 (delta 249), reused 131 (delta 131), pack-reused 7098 (from 1)
> Receiving objects: 100% (7391/7391), 4.09 MiB | 6.46 MiB/s, done.
> Resolving deltas: 100% (5618/5618), done.
> Downloading missing blobs: 157594, done.
These examples are nice, but only for one batch of objects.
You'll need to test with a smaller batch size or a larger
repo to get the output I'm looking for.
>> The complexity at the end is about what happens when the
>> user provides both --verbose and --progress, which is the
>> situation that this patch is currently in. How do the
>> progress indicators mingle with the verbose fetch output?
> IIUC, the fetch output for each batch completes before the progress
> bar updates, so they do not actually interleave. The
> "Downloading missing blobs" counter updates in place via carriage return
> during the run, display until it's done partially, and only prints the
> final "done." line at the end, for example:
>
> remote: Enumerating objects: 50106, done.
> remote: Counting objects: 100% (780/780), done.
> ...
> Receiving objects: 100% (50106/50106), done.
> remote: Enumerating objects: 50096, done.
> ...
> Receiving objects: 100% (50096/50096), done.
> Downloading missing blobs: 157594, done.
>
> So --verbose and --progress together produce readable output without
> any special handling needed.
> Does that direction sound reasonable to you?
I think I'd like to see the full output for multiple batches,
and then I can decide if the progress indicators make sense
together or if they look confusing.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GSoC PATCH v2] backfill: add --[no-]progress option
2026-04-12 19:36 [GSoC PATCH v2] backfill: add --[no-]progress option Trieu Huynh
2026-04-12 19:46 ` Derrick Stolee
@ 2026-04-15 17:04 ` Tian Yuchen
1 sibling, 0 replies; 5+ messages in thread
From: Tian Yuchen @ 2026-04-15 17:04 UTC (permalink / raw)
To: Trieu Huynh, stolee, gitster; +Cc: git
On 4/13/26 03:36, Trieu Huynh wrote:
> @@ -133,6 +133,30 @@ test_expect_success 'do partial clone 2, backfill min batch size' '
> test_line_count = 0 revs2
> '
>
> +test_expect_success TTY 'backfill shows progress on tty by default' '
> + git clone --no-checkout --filter=blob:none \
> + --single-branch --branch=main \
[1]
> + "file://$(pwd)/srv.bare" clone-tty &&
> + test_terminal env GIT_PROGRESS_DELAY=0 git -C clone-tty backfill 2>err &&
> + test_grep "Downloading missing blobs" err
> +'
> +
> +test_expect_success 'backfill --progress shows progress' '
> + git clone --no-checkout --filter=blob:none \
> + --single-branch --branch=main \
[1]
> + "file://$(pwd)/srv.bare" clone-progress &&
> + git -C clone-progress backfill --progress 2>err &&
> + test_grep "Downloading missing blobs" err
> +'
> +
> +test_expect_success 'backfill --no-progress suppresses progress' '
> + git clone --no-checkout --filter=blob:none \
> + --single-branch --branch=main \
[1]
> + "file://$(pwd)/srv.bare" clone-no-progress &&
> + git -C clone-no-progress backfill --no-progress 2>err &&
> + test_grep ! "Downloading missing blobs" err
[2]
> +'
> +
> test_expect_success 'backfill --sparse without sparse-checkout fails' '
> git init not-sparse &&
> test_must_fail git -C not-sparse backfill --sparse 2>err &&
[1] I reckon you can reuse the git-cloned repository; there’s no need to
clone in every test. It's up to you ;-)
[2] You mentioned that you want test script to verify that
'--no-progress suppresses output', but are you referring to the output
brought by the '--progress' parameter itself, or *all* output?
I believe the second scenario is a bit more meaningful. If that is the
case, then the matching condition 'Downloading missing blobs' is clearly
a necessary but insufficient condition. The output from the internal
call to 'git fetch' within 'git backfill' will not be matched, which
results in a false negative.
Regards, Yuchen
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-15 18:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-12 19:36 [GSoC PATCH v2] backfill: add --[no-]progress option Trieu Huynh
2026-04-12 19:46 ` Derrick Stolee
2026-04-13 19:02 ` Trieu Huynh
2026-04-15 18:28 ` Derrick Stolee
2026-04-15 17:04 ` Tian Yuchen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox