From: Trieu Huynh <vikingtc4@gmail.com>
To: git@vger.kernel.org
Cc: Trieu Huynh <vikingtc4@gmail.com>
Subject: [RFC GSoC PATCH] backfill: skip downloading for empty batches
Date: Tue, 31 Mar 2026 21:12:04 +0900 [thread overview]
Message-ID: <20260331121204.787826-1-vikingtc4@gmail.com> (raw)
When git backfill finishes its object walk, it unconditionally calls
download_batch to process any remaining objects. If the repository
is already up-to-date (no missing objects found), this call still
performs an unnecessary directory scan via odb_reprepare.
Fix it by adding a check in do_backfill to ensure download_batch is only
called if the current batch actually contains objects (nr > 0).
To facilitate testing and provide better telemetry, add a trace2 data
event for batches_requested. This allows us to verify that no batches
are processed when the command is run on an up-to-date repository.
Add a test case in t5620-backfill.sh to ensure silence and efficiency
when no objects are missing.
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
---
Need discussion:
1. Is adding trace2_data_intmax() the preferred way to verify this
behavior in our test suite, or should we rely on redirection of
stderr to check for progress messages when the progress option
is supported?
builtin/backfill.c | 3 ++-
t/t5620-backfill.sh | 16 ++++++++++++++++
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/builtin/backfill.c b/builtin/backfill.c
index 0f31844ce7..67f9f28daf 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -58,6 +58,7 @@ static void download_batch(struct backfill_context *ctx)
*/
odb_reprepare(ctx->repo->objects);
display_progress(ctx->progress, ++ctx->batches_requested);
+ trace2_data_intmax("backfill", ctx->repo, "batches_requested", ctx->batches_requested);
}
static int fill_missing_blobs(const char *path UNUSED,
@@ -109,7 +110,7 @@ static int do_backfill(struct backfill_context *ctx)
ret = walk_objects_by_path(&info);
/* Download the objects that did not fill a batch. */
- if (!ret)
+ if ( (!ret) && (ctx->current_batch.nr > 0) )
download_batch(ctx);
path_walk_info_clear(&info);
diff --git a/t/t5620-backfill.sh b/t/t5620-backfill.sh
index a1a8d736db..d3cc4022bf 100755
--- a/t/t5620-backfill.sh
+++ b/t/t5620-backfill.sh
@@ -221,6 +221,22 @@ test_expect_success 'backfill --sparse without cone mode (negative)' '
test_line_count = 12 missing
'
+test_expect_success 'backfill does not request batches when up-to-date' '
+ git clone --no-checkout --filter=blob:none \
+ --single-branch --branch=main \
+ "file://$(pwd)/srv.bare" backfill-up-to-date &&
+
+ # First trigger to have a full download
+ git -C backfill-up-to-date backfill &&
+
+ # Second trigger to verify when already have a full download previously
+ GIT_TRACE2_EVENT="$(pwd)/up-to-date-trace" git \
+ -C backfill-up-to-date backfill &&
+
+ # Verify no batches_request occurr
+ test_grep ! "batches_requested" up-to-date-trace
+'
+
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
--
2.43.0
next reply other threads:[~2026-03-31 12:12 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-31 12:12 Trieu Huynh [this message]
2026-04-01 11:50 ` [RFC GSoC PATCH] backfill: skip downloading for empty batches Patrick Steinhardt
2026-04-01 19:44 ` Trieu Huynh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260331121204.787826-1-vikingtc4@gmail.com \
--to=vikingtc4@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox