Git development
 help / color / mirror / Atom feed
From: Trieu Huynh <vikingtc4@gmail.com>
To: git@vger.kernel.org
Cc: Trieu Huynh <vikingtc4@gmail.com>
Subject: [RFC GSoC PATCH] backfill: skip downloading for empty batches
Date: Tue, 31 Mar 2026 21:12:04 +0900	[thread overview]
Message-ID: <20260331121204.787826-1-vikingtc4@gmail.com> (raw)

When git backfill finishes its object walk, it unconditionally calls
download_batch to process any remaining objects. If the repository
is already up-to-date (no missing objects found), this call still
performs an unnecessary directory scan via odb_reprepare.

Fix it by adding a check in do_backfill to ensure download_batch is only
called if the current batch actually contains objects (nr > 0).

To facilitate testing and provide better telemetry, add a trace2 data
event for batches_requested. This allows us to verify that no batches
are processed when the command is run on an up-to-date repository.

Add a test case in t5620-backfill.sh to ensure silence and efficiency
when no objects are missing.

Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
---
Need discussion:
1. Is adding trace2_data_intmax() the preferred way to verify this 
   behavior in our test suite, or should we rely on redirection of 
   stderr to check for progress messages when the progress option
   is supported?

 builtin/backfill.c  |  3 ++-
 t/t5620-backfill.sh | 16 ++++++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/builtin/backfill.c b/builtin/backfill.c
index 0f31844ce7..67f9f28daf 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -58,6 +58,7 @@ static void download_batch(struct backfill_context *ctx)
 	 */
 	odb_reprepare(ctx->repo->objects);
 	display_progress(ctx->progress, ++ctx->batches_requested);
+	trace2_data_intmax("backfill", ctx->repo, "batches_requested", ctx->batches_requested);
 }
 
 static int fill_missing_blobs(const char *path UNUSED,
@@ -109,7 +110,7 @@ static int do_backfill(struct backfill_context *ctx)
 	ret = walk_objects_by_path(&info);
 
 	/* Download the objects that did not fill a batch. */
-	if (!ret)
+	if ( (!ret) && (ctx->current_batch.nr > 0) )
 		download_batch(ctx);
 
 	path_walk_info_clear(&info);
diff --git a/t/t5620-backfill.sh b/t/t5620-backfill.sh
index a1a8d736db..d3cc4022bf 100755
--- a/t/t5620-backfill.sh
+++ b/t/t5620-backfill.sh
@@ -221,6 +221,22 @@ test_expect_success 'backfill --sparse without cone mode (negative)' '
 	test_line_count = 12 missing
 '
 
+test_expect_success 'backfill does not request batches when up-to-date' '
+	git clone --no-checkout --filter=blob:none \
+		--single-branch --branch=main \
+		"file://$(pwd)/srv.bare" backfill-up-to-date &&
+
+	# First trigger to have a full download
+	git -C backfill-up-to-date backfill &&
+
+	# Second trigger to verify when already have a full download previously
+	GIT_TRACE2_EVENT="$(pwd)/up-to-date-trace" git \
+		-C backfill-up-to-date backfill &&
+
+	# Verify no  batches_request occurr
+	test_grep ! "batches_requested" up-to-date-trace
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.43.0


             reply	other threads:[~2026-03-31 12:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31 12:12 Trieu Huynh [this message]
2026-04-01 11:50 ` [RFC GSoC PATCH] backfill: skip downloading for empty batches Patrick Steinhardt
2026-04-01 19:44   ` Trieu Huynh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260331121204.787826-1-vikingtc4@gmail.com \
    --to=vikingtc4@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox