From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>,
Phillip Wood <phillip.wood123@gmail.com>,
Derrick Stolee <stolee@gmail.com>,
Elijah Newren <newren@gmail.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v3 1/4] promisor-remote: document caller filtering contract
Date: Thu, 14 May 2026 16:25:25 +0000 [thread overview]
Message-ID: <6ad11e2c28d9c3b3d5dcb8986b726d2d2db18cc3.1778775928.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2089.v3.git.1778775928.gitgitgadget@gmail.com>
From: Elijah Newren <newren@gmail.com>
promisor_remote_get_direct() does not, on its happy path, filter out
OIDs that are already present in the local object store: every OID
the caller supplies is written to the fetch subprocess's stdin and
ends up in the response pack. The only filtering it performs is in
remove_fetched_oids(), and that only runs after a fetch failure when
falling back to a different configured promisor remote.
Almost every existing caller already filters locally-present OIDs out
itself (typically with odb_read_object_info_extended() and
OBJECT_INFO_FOR_PREFETCH, or odb_has_object() with no fetch flag). But
the existing API comment does not state this expectation, so a new
caller is easy to write incorrectly (I missed this originally and wrote
two problematic callers). Omitting the filter still "works" in the
sense that the desired objects end up local, but it silently makes the
fetch request -- and the response pack -- larger than necessary,
defeating part of the point of batching.
Spell the contract out so future callers know to filter (and
deduplicate) themselves, and point them at the helpers they should
use to check local presence without accidentally triggering a lazy
fetch.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
promisor-remote.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/promisor-remote.h b/promisor-remote.h
index 3d4d2de018..301f5ac5cb 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -29,6 +29,17 @@ int repo_has_promisor_remote(struct repository *r);
* Fetches all requested objects from all promisor remotes, trying them one at
* a time until all objects are fetched.
*
+ * Callers are responsible for filtering out OIDs that are already present
+ * locally before calling this function: every supplied OID is sent in the
+ * fetch request, even if the object already exists in the local object
+ * store. (Only after a fetch failure does this function fall back to
+ * stripping already-present OIDs from the list before trying the next
+ * configured promisor remote.) Callers should also deduplicate the OIDs.
+ *
+ * To test for local presence without triggering a lazy fetch (which would
+ * defeat the purpose of batching), use odb_has_object(..., 0) or
+ * odb_read_object_info_extended() with OBJECT_INFO_FOR_PREFETCH.
+ *
* If oid_nr is 0, this function returns immediately.
*/
void promisor_remote_get_direct(struct repository *repo,
--
gitgitgadget
next prev parent reply other threads:[~2026-05-14 16:25 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-16 22:48 [PATCH 0/3] Batch prefetching Elijah Newren via GitGitGadget
2026-04-16 22:48 ` [PATCH 1/3] patch-ids.h: add missing trailing parenthesis in documentation comment Elijah Newren via GitGitGadget
2026-04-16 22:48 ` [PATCH 2/3] builtin/log: prefetch necessary blobs for `git cherry` Elijah Newren via GitGitGadget
2026-04-17 21:42 ` Junio C Hamano
2026-04-17 22:02 ` Elijah Newren
2026-04-16 22:48 ` [PATCH 3/3] grep: prefetch necessary blobs Elijah Newren via GitGitGadget
2026-04-18 0:32 ` [PATCH v2 0/3] Batch prefetching Elijah Newren via GitGitGadget
2026-04-18 0:32 ` [PATCH v2 1/3] patch-ids.h: add missing trailing parenthesis in documentation comment Elijah Newren via GitGitGadget
2026-04-18 0:32 ` [PATCH v2 2/3] builtin/log: prefetch necessary blobs for `git cherry` Elijah Newren via GitGitGadget
2026-04-19 14:04 ` Phillip Wood
2026-04-21 21:28 ` Elijah Newren
2026-04-23 15:15 ` Phillip Wood
2026-04-23 17:38 ` Elijah Newren
2026-04-27 13:16 ` Derrick Stolee
2026-05-11 2:51 ` Junio C Hamano
2026-05-11 17:45 ` Elijah Newren
2026-05-13 23:17 ` Elijah Newren
2026-04-18 0:32 ` [PATCH v2 3/3] grep: prefetch necessary blobs Elijah Newren via GitGitGadget
2026-04-27 12:59 ` Derrick Stolee
2026-05-13 19:21 ` Elijah Newren
2026-05-14 16:25 ` [PATCH v3 0/4] Batch prefetching Elijah Newren via GitGitGadget
2026-05-14 16:25 ` Elijah Newren via GitGitGadget [this message]
2026-05-14 16:25 ` [PATCH v3 2/4] patch-ids.h: add missing trailing parenthesis in documentation comment Elijah Newren via GitGitGadget
2026-05-14 16:25 ` [PATCH v3 3/4] builtin/log: prefetch necessary blobs for `git cherry` Elijah Newren via GitGitGadget
2026-05-14 16:25 ` [PATCH v3 4/4] grep: prefetch necessary blobs Elijah Newren via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6ad11e2c28d9c3b3d5dcb8986b726d2d2db18cc3.1778775928.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
--cc=phillip.wood123@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox