From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>
Subject: [PATCH 0/3] Batch prefetching
Date: Thu, 16 Apr 2026 22:48:11 +0000 [thread overview]
Message-ID: <pull.2089.git.1776379694.gitgitgadget@gmail.com> (raw)
Partial clones provide a trade-off for users: avoid downloading blobs
upfront, at the expense of needing to download them later as they run other
commands. This tradeoff can sometimes incur a more severe cost than
expected, particularly if needed blobs are discovered as they are accessed,
resulting in downloading blobs one at a time. Some commands like checkout,
diff, and merge do batch prefetches of necessary blobs, since that can
dramatically reduce the pain of on-demand loading. Extend this ability to
two more commands: cherry and grep.
This series was spurred by a report where git cherry jobs were each doing
hundreds of single-blob fetches, at a cost of 3s each. Batching those
downloads should dramatically speed up their jobs. (And I decided to fix up
git grep similarly while at it.)
I'll also note that git backfill with revisions and/or pathspecs could also
improve things for these users, but since backfill is a manual command users
would have to run and requires users to try to figure out which data is
needed (a challenge in the case of cherry), it still makes sense to provide
smarter behavior for folks who don't choose to manually run backfill.
Also, correct a documentation typo I noticed in patch-ids.h (related to code
I was using for the git cherry fixes) as a preparatory fixup.
Elijah Newren (3):
patch-ids.h: add missing trailing parenthesis in documentation comment
builtin/log: prefetch necessary blobs for `git cherry`
grep: prefetch necessary blobs
builtin/grep.c | 142 ++++++++++++
builtin/log.c | 125 +++++++++++
investigations/cherry-prefetch-design-spec.md | 210 ++++++++++++++++++
patch-ids.h | 2 +-
t/t3500-cherry.sh | 18 ++
t/t7810-grep.sh | 35 +++
6 files changed, 531 insertions(+), 1 deletion(-)
create mode 100644 investigations/cherry-prefetch-design-spec.md
base-commit: 9f223ef1c026d91c7ac68cc0211bde255dda6199
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2089%2Fnewren%2Fbatch-prefetching-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2089/newren/batch-prefetching-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2089
--
gitgitgadget
next reply other threads:[~2026-04-16 22:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-16 22:48 Elijah Newren via GitGitGadget [this message]
2026-04-16 22:48 ` [PATCH 1/3] patch-ids.h: add missing trailing parenthesis in documentation comment Elijah Newren via GitGitGadget
2026-04-16 22:48 ` [PATCH 2/3] builtin/log: prefetch necessary blobs for `git cherry` Elijah Newren via GitGitGadget
2026-04-17 21:42 ` Junio C Hamano
2026-04-17 22:02 ` Elijah Newren
2026-04-16 22:48 ` [PATCH 3/3] grep: prefetch necessary blobs Elijah Newren via GitGitGadget
2026-04-18 0:32 ` [PATCH v2 0/3] Batch prefetching Elijah Newren via GitGitGadget
2026-04-18 0:32 ` [PATCH v2 1/3] patch-ids.h: add missing trailing parenthesis in documentation comment Elijah Newren via GitGitGadget
2026-04-18 0:32 ` [PATCH v2 2/3] builtin/log: prefetch necessary blobs for `git cherry` Elijah Newren via GitGitGadget
2026-04-19 14:04 ` Phillip Wood
2026-04-21 21:28 ` Elijah Newren
2026-04-23 15:15 ` Phillip Wood
2026-04-23 17:38 ` Elijah Newren
2026-04-27 13:16 ` Derrick Stolee
2026-05-11 2:51 ` Junio C Hamano
2026-05-11 17:45 ` Elijah Newren
2026-05-13 23:17 ` Elijah Newren
2026-04-18 0:32 ` [PATCH v2 3/3] grep: prefetch necessary blobs Elijah Newren via GitGitGadget
2026-04-27 12:59 ` Derrick Stolee
2026-05-13 19:21 ` Elijah Newren
2026-05-14 16:25 ` [PATCH v3 0/4] Batch prefetching Elijah Newren via GitGitGadget
2026-05-14 16:25 ` [PATCH v3 1/4] promisor-remote: document caller filtering contract Elijah Newren via GitGitGadget
2026-05-14 16:25 ` [PATCH v3 2/4] patch-ids.h: add missing trailing parenthesis in documentation comment Elijah Newren via GitGitGadget
2026-05-14 16:25 ` [PATCH v3 3/4] builtin/log: prefetch necessary blobs for `git cherry` Elijah Newren via GitGitGadget
2026-05-14 16:25 ` [PATCH v3 4/4] grep: prefetch necessary blobs Elijah Newren via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.2089.git.1776379694.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox