From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: tom.saeger@oracle.com, gitster@pobox.com,
sunshine@sunshineco.com, Derrick Stolee <stolee@gmail.com>,
Josh Steadmon <steadmon@google.com>,
Emily Shaffer <emilyshaffer@google.com>,
Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v3 0/3] Maintenance: adapt custom refspecs
Date: Sat, 10 Apr 2021 02:03:42 +0000 [thread overview]
Message-ID: <pull.924.v3.git.1618020225.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.924.v2.git.1617734870.gitgitgadget@gmail.com>
Tom Saeger rightly pointed out [1] that the prefetch task ignores custom
refspecs. This can lead to downloading more data than requested, and it
doesn't even help the future foreground fetches that use that custom
refspec.
[1]
https://lore.kernel.org/git/20210401184914.qmr7jhjbhp2mt3h6@dhcp-10-154-148-175.vpn.oracle.com/
This series fixes this problem by carefully replacing the start of each
refspec's destination with "refs/prefetch/". If the destination already
starts with "refs/", then that is replaced. Otherwise "refs/prefetch/" is
just prepended.
This happens inside of git fetch when a --prefetch option is given. This
allows us to maniuplate a struct refspec_item instead of a full refspec
string. It also simplifies our logic in testing the prefetch task.
Update in V3
============
* The fix is almost completely rewritten as an update to 'git fetch'. See
the new PATCH 2 for this update.
* There was some discussion of rewriting test_subcommand, but that can be
delayed until a proper solution is found to complications around softer
matches.
Updates in V2
=============
Thanks for the close eye on this series. I appreciate the recommendations,
which I believe I have responded to them all:
* Fixed typos.
* Made refspec_item_format() re-entrant. Consumers must free the buffer.
* Cleaned up style (quoting and tabbing).
Thanks, -Stolee
Derrick Stolee (3):
maintenance: simplify prefetch logic
fetch: add --prefetch option
maintenance: use 'git fetch --prefetch'
Documentation/fetch-options.txt | 5 +++
Documentation/git-maintenance.txt | 6 ++--
builtin/fetch.c | 56 +++++++++++++++++++++++++++++++
builtin/gc.c | 36 +++++---------------
t/t5582-fetch-negative-refspec.sh | 30 +++++++++++++++++
t/t7900-maintenance.sh | 14 ++++----
6 files changed, 109 insertions(+), 38 deletions(-)
base-commit: 89b43f80a514aee58b662ad606e6352e03eaeee4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-924%2Fderrickstolee%2Fmaintenance%2Frefspec-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-924/derrickstolee/maintenance/refspec-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/924
Range-diff vs v2:
1: 5aa0cb06c3f2 = 1: 4c0e983ba56f maintenance: simplify prefetch logic
2: d58a3e042ee8 < -: ------------ test-lib: use exact match for test_subcommand
3: 96388d949b98 < -: ------------ refspec: output a refspec item
4: bf296282323a < -: ------------ test-tool: test refspec input/output
-: ------------ > 2: 7f488eea6dbd fetch: add --prefetch option
5: 9592224e3d42 ! 3: ed055d772452 maintenance: allow custom refspecs during prefetch
@@ Metadata
Author: Derrick Stolee <dstolee@microsoft.com>
## Commit message ##
- maintenance: allow custom refspecs during prefetch
+ maintenance: use 'git fetch --prefetch'
- The prefetch task previously used the default refspec source plus a
- custom refspec destination to avoid colliding with remote refs:
+ The 'prefetch' maintenance task previously forced the following refspec
+ for each remote:
+refs/heads/*:refs/prefetch/<remote>/*
- However, some users customize their refspec to reduce how much data they
- download from specific remotes. This can involve restrictive patterns
- for fetching or negative patterns to avoid downloading some refs.
+ If a user has specified a more strict refspec for the remote, then this
+ prefetch task downloads more objects than necessary.
- Modify fetch_remote() to iterate over the remote's refspec list and
- translate that into the appropriate prefetch scenario. Specifically,
- re-parse the raw form of the refspec into a new 'struct refspec' and
- modify the 'dst' member to replace a leading "refs/" substring with
- "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
- Negative refspecs do not have a 'dst' so they can be transferred to the
- 'git fetch' command unmodified.
-
- This prefix change provides the benefit of keeping whatever collisions
- may exist in the custom refspecs, if that is a desirable outcome.
-
- This changes the names of the refs that would be fetched by the default
- refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
- to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
- is not a seriously breaking one: these refs are intended to be hidden
- and not used.
+ The previous change introduced the '--prefetch' option to 'git fetch'
+ which manipulates the remote's refspec to place all resulting refs into
+ refs/prefetch/, with further partitioning based on the destinations of
+ those refspecs.
Update the documentation to be more generic about the destination refs.
- Do not mention custom refpecs explicitly, as that does not need to be
+ Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
- refs/prefetch remains.
+ refs/prefetch/ remains.
Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
## Documentation/git-maintenance.txt ##
-@@ Documentation/git-maintenance.txt: prefetch::
+@@ Documentation/git-maintenance.txt: commit-graph::
+ prefetch::
+ The `prefetch` task updates the object directory with the latest
objects from all registered remotes. For each remote, a `git fetch`
- command is run. The refmap is custom to avoid updating local or remote
- branches (those in `refs/heads` or `refs/remotes`). Instead, the
+- command is run. The refmap is custom to avoid updating local or remote
+- branches (those in `refs/heads` or `refs/remotes`). Instead, the
- remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
- not updated.
-+ refs are stored in `refs/prefetch/`. Also, tags are not updated.
++ command is run. The configured refspec is modified to place all
++ requested refs within `refs/prefetch/`. Also, tags are not updated.
+
This is done to avoid disrupting the remote-tracking branches. The end users
expect these refs to stay unmoved unless they initiate a fetch. With prefetch
## builtin/gc.c ##
-@@
- #include "remote.h"
- #include "object-store.h"
- #include "exec-cmd.h"
-+#include "refspec.h"
-
- #define FAILED_RUN "failed to run %s"
-
@@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
- {
- struct maintenance_run_opts *opts = cbdata;
struct child_process child = CHILD_PROCESS_INIT;
-+ int i;
child.git_cmd = 1;
- strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
-@@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
+- strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
++ strvec_pushl(&child.args, "fetch", remote->name,
++ "--prefetch", "--prune", "--no-tags",
+ "--no-write-fetch-head", "--recurse-submodules=no",
+- "--refmap=", NULL);
++ NULL);
+
if (opts->quiet)
strvec_push(&child.args, "--quiet");
- strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
-+ for (i = 0; i < remote->fetch.nr; i++) {
-+ struct refspec_item replace;
-+ struct refspec_item *rsi = &remote->fetch.items[i];
-+ struct strbuf new_dst = STRBUF_INIT;
-+ size_t ignore_len = 0;
-+ char *replace_string;
-+
-+ if (rsi->negative) {
-+ strvec_push(&child.args, remote->fetch.raw[i]);
-+ continue;
-+ }
-+
-+ refspec_item_init(&replace, remote->fetch.raw[i], 1);
-+
-+ /*
-+ * If a refspec dst starts with "refs/" at the start,
-+ * then we will replace "refs/" with "refs/prefetch/".
-+ * Otherwise, we will prepend the dst string with
-+ * "refs/prefetch/".
-+ */
-+ if (!strncmp(replace.dst, "refs/", 5))
-+ ignore_len = 5;
-+
-+ strbuf_addstr(&new_dst, "refs/prefetch/");
-+ strbuf_addstr(&new_dst, replace.dst + ignore_len);
-+ free(replace.dst);
-+ replace.dst = strbuf_detach(&new_dst, NULL);
-+
-+ replace_string = refspec_item_format(&replace);
-+ strvec_push(&child.args, replace_string);
-+ free(replace_string);
-+
-+ refspec_item_clear(&replace);
-+ }
-
+-
return !!run_command(&child);
}
+
## t/t7900-maintenance.sh ##
@@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
+ test_commit -C clone1 one &&
test_commit -C clone2 two &&
GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
- fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-- test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
-- test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
-+ test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote1/*" <run-prefetch.txt &&
-+ test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <run-prefetch.txt &&
+- fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
+- test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
+- test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
++ fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
++ test_subcommand git fetch remote1 $fetchargs <run-prefetch.txt &&
++ test_subcommand git fetch remote2 $fetchargs <run-prefetch.txt &&
test_path_is_missing .git/refs/remotes &&
- git log prefetch/remote1/one &&
- git log prefetch/remote2/two &&
@@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
test_cmp_config refs/prefetch/ log.excludedecoration &&
git log --oneline --decorate --all >log &&
- ! grep "prefetch" log
- '
-
-+test_expect_success 'prefetch custom refspecs' '
-+ git -C clone1 branch -f special/fetched HEAD &&
-+ git -C clone1 branch -f special/secret/not-fetched HEAD &&
-+
-+ # create multiple refspecs for remote1
-+ git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
-+ git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
-+
-+ GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
-+
-+ fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-+
-+ # skips second refspec because it is not a pattern type
-+ rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
-+ rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
-+ rs3="^refs/heads/special/secret/not-fetched" &&
-+
-+ test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
-+ test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&
-+
-+ # first refspec is overridden by second
-+ test_must_fail git rev-parse refs/prefetch/special/fetched &&
-+ git rev-parse refs/prefetch/heads/fetched &&
-+
-+ # possible incorrect places for the non-fetched ref
-+ test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
-+ test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
-+ test_must_fail git rev-parse refs/heads/secret/not-fetched &&
-+ test_must_fail git rev-parse refs/heads/not-fetched
-+'
-+
- test_expect_success 'prefetch and existing log.excludeDecoration values' '
- git config --unset-all log.excludeDecoration &&
- git config log.excludeDecoration refs/remotes/remote1/ &&
--
gitgitgadget
next prev parent reply other threads:[~2021-04-10 2:04 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-05 17:01 ` Tom Saeger
2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-05 17:31 ` Eric Sunshine
2021-04-05 17:43 ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-05 16:57 ` Tom Saeger
2021-04-05 17:40 ` Eric Sunshine
2021-04-05 17:44 ` Junio C Hamano
2021-04-06 11:21 ` Derrick Stolee
2021-04-06 15:23 ` Eric Sunshine
2021-04-06 16:51 ` Derrick Stolee
2021-04-07 8:46 ` Ævar Arnfjörð Bjarmason
2021-04-07 20:53 ` Derrick Stolee
2021-04-07 22:05 ` Ævar Arnfjörð Bjarmason
2021-04-07 22:49 ` Junio C Hamano
2021-04-07 23:01 ` Ævar Arnfjörð Bjarmason
2021-04-08 7:33 ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-05 17:52 ` Eric Sunshine
2021-04-06 11:13 ` Derrick Stolee
2021-04-07 8:54 ` Ævar Arnfjörð Bjarmason
2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-05 17:16 ` Tom Saeger
2021-04-06 11:15 ` Derrick Stolee
2021-04-07 8:53 ` Ævar Arnfjörð Bjarmason
2021-04-07 10:26 ` Ævar Arnfjörð Bjarmason
2021-04-09 11:48 ` Derrick Stolee
2021-04-09 19:28 ` Ævar Arnfjörð Bjarmason
2021-04-10 0:56 ` Derrick Stolee
2021-04-10 11:37 ` Ævar Arnfjörð Bjarmason
2021-04-07 13:47 ` Ævar Arnfjörð Bjarmason
2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-06 18:47 ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-07 23:23 ` Emily Shaffer
2021-04-09 19:00 ` Derrick Stolee
2021-04-06 18:47 ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-06 18:47 ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-06 18:47 ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-07 23:08 ` Josh Steadmon
2021-04-07 23:26 ` Emily Shaffer
2021-04-06 18:47 ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-06 19:36 ` Tom Saeger
2021-04-06 19:45 ` Derrick Stolee
2021-04-07 23:09 ` Josh Steadmon
2021-04-07 23:37 ` Emily Shaffer
2021-04-08 0:23 ` Jonathan Tan
2021-04-10 2:03 ` Derrick Stolee via GitGitGadget [this message]
2021-04-10 2:03 ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-12 20:13 ` Tom Saeger
2021-04-12 20:27 ` Derrick Stolee
2021-04-10 2:03 ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-11 21:09 ` Ramsay Jones
2021-04-12 20:23 ` Derrick Stolee
2021-04-10 2:03 ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-11 1:35 ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
2021-04-12 16:48 ` Tom Saeger
2021-04-12 17:24 ` Tom Saeger
2021-04-12 17:41 ` Tom Saeger
2021-04-12 20:25 ` Derrick Stolee
2021-04-16 12:49 ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
2021-04-16 12:49 ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-16 18:02 ` Tom Saeger
2021-04-16 12:49 ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-16 17:52 ` Tom Saeger
2021-04-16 18:26 ` Tom Saeger
2021-04-16 12:49 ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-16 12:49 ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
2021-04-16 13:54 ` Ævar Arnfjörð Bjarmason
2021-04-16 14:33 ` Tom Saeger
2021-04-16 18:31 ` Tom Saeger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.924.v3.git.1618020225.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=derrickstolee@github.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=steadmon@google.com \
--cc=stolee@gmail.com \
--cc=sunshine@sunshineco.com \
--cc=tom.saeger@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.