git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] remote: prefetch config
@ 2024-09-04 17:59 Shubham Kanodia via GitGitGadget
  2024-09-04 20:03 ` Derrick Stolee
  2024-09-05  2:52 ` [PATCH v2] " Shubham Kanodia via GitGitGadget
  0 siblings, 2 replies; 16+ messages in thread
From: Shubham Kanodia via GitGitGadget @ 2024-09-04 17:59 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt [ ], Junio C Hamano [ ], Shubham Kanodia,
	Shubham Kanodia

From: Shubham Kanodia <shubham.kanodia10@gmail.com>

Large repositories often contain numerous branches and refs, many of
which individual users may not need. This commit introduces a new
configuration option (`remote.<remote>.prefetch`) to allow
users to specify which remotes to prefetch during
the maintenance task.

Key behaviors:
1. If `remote.<remote>.prefetch` is unset or true, running
   `git-maintenance` will prefetch all refs for the remote.
2. If `remote.<remote>.prefetch` is set to false, the remote
   will be ignored for prefetching.

In a future change, we could also allow restricting the refs that are
prefetched per remote using the `prefetchref` config option per remote.

Both of these options in unison would allow users to optimize their
prefetch operations, reducing network traffic and disk usage.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
    remote: prefetch config

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1779%2Fpastelsky%2Fsk%2Fmaintenance-prefetch-remote-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1779/pastelsky/sk/maintenance-prefetch-remote-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1779

 Documentation/config/remote.txt   |  5 +++
 Documentation/git-maintenance.txt |  7 +++--
 builtin/gc.c                      |  3 ++
 remote.c                          |  3 ++
 remote.h                          |  9 ++++++
 t/t7900-maintenance.sh            | 51 +++++++++++++++++++++++++++++++
 6 files changed, 75 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 8efc53e836d..c2b3876192c 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -33,6 +33,11 @@ remote.<name>.fetch::
 	The default set of "refspec" for linkgit:git-fetch[1]. See
 	linkgit:git-fetch[1].
 
+remote.<name>.prefetch::
+	If false, refs from the remote would not be prefetched for
+	the prefetch task in linkgit:git-maintenance[1]. If not set,
+	the value is assumed to be true.
+
 remote.<name>.push::
 	The default set of "refspec" for linkgit:git-push[1]. See
 	linkgit:git-push[1].
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 51d0f7e94b6..2fd38706ea2 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -97,9 +97,10 @@ commit-graph::
 
 prefetch::
 	The `prefetch` task updates the object directory with the latest
-	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The configured refspec is modified to place all
-	requested refs within `refs/prefetch/`. Also, tags are not updated.
+	objects from all registered remotes unless they've disabled prefetch
+	using `remote.<remote>.prefetch` set to `false`. For each such remote,
+	a `git fetch` command is run. The configured refspec is modified to place
+	all requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  However,
diff --git a/builtin/gc.c b/builtin/gc.c
index 427faf1cfe1..88b8d80aff6 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1027,6 +1027,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	if (remote->skip_default_update)
 		return 0;
 
+	if (remote->prefetch == 0)
+		return 0;
+
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name,
 		     "--prefetch", "--prune", "--no-tags",
diff --git a/remote.c b/remote.c
index 8f3dee13186..05edb3a5f40 100644
--- a/remote.c
+++ b/remote.c
@@ -140,6 +140,7 @@ static struct remote *make_remote(struct remote_state *remote_state,
 	CALLOC_ARRAY(ret, 1);
 	ret->prune = -1;  /* unspecified */
 	ret->prune_tags = -1;  /* unspecified */
+	ret->prefetch = -1;  /* unspecified */
 	ret->name = xstrndup(name, len);
 	refspec_init(&ret->push, REFSPEC_PUSH);
 	refspec_init(&ret->fetch, REFSPEC_FETCH);
@@ -456,6 +457,8 @@ static int handle_config(const char *key, const char *value,
 		remote->prune = git_config_bool(key, value);
 	else if (!strcmp(subkey, "prunetags"))
 		remote->prune_tags = git_config_bool(key, value);
+	else if (!strcmp(subkey, "prefetch"))
+		remote->prefetch = git_config_bool(key, value);
 	else if (!strcmp(subkey, "url")) {
 		if (!value)
 			return config_error_nonbool(key);
diff --git a/remote.h b/remote.h
index b901b56746d..57d21a7bfe7 100644
--- a/remote.h
+++ b/remote.h
@@ -77,6 +77,15 @@ struct remote {
 
 	struct refspec fetch;
 
+	/*
+	 * This setting for whether to prefetch from a remote
+	 * when a fetch is invoked with a prefetch flag.
+	 *  -1 = unset
+	 *   0 = don't prefetch from this remote
+	 *   1 = prefetch from this remote
+	 */
+	int prefetch;
+
 	/*
 	 * The setting for whether to fetch tags (as a separate rule from the
 	 * configured refspecs);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index abae7a97546..7bc349ec546 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -245,6 +245,57 @@ test_expect_success 'prefetch multiple remotes' '
 	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
 '
 
+test_expect_success 'prefetch respects remote.*.prefetch config' '
+	test_create_repo prefetch-test-config &&
+	(
+		cd prefetch-test-config &&
+		test_commit initial &&
+		test_create_repo clone1 &&
+		test_create_repo clone2 &&
+		test_create_repo clone3 &&
+
+		git remote add remote1 "file://$(pwd)/clone1" &&
+		git remote add remote2 "file://$(pwd)/clone2" &&
+		git remote add remote3 "file://$(pwd)/clone3" &&
+
+		git config remote.remote1.prefetch false &&
+		git config remote.remote2.prefetch true &&
+		# remote3 is left unset
+
+		# Make changes in all clones
+		git -C clone1 switch -c one &&
+		git -C clone2 switch -c two &&
+		git -C clone3 switch -c three &&
+		test_commit -C clone1 one &&
+		test_commit -C clone2 two &&
+		test_commit -C clone3 three &&
+
+		# Run maintenance prefetch task
+		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+		# Check that remote1 was not fetched (prefetch=false)
+		test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
+			--no-write-fetch-head --recurse-submodules=no --quiet \
+			<prefetch.txt &&
+
+		# Check that remote2 was fetched (prefetch=true)
+		test_subcommand git fetch remote2 --prefetch --prune --no-tags \
+			--no-write-fetch-head --recurse-submodules=no --quiet \
+			<prefetch.txt &&
+
+		# Check that remote3 was fetched (prefetch unset, default to true)
+		test_subcommand git fetch remote3 --prefetch --prune --no-tags \
+			--no-write-fetch-head --recurse-submodules=no --quiet \
+			<prefetch.txt &&
+
+		# Verify that changes are in the prefetch refs for remote2 and remote3, but not remote1
+		test_must_fail git rev-parse refs/prefetch/remotes/remote1/one &&
+		git fetch --all &&
+		test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
+		test_cmp_rev refs/remotes/remote3/three refs/prefetch/remotes/remote3/three
+	)
+'
+
 test_expect_success 'loose-objects task' '
 	# Repack everything so we know the state of the object dir
 	git repack -adk &&

base-commit: 2e7b89e038c0c888acf61f1b4ee5a43d4dd5e94c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] remote: prefetch config
  2024-09-04 17:59 [PATCH] remote: prefetch config Shubham Kanodia via GitGitGadget
@ 2024-09-04 20:03 ` Derrick Stolee
  2024-09-04 20:55   ` Junio C Hamano
  2024-09-05  2:52 ` [PATCH v2] " Shubham Kanodia via GitGitGadget
  1 sibling, 1 reply; 16+ messages in thread
From: Derrick Stolee @ 2024-09-04 20:03 UTC (permalink / raw)
  To: Shubham Kanodia via GitGitGadget, git
  Cc: Patrick Steinhardt [ ], Junio C Hamano [ ], Shubham Kanodia

On 9/4/24 1:59 PM, Shubham Kanodia via GitGitGadget wrote:
> From: Shubham Kanodia <shubham.kanodia10@gmail.com>
> 
> Large repositories often contain numerous branches and refs, many of
> which individual users may not need. This commit introduces a new
> configuration option (`remote.<remote>.prefetch`) to allow
> users to specify which remotes to prefetch during
> the maintenance task.
> 
> Key behaviors:
> 1. If `remote.<remote>.prefetch` is unset or true, running
>     `git-maintenance` will prefetch all refs for the remote.
> 2. If `remote.<remote>.prefetch` is set to false, the remote
>     will be ignored for prefetching.

Thanks for this contribution. I think this is a good idea for extra
flexibility of the prefetch task.

> In a future change, we could also allow restricting the refs that are
> prefetched per remote using the `prefetchref` config option per remote.

I agree that this would also be of interest, but more complicated. Thanks
for starting with this simpler modification.

> +	if (remote->prefetch == 0)
> +		return 0;

In the Git codebase, this would normally be written as

	if (!remote->prefetch)
		return 0;

> +		# Run maintenance prefetch task
> +		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
> +
> +		# Check that remote1 was not fetched (prefetch=false)
> +		test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
> +			--no-write-fetch-head --recurse-submodules=no --quiet \
> +			<prefetch.txt &&

I'm happy to see this use of test_subcommand to validate the behavior
of this patch!

This is a very good patch and I only have the one style nit.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] remote: prefetch config
  2024-09-04 20:03 ` Derrick Stolee
@ 2024-09-04 20:55   ` Junio C Hamano
  2024-09-05  2:08     ` Derrick Stolee
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2024-09-04 20:55 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Shubham Kanodia

Derrick Stolee <stolee@gmail.com> writes:

> I agree that this would also be of interest, but more complicated. Thanks
> for starting with this simpler modification.
>
>> +	if (remote->prefetch == 0)
>> +		return 0;
>
> In the Git codebase, this would normally be written as
>
> 	if (!remote->prefetch)
> 		return 0;

;-)

>> +		# Run maintenance prefetch task
>> +		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>> +
>> +		# Check that remote1 was not fetched (prefetch=false)
>> +		test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
>> +			--no-write-fetch-head --recurse-submodules=no --quiet \
>> +			<prefetch.txt &&
>
> I'm happy to see this use of test_subcommand to validate the behavior
> of this patch!

I found it a bit disturbing that the pattern is overly specific.

The only thing we are interested in is that we are not fetching from
remote1, so it _should_ suffice if we could write

	test_subcommand ! git fetch remote1 <prefetch.txt &&

to avoid being tied to how the current version of Git happens to
pass these command line option flags and the order it does so.

Looking at the implementation of test_subcommand, it seems that we
cannot quite do that (it assumes that the pattern it assembles out
of the parameters are to match the full argument list used in
invocation, enclosing them in a single [] pair and without giving
the caller an easy way to sneak wildcards like ".*" in), which is
sad.


So, the expected command line being too strit is *not* a fault of
this patch, and with the style fix, I think this half of the
solution is a good one.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] remote: prefetch config
  2024-09-04 20:55   ` Junio C Hamano
@ 2024-09-05  2:08     ` Derrick Stolee
  2024-09-05  2:51       ` Shubham Kanodia
  2024-09-05 15:00       ` Junio C Hamano
  0 siblings, 2 replies; 16+ messages in thread
From: Derrick Stolee @ 2024-09-05  2:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Shubham Kanodia

On 9/4/24 4:55 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:

>>> +		# Run maintenance prefetch task
>>> +		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>>> +
>>> +		# Check that remote1 was not fetched (prefetch=false)
>>> +		test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
>>> +			--no-write-fetch-head --recurse-submodules=no --quiet \
>>> +			<prefetch.txt &&
>>
>> I'm happy to see this use of test_subcommand to validate the behavior
>> of this patch!
> 
> I found it a bit disturbing that the pattern is overly specific.
> 
> The only thing we are interested in is that we are not fetching from
> remote1, so it _should_ suffice if we could write
> 
> 	test_subcommand ! git fetch remote1 <prefetch.txt &&
> 
> to avoid being tied to how the current version of Git happens to
> pass these command line option flags and the order it does so.
> 
> Looking at the implementation of test_subcommand, it seems that we
> cannot quite do that (it assumes that the pattern it assembles out
> of the parameters are to match the full argument list used in
> invocation, enclosing them in a single [] pair and without giving
> the caller an easy way to sneak wildcards like ".*" in), which is
> sad.
I agree the ergonomics of the test_subcommand helper is a bit poor
(and not this patch author's fault). The trickiest part is the
negative case, as in this highlighted one. It's hard to read from
this if the subcommand wasn't found because the argument list is
too specific and doesn't match the exact arguments.

It helps that the same options are given for the other, positive
tests. But maybe that could be a hint as to how to make this test
a bit cleaner: make a variable describing the "uninteresting"
arguments. Something like...

	args="--prefetch --prune --no-tags \
  	      --no-write-fetch-head --recurse-submodules=no --quiet" &&

	test_subcommand ! git fetch remote1 $args <prefetch.txt &&
	test_subcommand git fetch remote2 $args <prefetch.txt &&
	test_subcommand git fetch remote3 $args <prefetch.txt &&

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] remote: prefetch config
  2024-09-05  2:08     ` Derrick Stolee
@ 2024-09-05  2:51       ` Shubham Kanodia
  2024-09-05 15:00       ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Shubham Kanodia @ 2024-09-05  2:51 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Junio C Hamano, Shubham Kanodia via GitGitGadget, git,
	Patrick Steinhardt [ ]

On Thu, Sep 5, 2024 at 7:38 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 9/4/24 4:55 PM, Junio C Hamano wrote:
> > Derrick Stolee <stolee@gmail.com> writes:
>
> >>> +           # Run maintenance prefetch task
> >>> +           GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
> >>> +
> >>> +           # Check that remote1 was not fetched (prefetch=false)
> >>> +           test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
> >>> +                   --no-write-fetch-head --recurse-submodules=no --quiet \
> >>> +                   <prefetch.txt &&
> >>
> >> I'm happy to see this use of test_subcommand to validate the behavior
> >> of this patch!
> >
> > I found it a bit disturbing that the pattern is overly specific.
> >
> > The only thing we are interested in is that we are not fetching from
> > remote1, so it _should_ suffice if we could write
> >
> >       test_subcommand ! git fetch remote1 <prefetch.txt &&
> >
> > to avoid being tied to how the current version of Git happens to
> > pass these command line option flags and the order it does so.
> >
> > Looking at the implementation of test_subcommand, it seems that we
> > cannot quite do that (it assumes that the pattern it assembles out
> > of the parameters are to match the full argument list used in
> > invocation, enclosing them in a single [] pair and without giving
> > the caller an easy way to sneak wildcards like ".*" in), which is
> > sad.
> I agree the ergonomics of the test_subcommand helper is a bit poor
> (and not this patch author's fault). The trickiest part is the
> negative case, as in this highlighted one. It's hard to read from
> this if the subcommand wasn't found because the argument list is
> too specific and doesn't match the exact arguments.
>
> It helps that the same options are given for the other, positive
> tests. But maybe that could be a hint as to how to make this test
> a bit cleaner: make a variable describing the "uninteresting"
> arguments. Something like...
>
>         args="--prefetch --prune --no-tags \
>               --no-write-fetch-head --recurse-submodules=no --quiet" &&
>
>         test_subcommand ! git fetch remote1 $args <prefetch.txt &&
>         test_subcommand git fetch remote2 $args <prefetch.txt &&
>         test_subcommand git fetch remote3 $args <prefetch.txt &&
>
> Thanks,
> -Stolee
>

Agree with both the suggestions here. Updated my patch.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] remote: prefetch config
  2024-09-04 17:59 [PATCH] remote: prefetch config Shubham Kanodia via GitGitGadget
  2024-09-04 20:03 ` Derrick Stolee
@ 2024-09-05  2:52 ` Shubham Kanodia via GitGitGadget
  2024-09-05 16:06   ` Junio C Hamano
  2024-09-05 16:45   ` [PATCH v3] " Shubham Kanodia via GitGitGadget
  1 sibling, 2 replies; 16+ messages in thread
From: Shubham Kanodia via GitGitGadget @ 2024-09-05  2:52 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt [ ], Junio C Hamano [ ], Derrick Stolee,
	Shubham Kanodia, Shubham Kanodia

From: Shubham Kanodia <shubham.kanodia10@gmail.com>

Large repositories often contain numerous branches and refs, many of
which individual users may not need. This commit introduces a new
configuration option (`remote.<remote>.prefetch`) to allow
users to specify which remotes to prefetch during
the maintenance task.

Key behaviors:
1. If `remote.<remote>.prefetch` is unset or true, running
   `git-maintenance` will prefetch all refs for the remote.
2. If `remote.<remote>.prefetch` is set to false, the remote
   will be ignored for prefetching.

In a future change, we could also allow restricting the refs that are
prefetched per remote using the `prefetchref` config option per remote.

Both of these options in unison would allow users to optimize their
prefetch operations, reducing network traffic and disk usage.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
    remote: prefetch config

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1779%2Fpastelsky%2Fsk%2Fmaintenance-prefetch-remote-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1779/pastelsky/sk/maintenance-prefetch-remote-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1779

Range-diff vs v1:

 1:  667553b114b ! 1:  1d58b782e22 remote: prefetch config
     @@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
       	if (remote->skip_default_update)
       		return 0;
       
     -+	if (remote->prefetch == 0)
     ++	if (!remote->prefetch)
      +		return 0;
      +
       	child.git_cmd = 1;
     @@ remote.h: struct remote {
       	struct refspec fetch;
       
      +	/*
     -+	 * This setting for whether to prefetch from a remote
     ++	 * The setting for whether to prefetch from a remote
      +	 * when a fetch is invoked with a prefetch flag.
      +	 *  -1 = unset
      +	 *   0 = don't prefetch from this remote
     @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
      +		# Run maintenance prefetch task
      +		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
      +
     -+		# Check that remote1 was not fetched (prefetch=false)
     -+		test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
     -+			--no-write-fetch-head --recurse-submodules=no --quiet \
     -+			<prefetch.txt &&
     -+
     -+		# Check that remote2 was fetched (prefetch=true)
     -+		test_subcommand git fetch remote2 --prefetch --prune --no-tags \
     -+			--no-write-fetch-head --recurse-submodules=no --quiet \
     -+			<prefetch.txt &&
     -+
     -+		# Check that remote3 was fetched (prefetch unset, default to true)
     -+		test_subcommand git fetch remote3 --prefetch --prune --no-tags \
     -+			--no-write-fetch-head --recurse-submodules=no --quiet \
     -+			<prefetch.txt &&
     ++		# Check that if remotes were prefetched properly
     ++		fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
     ++		test_subcommand ! git fetch remote1 $fetchargs <prefetch.txt &&
     ++		test_subcommand git fetch remote2 $fetchargs <prefetch.txt &&
     ++		test_subcommand git fetch remote3 $fetchargs <prefetch.txt &&
      +
      +		# Verify that changes are in the prefetch refs for remote2 and remote3, but not remote1
      +		test_must_fail git rev-parse refs/prefetch/remotes/remote1/one &&


 Documentation/config/remote.txt   |  5 ++++
 Documentation/git-maintenance.txt |  7 +++---
 builtin/gc.c                      |  3 +++
 remote.c                          |  3 +++
 remote.h                          |  9 +++++++
 t/t7900-maintenance.sh            | 42 +++++++++++++++++++++++++++++++
 6 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 8efc53e836d..c2b3876192c 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -33,6 +33,11 @@ remote.<name>.fetch::
 	The default set of "refspec" for linkgit:git-fetch[1]. See
 	linkgit:git-fetch[1].
 
+remote.<name>.prefetch::
+	If false, refs from the remote would not be prefetched for
+	the prefetch task in linkgit:git-maintenance[1]. If not set,
+	the value is assumed to be true.
+
 remote.<name>.push::
 	The default set of "refspec" for linkgit:git-push[1]. See
 	linkgit:git-push[1].
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 51d0f7e94b6..2fd38706ea2 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -97,9 +97,10 @@ commit-graph::
 
 prefetch::
 	The `prefetch` task updates the object directory with the latest
-	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The configured refspec is modified to place all
-	requested refs within `refs/prefetch/`. Also, tags are not updated.
+	objects from all registered remotes unless they've disabled prefetch
+	using `remote.<remote>.prefetch` set to `false`. For each such remote,
+	a `git fetch` command is run. The configured refspec is modified to place
+	all requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  However,
diff --git a/builtin/gc.c b/builtin/gc.c
index 427faf1cfe1..2ca3a3e7d6a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1027,6 +1027,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	if (remote->skip_default_update)
 		return 0;
 
+	if (!remote->prefetch)
+		return 0;
+
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name,
 		     "--prefetch", "--prune", "--no-tags",
diff --git a/remote.c b/remote.c
index 8f3dee13186..05edb3a5f40 100644
--- a/remote.c
+++ b/remote.c
@@ -140,6 +140,7 @@ static struct remote *make_remote(struct remote_state *remote_state,
 	CALLOC_ARRAY(ret, 1);
 	ret->prune = -1;  /* unspecified */
 	ret->prune_tags = -1;  /* unspecified */
+	ret->prefetch = -1;  /* unspecified */
 	ret->name = xstrndup(name, len);
 	refspec_init(&ret->push, REFSPEC_PUSH);
 	refspec_init(&ret->fetch, REFSPEC_FETCH);
@@ -456,6 +457,8 @@ static int handle_config(const char *key, const char *value,
 		remote->prune = git_config_bool(key, value);
 	else if (!strcmp(subkey, "prunetags"))
 		remote->prune_tags = git_config_bool(key, value);
+	else if (!strcmp(subkey, "prefetch"))
+		remote->prefetch = git_config_bool(key, value);
 	else if (!strcmp(subkey, "url")) {
 		if (!value)
 			return config_error_nonbool(key);
diff --git a/remote.h b/remote.h
index b901b56746d..4522fdec354 100644
--- a/remote.h
+++ b/remote.h
@@ -77,6 +77,15 @@ struct remote {
 
 	struct refspec fetch;
 
+	/*
+	 * The setting for whether to prefetch from a remote
+	 * when a fetch is invoked with a prefetch flag.
+	 *  -1 = unset
+	 *   0 = don't prefetch from this remote
+	 *   1 = prefetch from this remote
+	 */
+	int prefetch;
+
 	/*
 	 * The setting for whether to fetch tags (as a separate rule from the
 	 * configured refspecs);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index abae7a97546..7484e1f1d46 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -245,6 +245,48 @@ test_expect_success 'prefetch multiple remotes' '
 	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
 '
 
+test_expect_success 'prefetch respects remote.*.prefetch config' '
+	test_create_repo prefetch-test-config &&
+	(
+		cd prefetch-test-config &&
+		test_commit initial &&
+		test_create_repo clone1 &&
+		test_create_repo clone2 &&
+		test_create_repo clone3 &&
+
+		git remote add remote1 "file://$(pwd)/clone1" &&
+		git remote add remote2 "file://$(pwd)/clone2" &&
+		git remote add remote3 "file://$(pwd)/clone3" &&
+
+		git config remote.remote1.prefetch false &&
+		git config remote.remote2.prefetch true &&
+		# remote3 is left unset
+
+		# Make changes in all clones
+		git -C clone1 switch -c one &&
+		git -C clone2 switch -c two &&
+		git -C clone3 switch -c three &&
+		test_commit -C clone1 one &&
+		test_commit -C clone2 two &&
+		test_commit -C clone3 three &&
+
+		# Run maintenance prefetch task
+		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+		# Check that if remotes were prefetched properly
+		fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
+		test_subcommand ! git fetch remote1 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote2 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote3 $fetchargs <prefetch.txt &&
+
+		# Verify that changes are in the prefetch refs for remote2 and remote3, but not remote1
+		test_must_fail git rev-parse refs/prefetch/remotes/remote1/one &&
+		git fetch --all &&
+		test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
+		test_cmp_rev refs/remotes/remote3/three refs/prefetch/remotes/remote3/three
+	)
+'
+
 test_expect_success 'loose-objects task' '
 	# Repack everything so we know the state of the object dir
 	git repack -adk &&

base-commit: 2e7b89e038c0c888acf61f1b4ee5a43d4dd5e94c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] remote: prefetch config
  2024-09-05  2:08     ` Derrick Stolee
  2024-09-05  2:51       ` Shubham Kanodia
@ 2024-09-05 15:00       ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2024-09-05 15:00 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Shubham Kanodia

Derrick Stolee <stolee@gmail.com> writes:

> On 9/4/24 4:55 PM, Junio C Hamano wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>
>>>> +		# Run maintenance prefetch task
>>>> +		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>>>> +
>>>> +		# Check that remote1 was not fetched (prefetch=false)
>>>> +		test_subcommand ! git fetch remote1 --prefetch --prune --no-tags \
>>>> +			--no-write-fetch-head --recurse-submodules=no --quiet \
>>>> +			<prefetch.txt &&
>>>
>>> I'm happy to see this use of test_subcommand to validate the behavior
>>> of this patch!
>> I found it a bit disturbing that the pattern is overly specific.
>> The only thing we are interested in is that we are not fetching from
>> remote1, so it _should_ suffice if we could write
>> 	test_subcommand ! git fetch remote1 <prefetch.txt &&
>> to avoid being tied to how the current version of Git happens to
>> pass these command line option flags and the order it does so.
>> Looking at the implementation of test_subcommand, it seems that we
>> cannot quite do that (it assumes that the pattern it assembles out
>> of the parameters are to match the full argument list used in
>> invocation, enclosing them in a single [] pair and without giving
>> the caller an easy way to sneak wildcards like ".*" in), which is
>> sad.
> I agree the ergonomics of the test_subcommand helper is a bit poor
> (and not this patch author's fault).

I suspect that we could do

    test_subcommand ! git fetch remote1 --prefetch '.*' <prefetch.txt

which would be rewritten to this pattern

	\["git", "fetch", "remote1", "--prefetch", ".*"\]

if I am reading how the expr given to grep is built by the
test_subcommand implementation.  As long a there is at least one
actual argument after the "--prefetch" one, .* would slurp
everything.

But it is ugly.  In any case, this is a tangent unrelated to the
topic of the patch on this thread.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] remote: prefetch config
  2024-09-05  2:52 ` [PATCH v2] " Shubham Kanodia via GitGitGadget
@ 2024-09-05 16:06   ` Junio C Hamano
  2024-09-05 16:43     ` Shubham Kanodia
  2024-09-05 16:45   ` [PATCH v3] " Shubham Kanodia via GitGitGadget
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2024-09-05 16:06 UTC (permalink / raw)
  To: Shubham Kanodia via GitGitGadget
  Cc: git, Patrick Steinhardt [ ], Derrick Stolee, Shubham Kanodia

"Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/builtin/gc.c b/builtin/gc.c
> index 427faf1cfe1..2ca3a3e7d6a 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1027,6 +1027,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	if (remote->skip_default_update)
>  		return 0;
>  
> +	if (!remote->prefetch)
> +		return 0;

This, while better than ane xplicit comparison with "== 0", is a bit
tricky in this patch, as it is not saying "if we are told to prefetch,
fall through to the rest of the function".  It is saying "leave if
and only if we are explicitly configured not to prefetch".

It might warrant a comment.

> diff --git a/remote.c b/remote.c
> index 8f3dee13186..05edb3a5f40 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -140,6 +140,7 @@ static struct remote *make_remote(struct remote_state *remote_state,
>  	CALLOC_ARRAY(ret, 1);
>  	ret->prune = -1;  /* unspecified */
>  	ret->prune_tags = -1;  /* unspecified */
> +	ret->prefetch = -1;  /* unspecified */

Or, we can just assign "1" (and drop "unspecified" comment).

	ret->prefetch = 1; /* enabled by default */

If I understand it correctly, we want this to default to true...

>  	ret->name = xstrndup(name, len);
>  	refspec_init(&ret->push, REFSPEC_PUSH);
>  	refspec_init(&ret->fetch, REFSPEC_FETCH);
> @@ -456,6 +457,8 @@ static int handle_config(const char *key, const char *value,
>  		remote->prune = git_config_bool(key, value);
>  	else if (!strcmp(subkey, "prunetags"))
>  		remote->prune_tags = git_config_bool(key, value);
> +	else if (!strcmp(subkey, "prefetch"))
> +		remote->prefetch = git_config_bool(key, value);

... with a way for the user to turn it off.

> diff --git a/remote.h b/remote.h
> index b901b56746d..4522fdec354 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -77,6 +77,15 @@ struct remote {
>  
>  	struct refspec fetch;
>  
> +	/*
> +	 * The setting for whether to prefetch from a remote
> +	 * when a fetch is invoked with a prefetch flag.
> +	 *  -1 = unset
> +	 *   0 = don't prefetch from this remote
> +	 *   1 = prefetch from this remote
> +	 */
> +	int prefetch;

And then we can get rid of "-1 unset" from this list.  The comment
can become a lot more brief, as such a change would make it a simple
Boolean flag that everybody would understand immediately.

"prefetch" in the comment is superfluous, as that is the name of the
member anyway.  "from this remote" is superfluous, as that is the
point of having the member in "struct remote" that gives settings
that are per-remote.

	int prefetch; /* is prefetch enabled? */

If we really want to have "unspecified yet" state, what we commonly
do is

 * to initialize the variable to -1 to signal "unspecified yet",
   which you did in this patch.

 * after the configuration reader returns, check if the variable is
   still -1, and then explicitly reset it to the default value,
   which your patch does not do.

 * the code that uses the variable assumes it is either 0 or 1 and
   there shoudl be no "unspecified yet" value.  It indeed is a bug
   that the ariable is left unspecified as it is a sign that the
   code to do previous step was somehow skipped.

But I do not think it is needed in this case; initializing the
.prefetch member to whichever is the default should be sufficient.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] remote: prefetch config
  2024-09-05 16:06   ` Junio C Hamano
@ 2024-09-05 16:43     ` Shubham Kanodia
  2024-09-05 16:52       ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Shubham Kanodia @ 2024-09-05 16:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Derrick Stolee

On Thu, Sep 5, 2024 at 9:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index 427faf1cfe1..2ca3a3e7d6a 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -1027,6 +1027,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
> >       if (remote->skip_default_update)
> >               return 0;
> >
> > +     if (!remote->prefetch)
> > +             return 0;
>
> This, while better than ane xplicit comparison with "== 0", is a bit
> tricky in this patch, as it is not saying "if we are told to prefetch,
> fall through to the rest of the function".  It is saying "leave if
> and only if we are explicitly configured not to prefetch".
>
> It might warrant a comment.
>
> > diff --git a/remote.c b/remote.c
> > index 8f3dee13186..05edb3a5f40 100644
> > --- a/remote.c
> > +++ b/remote.c
> > @@ -140,6 +140,7 @@ static struct remote *make_remote(struct remote_state *remote_state,
> >       CALLOC_ARRAY(ret, 1);
> >       ret->prune = -1;  /* unspecified */
> >       ret->prune_tags = -1;  /* unspecified */
> > +     ret->prefetch = -1;  /* unspecified */
>
> Or, we can just assign "1" (and drop "unspecified" comment).
>
>         ret->prefetch = 1; /* enabled by default */
>
> If I understand it correctly, we want this to default to true...
>
> >       ret->name = xstrndup(name, len);
> >       refspec_init(&ret->push, REFSPEC_PUSH);
> >       refspec_init(&ret->fetch, REFSPEC_FETCH);
> > @@ -456,6 +457,8 @@ static int handle_config(const char *key, const char *value,
> >               remote->prune = git_config_bool(key, value);
> >       else if (!strcmp(subkey, "prunetags"))
> >               remote->prune_tags = git_config_bool(key, value);
> > +     else if (!strcmp(subkey, "prefetch"))
> > +             remote->prefetch = git_config_bool(key, value);
>
> ... with a way for the user to turn it off.
>
> > diff --git a/remote.h b/remote.h
> > index b901b56746d..4522fdec354 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -77,6 +77,15 @@ struct remote {
> >
> >       struct refspec fetch;
> >
> > +     /*
> > +      * The setting for whether to prefetch from a remote
> > +      * when a fetch is invoked with a prefetch flag.
> > +      *  -1 = unset
> > +      *   0 = don't prefetch from this remote
> > +      *   1 = prefetch from this remote
> > +      */
> > +     int prefetch;
>
> And then we can get rid of "-1 unset" from this list.  The comment
> can become a lot more brief, as such a change would make it a simple
> Boolean flag that everybody would understand immediately.
>
> "prefetch" in the comment is superfluous, as that is the name of the
> member anyway.  "from this remote" is superfluous, as that is the
> point of having the member in "struct remote" that gives settings
> that are per-remote.
>
>         int prefetch; /* is prefetch enabled? */
>
> If we really want to have "unspecified yet" state, what we commonly
> do is
>
>  * to initialize the variable to -1 to signal "unspecified yet",
>    which you did in this patch.
>
>  * after the configuration reader returns, check if the variable is
>    still -1, and then explicitly reset it to the default value,
>    which your patch does not do.
>
>  * the code that uses the variable assumes it is either 0 or 1 and
>    there shoudl be no "unspecified yet" value.  It indeed is a bug
>    that the ariable is left unspecified as it is a sign that the
>    code to do previous step was somehow skipped.
>
> But I do not think it is needed in this case; initializing the
> .prefetch member to whichever is the default should be sufficient.
>
> Thanks.

Fair. I kept the initial value as `unset` as that could be interpreted
as a special case to do something else in the future — but I agree that keeping
it initialized to default keeps things clearer for now since such a
case doesn't arise.

Updating my patch — please let me know if there's anything else I can
improve here.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3] remote: prefetch config
  2024-09-05  2:52 ` [PATCH v2] " Shubham Kanodia via GitGitGadget
  2024-09-05 16:06   ` Junio C Hamano
@ 2024-09-05 16:45   ` Shubham Kanodia via GitGitGadget
  2024-09-05 19:43     ` [PATCH v4] " Shubham Kanodia via GitGitGadget
  1 sibling, 1 reply; 16+ messages in thread
From: Shubham Kanodia via GitGitGadget @ 2024-09-05 16:45 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt [ ], Junio C Hamano [ ], Derrick Stolee,
	Shubham Kanodia, Shubham Kanodia

From: Shubham Kanodia <shubham.kanodia10@gmail.com>

Large repositories often contain numerous branches and refs, many of
which individual users may not need. This commit introduces a new
configuration option (`remote.<remote>.prefetch`) to allow
users to specify which remotes to prefetch during
the maintenance task.

Key behaviors:
1. If `remote.<remote>.prefetch` is unset or true, running
   `git-maintenance` will prefetch all refs for the remote.
2. If `remote.<remote>.prefetch` is set to false, the remote
   will be ignored for prefetching.

In a future change, we could also allow restricting the refs that are
prefetched per remote using the `prefetchref` config option per remote.

Both of these options in unison would allow users to optimize their
prefetch operations, reducing network traffic and disk usage.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
    remote: prefetch config

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1779%2Fpastelsky%2Fsk%2Fmaintenance-prefetch-remote-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1779/pastelsky/sk/maintenance-prefetch-remote-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1779

Range-diff vs v2:

 1:  1d58b782e22 ! 1:  c348f8efd33 remote: prefetch config
     @@ remote.c: static struct remote *make_remote(struct remote_state *remote_state,
       	CALLOC_ARRAY(ret, 1);
       	ret->prune = -1;  /* unspecified */
       	ret->prune_tags = -1;  /* unspecified */
     -+	ret->prefetch = -1;  /* unspecified */
     ++	ret->prefetch = 1;
       	ret->name = xstrndup(name, len);
       	refspec_init(&ret->push, REFSPEC_PUSH);
       	refspec_init(&ret->fetch, REFSPEC_FETCH);
     @@ remote.h: struct remote {
       
       	struct refspec fetch;
       
     -+	/*
     -+	 * The setting for whether to prefetch from a remote
     -+	 * when a fetch is invoked with a prefetch flag.
     -+	 *  -1 = unset
     -+	 *   0 = don't prefetch from this remote
     -+	 *   1 = prefetch from this remote
     -+	 */
      +	int prefetch;
      +
       	/*


 Documentation/config/remote.txt   |  5 ++++
 Documentation/git-maintenance.txt |  7 +++---
 builtin/gc.c                      |  3 +++
 remote.c                          |  3 +++
 remote.h                          |  2 ++
 t/t7900-maintenance.sh            | 42 +++++++++++++++++++++++++++++++
 6 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 8efc53e836d..c2b3876192c 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -33,6 +33,11 @@ remote.<name>.fetch::
 	The default set of "refspec" for linkgit:git-fetch[1]. See
 	linkgit:git-fetch[1].
 
+remote.<name>.prefetch::
+	If false, refs from the remote would not be prefetched for
+	the prefetch task in linkgit:git-maintenance[1]. If not set,
+	the value is assumed to be true.
+
 remote.<name>.push::
 	The default set of "refspec" for linkgit:git-push[1]. See
 	linkgit:git-push[1].
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 51d0f7e94b6..2fd38706ea2 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -97,9 +97,10 @@ commit-graph::
 
 prefetch::
 	The `prefetch` task updates the object directory with the latest
-	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The configured refspec is modified to place all
-	requested refs within `refs/prefetch/`. Also, tags are not updated.
+	objects from all registered remotes unless they've disabled prefetch
+	using `remote.<remote>.prefetch` set to `false`. For each such remote,
+	a `git fetch` command is run. The configured refspec is modified to place
+	all requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  However,
diff --git a/builtin/gc.c b/builtin/gc.c
index 427faf1cfe1..2ca3a3e7d6a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1027,6 +1027,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	if (remote->skip_default_update)
 		return 0;
 
+	if (!remote->prefetch)
+		return 0;
+
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name,
 		     "--prefetch", "--prune", "--no-tags",
diff --git a/remote.c b/remote.c
index 8f3dee13186..4024018b1d6 100644
--- a/remote.c
+++ b/remote.c
@@ -140,6 +140,7 @@ static struct remote *make_remote(struct remote_state *remote_state,
 	CALLOC_ARRAY(ret, 1);
 	ret->prune = -1;  /* unspecified */
 	ret->prune_tags = -1;  /* unspecified */
+	ret->prefetch = 1;
 	ret->name = xstrndup(name, len);
 	refspec_init(&ret->push, REFSPEC_PUSH);
 	refspec_init(&ret->fetch, REFSPEC_FETCH);
@@ -456,6 +457,8 @@ static int handle_config(const char *key, const char *value,
 		remote->prune = git_config_bool(key, value);
 	else if (!strcmp(subkey, "prunetags"))
 		remote->prune_tags = git_config_bool(key, value);
+	else if (!strcmp(subkey, "prefetch"))
+		remote->prefetch = git_config_bool(key, value);
 	else if (!strcmp(subkey, "url")) {
 		if (!value)
 			return config_error_nonbool(key);
diff --git a/remote.h b/remote.h
index b901b56746d..182dc105a5a 100644
--- a/remote.h
+++ b/remote.h
@@ -77,6 +77,8 @@ struct remote {
 
 	struct refspec fetch;
 
+	int prefetch;
+
 	/*
 	 * The setting for whether to fetch tags (as a separate rule from the
 	 * configured refspecs);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index abae7a97546..7484e1f1d46 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -245,6 +245,48 @@ test_expect_success 'prefetch multiple remotes' '
 	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
 '
 
+test_expect_success 'prefetch respects remote.*.prefetch config' '
+	test_create_repo prefetch-test-config &&
+	(
+		cd prefetch-test-config &&
+		test_commit initial &&
+		test_create_repo clone1 &&
+		test_create_repo clone2 &&
+		test_create_repo clone3 &&
+
+		git remote add remote1 "file://$(pwd)/clone1" &&
+		git remote add remote2 "file://$(pwd)/clone2" &&
+		git remote add remote3 "file://$(pwd)/clone3" &&
+
+		git config remote.remote1.prefetch false &&
+		git config remote.remote2.prefetch true &&
+		# remote3 is left unset
+
+		# Make changes in all clones
+		git -C clone1 switch -c one &&
+		git -C clone2 switch -c two &&
+		git -C clone3 switch -c three &&
+		test_commit -C clone1 one &&
+		test_commit -C clone2 two &&
+		test_commit -C clone3 three &&
+
+		# Run maintenance prefetch task
+		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+		# Check that if remotes were prefetched properly
+		fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
+		test_subcommand ! git fetch remote1 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote2 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote3 $fetchargs <prefetch.txt &&
+
+		# Verify that changes are in the prefetch refs for remote2 and remote3, but not remote1
+		test_must_fail git rev-parse refs/prefetch/remotes/remote1/one &&
+		git fetch --all &&
+		test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
+		test_cmp_rev refs/remotes/remote3/three refs/prefetch/remotes/remote3/three
+	)
+'
+
 test_expect_success 'loose-objects task' '
 	# Repack everything so we know the state of the object dir
 	git repack -adk &&

base-commit: 2e7b89e038c0c888acf61f1b4ee5a43d4dd5e94c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] remote: prefetch config
  2024-09-05 16:43     ` Shubham Kanodia
@ 2024-09-05 16:52       ` Junio C Hamano
  2024-09-05 17:19         ` Shubham Kanodia
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2024-09-05 16:52 UTC (permalink / raw)
  To: Shubham Kanodia
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Derrick Stolee

Shubham Kanodia <shubham.kanodia10@gmail.com> writes:

>>
>>         int prefetch; /* is prefetch enabled? */
>>
> ...
> Updating my patch — please let me know if there's anything else I can
> improve here.

Renaming the .prefetch member to .prefetch_enabled would eliminate
the need to add any comment on the member in the header file.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] remote: prefetch config
  2024-09-05 16:52       ` Junio C Hamano
@ 2024-09-05 17:19         ` Shubham Kanodia
  2024-09-05 17:54           ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Shubham Kanodia @ 2024-09-05 17:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Derrick Stolee

On Thu, Sep 5, 2024 at 10:22 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Shubham Kanodia <shubham.kanodia10@gmail.com> writes:
>
> >>
> >>         int prefetch; /* is prefetch enabled? */
> >>
> > ...
> > Updating my patch — please let me know if there's anything else I can
> > improve here.
>
> Renaming the .prefetch member to .prefetch_enabled would eliminate
> the need to add any comment on the member in the header file.

Do you mean for the struct member here or also the config? For the
config, it'll probably be clearer
to keep `prefetch` still as it aligns nicely with the boolean
`--prefetch` command line flag.

I can name the struct member `prefetch_enabled` — though I don't see
other boolean remote properties (`prune`, `prune_tags`) add suffixes
to indicate
they are booleans.

> Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] remote: prefetch config
  2024-09-05 17:19         ` Shubham Kanodia
@ 2024-09-05 17:54           ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2024-09-05 17:54 UTC (permalink / raw)
  To: Shubham Kanodia
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Derrick Stolee

Shubham Kanodia <shubham.kanodia10@gmail.com> writes:

> On Thu, Sep 5, 2024 at 10:22 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Shubham Kanodia <shubham.kanodia10@gmail.com> writes:
>>
>> >>
>> >>         int prefetch; /* is prefetch enabled? */
>> >>
>> > ...
>> > Updating my patch — please let me know if there's anything else I can
>> > improve here.
>>
>> Renaming the .prefetch member to .prefetch_enabled would eliminate
>> the need to add any comment on the member in the header file.
>
> Do you mean for the struct member here or also the config?

I do not think I mentioned anything about the name of the
configuration variable, but if I did that was a mistake.

End-user facing configuration variables are often named after a
feature that it enables or disables, so it can use the name without
"enable".  An int variable on the other hand can mean many other
things, ranging from "how many times have we prefetched from here"
to "does this remote allow prefetching?", so a more explicit name
would often help.

Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4] remote: prefetch config
  2024-09-05 16:45   ` [PATCH v3] " Shubham Kanodia via GitGitGadget
@ 2024-09-05 19:43     ` Shubham Kanodia via GitGitGadget
  2024-09-05 20:57       ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Shubham Kanodia via GitGitGadget @ 2024-09-05 19:43 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt [ ], Junio C Hamano [ ], Derrick Stolee,
	Shubham Kanodia, Shubham Kanodia

From: Shubham Kanodia <shubham.kanodia10@gmail.com>

Large repositories often contain numerous branches and refs, many of
which individual users may not need. This commit introduces a new
configuration option (`remote.<remote>.prefetch`) to allow
users to specify which remotes to prefetch during
the maintenance task.

Key behaviors:
1. If `remote.<remote>.prefetch` is unset or true, running
   `git-maintenance` will prefetch all refs for the remote.
2. If `remote.<remote>.prefetch` is set to false, the remote
   will be ignored for prefetching.

In a future change, we could also allow restricting the refs that are
prefetched per remote using the `prefetchref` config option per remote.

Both of these options in unison would allow users to optimize their
prefetch operations, reducing network traffic and disk usage.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
    remote: prefetch config

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1779%2Fpastelsky%2Fsk%2Fmaintenance-prefetch-remote-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1779/pastelsky/sk/maintenance-prefetch-remote-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1779

Range-diff vs v3:

 1:  c348f8efd33 ! 1:  80af121f835 remote: prefetch config
     @@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
       	if (remote->skip_default_update)
       		return 0;
       
     -+	if (!remote->prefetch)
     ++	if (!remote->prefetch_enabled)
      +		return 0;
      +
       	child.git_cmd = 1;
     @@ remote.c: static struct remote *make_remote(struct remote_state *remote_state,
       	CALLOC_ARRAY(ret, 1);
       	ret->prune = -1;  /* unspecified */
       	ret->prune_tags = -1;  /* unspecified */
     -+	ret->prefetch = 1;
     ++	ret->prefetch_enabled = 1;
       	ret->name = xstrndup(name, len);
       	refspec_init(&ret->push, REFSPEC_PUSH);
       	refspec_init(&ret->fetch, REFSPEC_FETCH);
     @@ remote.c: static int handle_config(const char *key, const char *value,
       	else if (!strcmp(subkey, "prunetags"))
       		remote->prune_tags = git_config_bool(key, value);
      +	else if (!strcmp(subkey, "prefetch"))
     -+		remote->prefetch = git_config_bool(key, value);
     ++		remote->prefetch_enabled = git_config_bool(key, value);
       	else if (!strcmp(subkey, "url")) {
       		if (!value)
       			return config_error_nonbool(key);
     @@ remote.h: struct remote {
       
       	struct refspec fetch;
       
     -+	int prefetch;
     ++	int prefetch_enabled;
      +
       	/*
       	 * The setting for whether to fetch tags (as a separate rule from the


 Documentation/config/remote.txt   |  5 ++++
 Documentation/git-maintenance.txt |  7 +++---
 builtin/gc.c                      |  3 +++
 remote.c                          |  3 +++
 remote.h                          |  2 ++
 t/t7900-maintenance.sh            | 42 +++++++++++++++++++++++++++++++
 6 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 8efc53e836d..c2b3876192c 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -33,6 +33,11 @@ remote.<name>.fetch::
 	The default set of "refspec" for linkgit:git-fetch[1]. See
 	linkgit:git-fetch[1].
 
+remote.<name>.prefetch::
+	If false, refs from the remote would not be prefetched for
+	the prefetch task in linkgit:git-maintenance[1]. If not set,
+	the value is assumed to be true.
+
 remote.<name>.push::
 	The default set of "refspec" for linkgit:git-push[1]. See
 	linkgit:git-push[1].
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 51d0f7e94b6..2fd38706ea2 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -97,9 +97,10 @@ commit-graph::
 
 prefetch::
 	The `prefetch` task updates the object directory with the latest
-	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The configured refspec is modified to place all
-	requested refs within `refs/prefetch/`. Also, tags are not updated.
+	objects from all registered remotes unless they've disabled prefetch
+	using `remote.<remote>.prefetch` set to `false`. For each such remote,
+	a `git fetch` command is run. The configured refspec is modified to place
+	all requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  However,
diff --git a/builtin/gc.c b/builtin/gc.c
index 427faf1cfe1..8da78290929 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1027,6 +1027,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	if (remote->skip_default_update)
 		return 0;
 
+	if (!remote->prefetch_enabled)
+		return 0;
+
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name,
 		     "--prefetch", "--prune", "--no-tags",
diff --git a/remote.c b/remote.c
index 8f3dee13186..fc6eee21408 100644
--- a/remote.c
+++ b/remote.c
@@ -140,6 +140,7 @@ static struct remote *make_remote(struct remote_state *remote_state,
 	CALLOC_ARRAY(ret, 1);
 	ret->prune = -1;  /* unspecified */
 	ret->prune_tags = -1;  /* unspecified */
+	ret->prefetch_enabled = 1;
 	ret->name = xstrndup(name, len);
 	refspec_init(&ret->push, REFSPEC_PUSH);
 	refspec_init(&ret->fetch, REFSPEC_FETCH);
@@ -456,6 +457,8 @@ static int handle_config(const char *key, const char *value,
 		remote->prune = git_config_bool(key, value);
 	else if (!strcmp(subkey, "prunetags"))
 		remote->prune_tags = git_config_bool(key, value);
+	else if (!strcmp(subkey, "prefetch"))
+		remote->prefetch_enabled = git_config_bool(key, value);
 	else if (!strcmp(subkey, "url")) {
 		if (!value)
 			return config_error_nonbool(key);
diff --git a/remote.h b/remote.h
index b901b56746d..c448e5e6f9d 100644
--- a/remote.h
+++ b/remote.h
@@ -77,6 +77,8 @@ struct remote {
 
 	struct refspec fetch;
 
+	int prefetch_enabled;
+
 	/*
 	 * The setting for whether to fetch tags (as a separate rule from the
 	 * configured refspecs);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index abae7a97546..7484e1f1d46 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -245,6 +245,48 @@ test_expect_success 'prefetch multiple remotes' '
 	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
 '
 
+test_expect_success 'prefetch respects remote.*.prefetch config' '
+	test_create_repo prefetch-test-config &&
+	(
+		cd prefetch-test-config &&
+		test_commit initial &&
+		test_create_repo clone1 &&
+		test_create_repo clone2 &&
+		test_create_repo clone3 &&
+
+		git remote add remote1 "file://$(pwd)/clone1" &&
+		git remote add remote2 "file://$(pwd)/clone2" &&
+		git remote add remote3 "file://$(pwd)/clone3" &&
+
+		git config remote.remote1.prefetch false &&
+		git config remote.remote2.prefetch true &&
+		# remote3 is left unset
+
+		# Make changes in all clones
+		git -C clone1 switch -c one &&
+		git -C clone2 switch -c two &&
+		git -C clone3 switch -c three &&
+		test_commit -C clone1 one &&
+		test_commit -C clone2 two &&
+		test_commit -C clone3 three &&
+
+		# Run maintenance prefetch task
+		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+		# Check that if remotes were prefetched properly
+		fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
+		test_subcommand ! git fetch remote1 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote2 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote3 $fetchargs <prefetch.txt &&
+
+		# Verify that changes are in the prefetch refs for remote2 and remote3, but not remote1
+		test_must_fail git rev-parse refs/prefetch/remotes/remote1/one &&
+		git fetch --all &&
+		test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
+		test_cmp_rev refs/remotes/remote3/three refs/prefetch/remotes/remote3/three
+	)
+'
+
 test_expect_success 'loose-objects task' '
 	# Repack everything so we know the state of the object dir
 	git repack -adk &&

base-commit: 2e7b89e038c0c888acf61f1b4ee5a43d4dd5e94c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] remote: prefetch config
  2024-09-05 19:43     ` [PATCH v4] " Shubham Kanodia via GitGitGadget
@ 2024-09-05 20:57       ` Junio C Hamano
  2024-09-06  9:42         ` Shubham Kanodia
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2024-09-05 20:57 UTC (permalink / raw)
  To: Shubham Kanodia via GitGitGadget
  Cc: git, Patrick Steinhardt [ ], Derrick Stolee, Shubham Kanodia

"Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Shubham Kanodia <shubham.kanodia10@gmail.com>
> ...
> In a future change, we could also allow restricting the refs that are
> prefetched per remote using the `prefetchref` config option per remote.
>
> Both of these options in unison would allow users to optimize their
> prefetch operations, reducing network traffic and disk usage.
>
> Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
> ---

Looking good.  Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] remote: prefetch config
  2024-09-05 20:57       ` Junio C Hamano
@ 2024-09-06  9:42         ` Shubham Kanodia
  0 siblings, 0 replies; 16+ messages in thread
From: Shubham Kanodia @ 2024-09-06  9:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Shubham Kanodia via GitGitGadget, git, Patrick Steinhardt [ ],
	Derrick Stolee

On Fri, Sep 6, 2024 at 2:28 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Shubham Kanodia <shubham.kanodia10@gmail.com>
> > ...
> > In a future change, we could also allow restricting the refs that are
> > prefetched per remote using the `prefetchref` config option per remote.
> >
> > Both of these options in unison would allow users to optimize their
> > prefetch operations, reducing network traffic and disk usage.
> >
> > Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
> > ---
>
> Looking good.  Thanks.

How long do you reckon changes like this typically remain in "seen"
until merged upstream?
I'm preparing part-2 of this change separately — so would be good to
know when I can submit that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-09-06  9:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-04 17:59 [PATCH] remote: prefetch config Shubham Kanodia via GitGitGadget
2024-09-04 20:03 ` Derrick Stolee
2024-09-04 20:55   ` Junio C Hamano
2024-09-05  2:08     ` Derrick Stolee
2024-09-05  2:51       ` Shubham Kanodia
2024-09-05 15:00       ` Junio C Hamano
2024-09-05  2:52 ` [PATCH v2] " Shubham Kanodia via GitGitGadget
2024-09-05 16:06   ` Junio C Hamano
2024-09-05 16:43     ` Shubham Kanodia
2024-09-05 16:52       ` Junio C Hamano
2024-09-05 17:19         ` Shubham Kanodia
2024-09-05 17:54           ` Junio C Hamano
2024-09-05 16:45   ` [PATCH v3] " Shubham Kanodia via GitGitGadget
2024-09-05 19:43     ` [PATCH v4] " Shubham Kanodia via GitGitGadget
2024-09-05 20:57       ` Junio C Hamano
2024-09-06  9:42         ` Shubham Kanodia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).