From: "Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
Patrick Steinhardt <ps@pks.im>,
Shubham Kanodia <shubham.kanodia10@gmail.com>,
Shubham Kanodia <shubham.kanodia10@gmail.com>
Subject: [PATCH v2] maintenance: add prune-remote-refs task
Date: Sat, 28 Dec 2024 10:07:41 +0000 [thread overview]
Message-ID: <pull.1838.v2.git.1735380461980.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1838.git.1734946566885.gitgitgadget@gmail.com>
From: Shubham Kanodia <shubham.kanodia10@gmail.com>
Remote-tracking refs can accumulate in local repositories even as branches
are deleted on remotes, impacting git performance negatively. Existing
alternatives to keep refs pruned have a few issues —
1. Running `git fetch` with either `--prune` or `fetch.prune=true` set will
prune stale refs, but requires a manual operation and also pulls in new
refs from remote which can be an undesirable side-effect.
2.`git remote prune` cleans up refs without adding to the existing list
but requires periodic user intervention.
This adds a new maintenance task 'prune-remote-refs' that runs
'git remote prune' for each configured remote daily. This provides an
automated way to clean up stale remote-tracking refs — especially when
users may not do a full fetch.
This task is disabled by default.
Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
maintenance: add prune-remote-refs task
As discussed previously on:
https://lore.kernel.org/git/xmqqwmfr112w.fsf@gitster.g/T/#t
Remote-tracking refs can accumulate in local repositories even as
branches are deleted on remotes, impacting git performance negatively.
Existing alternatives to keep refs pruned have a few issues —
1. The fetch.prune config automatically cleans up remote ref on fetch,
but also pulls in new ref from remote which is an undesirable
side-effect.
2.git remote prune cleans up refs without adding to the existing list
but requires periodic user intervention.
This adds a new maintenance task 'prune-remote-refs' that runs 'git
remote prune' for each configured remote daily. This provides an
automated way to clean up stale remote-tracking refs — especially when
users may not do a full fetch.
This task is disabled by default.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1838%2Fpastelsky%2Fsk%2Fadd-remote-prune-maintenance-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1838/pastelsky/sk/add-remote-prune-maintenance-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1838
Range-diff vs v1:
1: 72e27d3ebe6 ! 1: 4d6c143c970 maintenance: add prune-remote-refs task
@@ Commit message
are deleted on remotes, impacting git performance negatively. Existing
alternatives to keep refs pruned have a few issues —
- 1. The `fetch.prune` config automatically cleans up remote ref on fetch,
- but also pulls in new ref from remote which is an undesirable side-effect.
+ 1. Running `git fetch` with either `--prune` or `fetch.prune=true` set will
+ prune stale refs, but requires a manual operation and also pulls in new
+ refs from remote which can be an undesirable side-effect.
2.`git remote prune` cleans up refs without adding to the existing list
but requires periodic user intervention.
@@ Documentation/git-maintenance.txt: pack-refs::
++
+--
+* When using selective fetching (e.g., `git fetch origin +foo:refs/remotes/origin/foo`)
-+ where `fetch.prune` would not affect refs outside the fetched hierarchy
++ where `fetch.prune` would only affect refs that are explicitly fetched.
+* When third-party tools might perform unexpected full fetches, and you want
-+ periodic cleanup independently of fetch operations
++ periodic cleanup independently of fetch operations.
+--
+
OPTIONS
@@ builtin/gc.c: static int maintenance_opt_schedule(const struct option *opt, cons
return 0;
}
-+static int collect_remote(struct remote *remote, void *cb_data)
++static int prune_remote(struct remote *remote, void *cb_data UNUSED)
+{
-+ struct string_list *list = cb_data;
++ struct child_process child = CHILD_PROCESS_INIT;
+
+ if (!remote->url.nr)
+ return 0;
+
-+ string_list_append(list, remote->name);
-+ return 0;
++ child.git_cmd = 1;
++ strvec_pushl(&child.args, "remote", "prune", remote->name, NULL);
++
++ return !!run_command(&child);
+}
+
-+static int maintenance_task_prune_remote(struct maintenance_run_opts *opts UNUSED,
++static int maintenance_task_prune_remote(struct maintenance_run_opts *opts,
+ struct gc_config *cfg UNUSED)
+{
-+ struct string_list_item *item;
-+ struct string_list remotes_list = STRING_LIST_INIT_NODUP;
-+ struct child_process child = CHILD_PROCESS_INIT;
-+ int result = 0;
-+
-+ for_each_remote(collect_remote, &remotes_list);
-+
-+ for_each_string_list_item (item, &remotes_list) {
-+ const char *remote_name = item->string;
-+ child.git_cmd = 1;
-+ strvec_pushl(&child.args, "remote", "prune", remote_name, NULL);
-+
-+ if (run_command(&child))
-+ result = error(_("failed to prune '%s'"), remote_name);
++ if (for_each_remote(prune_remote, opts)) {
++ error(_("failed to prune remotes"));
++ return 1;
+ }
+
-+ string_list_clear(&remotes_list, 0);
-+ return result;
++ return 0;
+}
+
/* Remember to update object flag allocation in object.h */
Documentation/git-maintenance.txt | 20 ++++++++++++++
builtin/gc.c | 32 ++++++++++++++++++++++
t/t7900-maintenance.sh | 44 +++++++++++++++++++++++++++++++
3 files changed, 96 insertions(+)
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 6e6651309d3..8b3e496c8ef 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -158,6 +158,26 @@ pack-refs::
need to iterate across many references. See linkgit:git-pack-refs[1]
for more information.
+prune-remote-refs::
+ The `prune-remote-refs` task runs `git remote prune` on each remote
+ repository registered in the local repository. This task helps clean
+ up deleted remote branches, improving the performance of operations
+ that iterate through the refs. See linkgit:git-remote[1] for more
+ information. This task is disabled by default.
++
+NOTE: This task is opt-in to prevent unexpected removal of remote refs
+for users of git-maintenance. For most users, configuring `fetch.prune=true`
+is a acceptable solution, as it will automatically clean up stale remote-tracking
+branches during normal fetch operations. However, this task can be useful in
+specific scenarios:
++
+--
+* When using selective fetching (e.g., `git fetch origin +foo:refs/remotes/origin/foo`)
+ where `fetch.prune` would only affect refs that are explicitly fetched.
+* When third-party tools might perform unexpected full fetches, and you want
+ periodic cleanup independently of fetch operations.
+--
+
OPTIONS
-------
--auto::
diff --git a/builtin/gc.c b/builtin/gc.c
index 4ae5196aedf..329c764f300 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -20,6 +20,7 @@
#include "lockfile.h"
#include "parse-options.h"
#include "run-command.h"
+#include "remote.h"
#include "sigchain.h"
#include "strvec.h"
#include "commit.h"
@@ -913,6 +914,30 @@ static int maintenance_opt_schedule(const struct option *opt, const char *arg,
return 0;
}
+static int prune_remote(struct remote *remote, void *cb_data UNUSED)
+{
+ struct child_process child = CHILD_PROCESS_INIT;
+
+ if (!remote->url.nr)
+ return 0;
+
+ child.git_cmd = 1;
+ strvec_pushl(&child.args, "remote", "prune", remote->name, NULL);
+
+ return !!run_command(&child);
+}
+
+static int maintenance_task_prune_remote(struct maintenance_run_opts *opts,
+ struct gc_config *cfg UNUSED)
+{
+ if (for_each_remote(prune_remote, opts)) {
+ error(_("failed to prune remotes"));
+ return 1;
+ }
+
+ return 0;
+}
+
/* Remember to update object flag allocation in object.h */
#define SEEN (1u<<0)
@@ -1375,6 +1400,7 @@ enum maintenance_task_label {
TASK_GC,
TASK_COMMIT_GRAPH,
TASK_PACK_REFS,
+ TASK_PRUNE_REMOTE_REFS,
/* Leave as final value */
TASK__COUNT
@@ -1411,6 +1437,10 @@ static struct maintenance_task tasks[] = {
maintenance_task_pack_refs,
pack_refs_condition,
},
+ [TASK_PRUNE_REMOTE_REFS] = {
+ "prune-remote-refs",
+ maintenance_task_prune_remote,
+ },
};
static int compare_tasks_by_selection(const void *a_, const void *b_)
@@ -1505,6 +1535,8 @@ static void initialize_maintenance_strategy(void)
tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
tasks[TASK_PACK_REFS].enabled = 1;
tasks[TASK_PACK_REFS].schedule = SCHEDULE_WEEKLY;
+ tasks[TASK_PRUNE_REMOTE_REFS].enabled = 0;
+ tasks[TASK_PRUNE_REMOTE_REFS].schedule = SCHEDULE_DAILY;
}
}
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 0ce4ba1cbef..60a0c3f8353 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -446,6 +446,50 @@ test_expect_success 'pack-refs task' '
test_subcommand git pack-refs --all --prune <pack-refs.txt
'
+test_expect_success 'prune-remote-refs task not enabled by default' '
+ git clone . prune-test &&
+ (
+ cd prune-test &&
+ GIT_TRACE2_EVENT="$(pwd)/prune.txt" git maintenance run 2>err &&
+ test_subcommand ! git remote prune origin <prune.txt
+ )
+'
+
+test_expect_success 'prune-remote-refs task cleans stale remote refs' '
+ test_commit initial &&
+
+ # Create two separate remote repos
+ git clone . remote1 &&
+ git clone . remote2 &&
+
+ git clone . prune-test-clean &&
+ (
+ cd prune-test-clean &&
+ git config maintenance.prune-remote-refs.enabled true &&
+
+ # Add both remotes
+ git remote add remote1 "../remote1" &&
+ git remote add remote2 "../remote2" &&
+
+ # Create and push branches to both remotes
+ git branch -f side2 HEAD &&
+ git push remote1 side2 &&
+ git push remote2 side2 &&
+
+ # Rename branches in each remote to simulate a stale branch
+ git -C ../remote1 branch -m side2 side3 &&
+ git -C ../remote2 branch -m side2 side4 &&
+
+ GIT_TRACE2_EVENT="$(pwd)/prune.txt" git maintenance run --task=prune-remote-refs &&
+
+ # Verify pruning happened for both remotes
+ test_subcommand git remote prune remote1 <prune.txt &&
+ test_subcommand git remote prune remote2 <prune.txt &&
+ test_must_fail git rev-parse refs/remotes/remote1/side2 &&
+ test_must_fail git rev-parse refs/remotes/remote2/side2
+ )
+'
+
test_expect_success '--auto and --schedule incompatible' '
test_must_fail git maintenance run --auto --schedule=daily 2>err &&
test_grep "at most one" err
base-commit: 063bcebf0c917140ca0e705cbe0fdea127e90086
--
gitgitgadget
next prev parent reply other threads:[~2024-12-28 10:07 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-23 9:36 [PATCH] maintenance: add prune-remote-refs task Shubham Kanodia via GitGitGadget
2024-12-27 9:07 ` Junio C Hamano
2024-12-28 9:58 ` Shubham Kanodia
2024-12-28 16:05 ` Junio C Hamano
2024-12-28 16:24 ` Shubham Kanodia
2024-12-28 10:07 ` Shubham Kanodia via GitGitGadget [this message]
2024-12-28 16:25 ` [PATCH v2] " Junio C Hamano
2024-12-30 7:15 ` Patrick Steinhardt
2024-12-30 14:05 ` Junio C Hamano
2025-01-03 6:50 ` Shubham Kanodia
2025-01-03 7:38 ` Patrick Steinhardt
2025-01-03 18:13 ` [PATCH v3] " Shubham Kanodia via GitGitGadget
2025-01-03 19:02 ` Junio C Hamano
[not found] ` <CAG=Um+1ch1sKC0H8MJoFv=6iSK3pvA=03AKXmvhm5DG=H8T1rw@mail.gmail.com>
2025-01-07 17:29 ` Shubham Kanodia
2025-01-07 18:48 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1838.v2.git.1735380461980.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
--cc=shubham.kanodia10@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).