From: Patrick Steinhardt <ps@pks.im>
To: Shubham Kanodia <shubham.kanodia10@gmail.com>
Cc: git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>
Subject: Re: Consider adding pruning of refs to git maintenance
Date: Tue, 17 Dec 2024 08:21:54 +0100 [thread overview]
Message-ID: <Z2Emh42DJkHFGWq7@pks.im> (raw)
In-Reply-To: <CAG=Um+0v=BmmYjvBAXs4r4My6zYvpJvcE+0U6SAnxKUcd1-A4w@mail.gmail.com>
On Mon, Dec 16, 2024 at 06:03:03PM +0530, Shubham Kanodia wrote:
> Remote-tracking refs accumulate quickly in large repositories as users
> merge and delete their branches. While these branches are cleaned up
> on the remote, local repositories may retain stale references to
> deleted branches unless explicitly pruned. The number of local refs
> can have an impact on git performance of several commands.
>
> Git currently provides two ways for orphan local refs to be cleaned up —
> 1. Automated: `fetch.prune` and `fetch.pruneTags` configurations with
> `git fetch/pull`
> 2. Manual: `git remote prune`
>
> However, both approaches have issues:
> - Full `git fetch/pull` operations are expensive on large
> repositories, pulling thousands of irrelevant refs
> - Manual `git remote prune` requires user intervention
Fair. Neither of those issues feel insurmountable, but I can see why it
could make our users lifes easier.
> Proposal:
> Add remote pruning to the daily `git-maintenance` task. This would
> clean stale refs automatically without requiring full fetches or
> manual intervention.
>
> This is especially useful for users who historically pulled all
> refs/tags but now use targeted fetches. Moreover, it decouples the
> cleanup action (pruning) from the action to fetch more refs.
I think we need to consider a couple of things:
- It's somewhat awkward to have maintenance jobs that interact with a
remote, as that may not work in contexts where you actually need to
authenticate. But there is precedent with the "prefetch" task, so we
have already opened that can of worms.
- Maintenance tries to be as non-destructive as reasonably possible,
and deleting refs certainly is a destructive operation.
- We try to avoid bad interactions with a user that works concurrently
in the repo that git-maintenance(1) runs in. This is the reason why
the "prefetch" task does not fetch into `refs/remotes`, but into a
separate ref namespace.
If we want to have such a feature I'd thus claim that it would be most
sensible to make it opt-in rather than opt-out. I wouldn't want to be
surprised by remote refs vanishing after going to bed, but may be okay
with it when I explicitly ask for it.
At that point one has to raise the question whether it is still all that
useful compared to running `git remote prune` manually every now and
then. Mostly because explicitly configuring maintenance is probably
something that only power users would do, and those power users would
likely know to prune manually.
In any case, that's just my 2c. I can see a usecase for your feature,
but think we should be careful with how it is introduced.
> Happy to submit on a patch for the same unless there's something
> obvious that I've missed here.
I'm happy to have a look in case you decide to implement this feature.
Patrick
next prev parent reply other threads:[~2024-12-17 7:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 12:33 Consider adding pruning of refs to git maintenance Shubham Kanodia
2024-12-17 7:21 ` Patrick Steinhardt [this message]
2024-12-17 7:41 ` Junio C Hamano
2024-12-17 11:21 ` Shubham Kanodia
2024-12-17 11:24 ` Shubham Kanodia
2024-12-17 19:56 ` Junio C Hamano
2024-12-18 8:30 ` Shubham Kanodia
2024-12-18 15:35 ` Junio C Hamano
2024-12-22 12:19 ` Shubham Kanodia
2024-12-23 4:21 ` Junio C Hamano
2024-12-23 9:30 ` Shubham Kanodia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2Emh42DJkHFGWq7@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=shubham.kanodia10@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).