git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Orgad Shaneh via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Orgad Shaneh <orgads@gmail.com>
Subject: Re: [PATCH v2] fetch: limit shared symref check only for local branches
Date: Mon, 16 May 2022 09:00:57 -0700	[thread overview]
Message-ID: <xmqqv8u54gcm.fsf@gitster.g> (raw)
In-Reply-To: <pull.1266.v2.git.git.1652690501963.gitgitgadget@gmail.com> (Orgad Shaneh via GitGitGadget's message of "Mon, 16 May 2022 08:41:41 +0000")

"Orgad Shaneh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Orgad Shaneh <orgads@gmail.com>
>
> This check was introduced in 8ee5d73137f (Fix fetch/pull when run without
> --update-head-ok, 2008-10-13) in order to protect against replacing the ref
> of the active branch by mistake, for example by running git fetch origin
> master:master.
>
> It was later extended in 8bc1f39f411 (fetch: protect branches checked out
> in all worktrees, 2021-12-01) to scan all worktrees.
>
> This operation is very expensive (takes about 30s in my repository) when
> there are many tags or branches, and it is executed on every fetch, even if
> no local heads are updated at all.
>
> Limit it to protect only refs/heads/* to improve fetch performance.

The point of the check is to prevent the index+working tree in the
worktrees to go out of sync with HEAD, and HEAD by definition can
point only into refs/heads/*, this change should be OK.

It is surprising find_shared_symref() is so expensive, though.  If
you have a dozen worktrees linked to the current repository, there
are at most a dozen HEAD that point at various refs in refs/heads/
namespace.  Even if you need to check a thousand ref_map elements,
it should cost almost nothing if you build a hashmap to find matches
with these dozen HEADs upfront, no?

Another thing that is surprising is that you say this loop is
expensive when there are many tags or branches.  Do you mean it is
expensive when there are many tags and branches that are updated, or
it is expensive to merely have thousands of dormant tags and
branches?  If the latter, I wonder if it is sensible to limit the
check only to the refs that are going to be updated.

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index e3791f09ed5..eeee5ac8f15 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1440,6 +1440,7 @@ static void check_not_current_branch(struct ref *ref_map,
>  	const struct worktree *wt;
>  	for (; ref_map; ref_map = ref_map->next)
>  		if (ref_map->peer_ref &&
> +		    starts_with(ref_map->peer_ref->name, "refs/heads/") &&
>  		    (wt = find_shared_symref(worktrees, "HEAD",
>  					     ref_map->peer_ref->name)) &&
>  		    !wt->is_bare)
>
> base-commit: 277cf0bc36094f6dc4297d8c9cef79df045b735d

  reply	other threads:[~2022-05-16 16:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16  8:37 [PATCH] fetch: limit shared symref check only for local branches Orgad Shaneh via GitGitGadget
2022-05-16  8:41 ` [PATCH v2] " Orgad Shaneh via GitGitGadget
2022-05-16 16:00   ` Junio C Hamano [this message]
2022-05-16 17:57     ` Junio C Hamano
2022-05-17  6:05     ` Orgad Shaneh
2022-05-17 10:27       ` Junio C Hamano
2022-05-17 10:41         ` Orgad Shaneh
2022-05-18 15:50           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqv8u54gcm.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=orgads@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).