All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liu Zhongbo via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Liu Zhongbo <liuzhongbo.gg@gmail.com>,
	Liu Zhongbo <liuzhongbo.6666@bytedance.com>
Subject: [PATCH] builtin/fetch: iterate symrefs instead of all when checking dangling refs
Date: Tue, 15 Oct 2024 03:27:58 +0000	[thread overview]
Message-ID: <pull.1812.git.git.1728962878717.gitgitgadget@gmail.com> (raw)

From: Liu Zhongbo <liuzhongbo.6666@bytedance.com>

refs_warn_dangling_symref() traverse all references to check if there are
any dangling symbolic references. The complexity is
O(number of deleted references * total number of references).
It will take a lot of time if there are tens of thousands of branches in
monorepo.

So I first identified all the symbolic references, and then only traverse
in these references. The complexity is
O (number of deleted references * number of symbolic references).

Due to the infrequent use of symbolic references, there will be significant
performance improvements here. In my case, the prune_refs() time has been
reduced from 20 seconds to 4 seconds.

Signed-off-by: Liu Zhongbo <liuzhongbo.6666@bytedance.com>
---
    builtin/fetch: iterate symrefs instead of all refs

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1812%2Flzb6666%2Fspeed_up_prune_refs-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1812/lzb6666/speed_up_prune_refs-v1
Pull-Request: https://github.com/git/git/pull/1812

 builtin/fetch.c |  7 +++++--
 refs.c          | 35 ++++++++++++++++++++++++++---------
 refs.h          |  4 +++-
 3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 80a64d0d269..ec4be60cfeb 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1412,15 +1412,18 @@ static int prune_refs(struct display_state *display_state,
 
 	if (verbosity >= 0) {
 		int summary_width = transport_summary_width(stale_refs);
+	    struct string_list symrefs = STRING_LIST_INIT_NODUP;
+	    refs_get_symrefs(get_main_ref_store(the_repository), &symrefs);
 
 		for (ref = stale_refs; ref; ref = ref->next) {
 			display_ref_update(display_state, '-', _("[deleted]"), NULL,
 					   _("(none)"), ref->name,
 					   &ref->new_oid, &ref->old_oid,
 					   summary_width);
-			refs_warn_dangling_symref(get_main_ref_store(the_repository),
-						  stderr, dangling_msg, ref->name);
+	        refs_warn_dangling_symref(get_main_ref_store(the_repository), stderr,
+				      dangling_msg, ref->name, &symrefs);
 		}
+	    string_list_clear(&symrefs, 0);
 	}
 
 cleanup:
diff --git a/refs.c b/refs.c
index 5f729ed4124..8dd480a7a91 100644
--- a/refs.c
+++ b/refs.c
@@ -463,16 +463,33 @@ static int warn_if_dangling_symref(const char *refname, const char *referent UNU
 	return 0;
 }
 
-void refs_warn_dangling_symref(struct ref_store *refs, FILE *fp,
-			       const char *msg_fmt, const char *refname)
+static int append_symref(const char *refname, const char *referent UNUSED,
+			 const struct object_id *oid UNUSED,
+			 int flags, void *cb_data) {
+    struct string_list *d = cb_data;
+    if ((flags & REF_ISSYMREF)){
+        string_list_append(d, refname);
+    }
+    return 0;
+}
+
+void refs_get_symrefs(struct ref_store *refs, struct string_list *refnames)
 {
-	struct warn_if_dangling_data data = {
-		.refs = refs,
-		.fp = fp,
-		.refname = refname,
-		.msg_fmt = msg_fmt,
-	};
-	refs_for_each_rawref(refs, warn_if_dangling_symref, &data);
+	refs_for_each_rawref(refs, append_symref, refnames);
+}
+
+void refs_warn_dangling_symref(struct ref_store *refs, FILE *fp,
+                               const char *msg_fmt, const char *refname, struct string_list *symrefs) {
+    const char *resolves_to;
+    struct string_list_item *symref;
+    for_each_string_list_item(symref, symrefs) {
+        resolves_to = refs_resolve_ref_unsafe(refs, symref->string,
+                                              0, NULL, NULL);
+        if (resolves_to && strcmp(resolves_to, refname) == 0) {
+            fprintf(fp, msg_fmt, symref->string);
+            fputc('\n', fp);
+        }
+    }
 }
 
 void refs_warn_dangling_symrefs(struct ref_store *refs, FILE *fp,
diff --git a/refs.h b/refs.h
index 108dfc93b34..d3b65564561 100644
--- a/refs.h
+++ b/refs.h
@@ -394,8 +394,10 @@ static inline const char *has_glob_specials(const char *pattern)
 	return strpbrk(pattern, "?*[");
 }
 
+void refs_get_symrefs(struct ref_store *refs, struct string_list *refnames);
+
 void refs_warn_dangling_symref(struct ref_store *refs, FILE *fp,
-			       const char *msg_fmt, const char *refname);
+			       const char *msg_fmt, const char *refname, struct string_list *symrefs);
 void refs_warn_dangling_symrefs(struct ref_store *refs, FILE *fp,
 				const char *msg_fmt, const struct string_list *refnames);
 

base-commit: ef8ce8f3d4344fd3af049c17eeba5cd20d98b69f
-- 
gitgitgadget

             reply	other threads:[~2024-10-15  3:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15  3:27 Liu Zhongbo via GitGitGadget [this message]
2024-10-15 19:08 ` [PATCH] builtin/fetch: iterate symrefs instead of all when checking dangling refs Taylor Blau
2024-10-16  7:13 ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1812.git.git.1728962878717.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=liuzhongbo.6666@bytedance.com \
    --cc=liuzhongbo.gg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.