git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] fetch --prune performance problem
@ 2025-06-18 21:08 Phil Hord
  2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Phil Hord @ 2025-06-18 21:08 UTC (permalink / raw)
  To: git; +Cc: Phil Hord

From: Phil Hord <phil.hord@gmail.com>

`git fetch --prune` runs in O(N^2) time normally. This happens because the code
iterates over each ref to be pruned to display its status. In a repo with
174,000 refs, where I was pruning 15,000 refs, the current code made 2.6 billion
calls to strcmp and consumed 470 seconds of CPU. After this change, the same
operation completes in under 1 second.

The loop looks like this:

    for p in prune_refs { for ref in all_refs { if p == ref { ... }}}

That loop runs only to check for and report newly dangling refs. A workaround to
avoid this slowness is to run with `-q` to bypass this check.

There is similar check/report functionality in `git remote prune`, but it uses a
more efficient method to check for dangling refs. prune_refs is first sorted, so
it can be searched in O(logN), so this loop is O(N*logN).

    for ref in all_refs { if ref in prune_refs { ... }}

My patch fixes this for fetch, but it affects the command's output order.
Currently the results look like this:

     - [deleted]     (none) -> origin/bar
       (origin/bar has become dangling)
     - [deleted]     (none) -> origin/baz
     - [deleted]     (none) -> origin/foo
       (origin/foo has become dangling)
     - [deleted]     (none) -> origin/frotz

After my change, the order will change so the danglers are reported at the end.

     - [deleted]     (none) -> origin/bar
     - [deleted]     (none) -> origin/baz
     - [deleted]     (none) -> origin/foo
     - [deleted]     (none) -> origin/frotz
       (origin/bar has become dangling)
       (origin/foo has become dangling)

The latter format is close to how `git remote prune` works, but the formatting
is a bit different. I can coerce my change into something that preserves the
original order, but it will be quite a bit messier.

Q: Does anyone care enough about the command output ordering that they think
   it's worth the extra code complexity?

Phil Hord (2):
  fetch-prune: optimize dangling-ref reporting
  refs: remove old refs_warn_dangling_symref

 builtin/fetch.c | 16 ++++++++--------
 refs.c          | 17 +----------------
 2 files changed, 9 insertions(+), 24 deletions(-)

-- 
2.50.0.1.gf2ab606906.dirty


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting
  2025-06-18 21:08 [RFC PATCH 0/2] fetch --prune performance problem Phil Hord
@ 2025-06-18 21:08 ` Phil Hord
  2025-06-18 21:50   ` Junio C Hamano
                     ` (2 more replies)
  2025-06-18 21:08 ` [RFC PATCH 2/2] refs: remove old refs_warn_dangling_symref Phil Hord
  2025-06-18 23:15 ` [RFC PATCH 0/2] fetch --prune performance problem Jacob Keller
  2 siblings, 3 replies; 14+ messages in thread
From: Phil Hord @ 2025-06-18 21:08 UTC (permalink / raw)
  To: git; +Cc: Phil Hord

From: Phil Hord <phil.hord@gmail.com>

When pruning during `git fetch` we check each pruned ref against the
ref_store one at a time to decide whether to report it as dangling.
This causes every local ref to be scanned for each ref being pruned.

If there are N refs in the repo and M refs being pruned, this code is
O(M*N). However, `git remote prune` uses a very similar function that
is only O(N*log(M)).

Remove the wasteful ref scanning for each pruned ref and use the faster
version already available in refs_warn_dangling_symrefs.

In a repo with 126,000 refs, where I was pruning 28,000 refs, this
code made about 3.6 billion calls to strcmp and consumed 410 seconds
of CPU. (Invariably in that time, my remote would timeout and the
fetch would fail anyway.)

After this change, the same operation completes in under 4 seconds.

I considered further optimizing this function to be O(N), but this
requires ref_store iterators to be sorted, too. I found some suggestions
that this is always the case, but I'm not certain it is.

The current speedup is enough for our needs at the moment.

This change causes a reordering of the output for any reported dangling
refs. Previously they would be reported inline with the "fetch: prune"
messages.  Now they will be reported after all the original prune
messages are complete.

Signed-off-by: Phil Hord <phil.hord@gmail.com>
---
 builtin/fetch.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 40a0e8d24434..11ce51da780a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1383,10 +1383,14 @@ static int prune_refs(struct display_state *display_state,
 	int result = 0;
 	struct ref *ref, *stale_refs = get_stale_heads(rs, ref_map);
 	struct strbuf err = STRBUF_INIT;
+	struct string_list refnames = STRING_LIST_INIT_NODUP;
 	const char *dangling_msg = dry_run
 		? _("   (%s will become dangling)")
 		: _("   (%s has become dangling)");
 
+	for (ref = stale_refs; ref; ref = ref->next)
+		string_list_append(&refnames, ref->name);
+
 	if (!dry_run) {
 		if (transaction) {
 			for (ref = stale_refs; ref; ref = ref->next) {
@@ -1396,15 +1400,9 @@ static int prune_refs(struct display_state *display_state,
 					goto cleanup;
 			}
 		} else {
-			struct string_list refnames = STRING_LIST_INIT_NODUP;
-
-			for (ref = stale_refs; ref; ref = ref->next)
-				string_list_append(&refnames, ref->name);
-
 			result = refs_delete_refs(get_main_ref_store(the_repository),
 						  "fetch: prune", &refnames,
 						  0);
-			string_list_clear(&refnames, 0);
 		}
 	}
 
@@ -1416,12 +1414,14 @@ static int prune_refs(struct display_state *display_state,
 					   _("(none)"), ref->name,
 					   &ref->new_oid, &ref->old_oid,
 					   summary_width);
-			refs_warn_dangling_symref(get_main_ref_store(the_repository),
-						  stderr, dangling_msg, ref->name);
 		}
+		string_list_sort(&refnames);
+		refs_warn_dangling_symrefs(get_main_ref_store(the_repository),
+					   stderr, dangling_msg, &refnames);
 	}
 
 cleanup:
+	string_list_clear(&refnames, 0);
 	strbuf_release(&err);
 	free_refs(stale_refs);
 	return result;
-- 
2.50.0.1.gf2ab606906.dirty


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/2] refs: remove old refs_warn_dangling_symref
  2025-06-18 21:08 [RFC PATCH 0/2] fetch --prune performance problem Phil Hord
  2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
@ 2025-06-18 21:08 ` Phil Hord
  2025-06-18 23:15 ` [RFC PATCH 0/2] fetch --prune performance problem Jacob Keller
  2 siblings, 0 replies; 14+ messages in thread
From: Phil Hord @ 2025-06-18 21:08 UTC (permalink / raw)
  To: git; +Cc: Phil Hord

From: Phil Hord <phil.hord@gmail.com>

The dangling warning function that takes a single ref to search for
is no longer used.  Remove it.

Signed-off-by: Phil Hord <phil.hord@gmail.com>
---
 refs.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/refs.c b/refs.c
index dce5c49ca2ba..0669d8e07072 100644
--- a/refs.c
+++ b/refs.c
@@ -438,7 +438,6 @@ static int for_each_filter_refs(const char *refname, const char *referent,
 struct warn_if_dangling_data {
 	struct ref_store *refs;
 	FILE *fp;
-	const char *refname;
 	const struct string_list *refnames;
 	const char *msg_fmt;
 };
@@ -455,9 +454,7 @@ static int warn_if_dangling_symref(const char *refname, const char *referent UNU
 
 	resolves_to = refs_resolve_ref_unsafe(d->refs, refname, 0, NULL, NULL);
 	if (!resolves_to
-	    || (d->refname
-		? strcmp(resolves_to, d->refname)
-		: !string_list_has_string(d->refnames, resolves_to))) {
+	    || !string_list_has_string(d->refnames, resolves_to)) {
 		return 0;
 	}
 
@@ -466,18 +463,6 @@ static int warn_if_dangling_symref(const char *refname, const char *referent UNU
 	return 0;
 }
 
-void refs_warn_dangling_symref(struct ref_store *refs, FILE *fp,
-			       const char *msg_fmt, const char *refname)
-{
-	struct warn_if_dangling_data data = {
-		.refs = refs,
-		.fp = fp,
-		.refname = refname,
-		.msg_fmt = msg_fmt,
-	};
-	refs_for_each_rawref(refs, warn_if_dangling_symref, &data);
-}
-
 void refs_warn_dangling_symrefs(struct ref_store *refs, FILE *fp,
 				const char *msg_fmt, const struct string_list *refnames)
 {
-- 
2.50.0.1.gf2ab606906.dirty


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting
  2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
@ 2025-06-18 21:50   ` Junio C Hamano
  2025-06-18 23:18   ` Jacob Keller
  2025-06-19  4:00   ` Jeff King
  2 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2025-06-18 21:50 UTC (permalink / raw)
  To: Phil Hord; +Cc: git

Phil Hord <phil.hord@gmail.com> writes:

> From: Phil Hord <phil.hord@gmail.com>
>
> When pruning during `git fetch` we check each pruned ref against the
> ref_store one at a time to decide whether to report it as dangling.
> This causes every local ref to be scanned for each ref being pruned.
>
> If there are N refs in the repo and M refs being pruned, this code is
> O(M*N). However, `git remote prune` uses a very similar function that
> is only O(N*log(M)).
>
> Remove the wasteful ref scanning for each pruned ref and use the faster
> version already available in refs_warn_dangling_symrefs.
>
> In a repo with 126,000 refs, where I was pruning 28,000 refs, this
> code made about 3.6 billion calls to strcmp and consumed 410 seconds
> of CPU. (Invariably in that time, my remote would timeout and the
> fetch would fail anyway.)
>
> After this change, the same operation completes in under 4 seconds.

Nice.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] fetch --prune performance problem
  2025-06-18 21:08 [RFC PATCH 0/2] fetch --prune performance problem Phil Hord
  2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
  2025-06-18 21:08 ` [RFC PATCH 2/2] refs: remove old refs_warn_dangling_symref Phil Hord
@ 2025-06-18 23:15 ` Jacob Keller
  2025-06-19  3:37   ` Jeff King
  2 siblings, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2025-06-18 23:15 UTC (permalink / raw)
  To: Phil Hord, git



On 6/18/2025 2:08 PM, Phil Hord wrote:
> My patch fixes this for fetch, but it affects the command's output order.
> Currently the results look like this:
> 
>      - [deleted]     (none) -> origin/bar
>        (origin/bar has become dangling)
>      - [deleted]     (none) -> origin/baz
>      - [deleted]     (none) -> origin/foo
>        (origin/foo has become dangling)
>      - [deleted]     (none) -> origin/frotz
> 
> After my change, the order will change so the danglers are reported at the end.
> 
>      - [deleted]     (none) -> origin/bar
>      - [deleted]     (none) -> origin/baz
>      - [deleted]     (none) -> origin/foo
>      - [deleted]     (none) -> origin/frotz
>        (origin/bar has become dangling)
>        (origin/foo has become dangling)

Personally, I like the later output. I have no idea why anyone would be
specifically scripting something that depends on the ordering being such
that dangling messages are printed immediately.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting
  2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
  2025-06-18 21:50   ` Junio C Hamano
@ 2025-06-18 23:18   ` Jacob Keller
  2025-06-19  4:00   ` Jeff King
  2 siblings, 0 replies; 14+ messages in thread
From: Jacob Keller @ 2025-06-18 23:18 UTC (permalink / raw)
  To: Phil Hord, git



On 6/18/2025 2:08 PM, Phil Hord wrote:
> From: Phil Hord <phil.hord@gmail.com>
> 
> When pruning during `git fetch` we check each pruned ref against the
> ref_store one at a time to decide whether to report it as dangling.
> This causes every local ref to be scanned for each ref being pruned.
> 
> If there are N refs in the repo and M refs being pruned, this code is
> O(M*N). However, `git remote prune` uses a very similar function that
> is only O(N*log(M)).
> 
> Remove the wasteful ref scanning for each pruned ref and use the faster
> version already available in refs_warn_dangling_symrefs.
> 
> In a repo with 126,000 refs, where I was pruning 28,000 refs, this
> code made about 3.6 billion calls to strcmp and consumed 410 seconds
> of CPU. (Invariably in that time, my remote would timeout and the
> fetch would fail anyway.)
> 
> After this change, the same operation completes in under 4 seconds.
> 

The cover letter said "under a second". Is this a different example?

> I considered further optimizing this function to be O(N), but this
> requires ref_store iterators to be sorted, too. I found some suggestions
> that this is always the case, but I'm not certain it is.
> 
> The current speedup is enough for our needs at the moment.
> 

Yep. Logarithmic scaling grows slow enough that this is probably
reasonable unless someone wants to put the remaining effort in.

> This change causes a reordering of the output for any reported dangling
> refs. Previously they would be reported inline with the "fetch: prune"
> messages.  Now they will be reported after all the original prune
> messages are complete.
> 

I think this is reasonable especially for the speedup.

> Signed-off-by: Phil Hord <phil.hord@gmail.com>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

>  builtin/fetch.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 40a0e8d24434..11ce51da780a 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1383,10 +1383,14 @@ static int prune_refs(struct display_state *display_state,
>  	int result = 0;
>  	struct ref *ref, *stale_refs = get_stale_heads(rs, ref_map);
>  	struct strbuf err = STRBUF_INIT;
> +	struct string_list refnames = STRING_LIST_INIT_NODUP;
>  	const char *dangling_msg = dry_run
>  		? _("   (%s will become dangling)")
>  		: _("   (%s has become dangling)");
>  
> +	for (ref = stale_refs; ref; ref = ref->next)
> +		string_list_append(&refnames, ref->name);
> +
>  	if (!dry_run) {
>  		if (transaction) {
>  			for (ref = stale_refs; ref; ref = ref->next) {
> @@ -1396,15 +1400,9 @@ static int prune_refs(struct display_state *display_state,
>  					goto cleanup;
>  			}
>  		} else {
> -			struct string_list refnames = STRING_LIST_INIT_NODUP;
> -
> -			for (ref = stale_refs; ref; ref = ref->next)
> -				string_list_append(&refnames, ref->name);
> -
>  			result = refs_delete_refs(get_main_ref_store(the_repository),
>  						  "fetch: prune", &refnames,
>  						  0);
> -			string_list_clear(&refnames, 0);
>  		}
>  	}
>  
> @@ -1416,12 +1414,14 @@ static int prune_refs(struct display_state *display_state,
>  					   _("(none)"), ref->name,
>  					   &ref->new_oid, &ref->old_oid,
>  					   summary_width);
> -			refs_warn_dangling_symref(get_main_ref_store(the_repository),
> -						  stderr, dangling_msg, ref->name);
>  		}
> +		string_list_sort(&refnames);
> +		refs_warn_dangling_symrefs(get_main_ref_store(the_repository),
> +					   stderr, dangling_msg, &refnames);
>  	}
>  
>  cleanup:
> +	string_list_clear(&refnames, 0);
>  	strbuf_release(&err);
>  	free_refs(stale_refs);
>  	return result;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] fetch --prune performance problem
  2025-06-18 23:15 ` [RFC PATCH 0/2] fetch --prune performance problem Jacob Keller
@ 2025-06-19  3:37   ` Jeff King
  2025-06-19 17:18     ` Junio C Hamano
       [not found]     ` <CABURp0p4d0JPg=-cW1OZdFQJ+vNT_0PDd9Rv3oz6toFGqGv5=g@mail.gmail.com>
  0 siblings, 2 replies; 14+ messages in thread
From: Jeff King @ 2025-06-19  3:37 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Phil Hord, git

On Wed, Jun 18, 2025 at 04:15:03PM -0700, Jacob Keller wrote:

> On 6/18/2025 2:08 PM, Phil Hord wrote:
> > My patch fixes this for fetch, but it affects the command's output order.
> > Currently the results look like this:
> > 
> >      - [deleted]     (none) -> origin/bar
> >        (origin/bar has become dangling)
> >      - [deleted]     (none) -> origin/baz
> >      - [deleted]     (none) -> origin/foo
> >        (origin/foo has become dangling)
> >      - [deleted]     (none) -> origin/frotz
> > 
> > After my change, the order will change so the danglers are reported at the end.
> > 
> >      - [deleted]     (none) -> origin/bar
> >      - [deleted]     (none) -> origin/baz
> >      - [deleted]     (none) -> origin/foo
> >      - [deleted]     (none) -> origin/frotz
> >        (origin/bar has become dangling)
> >        (origin/foo has become dangling)
> 
> Personally, I like the later output. I have no idea why anyone would be
> specifically scripting something that depends on the ordering being such
> that dangling messages are printed immediately.

I think the original ordering tells you which deletion caused the ref to
become dangling. Phil's example is a little confusing here:

    - [deleted]     (none) -> origin/bar
      (origin/bar has become dangling)

because the name is the same in both cases. A more likely output is that
origin/HEAD becomes dangling (since it's the only symref Git ever
automatically points at a tracking ref). E.g., in this:

  git init repo
  cd repo
  
  git commit --allow-empty -m foo
  git branch some
  git branch other
  git branch branches
  
  git clone . child
  cur=$(git symbolic-ref --short HEAD)
  git checkout some
  git branch -d other branches $cur
  
  cd child
  git fetch --prune

The final fetch output looks like:

   - [deleted]         (none)     -> origin/branches
   - [deleted]         (none)     -> origin/main
     (refs/remotes/origin/HEAD has become dangling)
   - [deleted]         (none)     -> origin/other

and we can see that the deletion of "main" is what caused the dangling.

That said, I'm not sure I care that much. I didn't even know we had this
dangling message, and it's been around for over 15 years!

If we did want to preserve the ordering, it could be done by taking two
passes (the first to create a reverse map of deletions to danglers, and
then the second to print each ref).

Alternatively, the dangling message could just mention where it the
now-dangling symref points at, something like:

   - [deleted]         (none)     -> origin/branches
   - [deleted]         (none)     -> origin/main
   - [deleted]         (none)     -> origin/other
     (refs/remotes/origin/HEAD points to the now-deleted origin/main)

I dunno. I guess anybody who really cares can run "git symbolic-ref
origin/HEAD" themselves to get that information.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting
  2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
  2025-06-18 21:50   ` Junio C Hamano
  2025-06-18 23:18   ` Jacob Keller
@ 2025-06-19  4:00   ` Jeff King
  2025-06-19 11:01     ` Lidong Yan
  2 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2025-06-19  4:00 UTC (permalink / raw)
  To: Phil Hord; +Cc: git

On Wed, Jun 18, 2025 at 02:08:39PM -0700, Phil Hord wrote:

> From: Phil Hord <phil.hord@gmail.com>
> 
> When pruning during `git fetch` we check each pruned ref against the
> ref_store one at a time to decide whether to report it as dangling.
> This causes every local ref to be scanned for each ref being pruned.
> 
> If there are N refs in the repo and M refs being pruned, this code is
> O(M*N). However, `git remote prune` uses a very similar function that
> is only O(N*log(M)).
> 
> Remove the wasteful ref scanning for each pruned ref and use the faster
> version already available in refs_warn_dangling_symrefs.
> 
> In a repo with 126,000 refs, where I was pruning 28,000 refs, this
> code made about 3.6 billion calls to strcmp and consumed 410 seconds
> of CPU. (Invariably in that time, my remote would timeout and the
> fetch would fail anyway.)
> 
> After this change, the same operation completes in under 4 seconds.

Very nice. I left some thoughts on the ordering question elsewhere, but
I'd be OK with this approach, too.

> I considered further optimizing this function to be O(N), but this
> requires ref_store iterators to be sorted, too. I found some suggestions
> that this is always the case, but I'm not certain it is.
> 
> The current speedup is enough for our needs at the moment.

I think we do guarantee the output order, and for-each-ref at least
takes advantage of this since 2e7c6d2f41 (ref-filter: format iteratively
with lexicographic refname sorting, 2024-10-21).

That said, I wouldn't be surprised if there are other n-log-n bits of
the code, and we usually consider that "good enough". So stopping here
is probably fine.

> +	struct string_list refnames = STRING_LIST_INIT_NODUP;
>  	const char *dangling_msg = dry_run
>  		? _("   (%s will become dangling)")
>  		: _("   (%s has become dangling)");
>  
> +	for (ref = stale_refs; ref; ref = ref->next)
> +		string_list_append(&refnames, ref->name);

I was going to suggest using strset over string_list, since I think we
prefer that these days for a simple set-inclusion check. But...

> +		string_list_sort(&refnames);
> +		refs_warn_dangling_symrefs(get_main_ref_store(the_repository),
> +					   stderr, dangling_msg, &refnames);

...we are ultimately relying on refs_warn_dangling_symrefs(), so we'd
have to update its interface. And we also reuse the list (here, after
your patch, but already in remote.c) to pass to refs_delete_refs(). So
probably not worth it.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting
  2025-06-19  4:00   ` Jeff King
@ 2025-06-19 11:01     ` Lidong Yan
  2025-06-19 14:41       ` Lidong Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Lidong Yan @ 2025-06-19 11:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Phil Hord, git

Jeff King <peff@peff.net> writes:
> ...we are ultimately relying on refs_warn_dangling_symrefs(), so we'd
> have to update its interface. And we also reuse the list (here, after
> your patch, but already in remote.c) to pass to refs_delete_refs(). So
> probably not worth it.

This patch only adds sorting code to prune_refs(), and as far as I can tell,
prune_refs() is only called once during git fetch. So I was just wondering,
would it be problematic if we moved the string_list_sort() into
refs_warn_dangling_symref() instead? And if it turns out to be safe, could
we perhaps even use strset in refs_warn_dangling_symref()?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting
  2025-06-19 11:01     ` Lidong Yan
@ 2025-06-19 14:41       ` Lidong Yan
  0 siblings, 0 replies; 14+ messages in thread
From: Lidong Yan @ 2025-06-19 14:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Phil Hord, git

Lidong Yan <yldhome2d2@gmail.com> writes:
> 
> This patch only adds sorting code to prune_refs(), and as far as I can tell,
> prune_refs() is only called once during git fetch. So I was just wondering,
> would it be problematic if we moved the string_list_sort() into
> refs_warn_dangling_symref() instead? And if it turns out to be safe, could
> we perhaps even use strset in refs_warn_dangling_symref()?

Ah, sorry I make a mistake. We can’t sort string_list in refs_warn_dangling_symref().

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] fetch --prune performance problem
  2025-06-19  3:37   ` Jeff King
@ 2025-06-19 17:18     ` Junio C Hamano
       [not found]     ` <CABURp0p4d0JPg=-cW1OZdFQJ+vNT_0PDd9Rv3oz6toFGqGv5=g@mail.gmail.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2025-06-19 17:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Jacob Keller, Phil Hord, git

Jeff King <peff@peff.net> writes:

> The final fetch output looks like:
>
>    - [deleted]         (none)     -> origin/branches
>    - [deleted]         (none)     -> origin/main
>      (refs/remotes/origin/HEAD has become dangling)
>    - [deleted]         (none)     -> origin/other
>
> and we can see that the deletion of "main" is what caused the dangling.
>
> That said, I'm not sure I care that much. I didn't even know we had this
> dangling message, and it's been around for over 15 years!

Same here.  I agree that the new output, while it may look prettier,
loses information.  I agree with your conclusion that the user who
really cares can check with symbolic-ref themselves.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] fetch --prune performance problem
       [not found]     ` <CABURp0p4d0JPg=-cW1OZdFQJ+vNT_0PDd9Rv3oz6toFGqGv5=g@mail.gmail.com>
@ 2025-06-23 23:32       ` Jacob Keller
  2025-06-23 23:41         ` Junio C Hamano
       [not found]         ` <CABURp0q-1FGmD+PJeSQ=xvyDN6ZYn1O7Fh8i1OojfD2WQCqgcw@mail.gmail.com>
  0 siblings, 2 replies; 14+ messages in thread
From: Jacob Keller @ 2025-06-23 23:32 UTC (permalink / raw)
  To: Phil Hord, Jeff King; +Cc: git



On 6/23/2025 4:11 PM, Phil Hord wrote:
> On Wed, Jun 18, 2025 at 8:37 PM Jeff King <peff@peff.net> wrote:
>> On Wed, Jun 18, 2025 at 04:15:03PM -0700, Jacob Keller wrote:
>>> On 6/18/2025 2:08 PM, Phil Hord wrote:
>>>> My patch fixes this for fetch, but it affects the command's output
> order.
>>>> Currently the results look like this:
>>>>
>>>>      - [deleted]     (none) -> origin/bar
>>>>        (origin/bar has become dangling)
>>>>      - [deleted]     (none) -> origin/baz
>>>>      - [deleted]     (none) -> origin/foo
>>>>        (origin/foo has become dangling)
>>>>      - [deleted]     (none) -> origin/frotz
>>>>
>>>> After my change, the order will change so the danglers are reported
> at the end.
>>>>
>>>>      - [deleted]     (none) -> origin/bar
>>>>      - [deleted]     (none) -> origin/baz
>>>>      - [deleted]     (none) -> origin/foo
>>>>      - [deleted]     (none) -> origin/frotz
>>>>        (origin/bar has become dangling)
>>>>        (origin/foo has become dangling)
>>>
>>> Personally, I like the later output. I have no idea why anyone would be
>>> specifically scripting something that depends on the ordering being such
>>> that dangling messages are printed immediately.
>>
>> I think the original ordering tells you which deletion caused the ref to
>> become dangling. Phil's example is a little confusing here:
>>
>>     - [deleted]     (none) -> origin/bar
>>       (origin/bar has become dangling)
>>
>> because the name is the same in both cases. A more likely output is that
>> origin/HEAD becomes dangling (since it's the only symref Git ever
>> automatically points at a tracking ref). E.g., in this:
>>
>>   git init repo
>>   cd repo
>>
>>   git commit --allow-empty -m foo
>>   git branch some
>>   git branch other
>>   git branch branches
>>
>>   git clone . child
>>   cur=$(git symbolic-ref --short HEAD)
>>   git checkout some
>>   git branch -d other branches $cur
>>
>>   cd child
>>   git fetch --prune
> 
> Thanks for the helpful demo and clarification of the real output.
> 
>> The final fetch output looks like:
>>
>>    - [deleted]         (none)     -> origin/branches
>>    - [deleted]         (none)     -> origin/main
>>      (refs/remotes/origin/HEAD has become dangling)
>>    - [deleted]         (none)     -> origin/other
>>
>> and we can see that the deletion of "main" is what caused the dangling.
>>
>> That said, I'm not sure I care that much. I didn't even know we had this
>> dangling message, and it's been around for over 15 years!
>>
>> If we did want to preserve the ordering, it could be done by taking two
>> passes (the first to create a reverse map of deletions to danglers, and
>> then the second to print each ref).
>>
>> Alternatively, the dangling message could just mention where it the
>> now-dangling symref points at, something like:
>>
>>    - [deleted]         (none)     -> origin/branches
>>    - [deleted]         (none)     -> origin/main
>>    - [deleted]         (none)     -> origin/other
>>      (refs/remotes/origin/HEAD points to the now-deleted origin/main)
> 
> I have a new patch that produces this:
> 
>     + git fetch --prune --dry-run
>     From /tmp/repo/.
>      - [deleted]                   (none)     -> origin/branches
>      - [deleted]                   (none)     -> origin/master
>      - [deleted]                   (none)     -> origin/other
>        origin/HEAD will become dangling after origin/master is deleted
> 


It is a bit weird that this says "will become dangling after <ref> is
deleted" because the deletion already happened.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] fetch --prune performance problem
  2025-06-23 23:32       ` Jacob Keller
@ 2025-06-23 23:41         ` Junio C Hamano
       [not found]         ` <CABURp0q-1FGmD+PJeSQ=xvyDN6ZYn1O7Fh8i1OojfD2WQCqgcw@mail.gmail.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2025-06-23 23:41 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Phil Hord, Jeff King, git

Jacob Keller <jacob.e.keller@intel.com> writes:

>>> Alternatively, the dangling message could just mention where it the
>>> now-dangling symref points at, something like:
>>>
>>>    - [deleted]         (none)     -> origin/branches
>>>    - [deleted]         (none)     -> origin/main
>>>    - [deleted]         (none)     -> origin/other
>>>      (refs/remotes/origin/HEAD points to the now-deleted origin/main)
>> 
>> I have a new patch that produces this:
>> 
>>     + git fetch --prune --dry-run
>>     From /tmp/repo/.
>>      - [deleted]                   (none)     -> origin/branches
>>      - [deleted]                   (none)     -> origin/master
>>      - [deleted]                   (none)     -> origin/other
>>        origin/HEAD will become dangling after origin/master is deleted
>> 
>
>
> It is a bit weird that this says "will become dangling after <ref> is
> deleted" because the deletion already happened.

But that is with "--dry-run".  Without it, presumably 

    origin/HEAD is now dangling since origin/master was deleted

or something, probably.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] fetch --prune performance problem
       [not found]         ` <CABURp0q-1FGmD+PJeSQ=xvyDN6ZYn1O7Fh8i1OojfD2WQCqgcw@mail.gmail.com>
@ 2025-06-23 23:46           ` Jacob Keller
  0 siblings, 0 replies; 14+ messages in thread
From: Jacob Keller @ 2025-06-23 23:46 UTC (permalink / raw)
  To: Phil Hord; +Cc: Jeff King, git



On 6/23/2025 4:40 PM, Phil Hord wrote:
> On Mon, Jun 23, 2025 at 4:32 PM Jacob Keller <jacob.e.keller@intel.com>
>> On 6/23/2025 4:11 PM, Phil Hord wrote:
>>> I have a new patch that produces this:
>>>
>>>     + git fetch --prune --dry-run
>>>     From /tmp/repo/.
>>>      - [deleted]                   (none)     -> origin/branches
>>>      - [deleted]                   (none)     -> origin/master
>>>      - [deleted]                   (none)     -> origin/other
>>>        origin/HEAD will become dangling after origin/master is deleted
>>>
>>
>>
>> It is a bit weird that this says "will become dangling after <ref> is
>> deleted" because the deletion already happened.
> 
> That's because I used the `--dry-run` switch.  Sorry for the confusion.
> 
>     + git fetch --prune
>     From /tmp/repo/.
>      - [deleted]                   (none)     -> origin/branches
>      - [deleted]                   (none)     -> origin/master
>      - [deleted]                   (none)     -> origin/other
>        origin/HEAD has become dangling after origin/master was deleted
> 

Aha! That is even better that it properly adjusts the text based on
--dry-run.

I like it.

Regards,
Jake

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-06-23 23:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-18 21:08 [RFC PATCH 0/2] fetch --prune performance problem Phil Hord
2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
2025-06-18 21:50   ` Junio C Hamano
2025-06-18 23:18   ` Jacob Keller
2025-06-19  4:00   ` Jeff King
2025-06-19 11:01     ` Lidong Yan
2025-06-19 14:41       ` Lidong Yan
2025-06-18 21:08 ` [RFC PATCH 2/2] refs: remove old refs_warn_dangling_symref Phil Hord
2025-06-18 23:15 ` [RFC PATCH 0/2] fetch --prune performance problem Jacob Keller
2025-06-19  3:37   ` Jeff King
2025-06-19 17:18     ` Junio C Hamano
     [not found]     ` <CABURp0p4d0JPg=-cW1OZdFQJ+vNT_0PDd9Rv3oz6toFGqGv5=g@mail.gmail.com>
2025-06-23 23:32       ` Jacob Keller
2025-06-23 23:41         ` Junio C Hamano
     [not found]         ` <CABURp0q-1FGmD+PJeSQ=xvyDN6ZYn1O7Fh8i1OojfD2WQCqgcw@mail.gmail.com>
2025-06-23 23:46           ` Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).