* `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) @ 2025-07-24 9:59 Han Jiang 2025-07-24 10:45 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Han Jiang @ 2025-07-24 9:59 UTC (permalink / raw) To: Git Mailing List What did you do before the bug happened? (Steps to reproduce your issue) mkdir --parents -- './server' './client'; git -C './server' init --bare './repo.git' branch_default_name="$(git -C './server/repo.git' branch --show-current)"; echo "$branch_default_name" git --git-dir='./server/repo.git' --work-tree='.' commit --message="$((++number))" --allow-empty git -C './client' init './repo' cd './client/repo' git remote add -m "$branch_default_name" server 'file://'"$(realpath '../../server/repo.git')" git remote --verbose git config list --local --show-scope --show-origin git symbolic-ref 'refs/remotes/server/HEAD' git remote rename server server2 git config list --local --show-scope --show-origin git symbolic-ref 'refs/remotes/server2/HEAD' git symbolic-ref 'refs/remotes/server/HEAD' What did you expect to happen? (Expected behavior) `git symbolic-ref 'refs/remotes/server/HEAD'` outputs "refs/remotes/server/master"; `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs "refs/remotes/server2/master". What happened instead? (Actual behavior) `git symbolic-ref 'refs/remotes/server/HEAD'` outputs "refs/remotes/server/master"; `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs "fatal: ref refs/remotes/server2/HEAD is not a symbolic ref". `git symbolic-ref 'refs/remotes/server/HEAD'` outputs "refs/remotes/server/master". What's different between what you expected and what actually happened? Anything else you want to add: Please review the rest of the bug report below. You can delete any lines you don't wish to share. [System Info] git version: git version 2.50.1.windows.1 cpu: x86_64 built from commit: 4d32d83913170b86f9753fca10e75cdb2223d1cc sizeof-long: 4 sizeof-size_t: 8 shell-path: D:/git-sdk-64/usr/bin/sh feature: fsmonitor--daemon libcurl: 8.14.1 OpenSSL: OpenSSL 3.2.4 11 Feb 2025 zlib: 1.3.1 SHA-1: SHA1_DC SHA-256: SHA256_BLK uname: Windows 10.0 26100 compiler info: gnuc: 15.1 libc info: no libc information available $SHELL (typically, interactive shell): C:\Program Files\Git\usr\bin\bash.exe [Enabled Hooks] not run from a git repository - no hooks to show ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-24 9:59 `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) Han Jiang @ 2025-07-24 10:45 ` Jeff King 2025-07-24 11:58 ` Patrick Steinhardt 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2025-07-24 10:45 UTC (permalink / raw) To: Han Jiang; +Cc: Patrick Steinhardt, Git Mailing List On Thu, Jul 24, 2025 at 09:59:45PM +1200, Han Jiang wrote: > What did you expect to happen? (Expected behavior) > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > "refs/remotes/server/master"; > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs > "refs/remotes/server2/master". > > What happened instead? (Actual behavior) > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > "refs/remotes/server/master"; > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs "fatal: ref > refs/remotes/server2/HEAD is not a symbolic ref". > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > "refs/remotes/server/master". Thanks for the report. I can reproduce the issue easily here. Probably a simpler reproduction is just: git init git remote add -m whatever server1 /does/not/need/to/exist git remote rename server1 server2 git symbolic-ref refs/remotes/server2/HEAD The problem is that the branch-renaming code in git-remote is not prepared to handle symrefs that don't resolve. This seems to make it work: diff --git a/builtin/remote.c b/builtin/remote.c index 5dd6cbbaee..478ea3a80c 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -630,7 +630,9 @@ static int read_remote_branches(const char *refname, const char *referent UNUSED if (starts_with(refname, buf.buf)) { item = string_list_append(rename->remote_branches, refname); symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), - refname, RESOLVE_REF_READING, + refname, + RESOLVE_REF_READING | + RESOLVE_REF_NO_RECURSE, NULL, &flag); if (symref && (flag & REF_ISSYMREF)) { item->util = xstrdup(symref); @@ -835,8 +837,8 @@ static int mv(int argc, const char **argv, const char *prefix, * First remove symrefs, then rename the rest, finally create * the new symrefs. */ - refs_for_each_ref(get_main_ref_store(the_repository), - read_remote_branches, &rename); + refs_for_each_rawref(get_main_ref_store(the_repository), + read_remote_branches, &rename); if (show_progress) { /* * Count symrefs twice, since "renaming" them is done by That is, we need two fixes: 1. When iterating over the refs, we need to cover _all_ refs, not just those that fully resolve (there's a related bug here: we'll silently ignore an actual broken or corrupt ref, whereas I think the right thing would probably be to try copying it and then complain loudly if we don't have the object). 2. When resolving each one, we shouldn't recurse. We're doing a shallow copy, not a deep one. Reading this code, though, I can't help but think that the recent "git refs migrate" command had to deal with all of these problems. I wonder if we could reuse its code. +cc pks for wisdom. -Peff PS I think there's a related bug. If we have a real upstream repo and try to "git fetch" with HEAD pointing to the unborn state, it will be overwritten! I think this is the remote.*.followRemoteHEAD logic being overly zealous in "create" mode. It should probably leave an existing value alone, even if it points to an unborn branch. ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-24 10:45 ` Jeff King @ 2025-07-24 11:58 ` Patrick Steinhardt 2025-07-24 13:03 ` Patrick Steinhardt 0 siblings, 1 reply; 10+ messages in thread From: Patrick Steinhardt @ 2025-07-24 11:58 UTC (permalink / raw) To: Jeff King; +Cc: Han Jiang, Git Mailing List On Thu, Jul 24, 2025 at 06:45:36AM -0400, Jeff King wrote: > On Thu, Jul 24, 2025 at 09:59:45PM +1200, Han Jiang wrote: > > > What did you expect to happen? (Expected behavior) > > > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > > "refs/remotes/server/master"; > > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs > > "refs/remotes/server2/master". > > > > What happened instead? (Actual behavior) > > > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > > "refs/remotes/server/master"; > > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs "fatal: ref > > refs/remotes/server2/HEAD is not a symbolic ref". > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > > "refs/remotes/server/master". > > Thanks for the report. I can reproduce the issue easily here. Probably a > simpler reproduction is just: > > git init > git remote add -m whatever server1 /does/not/need/to/exist > git remote rename server1 server2 > git symbolic-ref refs/remotes/server2/HEAD > > The problem is that the branch-renaming code in git-remote is not > prepared to handle symrefs that don't resolve. This seems to make it > work: > > diff --git a/builtin/remote.c b/builtin/remote.c > index 5dd6cbbaee..478ea3a80c 100644 > --- a/builtin/remote.c > +++ b/builtin/remote.c > @@ -630,7 +630,9 @@ static int read_remote_branches(const char *refname, const char *referent UNUSED > if (starts_with(refname, buf.buf)) { > item = string_list_append(rename->remote_branches, refname); > symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), > - refname, RESOLVE_REF_READING, > + refname, > + RESOLVE_REF_READING | > + RESOLVE_REF_NO_RECURSE, > NULL, &flag); > if (symref && (flag & REF_ISSYMREF)) { > item->util = xstrdup(symref); > @@ -835,8 +837,8 @@ static int mv(int argc, const char **argv, const char *prefix, > * First remove symrefs, then rename the rest, finally create > * the new symrefs. > */ > - refs_for_each_ref(get_main_ref_store(the_repository), > - read_remote_branches, &rename); > + refs_for_each_rawref(get_main_ref_store(the_repository), > + read_remote_branches, &rename); > if (show_progress) { > /* > * Count symrefs twice, since "renaming" them is done by > > That is, we need two fixes: > > 1. When iterating over the refs, we need to cover _all_ refs, not just > those that fully resolve (there's a related bug here: we'll > silently ignore an actual broken or corrupt ref, whereas I think > the right thing would probably be to try copying it and then > complain loudly if we don't have the object). > > 2. When resolving each one, we shouldn't recurse. We're doing a > shallow copy, not a deep one. > > Reading this code, though, I can't help but think that the recent "git > refs migrate" command had to deal with all of these problems. I wonder > if we could reuse its code. +cc pks for wisdom. I'm not sure whether we can easily reuse the code -- the use case is quite different, as the migration works across two totally independent refdbs. So all refs are recreated 1:1, without any renaming involved. But it certainly seems to me like this whole logic could use quite some love: - We create N+M*2 separate ref transactions, where N is the number of direct remote refs we need to migrate and M is the number of symbolic refs. This is bad with the "reftable" backend, but given that the N transactions are all renames that have to delete the old ref it's even quadratic in the worst case for the "files" backend because we may have to rewrite the packed-refs file for each such transaction. - It is way too brittle, as the update isn't even pretending to be atomic. We first delete everything, and then we recreate it. So if any of these updates fails we'll be left in an in-between state. - We shouldn't have to even call `refs_resolve_ref_unsafe()` at all, as the `read_remote_branches()` nowadays gets the referent as parameter. To demonstrate: $ git init --ref-format=files repo Initialized empty Git repository in /tmp/repo/.git/ $ cd repo/ /tmp/repo:HEAD $ git commit --allow-empty -m initial [main (root-commit) 00c2622] x $ git remote add origin /dev/null /tmp/repo:main $ for i in $(seq 100000); do printf "create refs/remotes/origin/branch-%d HEAD\n" $i; done | git update-ref --stdin /tmp/repo:main $ git pack-refs --all /tmp/repo:main $ time git remote rename origin renamed Renaming remote references: 0% (2216/100000) I stopped after a minute -- this will take hours to complete. So I think we should adapt this logic to use a single transaction. There's one catch, as refs_rename_ref()` also migrates any reflogs that exist. But with the recent infra that Karthik has added we can now also migrate reflogs, so that's all doable. Patrick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-24 11:58 ` Patrick Steinhardt @ 2025-07-24 13:03 ` Patrick Steinhardt 2025-07-24 18:38 ` Junio C Hamano 2025-07-25 11:02 ` Jeff King 0 siblings, 2 replies; 10+ messages in thread From: Patrick Steinhardt @ 2025-07-24 13:03 UTC (permalink / raw) To: Jeff King; +Cc: Han Jiang, Git Mailing List On Thu, Jul 24, 2025 at 01:58:37PM +0200, Patrick Steinhardt wrote: > On Thu, Jul 24, 2025 at 06:45:36AM -0400, Jeff King wrote: > > On Thu, Jul 24, 2025 at 09:59:45PM +1200, Han Jiang wrote: > > > > > What did you expect to happen? (Expected behavior) > > > > > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > > > "refs/remotes/server/master"; > > > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs > > > "refs/remotes/server2/master". > > > > > > What happened instead? (Actual behavior) > > > > > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > > > "refs/remotes/server/master"; > > > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs "fatal: ref > > > refs/remotes/server2/HEAD is not a symbolic ref". > > > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs > > > "refs/remotes/server/master". > > > > Thanks for the report. I can reproduce the issue easily here. Probably a > > simpler reproduction is just: > > > > git init > > git remote add -m whatever server1 /does/not/need/to/exist > > git remote rename server1 server2 > > git symbolic-ref refs/remotes/server2/HEAD > > > > The problem is that the branch-renaming code in git-remote is not > > prepared to handle symrefs that don't resolve. This seems to make it > > work: > > > > diff --git a/builtin/remote.c b/builtin/remote.c > > index 5dd6cbbaee..478ea3a80c 100644 > > --- a/builtin/remote.c > > +++ b/builtin/remote.c > > @@ -630,7 +630,9 @@ static int read_remote_branches(const char *refname, const char *referent UNUSED > > if (starts_with(refname, buf.buf)) { > > item = string_list_append(rename->remote_branches, refname); > > symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), > > - refname, RESOLVE_REF_READING, > > + refname, > > + RESOLVE_REF_READING | > > + RESOLVE_REF_NO_RECURSE, > > NULL, &flag); > > if (symref && (flag & REF_ISSYMREF)) { > > item->util = xstrdup(symref); > > @@ -835,8 +837,8 @@ static int mv(int argc, const char **argv, const char *prefix, > > * First remove symrefs, then rename the rest, finally create > > * the new symrefs. > > */ > > - refs_for_each_ref(get_main_ref_store(the_repository), > > - read_remote_branches, &rename); > > + refs_for_each_rawref(get_main_ref_store(the_repository), > > + read_remote_branches, &rename); > > if (show_progress) { > > /* > > * Count symrefs twice, since "renaming" them is done by > > > > That is, we need two fixes: > > > > 1. When iterating over the refs, we need to cover _all_ refs, not just > > those that fully resolve (there's a related bug here: we'll > > silently ignore an actual broken or corrupt ref, whereas I think > > the right thing would probably be to try copying it and then > > complain loudly if we don't have the object). > > > > 2. When resolving each one, we shouldn't recurse. We're doing a > > shallow copy, not a deep one. > > > > Reading this code, though, I can't help but think that the recent "git > > refs migrate" command had to deal with all of these problems. I wonder > > if we could reuse its code. +cc pks for wisdom. > > I'm not sure whether we can easily reuse the code -- the use case is > quite different, as the migration works across two totally independent > refdbs. So all refs are recreated 1:1, without any renaming involved. > > But it certainly seems to me like this whole logic could use quite some > love: > > - We create N+M*2 separate ref transactions, where N is the number of > direct remote refs we need to migrate and M is the number of > symbolic refs. This is bad with the "reftable" backend, but given > that the N transactions are all renames that have to delete the old > ref it's even quadratic in the worst case for the "files" backend > because we may have to rewrite the packed-refs file for each such > transaction. > > - It is way too brittle, as the update isn't even pretending to be > atomic. We first delete everything, and then we recreate it. So if > any of these updates fails we'll be left in an in-between state. > > - We shouldn't have to even call `refs_resolve_ref_unsafe()` at all, > as the `read_remote_branches()` nowadays gets the referent as > parameter. > > To demonstrate: > > $ git init --ref-format=files repo > Initialized empty Git repository in /tmp/repo/.git/ > $ cd repo/ > /tmp/repo:HEAD $ git commit --allow-empty -m initial > [main (root-commit) 00c2622] x > $ git remote add origin /dev/null > /tmp/repo:main $ for i in $(seq 100000); do printf "create refs/remotes/origin/branch-%d HEAD\n" $i; done | git update-ref --stdin > /tmp/repo:main $ git pack-refs --all > /tmp/repo:main $ time git remote rename origin renamed > Renaming remote references: 0% (2216/100000) > > I stopped after a minute -- this will take hours to complete. > > So I think we should adapt this logic to use a single transaction. > There's one catch, as refs_rename_ref()` also migrates any reflogs that > exist. But with the recent infra that Karthik has added we can now also > migrate reflogs, so that's all doable. > > Patrick I've quickly hacked something together now, see the work-in-progress patch below. The patch does not yet handle reflogs, but that isn't too hard to implement. And these changes indeed speed up things by quite a lot: instead of hours it now takes 7 seconds :) I'll polish this patch series and will likely send it in tomorrow. Patrick diff --git a/builtin/remote.c b/builtin/remote.c index 5dd6cbbaeed..072a70e6b45 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -612,36 +612,53 @@ static int add_branch_for_removal(const char *refname, struct rename_info { const char *old_name; const char *new_name; - struct string_list *remote_branches; uint32_t symrefs_nr; + struct ref_transaction *transaction; + struct strbuf *err; }; -static int read_remote_branches(const char *refname, const char *referent UNUSED, - const struct object_id *oid UNUSED, - int flags UNUSED, void *cb_data) +static int queue_one_rename(const char *refname, const char *referent, + const struct object_id *oid, + int flags, void *cb_data) { + struct strbuf new_refname = STRBUF_INIT, new_referent = STRBUF_INIT; struct rename_info *rename = cb_data; - struct strbuf buf = STRBUF_INIT; - struct string_list_item *item; - int flag; - const char *symref; - - strbuf_addf(&buf, "refs/remotes/%s/", rename->old_name); - if (starts_with(refname, buf.buf)) { - item = string_list_append(rename->remote_branches, refname); - symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), - refname, RESOLVE_REF_READING, - NULL, &flag); - if (symref && (flag & REF_ISSYMREF)) { - item->util = xstrdup(symref); - rename->symrefs_nr++; - } else { - item->util = NULL; - } + int error; + + strbuf_addf(&new_refname, "refs/remotes/%s/", rename->old_name); + if (!starts_with(refname, new_refname.buf)) { + error = 0; + goto out; } - strbuf_release(&buf); - return 0; + if (flags & REF_ISSYMREF) { + strbuf_addstr(&new_referent, referent); + strbuf_splice(&new_referent, strlen("refs/remotes/"), strlen(rename->old_name), + rename->new_name, strlen(rename->new_name)); + oid = NULL; + } + + error = ref_transaction_delete(rename->transaction, refname, + oid, referent, REF_NO_DEREF, NULL, rename->err); + if (error < 0) + goto out; + + strbuf_reset(&new_refname); + strbuf_addstr(&new_refname, refname); + strbuf_splice(&new_refname, strlen("refs/remotes/"), strlen(rename->old_name), + rename->new_name, strlen(rename->new_name)); + + error = ref_transaction_update(rename->transaction, new_refname.buf, oid, + null_oid(the_hash_algo), (flags & REF_ISSYMREF) ? new_referent.buf : NULL, NULL, + REF_SKIP_CREATE_REFLOG | REF_NO_DEREF | REF_SKIP_OID_VERIFICATION, + NULL, rename->err); + if (error < 0) + goto out; + +out: + strbuf_release(&new_refname); + strbuf_release(&new_referent); + return error; } static int migrate_file(struct remote *remote) @@ -740,12 +757,10 @@ static int mv(int argc, const char **argv, const char *prefix, OPT_END() }; struct remote *oldremote, *newremote; - struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, buf3 = STRBUF_INIT, - old_remote_context = STRBUF_INIT; - struct string_list remote_branches = STRING_LIST_INIT_DUP; + struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, + old_remote_context = STRBUF_INIT, err = STRBUF_INIT; struct rename_info rename; - int i, refs_renamed_nr = 0, refspec_updated = 0; - struct progress *progress = NULL; + int i, refspec_updated = 0; int result = 0; argc = parse_options(argc, argv, prefix, options, @@ -756,8 +771,8 @@ static int mv(int argc, const char **argv, const char *prefix, rename.old_name = argv[0]; rename.new_name = argv[1]; - rename.remote_branches = &remote_branches; rename.symrefs_nr = 0; + rename.err = &err; oldremote = remote_get(rename.old_name); if (!remote_is_configured(oldremote, 1)) { @@ -831,80 +846,28 @@ static int mv(int argc, const char **argv, const char *prefix, if (!refspec_updated) goto out; - /* - * First remove symrefs, then rename the rest, finally create - * the new symrefs. - */ - refs_for_each_ref(get_main_ref_store(the_repository), - read_remote_branches, &rename); - if (show_progress) { - /* - * Count symrefs twice, since "renaming" them is done by - * deleting and recreating them in two separate passes. - */ - progress = start_progress(the_repository, - _("Renaming remote references"), - rename.remote_branches->nr + rename.symrefs_nr); - } - for (i = 0; i < remote_branches.nr; i++) { - struct string_list_item *item = remote_branches.items + i; - struct strbuf referent = STRBUF_INIT; - - if (refs_read_symbolic_ref(get_main_ref_store(the_repository), item->string, - &referent)) - continue; - if (refs_delete_ref(get_main_ref_store(the_repository), NULL, item->string, NULL, REF_NO_DEREF)) - die(_("deleting '%s' failed"), item->string); - - strbuf_release(&referent); - display_progress(progress, ++refs_renamed_nr); - } - for (i = 0; i < remote_branches.nr; i++) { - struct string_list_item *item = remote_branches.items + i; + rename.transaction = ref_store_transaction_begin(get_main_ref_store(the_repository), + 0, &err); + if (!rename.transaction) + goto out; - if (item->util) - continue; - strbuf_reset(&buf); - strbuf_addstr(&buf, item->string); - strbuf_splice(&buf, strlen("refs/remotes/"), strlen(rename.old_name), - rename.new_name, strlen(rename.new_name)); - strbuf_reset(&buf2); - strbuf_addf(&buf2, "remote: renamed %s to %s", - item->string, buf.buf); - if (refs_rename_ref(get_main_ref_store(the_repository), item->string, buf.buf, buf2.buf)) - die(_("renaming '%s' failed"), item->string); - display_progress(progress, ++refs_renamed_nr); - } - for (i = 0; i < remote_branches.nr; i++) { - struct string_list_item *item = remote_branches.items + i; + result = refs_for_each_rawref(get_main_ref_store(the_repository), + queue_one_rename, &rename); + if (result < 0) + die(_("renaming references failed: %s"), rename.err->buf); - if (!item->util) - continue; - strbuf_reset(&buf); - strbuf_addstr(&buf, item->string); - strbuf_splice(&buf, strlen("refs/remotes/"), strlen(rename.old_name), - rename.new_name, strlen(rename.new_name)); - strbuf_reset(&buf2); - strbuf_addstr(&buf2, item->util); - strbuf_splice(&buf2, strlen("refs/remotes/"), strlen(rename.old_name), - rename.new_name, strlen(rename.new_name)); - strbuf_reset(&buf3); - strbuf_addf(&buf3, "remote: renamed %s to %s", - item->string, buf.buf); - if (refs_update_symref(get_main_ref_store(the_repository), buf.buf, buf2.buf, buf3.buf)) - die(_("creating '%s' failed"), buf.buf); - display_progress(progress, ++refs_renamed_nr); - } - stop_progress(&progress); + result = ref_transaction_commit(rename.transaction, &err); + if (result < 0) + die(_("committing renamed references failed: %s"), rename.err->buf); handle_push_default(rename.old_name, rename.new_name); out: - string_list_clear(&remote_branches, 1); + ref_transaction_free(rename.transaction); strbuf_release(&old_remote_context); strbuf_release(&buf); strbuf_release(&buf2); - strbuf_release(&buf3); + strbuf_release(&err); return result; } ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-24 13:03 ` Patrick Steinhardt @ 2025-07-24 18:38 ` Junio C Hamano 2025-07-25 8:00 ` Patrick Steinhardt 2025-07-25 11:02 ` Jeff King 1 sibling, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2025-07-24 18:38 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Jeff King, Han Jiang, Git Mailing List Patrick Steinhardt <ps@pks.im> writes: > I've quickly hacked something together now, see the work-in-progress > patch below. The patch does not yet handle reflogs, but that isn't too > hard to implement. > > And these changes indeed speed up things by quite a lot: instead of > hours it now takes 7 seconds :) I'll polish this patch series and will > likely send it in tomorrow. Nice. Not just the "oops, we shouldn't assume symrefs always point at an existing ref" fix, performance fix comes at the same time ;-) > diff --git a/builtin/remote.c b/builtin/remote.c > index 5dd6cbbaeed..072a70e6b45 100644 > --- a/builtin/remote.c > +++ b/builtin/remote.c > @@ -612,36 +612,53 @@ static int add_branch_for_removal(const char *refname, > struct rename_info { > const char *old_name; > const char *new_name; > - struct string_list *remote_branches; > uint32_t symrefs_nr; > + struct ref_transaction *transaction; > + struct strbuf *err; > }; OK, as a place to hook the transaction into, rename_info is a convenient place, as it already is passed around throughout the relevant code paths. > -static int read_remote_branches(const char *refname, const char *referent UNUSED, > - const struct object_id *oid UNUSED, > - int flags UNUSED, void *cb_data) > +static int queue_one_rename(const char *refname, const char *referent, > + const struct object_id *oid, > + int flags, void *cb_data) > { > + struct strbuf new_refname = STRBUF_INIT, new_referent = STRBUF_INIT; > struct rename_info *rename = cb_data; > - struct strbuf buf = STRBUF_INIT; > - struct string_list_item *item; > - int flag; > - const char *symref; > - > - strbuf_addf(&buf, "refs/remotes/%s/", rename->old_name); > - if (starts_with(refname, buf.buf)) { > - item = string_list_append(rename->remote_branches, refname); > - symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), > - refname, RESOLVE_REF_READING, > - NULL, &flag); > - if (symref && (flag & REF_ISSYMREF)) { > - item->util = xstrdup(symref); > - rename->symrefs_nr++; > - } else { > - item->util = NULL; > - } > + int error; > + > + strbuf_addf(&new_refname, "refs/remotes/%s/", rename->old_name); > + if (!starts_with(refname, new_refname.buf)) { > + error = 0; > + goto out; > } > - strbuf_release(&buf); > > - return 0; > + if (flags & REF_ISSYMREF) { > + strbuf_addstr(&new_referent, referent); > + strbuf_splice(&new_referent, strlen("refs/remotes/"), strlen(rename->old_name), > + rename->new_name, strlen(rename->new_name)); > + oid = NULL; > + } > + > + error = ref_transaction_delete(rename->transaction, refname, > + oid, referent, REF_NO_DEREF, NULL, rename->err); Remove old ... > + if (error < 0) > + goto out; > + > + strbuf_reset(&new_refname); > + strbuf_addstr(&new_refname, refname); > + strbuf_splice(&new_refname, strlen("refs/remotes/"), strlen(rename->old_name), > + rename->new_name, strlen(rename->new_name)); > + > + error = ref_transaction_update(rename->transaction, new_refname.buf, oid, > + null_oid(the_hash_algo), (flags & REF_ISSYMREF) ? new_referent.buf : NULL, NULL, > + REF_SKIP_CREATE_REFLOG | REF_NO_DEREF | REF_SKIP_OID_VERIFICATION, > + NULL, rename->err); ... and create new. Would we be hit with the same "while renaming A to A/B, there is a D/F conflict which the ref transaction does not handle by itself" issue we saw recently here? > + rename.transaction = ref_store_transaction_begin(get_main_ref_store(the_repository), > + 0, &err); > + if (!rename.transaction) > + goto out; > > + result = refs_for_each_rawref(get_main_ref_store(the_repository), > + queue_one_rename, &rename); > + if (result < 0) > + die(_("renaming references failed: %s"), rename.err->buf); > > + result = ref_transaction_commit(rename.transaction, &err); > + if (result < 0) > + die(_("committing renamed references failed: %s"), rename.err->buf); > > handle_push_default(rename.old_name, rename.new_name); > > out: > - string_list_clear(&remote_branches, 1); > + ref_transaction_free(rename.transaction); Very nice. > strbuf_release(&old_remote_context); > strbuf_release(&buf); > strbuf_release(&buf2); > - strbuf_release(&buf3); > + strbuf_release(&err); > return result; > } > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-24 18:38 ` Junio C Hamano @ 2025-07-25 8:00 ` Patrick Steinhardt 0 siblings, 0 replies; 10+ messages in thread From: Patrick Steinhardt @ 2025-07-25 8:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Han Jiang, Git Mailing List On Thu, Jul 24, 2025 at 11:38:35AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: [snip] > > + if (flags & REF_ISSYMREF) { > > + strbuf_addstr(&new_referent, referent); > > + strbuf_splice(&new_referent, strlen("refs/remotes/"), strlen(rename->old_name), > > + rename->new_name, strlen(rename->new_name)); > > + oid = NULL; > > + } > > + > > + error = ref_transaction_delete(rename->transaction, refname, > > + oid, referent, REF_NO_DEREF, NULL, rename->err); > > Remove old ... > > > + if (error < 0) > > + goto out; > > + > > + strbuf_reset(&new_refname); > > + strbuf_addstr(&new_refname, refname); > > + strbuf_splice(&new_refname, strlen("refs/remotes/"), strlen(rename->old_name), > > + rename->new_name, strlen(rename->new_name)); > > + > > + error = ref_transaction_update(rename->transaction, new_refname.buf, oid, > > + null_oid(the_hash_algo), (flags & REF_ISSYMREF) ? new_referent.buf : NULL, NULL, > > + REF_SKIP_CREATE_REFLOG | REF_NO_DEREF | REF_SKIP_OID_VERIFICATION, > > + NULL, rename->err); > > ... and create new. Would we be hit with the same "while renaming A > to A/B, there is a D/F conflict which the ref transaction does not > handle by itself" issue we saw recently here? Yes, we do indeed. I've now split this up into two transactions that get populated concurrently. Patrick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-24 13:03 ` Patrick Steinhardt 2025-07-24 18:38 ` Junio C Hamano @ 2025-07-25 11:02 ` Jeff King 2025-07-25 11:13 ` Jeff King 2025-07-28 6:05 ` Patrick Steinhardt 1 sibling, 2 replies; 10+ messages in thread From: Jeff King @ 2025-07-25 11:02 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Han Jiang, Git Mailing List On Thu, Jul 24, 2025 at 03:03:27PM +0200, Patrick Steinhardt wrote: > I've quickly hacked something together now, see the work-in-progress > patch below. The patch does not yet handle reflogs, but that isn't too > hard to implement. > > And these changes indeed speed up things by quite a lot: instead of > hours it now takes 7 seconds :) I'll polish this patch series and will > likely send it in tomorrow. Cool. I agree with all of the pain points you outlined, and the general direction. There was one other sub-optimal thing I noticed, which was... > - refs_for_each_ref(get_main_ref_store(the_repository), > - read_remote_branches, &rename); > [...] > + result = refs_for_each_rawref(get_main_ref_store(the_repository), > + queue_one_rename, &rename); Both before and after your patch, we're iterating over _all_ refs and skipping ones that aren't in "refs/remotes/<remote>/". If we just ask to iterate over that subset of refs, then we save the effort of iterating over the others that we don't care about. But: 1. We have refs_for_each_ref_in() and refs_for_each_rawref(), but no refs_for_each_rawref_in(). Feels like it should be easy to add it, though. 2. It's an obvious small optimization, but it doesn't help us in a big-O way. Iterating the refs is obviously O(n), and in the worst case rewriting the packed-refs file is likewise O(n). So I wouldn't expect to see the dramatic improvements you found by removing the quadratic bits. But I'd bet it's still measurable in a repo with a lot of refs (and maybe with reftables it actually would be bigger, since the goal there is to amortize the rewrites). -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-25 11:02 ` Jeff King @ 2025-07-25 11:13 ` Jeff King 2025-07-28 6:05 ` Patrick Steinhardt 2025-07-28 6:05 ` Patrick Steinhardt 1 sibling, 1 reply; 10+ messages in thread From: Jeff King @ 2025-07-25 11:13 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Han Jiang, Git Mailing List On Fri, Jul 25, 2025 at 07:02:43AM -0400, Jeff King wrote: > On Thu, Jul 24, 2025 at 03:03:27PM +0200, Patrick Steinhardt wrote: > > > I've quickly hacked something together now, see the work-in-progress > > patch below. The patch does not yet handle reflogs, but that isn't too > > hard to implement. > > > > And these changes indeed speed up things by quite a lot: instead of > > hours it now takes 7 seconds :) I'll polish this patch series and will > > likely send it in tomorrow. > > Cool. I agree with all of the pain points you outlined, and the general > direction. There was one other sub-optimal thing I noticed, which was... Oh, and I meant to say: I am very happy if you want to pick up this bug and fix it. In the original I mentioned also that the new remote.*.followRemoteHEAD=create logic was kicking in for an unborn branch. And I've verified that this is the case and am working on a fix. But I think the two are orthogonal and we can fix them independently. -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-25 11:13 ` Jeff King @ 2025-07-28 6:05 ` Patrick Steinhardt 0 siblings, 0 replies; 10+ messages in thread From: Patrick Steinhardt @ 2025-07-28 6:05 UTC (permalink / raw) To: Jeff King; +Cc: Han Jiang, Git Mailing List On Fri, Jul 25, 2025 at 07:13:08AM -0400, Jeff King wrote: > On Fri, Jul 25, 2025 at 07:02:43AM -0400, Jeff King wrote: > > > On Thu, Jul 24, 2025 at 03:03:27PM +0200, Patrick Steinhardt wrote: > > > > > I've quickly hacked something together now, see the work-in-progress > > > patch below. The patch does not yet handle reflogs, but that isn't too > > > hard to implement. > > > > > > And these changes indeed speed up things by quite a lot: instead of > > > hours it now takes 7 seconds :) I'll polish this patch series and will > > > likely send it in tomorrow. > > > > Cool. I agree with all of the pain points you outlined, and the general > > direction. There was one other sub-optimal thing I noticed, which was... > > Oh, and I meant to say: I am very happy if you want to pick up this bug > and fix it. In the original I mentioned also that the new > remote.*.followRemoteHEAD=create logic was kicking in for an unborn > branch. And I've verified that this is the case and am working on a fix. > But I think the two are orthogonal and we can fix them independently. I actually missed this one, so I'm happy to leave it to you. Thanks! Patrick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) 2025-07-25 11:02 ` Jeff King 2025-07-25 11:13 ` Jeff King @ 2025-07-28 6:05 ` Patrick Steinhardt 1 sibling, 0 replies; 10+ messages in thread From: Patrick Steinhardt @ 2025-07-28 6:05 UTC (permalink / raw) To: Jeff King; +Cc: Han Jiang, Git Mailing List On Fri, Jul 25, 2025 at 07:02:43AM -0400, Jeff King wrote: > On Thu, Jul 24, 2025 at 03:03:27PM +0200, Patrick Steinhardt wrote: > > > I've quickly hacked something together now, see the work-in-progress > > patch below. The patch does not yet handle reflogs, but that isn't too > > hard to implement. > > > > And these changes indeed speed up things by quite a lot: instead of > > hours it now takes 7 seconds :) I'll polish this patch series and will > > likely send it in tomorrow. > > Cool. I agree with all of the pain points you outlined, and the general > direction. There was one other sub-optimal thing I noticed, which was... > > > - refs_for_each_ref(get_main_ref_store(the_repository), > > - read_remote_branches, &rename); > > [...] > > + result = refs_for_each_rawref(get_main_ref_store(the_repository), > > + queue_one_rename, &rename); > > Both before and after your patch, we're iterating over _all_ refs and > skipping ones that aren't in "refs/remotes/<remote>/". If we just ask to > iterate over that subset of refs, then we save the effort of iterating > over the others that we don't care about. > > But: > > 1. We have refs_for_each_ref_in() and refs_for_each_rawref(), but no > refs_for_each_rawref_in(). Feels like it should be easy to add it, > though. > > 2. It's an obvious small optimization, but it doesn't help us in a > big-O way. Iterating the refs is obviously O(n), and in the worst > case rewriting the packed-refs file is likewise O(n). So I wouldn't > expect to see the dramatic improvements you found by removing the > quadratic bits. But I'd bet it's still measurable in a repo with a > lot of refs (and maybe with reftables it actually would be bigger, > since the goal there is to amortize the rewrites). Yeah, I was wondering whether to re-do this part while at it. I initially decided to not do so, but I guess that was just me being lazy. Patrick ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-07-28 6:05 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-24 9:59 `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`) Han Jiang 2025-07-24 10:45 ` Jeff King 2025-07-24 11:58 ` Patrick Steinhardt 2025-07-24 13:03 ` Patrick Steinhardt 2025-07-24 18:38 ` Junio C Hamano 2025-07-25 8:00 ` Patrick Steinhardt 2025-07-25 11:02 ` Jeff King 2025-07-25 11:13 ` Jeff King 2025-07-28 6:05 ` Patrick Steinhardt 2025-07-28 6:05 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).