From: Junio C Hamano <gitster@pobox.com>
To: "Scott Chacon via GitGitGadget" <gitgitgadget@gmail.com>,
Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, Scott Chacon <schacon@gmail.com>
Subject: Re: [PATCH] bundle-uri: copy all bundle references ino the refs/bundle space
Date: Tue, 25 Feb 2025 10:14:05 -0800 [thread overview]
Message-ID: <xmqqv7sxki36.fsf@gitster.g> (raw)
In-Reply-To: <pull.1897.git.git.1740489585344.gitgitgadget@gmail.com> (Scott Chacon via GitGitGadget's message of "Tue, 25 Feb 2025 13:19:45 +0000")
"Scott Chacon via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Furthermore, when I bundled just a tag (thinking it would have most
> reachable objects) it completely failed to work because there were no
> refs/heads/ available for negotiation - so it downloaded a huge file and
> then still started from scratch on the fetch.
Nice finding.
> Now I'm only getting an extra 43k objects, so 1% of the original total,
> and the entire operation is a bit faster as well.
That makes all sense.
> I'm not sure if there is a downside here, it seems clearly how you would
> want the negotiation to go. It ends up with way more refs under
> refs/bundle (now there is refs/bundle/origin/master, etc) but that's
> being polluted by the head refs anyhow, right?
I am not sure what you mean by "being polluted by the head refs
anyhow", but we should be equipped to deal with a repository with
tons of local branches, so having the comparable number of
remote-tracking branches instead of local branches now exposed in
refs/bundle/remotes/* hierarchy should work equally as well, or we
would have something we need to fix. So in principle I do not see
a problem with this approach.
The mapping used to be "refs/heads/foo" to "refs/bundles/foo", but
now "refs/heads/foo" is mapped to "refs/bundles/heads/foo", so that
you would presumably be mapping "refs/remotes/origin/master" to
"refs/bundles/remotes/origin/master", right? I hope existing users
are *not* looking at their resulting refs/bundles/ hierarchy and
somehow assuming the original mapping.
This is not something this "fix" changes, but unbundle_all_bundles()
apparently is prepared to handle more than one bundles. I wonder
what happens when multiple bundles expose the same branch pointing
at different objects? The way I read unbundle_from_file() is that
a new one overwrites the previous value, so even though we may have
all unbundled many bundles, the objects from earlier bundles may
lose their anchors, subject to be garbage-collected.
Imagine creating a bundle with two refs, refs/heads and
refs/remotes, and append that bundle as the last bundle of a bunch
of bundles full of local and remote-tracking branches, that have
populated refs/bundles/ hierarchy with tons of refs. Now the last
bundle is unbundled, and these two phoney refs would nuke everything
that used to be under refs/bundles/heads/* and refs/bundles/remotes/*
left by unpacking previous bundles, right?
> Is this a reasonable change?
This is mostly Stolee's design, IIRC, so I have CC'ed; the work is
mostly from 53a50892 (bundle-uri: create basic file-copy logic,
2022-08-09) that is more than 2 years ago.
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1897%2Fschacon%2Fsc-more-bundle-refs-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1897/schacon/sc-more-bundle-refs-v1
> Pull-Request: https://github.com/git/git/pull/1897
>
> bundle-uri.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/bundle-uri.c b/bundle-uri.c
> index 744257c49c1..3371d56f4ce 100644
> --- a/bundle-uri.c
> +++ b/bundle-uri.c
> @@ -403,7 +403,7 @@ static int unbundle_from_file(struct repository *r, const char *file)
> const char *branch_name;
> int has_old;
>
> - if (!skip_prefix(refname->string, "refs/heads/", &branch_name))
> + if (!skip_prefix(refname->string, "refs/", &branch_name))
> continue;
By skipping only "refs/", "branch_name" is no longer a branch name,
which may be something we may want to fix, but if we were to address
the "names from later bundles seem to overwrite names used by
previous boundles, and unanchor objects obtained from them" problem,
I suspect we'd rather want to create new and unique names there,
without assuming or relying on that the names used by these bundles
are reasonably unique, so this part of the code may have to change
anyway, so we may not care too deeply at this point.
next prev parent reply other threads:[~2025-02-25 18:14 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-25 13:19 [PATCH] bundle-uri: copy all bundle references ino the refs/bundle space Scott Chacon via GitGitGadget
2025-02-25 18:14 ` Junio C Hamano [this message]
2025-02-25 23:36 ` Derrick Stolee
2025-03-01 10:23 ` Scott Chacon
2025-03-03 17:12 ` Junio C Hamano
2025-03-03 18:46 ` Derrick Stolee
2025-03-01 10:33 ` [PATCH v2 0/3] " Scott Chacon via GitGitGadget
2025-03-01 10:33 ` [PATCH v2 1/3] " Scott Chacon via GitGitGadget
2025-03-01 10:33 ` [PATCH v2 2/3] bundle-uri: update bundle clone tests with new refspec path Scott Chacon via GitGitGadget
2025-03-01 10:33 ` [PATCH v2 3/3] bundle-uri: add test for bundle-uri clones with tags Scott Chacon via GitGitGadget
2025-03-03 18:49 ` [PATCH v2 0/3] bundle-uri: copy all bundle references ino the refs/bundle space Derrick Stolee
2025-03-18 15:36 ` [PATCH v3 0/2] " Scott Chacon via GitGitGadget
2025-03-18 15:36 ` [PATCH v3 1/2] " Scott Chacon via GitGitGadget
2025-03-19 10:24 ` Phillip Wood
2025-03-18 15:36 ` [PATCH v3 2/2] bundle-uri: add test for bundle-uri clones with tags Scott Chacon via GitGitGadget
2025-03-19 10:33 ` Phillip Wood
2025-03-19 17:50 ` Taylor Blau
2025-04-14 12:19 ` Toon Claes
2025-04-25 13:14 ` Scott Chacon
2025-03-21 6:31 ` Junio C Hamano
2025-04-25 13:17 ` [PATCH v4 0/2] bundle-uri: copy all bundle references ino the refs/bundle space Scott Chacon via GitGitGadget
2025-04-25 13:17 ` [PATCH v4 1/2] " Scott Chacon via GitGitGadget
2025-04-25 13:17 ` [PATCH v4 2/2] bundle-uri: add test for bundle-uri clones with tags Scott Chacon via GitGitGadget
2025-04-25 16:32 ` Scott Chacon
2025-04-25 13:53 ` [PATCH v4 0/2] bundle-uri: copy all bundle references ino the refs/bundle space Phillip Wood
2025-04-25 16:53 ` Junio C Hamano
2025-04-25 19:06 ` [PATCH v5 " Scott Chacon via GitGitGadget
2025-04-25 19:06 ` [PATCH v5 1/2] " Scott Chacon via GitGitGadget
2025-04-25 19:06 ` [PATCH v5 2/2] bundle-uri: add test for bundle-uri clones with tags Scott Chacon via GitGitGadget
2025-04-25 19:27 ` [PATCH v6 0/2] bundle-uri: copy all bundle references ino the refs/bundle space Scott Chacon via GitGitGadget
2025-04-25 19:27 ` [PATCH v6 1/2] " Scott Chacon via GitGitGadget
2025-04-25 19:27 ` [PATCH v6 2/2] bundle-uri: add test for bundle-uri clones with tags Scott Chacon via GitGitGadget
2025-04-25 19:33 ` [PATCH v7 0/2] bundle-uri: copy all bundle references ino the refs/bundle space Scott Chacon via GitGitGadget
2025-04-25 19:33 ` [PATCH v7 1/2] " Scott Chacon via GitGitGadget
2025-04-25 19:33 ` [PATCH v7 2/2] bundle-uri: add test for bundle-uri clones with tags Scott Chacon via GitGitGadget
2025-04-25 20:42 ` [PATCH v7 0/2] bundle-uri: copy all bundle references ino the refs/bundle space Junio C Hamano
2025-04-29 9:00 ` Phillip Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqv7sxki36.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=schacon@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).