From: Jeff King <peff@peff.net>
To: Simon Richter <Simon.Richter@hogyros.de>
Cc: Junio C Hamano <gitster@pobox.com>,
Benson Muite <benson_muite@emailplus.org>,
git@vger.kernel.org
Subject: Re: Mirror repositories for submodules
Date: Mon, 8 Jun 2026 19:41:16 -0400 [thread overview]
Message-ID: <20260608234116.GA358144@coredump.intra.peff.net> (raw)
In-Reply-To: <fa075b7a-96f6-4fd9-ae94-30ddf323f759@hogyros.de>
On Thu, Jun 04, 2026 at 06:27:31PM +0900, Simon Richter wrote:
> Hi,
>
> On 6/4/26 3:16 PM, Jeff King wrote:
>
> > Here's a thought experiment. What if you put the UUID into a URL, like:
> > repoid://123456789.git
>
> Yes, that's the idea, except I would want to use a relative URL, like
>
> ../123456789.git
>
> This could solve the "naive cloning" problem, because it creates an
> expectation that the submodules can be found on the same server, or in a
> nearby path.
I see. I forgot that we allowed relative submodule URLs.
> > Now, all of that said, do we still need uuids at all? If the canonical
> > submodule name is https://github.com/git/git.git, then anybody can just
> > rewrite that locally in the same way using url.*.insteadOf config.
>
> Yes, but we'd then need a mechanism for a server to indicate "for cloning,
> you should use these 'insteadOf' settings, which is a massive can of worms
> from a security standpoint.
>
> I also don't think these canonical URLs can ever be stable if they refer to
> infrastructure that is not under the control of the maintainer -- it would
> tie the project identity to the hosting provider, and increase the inertia
> to overcome for moves (such as the current exodus from github and gitlab
> towards codeberg).
From your description I was assuming the cloner had to always specify
insteadOf (which they find out about "somehow").
If they're not, then your choice of canonical URL is effectively trading
off some cases for others. In the scenario you care about, you assume
that the submodules are hosted relative to the superproject, so clients
can usually get what they need without further config. The server
operator and the superproject repo coordinate on the names.
But in many decentralized cases, there's no URL or administrative
relationship between the superproject and the submodules. They might
happen to be on the same server, but even that falls down if the
superproject is mirrored elsewhere. So using some canonical name which
works in practice _now_ is usually the best we can do.
> The common goal is that a naive clone should get submodules from a local
> server, ideally without us having to write some tool to make an initial
> checkout, enumerate submodules, create insteadOf settings, clone first layer
> of submodules, enumerate second layer, ...
You shouldn't need to do the recursive enumeration if you set up the
inteadOf ahead of time. You don't know which insteadOf settings you'll
want, but you can feed the whole possible mapping. How you get that
mapping is unspecified, but if you are mirroring the submodules already
on your local infrastructure, then whatever process does that can also
output the mapping.
Just to be clear, I'm not trying to dismiss what you're going for. I'm
looking at this from the lens of Git developers: how do existing Git
features fit into this space, and which features are missing that might
assist in a generalized way.
-Peff
next prev parent reply other threads:[~2026-06-08 23:41 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 6:11 Mirror repositories for submodules Benson Muite
2026-06-04 1:09 ` Junio C Hamano
2026-06-04 5:11 ` Simon Richter
2026-06-04 6:16 ` Jeff King
2026-06-04 9:27 ` Simon Richter
2026-06-08 23:41 ` Jeff King [this message]
2026-06-05 4:54 ` Benson Muite
2026-06-05 4:47 ` Benson Muite
2026-06-05 9:34 ` Matt Hunter
2026-06-05 5:05 ` Benson Muite
2026-06-05 12:10 ` Simon Richter
2026-06-05 4:37 ` Benson Muite
2026-06-05 4:57 ` Benson Muite
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260608234116.GA358144@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=Simon.Richter@hogyros.de \
--cc=benson_muite@emailplus.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox