From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Patrick Steinhardt <ps@pks.im>,
Per Cederqvist <ceder@lysator.liu.se>,
git@vger.kernel.org
Subject: Re: [PATCH] remote: detect collisions in remote names
Date: Tue, 08 Jul 2025 16:28:43 -0700 [thread overview]
Message-ID: <xmqq5xg2s1n8.fsf@gitster.g> (raw)
In-Reply-To: <20250708225946.GC1180568@coredump.intra.peff.net> (Jeff King's message of "Tue, 8 Jul 2025 18:59:46 -0400")
Jeff King <peff@peff.net> writes:
> On Mon, Jul 07, 2025 at 02:04:19PM -0700, Junio C Hamano wrote:
>
>> > IMHO we should probably avoid the underscore pattern. It's OK here, but
>> > it runs close to violating the reserved names rules (a global variable
>> > variable _name is bad, and _Name anywhere is bad).
>>
>> "name_" is available. In fact I think it is a very common pattern
>> in this codebase to name an incoming parameter with trailing "_",
>> and assign it to a local variable with the right name and with the
>> right type at the top of the function.
>
> Yeah, that is legal and is a pattern we use (though I admit that I find
> any underscores kind of ugly and easy to miss). I was curious how often
> each pattern appeared:
>
> ["v" prefix: vdata, va, etc]
> $ git grep 'void \*v' '*.c' | wc -l
> 51
>
> [leading underscore: _data, _a, etc]
> $ git grep 'void \*_' '*.c' | wc -l
> 52
>
> [trailing underscore: mostly a_, b_ in comparators]
> $ git grep 'void \*[a-zA-Z0-9]_' '*.c' | wc -l
> 30
Only a single letter followed by an underscore, which may be
followed by more letters legal in names (like a_bcde)?
A more fair pattern may be something like
$ git grep 'void \*[A-Za-z_0-9]*_[^A-Za-z_0-9]' \*.c | wc -l
52
> [just calling it "data"]
> $ git grep 'void \*data' '*.c' | wc -l
> 314
>
> The last one is cheating a little because it catches function pointer
> declarations, too, but grepping for "= data;" returns over a hundred
> hits, too.
Also "void *cb_data" is fairly common, I think, as we have some
callback API functions.
$ git grep 'void \*cb_data' \*.c | wc -l
234
> So that was mostly for fun, and I think any is OK. ;) But here is the
> patch again with the void pointer just called "data".
Yeah, I think any would be fine. I was a bit surprised that
v-something was so widely used, though, as I find that it makes the
least sense among all possibilities (and "data" is the distant
second, as it would become awkward when you have to have more than
one, like my_custom_cmp(void *left, void *right), if your rule says
that you must say "data").
> Although I think we're all a bit lukewarm on the concept, I feel like it
> won't hurt anything, isn't too much code, and disables a potential (if
> somewhat rare) footgun. So probably worth doing?
Even though it does not cover all cases, at least those coming from
"git remote" will be able to avoid surprises, so let me replace with
this version, wait for a few days for more inputs from others and
then mark it for 'next' if nobody sees any downsides.
> -- >8 --
> Subject: [PATCH] remote: detect collisions in remote names
>
> When two remotes collide in the destinations of their fetch refspecs,
> the results can be confusing. For example, in this silly example:
>
> git config remote.one.url [...]
> git config remote.one.fetch +refs/heads/*:refs/remotes/collide/*
> git config remote.two.url [...]
> git config remote.two.fetch +refs/heads/*:refs/remotes/collide/*
> git fetch --all
>
> we may try to write to the same ref twice (once for each remote we're
> fetching). There's also a more subtle version of this. If you have
> remotes "outer/inner" and "outer", then the ref "inner/branch" on the
> second remote will conflict with just "branch" on the former (they both
> want to write to "refs/remotes/outer/inner/branch").
>
> We probably don't want to forbid this kind of overlap completely. While
> the results can be confusing, there are legitimate reasons to have
> multiple refs write into the same namespace (e.g., if one is a "backup"
> of the other that is rarely fetched from).
>
> But it may be worth limiting the porcelain "git remote" command to avoid
> this confusion. The example above cannot be done with "git remote",
> because it always[1] matches the refspecs to the remote name, and you
> can only have one instance of each remote name. But you can still
> trigger the more subtle variant like this:
>
> git remote add outer [...]
> git remote add outer/inner [...]
>
> So let's detect that kind of name collision (in both directions) and
> forbid it. You can still do whatever you like by manipulating the config
> directly, but this should prevent the most obvious foot-gun.
>
> [1] Almost always. With the --mirror option, the resulting refspec will
> just write into "refs/*"; the remote name does not appear in the ref
> namespace at all.
>
> Our new "names must not overlap" rule is not necessary for that
> case, but it seems reasonable to enforce it consistently. We already
> require all remote names to be valid in the ref namespace, even
> though we won't ever use them in that context for --mirror remotes.
>
> Likewise, our new rule doesn't help with overlap here. Any two
> mirror remotes will always overlap (in fact, any mirror remote along
> with any other single one, since refs/remotes/ is a subset of the
> mirrored refs). I'm not sure this is worth worrying about, but if it
> is, we'd want an additional rule like "mirror remotes must be the
> only remote".
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Subject: [PATCH] remote: detect collisions in remote names
>
> When two remotes collide in the destinations of their fetch refspecs,
> the results can be confusing. For example, in this silly example:
>
> git config remote.one.url [...]
> git config remote.one.fetch +refs/heads/*:refs/remotes/collide/*
> git config remote.two.url [...]
> git config remote.two.fetch +refs/heads/*:refs/remotes/collide/*
> git fetch --all
>
> we may try to write to the same ref twice (once for each remote we're
> fetching). There's also a more subtle version of this. If you have
> remotes "outer/inner" and "outer", then the ref "inner/branch" on the
> second remote will conflict with just "branch" on the former (they both
> want to write to "refs/remotes/outer/inner/branch").
>
> We probably don't want to forbid this kind of overlap completely. While
> the results can be confusing, there are legitimate reasons to have
> multiple refs write into the same namespace (e.g., if one is a "backup"
> of the other that is rarely fetched from).
>
> But it may be worth limiting the porcelain "git remote" command to avoid
> this confusion. The example above cannot be done with "git remote",
> because it always[1] matches the refspecs to the remote name, and you
> can only have one instance of each remote name. But you can still
> trigger the more subtle variant like this:
>
> git remote add outer [...]
> git remote add outer/inner [...]
>
> So let's detect that kind of name collision (in both directions) and
> forbid it. You can still do whatever you like by manipulating the config
> directly, but this should prevent the most obvious foot-gun.
>
> [1] Almost always. With the --mirror option, the resulting refspec will
> just write into "refs/*"; the remote name does not appear in the ref
> namespace at all.
>
> Our new "names must not overlap" rule is not necessary for that
> case, but it seems reasonable to enforce it consistently. We already
> require all remote names to be valid in the ref namespace, even
> though we won't ever use them in that context for --mirror remotes.
>
> Likewise, our new rule doesn't help with overlap here. Any two
> mirror remotes will always overlap (in fact, any mirror remote along
> with any other single one, since refs/remotes/ is a subset of the
> mirrored refs). I'm not sure this is worth worrying about, but if it
> is, we'd want an additional rule like "mirror remotes must be the
> only remote".
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> builtin/remote.c | 17 +++++++++++++++++
> t/t5505-remote.sh | 14 ++++++++++++++
> 2 files changed, 31 insertions(+)
>
> diff --git a/builtin/remote.c b/builtin/remote.c
> index 0d6755bcb7..a770df669c 100644
> --- a/builtin/remote.c
> +++ b/builtin/remote.c
> @@ -157,6 +157,21 @@ static int parse_mirror_opt(const struct option *opt, const char *arg, int not)
> return 0;
> }
>
> +static int check_remote_collision(struct remote *remote, void *data)
> +{
> + const char *name = data;
> + const char *p;
> +
> + if (skip_prefix(name, remote->name, &p) && *p == '/')
> + die(_("remote name '%s' is a subset of existing remote '%s'"),
> + name, remote->name);
> + if (skip_prefix(remote->name, name, &p) && *p == '/')
> + die(_("remote name '%s' is a superset of existing remote '%s'"),
> + name, remote->name);
> +
> + return 0;
> +}
> +
> static int add(int argc, const char **argv, const char *prefix,
> struct repository *repo UNUSED)
> {
> @@ -208,6 +223,8 @@ static int add(int argc, const char **argv, const char *prefix,
> if (!valid_remote_name(name))
> die(_("'%s' is not a valid remote name"), name);
>
> + for_each_remote(check_remote_collision, (void *)name);
> +
> strbuf_addf(&buf, "remote.%s.url", name);
> git_config_set(buf.buf, url);
>
> diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
> index bef0250e89..2701eef85e 100755
> --- a/t/t5505-remote.sh
> +++ b/t/t5505-remote.sh
> @@ -1644,4 +1644,18 @@ test_expect_success 'empty config clears remote.*.pushurl list' '
> test_cmp expect actual
> '
>
> +test_expect_success 'forbid adding subset of existing remote' '
> + test_when_finished "git remote rm outer" &&
> + git remote add outer url &&
> + test_must_fail git remote add outer/inner url 2>err &&
> + test_grep ".outer/inner. is a subset of existing remote .outer." err
> +'
> +
> +test_expect_success 'forbid adding superset of existing remote' '
> + test_when_finished "git remote rm outer/inner" &&
> + git remote add outer/inner url &&
> + test_must_fail git remote add outer url 2>err &&
> + test_grep ".outer. is a superset of existing remote .outer/inner." err
> +'
> +
> test_done
next prev parent reply other threads:[~2025-07-08 23:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-03 19:33 Allowing "/" in the name of a git remote is a strange choice Per Cederqvist
2025-07-04 4:51 ` Junio C Hamano
2025-07-04 5:13 ` Patrick Steinhardt
2025-07-04 8:10 ` Lidong Yan
2025-07-04 8:17 ` Lidong Yan
2025-07-04 14:18 ` Junio C Hamano
2025-07-04 6:42 ` Per Cederqvist
2025-07-05 16:57 ` Jeff King
2025-07-05 18:58 ` [PATCH] remote: detect collisions in remote names Jeff King
2025-07-07 9:14 ` Patrick Steinhardt
2025-07-07 20:28 ` Jeff King
2025-07-07 21:04 ` Junio C Hamano
2025-07-08 22:59 ` Jeff King
2025-07-08 23:02 ` Jeff King
2025-07-08 23:28 ` Junio C Hamano [this message]
2025-07-09 1:21 ` Jeff King
2025-07-07 13:59 ` Junio C Hamano
2025-07-09 11:56 ` Raymond E. Pasco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq5xg2s1n8.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=ceder@lysator.liu.se \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).