From: Adrian Ratiu <adrian.ratiu@collabora.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Aaron Schrab <aaron@schrab.com>,
git@vger.kernel.org, Emily Shaffer <emilyshaffer@google.com>,
Rodrigo Damazio Bovendorp <rdamazio@google.com>,
Jeff King <peff@peff.net>, Jonathan Nieder <jrnieder@gmail.com>,
Patrick Steinhardt <ps@pks.im>,
Josh Steadmon <steadmon@google.com>,
Ben Knoble <ben.knoble@gmail.com>,
Phillip Wood <phillip.wood123@gmail.com>
Subject: Re: [PATCH v4 4/4] submodule: fix case-folding gitdir filesystem colisions
Date: Tue, 11 Nov 2025 01:01:27 +0200 [thread overview]
Message-ID: <878qgdjxvc.fsf@collabora.com> (raw)
In-Reply-To: <xmqqwm3xzots.fsf@gitster.g>
Hi Junio,
On Mon, 10 Nov 2025, Junio C Hamano <gitster@pobox.com> wrote:
> Adrian Ratiu <adrian.ratiu@collabora.com> writes:
>
>> On Sat, 08 Nov 2025, Aaron Schrab <aaron@schrab.com> wrote:
>>> At 17:05 +0200 07 Nov 2025, Adrian Ratiu
>>> <adrian.ratiu@collabora.com> wrote:
>>>>Add a new check in validate_submodule_git_dir() to detect and
>>>>prevent case-folding filesystem colisions. When this new check
>>>>is triggered, a stricter casefolding aware URI encoding is
>>>>used to percent-encode uppercase characters, e.g. Foo becomes
>>>>%46oo. By using this check/retry mechanism the uppercase
>>>>encoding is only applied when necessary, so case-sensitive
>>>>filesystems are not affected.
>
> The .gitdir name munging is a local thing, so it makes sense to
> do the casefold mitigation only the filesystem is case folding
> one,
That is correct. Case-sensitive filesystems will stop before
reaching the "Case 2.2" attempt, because their gitdirs will pass
validation in the earlier steps.
>
> Your code seems to compare directory names textually, and
> downcasing the proposed name for some reason, but I am not sure
> why we need any of these complexity. Wouldn't it be the matter
> of actually trying to mkdir(2) the name presented (either "foo"
> or "Foo") and see if that fails? If it fails (most likely with
> EEXIST if case folding is getting in the way, but for any
> reason), the name is unusable and we need to "tweak" the name to
> a usable one at that point by retrying. Once we find a usable
> name, we can remember the fact that we already created a
> directory for it and reuse that empty directory in the code
> where we used to do mkdir(2), no?
I tried the mkdir approach. Unfortunately it does not work because
submodule_name_to_gitdir() is called on all submodule paths: both
existing or new, valid or non-valid.
So if a dir already exists, we do not know if there is a
conflict. It might just be a normal valid path verification of an
existing module.
This is why I had to come up with the double-test using to_lower()
to compute a common path, canonicalization and checking for
existence when there is a difference via is_git_directory(). In
this case there is always a conflict and is the only reliable way
I colud think of.
I am very much in favor for finding a simpler check, however we
need something at least as reliable as the current double-test,
which passes all cases I could think of and all CI tests.
>
>> Maybe we could derive a new path automatically (eg foo2 or
>> foo_, suggestions welcome) and use it if valid. This way,
>> there is no user intervention.
>>
>> Do you have any preference?
>
> If adding 'foo' and then an attempt to add 'Foo' will
> automatically assign a name that does not conflict with 'foo' to
> the newly added submodule, then the users would expect the same
> to happen if the order to add them are swapped, wouldn't they?
Yes and it's a valid expectation. :)
I just missed this corner-case which Aaron brought up (many thanks
again Aaron!).
It's a simple fix which I'll do in v5: if validation fails, we
just need to come up with another name and retry.
The probability of conflicts decreases exponentially with each
retry. By the 4-5 retry the probability is so small, at that
point, if we're that desperate, we might just hash the name or use
a random string.
>
> IOW, I do not see why the code wants to treat uppercase and
> lowercase letters any differently, and suspect that it might be
> the source of additional complication. Also, if there is an
> existing module with a funny path "%46oo", you cannot just
> encode "Foo" into "%46oo" to avoid crashes with 'foo' and be
> done anyway, so it feels like we are inviting more bugs by
> special casing certain paths (and not encoding or checking
> others). Don't we have an issue similar to "case folding" in
> macOS wrt UTF-8 canonicalization, too? An identical Unicode
> string may be canonicalized in two ways, so in a presence of a
> submodule named one way, the other submodule named in the other
> canonicalization, while their names may be with different byte
> sequences, cannot co-exist in the same directory next to each
> other. "Try to mkdir(2) the new name, and see if it succeeds,
> and if so use the resulting empty directory" approach would
> cover that case with the same mechanism as you need to use for
> case folding filesystems, I would imagine.
Foo and "%46oo" is another possible conflict, yes, but only when
both "foo" and "%46oo" already exist. The deeper you go in the
retry cycles, the more unlikely conflicts become.
We just need to detect this conflict, come up with another name
and retry. Retry until one sticks.
Again, the probability of conflicts decreases exponentially with
each retry. :) If we're desperate: hash the name or try a random
string before giving up and throwing an error, which is "Case 3"
in the current patch.
(I wrote above why mkdir cannot solve the detection problem, hope
it makes sense).
I think it's also worth mentioning a key idea of this design: we
don't need to detect & fix all corner cases. We can iterate and
improve both the conflict detection and path generation over time,
since they are opaque to users, who rely only on the resulting
submodule.name.gitdir config we publish.
I know it seems like a game of "whack-a-mole", but even if we fix
just half of the possible conflicts, the likeliest ones, we still
are in a much better situation than before. At some point we're
approaching diminishing returns, so extremely rare conflicts might
not even be worth fixing.
I will add a test for this "Foo" + "foo" + "%46oo" in v5 as
well. :)
next prev parent reply other threads:[~2025-11-10 23:01 UTC|newest]
Thread overview: 179+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-16 21:36 [PATCH 0/9] Encode submodule gitdir names to avoid conflicts Adrian Ratiu
2025-08-16 21:36 ` [PATCH 1/9] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-08-20 19:04 ` Josh Steadmon
2025-08-21 11:26 ` Adrian Ratiu
2025-08-16 21:36 ` [PATCH 2/9] submodule: create new gitdirs under submodules path Adrian Ratiu
2025-09-08 14:24 ` Phillip Wood
2025-09-08 15:46 ` Adrian Ratiu
2025-09-09 8:53 ` Phillip Wood
2025-09-09 10:57 ` Adrian Ratiu
2025-08-16 21:36 ` [PATCH 3/9] submodule: add gitdir path config override Adrian Ratiu
2025-08-20 19:37 ` Josh Steadmon
2025-08-21 12:18 ` Adrian Ratiu
2025-08-20 21:38 ` Josh Steadmon
2025-08-21 13:04 ` Adrian Ratiu
2025-08-20 21:50 ` Josh Steadmon
2025-08-21 13:05 ` Adrian Ratiu
2025-09-08 14:23 ` Phillip Wood
2025-09-09 12:02 ` Adrian Ratiu
2025-08-16 21:36 ` [PATCH 4/9] t: submodules: add basic mixed gitdir path tests Adrian Ratiu
2025-08-20 22:07 ` Josh Steadmon
2025-09-02 23:02 ` Junio C Hamano
2025-08-16 21:36 ` [PATCH 5/9] strbuf: bring back is_rfc3986_unreserved Adrian Ratiu
2025-08-16 21:56 ` Ben Knoble
2025-08-21 13:08 ` Adrian Ratiu
2025-08-16 21:36 ` [PATCH 6/9] submodule: encode gitdir paths to avoid conflicts Adrian Ratiu
2025-08-20 19:29 ` Jeff King
2025-08-21 13:14 ` Adrian Ratiu
2025-08-16 21:36 ` [PATCH 7/9] submodule: remove validate_submodule_git_dir() Adrian Ratiu
2025-09-08 14:23 ` Phillip Wood
2025-08-16 21:36 ` [PATCH 8/9] t: move nested gitdir tests to proper location Adrian Ratiu
2025-08-16 21:36 ` [PATCH 9/9] t: add gitdir encoding tests Adrian Ratiu
2025-08-18 22:06 ` Junio C Hamano
2025-08-21 13:17 ` Adrian Ratiu
2025-08-17 13:01 ` [PATCH 0/9] Encode submodule gitdir names to avoid conflicts Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 00/10] " Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 01/10] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-09-30 13:37 ` Kristoffer Haugsbakk
2025-09-08 14:01 ` [PATCH v2 02/10] submodule: create new gitdirs under submodules path Adrian Ratiu
2025-09-09 7:40 ` Patrick Steinhardt
2025-09-09 16:17 ` Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 03/10] submodule: add gitdir path config override Adrian Ratiu
2025-09-09 7:40 ` Patrick Steinhardt
2025-09-09 17:46 ` Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 04/10] t7425: add basic mixed submodule gitdir path tests Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 05/10] strbuf: bring back is_rfc3986_unreserved Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 06/10] submodule: encode gitdir paths to avoid conflicts Adrian Ratiu
2025-09-10 18:15 ` SZEDER Gábor
2025-09-10 19:30 ` Adrian Ratiu
2025-09-10 20:18 ` Kristoffer Haugsbakk
2025-09-30 13:36 ` Kristoffer Haugsbakk
2025-09-08 14:01 ` [PATCH v2 07/10] submodule: error out if gitdir name is too long Adrian Ratiu
2025-09-08 15:51 ` Jeff King
2025-09-08 17:15 ` Adrian Ratiu
2025-09-30 13:35 ` Kristoffer Haugsbakk
2025-09-08 14:01 ` [PATCH v2 08/10] submodule: remove validate_submodule_git_dir() Adrian Ratiu
2025-09-30 13:35 ` Kristoffer Haugsbakk
2025-10-03 7:56 ` Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 09/10] t7450: move nested gitdir tests to t7425 Adrian Ratiu
2025-09-08 14:01 ` [PATCH v2 10/10] t7425: add gitdir encoding tests Adrian Ratiu
2025-10-06 11:25 ` [PATCH v3 0/5] Encode submodule gitdir names to avoid conflicts Adrian Ratiu
2025-10-06 11:25 ` [PATCH v3 1/5] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-10-06 16:37 ` Junio C Hamano
2025-10-07 9:23 ` Adrian Ratiu
2025-10-06 11:25 ` [PATCH v3 2/5] submodule: add gitdir path config override Adrian Ratiu
2025-10-06 16:47 ` Junio C Hamano
2025-10-07 15:41 ` Junio C Hamano
2025-10-21 8:06 ` Patrick Steinhardt
2025-10-21 11:50 ` Adrian Ratiu
2025-10-21 8:05 ` Patrick Steinhardt
2025-10-21 11:57 ` Adrian Ratiu
2025-10-06 11:25 ` [PATCH v3 3/5] strbuf: bring back is_rfc3986_unreserved Adrian Ratiu
2025-10-06 16:51 ` Junio C Hamano
2025-10-06 17:47 ` Junio C Hamano
2025-10-07 9:43 ` Adrian Ratiu
2025-10-21 8:06 ` Patrick Steinhardt
2025-10-06 11:25 ` [PATCH v3 4/5] submodule: encode gitdir paths to avoid conflicts Adrian Ratiu
2025-10-06 16:57 ` Junio C Hamano
2025-10-07 14:10 ` Adrian Ratiu
2025-10-07 17:20 ` Junio C Hamano
2025-10-07 17:41 ` Adrian Ratiu
2025-10-07 19:55 ` Junio C Hamano
2025-10-06 11:25 ` [PATCH v3 5/5] submodule: error out if gitdir name is too long Adrian Ratiu
2025-10-06 17:06 ` Junio C Hamano
2025-10-07 10:17 ` Adrian Ratiu
2025-10-07 15:58 ` Junio C Hamano
2025-10-21 8:06 ` Patrick Steinhardt
2025-10-21 13:13 ` Adrian Ratiu
2025-10-06 16:21 ` [PATCH v3 0/5] Encode submodule gitdir names to avoid conflicts Junio C Hamano
2025-10-07 11:13 ` Adrian Ratiu
2025-10-07 15:36 ` Junio C Hamano
2025-10-07 16:58 ` Adrian Ratiu
2025-10-07 17:27 ` Junio C Hamano
2025-10-07 16:21 ` Junio C Hamano
2025-10-07 17:21 ` Adrian Ratiu
2025-11-07 15:05 ` [PATCH v4 0/4] " Adrian Ratiu
2025-11-07 15:05 ` [PATCH v4 1/4] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-11-07 15:05 ` [PATCH v4 2/4] builtin/credential-store: move is_rfc3986_unreserved to url.[ch] Adrian Ratiu
2025-11-07 15:05 ` [PATCH v4 3/4] submodule: add extension to encode gitdir paths Adrian Ratiu
2025-11-07 15:05 ` [PATCH v4 4/4] submodule: fix case-folding gitdir filesystem colisions Adrian Ratiu
2025-11-08 18:20 ` Aaron Schrab
2025-11-10 17:11 ` Adrian Ratiu
2025-11-10 17:31 ` Aaron Schrab
2025-11-10 18:27 ` Adrian Ratiu
2025-11-10 19:10 ` Junio C Hamano
2025-11-10 23:01 ` Adrian Ratiu [this message]
2025-11-10 23:17 ` Junio C Hamano
2025-11-11 12:41 ` Adrian Ratiu
2025-11-12 15:28 ` Adrian Ratiu
2025-11-14 23:03 ` [PATCH v4 0/4] Encode submodule gitdir names to avoid conflicts Josh Steadmon
2025-11-17 15:22 ` Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 0/7] " Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 1/7] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 2/7] builtin/credential-store: move is_rfc3986_unreserved to url.[ch] Adrian Ratiu
2025-12-05 12:16 ` Patrick Steinhardt
2025-12-05 17:25 ` Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 3/7] submodule: always validate gitdirs inside submodule_name_to_gitdir Adrian Ratiu
2025-12-05 12:17 ` Patrick Steinhardt
2025-12-05 18:17 ` Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 4/7] submodule: add extension to encode gitdir paths Adrian Ratiu
2025-12-05 12:19 ` Patrick Steinhardt
2025-12-05 19:30 ` Adrian Ratiu
2025-12-05 22:47 ` Junio C Hamano
2025-12-06 11:59 ` Patrick Steinhardt
2025-12-06 16:38 ` Junio C Hamano
2025-12-08 9:01 ` Adrian Ratiu
2025-12-08 11:46 ` Patrick Steinhardt
2025-12-08 15:48 ` Adrian Ratiu
2025-12-08 9:10 ` Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 5/7] submodule: fix case-folding gitdir filesystem colisions Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 6/7] submodule: use hashed name for gitdir Adrian Ratiu
2025-11-19 21:10 ` [PATCH v5 7/7] meson/Makefile: allow setting submodule encoding at build time Adrian Ratiu
2025-12-05 12:19 ` Patrick Steinhardt
2025-12-05 19:42 ` Adrian Ratiu
2025-12-05 22:52 ` Junio C Hamano
2025-12-06 12:02 ` Patrick Steinhardt
2025-12-06 16:48 ` Junio C Hamano
2025-12-08 9:23 ` Adrian Ratiu
2025-12-08 9:42 ` Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 00/10] Add submodulePathConfig extension and gitdir encoding Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 01/10] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 02/10] submodule: always validate gitdirs inside submodule_name_to_gitdir Adrian Ratiu
2025-12-16 9:09 ` Patrick Steinhardt
2025-12-13 8:08 ` [PATCH v6 03/10] builtin/submodule--helper: add gitdir command Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 04/10] submodule: introduce extensions.submodulePathConfig Adrian Ratiu
2025-12-16 9:09 ` Patrick Steinhardt
2025-12-16 9:45 ` Adrian Ratiu
2025-12-16 23:22 ` Josh Steadmon
2025-12-17 7:30 ` Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 05/10] submodule: allow runtime enabling extensions.submodulePathConfig Adrian Ratiu
2025-12-16 9:09 ` Patrick Steinhardt
2025-12-16 10:01 ` Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 06/10] submodule--helper: add gitdir migration command Adrian Ratiu
2025-12-16 9:09 ` Patrick Steinhardt
2025-12-16 10:17 ` Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 07/10] builtin/credential-store: move is_rfc3986_unreserved to url.[ch] Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 08/10] submodule--helper: fix filesystem collisions by encoding gitdir paths Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 09/10] submodule: fix case-folding gitdir filesystem collisions Adrian Ratiu
2025-12-13 8:08 ` [PATCH v6 10/10] submodule: hash the submodule name for the gitdir path Adrian Ratiu
2025-12-13 14:03 ` [PATCH v6 00/10] Add submodulePathConfig extension and gitdir encoding Ben Knoble
2025-12-15 16:28 ` Adrian Ratiu
2025-12-16 0:53 ` Junio C Hamano
2025-12-18 3:43 ` Ben Knoble
2025-12-16 23:20 ` Josh Steadmon
2025-12-17 8:17 ` Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 00/11] " Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 01/11] submodule--helper: use submodule_name_to_gitdir in add_submodule Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 02/11] submodule: always validate gitdirs inside submodule_name_to_gitdir Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 03/11] builtin/submodule--helper: add gitdir command Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 04/11] submodule: introduce extensions.submodulePathConfig Adrian Ratiu
2025-12-21 3:27 ` Junio C Hamano
2025-12-23 13:35 ` Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 05/11] submodule: allow runtime enabling extensions.submodulePathConfig Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 06/11] submodule--helper: add gitdir migration command Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 07/11] builtin/credential-store: move is_rfc3986_unreserved to url.[ch] Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 08/11] submodule--helper: fix filesystem collisions by encoding gitdir paths Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 09/11] submodule: fix case-folding gitdir filesystem collisions Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 10/11] submodule: hash the submodule name for the gitdir path Adrian Ratiu
2025-12-20 10:15 ` [PATCH v7 11/11] submodule: detect conflicts with existing gitdir configs Adrian Ratiu
2025-12-21 2:39 ` [PATCH v7 00/11] Add submodulePathConfig extension and gitdir encoding Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878qgdjxvc.fsf@collabora.com \
--to=adrian.ratiu@collabora.com \
--cc=aaron@schrab.com \
--cc=ben.knoble@gmail.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
--cc=phillip.wood123@gmail.com \
--cc=ps@pks.im \
--cc=rdamazio@google.com \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).