From: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Jeff King <peff@peff.net>,
Victoria Dye <vdye@github.com>
Subject: [PATCH v2 0/4] Strengthen fsck checks for submodule URLs
Date: Thu, 18 Jan 2024 01:55:14 +0000 [thread overview]
Message-ID: <pull.1635.v2.git.1705542918.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1635.git.1704822817.gitgitgadget@gmail.com>
While testing 'git fsck' checks on .gitmodules URLs, I noticed that some
invalid URLs were passing the checks. Digging into it a bit more, the issue
turned out to be that 'credential_from_url_gently()' parses certain URLs
(like "http://example.com:something/deeper/path") incorrectly, in a way that
appeared to return a valid result.
Fortunately, these URLs are rejected in fetches/clones/pushes anyway because
'url_normalize()' (called in 'validate_remote_url()') correctly identifies
them as invalid. So, to bring 'git fsck' in line with other (stronger)
validation done on remote URLs, this series replaces the
'credential_from_url_gently()' check with one that uses 'url_normalize()'.
* Patch 1 moves 'check_submodule_url()' to a public location so that it can
be used outside of 'fsck.c'.
* Patch 2 removes the obsolete/never-used code in 'test-tool submodule
check-name' handling names provided on the command line.
* Patch 3 adds a 'check-url' mode to 'test-tool submodule', calling the
now-public 'check_submodule_url()' method on a given URL, and adds new
tests checking valid and invalid submodule URLs.
* Patch 4 replaces the 'credential_from_url_gently()' check with
'url_normalize()' followed by 'url_decode()' and an explicit check for
newlines (to preserve the newline handling added in 07259e74ec1 (fsck:
detect gitmodules URLs with embedded newlines, 2020-03-11)).
Changes since V1
================
* Added 'TEST_TOOL_CHECK_URL_USAGE' to 'submodule_usage'.
* Removed unused/unreachable code related to command line inputs in
'test-tool submodule check-name' and 'test-tool submodule check-url'.
* Split the new t7450 test case into two tests (the first contains URLs
that are validated successfully, the second demonstrates a URL
incorrectly marked valid) to clearly show which pattern is handled
improperly. The tests are merged in the final patch once the validation
is corrected.
Thanks!
* Victoria
Victoria Dye (4):
submodule-config.h: move check_submodule_url
test-submodule: remove command line handling for check-name
t7450: test submodule urls
submodule-config.c: strengthen URL fsck check
fsck.c | 133 ----------------------------------
submodule-config.c | 140 ++++++++++++++++++++++++++++++++++++
submodule-config.h | 3 +
t/helper/test-submodule.c | 52 +++++++++-----
t/t7450-bad-git-dotfiles.sh | 26 +++++++
5 files changed, 203 insertions(+), 151 deletions(-)
base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1635%2Fvdye%2Fvdye%2Fstrengthen-fsck-url-checks-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1635/vdye/vdye/strengthen-fsck-url-checks-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1635
Range-diff vs v1:
1: 588de3022d7 = 1: ce1de0406ef submodule-config.h: move check_submodule_url
-: ----------- > 2: 14e8834c38b test-submodule: remove command line handling for check-name
2: cf7848edffc ! 3: b6843a58389 t7450: test submodule urls
@@ Metadata
## Commit message ##
t7450: test submodule urls
- Add a test to 't7450-bad-git-dotfiles.sh' to check the validity of different
- submodule URLs. To test this directly (without setting up test repositories
- & submodules), add a 'check-url' subcommand to 'test-tool submodule' that
- calls 'check_submodule_url' in the same way that 'check-name' calls
- 'check_submodule_name'.
+ Add tests to 't7450-bad-git-dotfiles.sh' to check the validity of different
+ submodule URLs. To verify this directly (without setting up test
+ repositories & submodules), add a 'check-url' subcommand to 'test-tool
+ submodule' that calls 'check_submodule_url' in the same way that
+ 'check-name' calls 'check_submodule_name'.
- Mark the test with 'test_expect_failure' because, as it stands,
- 'check_submodule_url' marks certain invalid URLs valid. Specifically, the
- invalid URL "http://example.com:test/foo.git" is incorrectly marked valid in
- the test.
+ Add two tests to separately address cases where the URL check correctly
+ filters out invalid URLs and cases where the check misses invalid URLs. Mark
+ the latter ("url check misses invalid cases") with 'test_expect_failure' to
+ indicate that this not the undesired behavior.
Signed-off-by: Victoria Dye <vdye@github.com>
@@ t/helper/test-submodule.c: static const char *submodule_check_name_usage[] = {
};
+#define TEST_TOOL_CHECK_URL_USAGE \
-+ "test-tool submodule check-url <url>"
++ "test-tool submodule check-url"
+static const char *submodule_check_url_usage[] = {
+ TEST_TOOL_CHECK_URL_USAGE,
+ NULL
@@ t/helper/test-submodule.c: static const char *submodule_check_name_usage[] = {
#define TEST_TOOL_IS_ACTIVE_USAGE \
"test-tool submodule is-active <name>"
static const char *submodule_is_active_usage[] = {
-@@ t/helper/test-submodule.c: static const char *submodule_usage[] = {
+@@ t/helper/test-submodule.c: static const char *submodule_resolve_relative_url_usage[] = {
+
+ static const char *submodule_usage[] = {
+ TEST_TOOL_CHECK_NAME_USAGE,
++ TEST_TOOL_CHECK_URL_USAGE,
+ TEST_TOOL_IS_ACTIVE_USAGE,
+ TEST_TOOL_RESOLVE_RELATIVE_URL_USAGE,
NULL
};
+-/* Filter stdin to print only valid names. */
+-static int check_name(void)
+typedef int (*check_fn_t)(const char *);
+
- /*
- * Exit non-zero if any of the submodule names given on the command line is
- * invalid. If no names are given, filter stdin to print only valid names
- * (which is primarily intended for testing).
- */
--static int check_name(int argc, const char **argv)
-+static int check_submodule(int argc, const char **argv, check_fn_t check_fn)
++/*
++ * Apply 'check_fn' to each line of stdin, printing values that pass the check
++ * to stdout.
++ */
++static int check_submodule(check_fn_t check_fn)
{
- if (argc > 1) {
- while (*++argv) {
-- if (check_submodule_name(*argv) < 0)
-+ if (check_fn(*argv) < 0)
- return 1;
- }
- } else {
- struct strbuf buf = STRBUF_INIT;
- while (strbuf_getline(&buf, stdin) != EOF) {
-- if (!check_submodule_name(buf.buf))
-+ if (!check_fn(buf.buf))
- printf("%s\n", buf.buf);
- }
- strbuf_release(&buf);
+ struct strbuf buf = STRBUF_INIT;
+ while (strbuf_getline(&buf, stdin) != EOF) {
+- if (!check_submodule_name(buf.buf))
++ if (!check_fn(buf.buf))
+ printf("%s\n", buf.buf);
+ }
+ strbuf_release(&buf);
@@ t/helper/test-submodule.c: static int cmd__submodule_check_name(int argc, const char **argv)
if (argc)
usage_with_options(submodule_check_name_usage, options);
-- return check_name(argc, argv);
-+ return check_submodule(argc, argv, check_submodule_name);
+- return check_name();
++ return check_submodule(check_submodule_name);
+}
+
+static int cmd__submodule_check_url(int argc, const char **argv)
@@ t/helper/test-submodule.c: static int cmd__submodule_check_name(int argc, const
+ if (argc)
+ usage_with_options(submodule_check_url_usage, options);
+
-+ return check_submodule(argc, argv, check_submodule_url);
++ return check_submodule(check_submodule_url);
}
static int cmd__submodule_is_active(int argc, const char **argv)
@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
test_cmp expect actual
'
-+test_expect_failure 'check urls' '
++test_expect_success 'check urls' '
+ cat >expect <<-\EOF &&
+ ./bar/baz/foo.git
+ https://example.com/foo.git
@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
+ ./%0ahost=example.com/foo.git
+ https://one.example.com/evil?%0ahost=two.example.com
+ https:///example.com/foo.git
-+ http://example.com:test/foo.git
+ https::example.com/foo.git
+ http:::example.com/foo.git
+ EOF
+
+ test_cmp expect actual
+'
++
++# NEEDSWORK: the URL checked here is not valid (and will not work as a remote if
++# a user attempts to clone it), but the fsck check passes.
++test_expect_failure 'url check misses invalid cases' '
++ test-tool submodule check-url >actual <<-\EOF &&
++ http://example.com:test/foo.git
++ EOF
++
++ test_must_be_empty actual
++'
+
test_expect_success 'create innocent subrepo' '
git init innocent &&
3: 893071530d3 ! 4: b79b1a71780 submodule-config.c: strengthen URL fsck check
@@ submodule-config.c: int check_submodule_url(const char *url)
## t/t7450-bad-git-dotfiles.sh ##
-@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
+@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check urls' '
+ ./%0ahost=example.com/foo.git
+ https://one.example.com/evil?%0ahost=two.example.com
+ https:///example.com/foo.git
++ http://example.com:test/foo.git
+ https::example.com/foo.git
+ http:::example.com/foo.git
+ EOF
+@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check urls' '
test_cmp expect actual
'
--test_expect_failure 'check urls' '
-+test_expect_success 'check urls' '
- cat >expect <<-\EOF &&
- ./bar/baz/foo.git
- https://example.com/foo.git
+-# NEEDSWORK: the URL checked here is not valid (and will not work as a remote if
+-# a user attempts to clone it), but the fsck check passes.
+-test_expect_failure 'url check misses invalid cases' '
+- test-tool submodule check-url >actual <<-\EOF &&
+- http://example.com:test/foo.git
+- EOF
+-
+- test_must_be_empty actual
+-'
+-
+ test_expect_success 'create innocent subrepo' '
+ git init innocent &&
+ git -C innocent commit --allow-empty -m foo
--
gitgitgadget
next prev parent reply other threads:[~2024-01-18 1:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-09 17:53 [PATCH 0/3] Strengthen fsck checks for submodule URLs Victoria Dye via GitGitGadget
2024-01-09 17:53 ` [PATCH 1/3] submodule-config.h: move check_submodule_url Victoria Dye via GitGitGadget
2024-01-09 17:53 ` [PATCH 2/3] t7450: test submodule urls Victoria Dye via GitGitGadget
2024-01-09 21:38 ` Junio C Hamano
2024-01-11 17:23 ` Victoria Dye
2024-01-10 10:38 ` Jeff King
2024-01-11 16:54 ` Victoria Dye
2024-01-12 6:57 ` Jeff King
2024-01-09 17:53 ` [PATCH 3/3] submodule-config.c: strengthen URL fsck check Victoria Dye via GitGitGadget
2024-01-09 21:57 ` Junio C Hamano
2024-01-10 6:33 ` Patrick Steinhardt
2024-01-17 21:19 ` Victoria Dye
2024-01-10 10:23 ` [PATCH 0/3] Strengthen fsck checks for submodule URLs Jeff King
2024-11-13 19:24 ` Neil Mayhew
2024-11-13 19:44 ` Neil Mayhew
2024-11-13 22:40 ` Junio C Hamano
2024-11-14 0:10 ` Jeff King
2024-11-14 0:51 ` Neil Mayhew
2024-11-14 2:20 ` Junio C Hamano
2024-11-14 19:11 ` Neil Mayhew
2024-11-14 0:27 ` Neil Mayhew
2024-01-18 1:55 ` Victoria Dye via GitGitGadget [this message]
2024-01-18 1:55 ` [PATCH v2 1/4] submodule-config.h: move check_submodule_url Victoria Dye via GitGitGadget
2024-01-18 1:55 ` [PATCH v2 2/4] test-submodule: remove command line handling for check-name Victoria Dye via GitGitGadget
2024-01-18 20:44 ` Junio C Hamano
2024-01-18 1:55 ` [PATCH v2 3/4] t7450: test submodule urls Victoria Dye via GitGitGadget
2024-01-19 6:05 ` Patrick Steinhardt
2024-01-19 18:16 ` Junio C Hamano
2024-01-18 1:55 ` [PATCH v2 4/4] submodule-config.c: strengthen URL fsck check Victoria Dye via GitGitGadget
2024-01-18 18:24 ` [PATCH v2 0/4] Strengthen fsck checks for submodule URLs Junio C Hamano
2024-01-20 0:51 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1635.v2.git.1705542918.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).