From: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Jeff King <peff@peff.net>,
Victoria Dye <vdye@github.com>
Subject: [PATCH v2 0/4] Strengthen fsck checks for submodule URLs
Date: Thu, 18 Jan 2024 01:55:14 +0000 [thread overview]
Message-ID: <pull.1635.v2.git.1705542918.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1635.git.1704822817.gitgitgadget@gmail.com>
While testing 'git fsck' checks on .gitmodules URLs, I noticed that some
invalid URLs were passing the checks. Digging into it a bit more, the issue
turned out to be that 'credential_from_url_gently()' parses certain URLs
(like "http://example.com:something/deeper/path") incorrectly, in a way that
appeared to return a valid result.
Fortunately, these URLs are rejected in fetches/clones/pushes anyway because
'url_normalize()' (called in 'validate_remote_url()') correctly identifies
them as invalid. So, to bring 'git fsck' in line with other (stronger)
validation done on remote URLs, this series replaces the
'credential_from_url_gently()' check with one that uses 'url_normalize()'.
* Patch 1 moves 'check_submodule_url()' to a public location so that it can
be used outside of 'fsck.c'.
* Patch 2 removes the obsolete/never-used code in 'test-tool submodule
check-name' handling names provided on the command line.
* Patch 3 adds a 'check-url' mode to 'test-tool submodule', calling the
now-public 'check_submodule_url()' method on a given URL, and adds new
tests checking valid and invalid submodule URLs.
* Patch 4 replaces the 'credential_from_url_gently()' check with
'url_normalize()' followed by 'url_decode()' and an explicit check for
newlines (to preserve the newline handling added in 07259e74ec1 (fsck:
detect gitmodules URLs with embedded newlines, 2020-03-11)).
Changes since V1
================
* Added 'TEST_TOOL_CHECK_URL_USAGE' to 'submodule_usage'.
* Removed unused/unreachable code related to command line inputs in
'test-tool submodule check-name' and 'test-tool submodule check-url'.
* Split the new t7450 test case into two tests (the first contains URLs
that are validated successfully, the second demonstrates a URL
incorrectly marked valid) to clearly show which pattern is handled
improperly. The tests are merged in the final patch once the validation
is corrected.
Thanks!
* Victoria
Victoria Dye (4):
submodule-config.h: move check_submodule_url
test-submodule: remove command line handling for check-name
t7450: test submodule urls
submodule-config.c: strengthen URL fsck check
fsck.c | 133 ----------------------------------
submodule-config.c | 140 ++++++++++++++++++++++++++++++++++++
submodule-config.h | 3 +
t/helper/test-submodule.c | 52 +++++++++-----
t/t7450-bad-git-dotfiles.sh | 26 +++++++
5 files changed, 203 insertions(+), 151 deletions(-)
base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1635%2Fvdye%2Fvdye%2Fstrengthen-fsck-url-checks-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1635/vdye/vdye/strengthen-fsck-url-checks-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1635
Range-diff vs v1:
1: 588de3022d7 = 1: ce1de0406ef submodule-config.h: move check_submodule_url
-: ----------- > 2: 14e8834c38b test-submodule: remove command line handling for check-name
2: cf7848edffc ! 3: b6843a58389 t7450: test submodule urls
@@ Metadata
## Commit message ##
t7450: test submodule urls
- Add a test to 't7450-bad-git-dotfiles.sh' to check the validity of different
- submodule URLs. To test this directly (without setting up test repositories
- & submodules), add a 'check-url' subcommand to 'test-tool submodule' that
- calls 'check_submodule_url' in the same way that 'check-name' calls
- 'check_submodule_name'.
+ Add tests to 't7450-bad-git-dotfiles.sh' to check the validity of different
+ submodule URLs. To verify this directly (without setting up test
+ repositories & submodules), add a 'check-url' subcommand to 'test-tool
+ submodule' that calls 'check_submodule_url' in the same way that
+ 'check-name' calls 'check_submodule_name'.
- Mark the test with 'test_expect_failure' because, as it stands,
- 'check_submodule_url' marks certain invalid URLs valid. Specifically, the
- invalid URL "http://example.com:test/foo.git" is incorrectly marked valid in
- the test.
+ Add two tests to separately address cases where the URL check correctly
+ filters out invalid URLs and cases where the check misses invalid URLs. Mark
+ the latter ("url check misses invalid cases") with 'test_expect_failure' to
+ indicate that this not the undesired behavior.
Signed-off-by: Victoria Dye <vdye@github.com>
@@ t/helper/test-submodule.c: static const char *submodule_check_name_usage[] = {
};
+#define TEST_TOOL_CHECK_URL_USAGE \
-+ "test-tool submodule check-url <url>"
++ "test-tool submodule check-url"
+static const char *submodule_check_url_usage[] = {
+ TEST_TOOL_CHECK_URL_USAGE,
+ NULL
@@ t/helper/test-submodule.c: static const char *submodule_check_name_usage[] = {
#define TEST_TOOL_IS_ACTIVE_USAGE \
"test-tool submodule is-active <name>"
static const char *submodule_is_active_usage[] = {
-@@ t/helper/test-submodule.c: static const char *submodule_usage[] = {
+@@ t/helper/test-submodule.c: static const char *submodule_resolve_relative_url_usage[] = {
+
+ static const char *submodule_usage[] = {
+ TEST_TOOL_CHECK_NAME_USAGE,
++ TEST_TOOL_CHECK_URL_USAGE,
+ TEST_TOOL_IS_ACTIVE_USAGE,
+ TEST_TOOL_RESOLVE_RELATIVE_URL_USAGE,
NULL
};
+-/* Filter stdin to print only valid names. */
+-static int check_name(void)
+typedef int (*check_fn_t)(const char *);
+
- /*
- * Exit non-zero if any of the submodule names given on the command line is
- * invalid. If no names are given, filter stdin to print only valid names
- * (which is primarily intended for testing).
- */
--static int check_name(int argc, const char **argv)
-+static int check_submodule(int argc, const char **argv, check_fn_t check_fn)
++/*
++ * Apply 'check_fn' to each line of stdin, printing values that pass the check
++ * to stdout.
++ */
++static int check_submodule(check_fn_t check_fn)
{
- if (argc > 1) {
- while (*++argv) {
-- if (check_submodule_name(*argv) < 0)
-+ if (check_fn(*argv) < 0)
- return 1;
- }
- } else {
- struct strbuf buf = STRBUF_INIT;
- while (strbuf_getline(&buf, stdin) != EOF) {
-- if (!check_submodule_name(buf.buf))
-+ if (!check_fn(buf.buf))
- printf("%s\n", buf.buf);
- }
- strbuf_release(&buf);
+ struct strbuf buf = STRBUF_INIT;
+ while (strbuf_getline(&buf, stdin) != EOF) {
+- if (!check_submodule_name(buf.buf))
++ if (!check_fn(buf.buf))
+ printf("%s\n", buf.buf);
+ }
+ strbuf_release(&buf);
@@ t/helper/test-submodule.c: static int cmd__submodule_check_name(int argc, const char **argv)
if (argc)
usage_with_options(submodule_check_name_usage, options);
-- return check_name(argc, argv);
-+ return check_submodule(argc, argv, check_submodule_name);
+- return check_name();
++ return check_submodule(check_submodule_name);
+}
+
+static int cmd__submodule_check_url(int argc, const char **argv)
@@ t/helper/test-submodule.c: static int cmd__submodule_check_name(int argc, const
+ if (argc)
+ usage_with_options(submodule_check_url_usage, options);
+
-+ return check_submodule(argc, argv, check_submodule_url);
++ return check_submodule(check_submodule_url);
}
static int cmd__submodule_is_active(int argc, const char **argv)
@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
test_cmp expect actual
'
-+test_expect_failure 'check urls' '
++test_expect_success 'check urls' '
+ cat >expect <<-\EOF &&
+ ./bar/baz/foo.git
+ https://example.com/foo.git
@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
+ ./%0ahost=example.com/foo.git
+ https://one.example.com/evil?%0ahost=two.example.com
+ https:///example.com/foo.git
-+ http://example.com:test/foo.git
+ https::example.com/foo.git
+ http:::example.com/foo.git
+ EOF
+
+ test_cmp expect actual
+'
++
++# NEEDSWORK: the URL checked here is not valid (and will not work as a remote if
++# a user attempts to clone it), but the fsck check passes.
++test_expect_failure 'url check misses invalid cases' '
++ test-tool submodule check-url >actual <<-\EOF &&
++ http://example.com:test/foo.git
++ EOF
++
++ test_must_be_empty actual
++'
+
test_expect_success 'create innocent subrepo' '
git init innocent &&
3: 893071530d3 ! 4: b79b1a71780 submodule-config.c: strengthen URL fsck check
@@ submodule-config.c: int check_submodule_url(const char *url)
## t/t7450-bad-git-dotfiles.sh ##
-@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
+@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check urls' '
+ ./%0ahost=example.com/foo.git
+ https://one.example.com/evil?%0ahost=two.example.com
+ https:///example.com/foo.git
++ http://example.com:test/foo.git
+ https::example.com/foo.git
+ http:::example.com/foo.git
+ EOF
+@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check urls' '
test_cmp expect actual
'
--test_expect_failure 'check urls' '
-+test_expect_success 'check urls' '
- cat >expect <<-\EOF &&
- ./bar/baz/foo.git
- https://example.com/foo.git
+-# NEEDSWORK: the URL checked here is not valid (and will not work as a remote if
+-# a user attempts to clone it), but the fsck check passes.
+-test_expect_failure 'url check misses invalid cases' '
+- test-tool submodule check-url >actual <<-\EOF &&
+- http://example.com:test/foo.git
+- EOF
+-
+- test_must_be_empty actual
+-'
+-
+ test_expect_success 'create innocent subrepo' '
+ git init innocent &&
+ git -C innocent commit --allow-empty -m foo
--
gitgitgadget
next prev parent reply other threads:[~2024-01-18 1:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-09 17:53 [PATCH 0/3] Strengthen fsck checks for submodule URLs Victoria Dye via GitGitGadget
2024-01-09 17:53 ` [PATCH 1/3] submodule-config.h: move check_submodule_url Victoria Dye via GitGitGadget
2024-01-09 17:53 ` [PATCH 2/3] t7450: test submodule urls Victoria Dye via GitGitGadget
2024-01-09 21:38 ` Junio C Hamano
2024-01-11 17:23 ` Victoria Dye
2024-01-10 10:38 ` Jeff King
2024-01-11 16:54 ` Victoria Dye
2024-01-12 6:57 ` Jeff King
2024-01-09 17:53 ` [PATCH 3/3] submodule-config.c: strengthen URL fsck check Victoria Dye via GitGitGadget
2024-01-09 21:57 ` Junio C Hamano
2024-01-10 6:33 ` Patrick Steinhardt
2024-01-17 21:19 ` Victoria Dye
2024-01-10 10:23 ` [PATCH 0/3] Strengthen fsck checks for submodule URLs Jeff King
2024-11-13 19:24 ` Neil Mayhew
2024-11-13 19:44 ` Neil Mayhew
2024-11-13 22:40 ` Junio C Hamano
2024-11-14 0:10 ` Jeff King
2024-11-14 0:51 ` Neil Mayhew
2024-11-14 2:20 ` Junio C Hamano
2024-11-14 19:11 ` Neil Mayhew
2024-11-14 0:27 ` Neil Mayhew
2024-01-18 1:55 ` Victoria Dye via GitGitGadget [this message]
2024-01-18 1:55 ` [PATCH v2 1/4] submodule-config.h: move check_submodule_url Victoria Dye via GitGitGadget
2024-01-18 1:55 ` [PATCH v2 2/4] test-submodule: remove command line handling for check-name Victoria Dye via GitGitGadget
2024-01-18 20:44 ` Junio C Hamano
2024-01-18 1:55 ` [PATCH v2 3/4] t7450: test submodule urls Victoria Dye via GitGitGadget
2024-01-19 6:05 ` Patrick Steinhardt
2024-01-19 18:16 ` Junio C Hamano
2024-01-18 1:55 ` [PATCH v2 4/4] submodule-config.c: strengthen URL fsck check Victoria Dye via GitGitGadget
2024-01-18 18:24 ` [PATCH v2 0/4] Strengthen fsck checks for submodule URLs Junio C Hamano
2024-01-20 0:51 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1635.v2.git.1705542918.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.