git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Jeff King <peff@peff.net>,
	Victoria Dye <vdye@github.com>
Subject: [PATCH v2 0/4] Strengthen fsck checks for submodule URLs
Date: Thu, 18 Jan 2024 01:55:14 +0000	[thread overview]
Message-ID: <pull.1635.v2.git.1705542918.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1635.git.1704822817.gitgitgadget@gmail.com>

While testing 'git fsck' checks on .gitmodules URLs, I noticed that some
invalid URLs were passing the checks. Digging into it a bit more, the issue
turned out to be that 'credential_from_url_gently()' parses certain URLs
(like "http://example.com:something/deeper/path") incorrectly, in a way that
appeared to return a valid result.

Fortunately, these URLs are rejected in fetches/clones/pushes anyway because
'url_normalize()' (called in 'validate_remote_url()') correctly identifies
them as invalid. So, to bring 'git fsck' in line with other (stronger)
validation done on remote URLs, this series replaces the
'credential_from_url_gently()' check with one that uses 'url_normalize()'.

 * Patch 1 moves 'check_submodule_url()' to a public location so that it can
   be used outside of 'fsck.c'.
 * Patch 2 removes the obsolete/never-used code in 'test-tool submodule
   check-name' handling names provided on the command line.
 * Patch 3 adds a 'check-url' mode to 'test-tool submodule', calling the
   now-public 'check_submodule_url()' method on a given URL, and adds new
   tests checking valid and invalid submodule URLs.
 * Patch 4 replaces the 'credential_from_url_gently()' check with
   'url_normalize()' followed by 'url_decode()' and an explicit check for
   newlines (to preserve the newline handling added in 07259e74ec1 (fsck:
   detect gitmodules URLs with embedded newlines, 2020-03-11)).


Changes since V1
================

 * Added 'TEST_TOOL_CHECK_URL_USAGE' to 'submodule_usage'.
 * Removed unused/unreachable code related to command line inputs in
   'test-tool submodule check-name' and 'test-tool submodule check-url'.
 * Split the new t7450 test case into two tests (the first contains URLs
   that are validated successfully, the second demonstrates a URL
   incorrectly marked valid) to clearly show which pattern is handled
   improperly. The tests are merged in the final patch once the validation
   is corrected.

Thanks!

 * Victoria

Victoria Dye (4):
  submodule-config.h: move check_submodule_url
  test-submodule: remove command line handling for check-name
  t7450: test submodule urls
  submodule-config.c: strengthen URL fsck check

 fsck.c                      | 133 ----------------------------------
 submodule-config.c          | 140 ++++++++++++++++++++++++++++++++++++
 submodule-config.h          |   3 +
 t/helper/test-submodule.c   |  52 +++++++++-----
 t/t7450-bad-git-dotfiles.sh |  26 +++++++
 5 files changed, 203 insertions(+), 151 deletions(-)


base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1635%2Fvdye%2Fvdye%2Fstrengthen-fsck-url-checks-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1635/vdye/vdye/strengthen-fsck-url-checks-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1635

Range-diff vs v1:

 1:  588de3022d7 = 1:  ce1de0406ef submodule-config.h: move check_submodule_url
 -:  ----------- > 2:  14e8834c38b test-submodule: remove command line handling for check-name
 2:  cf7848edffc ! 3:  b6843a58389 t7450: test submodule urls
     @@ Metadata
       ## Commit message ##
          t7450: test submodule urls
      
     -    Add a test to 't7450-bad-git-dotfiles.sh' to check the validity of different
     -    submodule URLs. To test this directly (without setting up test repositories
     -    & submodules), add a 'check-url' subcommand to 'test-tool submodule' that
     -    calls 'check_submodule_url' in the same way that 'check-name' calls
     -    'check_submodule_name'.
     +    Add tests to 't7450-bad-git-dotfiles.sh' to check the validity of different
     +    submodule URLs. To verify this directly (without setting up test
     +    repositories & submodules), add a 'check-url' subcommand to 'test-tool
     +    submodule' that calls 'check_submodule_url' in the same way that
     +    'check-name' calls 'check_submodule_name'.
      
     -    Mark the test with 'test_expect_failure' because, as it stands,
     -    'check_submodule_url' marks certain invalid URLs valid. Specifically, the
     -    invalid URL "http://example.com:test/foo.git" is incorrectly marked valid in
     -    the test.
     +    Add two tests to separately address cases where the URL check correctly
     +    filters out invalid URLs and cases where the check misses invalid URLs. Mark
     +    the latter ("url check misses invalid cases") with 'test_expect_failure' to
     +    indicate that this not the undesired behavior.
      
          Signed-off-by: Victoria Dye <vdye@github.com>
      
     @@ t/helper/test-submodule.c: static const char *submodule_check_name_usage[] = {
       };
       
      +#define TEST_TOOL_CHECK_URL_USAGE \
     -+	"test-tool submodule check-url <url>"
     ++	"test-tool submodule check-url"
      +static const char *submodule_check_url_usage[] = {
      +	TEST_TOOL_CHECK_URL_USAGE,
      +	NULL
     @@ t/helper/test-submodule.c: static const char *submodule_check_name_usage[] = {
       #define TEST_TOOL_IS_ACTIVE_USAGE \
       	"test-tool submodule is-active <name>"
       static const char *submodule_is_active_usage[] = {
     -@@ t/helper/test-submodule.c: static const char *submodule_usage[] = {
     +@@ t/helper/test-submodule.c: static const char *submodule_resolve_relative_url_usage[] = {
     + 
     + static const char *submodule_usage[] = {
     + 	TEST_TOOL_CHECK_NAME_USAGE,
     ++	TEST_TOOL_CHECK_URL_USAGE,
     + 	TEST_TOOL_IS_ACTIVE_USAGE,
     + 	TEST_TOOL_RESOLVE_RELATIVE_URL_USAGE,
       	NULL
       };
       
     +-/* Filter stdin to print only valid names. */
     +-static int check_name(void)
      +typedef int (*check_fn_t)(const char *);
      +
     - /*
     -  * Exit non-zero if any of the submodule names given on the command line is
     -  * invalid. If no names are given, filter stdin to print only valid names
     -  * (which is primarily intended for testing).
     -  */
     --static int check_name(int argc, const char **argv)
     -+static int check_submodule(int argc, const char **argv, check_fn_t check_fn)
     ++/*
     ++ * Apply 'check_fn' to each line of stdin, printing values that pass the check
     ++ * to stdout.
     ++ */
     ++static int check_submodule(check_fn_t check_fn)
       {
     - 	if (argc > 1) {
     - 		while (*++argv) {
     --			if (check_submodule_name(*argv) < 0)
     -+			if (check_fn(*argv) < 0)
     - 				return 1;
     - 		}
     - 	} else {
     - 		struct strbuf buf = STRBUF_INIT;
     - 		while (strbuf_getline(&buf, stdin) != EOF) {
     --			if (!check_submodule_name(buf.buf))
     -+			if (!check_fn(buf.buf))
     - 				printf("%s\n", buf.buf);
     - 		}
     - 		strbuf_release(&buf);
     + 	struct strbuf buf = STRBUF_INIT;
     + 	while (strbuf_getline(&buf, stdin) != EOF) {
     +-		if (!check_submodule_name(buf.buf))
     ++		if (!check_fn(buf.buf))
     + 			printf("%s\n", buf.buf);
     + 	}
     + 	strbuf_release(&buf);
      @@ t/helper/test-submodule.c: static int cmd__submodule_check_name(int argc, const char **argv)
       	if (argc)
       		usage_with_options(submodule_check_name_usage, options);
       
     --	return check_name(argc, argv);
     -+	return check_submodule(argc, argv, check_submodule_name);
     +-	return check_name();
     ++	return check_submodule(check_submodule_name);
      +}
      +
      +static int cmd__submodule_check_url(int argc, const char **argv)
     @@ t/helper/test-submodule.c: static int cmd__submodule_check_name(int argc, const
      +	if (argc)
      +		usage_with_options(submodule_check_url_usage, options);
      +
     -+	return check_submodule(argc, argv, check_submodule_url);
     ++	return check_submodule(check_submodule_url);
       }
       
       static int cmd__submodule_is_active(int argc, const char **argv)
     @@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
       	test_cmp expect actual
       '
       
     -+test_expect_failure 'check urls' '
     ++test_expect_success 'check urls' '
      +	cat >expect <<-\EOF &&
      +	./bar/baz/foo.git
      +	https://example.com/foo.git
     @@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
      +	./%0ahost=example.com/foo.git
      +	https://one.example.com/evil?%0ahost=two.example.com
      +	https:///example.com/foo.git
     -+	http://example.com:test/foo.git
      +	https::example.com/foo.git
      +	http:::example.com/foo.git
      +	EOF
      +
      +	test_cmp expect actual
      +'
     ++
     ++# NEEDSWORK: the URL checked here is not valid (and will not work as a remote if
     ++# a user attempts to clone it), but the fsck check passes.
     ++test_expect_failure 'url check misses invalid cases' '
     ++	test-tool submodule check-url >actual <<-\EOF &&
     ++	http://example.com:test/foo.git
     ++	EOF
     ++
     ++	test_must_be_empty actual
     ++'
      +
       test_expect_success 'create innocent subrepo' '
       	git init innocent &&
 3:  893071530d3 ! 4:  b79b1a71780 submodule-config.c: strengthen URL fsck check
     @@ submodule-config.c: int check_submodule_url(const char *url)
       
      
       ## t/t7450-bad-git-dotfiles.sh ##
     -@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check names' '
     +@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check urls' '
     + 	./%0ahost=example.com/foo.git
     + 	https://one.example.com/evil?%0ahost=two.example.com
     + 	https:///example.com/foo.git
     ++	http://example.com:test/foo.git
     + 	https::example.com/foo.git
     + 	http:::example.com/foo.git
     + 	EOF
     +@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'check urls' '
       	test_cmp expect actual
       '
       
     --test_expect_failure 'check urls' '
     -+test_expect_success 'check urls' '
     - 	cat >expect <<-\EOF &&
     - 	./bar/baz/foo.git
     - 	https://example.com/foo.git
     +-# NEEDSWORK: the URL checked here is not valid (and will not work as a remote if
     +-# a user attempts to clone it), but the fsck check passes.
     +-test_expect_failure 'url check misses invalid cases' '
     +-	test-tool submodule check-url >actual <<-\EOF &&
     +-	http://example.com:test/foo.git
     +-	EOF
     +-
     +-	test_must_be_empty actual
     +-'
     +-
     + test_expect_success 'create innocent subrepo' '
     + 	git init innocent &&
     + 	git -C innocent commit --allow-empty -m foo

-- 
gitgitgadget

  parent reply	other threads:[~2024-01-18  1:55 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-09 17:53 [PATCH 0/3] Strengthen fsck checks for submodule URLs Victoria Dye via GitGitGadget
2024-01-09 17:53 ` [PATCH 1/3] submodule-config.h: move check_submodule_url Victoria Dye via GitGitGadget
2024-01-09 17:53 ` [PATCH 2/3] t7450: test submodule urls Victoria Dye via GitGitGadget
2024-01-09 21:38   ` Junio C Hamano
2024-01-11 17:23     ` Victoria Dye
2024-01-10 10:38   ` Jeff King
2024-01-11 16:54     ` Victoria Dye
2024-01-12  6:57       ` Jeff King
2024-01-09 17:53 ` [PATCH 3/3] submodule-config.c: strengthen URL fsck check Victoria Dye via GitGitGadget
2024-01-09 21:57   ` Junio C Hamano
2024-01-10  6:33   ` Patrick Steinhardt
2024-01-17 21:19     ` Victoria Dye
2024-01-10 10:23 ` [PATCH 0/3] Strengthen fsck checks for submodule URLs Jeff King
2024-11-13 19:24   ` Neil Mayhew
2024-11-13 19:44     ` Neil Mayhew
2024-11-13 22:40     ` Junio C Hamano
2024-11-14  0:10       ` Jeff King
2024-11-14  0:51         ` Neil Mayhew
2024-11-14  2:20         ` Junio C Hamano
2024-11-14 19:11           ` Neil Mayhew
2024-11-14  0:27       ` Neil Mayhew
2024-01-18  1:55 ` Victoria Dye via GitGitGadget [this message]
2024-01-18  1:55   ` [PATCH v2 1/4] submodule-config.h: move check_submodule_url Victoria Dye via GitGitGadget
2024-01-18  1:55   ` [PATCH v2 2/4] test-submodule: remove command line handling for check-name Victoria Dye via GitGitGadget
2024-01-18 20:44     ` Junio C Hamano
2024-01-18  1:55   ` [PATCH v2 3/4] t7450: test submodule urls Victoria Dye via GitGitGadget
2024-01-19  6:05     ` Patrick Steinhardt
2024-01-19 18:16       ` Junio C Hamano
2024-01-18  1:55   ` [PATCH v2 4/4] submodule-config.c: strengthen URL fsck check Victoria Dye via GitGitGadget
2024-01-18 18:24   ` [PATCH v2 0/4] Strengthen fsck checks for submodule URLs Junio C Hamano
2024-01-20  0:51     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1635.v2.git.1705542918.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).