* [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
@ 2024-06-28 12:56 Ghanshyam Thakkar
2024-07-09 0:42 ` Ghanshyam Thakkar
` (3 more replies)
0 siblings, 4 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-06-28 12:56 UTC (permalink / raw)
To: git
Cc: Christian Couder, Phillip Wood, Ghanshyam Thakkar,
Christian Couder, Kaartic Sivaraam
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.
In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shellscript. This involves changing
'\'' -> '
'\!' -> !
in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shellscript.
Another change is the removal of MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a Unicode
string, it is not possible to pass arbitrary bytes to a program. But
in unit tests we don't have this limitation.
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
Makefile | 2 +-
t/helper/test-tool.c | 1 -
t/helper/test-tool.h | 1 -
t/helper/test-urlmatch-normalization.c | 56 ----
t/t0110-urlmatch-normalization.sh | 182 -----------
t/unit-tests/t-urlmatch-normalization.c | 284 ++++++++++++++++++
.../t-urlmatch-normalization}/README | 0
.../t-urlmatch-normalization}/url-1 | Bin
.../t-urlmatch-normalization}/url-10 | Bin
.../t-urlmatch-normalization}/url-11 | Bin
.../t-urlmatch-normalization}/url-2 | Bin
.../t-urlmatch-normalization}/url-3 | Bin
.../t-urlmatch-normalization}/url-4 | Bin
.../t-urlmatch-normalization}/url-5 | Bin
.../t-urlmatch-normalization}/url-6 | Bin
.../t-urlmatch-normalization}/url-7 | Bin
.../t-urlmatch-normalization}/url-8 | Bin
.../t-urlmatch-normalization}/url-9 | Bin
18 files changed, 285 insertions(+), 241 deletions(-)
delete mode 100644 t/helper/test-urlmatch-normalization.c
delete mode 100755 t/t0110-urlmatch-normalization.sh
create mode 100644 t/unit-tests/t-urlmatch-normalization.c
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/README (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-1 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-10 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-11 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-2 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-3 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-4 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-5 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-6 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-7 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-8 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-9 (100%)
diff --git a/Makefile b/Makefile
index 83bd9d13af..0fc0ee8c3e 100644
--- a/Makefile
+++ b/Makefile
@@ -844,7 +844,6 @@ TEST_BUILTINS_OBJS += test-submodule.o
TEST_BUILTINS_OBJS += test-subprocess.o
TEST_BUILTINS_OBJS += test-trace2.o
TEST_BUILTINS_OBJS += test-truncate.o
-TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
TEST_BUILTINS_OBJS += test-userdiff.o
TEST_BUILTINS_OBJS += test-wildmatch.o
TEST_BUILTINS_OBJS += test-windows-named-pipe.o
@@ -1343,6 +1342,7 @@ UNIT_TEST_PROGRAMS += t-strbuf
UNIT_TEST_PROGRAMS += t-strcmp-offset
UNIT_TEST_PROGRAMS += t-strvec
UNIT_TEST_PROGRAMS += t-trailer
+UNIT_TEST_PROGRAMS += t-urlmatch-normalization
UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 93436a82ae..feed419cdd 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -84,7 +84,6 @@ static struct test_cmd cmds[] = {
{ "trace2", cmd__trace2 },
{ "truncate", cmd__truncate },
{ "userdiff", cmd__userdiff },
- { "urlmatch-normalization", cmd__urlmatch_normalization },
{ "xml-encode", cmd__xml_encode },
{ "wildmatch", cmd__wildmatch },
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index d9033d14e1..0c80529604 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -77,7 +77,6 @@ int cmd__subprocess(int argc, const char **argv);
int cmd__trace2(int argc, const char **argv);
int cmd__truncate(int argc, const char **argv);
int cmd__userdiff(int argc, const char **argv);
-int cmd__urlmatch_normalization(int argc, const char **argv);
int cmd__xml_encode(int argc, const char **argv);
int cmd__wildmatch(int argc, const char **argv);
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-urlmatch-normalization.c b/t/helper/test-urlmatch-normalization.c
deleted file mode 100644
index 86edd454f5..0000000000
--- a/t/helper/test-urlmatch-normalization.c
+++ /dev/null
@@ -1,56 +0,0 @@
-#include "test-tool.h"
-#include "git-compat-util.h"
-#include "urlmatch.h"
-
-int cmd__urlmatch_normalization(int argc, const char **argv)
-{
- const char usage[] = "test-tool urlmatch-normalization [-p | -l] <url1> | <url1> <url2>";
- char *url1 = NULL, *url2 = NULL;
- int opt_p = 0, opt_l = 0;
- int ret = 0;
-
- /*
- * For one url, succeed if url_normalize succeeds on it, fail otherwise.
- * For two urls, succeed only if url_normalize succeeds on both and
- * the results compare equal with strcmp. If -p is given (one url only)
- * and url_normalize succeeds, print the result followed by "\n". If
- * -l is given (one url only) and url_normalize succeeds, print the
- * returned length in decimal followed by "\n".
- */
-
- if (argc > 1 && !strcmp(argv[1], "-p")) {
- opt_p = 1;
- argc--;
- argv++;
- } else if (argc > 1 && !strcmp(argv[1], "-l")) {
- opt_l = 1;
- argc--;
- argv++;
- }
-
- if (argc < 2 || argc > 3)
- die("%s", usage);
-
- if (argc == 2) {
- struct url_info info;
- url1 = url_normalize(argv[1], &info);
- if (!url1)
- return 1;
- if (opt_p)
- printf("%s\n", url1);
- if (opt_l)
- printf("%u\n", (unsigned)info.url_len);
- goto cleanup;
- }
-
- if (opt_p || opt_l)
- die("%s", usage);
-
- url1 = url_normalize(argv[1], NULL);
- url2 = url_normalize(argv[2], NULL);
- ret = (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
-cleanup:
- free(url1);
- free(url2);
- return ret;
-}
diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch-normalization.sh
deleted file mode 100755
index 12d817fbd3..0000000000
--- a/t/t0110-urlmatch-normalization.sh
+++ /dev/null
@@ -1,182 +0,0 @@
-#!/bin/sh
-
-test_description='urlmatch URL normalization'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-# The base name of the test url files
-tu="$TEST_DIRECTORY/t0110/url"
-
-# Note that only file: URLs should be allowed without a host
-
-test_expect_success 'url scheme' '
- ! test-tool urlmatch-normalization "" &&
- ! test-tool urlmatch-normalization "_" &&
- ! test-tool urlmatch-normalization "scheme" &&
- ! test-tool urlmatch-normalization "scheme:" &&
- ! test-tool urlmatch-normalization "scheme:/" &&
- ! test-tool urlmatch-normalization "scheme://" &&
- ! test-tool urlmatch-normalization "file" &&
- ! test-tool urlmatch-normalization "file:" &&
- ! test-tool urlmatch-normalization "file:/" &&
- test-tool urlmatch-normalization "file://" &&
- ! test-tool urlmatch-normalization "://acme.co" &&
- ! test-tool urlmatch-normalization "x_test://acme.co" &&
- ! test-tool urlmatch-normalization "-test://acme.co" &&
- ! test-tool urlmatch-normalization "0test://acme.co" &&
- ! test-tool urlmatch-normalization "+test://acme.co" &&
- ! test-tool urlmatch-normalization ".test://acme.co" &&
- ! test-tool urlmatch-normalization "schem%6e://" &&
- test-tool urlmatch-normalization "x-Test+v1.0://acme.co" &&
- test "$(test-tool urlmatch-normalization -p "AbCdeF://x.Y")" = "abcdef://x.y/"
-'
-
-test_expect_success 'url authority' '
- ! test-tool urlmatch-normalization "scheme://user:pass@" &&
- ! test-tool urlmatch-normalization "scheme://?" &&
- ! test-tool urlmatch-normalization "scheme://#" &&
- ! test-tool urlmatch-normalization "scheme:///" &&
- ! test-tool urlmatch-normalization "scheme://:" &&
- ! test-tool urlmatch-normalization "scheme://:555" &&
- test-tool urlmatch-normalization "file://user:pass@" &&
- test-tool urlmatch-normalization "file://?" &&
- test-tool urlmatch-normalization "file://#" &&
- test-tool urlmatch-normalization "file:///" &&
- test-tool urlmatch-normalization "file://:" &&
- ! test-tool urlmatch-normalization "file://:555" &&
- test-tool urlmatch-normalization "scheme://user:pass@host" &&
- test-tool urlmatch-normalization "scheme://@host" &&
- test-tool urlmatch-normalization "scheme://%00@host" &&
- ! test-tool urlmatch-normalization "scheme://%%@host" &&
- test-tool urlmatch-normalization "scheme://host_" &&
- test-tool urlmatch-normalization "scheme://user:pass@host/" &&
- test-tool urlmatch-normalization "scheme://@host/" &&
- test-tool urlmatch-normalization "scheme://host/" &&
- test-tool urlmatch-normalization "scheme://host?x" &&
- test-tool urlmatch-normalization "scheme://host#x" &&
- test-tool urlmatch-normalization "scheme://host/@" &&
- test-tool urlmatch-normalization "scheme://host?@x" &&
- test-tool urlmatch-normalization "scheme://host#@x" &&
- test-tool urlmatch-normalization "scheme://[::1]" &&
- test-tool urlmatch-normalization "scheme://[::1]/" &&
- ! test-tool urlmatch-normalization "scheme://hos%41/" &&
- test-tool urlmatch-normalization "scheme://[invalid....:/" &&
- test-tool urlmatch-normalization "scheme://invalid....:]/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:[/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:["
-'
-
-test_expect_success 'url port checks' '
- test-tool urlmatch-normalization "xyz://q@some.host:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0000000" &&
- test-tool urlmatch-normalization "xyz://q@some.host:0000001?" &&
- test-tool urlmatch-normalization "xyz://q@some.host:065535#" &&
- test-tool urlmatch-normalization "xyz://q@some.host:65535" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:65536" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:99999" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100000" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100001" &&
- test-tool urlmatch-normalization "http://q@some.host:80" &&
- test-tool urlmatch-normalization "https://q@some.host:443" &&
- test-tool urlmatch-normalization "http://q@some.host:80/" &&
- test-tool urlmatch-normalization "https://q@some.host:443?" &&
- ! test-tool urlmatch-normalization "http://q@:8008" &&
- ! test-tool urlmatch-normalization "http://:8080" &&
- ! test-tool urlmatch-normalization "http://:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:000/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0%300/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0x80/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:4294967297/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:030f/"
-'
-
-test_expect_success 'url port normalization' '
- test "$(test-tool urlmatch-normalization -p "http://x:800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:0800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:00000800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:065535")" = "http://x:65535/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:1")" = "http://x:1/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:80")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:000000080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:0443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:000000443")" = "https://x/"
-'
-
-test_expect_success 'url general escapes' '
- ! test-tool urlmatch-normalization "http://x.y?%fg" &&
- test "$(test-tool urlmatch-normalization -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
- test "$(test-tool urlmatch-normalization -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
- test "$(test-tool urlmatch-normalization -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
- test "$(test-tool urlmatch-normalization -p "X://W/'\''")" = "x://w/'\''" &&
- test "$(test-tool urlmatch-normalization -p "X://W?'\!'")" = "x://w/?'\!'"
-'
-
-test_expect_success !MINGW 'url high-bit escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF"
-'
-
-test_expect_success 'url utf-8 escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
-'
-
-test_expect_success 'url username/password escapes' '
- test "$(test-tool urlmatch-normalization -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
-'
-
-test_expect_success 'url normalized lengths' '
- test "$(test-tool urlmatch-normalization -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
- test "$(test-tool urlmatch-normalization -l "http://%41:%42@x.y/%61/")" = 17 &&
- test "$(test-tool urlmatch-normalization -l "http://@x.y/^")" = 15
-'
-
-test_expect_success 'url . and .. segments' '
- test "$(test-tool urlmatch-normalization -p "x://y/.")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.")" = "x://y/a" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./")" = "x://y/a/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/.?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.?")" = "x://y/a?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?")" = "x://y/a/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c")" = "x://y/c" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
- ! test-tool urlmatch-normalization "x://y/a/./b/.././../c/././.././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2e/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/%2e./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/b/.%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/c/%2e%2E/")" = "x://y/"
-'
-
-# http://@foo specifies an empty user name but does not specify a password
-# http://foo specifies neither a user name nor a password
-# So they should not be equivalent
-test_expect_success 'url equivalents' '
- test-tool urlmatch-normalization "httP://x" "Http://X/" &&
- test-tool urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
- ! test-tool urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^" "httpS://@x.y:0443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
- test-tool urlmatch-normalization "https://@x.y/^/.." "httpS://@x.y:0443/"
-'
-
-test_done
diff --git a/t/unit-tests/t-urlmatch-normalization.c b/t/unit-tests/t-urlmatch-normalization.c
new file mode 100644
index 0000000000..4f225802b0
--- /dev/null
+++ b/t/unit-tests/t-urlmatch-normalization.c
@@ -0,0 +1,284 @@
+#include "test-lib.h"
+#include "urlmatch.h"
+#include "strbuf.h"
+
+static void check_url_normalizable(const char *url, int normalizable)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_int(normalizable, ==, url_norm ? 1 : 0))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void check_normalized_url(const char *url, const char *expect)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_str(url_norm, expect))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void compare_normalized_urls(const char *url1, const char *url2,
+ size_t equal)
+{
+ char *url1_norm = url_normalize(url1, NULL);
+ char *url2_norm = url_normalize(url2, NULL);
+
+ if (equal) {
+ if (!check_str(url1_norm, url2_norm))
+ test_msg("input url1: %s\n input url2: %s", url1,
+ url2);
+ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
+ test_msg(" url1_norm: %s\n url2_norm: %s\n"
+ " input url1: %s\n input url2: %s",
+ url1_norm, url2_norm, url1, url2);
+ free(url1_norm);
+ free(url2_norm);
+}
+
+static void check_normalized_url_from_file(const char *file, const char *expect)
+{
+ struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
+
+ strbuf_getcwd(&path);
+ strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
+ strbuf_addf(&path, "/unit-tests/t-urlmatch-normalization/%s", file);
+
+ if (!check_int(strbuf_read_file(&content, path.buf, 0), >, 0)) {
+ test_msg("failed to read from file '%s': %s", file, strerror(errno));
+ } else {
+ char *url_norm;
+
+ strbuf_trim_trailing_newline(&content);
+ url_norm = url_normalize(content.buf, NULL);
+ if (!check_str(url_norm, expect))
+ test_msg("input file: %s", file);
+ free(url_norm);
+ }
+
+ strbuf_release(&content);
+ strbuf_release(&path);
+}
+
+static void check_normalized_url_length(const char *url, size_t len)
+{
+ struct url_info info;
+ char *url_norm = url_normalize(url, &info);
+
+ if (!check_int(info.url_len, ==, len))
+ test_msg(" input url: %s\n normalized url: %s", url,
+ url_norm);
+ free(url_norm);
+}
+
+/* Note that only file: URLs should be allowed without a host */
+static void t_url_scheme(void)
+{
+ check_url_normalizable("", 0);
+ check_url_normalizable("_", 0);
+ check_url_normalizable("scheme", 0);
+ check_url_normalizable("scheme:", 0);
+ check_url_normalizable("scheme:/", 0);
+ check_url_normalizable("scheme://", 0);
+ check_url_normalizable("file", 0);
+ check_url_normalizable("file:", 0);
+ check_url_normalizable("file:/", 0);
+ check_url_normalizable("file://", 1);
+ check_url_normalizable("://acme.co", 0);
+ check_url_normalizable("x_test://acme.co", 0);
+ check_url_normalizable("-test://acme.co", 0);
+ check_url_normalizable("0test://acme.co", 0);
+ check_url_normalizable("+test://acme.co", 0);
+ check_url_normalizable(".test://acme.co", 0);
+ check_url_normalizable("schem%6e://", 0);
+ check_url_normalizable("x-Test+v1.0://acme.co", 1);
+ check_normalized_url("AbCdeF://x.Y", "abcdef://x.y/");
+}
+
+static void t_url_authority(void)
+{
+ check_url_normalizable("scheme://user:pass@", 0);
+ check_url_normalizable("scheme://?", 0);
+ check_url_normalizable("scheme://#", 0);
+ check_url_normalizable("scheme:///", 0);
+ check_url_normalizable("scheme://:", 0);
+ check_url_normalizable("scheme://:555", 0);
+ check_url_normalizable("file://user:pass@", 1);
+ check_url_normalizable("file://?", 1);
+ check_url_normalizable("file://#", 1);
+ check_url_normalizable("file:///", 1);
+ check_url_normalizable("file://:", 1);
+ check_url_normalizable("file://:555", 0);
+ check_url_normalizable("scheme://user:pass@host", 1);
+ check_url_normalizable("scheme://@host", 1);
+ check_url_normalizable("scheme://%00@host", 1);
+ check_url_normalizable("scheme://%%@host", 0);
+ check_url_normalizable("scheme://host_", 1);
+ check_url_normalizable("scheme://user:pass@host/", 1);
+ check_url_normalizable("scheme://@host/", 1);
+ check_url_normalizable("scheme://host/", 1);
+ check_url_normalizable("scheme://host?x", 1);
+ check_url_normalizable("scheme://host#x", 1);
+ check_url_normalizable("scheme://host/@", 1);
+ check_url_normalizable("scheme://host?@x", 1);
+ check_url_normalizable("scheme://host#@x", 1);
+ check_url_normalizable("scheme://[::1]", 1);
+ check_url_normalizable("scheme://[::1]/", 1);
+ check_url_normalizable("scheme://hos%41/", 0);
+ check_url_normalizable("scheme://[invalid....:/", 1);
+ check_url_normalizable("scheme://invalid....:]/", 1);
+ check_url_normalizable("scheme://invalid....:[/", 0);
+ check_url_normalizable("scheme://invalid....:[", 0);
+}
+
+static void t_url_port(void)
+{
+ check_url_normalizable("xyz://q@some.host:", 1);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://q@some.host:0", 0);
+ check_url_normalizable("xyz://q@some.host:0000000", 0);
+ check_url_normalizable("xyz://q@some.host:0000001?", 1);
+ check_url_normalizable("xyz://q@some.host:065535#", 1);
+ check_url_normalizable("xyz://q@some.host:65535", 1);
+ check_url_normalizable("xyz://q@some.host:65536", 0);
+ check_url_normalizable("xyz://q@some.host:99999", 0);
+ check_url_normalizable("xyz://q@some.host:100000", 0);
+ check_url_normalizable("xyz://q@some.host:100001", 0);
+ check_url_normalizable("http://q@some.host:80", 1);
+ check_url_normalizable("https://q@some.host:443", 1);
+ check_url_normalizable("http://q@some.host:80/", 1);
+ check_url_normalizable("https://q@some.host:443?", 1);
+ check_url_normalizable("http://q@:8008", 0);
+ check_url_normalizable("http://:8080", 0);
+ check_url_normalizable("http://:", 0);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://[::1]:456/", 1);
+ check_url_normalizable("xyz://[::1]:/", 1);
+ check_url_normalizable("xyz://[::1]:000/", 0);
+ check_url_normalizable("xyz://[::1]:0%300/", 0);
+ check_url_normalizable("xyz://[::1]:0x80/", 0);
+ check_url_normalizable("xyz://[::1]:4294967297/", 0);
+ check_url_normalizable("xyz://[::1]:030f/", 0);
+}
+
+static void t_url_port_normalization(void)
+{
+ check_normalized_url("http://x:800", "http://x:800/");
+ check_normalized_url("http://x:0800", "http://x:800/");
+ check_normalized_url("http://x:00000800", "http://x:800/");
+ check_normalized_url("http://x:065535", "http://x:65535/");
+ check_normalized_url("http://x:1", "http://x:1/");
+ check_normalized_url("http://x:80", "http://x/");
+ check_normalized_url("http://x:080", "http://x/");
+ check_normalized_url("http://x:000000080", "http://x/");
+ check_normalized_url("https://x:443", "https://x/");
+ check_normalized_url("https://x:0443", "https://x/");
+ check_normalized_url("https://x:000000443", "https://x/");
+}
+
+static void t_url_general_escape(void)
+{
+ check_url_normalizable("http://x.y?%fg", 0);
+ check_normalized_url("X://W/%7e%41^%3a", "x://w/~A%5E%3A");
+ check_normalized_url("X://W/:/?#[]@", "x://w/:/?#[]@");
+ check_normalized_url("X://W/$&()*+,;=", "x://w/$&()*+,;=");
+ check_normalized_url("X://W/'", "x://w/'");
+ check_normalized_url("X://W?!", "x://w/?!");
+}
+
+static void t_url_high_bit(void)
+{
+ check_normalized_url_from_file("url-1",
+ "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
+ check_normalized_url_from_file("url-2",
+ "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
+ check_normalized_url_from_file("url-3",
+ "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
+ check_normalized_url_from_file("url-4",
+ "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
+ check_normalized_url_from_file("url-5",
+ "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
+ check_normalized_url_from_file("url-6",
+ "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
+ check_normalized_url_from_file("url-7",
+ "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
+ check_normalized_url_from_file("url-8",
+ "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
+ check_normalized_url_from_file("url-9",
+ "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
+ check_normalized_url_from_file("url-10",
+ "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
+}
+
+static void t_url_utf8_escape(void)
+{
+ check_normalized_url_from_file("url-11",
+ "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
+}
+
+static void t_url_username_pass(void)
+{
+ check_normalized_url("x://%41%62(^):%70+d@foo", "x://Ab(%5E):p+d@foo/");
+}
+
+static void t_url_length(void)
+{
+ check_normalized_url_length("Http://%4d%65:%4d^%70@The.Host", 25);
+ check_normalized_url_length("http://%41:%42@x.y/%61/", 17);
+ check_normalized_url_length("http://@x.y/^", 15);
+}
+
+static void t_url_dots(void)
+{
+ check_normalized_url("x://y/.", "x://y/");
+ check_normalized_url("x://y/./", "x://y/");
+ check_normalized_url("x://y/a/.", "x://y/a");
+ check_normalized_url("x://y/a/./", "x://y/a/");
+ check_normalized_url("x://y/.?", "x://y/?");
+ check_normalized_url("x://y/./?", "x://y/?");
+ check_normalized_url("x://y/a/.?", "x://y/a?");
+ check_normalized_url("x://y/a/./?", "x://y/a/?");
+ check_normalized_url("x://y/a/./b/.././../c", "x://y/c");
+ check_normalized_url("x://y/a/./b/../.././c/", "x://y/c/");
+ check_normalized_url("x://y/a/./b/.././../c/././.././.", "x://y/");
+ check_url_normalizable("x://y/a/./b/.././../c/././.././..", 0);
+ check_normalized_url("x://y/a/./?/././..", "x://y/a/?/././..");
+ check_normalized_url("x://y/%2e/", "x://y/");
+ check_normalized_url("x://y/%2E/", "x://y/");
+ check_normalized_url("x://y/a/%2e./", "x://y/");
+ check_normalized_url("x://y/b/.%2E/", "x://y/");
+ check_normalized_url("x://y/c/%2e%2E/", "x://y/");
+}
+
+/*
+ * http://@foo specifies an empty user name but does not specify a password
+ * http://foo specifies neither a user name nor a password
+ * So they should not be equivalent
+ */
+static void t_url_equivalents(void)
+{
+ compare_normalized_urls("httP://x", "Http://X/", 1);
+ compare_normalized_urls("Http://%4d%65:%4d^%70@The.Host", "hTTP://Me:%4D^p@the.HOST:80/", 1);
+ compare_normalized_urls("https://@x.y/^", "httpS://x.y:443/^", 0);
+ compare_normalized_urls("https://@x.y/^", "httpS://@x.y:0443/^", 1);
+ compare_normalized_urls("https://@x.y/^/../abc", "httpS://@x.y:0443/abc", 1);
+ compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1);
+}
+
+int cmd_main(int argc UNUSED, const char **argv UNUSED)
+{
+ TEST(t_url_scheme(), "url scheme");
+ TEST(t_url_authority(), "url authority");
+ TEST(t_url_port(), "url port checks");
+ TEST(t_url_port_normalization(), "url port normalization");
+ TEST(t_url_general_escape(), "url general escapes");
+ TEST(t_url_high_bit(), "url high-bit escapes");
+ TEST(t_url_utf8_escape(), "url utf8 escapes");
+ TEST(t_url_username_pass(), "url username/password escapes");
+ TEST(t_url_length(), "url normalized lengths");
+ TEST(t_url_dots(), "url . and .. segments");
+ TEST(t_url_equivalents(), "url equivalents");
+ return test_done();
+}
diff --git a/t/t0110/README b/t/unit-tests/t-urlmatch-normalization/README
similarity index 100%
rename from t/t0110/README
rename to t/unit-tests/t-urlmatch-normalization/README
diff --git a/t/t0110/url-1 b/t/unit-tests/t-urlmatch-normalization/url-1
similarity index 100%
rename from t/t0110/url-1
rename to t/unit-tests/t-urlmatch-normalization/url-1
diff --git a/t/t0110/url-10 b/t/unit-tests/t-urlmatch-normalization/url-10
similarity index 100%
rename from t/t0110/url-10
rename to t/unit-tests/t-urlmatch-normalization/url-10
diff --git a/t/t0110/url-11 b/t/unit-tests/t-urlmatch-normalization/url-11
similarity index 100%
rename from t/t0110/url-11
rename to t/unit-tests/t-urlmatch-normalization/url-11
diff --git a/t/t0110/url-2 b/t/unit-tests/t-urlmatch-normalization/url-2
similarity index 100%
rename from t/t0110/url-2
rename to t/unit-tests/t-urlmatch-normalization/url-2
diff --git a/t/t0110/url-3 b/t/unit-tests/t-urlmatch-normalization/url-3
similarity index 100%
rename from t/t0110/url-3
rename to t/unit-tests/t-urlmatch-normalization/url-3
diff --git a/t/t0110/url-4 b/t/unit-tests/t-urlmatch-normalization/url-4
similarity index 100%
rename from t/t0110/url-4
rename to t/unit-tests/t-urlmatch-normalization/url-4
diff --git a/t/t0110/url-5 b/t/unit-tests/t-urlmatch-normalization/url-5
similarity index 100%
rename from t/t0110/url-5
rename to t/unit-tests/t-urlmatch-normalization/url-5
diff --git a/t/t0110/url-6 b/t/unit-tests/t-urlmatch-normalization/url-6
similarity index 100%
rename from t/t0110/url-6
rename to t/unit-tests/t-urlmatch-normalization/url-6
diff --git a/t/t0110/url-7 b/t/unit-tests/t-urlmatch-normalization/url-7
similarity index 100%
rename from t/t0110/url-7
rename to t/unit-tests/t-urlmatch-normalization/url-7
diff --git a/t/t0110/url-8 b/t/unit-tests/t-urlmatch-normalization/url-8
similarity index 100%
rename from t/t0110/url-8
rename to t/unit-tests/t-urlmatch-normalization/url-8
diff --git a/t/t0110/url-9 b/t/unit-tests/t-urlmatch-normalization/url-9
similarity index 100%
rename from t/t0110/url-9
rename to t/unit-tests/t-urlmatch-normalization/url-9
--
2.45.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-06-28 12:56 [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests Ghanshyam Thakkar
@ 2024-07-09 0:42 ` Ghanshyam Thakkar
2024-07-22 12:53 ` Karthik Nayak
` (2 subsequent siblings)
3 siblings, 0 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-07-09 0:42 UTC (permalink / raw)
To: Ghanshyam Thakkar, git
Cc: Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam, Junio C Hamano
Ghanshyam Thakkar <shyamthakkar001@gmail.com> wrote:
> helper/test-urlmatch-normalization along with
> t0110-urlmatch-normalization test the `url_normalize()` function from
> 'urlmatch.h'. Migrate them to the unit testing framework for better
> performance. And also add different test_msg()s for better debugging.
>
> In the migration, last two of the checks from `t_url_general_escape()`
> were slightly changed compared to the shellscript. This involves
> changing
>
> '\'' -> '
> '\!' -> !
>
> in the urls of those checks. This is because in C strings, we don't
> need to escape "'" and "!". Other than these two, all the urls were
> pasted verbatim from the shellscript.
>
> Another change is the removal of MINGW prerequisite from one of the
> test. It was there because[1] on Windows, the command line is a Unicode
> string, it is not possible to pass arbitrary bytes to a program. But
> in unit tests we don't have this limitation.
>
> [1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
> ---
Friendly reminder for reviews/acks. :)
Thanks.
> Makefile | 2 +-
> t/helper/test-tool.c | 1 -
> t/helper/test-tool.h | 1 -
> t/helper/test-urlmatch-normalization.c | 56 ----
> t/t0110-urlmatch-normalization.sh | 182 -----------
> t/unit-tests/t-urlmatch-normalization.c | 284 ++++++++++++++++++
> .../t-urlmatch-normalization}/README | 0
> .../t-urlmatch-normalization}/url-1 | Bin
> .../t-urlmatch-normalization}/url-10 | Bin
> .../t-urlmatch-normalization}/url-11 | Bin
> .../t-urlmatch-normalization}/url-2 | Bin
> .../t-urlmatch-normalization}/url-3 | Bin
> .../t-urlmatch-normalization}/url-4 | Bin
> .../t-urlmatch-normalization}/url-5 | Bin
> .../t-urlmatch-normalization}/url-6 | Bin
> .../t-urlmatch-normalization}/url-7 | Bin
> .../t-urlmatch-normalization}/url-8 | Bin
> .../t-urlmatch-normalization}/url-9 | Bin
> 18 files changed, 285 insertions(+), 241 deletions(-)
> delete mode 100644 t/helper/test-urlmatch-normalization.c
> delete mode 100755 t/t0110-urlmatch-normalization.sh
> create mode 100644 t/unit-tests/t-urlmatch-normalization.c
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/README (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-1 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-10 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-11 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-2 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-3 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-4 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-5 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-6 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-7 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-8 (100%)
> rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-9 (100%)
>
> diff --git a/Makefile b/Makefile
> index 83bd9d13af..0fc0ee8c3e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -844,7 +844,6 @@ TEST_BUILTINS_OBJS += test-submodule.o
> TEST_BUILTINS_OBJS += test-subprocess.o
> TEST_BUILTINS_OBJS += test-trace2.o
> TEST_BUILTINS_OBJS += test-truncate.o
> -TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
> TEST_BUILTINS_OBJS += test-userdiff.o
> TEST_BUILTINS_OBJS += test-wildmatch.o
> TEST_BUILTINS_OBJS += test-windows-named-pipe.o
> @@ -1343,6 +1342,7 @@ UNIT_TEST_PROGRAMS += t-strbuf
> UNIT_TEST_PROGRAMS += t-strcmp-offset
> UNIT_TEST_PROGRAMS += t-strvec
> UNIT_TEST_PROGRAMS += t-trailer
> +UNIT_TEST_PROGRAMS += t-urlmatch-normalization
> UNIT_TEST_PROGS = $(patsubst
> %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
> UNIT_TEST_OBJS = $(patsubst
> %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
> UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index 93436a82ae..feed419cdd 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -84,7 +84,6 @@ static struct test_cmd cmds[] = {
> { "trace2", cmd__trace2 },
> { "truncate", cmd__truncate },
> { "userdiff", cmd__userdiff },
> - { "urlmatch-normalization", cmd__urlmatch_normalization },
> { "xml-encode", cmd__xml_encode },
> { "wildmatch", cmd__wildmatch },
> #ifdef GIT_WINDOWS_NATIVE
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index d9033d14e1..0c80529604 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -77,7 +77,6 @@ int cmd__subprocess(int argc, const char **argv);
> int cmd__trace2(int argc, const char **argv);
> int cmd__truncate(int argc, const char **argv);
> int cmd__userdiff(int argc, const char **argv);
> -int cmd__urlmatch_normalization(int argc, const char **argv);
> int cmd__xml_encode(int argc, const char **argv);
> int cmd__wildmatch(int argc, const char **argv);
> #ifdef GIT_WINDOWS_NATIVE
> diff --git a/t/helper/test-urlmatch-normalization.c
> b/t/helper/test-urlmatch-normalization.c
> deleted file mode 100644
> index 86edd454f5..0000000000
> --- a/t/helper/test-urlmatch-normalization.c
> +++ /dev/null
> @@ -1,56 +0,0 @@
> -#include "test-tool.h"
> -#include "git-compat-util.h"
> -#include "urlmatch.h"
> -
> -int cmd__urlmatch_normalization(int argc, const char **argv)
> -{
> - const char usage[] = "test-tool urlmatch-normalization [-p | -l]
> <url1> | <url1> <url2>";
> - char *url1 = NULL, *url2 = NULL;
> - int opt_p = 0, opt_l = 0;
> - int ret = 0;
> -
> - /*
> - * For one url, succeed if url_normalize succeeds on it, fail
> otherwise.
> - * For two urls, succeed only if url_normalize succeeds on both and
> - * the results compare equal with strcmp. If -p is given (one url only)
> - * and url_normalize succeeds, print the result followed by "\n". If
> - * -l is given (one url only) and url_normalize succeeds, print the
> - * returned length in decimal followed by "\n".
> - */
> -
> - if (argc > 1 && !strcmp(argv[1], "-p")) {
> - opt_p = 1;
> - argc--;
> - argv++;
> - } else if (argc > 1 && !strcmp(argv[1], "-l")) {
> - opt_l = 1;
> - argc--;
> - argv++;
> - }
> -
> - if (argc < 2 || argc > 3)
> - die("%s", usage);
> -
> - if (argc == 2) {
> - struct url_info info;
> - url1 = url_normalize(argv[1], &info);
> - if (!url1)
> - return 1;
> - if (opt_p)
> - printf("%s\n", url1);
> - if (opt_l)
> - printf("%u\n", (unsigned)info.url_len);
> - goto cleanup;
> - }
> -
> - if (opt_p || opt_l)
> - die("%s", usage);
> -
> - url1 = url_normalize(argv[1], NULL);
> - url2 = url_normalize(argv[2], NULL);
> - ret = (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
> -cleanup:
> - free(url1);
> - free(url2);
> - return ret;
> -}
> diff --git a/t/t0110-urlmatch-normalization.sh
> b/t/t0110-urlmatch-normalization.sh
> deleted file mode 100755
> index 12d817fbd3..0000000000
> --- a/t/t0110-urlmatch-normalization.sh
> +++ /dev/null
> @@ -1,182 +0,0 @@
> -#!/bin/sh
> -
> -test_description='urlmatch URL normalization'
> -
> -TEST_PASSES_SANITIZE_LEAK=true
> -. ./test-lib.sh
> -
> -# The base name of the test url files
> -tu="$TEST_DIRECTORY/t0110/url"
> -
> -# Note that only file: URLs should be allowed without a host
> -
> -test_expect_success 'url scheme' '
> - ! test-tool urlmatch-normalization "" &&
> - ! test-tool urlmatch-normalization "_" &&
> - ! test-tool urlmatch-normalization "scheme" &&
> - ! test-tool urlmatch-normalization "scheme:" &&
> - ! test-tool urlmatch-normalization "scheme:/" &&
> - ! test-tool urlmatch-normalization "scheme://" &&
> - ! test-tool urlmatch-normalization "file" &&
> - ! test-tool urlmatch-normalization "file:" &&
> - ! test-tool urlmatch-normalization "file:/" &&
> - test-tool urlmatch-normalization "file://" &&
> - ! test-tool urlmatch-normalization "://acme.co" &&
> - ! test-tool urlmatch-normalization "x_test://acme.co" &&
> - ! test-tool urlmatch-normalization "-test://acme.co" &&
> - ! test-tool urlmatch-normalization "0test://acme.co" &&
> - ! test-tool urlmatch-normalization "+test://acme.co" &&
> - ! test-tool urlmatch-normalization ".test://acme.co" &&
> - ! test-tool urlmatch-normalization "schem%6e://" &&
> - test-tool urlmatch-normalization "x-Test+v1.0://acme.co" &&
> - test "$(test-tool urlmatch-normalization -p "AbCdeF://x.Y")" =
> "abcdef://x.y/"
> -'
> -
> -test_expect_success 'url authority' '
> - ! test-tool urlmatch-normalization "scheme://user:pass@" &&
> - ! test-tool urlmatch-normalization "scheme://?" &&
> - ! test-tool urlmatch-normalization "scheme://#" &&
> - ! test-tool urlmatch-normalization "scheme:///" &&
> - ! test-tool urlmatch-normalization "scheme://:" &&
> - ! test-tool urlmatch-normalization "scheme://:555" &&
> - test-tool urlmatch-normalization "file://user:pass@" &&
> - test-tool urlmatch-normalization "file://?" &&
> - test-tool urlmatch-normalization "file://#" &&
> - test-tool urlmatch-normalization "file:///" &&
> - test-tool urlmatch-normalization "file://:" &&
> - ! test-tool urlmatch-normalization "file://:555" &&
> - test-tool urlmatch-normalization "scheme://user:pass@host" &&
> - test-tool urlmatch-normalization "scheme://@host" &&
> - test-tool urlmatch-normalization "scheme://%00@host" &&
> - ! test-tool urlmatch-normalization "scheme://%%@host" &&
> - test-tool urlmatch-normalization "scheme://host_" &&
> - test-tool urlmatch-normalization "scheme://user:pass@host/" &&
> - test-tool urlmatch-normalization "scheme://@host/" &&
> - test-tool urlmatch-normalization "scheme://host/" &&
> - test-tool urlmatch-normalization "scheme://host?x" &&
> - test-tool urlmatch-normalization "scheme://host#x" &&
> - test-tool urlmatch-normalization "scheme://host/@" &&
> - test-tool urlmatch-normalization "scheme://host?@x" &&
> - test-tool urlmatch-normalization "scheme://host#@x" &&
> - test-tool urlmatch-normalization "scheme://[::1]" &&
> - test-tool urlmatch-normalization "scheme://[::1]/" &&
> - ! test-tool urlmatch-normalization "scheme://hos%41/" &&
> - test-tool urlmatch-normalization "scheme://[invalid....:/" &&
> - test-tool urlmatch-normalization "scheme://invalid....:]/" &&
> - ! test-tool urlmatch-normalization "scheme://invalid....:[/" &&
> - ! test-tool urlmatch-normalization "scheme://invalid....:["
> -'
> -
> -test_expect_success 'url port checks' '
> - test-tool urlmatch-normalization "xyz://q@some.host:" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:0" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:0000000" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:0000001?" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:065535#" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:65535" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:65536" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:99999" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:100000" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:100001" &&
> - test-tool urlmatch-normalization "http://q@some.host:80" &&
> - test-tool urlmatch-normalization "https://q@some.host:443" &&
> - test-tool urlmatch-normalization "http://q@some.host:80/" &&
> - test-tool urlmatch-normalization "https://q@some.host:443?" &&
> - ! test-tool urlmatch-normalization "http://q@:8008" &&
> - ! test-tool urlmatch-normalization "http://:8080" &&
> - ! test-tool urlmatch-normalization "http://:" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
> - test-tool urlmatch-normalization "xyz://[::1]:456/" &&
> - test-tool urlmatch-normalization "xyz://[::1]:/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:000/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:0%300/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:0x80/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:4294967297/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:030f/"
> -'
> -
> -test_expect_success 'url port normalization' '
> - test "$(test-tool urlmatch-normalization -p "http://x:800")" =
> "http://x:800/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:0800")" =
> "http://x:800/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:00000800")" =
> "http://x:800/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:065535")" =
> "http://x:65535/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:1")" =
> "http://x:1/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:80")" =
> "http://x/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:080")" =
> "http://x/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:000000080")" =
> "http://x/" &&
> - test "$(test-tool urlmatch-normalization -p "https://x:443")" =
> "https://x/" &&
> - test "$(test-tool urlmatch-normalization -p "https://x:0443")" =
> "https://x/" &&
> - test "$(test-tool urlmatch-normalization -p "https://x:000000443")" =
> "https://x/"
> -'
> -
> -test_expect_success 'url general escapes' '
> - ! test-tool urlmatch-normalization "http://x.y?%fg" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/%7e%41^%3a")" =
> "x://w/~A%5E%3A" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/:/?#[]@")" =
> "x://w/:/?#[]@" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/$&()*+,;=")" =
> "x://w/$&()*+,;=" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/'\''")" =
> "x://w/'\''" &&
> - test "$(test-tool urlmatch-normalization -p "X://W?'\!'")" =
> "x://w/?'\!'"
> -'
> -
> -test_expect_success !MINGW 'url high-bit escapes' '
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-1")")" =
> "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-2")")" =
> "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-3")")" =
> "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-4")")" =
> "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-5")")" =
> "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-6")")" =
> "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-7")")" =
> "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-8")")" =
> "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-9")")" =
> "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-10")")" =
> "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF"
> -'
> -
> -test_expect_success 'url utf-8 escapes' '
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-11")")" =
> "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
> -'
> -
> -test_expect_success 'url username/password escapes' '
> - test "$(test-tool urlmatch-normalization -p
> "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
> -'
> -
> -test_expect_success 'url normalized lengths' '
> - test "$(test-tool urlmatch-normalization -l
> "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
> - test "$(test-tool urlmatch-normalization -l
> "http://%41:%42@x.y/%61/")" = 17 &&
> - test "$(test-tool urlmatch-normalization -l "http://@x.y/^")" = 15
> -'
> -
> -test_expect_success 'url . and .. segments' '
> - test "$(test-tool urlmatch-normalization -p "x://y/.")" = "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/./")" = "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/.")" = "x://y/a"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./")" =
> "x://y/a/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/.?")" = "x://y/?"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/./?")" = "x://y/?"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/.?")" =
> "x://y/a?" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./?")" =
> "x://y/a/?" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c")"
> = "x://y/c" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./b/../.././c/")"
> = "x://y/c/" &&
> - test "$(test-tool urlmatch-normalization -p
> "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
> - ! test-tool urlmatch-normalization "x://y/a/./b/.././../c/././.././.."
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./?/././..")" =
> "x://y/a/?/././.." &&
> - test "$(test-tool urlmatch-normalization -p "x://y/%2e/")" = "x://y/"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/%2E/")" = "x://y/"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/%2e./")" =
> "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/b/.%2E/")" =
> "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/c/%2e%2E/")" =
> "x://y/"
> -'
> -
> -# http://@foo specifies an empty user name but does not specify a
> password
> -# http://foo specifies neither a user name nor a password
> -# So they should not be equivalent
> -test_expect_success 'url equivalents' '
> - test-tool urlmatch-normalization "httP://x" "Http://X/" &&
> - test-tool urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host"
> "hTTP://Me:%4D^p@the.HOST:80/" &&
> - ! test-tool urlmatch-normalization "https://@x.y/^"
> "httpS://x.y:443/^" &&
> - test-tool urlmatch-normalization "https://@x.y/^"
> "httpS://@x.y:0443/^" &&
> - test-tool urlmatch-normalization "https://@x.y/^/../abc"
> "httpS://@x.y:0443/abc" &&
> - test-tool urlmatch-normalization "https://@x.y/^/.."
> "httpS://@x.y:0443/"
> -'
> -
> -test_done
> diff --git a/t/unit-tests/t-urlmatch-normalization.c
> b/t/unit-tests/t-urlmatch-normalization.c
> new file mode 100644
> index 0000000000..4f225802b0
> --- /dev/null
> +++ b/t/unit-tests/t-urlmatch-normalization.c
> @@ -0,0 +1,284 @@
> +#include "test-lib.h"
> +#include "urlmatch.h"
> +#include "strbuf.h"
> +
> +static void check_url_normalizable(const char *url, int normalizable)
> +{
> + char *url_norm = url_normalize(url, NULL);
> +
> + if (!check_int(normalizable, ==, url_norm ? 1 : 0))
> + test_msg("input url: %s", url);
> + free(url_norm);
> +}
> +
> +static void check_normalized_url(const char *url, const char *expect)
> +{
> + char *url_norm = url_normalize(url, NULL);
> +
> + if (!check_str(url_norm, expect))
> + test_msg("input url: %s", url);
> + free(url_norm);
> +}
> +
> +static void compare_normalized_urls(const char *url1, const char *url2,
> + size_t equal)
> +{
> + char *url1_norm = url_normalize(url1, NULL);
> + char *url2_norm = url_normalize(url2, NULL);
> +
> + if (equal) {
> + if (!check_str(url1_norm, url2_norm))
> + test_msg("input url1: %s\n input url2: %s", url1,
> + url2);
> + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
> + test_msg(" url1_norm: %s\n url2_norm: %s\n"
> + " input url1: %s\n input url2: %s",
> + url1_norm, url2_norm, url1, url2);
> + free(url1_norm);
> + free(url2_norm);
> +}
> +
> +static void check_normalized_url_from_file(const char *file, const char
> *expect)
> +{
> + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> +
> + strbuf_getcwd(&path);
> + strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because
> 'unit-tests-test-tool' is run from 'bin' directory */
> + strbuf_addf(&path, "/unit-tests/t-urlmatch-normalization/%s", file);
> +
> + if (!check_int(strbuf_read_file(&content, path.buf, 0), >, 0)) {
> + test_msg("failed to read from file '%s': %s", file, strerror(errno));
> + } else {
> + char *url_norm;
> +
> + strbuf_trim_trailing_newline(&content);
> + url_norm = url_normalize(content.buf, NULL);
> + if (!check_str(url_norm, expect))
> + test_msg("input file: %s", file);
> + free(url_norm);
> + }
> +
> + strbuf_release(&content);
> + strbuf_release(&path);
> +}
> +
> +static void check_normalized_url_length(const char *url, size_t len)
> +{
> + struct url_info info;
> + char *url_norm = url_normalize(url, &info);
> +
> + if (!check_int(info.url_len, ==, len))
> + test_msg(" input url: %s\n normalized url: %s", url,
> + url_norm);
> + free(url_norm);
> +}
> +
> +/* Note that only file: URLs should be allowed without a host */
> +static void t_url_scheme(void)
> +{
> + check_url_normalizable("", 0);
> + check_url_normalizable("_", 0);
> + check_url_normalizable("scheme", 0);
> + check_url_normalizable("scheme:", 0);
> + check_url_normalizable("scheme:/", 0);
> + check_url_normalizable("scheme://", 0);
> + check_url_normalizable("file", 0);
> + check_url_normalizable("file:", 0);
> + check_url_normalizable("file:/", 0);
> + check_url_normalizable("file://", 1);
> + check_url_normalizable("://acme.co", 0);
> + check_url_normalizable("x_test://acme.co", 0);
> + check_url_normalizable("-test://acme.co", 0);
> + check_url_normalizable("0test://acme.co", 0);
> + check_url_normalizable("+test://acme.co", 0);
> + check_url_normalizable(".test://acme.co", 0);
> + check_url_normalizable("schem%6e://", 0);
> + check_url_normalizable("x-Test+v1.0://acme.co", 1);
> + check_normalized_url("AbCdeF://x.Y", "abcdef://x.y/");
> +}
> +
> +static void t_url_authority(void)
> +{
> + check_url_normalizable("scheme://user:pass@", 0);
> + check_url_normalizable("scheme://?", 0);
> + check_url_normalizable("scheme://#", 0);
> + check_url_normalizable("scheme:///", 0);
> + check_url_normalizable("scheme://:", 0);
> + check_url_normalizable("scheme://:555", 0);
> + check_url_normalizable("file://user:pass@", 1);
> + check_url_normalizable("file://?", 1);
> + check_url_normalizable("file://#", 1);
> + check_url_normalizable("file:///", 1);
> + check_url_normalizable("file://:", 1);
> + check_url_normalizable("file://:555", 0);
> + check_url_normalizable("scheme://user:pass@host", 1);
> + check_url_normalizable("scheme://@host", 1);
> + check_url_normalizable("scheme://%00@host", 1);
> + check_url_normalizable("scheme://%%@host", 0);
> + check_url_normalizable("scheme://host_", 1);
> + check_url_normalizable("scheme://user:pass@host/", 1);
> + check_url_normalizable("scheme://@host/", 1);
> + check_url_normalizable("scheme://host/", 1);
> + check_url_normalizable("scheme://host?x", 1);
> + check_url_normalizable("scheme://host#x", 1);
> + check_url_normalizable("scheme://host/@", 1);
> + check_url_normalizable("scheme://host?@x", 1);
> + check_url_normalizable("scheme://host#@x", 1);
> + check_url_normalizable("scheme://[::1]", 1);
> + check_url_normalizable("scheme://[::1]/", 1);
> + check_url_normalizable("scheme://hos%41/", 0);
> + check_url_normalizable("scheme://[invalid....:/", 1);
> + check_url_normalizable("scheme://invalid....:]/", 1);
> + check_url_normalizable("scheme://invalid....:[/", 0);
> + check_url_normalizable("scheme://invalid....:[", 0);
> +}
> +
> +static void t_url_port(void)
> +{
> + check_url_normalizable("xyz://q@some.host:", 1);
> + check_url_normalizable("xyz://q@some.host:456/", 1);
> + check_url_normalizable("xyz://q@some.host:0", 0);
> + check_url_normalizable("xyz://q@some.host:0000000", 0);
> + check_url_normalizable("xyz://q@some.host:0000001?", 1);
> + check_url_normalizable("xyz://q@some.host:065535#", 1);
> + check_url_normalizable("xyz://q@some.host:65535", 1);
> + check_url_normalizable("xyz://q@some.host:65536", 0);
> + check_url_normalizable("xyz://q@some.host:99999", 0);
> + check_url_normalizable("xyz://q@some.host:100000", 0);
> + check_url_normalizable("xyz://q@some.host:100001", 0);
> + check_url_normalizable("http://q@some.host:80", 1);
> + check_url_normalizable("https://q@some.host:443", 1);
> + check_url_normalizable("http://q@some.host:80/", 1);
> + check_url_normalizable("https://q@some.host:443?", 1);
> + check_url_normalizable("http://q@:8008", 0);
> + check_url_normalizable("http://:8080", 0);
> + check_url_normalizable("http://:", 0);
> + check_url_normalizable("xyz://q@some.host:456/", 1);
> + check_url_normalizable("xyz://[::1]:456/", 1);
> + check_url_normalizable("xyz://[::1]:/", 1);
> + check_url_normalizable("xyz://[::1]:000/", 0);
> + check_url_normalizable("xyz://[::1]:0%300/", 0);
> + check_url_normalizable("xyz://[::1]:0x80/", 0);
> + check_url_normalizable("xyz://[::1]:4294967297/", 0);
> + check_url_normalizable("xyz://[::1]:030f/", 0);
> +}
> +
> +static void t_url_port_normalization(void)
> +{
> + check_normalized_url("http://x:800", "http://x:800/");
> + check_normalized_url("http://x:0800", "http://x:800/");
> + check_normalized_url("http://x:00000800", "http://x:800/");
> + check_normalized_url("http://x:065535", "http://x:65535/");
> + check_normalized_url("http://x:1", "http://x:1/");
> + check_normalized_url("http://x:80", "http://x/");
> + check_normalized_url("http://x:080", "http://x/");
> + check_normalized_url("http://x:000000080", "http://x/");
> + check_normalized_url("https://x:443", "https://x/");
> + check_normalized_url("https://x:0443", "https://x/");
> + check_normalized_url("https://x:000000443", "https://x/");
> +}
> +
> +static void t_url_general_escape(void)
> +{
> + check_url_normalizable("http://x.y?%fg", 0);
> + check_normalized_url("X://W/%7e%41^%3a", "x://w/~A%5E%3A");
> + check_normalized_url("X://W/:/?#[]@", "x://w/:/?#[]@");
> + check_normalized_url("X://W/$&()*+,;=", "x://w/$&()*+,;=");
> + check_normalized_url("X://W/'", "x://w/'");
> + check_normalized_url("X://W?!", "x://w/?!");
> +}
> +
> +static void t_url_high_bit(void)
> +{
> + check_normalized_url_from_file("url-1",
> + "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
> + check_normalized_url_from_file("url-2",
> + "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
> + check_normalized_url_from_file("url-3",
> + "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
> + check_normalized_url_from_file("url-4",
> + "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
> + check_normalized_url_from_file("url-5",
> + "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
> + check_normalized_url_from_file("url-6",
> + "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
> + check_normalized_url_from_file("url-7",
> + "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
> + check_normalized_url_from_file("url-8",
> + "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
> + check_normalized_url_from_file("url-9",
> + "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
> + check_normalized_url_from_file("url-10",
> + "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
> +}
> +
> +static void t_url_utf8_escape(void)
> +{
> + check_normalized_url_from_file("url-11",
> + "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
> +}
> +
> +static void t_url_username_pass(void)
> +{
> + check_normalized_url("x://%41%62(^):%70+d@foo",
> "x://Ab(%5E):p+d@foo/");
> +}
> +
> +static void t_url_length(void)
> +{
> + check_normalized_url_length("Http://%4d%65:%4d^%70@The.Host", 25);
> + check_normalized_url_length("http://%41:%42@x.y/%61/", 17);
> + check_normalized_url_length("http://@x.y/^", 15);
> +}
> +
> +static void t_url_dots(void)
> +{
> + check_normalized_url("x://y/.", "x://y/");
> + check_normalized_url("x://y/./", "x://y/");
> + check_normalized_url("x://y/a/.", "x://y/a");
> + check_normalized_url("x://y/a/./", "x://y/a/");
> + check_normalized_url("x://y/.?", "x://y/?");
> + check_normalized_url("x://y/./?", "x://y/?");
> + check_normalized_url("x://y/a/.?", "x://y/a?");
> + check_normalized_url("x://y/a/./?", "x://y/a/?");
> + check_normalized_url("x://y/a/./b/.././../c", "x://y/c");
> + check_normalized_url("x://y/a/./b/../.././c/", "x://y/c/");
> + check_normalized_url("x://y/a/./b/.././../c/././.././.", "x://y/");
> + check_url_normalizable("x://y/a/./b/.././../c/././.././..", 0);
> + check_normalized_url("x://y/a/./?/././..", "x://y/a/?/././..");
> + check_normalized_url("x://y/%2e/", "x://y/");
> + check_normalized_url("x://y/%2E/", "x://y/");
> + check_normalized_url("x://y/a/%2e./", "x://y/");
> + check_normalized_url("x://y/b/.%2E/", "x://y/");
> + check_normalized_url("x://y/c/%2e%2E/", "x://y/");
> +}
> +
> +/*
> + * http://@foo specifies an empty user name but does not specify a
> password
> + * http://foo specifies neither a user name nor a password
> + * So they should not be equivalent
> + */
> +static void t_url_equivalents(void)
> +{
> + compare_normalized_urls("httP://x", "Http://X/", 1);
> + compare_normalized_urls("Http://%4d%65:%4d^%70@The.Host",
> "hTTP://Me:%4D^p@the.HOST:80/", 1);
> + compare_normalized_urls("https://@x.y/^", "httpS://x.y:443/^", 0);
> + compare_normalized_urls("https://@x.y/^", "httpS://@x.y:0443/^", 1);
> + compare_normalized_urls("https://@x.y/^/../abc",
> "httpS://@x.y:0443/abc", 1);
> + compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1);
> +}
> +
> +int cmd_main(int argc UNUSED, const char **argv UNUSED)
> +{
> + TEST(t_url_scheme(), "url scheme");
> + TEST(t_url_authority(), "url authority");
> + TEST(t_url_port(), "url port checks");
> + TEST(t_url_port_normalization(), "url port normalization");
> + TEST(t_url_general_escape(), "url general escapes");
> + TEST(t_url_high_bit(), "url high-bit escapes");
> + TEST(t_url_utf8_escape(), "url utf8 escapes");
> + TEST(t_url_username_pass(), "url username/password escapes");
> + TEST(t_url_length(), "url normalized lengths");
> + TEST(t_url_dots(), "url . and .. segments");
> + TEST(t_url_equivalents(), "url equivalents");
> + return test_done();
> +}
> diff --git a/t/t0110/README
> b/t/unit-tests/t-urlmatch-normalization/README
> similarity index 100%
> rename from t/t0110/README
> rename to t/unit-tests/t-urlmatch-normalization/README
> diff --git a/t/t0110/url-1 b/t/unit-tests/t-urlmatch-normalization/url-1
> similarity index 100%
> rename from t/t0110/url-1
> rename to t/unit-tests/t-urlmatch-normalization/url-1
> diff --git a/t/t0110/url-10
> b/t/unit-tests/t-urlmatch-normalization/url-10
> similarity index 100%
> rename from t/t0110/url-10
> rename to t/unit-tests/t-urlmatch-normalization/url-10
> diff --git a/t/t0110/url-11
> b/t/unit-tests/t-urlmatch-normalization/url-11
> similarity index 100%
> rename from t/t0110/url-11
> rename to t/unit-tests/t-urlmatch-normalization/url-11
> diff --git a/t/t0110/url-2 b/t/unit-tests/t-urlmatch-normalization/url-2
> similarity index 100%
> rename from t/t0110/url-2
> rename to t/unit-tests/t-urlmatch-normalization/url-2
> diff --git a/t/t0110/url-3 b/t/unit-tests/t-urlmatch-normalization/url-3
> similarity index 100%
> rename from t/t0110/url-3
> rename to t/unit-tests/t-urlmatch-normalization/url-3
> diff --git a/t/t0110/url-4 b/t/unit-tests/t-urlmatch-normalization/url-4
> similarity index 100%
> rename from t/t0110/url-4
> rename to t/unit-tests/t-urlmatch-normalization/url-4
> diff --git a/t/t0110/url-5 b/t/unit-tests/t-urlmatch-normalization/url-5
> similarity index 100%
> rename from t/t0110/url-5
> rename to t/unit-tests/t-urlmatch-normalization/url-5
> diff --git a/t/t0110/url-6 b/t/unit-tests/t-urlmatch-normalization/url-6
> similarity index 100%
> rename from t/t0110/url-6
> rename to t/unit-tests/t-urlmatch-normalization/url-6
> diff --git a/t/t0110/url-7 b/t/unit-tests/t-urlmatch-normalization/url-7
> similarity index 100%
> rename from t/t0110/url-7
> rename to t/unit-tests/t-urlmatch-normalization/url-7
> diff --git a/t/t0110/url-8 b/t/unit-tests/t-urlmatch-normalization/url-8
> similarity index 100%
> rename from t/t0110/url-8
> rename to t/unit-tests/t-urlmatch-normalization/url-8
> diff --git a/t/t0110/url-9 b/t/unit-tests/t-urlmatch-normalization/url-9
> similarity index 100%
> rename from t/t0110/url-9
> rename to t/unit-tests/t-urlmatch-normalization/url-9
> --
> 2.45.2
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-06-28 12:56 [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests Ghanshyam Thakkar
2024-07-09 0:42 ` Ghanshyam Thakkar
@ 2024-07-22 12:53 ` Karthik Nayak
2024-07-22 12:54 ` Ghanshyam Thakkar
2024-07-23 14:00 ` Patrick Steinhardt
2024-08-13 17:24 ` [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework Ghanshyam Thakkar
3 siblings, 1 reply; 23+ messages in thread
From: Karthik Nayak @ 2024-07-22 12:53 UTC (permalink / raw)
To: Ghanshyam Thakkar, git
Cc: Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
[snip]
> +static void compare_normalized_urls(const char *url1, const char *url2,
> + size_t equal)
[ 5 more citation lines. Click/Enter to show. ]
> +{
> + char *url1_norm = url_normalize(url1, NULL);
> + char *url2_norm = url_normalize(url2, NULL);
> +
> + if (equal) {
> + if (!check_str(url1_norm, url2_norm))
> + test_msg("input url1: %s\n input url2: %s", url1,
> + url2);
check_str() checks and prints the values if they don't match, so here
since the normalized urls will be printed by check_str(), we print the
input urls. Makes sense.
> + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
> + test_msg(" url1_norm: %s\n url2_norm: %s\n"
> + " input url1: %s\n input url2: %s",
> + url1_norm, url2_norm, url1, url2);
Here we use strcmp and hence, it won't print the normalized urls, so we
also print them. This is because we want to make sure they are not
equal.
I don't understand why there is inconsistent spacing in this message
though.
Apart from this small question, the patch looks great!
Thanks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-07-22 12:53 ` Karthik Nayak
@ 2024-07-22 12:54 ` Ghanshyam Thakkar
2024-07-23 8:26 ` Karthik Nayak
0 siblings, 1 reply; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-07-22 12:54 UTC (permalink / raw)
To: Karthik Nayak, git
Cc: Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
Karthik Nayak <karthik.188@gmail.com> wrote:
> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
>
> [snip]
>
> > +static void compare_normalized_urls(const char *url1, const char *url2,
> > + size_t equal)
> [ 5 more citation lines. Click/Enter to show. ]
> > +{
> > + char *url1_norm = url_normalize(url1, NULL);
> > + char *url2_norm = url_normalize(url2, NULL);
> > +
> > + if (equal) {
> > + if (!check_str(url1_norm, url2_norm))
> > + test_msg("input url1: %s\n input url2: %s", url1,
> > + url2);
>
> check_str() checks and prints the values if they don't match, so here
> since the normalized urls will be printed by check_str(), we print the
> input urls. Makes sense.
>
> > + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
> > + test_msg(" url1_norm: %s\n url2_norm: %s\n"
> > + " input url1: %s\n input url2: %s",
> > + url1_norm, url2_norm, url1, url2);
>
> Here we use strcmp and hence, it won't print the normalized urls, so we
> also print them. This is because we want to make sure they are not
> equal.
>
> I don't understand why there is inconsistent spacing in this message
> though.
That is for alignment purposes, so the ':' matches vertically between
them. I.e.
# url1_norm: https://@x.y/%5E
url2_norm: https://x.y/%5E
input url1: https://@x.y/^
input url2: httpS://x.y:443/^
Thanks.
> Apart from this small question, the patch looks great!
>
> Thanks
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-07-22 12:54 ` Ghanshyam Thakkar
@ 2024-07-23 8:26 ` Karthik Nayak
0 siblings, 0 replies; 23+ messages in thread
From: Karthik Nayak @ 2024-07-23 8:26 UTC (permalink / raw)
To: Ghanshyam Thakkar, git
Cc: Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
[-- Attachment #1: Type: text/plain, Size: 1556 bytes --]
"Ghanshyam Thakkar" <shyamthakkar001@gmail.com> writes:
> Karthik Nayak <karthik.188@gmail.com> wrote:
>> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
>>
>> [snip]
>>
>> > +static void compare_normalized_urls(const char *url1, const char *url2,
>> > + size_t equal)
>> [ 5 more citation lines. Click/Enter to show. ]
>> > +{
>> > + char *url1_norm = url_normalize(url1, NULL);
>> > + char *url2_norm = url_normalize(url2, NULL);
>> > +
>> > + if (equal) {
>> > + if (!check_str(url1_norm, url2_norm))
>> > + test_msg("input url1: %s\n input url2: %s", url1,
>> > + url2);
>>
>> check_str() checks and prints the values if they don't match, so here
>> since the normalized urls will be printed by check_str(), we print the
>> input urls. Makes sense.
>>
>> > + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
>> > + test_msg(" url1_norm: %s\n url2_norm: %s\n"
>> > + " input url1: %s\n input url2: %s",
>> > + url1_norm, url2_norm, url1, url2);
>>
>> Here we use strcmp and hence, it won't print the normalized urls, so we
>> also print them. This is because we want to make sure they are not
>> equal.
>>
>> I don't understand why there is inconsistent spacing in this message
>> though.
>
> That is for alignment purposes, so the ':' matches vertically between
> them. I.e.
>
> # url1_norm: https://@x.y/%5E
> url2_norm: https://x.y/%5E
> input url1: https://@x.y/^
> input url2: httpS://x.y:443/^
>
> Thanks.
>
Yeah makes sense, I was expecting it to be left aligned, but this looks
good! Thanks.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-06-28 12:56 [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests Ghanshyam Thakkar
2024-07-09 0:42 ` Ghanshyam Thakkar
2024-07-22 12:53 ` Karthik Nayak
@ 2024-07-23 14:00 ` Patrick Steinhardt
2024-07-24 0:24 ` Ghanshyam Thakkar
2024-08-13 17:24 ` [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework Ghanshyam Thakkar
3 siblings, 1 reply; 23+ messages in thread
From: Patrick Steinhardt @ 2024-07-23 14:00 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
[-- Attachment #1: Type: text/plain, Size: 1562 bytes --]
On Fri, Jun 28, 2024 at 06:26:24PM +0530, Ghanshyam Thakkar wrote:
> +static void compare_normalized_urls(const char *url1, const char *url2,
> + size_t equal)
> +{
> + char *url1_norm = url_normalize(url1, NULL);
> + char *url2_norm = url_normalize(url2, NULL);
> +
> + if (equal) {
> + if (!check_str(url1_norm, url2_norm))
> + test_msg("input url1: %s\n input url2: %s", url1,
> + url2);
> + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
> + test_msg(" url1_norm: %s\n url2_norm: %s\n"
> + " input url1: %s\n input url2: %s",
> + url1_norm, url2_norm, url1, url2);
Nit: this is missing braces around the `else if` branch. If one of the
conditional bodies has braces, then all should have according to our
style guide.
> + free(url1_norm);
> + free(url2_norm);
> +}
> +
> +static void check_normalized_url_from_file(const char *file, const char *expect)
> +{
> + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> +
> + strbuf_getcwd(&path);
> + strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
Curious: is this a new requirement or do other tests have the same
requirement? I was under the impression that I could execude the
resulting unit test binaries from whatever directory I wanted to, but
didn't verify.
In any case, the line should probably be wrapped as it is overly long.
Other than that this looks good to me. I've gave a cursory read to the
testcases themselves and they do look like a faithful conversion to me.
Thanks!
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-07-23 14:00 ` Patrick Steinhardt
@ 2024-07-24 0:24 ` Ghanshyam Thakkar
2024-07-24 5:19 ` Patrick Steinhardt
0 siblings, 1 reply; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-07-24 0:24 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
Patrick Steinhardt <ps@pks.im> wrote:
> On Fri, Jun 28, 2024 at 06:26:24PM +0530, Ghanshyam Thakkar wrote:
> > +static void compare_normalized_urls(const char *url1, const char *url2,
> > + size_t equal)
> > +{
> > + char *url1_norm = url_normalize(url1, NULL);
> > + char *url2_norm = url_normalize(url2, NULL);
> > +
> > + if (equal) {
> > + if (!check_str(url1_norm, url2_norm))
> > + test_msg("input url1: %s\n input url2: %s", url1,
> > + url2);
> > + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
> > + test_msg(" url1_norm: %s\n url2_norm: %s\n"
> > + " input url1: %s\n input url2: %s",
> > + url1_norm, url2_norm, url1, url2);
>
> Nit: this is missing braces around the `else if` branch. If one of the
> conditional bodies has braces, then all should have according to our
> style guide.
Will update.
>
> > + free(url1_norm);
> > + free(url2_norm);
> > +}
> > +
> > +static void check_normalized_url_from_file(const char *file, const char *expect)
> > +{
> > + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> > +
> > + strbuf_getcwd(&path);
> > + strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
>
> Curious: is this a new requirement or do other tests have the same
> requirement? I was under the impression that I could execude the
> resulting unit test binaries from whatever directory I wanted to, but
> didn't verify.
I am not aware of any requirements, but if we want to interact with
other files like in this case (and where we potentially have to
interact with a test repository), we'd need to have some requirement
to construct the path to these data files (and the test repository),
similar to end-to-end tests where they can be run in only t/
directory. Do you think calling `setup_git_directory()` and then using
`the_repository->worktree` to get the root of the worktree of Git source
and then construct the path relative to that, would be useful? That way
we can atleast call the binaries from anywhere within the tree.
(P.S. I know we want to avoid using `the_repository`, but I don't know
any other way yet.)
>
> In any case, the line should probably be wrapped as it is overly long.
Will update.
Thank you.
>
> Other than that this looks good to me. I've gave a cursory read to the
> testcases themselves and they do look like a faithful conversion to me.
>
> Thanks!
>
> Patrick
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-07-24 0:24 ` Ghanshyam Thakkar
@ 2024-07-24 5:19 ` Patrick Steinhardt
2024-07-24 7:06 ` Ghanshyam Thakkar
0 siblings, 1 reply; 23+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 5:19 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]
On Wed, Jul 24, 2024 at 05:54:33AM +0530, Ghanshyam Thakkar wrote:
> Patrick Steinhardt <ps@pks.im> wrote:
> > On Fri, Jun 28, 2024 at 06:26:24PM +0530, Ghanshyam Thakkar wrote:
> > > + free(url1_norm);
> > > + free(url2_norm);
> > > +}
> > > +
> > > +static void check_normalized_url_from_file(const char *file, const char *expect)
> > > +{
> > > + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> > > +
> > > + strbuf_getcwd(&path);
> > > + strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
> >
> > Curious: is this a new requirement or do other tests have the same
> > requirement? I was under the impression that I could execude the
> > resulting unit test binaries from whatever directory I wanted to, but
> > didn't verify.
>
> I am not aware of any requirements, but if we want to interact with
> other files like in this case (and where we potentially have to
> interact with a test repository), we'd need to have some requirement
> to construct the path to these data files (and the test repository),
> similar to end-to-end tests where they can be run in only t/
> directory. Do you think calling `setup_git_directory()` and then using
> `the_repository->worktree` to get the root of the worktree of Git source
> and then construct the path relative to that, would be useful? That way
> we can atleast call the binaries from anywhere within the tree.
Instead of using the working directory, you can also use the `__FILE__`
preprocessor macro to access the files relative to the directory of the
original source file. That at least makes it possible to execute the
result from all directories, but still obviously ties us to the location
of the source directory.
Whether that's ultimately much better.. dunno. But I guess this should
at least be discussed in the commit message.
> (P.S. I know we want to avoid using `the_repository`, but I don't know
> any other way yet.)
You can use e.g. "t/helper/test-repository.c" as an example, where we
use `repo_init()` to initialize a local repository variable.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-07-24 5:19 ` Patrick Steinhardt
@ 2024-07-24 7:06 ` Ghanshyam Thakkar
2024-07-24 7:45 ` Patrick Steinhardt
0 siblings, 1 reply; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-07-24 7:06 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
Patrick Steinhardt <ps@pks.im> wrote:
> On Wed, Jul 24, 2024 at 05:54:33AM +0530, Ghanshyam Thakkar wrote:
> > Patrick Steinhardt <ps@pks.im> wrote:
> > > On Fri, Jun 28, 2024 at 06:26:24PM +0530, Ghanshyam Thakkar wrote:
> > > > + free(url1_norm);
> > > > + free(url2_norm);
> > > > +}
> > > > +
> > > > +static void check_normalized_url_from_file(const char *file, const char *expect)
> > > > +{
> > > > + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> > > > +
> > > > + strbuf_getcwd(&path);
> > > > + strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
> > >
> > > Curious: is this a new requirement or do other tests have the same
> > > requirement? I was under the impression that I could execude the
> > > resulting unit test binaries from whatever directory I wanted to, but
> > > didn't verify.
> >
> > I am not aware of any requirements, but if we want to interact with
> > other files like in this case (and where we potentially have to
> > interact with a test repository), we'd need to have some requirement
> > to construct the path to these data files (and the test repository),
> > similar to end-to-end tests where they can be run in only t/
> > directory. Do you think calling `setup_git_directory()` and then using
> > `the_repository->worktree` to get the root of the worktree of Git source
> > and then construct the path relative to that, would be useful? That way
> > we can atleast call the binaries from anywhere within the tree.
>
> Instead of using the working directory, you can also use the `__FILE__`
> preprocessor macro to access the files relative to the directory of the
> original source file. That at least makes it possible to execute the
> result from all directories, but still obviously ties us to the location
> of the source directory.
But doesn't '__FILE__' give relative path instead of absolute? A quick
test_msg() tells me that '__FILE__' gives the path
't/unit-tests/t-urlmatch-normalization.c' for me. So, I don't know
how we would be able to execute from _all_ directories. Although, I
think the restriction of running from only 't/' would be fine as
end-to-end tests have similar restrictions.
> Whether that's ultimately much better.. dunno. But I guess this should
> at least be discussed in the commit message.
Will update.
>
> > (P.S. I know we want to avoid using `the_repository`, but I don't know
> > any other way yet.)
>
> You can use e.g. "t/helper/test-repository.c" as an example, where we
> use `repo_init()` to initialize a local repository variable.
But it requires us to know the path to the repo and worktree in
advance, which kinda defeats the purpose of using 'repository->worktree'.
setup_git_directory() sets up 'the_repository' from any subdirectory
of the worktree, so we can get the root without us having to know
which sub-directory (of the worktree) we are in.
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests
2024-07-24 7:06 ` Ghanshyam Thakkar
@ 2024-07-24 7:45 ` Patrick Steinhardt
0 siblings, 0 replies; 23+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 7:45 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Christian Couder, Phillip Wood, Christian Couder,
Kaartic Sivaraam
[-- Attachment #1: Type: text/plain, Size: 2473 bytes --]
On Wed, Jul 24, 2024 at 12:36:32PM +0530, Ghanshyam Thakkar wrote:
> Patrick Steinhardt <ps@pks.im> wrote:
> > On Wed, Jul 24, 2024 at 05:54:33AM +0530, Ghanshyam Thakkar wrote:
> > > Patrick Steinhardt <ps@pks.im> wrote:
> > > > On Fri, Jun 28, 2024 at 06:26:24PM +0530, Ghanshyam Thakkar wrote:
> > > > > + free(url1_norm);
> > > > > + free(url2_norm);
> > > > > +}
> > > > > +
> > > > > +static void check_normalized_url_from_file(const char *file, const char *expect)
> > > > > +{
> > > > > + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> > > > > +
> > > > > + strbuf_getcwd(&path);
> > > > > + strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
> > > >
> > > > Curious: is this a new requirement or do other tests have the same
> > > > requirement? I was under the impression that I could execude the
> > > > resulting unit test binaries from whatever directory I wanted to, but
> > > > didn't verify.
> > >
> > > I am not aware of any requirements, but if we want to interact with
> > > other files like in this case (and where we potentially have to
> > > interact with a test repository), we'd need to have some requirement
> > > to construct the path to these data files (and the test repository),
> > > similar to end-to-end tests where they can be run in only t/
> > > directory. Do you think calling `setup_git_directory()` and then using
> > > `the_repository->worktree` to get the root of the worktree of Git source
> > > and then construct the path relative to that, would be useful? That way
> > > we can atleast call the binaries from anywhere within the tree.
> >
> > Instead of using the working directory, you can also use the `__FILE__`
> > preprocessor macro to access the files relative to the directory of the
> > original source file. That at least makes it possible to execute the
> > result from all directories, but still obviously ties us to the location
> > of the source directory.
>
> But doesn't '__FILE__' give relative path instead of absolute? A quick
> test_msg() tells me that '__FILE__' gives the path
> 't/unit-tests/t-urlmatch-normalization.c' for me. So, I don't know
> how we would be able to execute from _all_ directories. Although, I
> think the restriction of running from only 't/' would be fine as
> end-to-end tests have similar restrictions.
Ah, you're right of course. Scratch that then.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework
2024-06-28 12:56 [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests Ghanshyam Thakkar
` (2 preceding siblings ...)
2024-07-23 14:00 ` Patrick Steinhardt
@ 2024-08-13 17:24 ` Ghanshyam Thakkar
2024-08-13 19:22 ` Junio C Hamano
` (2 more replies)
3 siblings, 3 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-08-13 17:24 UTC (permalink / raw)
To: git
Cc: Patrick Steinhardt, Karthik Nayak, Phillip Wood, Christian Couder,
Ghanshyam Thakkar, Christian Couder, Kaartic Sivaraam
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.
In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shellscript. This involves changing
'\'' -> '
'\!' -> !
in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shellscript.
Another change is the removal of MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a Unicode
string, it is not possible to pass arbitrary bytes to a program. But
in unit tests we don't have this limitation.
With the addition of this unit test, we impose a new restriction of
running the unit tests from either 't/' or 't/unit-tests/bin/'
directories. This is to construct the path to files which contain some
input urls under the 't/t-urlmatch-normalization' directory. This
restriction is similar to one we have for end-to-end tests, where they
can be ran from only 't/'. Addition of 't/unit-tests/bin/' is to allow
for running individual tests which is not currently possible via any
'make' targets and also 'unit-tests-test-tool' target is also ran from
the 't/unit-tests/bin' directory.
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
Makefile | 2 +-
t/helper/test-tool.c | 1 -
t/helper/test-tool.h | 1 -
t/helper/test-urlmatch-normalization.c | 56 ----
t/t0110-urlmatch-normalization.sh | 182 -----------
t/unit-tests/t-urlmatch-normalization.c | 294 ++++++++++++++++++
.../t-urlmatch-normalization}/README | 0
.../t-urlmatch-normalization}/url-1 | Bin
.../t-urlmatch-normalization}/url-10 | Bin
.../t-urlmatch-normalization}/url-11 | Bin
.../t-urlmatch-normalization}/url-2 | Bin
.../t-urlmatch-normalization}/url-3 | Bin
.../t-urlmatch-normalization}/url-4 | Bin
.../t-urlmatch-normalization}/url-5 | Bin
.../t-urlmatch-normalization}/url-6 | Bin
.../t-urlmatch-normalization}/url-7 | Bin
.../t-urlmatch-normalization}/url-8 | Bin
.../t-urlmatch-normalization}/url-9 | Bin
18 files changed, 295 insertions(+), 241 deletions(-)
delete mode 100644 t/helper/test-urlmatch-normalization.c
delete mode 100755 t/t0110-urlmatch-normalization.sh
create mode 100644 t/unit-tests/t-urlmatch-normalization.c
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/README (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-1 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-10 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-11 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-2 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-3 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-4 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-5 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-6 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-7 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-8 (100%)
rename t/{t0110 => unit-tests/t-urlmatch-normalization}/url-9 (100%)
diff --git a/Makefile b/Makefile
index 3863e60b66..d7bc19e823 100644
--- a/Makefile
+++ b/Makefile
@@ -843,7 +843,6 @@ TEST_BUILTINS_OBJS += test-submodule.o
TEST_BUILTINS_OBJS += test-subprocess.o
TEST_BUILTINS_OBJS += test-trace2.o
TEST_BUILTINS_OBJS += test-truncate.o
-TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
TEST_BUILTINS_OBJS += test-userdiff.o
TEST_BUILTINS_OBJS += test-wildmatch.o
TEST_BUILTINS_OBJS += test-windows-named-pipe.o
@@ -1346,6 +1345,7 @@ UNIT_TEST_PROGRAMS += t-strbuf
UNIT_TEST_PROGRAMS += t-strcmp-offset
UNIT_TEST_PROGRAMS += t-strvec
UNIT_TEST_PROGRAMS += t-trailer
+UNIT_TEST_PROGRAMS += t-urlmatch-normalization
UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index da3e69128a..f8a67df7de 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -83,7 +83,6 @@ static struct test_cmd cmds[] = {
{ "trace2", cmd__trace2 },
{ "truncate", cmd__truncate },
{ "userdiff", cmd__userdiff },
- { "urlmatch-normalization", cmd__urlmatch_normalization },
{ "xml-encode", cmd__xml_encode },
{ "wildmatch", cmd__wildmatch },
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 642a34578c..e74bc0ffd4 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -76,7 +76,6 @@ int cmd__subprocess(int argc, const char **argv);
int cmd__trace2(int argc, const char **argv);
int cmd__truncate(int argc, const char **argv);
int cmd__userdiff(int argc, const char **argv);
-int cmd__urlmatch_normalization(int argc, const char **argv);
int cmd__xml_encode(int argc, const char **argv);
int cmd__wildmatch(int argc, const char **argv);
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-urlmatch-normalization.c b/t/helper/test-urlmatch-normalization.c
deleted file mode 100644
index 86edd454f5..0000000000
--- a/t/helper/test-urlmatch-normalization.c
+++ /dev/null
@@ -1,56 +0,0 @@
-#include "test-tool.h"
-#include "git-compat-util.h"
-#include "urlmatch.h"
-
-int cmd__urlmatch_normalization(int argc, const char **argv)
-{
- const char usage[] = "test-tool urlmatch-normalization [-p | -l] <url1> | <url1> <url2>";
- char *url1 = NULL, *url2 = NULL;
- int opt_p = 0, opt_l = 0;
- int ret = 0;
-
- /*
- * For one url, succeed if url_normalize succeeds on it, fail otherwise.
- * For two urls, succeed only if url_normalize succeeds on both and
- * the results compare equal with strcmp. If -p is given (one url only)
- * and url_normalize succeeds, print the result followed by "\n". If
- * -l is given (one url only) and url_normalize succeeds, print the
- * returned length in decimal followed by "\n".
- */
-
- if (argc > 1 && !strcmp(argv[1], "-p")) {
- opt_p = 1;
- argc--;
- argv++;
- } else if (argc > 1 && !strcmp(argv[1], "-l")) {
- opt_l = 1;
- argc--;
- argv++;
- }
-
- if (argc < 2 || argc > 3)
- die("%s", usage);
-
- if (argc == 2) {
- struct url_info info;
- url1 = url_normalize(argv[1], &info);
- if (!url1)
- return 1;
- if (opt_p)
- printf("%s\n", url1);
- if (opt_l)
- printf("%u\n", (unsigned)info.url_len);
- goto cleanup;
- }
-
- if (opt_p || opt_l)
- die("%s", usage);
-
- url1 = url_normalize(argv[1], NULL);
- url2 = url_normalize(argv[2], NULL);
- ret = (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
-cleanup:
- free(url1);
- free(url2);
- return ret;
-}
diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch-normalization.sh
deleted file mode 100755
index 12d817fbd3..0000000000
--- a/t/t0110-urlmatch-normalization.sh
+++ /dev/null
@@ -1,182 +0,0 @@
-#!/bin/sh
-
-test_description='urlmatch URL normalization'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-# The base name of the test url files
-tu="$TEST_DIRECTORY/t0110/url"
-
-# Note that only file: URLs should be allowed without a host
-
-test_expect_success 'url scheme' '
- ! test-tool urlmatch-normalization "" &&
- ! test-tool urlmatch-normalization "_" &&
- ! test-tool urlmatch-normalization "scheme" &&
- ! test-tool urlmatch-normalization "scheme:" &&
- ! test-tool urlmatch-normalization "scheme:/" &&
- ! test-tool urlmatch-normalization "scheme://" &&
- ! test-tool urlmatch-normalization "file" &&
- ! test-tool urlmatch-normalization "file:" &&
- ! test-tool urlmatch-normalization "file:/" &&
- test-tool urlmatch-normalization "file://" &&
- ! test-tool urlmatch-normalization "://acme.co" &&
- ! test-tool urlmatch-normalization "x_test://acme.co" &&
- ! test-tool urlmatch-normalization "-test://acme.co" &&
- ! test-tool urlmatch-normalization "0test://acme.co" &&
- ! test-tool urlmatch-normalization "+test://acme.co" &&
- ! test-tool urlmatch-normalization ".test://acme.co" &&
- ! test-tool urlmatch-normalization "schem%6e://" &&
- test-tool urlmatch-normalization "x-Test+v1.0://acme.co" &&
- test "$(test-tool urlmatch-normalization -p "AbCdeF://x.Y")" = "abcdef://x.y/"
-'
-
-test_expect_success 'url authority' '
- ! test-tool urlmatch-normalization "scheme://user:pass@" &&
- ! test-tool urlmatch-normalization "scheme://?" &&
- ! test-tool urlmatch-normalization "scheme://#" &&
- ! test-tool urlmatch-normalization "scheme:///" &&
- ! test-tool urlmatch-normalization "scheme://:" &&
- ! test-tool urlmatch-normalization "scheme://:555" &&
- test-tool urlmatch-normalization "file://user:pass@" &&
- test-tool urlmatch-normalization "file://?" &&
- test-tool urlmatch-normalization "file://#" &&
- test-tool urlmatch-normalization "file:///" &&
- test-tool urlmatch-normalization "file://:" &&
- ! test-tool urlmatch-normalization "file://:555" &&
- test-tool urlmatch-normalization "scheme://user:pass@host" &&
- test-tool urlmatch-normalization "scheme://@host" &&
- test-tool urlmatch-normalization "scheme://%00@host" &&
- ! test-tool urlmatch-normalization "scheme://%%@host" &&
- test-tool urlmatch-normalization "scheme://host_" &&
- test-tool urlmatch-normalization "scheme://user:pass@host/" &&
- test-tool urlmatch-normalization "scheme://@host/" &&
- test-tool urlmatch-normalization "scheme://host/" &&
- test-tool urlmatch-normalization "scheme://host?x" &&
- test-tool urlmatch-normalization "scheme://host#x" &&
- test-tool urlmatch-normalization "scheme://host/@" &&
- test-tool urlmatch-normalization "scheme://host?@x" &&
- test-tool urlmatch-normalization "scheme://host#@x" &&
- test-tool urlmatch-normalization "scheme://[::1]" &&
- test-tool urlmatch-normalization "scheme://[::1]/" &&
- ! test-tool urlmatch-normalization "scheme://hos%41/" &&
- test-tool urlmatch-normalization "scheme://[invalid....:/" &&
- test-tool urlmatch-normalization "scheme://invalid....:]/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:[/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:["
-'
-
-test_expect_success 'url port checks' '
- test-tool urlmatch-normalization "xyz://q@some.host:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0000000" &&
- test-tool urlmatch-normalization "xyz://q@some.host:0000001?" &&
- test-tool urlmatch-normalization "xyz://q@some.host:065535#" &&
- test-tool urlmatch-normalization "xyz://q@some.host:65535" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:65536" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:99999" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100000" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100001" &&
- test-tool urlmatch-normalization "http://q@some.host:80" &&
- test-tool urlmatch-normalization "https://q@some.host:443" &&
- test-tool urlmatch-normalization "http://q@some.host:80/" &&
- test-tool urlmatch-normalization "https://q@some.host:443?" &&
- ! test-tool urlmatch-normalization "http://q@:8008" &&
- ! test-tool urlmatch-normalization "http://:8080" &&
- ! test-tool urlmatch-normalization "http://:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:000/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0%300/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0x80/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:4294967297/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:030f/"
-'
-
-test_expect_success 'url port normalization' '
- test "$(test-tool urlmatch-normalization -p "http://x:800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:0800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:00000800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:065535")" = "http://x:65535/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:1")" = "http://x:1/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:80")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:000000080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:0443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:000000443")" = "https://x/"
-'
-
-test_expect_success 'url general escapes' '
- ! test-tool urlmatch-normalization "http://x.y?%fg" &&
- test "$(test-tool urlmatch-normalization -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
- test "$(test-tool urlmatch-normalization -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
- test "$(test-tool urlmatch-normalization -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
- test "$(test-tool urlmatch-normalization -p "X://W/'\''")" = "x://w/'\''" &&
- test "$(test-tool urlmatch-normalization -p "X://W?'\!'")" = "x://w/?'\!'"
-'
-
-test_expect_success !MINGW 'url high-bit escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF"
-'
-
-test_expect_success 'url utf-8 escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
-'
-
-test_expect_success 'url username/password escapes' '
- test "$(test-tool urlmatch-normalization -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
-'
-
-test_expect_success 'url normalized lengths' '
- test "$(test-tool urlmatch-normalization -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
- test "$(test-tool urlmatch-normalization -l "http://%41:%42@x.y/%61/")" = 17 &&
- test "$(test-tool urlmatch-normalization -l "http://@x.y/^")" = 15
-'
-
-test_expect_success 'url . and .. segments' '
- test "$(test-tool urlmatch-normalization -p "x://y/.")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.")" = "x://y/a" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./")" = "x://y/a/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/.?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.?")" = "x://y/a?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?")" = "x://y/a/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c")" = "x://y/c" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
- ! test-tool urlmatch-normalization "x://y/a/./b/.././../c/././.././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2e/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/%2e./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/b/.%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/c/%2e%2E/")" = "x://y/"
-'
-
-# http://@foo specifies an empty user name but does not specify a password
-# http://foo specifies neither a user name nor a password
-# So they should not be equivalent
-test_expect_success 'url equivalents' '
- test-tool urlmatch-normalization "httP://x" "Http://X/" &&
- test-tool urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
- ! test-tool urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^" "httpS://@x.y:0443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
- test-tool urlmatch-normalization "https://@x.y/^/.." "httpS://@x.y:0443/"
-'
-
-test_done
diff --git a/t/unit-tests/t-urlmatch-normalization.c b/t/unit-tests/t-urlmatch-normalization.c
new file mode 100644
index 0000000000..e0dd50dc11
--- /dev/null
+++ b/t/unit-tests/t-urlmatch-normalization.c
@@ -0,0 +1,294 @@
+#include "test-lib.h"
+#include "urlmatch.h"
+#include "strbuf.h"
+
+static void check_url_normalizable(const char *url, int normalizable)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_int(normalizable, ==, url_norm ? 1 : 0))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void check_normalized_url(const char *url, const char *expect)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_str(url_norm, expect))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void compare_normalized_urls(const char *url1, const char *url2,
+ size_t equal)
+{
+ char *url1_norm = url_normalize(url1, NULL);
+ char *url2_norm = url_normalize(url2, NULL);
+
+ if (equal) {
+ if (!check_str(url1_norm, url2_norm))
+ test_msg("input url1: %s\n input url2: %s", url1,
+ url2);
+ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
+ test_msg(" url1_norm: %s\n url2_norm: %s\n"
+ " input url1: %s\n input url2: %s",
+ url1_norm, url2_norm, url1, url2);
+ }
+ free(url1_norm);
+ free(url2_norm);
+}
+
+static void check_normalized_url_from_file(const char *file, const char *expect)
+{
+ struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
+ char *cwd_basename;
+
+ if (!check_int(strbuf_getcwd(&path), ==, 0))
+ return;
+
+ cwd_basename = basename(path.buf);
+ if (!check(!strcmp(cwd_basename, "t") || !strcmp(cwd_basename, "bin"))) {
+ test_msg("BUG: unit-tests should be run from either 't/' or 't/unit-tests/bin' directory");
+ return;
+ }
+
+ strbuf_strip_suffix(&path, "/unit-tests/bin");
+ strbuf_addf(&path, "/unit-tests/t-urlmatch-normalization/%s", file);
+
+ if (!check_int(strbuf_read_file(&content, path.buf, 0), >, 0)) {
+ test_msg("failed to read from file '%s': %s", file, strerror(errno));
+ } else {
+ char *url_norm;
+
+ strbuf_trim_trailing_newline(&content);
+ url_norm = url_normalize(content.buf, NULL);
+ if (!check_str(url_norm, expect))
+ test_msg("input file: %s", file);
+ free(url_norm);
+ }
+
+ strbuf_release(&content);
+ strbuf_release(&path);
+}
+
+static void check_normalized_url_length(const char *url, size_t len)
+{
+ struct url_info info;
+ char *url_norm = url_normalize(url, &info);
+
+ if (!check_int(info.url_len, ==, len))
+ test_msg(" input url: %s\n normalized url: %s", url,
+ url_norm);
+ free(url_norm);
+}
+
+/* Note that only file: URLs should be allowed without a host */
+static void t_url_scheme(void)
+{
+ check_url_normalizable("", 0);
+ check_url_normalizable("_", 0);
+ check_url_normalizable("scheme", 0);
+ check_url_normalizable("scheme:", 0);
+ check_url_normalizable("scheme:/", 0);
+ check_url_normalizable("scheme://", 0);
+ check_url_normalizable("file", 0);
+ check_url_normalizable("file:", 0);
+ check_url_normalizable("file:/", 0);
+ check_url_normalizable("file://", 1);
+ check_url_normalizable("://acme.co", 0);
+ check_url_normalizable("x_test://acme.co", 0);
+ check_url_normalizable("-test://acme.co", 0);
+ check_url_normalizable("0test://acme.co", 0);
+ check_url_normalizable("+test://acme.co", 0);
+ check_url_normalizable(".test://acme.co", 0);
+ check_url_normalizable("schem%6e://", 0);
+ check_url_normalizable("x-Test+v1.0://acme.co", 1);
+ check_normalized_url("AbCdeF://x.Y", "abcdef://x.y/");
+}
+
+static void t_url_authority(void)
+{
+ check_url_normalizable("scheme://user:pass@", 0);
+ check_url_normalizable("scheme://?", 0);
+ check_url_normalizable("scheme://#", 0);
+ check_url_normalizable("scheme:///", 0);
+ check_url_normalizable("scheme://:", 0);
+ check_url_normalizable("scheme://:555", 0);
+ check_url_normalizable("file://user:pass@", 1);
+ check_url_normalizable("file://?", 1);
+ check_url_normalizable("file://#", 1);
+ check_url_normalizable("file:///", 1);
+ check_url_normalizable("file://:", 1);
+ check_url_normalizable("file://:555", 0);
+ check_url_normalizable("scheme://user:pass@host", 1);
+ check_url_normalizable("scheme://@host", 1);
+ check_url_normalizable("scheme://%00@host", 1);
+ check_url_normalizable("scheme://%%@host", 0);
+ check_url_normalizable("scheme://host_", 1);
+ check_url_normalizable("scheme://user:pass@host/", 1);
+ check_url_normalizable("scheme://@host/", 1);
+ check_url_normalizable("scheme://host/", 1);
+ check_url_normalizable("scheme://host?x", 1);
+ check_url_normalizable("scheme://host#x", 1);
+ check_url_normalizable("scheme://host/@", 1);
+ check_url_normalizable("scheme://host?@x", 1);
+ check_url_normalizable("scheme://host#@x", 1);
+ check_url_normalizable("scheme://[::1]", 1);
+ check_url_normalizable("scheme://[::1]/", 1);
+ check_url_normalizable("scheme://hos%41/", 0);
+ check_url_normalizable("scheme://[invalid....:/", 1);
+ check_url_normalizable("scheme://invalid....:]/", 1);
+ check_url_normalizable("scheme://invalid....:[/", 0);
+ check_url_normalizable("scheme://invalid....:[", 0);
+}
+
+static void t_url_port(void)
+{
+ check_url_normalizable("xyz://q@some.host:", 1);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://q@some.host:0", 0);
+ check_url_normalizable("xyz://q@some.host:0000000", 0);
+ check_url_normalizable("xyz://q@some.host:0000001?", 1);
+ check_url_normalizable("xyz://q@some.host:065535#", 1);
+ check_url_normalizable("xyz://q@some.host:65535", 1);
+ check_url_normalizable("xyz://q@some.host:65536", 0);
+ check_url_normalizable("xyz://q@some.host:99999", 0);
+ check_url_normalizable("xyz://q@some.host:100000", 0);
+ check_url_normalizable("xyz://q@some.host:100001", 0);
+ check_url_normalizable("http://q@some.host:80", 1);
+ check_url_normalizable("https://q@some.host:443", 1);
+ check_url_normalizable("http://q@some.host:80/", 1);
+ check_url_normalizable("https://q@some.host:443?", 1);
+ check_url_normalizable("http://q@:8008", 0);
+ check_url_normalizable("http://:8080", 0);
+ check_url_normalizable("http://:", 0);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://[::1]:456/", 1);
+ check_url_normalizable("xyz://[::1]:/", 1);
+ check_url_normalizable("xyz://[::1]:000/", 0);
+ check_url_normalizable("xyz://[::1]:0%300/", 0);
+ check_url_normalizable("xyz://[::1]:0x80/", 0);
+ check_url_normalizable("xyz://[::1]:4294967297/", 0);
+ check_url_normalizable("xyz://[::1]:030f/", 0);
+}
+
+static void t_url_port_normalization(void)
+{
+ check_normalized_url("http://x:800", "http://x:800/");
+ check_normalized_url("http://x:0800", "http://x:800/");
+ check_normalized_url("http://x:00000800", "http://x:800/");
+ check_normalized_url("http://x:065535", "http://x:65535/");
+ check_normalized_url("http://x:1", "http://x:1/");
+ check_normalized_url("http://x:80", "http://x/");
+ check_normalized_url("http://x:080", "http://x/");
+ check_normalized_url("http://x:000000080", "http://x/");
+ check_normalized_url("https://x:443", "https://x/");
+ check_normalized_url("https://x:0443", "https://x/");
+ check_normalized_url("https://x:000000443", "https://x/");
+}
+
+static void t_url_general_escape(void)
+{
+ check_url_normalizable("http://x.y?%fg", 0);
+ check_normalized_url("X://W/%7e%41^%3a", "x://w/~A%5E%3A");
+ check_normalized_url("X://W/:/?#[]@", "x://w/:/?#[]@");
+ check_normalized_url("X://W/$&()*+,;=", "x://w/$&()*+,;=");
+ check_normalized_url("X://W/'", "x://w/'");
+ check_normalized_url("X://W?!", "x://w/?!");
+}
+
+static void t_url_high_bit(void)
+{
+ check_normalized_url_from_file("url-1",
+ "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
+ check_normalized_url_from_file("url-2",
+ "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
+ check_normalized_url_from_file("url-3",
+ "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
+ check_normalized_url_from_file("url-4",
+ "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
+ check_normalized_url_from_file("url-5",
+ "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
+ check_normalized_url_from_file("url-6",
+ "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
+ check_normalized_url_from_file("url-7",
+ "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
+ check_normalized_url_from_file("url-8",
+ "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
+ check_normalized_url_from_file("url-9",
+ "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
+ check_normalized_url_from_file("url-10",
+ "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
+}
+
+static void t_url_utf8_escape(void)
+{
+ check_normalized_url_from_file("url-11",
+ "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
+}
+
+static void t_url_username_pass(void)
+{
+ check_normalized_url("x://%41%62(^):%70+d@foo", "x://Ab(%5E):p+d@foo/");
+}
+
+static void t_url_length(void)
+{
+ check_normalized_url_length("Http://%4d%65:%4d^%70@The.Host", 25);
+ check_normalized_url_length("http://%41:%42@x.y/%61/", 17);
+ check_normalized_url_length("http://@x.y/^", 15);
+}
+
+static void t_url_dots(void)
+{
+ check_normalized_url("x://y/.", "x://y/");
+ check_normalized_url("x://y/./", "x://y/");
+ check_normalized_url("x://y/a/.", "x://y/a");
+ check_normalized_url("x://y/a/./", "x://y/a/");
+ check_normalized_url("x://y/.?", "x://y/?");
+ check_normalized_url("x://y/./?", "x://y/?");
+ check_normalized_url("x://y/a/.?", "x://y/a?");
+ check_normalized_url("x://y/a/./?", "x://y/a/?");
+ check_normalized_url("x://y/a/./b/.././../c", "x://y/c");
+ check_normalized_url("x://y/a/./b/../.././c/", "x://y/c/");
+ check_normalized_url("x://y/a/./b/.././../c/././.././.", "x://y/");
+ check_url_normalizable("x://y/a/./b/.././../c/././.././..", 0);
+ check_normalized_url("x://y/a/./?/././..", "x://y/a/?/././..");
+ check_normalized_url("x://y/%2e/", "x://y/");
+ check_normalized_url("x://y/%2E/", "x://y/");
+ check_normalized_url("x://y/a/%2e./", "x://y/");
+ check_normalized_url("x://y/b/.%2E/", "x://y/");
+ check_normalized_url("x://y/c/%2e%2E/", "x://y/");
+}
+
+/*
+ * http://@foo specifies an empty user name but does not specify a password
+ * http://foo specifies neither a user name nor a password
+ * So they should not be equivalent
+ */
+static void t_url_equivalents(void)
+{
+ compare_normalized_urls("httP://x", "Http://X/", 1);
+ compare_normalized_urls("Http://%4d%65:%4d^%70@The.Host", "hTTP://Me:%4D^p@the.HOST:80/", 1);
+ compare_normalized_urls("https://@x.y/^", "httpS://x.y:443/^", 0);
+ compare_normalized_urls("https://@x.y/^", "httpS://@x.y:0443/^", 1);
+ compare_normalized_urls("https://@x.y/^/../abc", "httpS://@x.y:0443/abc", 1);
+ compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1);
+}
+
+int cmd_main(int argc UNUSED, const char **argv UNUSED)
+{
+ TEST(t_url_scheme(), "url scheme");
+ TEST(t_url_authority(), "url authority");
+ TEST(t_url_port(), "url port checks");
+ TEST(t_url_port_normalization(), "url port normalization");
+ TEST(t_url_general_escape(), "url general escapes");
+ TEST(t_url_high_bit(), "url high-bit escapes");
+ TEST(t_url_utf8_escape(), "url utf8 escapes");
+ TEST(t_url_username_pass(), "url username/password escapes");
+ TEST(t_url_length(), "url normalized lengths");
+ TEST(t_url_dots(), "url . and .. segments");
+ TEST(t_url_equivalents(), "url equivalents");
+ return test_done();
+}
diff --git a/t/t0110/README b/t/unit-tests/t-urlmatch-normalization/README
similarity index 100%
rename from t/t0110/README
rename to t/unit-tests/t-urlmatch-normalization/README
diff --git a/t/t0110/url-1 b/t/unit-tests/t-urlmatch-normalization/url-1
similarity index 100%
rename from t/t0110/url-1
rename to t/unit-tests/t-urlmatch-normalization/url-1
diff --git a/t/t0110/url-10 b/t/unit-tests/t-urlmatch-normalization/url-10
similarity index 100%
rename from t/t0110/url-10
rename to t/unit-tests/t-urlmatch-normalization/url-10
diff --git a/t/t0110/url-11 b/t/unit-tests/t-urlmatch-normalization/url-11
similarity index 100%
rename from t/t0110/url-11
rename to t/unit-tests/t-urlmatch-normalization/url-11
diff --git a/t/t0110/url-2 b/t/unit-tests/t-urlmatch-normalization/url-2
similarity index 100%
rename from t/t0110/url-2
rename to t/unit-tests/t-urlmatch-normalization/url-2
diff --git a/t/t0110/url-3 b/t/unit-tests/t-urlmatch-normalization/url-3
similarity index 100%
rename from t/t0110/url-3
rename to t/unit-tests/t-urlmatch-normalization/url-3
diff --git a/t/t0110/url-4 b/t/unit-tests/t-urlmatch-normalization/url-4
similarity index 100%
rename from t/t0110/url-4
rename to t/unit-tests/t-urlmatch-normalization/url-4
diff --git a/t/t0110/url-5 b/t/unit-tests/t-urlmatch-normalization/url-5
similarity index 100%
rename from t/t0110/url-5
rename to t/unit-tests/t-urlmatch-normalization/url-5
diff --git a/t/t0110/url-6 b/t/unit-tests/t-urlmatch-normalization/url-6
similarity index 100%
rename from t/t0110/url-6
rename to t/unit-tests/t-urlmatch-normalization/url-6
diff --git a/t/t0110/url-7 b/t/unit-tests/t-urlmatch-normalization/url-7
similarity index 100%
rename from t/t0110/url-7
rename to t/unit-tests/t-urlmatch-normalization/url-7
diff --git a/t/t0110/url-8 b/t/unit-tests/t-urlmatch-normalization/url-8
similarity index 100%
rename from t/t0110/url-8
rename to t/unit-tests/t-urlmatch-normalization/url-8
diff --git a/t/t0110/url-9 b/t/unit-tests/t-urlmatch-normalization/url-9
similarity index 100%
rename from t/t0110/url-9
rename to t/unit-tests/t-urlmatch-normalization/url-9
Range-diff against v1:
1: 6a44d676cd ! 1: 3f4e4be1a6 t: migrate helper/test-urlmatch-normalization to unit tests
@@ Commit message
string, it is not possible to pass arbitrary bytes to a program. But
in unit tests we don't have this limitation.
+ With the addition of this unit test, we impose a new restriction of
+ running the unit tests from either 't/' or 't/unit-tests/bin/'
+ directories. This is to construct the path to files which contain some
+ input urls under the 't/t-urlmatch-normalization' directory. This
+ restriction is similar to one we have for end-to-end tests, where they
+ can be ran from only 't/'. Addition of 't/unit-tests/bin/' is to allow
+ for running individual tests which is not currently possible via any
+ 'make' targets and also 'unit-tests-test-tool' target is also ran from
+ the 't/unit-tests/bin' directory.
+
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+ if (!check_str(url1_norm, url2_norm))
+ test_msg("input url1: %s\n input url2: %s", url1,
+ url2);
-+ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0))
++ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
+ test_msg(" url1_norm: %s\n url2_norm: %s\n"
+ " input url1: %s\n input url2: %s",
+ url1_norm, url2_norm, url1, url2);
++ }
+ free(url1_norm);
+ free(url2_norm);
+}
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+static void check_normalized_url_from_file(const char *file, const char *expect)
+{
+ struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
++ char *cwd_basename;
++
++ if (!check_int(strbuf_getcwd(&path), ==, 0))
++ return;
++
++ cwd_basename = basename(path.buf);
++ if (!check(!strcmp(cwd_basename, "t") || !strcmp(cwd_basename, "bin"))) {
++ test_msg("BUG: unit-tests should be run from either 't/' or 't/unit-tests/bin' directory");
++ return;
++ }
+
-+ strbuf_getcwd(&path);
-+ strbuf_strip_suffix(&path, "/unit-tests/bin"); /* because 'unit-tests-test-tool' is run from 'bin' directory */
++ strbuf_strip_suffix(&path, "/unit-tests/bin");
+ strbuf_addf(&path, "/unit-tests/t-urlmatch-normalization/%s", file);
+
+ if (!check_int(strbuf_read_file(&content, path.buf, 0), >, 0)) {
--
2.46.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-13 17:24 ` [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework Ghanshyam Thakkar
@ 2024-08-13 19:22 ` Junio C Hamano
2024-08-14 1:35 ` Kaartic Sivaraam
2024-08-14 14:24 ` Ghanshyam Thakkar
2024-08-14 5:17 ` Kaartic Sivaraam
2024-08-14 14:20 ` [GSoC][PATCH v3] " Ghanshyam Thakkar
2 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-08-13 19:22 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Patrick Steinhardt, Karthik Nayak, Phillip Wood,
Christian Couder, Christian Couder, Kaartic Sivaraam
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
> With the addition of this unit test, we impose a new restriction of
> running the unit tests from either 't/' or 't/unit-tests/bin/'
> directories. This is to construct the path to files which contain some
> input urls under the 't/t-urlmatch-normalization' directory. This
> restriction is similar to one we have for end-to-end tests, where they
> can be ran from only 't/'.
>
> Addition of 't/unit-tests/bin/' is to allow
> for running individual tests which is not currently possible via any
> 'make' targets and also 'unit-tests-test-tool' target is also ran from
> the 't/unit-tests/bin' directory.
Sorry, but I do not quite follow. The above makes it sound as if
the 'bin' subdirectory is something that never existed before this
patch and this patch introduces the use of that directory, but that
is hardly the case. What does that "Addition of" really refer to?
Do you mean "we cannot run the tests from arbitrary places, and we
allow them to be run from t/, just like the normal tests" followed
by "in addition, we also allow them to be run from t/unit-tests/bin
directory because ..."?
I wonder if we should get of t/t-urlmatch-normalization/ directory
and instead hold these test data in the form of string constants in
the program. After all, you have the expected normalization result
hardcoded in the binary (e.g. t_url_high_bit() asks the checker
function to read from "url-1" file and then compare the result of
normalization with a hardcoded string constant), so having the test
data in separate files only risks the input and the output easily
drift apart.
As a side effect, it would make it easily possible to run the tests
anywhere, because you no longer depend on these url-$n input files.
It of course depends on how burdensome the limitation that we can
run the tests only from a fixed place really is, but it generally is
not a good idea to have these random sequence of bytes in small
files that nobody looks at in a repository in the first place.
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-13 19:22 ` Junio C Hamano
@ 2024-08-14 1:35 ` Kaartic Sivaraam
2024-08-14 4:58 ` Junio C Hamano
2024-08-14 14:24 ` Ghanshyam Thakkar
1 sibling, 1 reply; 23+ messages in thread
From: Kaartic Sivaraam @ 2024-08-14 1:35 UTC (permalink / raw)
To: Junio C Hamano, Ghanshyam Thakkar
Cc: git, Patrick Steinhardt, Karthik Nayak, Phillip Wood,
Christian Couder, Christian Couder
On 14/08/24 00:52, Junio C Hamano wrote:
> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
>
> I wonder if we should get of t/t-urlmatch-normalization/ directory
> and instead hold these test data in the form of string constants in
> the program. After all, you have the expected normalization result
> hardcoded in the binary (e.g. t_url_high_bit() asks the checker
> function to read from "url-1" file and then compare the result of
> normalization with a hardcoded string constant), so having the test
> data in separate files only risks the input and the output easily
> drift apart.
>
> As a side effect, it would make it easily possible to run the tests
> anywhere, because you no longer depend on these url-$n input files.
> It of course depends on how burdensome the limitation that we can
> run the tests only from a fixed place really is, but it generally is
> not a good idea to have these random sequence of bytes in small
> files that nobody looks at in a repository in the first place.
>
I think the reason these inputs are present in the files is solely
because they are random sequence of characters which contain unicode
and even some control characters. This makes it tricky to hold the
input string in the source itself.
I'm not sure there would be a straight-forward way to have these inputs
in the C source file. There may be some way to represent them in an
alternate form but I suppose that would sacrifice the readability of
these inputs which I believe is also a significant factor for test cases.
Feel free to enlighten us if we're possibly missing some straight
forward way of having these input URLs in the source files.
--
Sivaraam
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-14 1:35 ` Kaartic Sivaraam
@ 2024-08-14 4:58 ` Junio C Hamano
0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-08-14 4:58 UTC (permalink / raw)
To: Kaartic Sivaraam
Cc: Ghanshyam Thakkar, git, Patrick Steinhardt, Karthik Nayak,
Phillip Wood, Christian Couder, Christian Couder
Kaartic Sivaraam <kaartic.sivaraam@gmail.com> writes:
> I'm not sure there would be a straight-forward way to have these inputs
> in the C source file. There may be some way to represent them in an
> alternate form but I suppose that would sacrifice the readability of
> these inputs which I believe is also a significant factor for test
> cases.
Just as a byte string, all bytes \012 octally expressed?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-13 17:24 ` [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework Ghanshyam Thakkar
2024-08-13 19:22 ` Junio C Hamano
@ 2024-08-14 5:17 ` Kaartic Sivaraam
2024-08-14 14:20 ` [GSoC][PATCH v3] " Ghanshyam Thakkar
2 siblings, 0 replies; 23+ messages in thread
From: Kaartic Sivaraam @ 2024-08-14 5:17 UTC (permalink / raw)
To: Ghanshyam Thakkar, git
Cc: Patrick Steinhardt, Karthik Nayak, Phillip Wood, Christian Couder,
Christian Couder
Hi Ghansyam,
I just wanted to share two comments based on what I observed from the
recent changes.
On 13/08/24 22:54, Ghanshyam Thakkar wrote:
>
> index 0000000000..e0dd50dc11
> --- /dev/null
> +++ b/t/unit-tests/t-urlmatch-normalization.c
> @@ -0,0 +1,294 @@
> +
> +static void check_normalized_url_from_file(const char *file, const char *expect)
> +{
> + struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
> + char *cwd_basename;
> +
> + if (!check_int(strbuf_getcwd(&path), ==, 0))
> + return;
> +
> + cwd_basename = basename(path.buf);
> + if (!check(!strcmp(cwd_basename, "t") || !strcmp(cwd_basename, "bin"))) {
I think comparing blindly for "bin" would not be an ideal thing to do as
this would let other locations which have the "bin" basename to be slip
through. For instance, running the test from "perl/blib/bin" would
result in this check letting it through. I suppose we could need a bit
more specific check that ensures that the test is indeed running from
"t/unit-tests/bin".
> ... snip ...
>
> +static void t_url_high_bit(void)
> +{
> + check_normalized_url_from_file("url-1",
> + "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
> + check_normalized_url_from_file("url-2",
> + "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
When we run the unit-test binary from other directories, the error
message is thrown appropriately. But it seems to be printed for every
test case which seems a bit spammy. I suppose it might be helpful to do
it once before actually running the test cases and skip the cases when
we realize the binary is being run from a different directory.
--
Sivaraam
^ permalink raw reply [flat|nested] 23+ messages in thread
* [GSoC][PATCH v3] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-13 17:24 ` [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework Ghanshyam Thakkar
2024-08-13 19:22 ` Junio C Hamano
2024-08-14 5:17 ` Kaartic Sivaraam
@ 2024-08-14 14:20 ` Ghanshyam Thakkar
2024-08-14 16:52 ` Junio C Hamano
` (2 more replies)
2 siblings, 3 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-08-14 14:20 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Karthik Nayak, Patrick Steinhardt,
Christian Couder, Ghanshyam Thakkar, Christian Couder,
Kaartic Sivaraam
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.
In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shellscript. This involves changing
'\'' -> '
'\!' -> !
in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shellscript.
Another change is the removal of MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.
And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
This version addresses Junio's review and removes the restriction
of running the unit tests in the 't/' and 't/unit-tests/bin'
introduced in v2 by embedding the URLs in the code itself.
Makefile | 2 +-
t/helper/test-tool.c | 1 -
t/helper/test-tool.h | 1 -
t/helper/test-urlmatch-normalization.c | 56 -----
t/t0110-urlmatch-normalization.sh | 182 ----------------
t/t0110/README | 9 -
t/t0110/url-1 | Bin 20 -> 0 bytes
t/t0110/url-10 | Bin 23 -> 0 bytes
t/t0110/url-11 | Bin 25 -> 0 bytes
t/t0110/url-2 | Bin 20 -> 0 bytes
t/t0110/url-3 | Bin 23 -> 0 bytes
t/t0110/url-4 | Bin 23 -> 0 bytes
t/t0110/url-5 | Bin 23 -> 0 bytes
t/t0110/url-6 | Bin 23 -> 0 bytes
t/t0110/url-7 | Bin 23 -> 0 bytes
t/t0110/url-8 | Bin 23 -> 0 bytes
t/t0110/url-9 | Bin 23 -> 0 bytes
t/unit-tests/t-urlmatch-normalization.c | 271 ++++++++++++++++++++++++
18 files changed, 272 insertions(+), 250 deletions(-)
delete mode 100644 t/helper/test-urlmatch-normalization.c
delete mode 100755 t/t0110-urlmatch-normalization.sh
delete mode 100644 t/t0110/README
delete mode 100644 t/t0110/url-1
delete mode 100644 t/t0110/url-10
delete mode 100644 t/t0110/url-11
delete mode 100644 t/t0110/url-2
delete mode 100644 t/t0110/url-3
delete mode 100644 t/t0110/url-4
delete mode 100644 t/t0110/url-5
delete mode 100644 t/t0110/url-6
delete mode 100644 t/t0110/url-7
delete mode 100644 t/t0110/url-8
delete mode 100644 t/t0110/url-9
create mode 100644 t/unit-tests/t-urlmatch-normalization.c
diff --git a/Makefile b/Makefile
index 3863e60b66..d7bc19e823 100644
--- a/Makefile
+++ b/Makefile
@@ -843,7 +843,6 @@ TEST_BUILTINS_OBJS += test-submodule.o
TEST_BUILTINS_OBJS += test-subprocess.o
TEST_BUILTINS_OBJS += test-trace2.o
TEST_BUILTINS_OBJS += test-truncate.o
-TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
TEST_BUILTINS_OBJS += test-userdiff.o
TEST_BUILTINS_OBJS += test-wildmatch.o
TEST_BUILTINS_OBJS += test-windows-named-pipe.o
@@ -1346,6 +1345,7 @@ UNIT_TEST_PROGRAMS += t-strbuf
UNIT_TEST_PROGRAMS += t-strcmp-offset
UNIT_TEST_PROGRAMS += t-strvec
UNIT_TEST_PROGRAMS += t-trailer
+UNIT_TEST_PROGRAMS += t-urlmatch-normalization
UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index da3e69128a..f8a67df7de 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -83,7 +83,6 @@ static struct test_cmd cmds[] = {
{ "trace2", cmd__trace2 },
{ "truncate", cmd__truncate },
{ "userdiff", cmd__userdiff },
- { "urlmatch-normalization", cmd__urlmatch_normalization },
{ "xml-encode", cmd__xml_encode },
{ "wildmatch", cmd__wildmatch },
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 642a34578c..e74bc0ffd4 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -76,7 +76,6 @@ int cmd__subprocess(int argc, const char **argv);
int cmd__trace2(int argc, const char **argv);
int cmd__truncate(int argc, const char **argv);
int cmd__userdiff(int argc, const char **argv);
-int cmd__urlmatch_normalization(int argc, const char **argv);
int cmd__xml_encode(int argc, const char **argv);
int cmd__wildmatch(int argc, const char **argv);
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-urlmatch-normalization.c b/t/helper/test-urlmatch-normalization.c
deleted file mode 100644
index 86edd454f5..0000000000
--- a/t/helper/test-urlmatch-normalization.c
+++ /dev/null
@@ -1,56 +0,0 @@
-#include "test-tool.h"
-#include "git-compat-util.h"
-#include "urlmatch.h"
-
-int cmd__urlmatch_normalization(int argc, const char **argv)
-{
- const char usage[] = "test-tool urlmatch-normalization [-p | -l] <url1> | <url1> <url2>";
- char *url1 = NULL, *url2 = NULL;
- int opt_p = 0, opt_l = 0;
- int ret = 0;
-
- /*
- * For one url, succeed if url_normalize succeeds on it, fail otherwise.
- * For two urls, succeed only if url_normalize succeeds on both and
- * the results compare equal with strcmp. If -p is given (one url only)
- * and url_normalize succeeds, print the result followed by "\n". If
- * -l is given (one url only) and url_normalize succeeds, print the
- * returned length in decimal followed by "\n".
- */
-
- if (argc > 1 && !strcmp(argv[1], "-p")) {
- opt_p = 1;
- argc--;
- argv++;
- } else if (argc > 1 && !strcmp(argv[1], "-l")) {
- opt_l = 1;
- argc--;
- argv++;
- }
-
- if (argc < 2 || argc > 3)
- die("%s", usage);
-
- if (argc == 2) {
- struct url_info info;
- url1 = url_normalize(argv[1], &info);
- if (!url1)
- return 1;
- if (opt_p)
- printf("%s\n", url1);
- if (opt_l)
- printf("%u\n", (unsigned)info.url_len);
- goto cleanup;
- }
-
- if (opt_p || opt_l)
- die("%s", usage);
-
- url1 = url_normalize(argv[1], NULL);
- url2 = url_normalize(argv[2], NULL);
- ret = (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
-cleanup:
- free(url1);
- free(url2);
- return ret;
-}
diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch-normalization.sh
deleted file mode 100755
index 12d817fbd3..0000000000
--- a/t/t0110-urlmatch-normalization.sh
+++ /dev/null
@@ -1,182 +0,0 @@
-#!/bin/sh
-
-test_description='urlmatch URL normalization'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-# The base name of the test url files
-tu="$TEST_DIRECTORY/t0110/url"
-
-# Note that only file: URLs should be allowed without a host
-
-test_expect_success 'url scheme' '
- ! test-tool urlmatch-normalization "" &&
- ! test-tool urlmatch-normalization "_" &&
- ! test-tool urlmatch-normalization "scheme" &&
- ! test-tool urlmatch-normalization "scheme:" &&
- ! test-tool urlmatch-normalization "scheme:/" &&
- ! test-tool urlmatch-normalization "scheme://" &&
- ! test-tool urlmatch-normalization "file" &&
- ! test-tool urlmatch-normalization "file:" &&
- ! test-tool urlmatch-normalization "file:/" &&
- test-tool urlmatch-normalization "file://" &&
- ! test-tool urlmatch-normalization "://acme.co" &&
- ! test-tool urlmatch-normalization "x_test://acme.co" &&
- ! test-tool urlmatch-normalization "-test://acme.co" &&
- ! test-tool urlmatch-normalization "0test://acme.co" &&
- ! test-tool urlmatch-normalization "+test://acme.co" &&
- ! test-tool urlmatch-normalization ".test://acme.co" &&
- ! test-tool urlmatch-normalization "schem%6e://" &&
- test-tool urlmatch-normalization "x-Test+v1.0://acme.co" &&
- test "$(test-tool urlmatch-normalization -p "AbCdeF://x.Y")" = "abcdef://x.y/"
-'
-
-test_expect_success 'url authority' '
- ! test-tool urlmatch-normalization "scheme://user:pass@" &&
- ! test-tool urlmatch-normalization "scheme://?" &&
- ! test-tool urlmatch-normalization "scheme://#" &&
- ! test-tool urlmatch-normalization "scheme:///" &&
- ! test-tool urlmatch-normalization "scheme://:" &&
- ! test-tool urlmatch-normalization "scheme://:555" &&
- test-tool urlmatch-normalization "file://user:pass@" &&
- test-tool urlmatch-normalization "file://?" &&
- test-tool urlmatch-normalization "file://#" &&
- test-tool urlmatch-normalization "file:///" &&
- test-tool urlmatch-normalization "file://:" &&
- ! test-tool urlmatch-normalization "file://:555" &&
- test-tool urlmatch-normalization "scheme://user:pass@host" &&
- test-tool urlmatch-normalization "scheme://@host" &&
- test-tool urlmatch-normalization "scheme://%00@host" &&
- ! test-tool urlmatch-normalization "scheme://%%@host" &&
- test-tool urlmatch-normalization "scheme://host_" &&
- test-tool urlmatch-normalization "scheme://user:pass@host/" &&
- test-tool urlmatch-normalization "scheme://@host/" &&
- test-tool urlmatch-normalization "scheme://host/" &&
- test-tool urlmatch-normalization "scheme://host?x" &&
- test-tool urlmatch-normalization "scheme://host#x" &&
- test-tool urlmatch-normalization "scheme://host/@" &&
- test-tool urlmatch-normalization "scheme://host?@x" &&
- test-tool urlmatch-normalization "scheme://host#@x" &&
- test-tool urlmatch-normalization "scheme://[::1]" &&
- test-tool urlmatch-normalization "scheme://[::1]/" &&
- ! test-tool urlmatch-normalization "scheme://hos%41/" &&
- test-tool urlmatch-normalization "scheme://[invalid....:/" &&
- test-tool urlmatch-normalization "scheme://invalid....:]/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:[/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:["
-'
-
-test_expect_success 'url port checks' '
- test-tool urlmatch-normalization "xyz://q@some.host:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0000000" &&
- test-tool urlmatch-normalization "xyz://q@some.host:0000001?" &&
- test-tool urlmatch-normalization "xyz://q@some.host:065535#" &&
- test-tool urlmatch-normalization "xyz://q@some.host:65535" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:65536" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:99999" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100000" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100001" &&
- test-tool urlmatch-normalization "http://q@some.host:80" &&
- test-tool urlmatch-normalization "https://q@some.host:443" &&
- test-tool urlmatch-normalization "http://q@some.host:80/" &&
- test-tool urlmatch-normalization "https://q@some.host:443?" &&
- ! test-tool urlmatch-normalization "http://q@:8008" &&
- ! test-tool urlmatch-normalization "http://:8080" &&
- ! test-tool urlmatch-normalization "http://:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:000/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0%300/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0x80/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:4294967297/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:030f/"
-'
-
-test_expect_success 'url port normalization' '
- test "$(test-tool urlmatch-normalization -p "http://x:800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:0800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:00000800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:065535")" = "http://x:65535/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:1")" = "http://x:1/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:80")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:000000080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:0443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:000000443")" = "https://x/"
-'
-
-test_expect_success 'url general escapes' '
- ! test-tool urlmatch-normalization "http://x.y?%fg" &&
- test "$(test-tool urlmatch-normalization -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
- test "$(test-tool urlmatch-normalization -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
- test "$(test-tool urlmatch-normalization -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
- test "$(test-tool urlmatch-normalization -p "X://W/'\''")" = "x://w/'\''" &&
- test "$(test-tool urlmatch-normalization -p "X://W?'\!'")" = "x://w/?'\!'"
-'
-
-test_expect_success !MINGW 'url high-bit escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF"
-'
-
-test_expect_success 'url utf-8 escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
-'
-
-test_expect_success 'url username/password escapes' '
- test "$(test-tool urlmatch-normalization -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
-'
-
-test_expect_success 'url normalized lengths' '
- test "$(test-tool urlmatch-normalization -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
- test "$(test-tool urlmatch-normalization -l "http://%41:%42@x.y/%61/")" = 17 &&
- test "$(test-tool urlmatch-normalization -l "http://@x.y/^")" = 15
-'
-
-test_expect_success 'url . and .. segments' '
- test "$(test-tool urlmatch-normalization -p "x://y/.")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.")" = "x://y/a" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./")" = "x://y/a/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/.?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.?")" = "x://y/a?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?")" = "x://y/a/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c")" = "x://y/c" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
- ! test-tool urlmatch-normalization "x://y/a/./b/.././../c/././.././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2e/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/%2e./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/b/.%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/c/%2e%2E/")" = "x://y/"
-'
-
-# http://@foo specifies an empty user name but does not specify a password
-# http://foo specifies neither a user name nor a password
-# So they should not be equivalent
-test_expect_success 'url equivalents' '
- test-tool urlmatch-normalization "httP://x" "Http://X/" &&
- test-tool urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
- ! test-tool urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^" "httpS://@x.y:0443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
- test-tool urlmatch-normalization "https://@x.y/^/.." "httpS://@x.y:0443/"
-'
-
-test_done
diff --git a/t/t0110/README b/t/t0110/README
deleted file mode 100644
index ad4a50ecd8..0000000000
--- a/t/t0110/README
+++ /dev/null
@@ -1,9 +0,0 @@
-The url data files in this directory contain URLs with characters
-in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
-of unprintable characters.
-
-A select few characters in the 0x01-0x1f range are skipped to help
-avoid problems running the test itself.
-
-The urls are in test files in this directory rather than being
-embedded in the test script for portability.
diff --git a/t/t0110/url-1 b/t/t0110/url-1
deleted file mode 100644
index 519019c5ce6c58478f048a2f39e2321370d318c6..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt
diff --git a/t/t0110/url-10 b/t/t0110/url-10
deleted file mode 100644
index b9965de6a5d74b122179821212b2c27c8ae03e80..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)
diff --git a/t/t0110/url-11 b/t/t0110/url-11
deleted file mode 100644
index f0a50f10096a20d597f40c775f09a71276e0050a..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5
diff --git a/t/t0110/url-2 b/t/t0110/url-2
deleted file mode 100644
index 43334b05b2de3794d6020abd96e634a4e9e49cb0..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx
diff --git a/t/t0110/url-3 b/t/t0110/url-3
deleted file mode 100644
index 7378c7bec247b996bc67b00a05ed89cf47d4b7a7..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b
diff --git a/t/t0110/url-4 b/t/t0110/url-4
deleted file mode 100644
index 220b198c97f942fea4960f51a2105cc42261061a..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!
diff --git a/t/t0110/url-5 b/t/t0110/url-5
deleted file mode 100644
index 1ccd9277792840955bb124bdde21f4b08bcccb63..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#
diff --git a/t/t0110/url-6 b/t/t0110/url-6
deleted file mode 100644
index e8283aac6dff049d3e02454db6e684c5790a5996..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$
diff --git a/t/t0110/url-7 b/t/t0110/url-7
deleted file mode 100644
index fa7c10b615259deefd15b638b021da7c60eba1b2..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%
diff --git a/t/t0110/url-8 b/t/t0110/url-8
deleted file mode 100644
index 79a0ba836f5b8886b0a73f161eb292af2b105e65..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&
diff --git a/t/t0110/url-9 b/t/t0110/url-9
deleted file mode 100644
index 8b44bec48b94467c63e8e1ad18162e465da6d6dd..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(
diff --git a/t/unit-tests/t-urlmatch-normalization.c b/t/unit-tests/t-urlmatch-normalization.c
new file mode 100644
index 0000000000..8e62b423cb
--- /dev/null
+++ b/t/unit-tests/t-urlmatch-normalization.c
@@ -0,0 +1,271 @@
+#include "test-lib.h"
+#include "urlmatch.h"
+
+static void check_url_normalizable(const char *url, int normalizable)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_int(normalizable, ==, url_norm ? 1 : 0))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void check_normalized_url(const char *url, const char *expect)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_str(url_norm, expect))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void compare_normalized_urls(const char *url1, const char *url2,
+ size_t equal)
+{
+ char *url1_norm = url_normalize(url1, NULL);
+ char *url2_norm = url_normalize(url2, NULL);
+
+ if (equal) {
+ if (!check_str(url1_norm, url2_norm))
+ test_msg("input url1: %s\n input url2: %s", url1,
+ url2);
+ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
+ test_msg(" url1_norm: %s\n url2_norm: %s\n"
+ " input url1: %s\n input url2: %s",
+ url1_norm, url2_norm, url1, url2);
+ }
+ free(url1_norm);
+ free(url2_norm);
+}
+
+static void check_normalized_url_length(const char *url, size_t len)
+{
+ struct url_info info;
+ char *url_norm = url_normalize(url, &info);
+
+ if (!check_int(info.url_len, ==, len))
+ test_msg(" input url: %s\n normalized url: %s", url,
+ url_norm);
+ free(url_norm);
+}
+
+/* Note that only file: URLs should be allowed without a host */
+static void t_url_scheme(void)
+{
+ check_url_normalizable("", 0);
+ check_url_normalizable("_", 0);
+ check_url_normalizable("scheme", 0);
+ check_url_normalizable("scheme:", 0);
+ check_url_normalizable("scheme:/", 0);
+ check_url_normalizable("scheme://", 0);
+ check_url_normalizable("file", 0);
+ check_url_normalizable("file:", 0);
+ check_url_normalizable("file:/", 0);
+ check_url_normalizable("file://", 1);
+ check_url_normalizable("://acme.co", 0);
+ check_url_normalizable("x_test://acme.co", 0);
+ check_url_normalizable("-test://acme.co", 0);
+ check_url_normalizable("0test://acme.co", 0);
+ check_url_normalizable("+test://acme.co", 0);
+ check_url_normalizable(".test://acme.co", 0);
+ check_url_normalizable("schem%6e://", 0);
+ check_url_normalizable("x-Test+v1.0://acme.co", 1);
+ check_normalized_url("AbCdeF://x.Y", "abcdef://x.y/");
+}
+
+static void t_url_authority(void)
+{
+ check_url_normalizable("scheme://user:pass@", 0);
+ check_url_normalizable("scheme://?", 0);
+ check_url_normalizable("scheme://#", 0);
+ check_url_normalizable("scheme:///", 0);
+ check_url_normalizable("scheme://:", 0);
+ check_url_normalizable("scheme://:555", 0);
+ check_url_normalizable("file://user:pass@", 1);
+ check_url_normalizable("file://?", 1);
+ check_url_normalizable("file://#", 1);
+ check_url_normalizable("file:///", 1);
+ check_url_normalizable("file://:", 1);
+ check_url_normalizable("file://:555", 0);
+ check_url_normalizable("scheme://user:pass@host", 1);
+ check_url_normalizable("scheme://@host", 1);
+ check_url_normalizable("scheme://%00@host", 1);
+ check_url_normalizable("scheme://%%@host", 0);
+ check_url_normalizable("scheme://host_", 1);
+ check_url_normalizable("scheme://user:pass@host/", 1);
+ check_url_normalizable("scheme://@host/", 1);
+ check_url_normalizable("scheme://host/", 1);
+ check_url_normalizable("scheme://host?x", 1);
+ check_url_normalizable("scheme://host#x", 1);
+ check_url_normalizable("scheme://host/@", 1);
+ check_url_normalizable("scheme://host?@x", 1);
+ check_url_normalizable("scheme://host#@x", 1);
+ check_url_normalizable("scheme://[::1]", 1);
+ check_url_normalizable("scheme://[::1]/", 1);
+ check_url_normalizable("scheme://hos%41/", 0);
+ check_url_normalizable("scheme://[invalid....:/", 1);
+ check_url_normalizable("scheme://invalid....:]/", 1);
+ check_url_normalizable("scheme://invalid....:[/", 0);
+ check_url_normalizable("scheme://invalid....:[", 0);
+}
+
+static void t_url_port(void)
+{
+ check_url_normalizable("xyz://q@some.host:", 1);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://q@some.host:0", 0);
+ check_url_normalizable("xyz://q@some.host:0000000", 0);
+ check_url_normalizable("xyz://q@some.host:0000001?", 1);
+ check_url_normalizable("xyz://q@some.host:065535#", 1);
+ check_url_normalizable("xyz://q@some.host:65535", 1);
+ check_url_normalizable("xyz://q@some.host:65536", 0);
+ check_url_normalizable("xyz://q@some.host:99999", 0);
+ check_url_normalizable("xyz://q@some.host:100000", 0);
+ check_url_normalizable("xyz://q@some.host:100001", 0);
+ check_url_normalizable("http://q@some.host:80", 1);
+ check_url_normalizable("https://q@some.host:443", 1);
+ check_url_normalizable("http://q@some.host:80/", 1);
+ check_url_normalizable("https://q@some.host:443?", 1);
+ check_url_normalizable("http://q@:8008", 0);
+ check_url_normalizable("http://:8080", 0);
+ check_url_normalizable("http://:", 0);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://[::1]:456/", 1);
+ check_url_normalizable("xyz://[::1]:/", 1);
+ check_url_normalizable("xyz://[::1]:000/", 0);
+ check_url_normalizable("xyz://[::1]:0%300/", 0);
+ check_url_normalizable("xyz://[::1]:0x80/", 0);
+ check_url_normalizable("xyz://[::1]:4294967297/", 0);
+ check_url_normalizable("xyz://[::1]:030f/", 0);
+}
+
+static void t_url_port_normalization(void)
+{
+ check_normalized_url("http://x:800", "http://x:800/");
+ check_normalized_url("http://x:0800", "http://x:800/");
+ check_normalized_url("http://x:00000800", "http://x:800/");
+ check_normalized_url("http://x:065535", "http://x:65535/");
+ check_normalized_url("http://x:1", "http://x:1/");
+ check_normalized_url("http://x:80", "http://x/");
+ check_normalized_url("http://x:080", "http://x/");
+ check_normalized_url("http://x:000000080", "http://x/");
+ check_normalized_url("https://x:443", "https://x/");
+ check_normalized_url("https://x:0443", "https://x/");
+ check_normalized_url("https://x:000000443", "https://x/");
+}
+
+static void t_url_general_escape(void)
+{
+ check_url_normalizable("http://x.y?%fg", 0);
+ check_normalized_url("X://W/%7e%41^%3a", "x://w/~A%5E%3A");
+ check_normalized_url("X://W/:/?#[]@", "x://w/:/?#[]@");
+ check_normalized_url("X://W/$&()*+,;=", "x://w/$&()*+,;=");
+ check_normalized_url("X://W/'", "x://w/'");
+ check_normalized_url("X://W?!", "x://w/?!");
+}
+
+static void t_url_high_bit(void)
+{
+ check_normalized_url(
+ "x://q/\x01\x02\x03\x04\x05\x06\x07\x08\x0e\x0f\x10\x11\x12",
+ "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
+ check_normalized_url(
+ "x://q/\x13\x14\x15\x16\x17\x18\x19\x1b\x1c\x1d\x1e\x1f\x7f",
+ "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
+ check_normalized_url(
+ "x://q/\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f",
+ "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
+ check_normalized_url(
+ "x://q/\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f",
+ "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
+ check_normalized_url(
+ "x://q/\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf",
+ "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
+ check_normalized_url(
+ "x://q/\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf",
+ "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
+ check_normalized_url(
+ "x://q/\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf",
+ "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
+ check_normalized_url(
+ "x://q/\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf",
+ "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
+ check_normalized_url(
+ "x://q/\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef",
+ "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
+ check_normalized_url(
+ "x://q/\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff",
+ "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
+}
+
+static void t_url_utf8_escape(void)
+{
+ check_normalized_url(
+ "x://q/\xc2\x80\xdf\xbf\xe0\xa0\x80\xef\xbf\xbd\xf0\x90\x80\x80\xf0\xaf\xbf\xbd",
+ "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
+}
+
+static void t_url_username_pass(void)
+{
+ check_normalized_url("x://%41%62(^):%70+d@foo", "x://Ab(%5E):p+d@foo/");
+}
+
+static void t_url_length(void)
+{
+ check_normalized_url_length("Http://%4d%65:%4d^%70@The.Host", 25);
+ check_normalized_url_length("http://%41:%42@x.y/%61/", 17);
+ check_normalized_url_length("http://@x.y/^", 15);
+}
+
+static void t_url_dots(void)
+{
+ check_normalized_url("x://y/.", "x://y/");
+ check_normalized_url("x://y/./", "x://y/");
+ check_normalized_url("x://y/a/.", "x://y/a");
+ check_normalized_url("x://y/a/./", "x://y/a/");
+ check_normalized_url("x://y/.?", "x://y/?");
+ check_normalized_url("x://y/./?", "x://y/?");
+ check_normalized_url("x://y/a/.?", "x://y/a?");
+ check_normalized_url("x://y/a/./?", "x://y/a/?");
+ check_normalized_url("x://y/a/./b/.././../c", "x://y/c");
+ check_normalized_url("x://y/a/./b/../.././c/", "x://y/c/");
+ check_normalized_url("x://y/a/./b/.././../c/././.././.", "x://y/");
+ check_url_normalizable("x://y/a/./b/.././../c/././.././..", 0);
+ check_normalized_url("x://y/a/./?/././..", "x://y/a/?/././..");
+ check_normalized_url("x://y/%2e/", "x://y/");
+ check_normalized_url("x://y/%2E/", "x://y/");
+ check_normalized_url("x://y/a/%2e./", "x://y/");
+ check_normalized_url("x://y/b/.%2E/", "x://y/");
+ check_normalized_url("x://y/c/%2e%2E/", "x://y/");
+}
+
+/*
+ * http://@foo specifies an empty user name but does not specify a password
+ * http://foo specifies neither a user name nor a password
+ * So they should not be equivalent
+ */
+static void t_url_equivalents(void)
+{
+ compare_normalized_urls("httP://x", "Http://X/", 1);
+ compare_normalized_urls("Http://%4d%65:%4d^%70@The.Host", "hTTP://Me:%4D^p@the.HOST:80/", 1);
+ compare_normalized_urls("https://@x.y/^", "httpS://x.y:443/^", 0);
+ compare_normalized_urls("https://@x.y/^", "httpS://@x.y:0443/^", 1);
+ compare_normalized_urls("https://@x.y/^/../abc", "httpS://@x.y:0443/abc", 1);
+ compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1);
+}
+
+int cmd_main(int argc UNUSED, const char **argv UNUSED)
+{
+ TEST(t_url_scheme(), "url scheme");
+ TEST(t_url_authority(), "url authority");
+ TEST(t_url_port(), "url port checks");
+ TEST(t_url_port_normalization(), "url port normalization");
+ TEST(t_url_general_escape(), "url general escapes");
+ TEST(t_url_high_bit(), "url high-bit escapes");
+ TEST(t_url_utf8_escape(), "url utf8 escapes");
+ TEST(t_url_username_pass(), "url username/password escapes");
+ TEST(t_url_length(), "url normalized lengths");
+ TEST(t_url_dots(), "url . and .. segments");
+ TEST(t_url_equivalents(), "url equivalents");
+ return test_done();
+}
Range-diff against v2:
1: cf47b17a33 ! 1: a73b89c8e0 t: migrate t0110-urlmatch-normalization to the new framework
@@ Commit message
pasted verbatim from the shellscript.
Another change is the removal of MINGW prerequisite from one of the
- test. It was there because[1] on Windows, the command line is a Unicode
- string, it is not possible to pass arbitrary bytes to a program. But
- in unit tests we don't have this limitation.
+ test. It was there because[1] on Windows, the command line is a
+ Unicode string, it is not possible to pass arbitrary bytes to a
+ program. But in unit tests we don't have this limitation.
- With the addition of this unit test, we impose a new restriction of
- running the unit tests from either 't/' or 't/unit-tests/bin/'
- directories. This is to construct the path to files which contain some
- input urls under the 't/t-urlmatch-normalization' directory. This
- restriction is similar to one we have for end-to-end tests, where they
- can be ran from only 't/'. Addition of 't/unit-tests/bin/' is to allow
- for running individual tests which is not currently possible via any
- 'make' targets and also 'unit-tests-test-tool' target is also ran from
- the 't/unit-tests/bin' directory.
+ And since we can construct strings with arbitrary bytes in C, let's
+ also remove the test files which contain URLs with arbitrary bytes in
+ the 't/t0110' directory and instead embed those URLs in the unit test
+ code itself.
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
@@ t/t0110-urlmatch-normalization.sh (deleted)
-
-test_done
+ ## t/t0110/README (deleted) ##
+@@
+-The url data files in this directory contain URLs with characters
+-in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
+-of unprintable characters.
+-
+-A select few characters in the 0x01-0x1f range are skipped to help
+-avoid problems running the test itself.
+-
+-The urls are in test files in this directory rather than being
+-embedded in the test script for portability.
+
+ ## t/t0110/url-1 (deleted) ##
+ Binary files t/t0110/url-1 and /dev/null differ
+
+ ## t/t0110/url-10 (deleted) ##
+ Binary files t/t0110/url-10 and /dev/null differ
+
+ ## t/t0110/url-11 (deleted) ##
+ Binary files t/t0110/url-11 and /dev/null differ
+
+ ## t/t0110/url-2 (deleted) ##
+ Binary files t/t0110/url-2 and /dev/null differ
+
+ ## t/t0110/url-3 (deleted) ##
+ Binary files t/t0110/url-3 and /dev/null differ
+
+ ## t/t0110/url-4 (deleted) ##
+ Binary files t/t0110/url-4 and /dev/null differ
+
+ ## t/t0110/url-5 (deleted) ##
+ Binary files t/t0110/url-5 and /dev/null differ
+
+ ## t/t0110/url-6 (deleted) ##
+ Binary files t/t0110/url-6 and /dev/null differ
+
+ ## t/t0110/url-7 (deleted) ##
+ Binary files t/t0110/url-7 and /dev/null differ
+
+ ## t/t0110/url-8 (deleted) ##
+ Binary files t/t0110/url-8 and /dev/null differ
+
+ ## t/t0110/url-9 (deleted) ##
+ Binary files t/t0110/url-9 and /dev/null differ
+
## t/unit-tests/t-urlmatch-normalization.c (new) ##
@@
+#include "test-lib.h"
+#include "urlmatch.h"
-+#include "strbuf.h"
+
+static void check_url_normalizable(const char *url, int normalizable)
+{
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+ free(url2_norm);
+}
+
-+static void check_normalized_url_from_file(const char *file, const char *expect)
-+{
-+ struct strbuf content = STRBUF_INIT, path = STRBUF_INIT;
-+ char *cwd_basename;
-+
-+ if (!check_int(strbuf_getcwd(&path), ==, 0))
-+ return;
-+
-+ cwd_basename = basename(path.buf);
-+ if (!check(!strcmp(cwd_basename, "t") || !strcmp(cwd_basename, "bin"))) {
-+ test_msg("BUG: unit-tests should be run from either 't/' or 't/unit-tests/bin' directory");
-+ return;
-+ }
-+
-+ strbuf_strip_suffix(&path, "/unit-tests/bin");
-+ strbuf_addf(&path, "/unit-tests/t-urlmatch-normalization/%s", file);
-+
-+ if (!check_int(strbuf_read_file(&content, path.buf, 0), >, 0)) {
-+ test_msg("failed to read from file '%s': %s", file, strerror(errno));
-+ } else {
-+ char *url_norm;
-+
-+ strbuf_trim_trailing_newline(&content);
-+ url_norm = url_normalize(content.buf, NULL);
-+ if (!check_str(url_norm, expect))
-+ test_msg("input file: %s", file);
-+ free(url_norm);
-+ }
-+
-+ strbuf_release(&content);
-+ strbuf_release(&path);
-+}
-+
+static void check_normalized_url_length(const char *url, size_t len)
+{
+ struct url_info info;
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+
+static void t_url_high_bit(void)
+{
-+ check_normalized_url_from_file("url-1",
-+ "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
-+ check_normalized_url_from_file("url-2",
-+ "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
-+ check_normalized_url_from_file("url-3",
-+ "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
-+ check_normalized_url_from_file("url-4",
-+ "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
-+ check_normalized_url_from_file("url-5",
-+ "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
-+ check_normalized_url_from_file("url-6",
-+ "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
-+ check_normalized_url_from_file("url-7",
-+ "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
-+ check_normalized_url_from_file("url-8",
-+ "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
-+ check_normalized_url_from_file("url-9",
-+ "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
-+ check_normalized_url_from_file("url-10",
-+ "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
++ check_normalized_url(
++ "x://q/\x01\x02\x03\x04\x05\x06\x07\x08\x0e\x0f\x10\x11\x12",
++ "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
++ check_normalized_url(
++ "x://q/\x13\x14\x15\x16\x17\x18\x19\x1b\x1c\x1d\x1e\x1f\x7f",
++ "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
++ check_normalized_url(
++ "x://q/\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f",
++ "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
++ check_normalized_url(
++ "x://q/\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f",
++ "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
++ check_normalized_url(
++ "x://q/\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf",
++ "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
++ check_normalized_url(
++ "x://q/\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf",
++ "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
++ check_normalized_url(
++ "x://q/\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf",
++ "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
++ check_normalized_url(
++ "x://q/\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf",
++ "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
++ check_normalized_url(
++ "x://q/\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef",
++ "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
++ check_normalized_url(
++ "x://q/\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff",
++ "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
+}
+
+static void t_url_utf8_escape(void)
+{
-+ check_normalized_url_from_file("url-11",
-+ "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
++ check_normalized_url(
++ "x://q/\xc2\x80\xdf\xbf\xe0\xa0\x80\xef\xbf\xbd\xf0\x90\x80\x80\xf0\xaf\xbf\xbd",
++ "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
+}
+
+static void t_url_username_pass(void)
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+ TEST(t_url_equivalents(), "url equivalents");
+ return test_done();
+}
-
- ## t/t0110/README => t/unit-tests/t-urlmatch-normalization/README ##
-
- ## t/t0110/url-1 => t/unit-tests/t-urlmatch-normalization/url-1 ##
-
- ## t/t0110/url-10 => t/unit-tests/t-urlmatch-normalization/url-10 ##
-
- ## t/t0110/url-11 => t/unit-tests/t-urlmatch-normalization/url-11 ##
-
- ## t/t0110/url-2 => t/unit-tests/t-urlmatch-normalization/url-2 ##
-
- ## t/t0110/url-3 => t/unit-tests/t-urlmatch-normalization/url-3 ##
-
- ## t/t0110/url-4 => t/unit-tests/t-urlmatch-normalization/url-4 ##
-
- ## t/t0110/url-5 => t/unit-tests/t-urlmatch-normalization/url-5 ##
-
- ## t/t0110/url-6 => t/unit-tests/t-urlmatch-normalization/url-6 ##
-
- ## t/t0110/url-7 => t/unit-tests/t-urlmatch-normalization/url-7 ##
-
- ## t/t0110/url-8 => t/unit-tests/t-urlmatch-normalization/url-8 ##
-
- ## t/t0110/url-9 => t/unit-tests/t-urlmatch-normalization/url-9 ##
--
2.46.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-13 19:22 ` Junio C Hamano
2024-08-14 1:35 ` Kaartic Sivaraam
@ 2024-08-14 14:24 ` Ghanshyam Thakkar
1 sibling, 0 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-08-14 14:24 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Patrick Steinhardt, Karthik Nayak, Phillip Wood,
Christian Couder, Christian Couder, Kaartic Sivaraam
Junio C Hamano <gitster@pobox.com> wrote:
> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
>
> > With the addition of this unit test, we impose a new restriction of
> > running the unit tests from either 't/' or 't/unit-tests/bin/'
> > directories. This is to construct the path to files which contain some
> > input urls under the 't/t-urlmatch-normalization' directory. This
> > restriction is similar to one we have for end-to-end tests, where they
> > can be ran from only 't/'.
> >
> > Addition of 't/unit-tests/bin/' is to allow
> > for running individual tests which is not currently possible via any
> > 'make' targets and also 'unit-tests-test-tool' target is also ran from
> > the 't/unit-tests/bin' directory.
>
> Sorry, but I do not quite follow. The above makes it sound as if
> the 'bin' subdirectory is something that never existed before this
> patch and this patch introduces the use of that directory, but that
> is hardly the case. What does that "Addition of" really refer to?
>
> Do you mean "we cannot run the tests from arbitrary places, and we
> allow them to be run from t/, just like the normal tests" followed
> by "in addition, we also allow them to be run from t/unit-tests/bin
> directory because ..."?
Yes, I meant that. I sent a v3 which embeds those URLs from files into
the code itself, which should not require such restriction.
Link: https://lore.kernel.org/git/20240814142057.94671-1-shyamthakkar001@gmail.com/
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v3] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-14 14:20 ` [GSoC][PATCH v3] " Ghanshyam Thakkar
@ 2024-08-14 16:52 ` Junio C Hamano
2024-08-19 12:46 ` Christian Couder
2024-08-20 15:19 ` [GSoC][PATCH v4] " Ghanshyam Thakkar
2 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-08-14 16:52 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Karthik Nayak, Patrick Steinhardt, Christian Couder,
Christian Couder, Kaartic Sivaraam
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
> And since we can construct strings with arbitrary bytes in C, let's
> also remove the test files which contain URLs with arbitrary bytes in
> the 't/t0110' directory and instead embed those URLs in the unit test
> code itself.
Yeah, that was what I meant to say. As long as we do not have an
embedded NUL in these test pattern strings, "const char *" literals
are fine (and if we need to use embedded NUL, we'd need <ptr,len>).
Thanks. Will queue.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v3] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-14 14:20 ` [GSoC][PATCH v3] " Ghanshyam Thakkar
2024-08-14 16:52 ` Junio C Hamano
@ 2024-08-19 12:46 ` Christian Couder
2024-08-20 15:19 ` [GSoC][PATCH v4] " Ghanshyam Thakkar
2 siblings, 0 replies; 23+ messages in thread
From: Christian Couder @ 2024-08-19 12:46 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Junio C Hamano, Karthik Nayak, Patrick Steinhardt,
Christian Couder, Kaartic Sivaraam
On Wed, Aug 14, 2024 at 4:21 PM Ghanshyam Thakkar
<shyamthakkar001@gmail.com> wrote:
>
> helper/test-urlmatch-normalization along with
> t0110-urlmatch-normalization test the `url_normalize()` function from
> 'urlmatch.h'. Migrate them to the unit testing framework for better
> performance. And also add different test_msg()s for better debugging.
>
> In the migration, last two of the checks from `t_url_general_escape()`
> were slightly changed compared to the shellscript. This involves changing
Nit: s/shellscript/shell script/
>
> '\'' -> '
> '\!' -> !
>
> in the urls of those checks. This is because in C strings, we don't
> need to escape "'" and "!". Other than these two, all the urls were
> pasted verbatim from the shellscript.
Nit: s/shellscript/shell script/
> Another change is the removal of MINGW prerequisite from one of the
Nit: s/of MINGW prerequisite/of a MINGW prerequisite/
> test. It was there because[1] on Windows, the command line is a
> Unicode string, it is not possible to pass arbitrary bytes to a
> program. But in unit tests we don't have this limitation.
>
> And since we can construct strings with arbitrary bytes in C, let's
> also remove the test files which contain URLs with arbitrary bytes in
> the 't/t0110' directory and instead embed those URLs in the unit test
> code itself.
>
> [1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
> ---
> This version addresses Junio's review and removes the restriction
> of running the unit tests in the 't/' and 't/unit-tests/bin'
> introduced in v2 by embedding the URLs in the code itself.
Nice change.
[...]
> +static void compare_normalized_urls(const char *url1, const char *url2,
> + size_t equal)
Nit: it's better to use 'unsigned int' or just 'int' for bool flags
like "equal". Or is there a reason to use 'size_t' instead?
> +{
> + char *url1_norm = url_normalize(url1, NULL);
> + char *url2_norm = url_normalize(url2, NULL);
> +
> + if (equal) {
> + if (!check_str(url1_norm, url2_norm))
> + test_msg("input url1: %s\n input url2: %s", url1,
> + url2);
> + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
> + test_msg(" url1_norm: %s\n url2_norm: %s\n"
> + " input url1: %s\n input url2: %s",
> + url1_norm, url2_norm, url1, url2);
Nit: something like "normalized url1" might be a bit better than
"url1_norm" as for the 'url1' variable we use "input url1" instead of
just "url1".
> + }
> + free(url1_norm);
> + free(url2_norm);
> +}
> +
> +static void check_normalized_url_length(const char *url, size_t len)
> +{
> + struct url_info info;
> + char *url_norm = url_normalize(url, &info);
> +
> + if (!check_int(info.url_len, ==, len))
> + test_msg(" input url: %s\n normalized url: %s", url,
> + url_norm);
Above "normalized url" is used for "url_norm" which is good.
> + free(url_norm);
> +}
> +
> +/* Note that only file: URLs should be allowed without a host */
Nit: maybe s/file:/"file:"/ would make things a bit clearer.
[...]
> +/*
> + * http://@foo specifies an empty user name but does not specify a password
> + * http://foo specifies neither a user name nor a password
> + * So they should not be equivalent
> + */
Nit: the above comment would be a bit better with URLs inside double
quotes, with a full stop (period) at the end of each sentence and with
only one space character between "http://foo" and "specifies".
Except for the above nits, I am happy with this version. Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [GSoC][PATCH v4] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-14 14:20 ` [GSoC][PATCH v3] " Ghanshyam Thakkar
2024-08-14 16:52 ` Junio C Hamano
2024-08-19 12:46 ` Christian Couder
@ 2024-08-20 15:19 ` Ghanshyam Thakkar
2024-08-20 15:24 ` Ghanshyam Thakkar
2024-08-21 10:06 ` Christian Couder
2 siblings, 2 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-08-20 15:19 UTC (permalink / raw)
To: git
Cc: Ghanshyam Thakkar, Junio C Hamano, Karthik Nayak,
Patrick Steinhardt, Christian Couder, Christian Couder,
Kaartic Sivaraam
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.
In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shell script. This involves
changing
'\'' -> '
'\!' -> !
in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shell script.
Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.
And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
Makefile | 2 +-
t/helper/test-tool.c | 1 -
t/helper/test-tool.h | 1 -
t/helper/test-urlmatch-normalization.c | 56 -----
t/t0110-urlmatch-normalization.sh | 182 ----------------
t/t0110/README | 9 -
t/t0110/url-1 | Bin 20 -> 0 bytes
t/t0110/url-10 | Bin 23 -> 0 bytes
t/t0110/url-11 | Bin 25 -> 0 bytes
t/t0110/url-2 | Bin 20 -> 0 bytes
t/t0110/url-3 | Bin 23 -> 0 bytes
t/t0110/url-4 | Bin 23 -> 0 bytes
t/t0110/url-5 | Bin 23 -> 0 bytes
t/t0110/url-6 | Bin 23 -> 0 bytes
t/t0110/url-7 | Bin 23 -> 0 bytes
t/t0110/url-8 | Bin 23 -> 0 bytes
t/t0110/url-9 | Bin 23 -> 0 bytes
t/unit-tests/t-urlmatch-normalization.c | 271 ++++++++++++++++++++++++
18 files changed, 272 insertions(+), 250 deletions(-)
delete mode 100644 t/helper/test-urlmatch-normalization.c
delete mode 100755 t/t0110-urlmatch-normalization.sh
delete mode 100644 t/t0110/README
delete mode 100644 t/t0110/url-1
delete mode 100644 t/t0110/url-10
delete mode 100644 t/t0110/url-11
delete mode 100644 t/t0110/url-2
delete mode 100644 t/t0110/url-3
delete mode 100644 t/t0110/url-4
delete mode 100644 t/t0110/url-5
delete mode 100644 t/t0110/url-6
delete mode 100644 t/t0110/url-7
delete mode 100644 t/t0110/url-8
delete mode 100644 t/t0110/url-9
create mode 100644 t/unit-tests/t-urlmatch-normalization.c
diff --git a/Makefile b/Makefile
index 3863e60b66..d7bc19e823 100644
--- a/Makefile
+++ b/Makefile
@@ -843,7 +843,6 @@ TEST_BUILTINS_OBJS += test-submodule.o
TEST_BUILTINS_OBJS += test-subprocess.o
TEST_BUILTINS_OBJS += test-trace2.o
TEST_BUILTINS_OBJS += test-truncate.o
-TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
TEST_BUILTINS_OBJS += test-userdiff.o
TEST_BUILTINS_OBJS += test-wildmatch.o
TEST_BUILTINS_OBJS += test-windows-named-pipe.o
@@ -1346,6 +1345,7 @@ UNIT_TEST_PROGRAMS += t-strbuf
UNIT_TEST_PROGRAMS += t-strcmp-offset
UNIT_TEST_PROGRAMS += t-strvec
UNIT_TEST_PROGRAMS += t-trailer
+UNIT_TEST_PROGRAMS += t-urlmatch-normalization
UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index da3e69128a..f8a67df7de 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -83,7 +83,6 @@ static struct test_cmd cmds[] = {
{ "trace2", cmd__trace2 },
{ "truncate", cmd__truncate },
{ "userdiff", cmd__userdiff },
- { "urlmatch-normalization", cmd__urlmatch_normalization },
{ "xml-encode", cmd__xml_encode },
{ "wildmatch", cmd__wildmatch },
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 642a34578c..e74bc0ffd4 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -76,7 +76,6 @@ int cmd__subprocess(int argc, const char **argv);
int cmd__trace2(int argc, const char **argv);
int cmd__truncate(int argc, const char **argv);
int cmd__userdiff(int argc, const char **argv);
-int cmd__urlmatch_normalization(int argc, const char **argv);
int cmd__xml_encode(int argc, const char **argv);
int cmd__wildmatch(int argc, const char **argv);
#ifdef GIT_WINDOWS_NATIVE
diff --git a/t/helper/test-urlmatch-normalization.c b/t/helper/test-urlmatch-normalization.c
deleted file mode 100644
index 86edd454f5..0000000000
--- a/t/helper/test-urlmatch-normalization.c
+++ /dev/null
@@ -1,56 +0,0 @@
-#include "test-tool.h"
-#include "git-compat-util.h"
-#include "urlmatch.h"
-
-int cmd__urlmatch_normalization(int argc, const char **argv)
-{
- const char usage[] = "test-tool urlmatch-normalization [-p | -l] <url1> | <url1> <url2>";
- char *url1 = NULL, *url2 = NULL;
- int opt_p = 0, opt_l = 0;
- int ret = 0;
-
- /*
- * For one url, succeed if url_normalize succeeds on it, fail otherwise.
- * For two urls, succeed only if url_normalize succeeds on both and
- * the results compare equal with strcmp. If -p is given (one url only)
- * and url_normalize succeeds, print the result followed by "\n". If
- * -l is given (one url only) and url_normalize succeeds, print the
- * returned length in decimal followed by "\n".
- */
-
- if (argc > 1 && !strcmp(argv[1], "-p")) {
- opt_p = 1;
- argc--;
- argv++;
- } else if (argc > 1 && !strcmp(argv[1], "-l")) {
- opt_l = 1;
- argc--;
- argv++;
- }
-
- if (argc < 2 || argc > 3)
- die("%s", usage);
-
- if (argc == 2) {
- struct url_info info;
- url1 = url_normalize(argv[1], &info);
- if (!url1)
- return 1;
- if (opt_p)
- printf("%s\n", url1);
- if (opt_l)
- printf("%u\n", (unsigned)info.url_len);
- goto cleanup;
- }
-
- if (opt_p || opt_l)
- die("%s", usage);
-
- url1 = url_normalize(argv[1], NULL);
- url2 = url_normalize(argv[2], NULL);
- ret = (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
-cleanup:
- free(url1);
- free(url2);
- return ret;
-}
diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch-normalization.sh
deleted file mode 100755
index 12d817fbd3..0000000000
--- a/t/t0110-urlmatch-normalization.sh
+++ /dev/null
@@ -1,182 +0,0 @@
-#!/bin/sh
-
-test_description='urlmatch URL normalization'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-# The base name of the test url files
-tu="$TEST_DIRECTORY/t0110/url"
-
-# Note that only file: URLs should be allowed without a host
-
-test_expect_success 'url scheme' '
- ! test-tool urlmatch-normalization "" &&
- ! test-tool urlmatch-normalization "_" &&
- ! test-tool urlmatch-normalization "scheme" &&
- ! test-tool urlmatch-normalization "scheme:" &&
- ! test-tool urlmatch-normalization "scheme:/" &&
- ! test-tool urlmatch-normalization "scheme://" &&
- ! test-tool urlmatch-normalization "file" &&
- ! test-tool urlmatch-normalization "file:" &&
- ! test-tool urlmatch-normalization "file:/" &&
- test-tool urlmatch-normalization "file://" &&
- ! test-tool urlmatch-normalization "://acme.co" &&
- ! test-tool urlmatch-normalization "x_test://acme.co" &&
- ! test-tool urlmatch-normalization "-test://acme.co" &&
- ! test-tool urlmatch-normalization "0test://acme.co" &&
- ! test-tool urlmatch-normalization "+test://acme.co" &&
- ! test-tool urlmatch-normalization ".test://acme.co" &&
- ! test-tool urlmatch-normalization "schem%6e://" &&
- test-tool urlmatch-normalization "x-Test+v1.0://acme.co" &&
- test "$(test-tool urlmatch-normalization -p "AbCdeF://x.Y")" = "abcdef://x.y/"
-'
-
-test_expect_success 'url authority' '
- ! test-tool urlmatch-normalization "scheme://user:pass@" &&
- ! test-tool urlmatch-normalization "scheme://?" &&
- ! test-tool urlmatch-normalization "scheme://#" &&
- ! test-tool urlmatch-normalization "scheme:///" &&
- ! test-tool urlmatch-normalization "scheme://:" &&
- ! test-tool urlmatch-normalization "scheme://:555" &&
- test-tool urlmatch-normalization "file://user:pass@" &&
- test-tool urlmatch-normalization "file://?" &&
- test-tool urlmatch-normalization "file://#" &&
- test-tool urlmatch-normalization "file:///" &&
- test-tool urlmatch-normalization "file://:" &&
- ! test-tool urlmatch-normalization "file://:555" &&
- test-tool urlmatch-normalization "scheme://user:pass@host" &&
- test-tool urlmatch-normalization "scheme://@host" &&
- test-tool urlmatch-normalization "scheme://%00@host" &&
- ! test-tool urlmatch-normalization "scheme://%%@host" &&
- test-tool urlmatch-normalization "scheme://host_" &&
- test-tool urlmatch-normalization "scheme://user:pass@host/" &&
- test-tool urlmatch-normalization "scheme://@host/" &&
- test-tool urlmatch-normalization "scheme://host/" &&
- test-tool urlmatch-normalization "scheme://host?x" &&
- test-tool urlmatch-normalization "scheme://host#x" &&
- test-tool urlmatch-normalization "scheme://host/@" &&
- test-tool urlmatch-normalization "scheme://host?@x" &&
- test-tool urlmatch-normalization "scheme://host#@x" &&
- test-tool urlmatch-normalization "scheme://[::1]" &&
- test-tool urlmatch-normalization "scheme://[::1]/" &&
- ! test-tool urlmatch-normalization "scheme://hos%41/" &&
- test-tool urlmatch-normalization "scheme://[invalid....:/" &&
- test-tool urlmatch-normalization "scheme://invalid....:]/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:[/" &&
- ! test-tool urlmatch-normalization "scheme://invalid....:["
-'
-
-test_expect_success 'url port checks' '
- test-tool urlmatch-normalization "xyz://q@some.host:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:0000000" &&
- test-tool urlmatch-normalization "xyz://q@some.host:0000001?" &&
- test-tool urlmatch-normalization "xyz://q@some.host:065535#" &&
- test-tool urlmatch-normalization "xyz://q@some.host:65535" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:65536" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:99999" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100000" &&
- ! test-tool urlmatch-normalization "xyz://q@some.host:100001" &&
- test-tool urlmatch-normalization "http://q@some.host:80" &&
- test-tool urlmatch-normalization "https://q@some.host:443" &&
- test-tool urlmatch-normalization "http://q@some.host:80/" &&
- test-tool urlmatch-normalization "https://q@some.host:443?" &&
- ! test-tool urlmatch-normalization "http://q@:8008" &&
- ! test-tool urlmatch-normalization "http://:8080" &&
- ! test-tool urlmatch-normalization "http://:" &&
- test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:456/" &&
- test-tool urlmatch-normalization "xyz://[::1]:/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:000/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0%300/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:0x80/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:4294967297/" &&
- ! test-tool urlmatch-normalization "xyz://[::1]:030f/"
-'
-
-test_expect_success 'url port normalization' '
- test "$(test-tool urlmatch-normalization -p "http://x:800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:0800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:00000800")" = "http://x:800/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:065535")" = "http://x:65535/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:1")" = "http://x:1/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:80")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "http://x:000000080")" = "http://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:0443")" = "https://x/" &&
- test "$(test-tool urlmatch-normalization -p "https://x:000000443")" = "https://x/"
-'
-
-test_expect_success 'url general escapes' '
- ! test-tool urlmatch-normalization "http://x.y?%fg" &&
- test "$(test-tool urlmatch-normalization -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
- test "$(test-tool urlmatch-normalization -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
- test "$(test-tool urlmatch-normalization -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
- test "$(test-tool urlmatch-normalization -p "X://W/'\''")" = "x://w/'\''" &&
- test "$(test-tool urlmatch-normalization -p "X://W?'\!'")" = "x://w/?'\!'"
-'
-
-test_expect_success !MINGW 'url high-bit escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF"
-'
-
-test_expect_success 'url utf-8 escapes' '
- test "$(test-tool urlmatch-normalization -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
-'
-
-test_expect_success 'url username/password escapes' '
- test "$(test-tool urlmatch-normalization -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
-'
-
-test_expect_success 'url normalized lengths' '
- test "$(test-tool urlmatch-normalization -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
- test "$(test-tool urlmatch-normalization -l "http://%41:%42@x.y/%61/")" = 17 &&
- test "$(test-tool urlmatch-normalization -l "http://@x.y/^")" = 15
-'
-
-test_expect_success 'url . and .. segments' '
- test "$(test-tool urlmatch-normalization -p "x://y/.")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.")" = "x://y/a" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./")" = "x://y/a/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/.?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/./?")" = "x://y/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/.?")" = "x://y/a?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?")" = "x://y/a/?" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c")" = "x://y/c" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
- ! test-tool urlmatch-normalization "x://y/a/./b/.././../c/././.././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2e/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/a/%2e./")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/b/.%2E/")" = "x://y/" &&
- test "$(test-tool urlmatch-normalization -p "x://y/c/%2e%2E/")" = "x://y/"
-'
-
-# http://@foo specifies an empty user name but does not specify a password
-# http://foo specifies neither a user name nor a password
-# So they should not be equivalent
-test_expect_success 'url equivalents' '
- test-tool urlmatch-normalization "httP://x" "Http://X/" &&
- test-tool urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
- ! test-tool urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^" "httpS://@x.y:0443/^" &&
- test-tool urlmatch-normalization "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
- test-tool urlmatch-normalization "https://@x.y/^/.." "httpS://@x.y:0443/"
-'
-
-test_done
diff --git a/t/t0110/README b/t/t0110/README
deleted file mode 100644
index ad4a50ecd8..0000000000
--- a/t/t0110/README
+++ /dev/null
@@ -1,9 +0,0 @@
-The url data files in this directory contain URLs with characters
-in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
-of unprintable characters.
-
-A select few characters in the 0x01-0x1f range are skipped to help
-avoid problems running the test itself.
-
-The urls are in test files in this directory rather than being
-embedded in the test script for portability.
diff --git a/t/t0110/url-1 b/t/t0110/url-1
deleted file mode 100644
index 519019c5ce6c58478f048a2f39e2321370d318c6..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt
diff --git a/t/t0110/url-10 b/t/t0110/url-10
deleted file mode 100644
index b9965de6a5d74b122179821212b2c27c8ae03e80..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)
diff --git a/t/t0110/url-11 b/t/t0110/url-11
deleted file mode 100644
index f0a50f10096a20d597f40c775f09a71276e0050a..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5
diff --git a/t/t0110/url-2 b/t/t0110/url-2
deleted file mode 100644
index 43334b05b2de3794d6020abd96e634a4e9e49cb0..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx
diff --git a/t/t0110/url-3 b/t/t0110/url-3
deleted file mode 100644
index 7378c7bec247b996bc67b00a05ed89cf47d4b7a7..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b
diff --git a/t/t0110/url-4 b/t/t0110/url-4
deleted file mode 100644
index 220b198c97f942fea4960f51a2105cc42261061a..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!
diff --git a/t/t0110/url-5 b/t/t0110/url-5
deleted file mode 100644
index 1ccd9277792840955bb124bdde21f4b08bcccb63..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#
diff --git a/t/t0110/url-6 b/t/t0110/url-6
deleted file mode 100644
index e8283aac6dff049d3e02454db6e684c5790a5996..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$
diff --git a/t/t0110/url-7 b/t/t0110/url-7
deleted file mode 100644
index fa7c10b615259deefd15b638b021da7c60eba1b2..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%
diff --git a/t/t0110/url-8 b/t/t0110/url-8
deleted file mode 100644
index 79a0ba836f5b8886b0a73f161eb292af2b105e65..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&
diff --git a/t/t0110/url-9 b/t/t0110/url-9
deleted file mode 100644
index 8b44bec48b94467c63e8e1ad18162e465da6d6dd..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(
diff --git a/t/unit-tests/t-urlmatch-normalization.c b/t/unit-tests/t-urlmatch-normalization.c
new file mode 100644
index 0000000000..1769c357b9
--- /dev/null
+++ b/t/unit-tests/t-urlmatch-normalization.c
@@ -0,0 +1,271 @@
+#include "test-lib.h"
+#include "urlmatch.h"
+
+static void check_url_normalizable(const char *url, unsigned int normalizable)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_int(normalizable, ==, url_norm ? 1 : 0))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void check_normalized_url(const char *url, const char *expect)
+{
+ char *url_norm = url_normalize(url, NULL);
+
+ if (!check_str(url_norm, expect))
+ test_msg("input url: %s", url);
+ free(url_norm);
+}
+
+static void compare_normalized_urls(const char *url1, const char *url2,
+ unsigned int equal)
+{
+ char *url1_norm = url_normalize(url1, NULL);
+ char *url2_norm = url_normalize(url2, NULL);
+
+ if (equal) {
+ if (!check_str(url1_norm, url2_norm))
+ test_msg("input url1: %s\n input url2: %s", url1,
+ url2);
+ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
+ test_msg(" normalized url1: %s\n normalized url2: %s\n"
+ " input url1: %s\n input url2: %s",
+ url1_norm, url2_norm, url1, url2);
+ }
+ free(url1_norm);
+ free(url2_norm);
+}
+
+static void check_normalized_url_length(const char *url, size_t len)
+{
+ struct url_info info;
+ char *url_norm = url_normalize(url, &info);
+
+ if (!check_int(info.url_len, ==, len))
+ test_msg(" input url: %s\n normalized url: %s", url,
+ url_norm);
+ free(url_norm);
+}
+
+/* Note that only "file:" URLs should be allowed without a host */
+static void t_url_scheme(void)
+{
+ check_url_normalizable("", 0);
+ check_url_normalizable("_", 0);
+ check_url_normalizable("scheme", 0);
+ check_url_normalizable("scheme:", 0);
+ check_url_normalizable("scheme:/", 0);
+ check_url_normalizable("scheme://", 0);
+ check_url_normalizable("file", 0);
+ check_url_normalizable("file:", 0);
+ check_url_normalizable("file:/", 0);
+ check_url_normalizable("file://", 1);
+ check_url_normalizable("://acme.co", 0);
+ check_url_normalizable("x_test://acme.co", 0);
+ check_url_normalizable("-test://acme.co", 0);
+ check_url_normalizable("0test://acme.co", 0);
+ check_url_normalizable("+test://acme.co", 0);
+ check_url_normalizable(".test://acme.co", 0);
+ check_url_normalizable("schem%6e://", 0);
+ check_url_normalizable("x-Test+v1.0://acme.co", 1);
+ check_normalized_url("AbCdeF://x.Y", "abcdef://x.y/");
+}
+
+static void t_url_authority(void)
+{
+ check_url_normalizable("scheme://user:pass@", 0);
+ check_url_normalizable("scheme://?", 0);
+ check_url_normalizable("scheme://#", 0);
+ check_url_normalizable("scheme:///", 0);
+ check_url_normalizable("scheme://:", 0);
+ check_url_normalizable("scheme://:555", 0);
+ check_url_normalizable("file://user:pass@", 1);
+ check_url_normalizable("file://?", 1);
+ check_url_normalizable("file://#", 1);
+ check_url_normalizable("file:///", 1);
+ check_url_normalizable("file://:", 1);
+ check_url_normalizable("file://:555", 0);
+ check_url_normalizable("scheme://user:pass@host", 1);
+ check_url_normalizable("scheme://@host", 1);
+ check_url_normalizable("scheme://%00@host", 1);
+ check_url_normalizable("scheme://%%@host", 0);
+ check_url_normalizable("scheme://host_", 1);
+ check_url_normalizable("scheme://user:pass@host/", 1);
+ check_url_normalizable("scheme://@host/", 1);
+ check_url_normalizable("scheme://host/", 1);
+ check_url_normalizable("scheme://host?x", 1);
+ check_url_normalizable("scheme://host#x", 1);
+ check_url_normalizable("scheme://host/@", 1);
+ check_url_normalizable("scheme://host?@x", 1);
+ check_url_normalizable("scheme://host#@x", 1);
+ check_url_normalizable("scheme://[::1]", 1);
+ check_url_normalizable("scheme://[::1]/", 1);
+ check_url_normalizable("scheme://hos%41/", 0);
+ check_url_normalizable("scheme://[invalid....:/", 1);
+ check_url_normalizable("scheme://invalid....:]/", 1);
+ check_url_normalizable("scheme://invalid....:[/", 0);
+ check_url_normalizable("scheme://invalid....:[", 0);
+}
+
+static void t_url_port(void)
+{
+ check_url_normalizable("xyz://q@some.host:", 1);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://q@some.host:0", 0);
+ check_url_normalizable("xyz://q@some.host:0000000", 0);
+ check_url_normalizable("xyz://q@some.host:0000001?", 1);
+ check_url_normalizable("xyz://q@some.host:065535#", 1);
+ check_url_normalizable("xyz://q@some.host:65535", 1);
+ check_url_normalizable("xyz://q@some.host:65536", 0);
+ check_url_normalizable("xyz://q@some.host:99999", 0);
+ check_url_normalizable("xyz://q@some.host:100000", 0);
+ check_url_normalizable("xyz://q@some.host:100001", 0);
+ check_url_normalizable("http://q@some.host:80", 1);
+ check_url_normalizable("https://q@some.host:443", 1);
+ check_url_normalizable("http://q@some.host:80/", 1);
+ check_url_normalizable("https://q@some.host:443?", 1);
+ check_url_normalizable("http://q@:8008", 0);
+ check_url_normalizable("http://:8080", 0);
+ check_url_normalizable("http://:", 0);
+ check_url_normalizable("xyz://q@some.host:456/", 1);
+ check_url_normalizable("xyz://[::1]:456/", 1);
+ check_url_normalizable("xyz://[::1]:/", 1);
+ check_url_normalizable("xyz://[::1]:000/", 0);
+ check_url_normalizable("xyz://[::1]:0%300/", 0);
+ check_url_normalizable("xyz://[::1]:0x80/", 0);
+ check_url_normalizable("xyz://[::1]:4294967297/", 0);
+ check_url_normalizable("xyz://[::1]:030f/", 0);
+}
+
+static void t_url_port_normalization(void)
+{
+ check_normalized_url("http://x:800", "http://x:800/");
+ check_normalized_url("http://x:0800", "http://x:800/");
+ check_normalized_url("http://x:00000800", "http://x:800/");
+ check_normalized_url("http://x:065535", "http://x:65535/");
+ check_normalized_url("http://x:1", "http://x:1/");
+ check_normalized_url("http://x:80", "http://x/");
+ check_normalized_url("http://x:080", "http://x/");
+ check_normalized_url("http://x:000000080", "http://x/");
+ check_normalized_url("https://x:443", "https://x/");
+ check_normalized_url("https://x:0443", "https://x/");
+ check_normalized_url("https://x:000000443", "https://x/");
+}
+
+static void t_url_general_escape(void)
+{
+ check_url_normalizable("http://x.y?%fg", 0);
+ check_normalized_url("X://W/%7e%41^%3a", "x://w/~A%5E%3A");
+ check_normalized_url("X://W/:/?#[]@", "x://w/:/?#[]@");
+ check_normalized_url("X://W/$&()*+,;=", "x://w/$&()*+,;=");
+ check_normalized_url("X://W/'", "x://w/'");
+ check_normalized_url("X://W?!", "x://w/?!");
+}
+
+static void t_url_high_bit(void)
+{
+ check_normalized_url(
+ "x://q/\x01\x02\x03\x04\x05\x06\x07\x08\x0e\x0f\x10\x11\x12",
+ "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
+ check_normalized_url(
+ "x://q/\x13\x14\x15\x16\x17\x18\x19\x1b\x1c\x1d\x1e\x1f\x7f",
+ "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
+ check_normalized_url(
+ "x://q/\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f",
+ "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
+ check_normalized_url(
+ "x://q/\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f",
+ "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
+ check_normalized_url(
+ "x://q/\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf",
+ "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
+ check_normalized_url(
+ "x://q/\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf",
+ "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
+ check_normalized_url(
+ "x://q/\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf",
+ "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
+ check_normalized_url(
+ "x://q/\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf",
+ "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
+ check_normalized_url(
+ "x://q/\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef",
+ "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
+ check_normalized_url(
+ "x://q/\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff",
+ "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
+}
+
+static void t_url_utf8_escape(void)
+{
+ check_normalized_url(
+ "x://q/\xc2\x80\xdf\xbf\xe0\xa0\x80\xef\xbf\xbd\xf0\x90\x80\x80\xf0\xaf\xbf\xbd",
+ "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
+}
+
+static void t_url_username_pass(void)
+{
+ check_normalized_url("x://%41%62(^):%70+d@foo", "x://Ab(%5E):p+d@foo/");
+}
+
+static void t_url_length(void)
+{
+ check_normalized_url_length("Http://%4d%65:%4d^%70@The.Host", 25);
+ check_normalized_url_length("http://%41:%42@x.y/%61/", 17);
+ check_normalized_url_length("http://@x.y/^", 15);
+}
+
+static void t_url_dots(void)
+{
+ check_normalized_url("x://y/.", "x://y/");
+ check_normalized_url("x://y/./", "x://y/");
+ check_normalized_url("x://y/a/.", "x://y/a");
+ check_normalized_url("x://y/a/./", "x://y/a/");
+ check_normalized_url("x://y/.?", "x://y/?");
+ check_normalized_url("x://y/./?", "x://y/?");
+ check_normalized_url("x://y/a/.?", "x://y/a?");
+ check_normalized_url("x://y/a/./?", "x://y/a/?");
+ check_normalized_url("x://y/a/./b/.././../c", "x://y/c");
+ check_normalized_url("x://y/a/./b/../.././c/", "x://y/c/");
+ check_normalized_url("x://y/a/./b/.././../c/././.././.", "x://y/");
+ check_url_normalizable("x://y/a/./b/.././../c/././.././..", 0);
+ check_normalized_url("x://y/a/./?/././..", "x://y/a/?/././..");
+ check_normalized_url("x://y/%2e/", "x://y/");
+ check_normalized_url("x://y/%2E/", "x://y/");
+ check_normalized_url("x://y/a/%2e./", "x://y/");
+ check_normalized_url("x://y/b/.%2E/", "x://y/");
+ check_normalized_url("x://y/c/%2e%2E/", "x://y/");
+}
+
+/*
+ * "http://@foo" specifies an empty user name but does not specify a password.
+ * "http://foo" specifies neither a user name nor a password.
+ * So they should not be equivalent.
+ */
+static void t_url_equivalents(void)
+{
+ compare_normalized_urls("httP://x", "Http://X/", 1);
+ compare_normalized_urls("Http://%4d%65:%4d^%70@The.Host", "hTTP://Me:%4D^p@the.HOST:80/", 1);
+ compare_normalized_urls("https://@x.y/^", "httpS://x.y:443/^", 0);
+ compare_normalized_urls("https://@x.y/^", "httpS://@x.y:0443/^", 1);
+ compare_normalized_urls("https://@x.y/^/../abc", "httpS://@x.y:0443/abc", 1);
+ compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1);
+}
+
+int cmd_main(int argc UNUSED, const char **argv UNUSED)
+{
+ TEST(t_url_scheme(), "url scheme");
+ TEST(t_url_authority(), "url authority");
+ TEST(t_url_port(), "url port checks");
+ TEST(t_url_port_normalization(), "url port normalization");
+ TEST(t_url_general_escape(), "url general escapes");
+ TEST(t_url_high_bit(), "url high-bit escapes");
+ TEST(t_url_utf8_escape(), "url utf8 escapes");
+ TEST(t_url_username_pass(), "url username/password escapes");
+ TEST(t_url_length(), "url normalized lengths");
+ TEST(t_url_dots(), "url . and .. segments");
+ TEST(t_url_equivalents(), "url equivalents");
+ return test_done();
+}
Range-diff against v3:
1: a73b89c8e0 ! 1: ef25954bf8 t: migrate t0110-urlmatch-normalization to the new framework
@@ Commit message
performance. And also add different test_msg()s for better debugging.
In the migration, last two of the checks from `t_url_general_escape()`
- were slightly changed compared to the shellscript. This involves changing
+ were slightly changed compared to the shell script. This involves
+ changing
'\'' -> '
'\!' -> !
in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
- pasted verbatim from the shellscript.
+ pasted verbatim from the shell script.
- Another change is the removal of MINGW prerequisite from one of the
+ Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+#include "test-lib.h"
+#include "urlmatch.h"
+
-+static void check_url_normalizable(const char *url, int normalizable)
++static void check_url_normalizable(const char *url, unsigned int normalizable)
+{
+ char *url_norm = url_normalize(url, NULL);
+
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+}
+
+static void compare_normalized_urls(const char *url1, const char *url2,
-+ size_t equal)
++ unsigned int equal)
+{
+ char *url1_norm = url_normalize(url1, NULL);
+ char *url2_norm = url_normalize(url2, NULL);
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+ test_msg("input url1: %s\n input url2: %s", url1,
+ url2);
+ } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
-+ test_msg(" url1_norm: %s\n url2_norm: %s\n"
++ test_msg(" normalized url1: %s\n normalized url2: %s\n"
+ " input url1: %s\n input url2: %s",
+ url1_norm, url2_norm, url1, url2);
+ }
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+ free(url_norm);
+}
+
-+/* Note that only file: URLs should be allowed without a host */
++/* Note that only "file:" URLs should be allowed without a host */
+static void t_url_scheme(void)
+{
+ check_url_normalizable("", 0);
@@ t/unit-tests/t-urlmatch-normalization.c (new)
+}
+
+/*
-+ * http://@foo specifies an empty user name but does not specify a password
-+ * http://foo specifies neither a user name nor a password
-+ * So they should not be equivalent
++ * "http://@foo" specifies an empty user name but does not specify a password.
++ * "http://foo" specifies neither a user name nor a password.
++ * So they should not be equivalent.
+ */
+static void t_url_equivalents(void)
+{
--
2.46.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v4] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-20 15:19 ` [GSoC][PATCH v4] " Ghanshyam Thakkar
@ 2024-08-20 15:24 ` Ghanshyam Thakkar
2024-08-21 10:06 ` Christian Couder
1 sibling, 0 replies; 23+ messages in thread
From: Ghanshyam Thakkar @ 2024-08-20 15:24 UTC (permalink / raw)
To: Ghanshyam Thakkar, git
Cc: Junio C Hamano, Karthik Nayak, Patrick Steinhardt,
Christian Couder, Christian Couder, Kaartic Sivaraam
Ghanshyam Thakkar <shyamthakkar001@gmail.com> wrote:
> helper/test-urlmatch-normalization along with
> t0110-urlmatch-normalization test the `url_normalize()` function from
> 'urlmatch.h'. Migrate them to the unit testing framework for better
> performance. And also add different test_msg()s for better debugging.
>
> In the migration, last two of the checks from `t_url_general_escape()`
> were slightly changed compared to the shell script. This involves
> changing
>
> '\'' -> '
> '\!' -> !
>
> in the urls of those checks. This is because in C strings, we don't
> need to escape "'" and "!". Other than these two, all the urls were
> pasted verbatim from the shell script.
>
> Another change is the removal of a MINGW prerequisite from one of the
> test. It was there because[1] on Windows, the command line is a
> Unicode string, it is not possible to pass arbitrary bytes to a
> program. But in unit tests we don't have this limitation.
>
> And since we can construct strings with arbitrary bytes in C, let's
> also remove the test files which contain URLs with arbitrary bytes in
> the 't/t0110' directory and instead embed those URLs in the unit test
> code itself.
>
> [1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
> ---
Changes in v4:
- fix typos and grammatical inconsistencies in the commit
message, test_msg()s and comments.
- change int and size_t to unsigned int, where ever applicable.
Thanks.
> Makefile | 2 +-
> t/helper/test-tool.c | 1 -
> t/helper/test-tool.h | 1 -
> t/helper/test-urlmatch-normalization.c | 56 -----
> t/t0110-urlmatch-normalization.sh | 182 ----------------
> t/t0110/README | 9 -
> t/t0110/url-1 | Bin 20 -> 0 bytes
> t/t0110/url-10 | Bin 23 -> 0 bytes
> t/t0110/url-11 | Bin 25 -> 0 bytes
> t/t0110/url-2 | Bin 20 -> 0 bytes
> t/t0110/url-3 | Bin 23 -> 0 bytes
> t/t0110/url-4 | Bin 23 -> 0 bytes
> t/t0110/url-5 | Bin 23 -> 0 bytes
> t/t0110/url-6 | Bin 23 -> 0 bytes
> t/t0110/url-7 | Bin 23 -> 0 bytes
> t/t0110/url-8 | Bin 23 -> 0 bytes
> t/t0110/url-9 | Bin 23 -> 0 bytes
> t/unit-tests/t-urlmatch-normalization.c | 271 ++++++++++++++++++++++++
> 18 files changed, 272 insertions(+), 250 deletions(-)
> delete mode 100644 t/helper/test-urlmatch-normalization.c
> delete mode 100755 t/t0110-urlmatch-normalization.sh
> delete mode 100644 t/t0110/README
> delete mode 100644 t/t0110/url-1
> delete mode 100644 t/t0110/url-10
> delete mode 100644 t/t0110/url-11
> delete mode 100644 t/t0110/url-2
> delete mode 100644 t/t0110/url-3
> delete mode 100644 t/t0110/url-4
> delete mode 100644 t/t0110/url-5
> delete mode 100644 t/t0110/url-6
> delete mode 100644 t/t0110/url-7
> delete mode 100644 t/t0110/url-8
> delete mode 100644 t/t0110/url-9
> create mode 100644 t/unit-tests/t-urlmatch-normalization.c
>
> diff --git a/Makefile b/Makefile
> index 3863e60b66..d7bc19e823 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -843,7 +843,6 @@ TEST_BUILTINS_OBJS += test-submodule.o
> TEST_BUILTINS_OBJS += test-subprocess.o
> TEST_BUILTINS_OBJS += test-trace2.o
> TEST_BUILTINS_OBJS += test-truncate.o
> -TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
> TEST_BUILTINS_OBJS += test-userdiff.o
> TEST_BUILTINS_OBJS += test-wildmatch.o
> TEST_BUILTINS_OBJS += test-windows-named-pipe.o
> @@ -1346,6 +1345,7 @@ UNIT_TEST_PROGRAMS += t-strbuf
> UNIT_TEST_PROGRAMS += t-strcmp-offset
> UNIT_TEST_PROGRAMS += t-strvec
> UNIT_TEST_PROGRAMS += t-trailer
> +UNIT_TEST_PROGRAMS += t-urlmatch-normalization
> UNIT_TEST_PROGS = $(patsubst
> %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
> UNIT_TEST_OBJS = $(patsubst
> %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
> UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index da3e69128a..f8a67df7de 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -83,7 +83,6 @@ static struct test_cmd cmds[] = {
> { "trace2", cmd__trace2 },
> { "truncate", cmd__truncate },
> { "userdiff", cmd__userdiff },
> - { "urlmatch-normalization", cmd__urlmatch_normalization },
> { "xml-encode", cmd__xml_encode },
> { "wildmatch", cmd__wildmatch },
> #ifdef GIT_WINDOWS_NATIVE
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index 642a34578c..e74bc0ffd4 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -76,7 +76,6 @@ int cmd__subprocess(int argc, const char **argv);
> int cmd__trace2(int argc, const char **argv);
> int cmd__truncate(int argc, const char **argv);
> int cmd__userdiff(int argc, const char **argv);
> -int cmd__urlmatch_normalization(int argc, const char **argv);
> int cmd__xml_encode(int argc, const char **argv);
> int cmd__wildmatch(int argc, const char **argv);
> #ifdef GIT_WINDOWS_NATIVE
> diff --git a/t/helper/test-urlmatch-normalization.c
> b/t/helper/test-urlmatch-normalization.c
> deleted file mode 100644
> index 86edd454f5..0000000000
> --- a/t/helper/test-urlmatch-normalization.c
> +++ /dev/null
> @@ -1,56 +0,0 @@
> -#include "test-tool.h"
> -#include "git-compat-util.h"
> -#include "urlmatch.h"
> -
> -int cmd__urlmatch_normalization(int argc, const char **argv)
> -{
> - const char usage[] = "test-tool urlmatch-normalization [-p | -l]
> <url1> | <url1> <url2>";
> - char *url1 = NULL, *url2 = NULL;
> - int opt_p = 0, opt_l = 0;
> - int ret = 0;
> -
> - /*
> - * For one url, succeed if url_normalize succeeds on it, fail
> otherwise.
> - * For two urls, succeed only if url_normalize succeeds on both and
> - * the results compare equal with strcmp. If -p is given (one url only)
> - * and url_normalize succeeds, print the result followed by "\n". If
> - * -l is given (one url only) and url_normalize succeeds, print the
> - * returned length in decimal followed by "\n".
> - */
> -
> - if (argc > 1 && !strcmp(argv[1], "-p")) {
> - opt_p = 1;
> - argc--;
> - argv++;
> - } else if (argc > 1 && !strcmp(argv[1], "-l")) {
> - opt_l = 1;
> - argc--;
> - argv++;
> - }
> -
> - if (argc < 2 || argc > 3)
> - die("%s", usage);
> -
> - if (argc == 2) {
> - struct url_info info;
> - url1 = url_normalize(argv[1], &info);
> - if (!url1)
> - return 1;
> - if (opt_p)
> - printf("%s\n", url1);
> - if (opt_l)
> - printf("%u\n", (unsigned)info.url_len);
> - goto cleanup;
> - }
> -
> - if (opt_p || opt_l)
> - die("%s", usage);
> -
> - url1 = url_normalize(argv[1], NULL);
> - url2 = url_normalize(argv[2], NULL);
> - ret = (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
> -cleanup:
> - free(url1);
> - free(url2);
> - return ret;
> -}
> diff --git a/t/t0110-urlmatch-normalization.sh
> b/t/t0110-urlmatch-normalization.sh
> deleted file mode 100755
> index 12d817fbd3..0000000000
> --- a/t/t0110-urlmatch-normalization.sh
> +++ /dev/null
> @@ -1,182 +0,0 @@
> -#!/bin/sh
> -
> -test_description='urlmatch URL normalization'
> -
> -TEST_PASSES_SANITIZE_LEAK=true
> -. ./test-lib.sh
> -
> -# The base name of the test url files
> -tu="$TEST_DIRECTORY/t0110/url"
> -
> -# Note that only file: URLs should be allowed without a host
> -
> -test_expect_success 'url scheme' '
> - ! test-tool urlmatch-normalization "" &&
> - ! test-tool urlmatch-normalization "_" &&
> - ! test-tool urlmatch-normalization "scheme" &&
> - ! test-tool urlmatch-normalization "scheme:" &&
> - ! test-tool urlmatch-normalization "scheme:/" &&
> - ! test-tool urlmatch-normalization "scheme://" &&
> - ! test-tool urlmatch-normalization "file" &&
> - ! test-tool urlmatch-normalization "file:" &&
> - ! test-tool urlmatch-normalization "file:/" &&
> - test-tool urlmatch-normalization "file://" &&
> - ! test-tool urlmatch-normalization "://acme.co" &&
> - ! test-tool urlmatch-normalization "x_test://acme.co" &&
> - ! test-tool urlmatch-normalization "-test://acme.co" &&
> - ! test-tool urlmatch-normalization "0test://acme.co" &&
> - ! test-tool urlmatch-normalization "+test://acme.co" &&
> - ! test-tool urlmatch-normalization ".test://acme.co" &&
> - ! test-tool urlmatch-normalization "schem%6e://" &&
> - test-tool urlmatch-normalization "x-Test+v1.0://acme.co" &&
> - test "$(test-tool urlmatch-normalization -p "AbCdeF://x.Y")" =
> "abcdef://x.y/"
> -'
> -
> -test_expect_success 'url authority' '
> - ! test-tool urlmatch-normalization "scheme://user:pass@" &&
> - ! test-tool urlmatch-normalization "scheme://?" &&
> - ! test-tool urlmatch-normalization "scheme://#" &&
> - ! test-tool urlmatch-normalization "scheme:///" &&
> - ! test-tool urlmatch-normalization "scheme://:" &&
> - ! test-tool urlmatch-normalization "scheme://:555" &&
> - test-tool urlmatch-normalization "file://user:pass@" &&
> - test-tool urlmatch-normalization "file://?" &&
> - test-tool urlmatch-normalization "file://#" &&
> - test-tool urlmatch-normalization "file:///" &&
> - test-tool urlmatch-normalization "file://:" &&
> - ! test-tool urlmatch-normalization "file://:555" &&
> - test-tool urlmatch-normalization "scheme://user:pass@host" &&
> - test-tool urlmatch-normalization "scheme://@host" &&
> - test-tool urlmatch-normalization "scheme://%00@host" &&
> - ! test-tool urlmatch-normalization "scheme://%%@host" &&
> - test-tool urlmatch-normalization "scheme://host_" &&
> - test-tool urlmatch-normalization "scheme://user:pass@host/" &&
> - test-tool urlmatch-normalization "scheme://@host/" &&
> - test-tool urlmatch-normalization "scheme://host/" &&
> - test-tool urlmatch-normalization "scheme://host?x" &&
> - test-tool urlmatch-normalization "scheme://host#x" &&
> - test-tool urlmatch-normalization "scheme://host/@" &&
> - test-tool urlmatch-normalization "scheme://host?@x" &&
> - test-tool urlmatch-normalization "scheme://host#@x" &&
> - test-tool urlmatch-normalization "scheme://[::1]" &&
> - test-tool urlmatch-normalization "scheme://[::1]/" &&
> - ! test-tool urlmatch-normalization "scheme://hos%41/" &&
> - test-tool urlmatch-normalization "scheme://[invalid....:/" &&
> - test-tool urlmatch-normalization "scheme://invalid....:]/" &&
> - ! test-tool urlmatch-normalization "scheme://invalid....:[/" &&
> - ! test-tool urlmatch-normalization "scheme://invalid....:["
> -'
> -
> -test_expect_success 'url port checks' '
> - test-tool urlmatch-normalization "xyz://q@some.host:" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:0" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:0000000" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:0000001?" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:065535#" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:65535" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:65536" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:99999" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:100000" &&
> - ! test-tool urlmatch-normalization "xyz://q@some.host:100001" &&
> - test-tool urlmatch-normalization "http://q@some.host:80" &&
> - test-tool urlmatch-normalization "https://q@some.host:443" &&
> - test-tool urlmatch-normalization "http://q@some.host:80/" &&
> - test-tool urlmatch-normalization "https://q@some.host:443?" &&
> - ! test-tool urlmatch-normalization "http://q@:8008" &&
> - ! test-tool urlmatch-normalization "http://:8080" &&
> - ! test-tool urlmatch-normalization "http://:" &&
> - test-tool urlmatch-normalization "xyz://q@some.host:456/" &&
> - test-tool urlmatch-normalization "xyz://[::1]:456/" &&
> - test-tool urlmatch-normalization "xyz://[::1]:/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:000/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:0%300/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:0x80/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:4294967297/" &&
> - ! test-tool urlmatch-normalization "xyz://[::1]:030f/"
> -'
> -
> -test_expect_success 'url port normalization' '
> - test "$(test-tool urlmatch-normalization -p "http://x:800")" =
> "http://x:800/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:0800")" =
> "http://x:800/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:00000800")" =
> "http://x:800/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:065535")" =
> "http://x:65535/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:1")" =
> "http://x:1/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:80")" =
> "http://x/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:080")" =
> "http://x/" &&
> - test "$(test-tool urlmatch-normalization -p "http://x:000000080")" =
> "http://x/" &&
> - test "$(test-tool urlmatch-normalization -p "https://x:443")" =
> "https://x/" &&
> - test "$(test-tool urlmatch-normalization -p "https://x:0443")" =
> "https://x/" &&
> - test "$(test-tool urlmatch-normalization -p "https://x:000000443")" =
> "https://x/"
> -'
> -
> -test_expect_success 'url general escapes' '
> - ! test-tool urlmatch-normalization "http://x.y?%fg" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/%7e%41^%3a")" =
> "x://w/~A%5E%3A" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/:/?#[]@")" =
> "x://w/:/?#[]@" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/$&()*+,;=")" =
> "x://w/$&()*+,;=" &&
> - test "$(test-tool urlmatch-normalization -p "X://W/'\''")" =
> "x://w/'\''" &&
> - test "$(test-tool urlmatch-normalization -p "X://W?'\!'")" =
> "x://w/?'\!'"
> -'
> -
> -test_expect_success !MINGW 'url high-bit escapes' '
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-1")")" =
> "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-2")")" =
> "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-3")")" =
> "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-4")")" =
> "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-5")")" =
> "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-6")")" =
> "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-7")")" =
> "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-8")")" =
> "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-9")")" =
> "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-10")")" =
> "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF"
> -'
> -
> -test_expect_success 'url utf-8 escapes' '
> - test "$(test-tool urlmatch-normalization -p "$(cat "$tu-11")")" =
> "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
> -'
> -
> -test_expect_success 'url username/password escapes' '
> - test "$(test-tool urlmatch-normalization -p
> "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
> -'
> -
> -test_expect_success 'url normalized lengths' '
> - test "$(test-tool urlmatch-normalization -l
> "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
> - test "$(test-tool urlmatch-normalization -l
> "http://%41:%42@x.y/%61/")" = 17 &&
> - test "$(test-tool urlmatch-normalization -l "http://@x.y/^")" = 15
> -'
> -
> -test_expect_success 'url . and .. segments' '
> - test "$(test-tool urlmatch-normalization -p "x://y/.")" = "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/./")" = "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/.")" = "x://y/a"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./")" =
> "x://y/a/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/.?")" = "x://y/?"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/./?")" = "x://y/?"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/.?")" =
> "x://y/a?" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./?")" =
> "x://y/a/?" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./b/.././../c")"
> = "x://y/c" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./b/../.././c/")"
> = "x://y/c/" &&
> - test "$(test-tool urlmatch-normalization -p
> "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
> - ! test-tool urlmatch-normalization "x://y/a/./b/.././../c/././.././.."
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/./?/././..")" =
> "x://y/a/?/././.." &&
> - test "$(test-tool urlmatch-normalization -p "x://y/%2e/")" = "x://y/"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/%2E/")" = "x://y/"
> &&
> - test "$(test-tool urlmatch-normalization -p "x://y/a/%2e./")" =
> "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/b/.%2E/")" =
> "x://y/" &&
> - test "$(test-tool urlmatch-normalization -p "x://y/c/%2e%2E/")" =
> "x://y/"
> -'
> -
> -# http://@foo specifies an empty user name but does not specify a
> password
> -# http://foo specifies neither a user name nor a password
> -# So they should not be equivalent
> -test_expect_success 'url equivalents' '
> - test-tool urlmatch-normalization "httP://x" "Http://X/" &&
> - test-tool urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host"
> "hTTP://Me:%4D^p@the.HOST:80/" &&
> - ! test-tool urlmatch-normalization "https://@x.y/^"
> "httpS://x.y:443/^" &&
> - test-tool urlmatch-normalization "https://@x.y/^"
> "httpS://@x.y:0443/^" &&
> - test-tool urlmatch-normalization "https://@x.y/^/../abc"
> "httpS://@x.y:0443/abc" &&
> - test-tool urlmatch-normalization "https://@x.y/^/.."
> "httpS://@x.y:0443/"
> -'
> -
> -test_done
> diff --git a/t/t0110/README b/t/t0110/README
> deleted file mode 100644
> index ad4a50ecd8..0000000000
> --- a/t/t0110/README
> +++ /dev/null
> @@ -1,9 +0,0 @@
> -The url data files in this directory contain URLs with characters
> -in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
> -of unprintable characters.
> -
> -A select few characters in the 0x01-0x1f range are skipped to help
> -avoid problems running the test itself.
> -
> -The urls are in test files in this directory rather than being
> -embedded in the test script for portability.
> diff --git a/t/t0110/url-1 b/t/t0110/url-1
> deleted file mode 100644
> index
> 519019c5ce6c58478f048a2f39e2321370d318c6..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 20
> bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt
>
> diff --git a/t/t0110/url-10 b/t/t0110/url-10
> deleted file mode 100644
> index
> b9965de6a5d74b122179821212b2c27c8ae03e80..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)
>
> diff --git a/t/t0110/url-11 b/t/t0110/url-11
> deleted file mode 100644
> index
> f0a50f10096a20d597f40c775f09a71276e0050a..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 25
> hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5
>
> diff --git a/t/t0110/url-2 b/t/t0110/url-2
> deleted file mode 100644
> index
> 43334b05b2de3794d6020abd96e634a4e9e49cb0..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 20
> bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx
>
> diff --git a/t/t0110/url-3 b/t/t0110/url-3
> deleted file mode 100644
> index
> 7378c7bec247b996bc67b00a05ed89cf47d4b7a7..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b
>
> diff --git a/t/t0110/url-4 b/t/t0110/url-4
> deleted file mode 100644
> index
> 220b198c97f942fea4960f51a2105cc42261061a..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!
>
> diff --git a/t/t0110/url-5 b/t/t0110/url-5
> deleted file mode 100644
> index
> 1ccd9277792840955bb124bdde21f4b08bcccb63..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#
>
> diff --git a/t/t0110/url-6 b/t/t0110/url-6
> deleted file mode 100644
> index
> e8283aac6dff049d3e02454db6e684c5790a5996..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$
>
> diff --git a/t/t0110/url-7 b/t/t0110/url-7
> deleted file mode 100644
> index
> fa7c10b615259deefd15b638b021da7c60eba1b2..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%
>
> diff --git a/t/t0110/url-8 b/t/t0110/url-8
> deleted file mode 100644
> index
> 79a0ba836f5b8886b0a73f161eb292af2b105e65..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&
>
> diff --git a/t/t0110/url-9 b/t/t0110/url-9
> deleted file mode 100644
> index
> 8b44bec48b94467c63e8e1ad18162e465da6d6dd..0000000000000000000000000000000000000000
> GIT binary patch
> literal 0
> HcmV?d00001
>
> literal 23
> hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(
>
> diff --git a/t/unit-tests/t-urlmatch-normalization.c
> b/t/unit-tests/t-urlmatch-normalization.c
> new file mode 100644
> index 0000000000..1769c357b9
> --- /dev/null
> +++ b/t/unit-tests/t-urlmatch-normalization.c
> @@ -0,0 +1,271 @@
> +#include "test-lib.h"
> +#include "urlmatch.h"
> +
> +static void check_url_normalizable(const char *url, unsigned int
> normalizable)
> +{
> + char *url_norm = url_normalize(url, NULL);
> +
> + if (!check_int(normalizable, ==, url_norm ? 1 : 0))
> + test_msg("input url: %s", url);
> + free(url_norm);
> +}
> +
> +static void check_normalized_url(const char *url, const char *expect)
> +{
> + char *url_norm = url_normalize(url, NULL);
> +
> + if (!check_str(url_norm, expect))
> + test_msg("input url: %s", url);
> + free(url_norm);
> +}
> +
> +static void compare_normalized_urls(const char *url1, const char *url2,
> + unsigned int equal)
> +{
> + char *url1_norm = url_normalize(url1, NULL);
> + char *url2_norm = url_normalize(url2, NULL);
> +
> + if (equal) {
> + if (!check_str(url1_norm, url2_norm))
> + test_msg("input url1: %s\n input url2: %s", url1,
> + url2);
> + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
> + test_msg(" normalized url1: %s\n normalized url2: %s\n"
> + " input url1: %s\n input url2: %s",
> + url1_norm, url2_norm, url1, url2);
> + }
> + free(url1_norm);
> + free(url2_norm);
> +}
> +
> +static void check_normalized_url_length(const char *url, size_t len)
> +{
> + struct url_info info;
> + char *url_norm = url_normalize(url, &info);
> +
> + if (!check_int(info.url_len, ==, len))
> + test_msg(" input url: %s\n normalized url: %s", url,
> + url_norm);
> + free(url_norm);
> +}
> +
> +/* Note that only "file:" URLs should be allowed without a host */
> +static void t_url_scheme(void)
> +{
> + check_url_normalizable("", 0);
> + check_url_normalizable("_", 0);
> + check_url_normalizable("scheme", 0);
> + check_url_normalizable("scheme:", 0);
> + check_url_normalizable("scheme:/", 0);
> + check_url_normalizable("scheme://", 0);
> + check_url_normalizable("file", 0);
> + check_url_normalizable("file:", 0);
> + check_url_normalizable("file:/", 0);
> + check_url_normalizable("file://", 1);
> + check_url_normalizable("://acme.co", 0);
> + check_url_normalizable("x_test://acme.co", 0);
> + check_url_normalizable("-test://acme.co", 0);
> + check_url_normalizable("0test://acme.co", 0);
> + check_url_normalizable("+test://acme.co", 0);
> + check_url_normalizable(".test://acme.co", 0);
> + check_url_normalizable("schem%6e://", 0);
> + check_url_normalizable("x-Test+v1.0://acme.co", 1);
> + check_normalized_url("AbCdeF://x.Y", "abcdef://x.y/");
> +}
> +
> +static void t_url_authority(void)
> +{
> + check_url_normalizable("scheme://user:pass@", 0);
> + check_url_normalizable("scheme://?", 0);
> + check_url_normalizable("scheme://#", 0);
> + check_url_normalizable("scheme:///", 0);
> + check_url_normalizable("scheme://:", 0);
> + check_url_normalizable("scheme://:555", 0);
> + check_url_normalizable("file://user:pass@", 1);
> + check_url_normalizable("file://?", 1);
> + check_url_normalizable("file://#", 1);
> + check_url_normalizable("file:///", 1);
> + check_url_normalizable("file://:", 1);
> + check_url_normalizable("file://:555", 0);
> + check_url_normalizable("scheme://user:pass@host", 1);
> + check_url_normalizable("scheme://@host", 1);
> + check_url_normalizable("scheme://%00@host", 1);
> + check_url_normalizable("scheme://%%@host", 0);
> + check_url_normalizable("scheme://host_", 1);
> + check_url_normalizable("scheme://user:pass@host/", 1);
> + check_url_normalizable("scheme://@host/", 1);
> + check_url_normalizable("scheme://host/", 1);
> + check_url_normalizable("scheme://host?x", 1);
> + check_url_normalizable("scheme://host#x", 1);
> + check_url_normalizable("scheme://host/@", 1);
> + check_url_normalizable("scheme://host?@x", 1);
> + check_url_normalizable("scheme://host#@x", 1);
> + check_url_normalizable("scheme://[::1]", 1);
> + check_url_normalizable("scheme://[::1]/", 1);
> + check_url_normalizable("scheme://hos%41/", 0);
> + check_url_normalizable("scheme://[invalid....:/", 1);
> + check_url_normalizable("scheme://invalid....:]/", 1);
> + check_url_normalizable("scheme://invalid....:[/", 0);
> + check_url_normalizable("scheme://invalid....:[", 0);
> +}
> +
> +static void t_url_port(void)
> +{
> + check_url_normalizable("xyz://q@some.host:", 1);
> + check_url_normalizable("xyz://q@some.host:456/", 1);
> + check_url_normalizable("xyz://q@some.host:0", 0);
> + check_url_normalizable("xyz://q@some.host:0000000", 0);
> + check_url_normalizable("xyz://q@some.host:0000001?", 1);
> + check_url_normalizable("xyz://q@some.host:065535#", 1);
> + check_url_normalizable("xyz://q@some.host:65535", 1);
> + check_url_normalizable("xyz://q@some.host:65536", 0);
> + check_url_normalizable("xyz://q@some.host:99999", 0);
> + check_url_normalizable("xyz://q@some.host:100000", 0);
> + check_url_normalizable("xyz://q@some.host:100001", 0);
> + check_url_normalizable("http://q@some.host:80", 1);
> + check_url_normalizable("https://q@some.host:443", 1);
> + check_url_normalizable("http://q@some.host:80/", 1);
> + check_url_normalizable("https://q@some.host:443?", 1);
> + check_url_normalizable("http://q@:8008", 0);
> + check_url_normalizable("http://:8080", 0);
> + check_url_normalizable("http://:", 0);
> + check_url_normalizable("xyz://q@some.host:456/", 1);
> + check_url_normalizable("xyz://[::1]:456/", 1);
> + check_url_normalizable("xyz://[::1]:/", 1);
> + check_url_normalizable("xyz://[::1]:000/", 0);
> + check_url_normalizable("xyz://[::1]:0%300/", 0);
> + check_url_normalizable("xyz://[::1]:0x80/", 0);
> + check_url_normalizable("xyz://[::1]:4294967297/", 0);
> + check_url_normalizable("xyz://[::1]:030f/", 0);
> +}
> +
> +static void t_url_port_normalization(void)
> +{
> + check_normalized_url("http://x:800", "http://x:800/");
> + check_normalized_url("http://x:0800", "http://x:800/");
> + check_normalized_url("http://x:00000800", "http://x:800/");
> + check_normalized_url("http://x:065535", "http://x:65535/");
> + check_normalized_url("http://x:1", "http://x:1/");
> + check_normalized_url("http://x:80", "http://x/");
> + check_normalized_url("http://x:080", "http://x/");
> + check_normalized_url("http://x:000000080", "http://x/");
> + check_normalized_url("https://x:443", "https://x/");
> + check_normalized_url("https://x:0443", "https://x/");
> + check_normalized_url("https://x:000000443", "https://x/");
> +}
> +
> +static void t_url_general_escape(void)
> +{
> + check_url_normalizable("http://x.y?%fg", 0);
> + check_normalized_url("X://W/%7e%41^%3a", "x://w/~A%5E%3A");
> + check_normalized_url("X://W/:/?#[]@", "x://w/:/?#[]@");
> + check_normalized_url("X://W/$&()*+,;=", "x://w/$&()*+,;=");
> + check_normalized_url("X://W/'", "x://w/'");
> + check_normalized_url("X://W?!", "x://w/?!");
> +}
> +
> +static void t_url_high_bit(void)
> +{
> + check_normalized_url(
> + "x://q/\x01\x02\x03\x04\x05\x06\x07\x08\x0e\x0f\x10\x11\x12",
> + "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12");
> + check_normalized_url(
> + "x://q/\x13\x14\x15\x16\x17\x18\x19\x1b\x1c\x1d\x1e\x1f\x7f",
> + "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F");
> + check_normalized_url(
> +
> "x://q/\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f",
> + "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F");
> + check_normalized_url(
> +
> "x://q/\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f",
> + "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F");
> + check_normalized_url(
> +
> "x://q/\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf",
> + "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF");
> + check_normalized_url(
> +
> "x://q/\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf",
> + "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF");
> + check_normalized_url(
> +
> "x://q/\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf",
> + "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF");
> + check_normalized_url(
> +
> "x://q/\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf",
> + "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF");
> + check_normalized_url(
> +
> "x://q/\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef",
> + "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF");
> + check_normalized_url(
> +
> "x://q/\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff",
> + "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF");
> +}
> +
> +static void t_url_utf8_escape(void)
> +{
> + check_normalized_url(
> +
> "x://q/\xc2\x80\xdf\xbf\xe0\xa0\x80\xef\xbf\xbd\xf0\x90\x80\x80\xf0\xaf\xbf\xbd",
> + "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD");
> +}
> +
> +static void t_url_username_pass(void)
> +{
> + check_normalized_url("x://%41%62(^):%70+d@foo",
> "x://Ab(%5E):p+d@foo/");
> +}
> +
> +static void t_url_length(void)
> +{
> + check_normalized_url_length("Http://%4d%65:%4d^%70@The.Host", 25);
> + check_normalized_url_length("http://%41:%42@x.y/%61/", 17);
> + check_normalized_url_length("http://@x.y/^", 15);
> +}
> +
> +static void t_url_dots(void)
> +{
> + check_normalized_url("x://y/.", "x://y/");
> + check_normalized_url("x://y/./", "x://y/");
> + check_normalized_url("x://y/a/.", "x://y/a");
> + check_normalized_url("x://y/a/./", "x://y/a/");
> + check_normalized_url("x://y/.?", "x://y/?");
> + check_normalized_url("x://y/./?", "x://y/?");
> + check_normalized_url("x://y/a/.?", "x://y/a?");
> + check_normalized_url("x://y/a/./?", "x://y/a/?");
> + check_normalized_url("x://y/a/./b/.././../c", "x://y/c");
> + check_normalized_url("x://y/a/./b/../.././c/", "x://y/c/");
> + check_normalized_url("x://y/a/./b/.././../c/././.././.", "x://y/");
> + check_url_normalizable("x://y/a/./b/.././../c/././.././..", 0);
> + check_normalized_url("x://y/a/./?/././..", "x://y/a/?/././..");
> + check_normalized_url("x://y/%2e/", "x://y/");
> + check_normalized_url("x://y/%2E/", "x://y/");
> + check_normalized_url("x://y/a/%2e./", "x://y/");
> + check_normalized_url("x://y/b/.%2E/", "x://y/");
> + check_normalized_url("x://y/c/%2e%2E/", "x://y/");
> +}
> +
> +/*
> + * "http://@foo" specifies an empty user name but does not specify a
> password.
> + * "http://foo" specifies neither a user name nor a password.
> + * So they should not be equivalent.
> + */
> +static void t_url_equivalents(void)
> +{
> + compare_normalized_urls("httP://x", "Http://X/", 1);
> + compare_normalized_urls("Http://%4d%65:%4d^%70@The.Host",
> "hTTP://Me:%4D^p@the.HOST:80/", 1);
> + compare_normalized_urls("https://@x.y/^", "httpS://x.y:443/^", 0);
> + compare_normalized_urls("https://@x.y/^", "httpS://@x.y:0443/^", 1);
> + compare_normalized_urls("https://@x.y/^/../abc",
> "httpS://@x.y:0443/abc", 1);
> + compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1);
> +}
> +
> +int cmd_main(int argc UNUSED, const char **argv UNUSED)
> +{
> + TEST(t_url_scheme(), "url scheme");
> + TEST(t_url_authority(), "url authority");
> + TEST(t_url_port(), "url port checks");
> + TEST(t_url_port_normalization(), "url port normalization");
> + TEST(t_url_general_escape(), "url general escapes");
> + TEST(t_url_high_bit(), "url high-bit escapes");
> + TEST(t_url_utf8_escape(), "url utf8 escapes");
> + TEST(t_url_username_pass(), "url username/password escapes");
> + TEST(t_url_length(), "url normalized lengths");
> + TEST(t_url_dots(), "url . and .. segments");
> + TEST(t_url_equivalents(), "url equivalents");
> + return test_done();
> +}
>
> Range-diff against v3:
> 1: a73b89c8e0 ! 1: ef25954bf8 t: migrate t0110-urlmatch-normalization to
> the new framework
> @@ Commit message
> performance. And also add different test_msg()s for better debugging.
>
> In the migration, last two of the checks from `t_url_general_escape()`
> - were slightly changed compared to the shellscript. This involves
> changing
> + were slightly changed compared to the shell script. This involves
> + changing
>
> '\'' -> '
> '\!' -> !
>
> in the urls of those checks. This is because in C strings, we don't
> need to escape "'" and "!". Other than these two, all the urls were
> - pasted verbatim from the shellscript.
> + pasted verbatim from the shell script.
>
> - Another change is the removal of MINGW prerequisite from one of the
> + Another change is the removal of a MINGW prerequisite from one of the
> test. It was there because[1] on Windows, the command line is a
> Unicode string, it is not possible to pass arbitrary bytes to a
> program. But in unit tests we don't have this limitation.
> @@ t/unit-tests/t-urlmatch-normalization.c (new)
> +#include "test-lib.h"
> +#include "urlmatch.h"
> +
> -+static void check_url_normalizable(const char *url, int normalizable)
> ++static void check_url_normalizable(const char *url, unsigned int
> normalizable)
> +{
> + char *url_norm = url_normalize(url, NULL);
> +
> @@ t/unit-tests/t-urlmatch-normalization.c (new)
> +}
> +
> +static void compare_normalized_urls(const char *url1, const char *url2,
> -+ size_t equal)
> ++ unsigned int equal)
> +{
> + char *url1_norm = url_normalize(url1, NULL);
> + char *url2_norm = url_normalize(url2, NULL);
> @@ t/unit-tests/t-urlmatch-normalization.c (new)
> + test_msg("input url1: %s\n input url2: %s", url1,
> + url2);
> + } else if (!check_int(strcmp(url1_norm, url2_norm), !=, 0)) {
> -+ test_msg(" url1_norm: %s\n url2_norm: %s\n"
> ++ test_msg(" normalized url1: %s\n normalized url2: %s\n"
> + " input url1: %s\n input url2: %s",
> + url1_norm, url2_norm, url1, url2);
> + }
> @@ t/unit-tests/t-urlmatch-normalization.c (new)
> + free(url_norm);
> +}
> +
> -+/* Note that only file: URLs should be allowed without a host */
> ++/* Note that only "file:" URLs should be allowed without a host */
> +static void t_url_scheme(void)
> +{
> + check_url_normalizable("", 0);
> @@ t/unit-tests/t-urlmatch-normalization.c (new)
> +}
> +
> +/*
> -+ * http://@foo specifies an empty user name but does not specify a
> password
> -+ * http://foo specifies neither a user name nor a password
> -+ * So they should not be equivalent
> ++ * "http://@foo" specifies an empty user name but does not specify a
> password.
> ++ * "http://foo" specifies neither a user name nor a password.
> ++ * So they should not be equivalent.
> + */
> +static void t_url_equivalents(void)
> +{
> --
> 2.46.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v4] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-20 15:19 ` [GSoC][PATCH v4] " Ghanshyam Thakkar
2024-08-20 15:24 ` Ghanshyam Thakkar
@ 2024-08-21 10:06 ` Christian Couder
2024-08-21 16:08 ` Junio C Hamano
1 sibling, 1 reply; 23+ messages in thread
From: Christian Couder @ 2024-08-21 10:06 UTC (permalink / raw)
To: Ghanshyam Thakkar
Cc: git, Junio C Hamano, Karthik Nayak, Patrick Steinhardt,
Christian Couder, Kaartic Sivaraam
On Tue, Aug 20, 2024 at 5:20 PM Ghanshyam Thakkar
<shyamthakkar001@gmail.com> wrote:
>
> helper/test-urlmatch-normalization along with
> t0110-urlmatch-normalization test the `url_normalize()` function from
> 'urlmatch.h'. Migrate them to the unit testing framework for better
> performance. And also add different test_msg()s for better debugging.
This version addresses all the suggestions (nits actually) to improve
the previous version, so it seems to me that it is good to go.
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [GSoC][PATCH v4] t: migrate t0110-urlmatch-normalization to the new framework
2024-08-21 10:06 ` Christian Couder
@ 2024-08-21 16:08 ` Junio C Hamano
0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-08-21 16:08 UTC (permalink / raw)
To: Christian Couder
Cc: Ghanshyam Thakkar, git, Karthik Nayak, Patrick Steinhardt,
Christian Couder, Kaartic Sivaraam
Christian Couder <christian.couder@gmail.com> writes:
> On Tue, Aug 20, 2024 at 5:20 PM Ghanshyam Thakkar
> <shyamthakkar001@gmail.com> wrote:
>>
>> helper/test-urlmatch-normalization along with
>> t0110-urlmatch-normalization test the `url_normalize()` function from
>> 'urlmatch.h'. Migrate them to the unit testing framework for better
>> performance. And also add different test_msg()s for better debugging.
>
> This version addresses all the suggestions (nits actually) to improve
> the previous version, so it seems to me that it is good to go.
Looking good to me, too. Thanks, both.
Will queue. Let me mark it for 'next' in my "What's cooking" draft.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-08-21 16:08 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-28 12:56 [GSoC][PATCH] t: migrate helper/test-urlmatch-normalization to unit tests Ghanshyam Thakkar
2024-07-09 0:42 ` Ghanshyam Thakkar
2024-07-22 12:53 ` Karthik Nayak
2024-07-22 12:54 ` Ghanshyam Thakkar
2024-07-23 8:26 ` Karthik Nayak
2024-07-23 14:00 ` Patrick Steinhardt
2024-07-24 0:24 ` Ghanshyam Thakkar
2024-07-24 5:19 ` Patrick Steinhardt
2024-07-24 7:06 ` Ghanshyam Thakkar
2024-07-24 7:45 ` Patrick Steinhardt
2024-08-13 17:24 ` [GSoC][PATCH v2] t: migrate t0110-urlmatch-normalization to the new framework Ghanshyam Thakkar
2024-08-13 19:22 ` Junio C Hamano
2024-08-14 1:35 ` Kaartic Sivaraam
2024-08-14 4:58 ` Junio C Hamano
2024-08-14 14:24 ` Ghanshyam Thakkar
2024-08-14 5:17 ` Kaartic Sivaraam
2024-08-14 14:20 ` [GSoC][PATCH v3] " Ghanshyam Thakkar
2024-08-14 16:52 ` Junio C Hamano
2024-08-19 12:46 ` Christian Couder
2024-08-20 15:19 ` [GSoC][PATCH v4] " Ghanshyam Thakkar
2024-08-20 15:24 ` Ghanshyam Thakkar
2024-08-21 10:06 ` Christian Couder
2024-08-21 16:08 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).