git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Ivan Frade <ifrade@google.com>, Ivan Frade <ifrade@google.com>
Subject: [PATCH v3] fetch-pack: redact packfile urls in traces
Date: Tue, 19 Oct 2021 22:57:39 +0000	[thread overview]
Message-ID: <pull.1052.v3.git.1634684260142.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1052.v2.git.1633746024175.gitgitgadget@gmail.com>

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
    fetch-pack: redact packfile urls in traces
    
    Changes since v1:
    
     * Redact only the path of the URL
     * Test are now strict, validating the exact line expected in the log
    
    Changes since v1:
    
     * Removed non-POSIX flags in tests
     * More accurate regex for the non-encrypted packfile line
     * Dropped documentation change
     * Dropped redacting the die message in http-fetch

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v2:

 1:  701cb7a6ab9 ! 1:  9afe0093af4 fetch-pack: redact packfile urls in traces
     @@ Commit message
          recommended to expose them plainly in logs, although in special
          circunstances (e.g. debug) it makes sense to write them.
      
     -    Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
     -    variable is set to false. This mimics the redacting of the
     -    Authorization header in HTTP.
     -
     -    Changes since v1:
     -    - Removed non-POSIX flags in tests
     -    - More accurate regex for the non-encrypted packfile line
     -    - Dropped documentation change
     -    - Dropped redacting the die message in http-fetch
     +    Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
     +    variable is set to false. This mimics the redacting of the Authorization
     +    header in HTTP.
      
          Signed-off-by: Ivan Frade <ifrade@google.com>
      
     @@ fetch-pack.c: static void receive_wanted_refs(struct packet_reader *reader,
       static void receive_packfile_uris(struct packet_reader *reader,
       				  struct string_list *uris)
       {
     -+	int original_options;
     ++	int saved_options;
       	process_section_header(reader, "packfile-uris", 0);
      +	/*
      +	 * In some setups, packfile-uris act as bearer tokens,
      +	 * redact them by default.
      +	 */
     -+	original_options = reader->options;
     ++	saved_options = reader->options;
      +	if (git_env_bool("GIT_TRACE_REDACT", 1))
     -+		reader->options |= PACKET_READ_REDACT_ON_TRACE;
     ++		reader->options |= PACKET_READ_REDACT_URL_PATH;
      +
       	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
       		if (reader->pktlen < the_hash_algo->hexsz ||
     @@ fetch-pack.c: static void receive_packfile_uris(struct packet_reader *reader,
       
       		string_list_append(uris, reader->line);
       	}
     -+	reader->options = original_options;
     ++	reader->options = saved_options;
      +
       	if (reader->status != PACKET_READ_DELIM)
       		die("expected DELIM");
       }
      
       ## pkt-line.c ##
     +@@ pkt-line.c: int packet_length(const char lenbuf_hex[4])
     + 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
     + }
     + 
     ++static int find_url_path_start(const char* buffer)
     ++{
     ++	const char *URL_MARK = "://";
     ++	char *p = strstr(buffer, URL_MARK);
     ++	if (!p) {
     ++		return -1;
     ++	}
     ++
     ++	p += strlen(URL_MARK);
     ++	while (*p && *p != '/')
     ++		p++;
     ++
     ++	// Position after '/'
     ++	if (*p && *(p + 1))
     ++		return (p + 1) - buffer;
     ++
     ++	return -1;
     ++}
     ++
     + enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     + 						size_t *src_len, char *buffer,
     + 						unsigned size, int *pktlen,
     +@@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     + {
     + 	int len;
     + 	char linelen[4];
     ++	int url_path_start;
     + 
     + 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
     + 		*pktlen = -1;
      @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
       		len--;
       
       	buffer[len] = 0;
      -	packet_trace(buffer, len, 0);
     -+	if (options & PACKET_READ_REDACT_ON_TRACE) {
     ++	if (options & PACKET_READ_REDACT_URL_PATH &&
     ++	    (url_path_start = find_url_path_start(buffer)) != -1) {
      +		const char *redacted = "<redacted>";
     -+		packet_trace(redacted, strlen(redacted), 0);
     ++		struct strbuf tracebuf = STRBUF_INIT;
     ++		strbuf_insert(&tracebuf, 0, buffer, len);
     ++		strbuf_splice(&tracebuf, url_path_start,
     ++			      len - url_path_start, redacted, strlen(redacted));
     ++		packet_trace(tracebuf.buf, tracebuf.len, 0);
     ++		strbuf_release(&tracebuf);
      +	} else {
      +		packet_trace(buffer, len, 0);
      +	}
     @@ pkt-line.h: void packet_fflush(FILE *f);
       #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
       #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
       #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
     -+#define PACKET_READ_REDACT_ON_TRACE      (1u<<4)
     ++#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
       int packet_read(int fd, char **src_buffer, size_t *src_len, char
       		*buffer, unsigned size, int options);
       
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
       	test_i18ngrep "disallowed submodule name" err
       '
       
     -+test_expect_success 'packfile-uri redacted in trace' '
     ++test_expect_success 'packfile-uri path redacted in trace' '
      +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
      +	rm -rf "$P" http_child log &&
      +
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +	git -C "$P" add my-blob &&
      +	git -C "$P" commit -m x &&
      +
     -+	configure_exclusion "$P" my-blob >h &&
     ++	git -C "$P" hash-object my-blob >objh &&
     ++	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
     ++	git -C "$P" config --add \
     ++		"uploadpack.blobpackfileuri" \
     ++		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
      +
      +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
      +	git -c protocol.version=2 \
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
      +
     -+	grep "clone< <redacted>" log
     ++	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
      +'
      +
     -+test_expect_success 'packfile-uri not redacted in trace when GIT_TRACE_REDACT=0' '
     ++test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
      +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
      +	rm -rf "$P" http_child log &&
      +
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +	git -C "$P" add my-blob &&
      +	git -C "$P" commit -m x &&
      +
     -+	configure_exclusion "$P" my-blob >h &&
     ++	git -C "$P" hash-object my-blob >objh &&
     ++	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
     ++	git -C "$P" config --add \
     ++		"uploadpack.blobpackfileuri" \
     ++		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
      +
      +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
      +	GIT_TRACE_REDACT=0 \
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
      +
     -+	grep -E "clone< ..[0-9a-f]{40,64} http://" log
     ++	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
      +'
      +
       test_expect_success 'http:// --negotiate-only' '


 fetch-pack.c           | 11 +++++++++
 pkt-line.c             | 33 ++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..1587b9ae662 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1518,7 +1518,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
 static void receive_packfile_uris(struct packet_reader *reader,
 				  struct string_list *uris)
 {
+	int saved_options;
 	process_section_header(reader, "packfile-uris", 0);
+	/*
+	 * In some setups, packfile-uris act as bearer tokens,
+	 * redact them by default.
+	 */
+	saved_options = reader->options;
+	if (git_env_bool("GIT_TRACE_REDACT", 1))
+		reader->options |= PACKET_READ_REDACT_URL_PATH;
+
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
 		if (reader->pktlen < the_hash_algo->hexsz ||
 		    reader->line[the_hash_algo->hexsz] != ' ')
@@ -1526,6 +1535,8 @@ static void receive_packfile_uris(struct packet_reader *reader,
 
 		string_list_append(uris, reader->line);
 	}
+	reader->options = saved_options;
+
 	if (reader->status != PACKET_READ_DELIM)
 		die("expected DELIM");
 }
diff --git a/pkt-line.c b/pkt-line.c
index de4a94b437e..1a9e6870559 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -386,6 +386,25 @@ int packet_length(const char lenbuf_hex[4])
 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
+static int find_url_path_start(const char* buffer)
+{
+	const char *URL_MARK = "://";
+	char *p = strstr(buffer, URL_MARK);
+	if (!p) {
+		return -1;
+	}
+
+	p += strlen(URL_MARK);
+	while (*p && *p != '/')
+		p++;
+
+	// Position after '/'
+	if (*p && *(p + 1))
+		return (p + 1) - buffer;
+
+	return -1;
+}
+
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
 						unsigned size, int *pktlen,
@@ -393,6 +412,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 {
 	int len;
 	char linelen[4];
+	int url_path_start;
 
 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
 		*pktlen = -1;
@@ -443,7 +463,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_URL_PATH &&
+	    (url_path_start = find_url_path_start(buffer)) != -1) {
+		const char *redacted = "<redacted>";
+		struct strbuf tracebuf = STRBUF_INIT;
+		strbuf_insert(&tracebuf, 0, buffer, len);
+		strbuf_splice(&tracebuf, url_path_start,
+			      len - url_path_start, redacted, strlen(redacted));
+		packet_trace(tracebuf.buf, tracebuf.len, 0);
+		strbuf_release(&tracebuf);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 82b95e4bdd3..853d20688c8 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -88,6 +88,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..f01af2f2ed3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri path redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
+'
+
+test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&

base-commit: 9d530dc0024503ab4218fe6c4395b8a0aa245478
-- 
gitgitgadget

  parent reply	other threads:[~2021-10-19 22:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-08 16:03 [PATCH 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
2021-10-08 19:36   ` Ævar Arnfjörð Bjarmason
2021-10-08 23:15     ` Ivan Frade
2021-10-08 16:03 ` [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars Ivan Frade via GitGitGadget
2021-10-08 19:43   ` Ævar Arnfjörð Bjarmason
2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-11 20:39   ` Junio C Hamano
2021-10-26 19:32     ` Ivan Frade
2021-10-19 22:57   ` Ivan Frade via GitGitGadget [this message]
2021-10-20 11:41     ` [PATCH v3] " Ævar Arnfjörð Bjarmason
2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
2021-10-28  1:01         ` Junio C Hamano
2021-10-28 22:15           ` Ivan Frade
2021-10-28 22:46             ` Junio C Hamano
2021-10-26 22:49       ` [PATCH v4 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
2021-10-28 17:25           ` Eric Sunshine
2021-10-28 22:44             ` Ivan Frade
2021-10-28 22:41           ` Ivan Frade
2021-10-29 23:18           ` Junio C Hamano
2021-11-09  1:54             ` Ævar Arnfjörð Bjarmason
2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
2021-10-28 23:21           ` Junio C Hamano
2021-10-29 18:42             ` Ivan Frade
2021-10-29 19:59               ` Junio C Hamano
2021-11-08 22:43                 ` Jonathan Tan
2021-10-28 22:51         ` [PATCH v5 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
2021-11-08 23:01             ` Jonathan Tan
2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
2021-11-10 23:44                 ` Ivan Frade
2021-11-11  0:01                   ` Ævar Arnfjörð Bjarmason
2021-11-10 21:18               ` Ivan Frade
2021-10-29 18:42           ` [PATCH v6 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-11-08 23:06             ` Jonathan Tan
2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-11-10 23:51             ` [PATCH v7 1/2] " Ivan Frade via GitGitGadget
2021-11-10 23:51             ` [PATCH v7 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-11-12  4:43             ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1052.v3.git.1634684260142.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=ifrade@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).