git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "Kyle J. McKay" <mackyle@gmail.com>,
	Peter Krefting <peter@softwolves.pp.se>
Subject: [PATCH v2 4/8] http: extract type/subtype portion of content-type
Date: Thu, 22 May 2014 05:29:47 -0400	[thread overview]
Message-ID: <20140522092947.GD15032@sigill.intra.peff.net> (raw)
In-Reply-To: <20140522092824.GA14530@sigill.intra.peff.net>

When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.

This could cause two visible behaviors:

  1. We might fail to recognize a smart-http server by its
     content-type.

  2. We might fail to relay text/plain error messages to
     users (especially if they contain a charset parameter).

This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.

Signed-off-by: Jeff King <peff@peff.net>
---
 http.c                     | 32 +++++++++++++++++++++++++++++---
 remote-curl.c              |  2 +-
 t/lib-httpd/error.sh       |  8 +++++++-
 t/t5550-http-fetch-dumb.sh |  5 +++++
 4 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/http.c b/http.c
index 94e1afd..4edf5b9 100644
--- a/http.c
+++ b/http.c
@@ -906,6 +906,29 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
 	return ret;
 }
 
+/*
+ * Extract a normalized version of the content type, with any
+ * spaces suppressed, all letters lowercased, and no trailing ";"
+ * or parameters.
+ *
+ * Example:
+ *   "TEXT/PLAIN; charset=utf-8" -> "text/plain"
+ */
+static void extract_content_type(struct strbuf *raw, struct strbuf *type)
+{
+	const char *p;
+
+	strbuf_reset(type);
+	strbuf_grow(type, raw->len);
+	for (p = raw->buf; *p; p++) {
+		if (isspace(*p))
+			continue;
+		if (*p == ';')
+			break;
+		strbuf_addch(type, tolower(*p));
+	}
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -957,9 +980,12 @@ static int http_request(const char *url,
 
 	ret = run_one_slot(slot, &results);
 
-	if (options && options->content_type)
-		curlinfo_strbuf(slot->curl, CURLINFO_CONTENT_TYPE,
-				options->content_type);
+	if (options && options->content_type) {
+		struct strbuf raw = STRBUF_INIT;
+		curlinfo_strbuf(slot->curl, CURLINFO_CONTENT_TYPE, &raw);
+		extract_content_type(&raw, options->content_type);
+		strbuf_release(&raw);
+	}
 
 	if (options && options->effective_url)
 		curlinfo_strbuf(slot->curl, CURLINFO_EFFECTIVE_URL,
diff --git a/remote-curl.c b/remote-curl.c
index 52c2d96..a5ab977 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -205,7 +205,7 @@ static int show_http_message(struct strbuf *type, struct strbuf *msg)
 	 * TODO should handle "; charset=XXX", and re-encode into
 	 * logoutputencoding
 	 */
-	if (strcasecmp(type->buf, "text/plain"))
+	if (strcmp(type->buf, "text/plain"))
 		return -1;
 
 	strbuf_trim(msg);
diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh
index 786f281..23cec97 100755
--- a/t/lib-httpd/error.sh
+++ b/t/lib-httpd/error.sh
@@ -3,6 +3,7 @@
 printf "Status: 500 Intentional Breakage\n"
 
 printf "Content-Type: "
+charset=iso-8859-1
 case "$PATH_INFO" in
 *html*)
 	printf "text/html"
@@ -10,8 +11,13 @@ case "$PATH_INFO" in
 *text*)
 	printf "text/plain"
 	;;
+*charset*)
+	printf "text/plain; charset=utf-8"
+	charset=utf-8
+	;;
 esac
 printf "\n"
 
 printf "\n"
-printf "this is the error message\n"
+printf "this is the error message\n" |
+iconv -f us-ascii -t $charset
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 13defd3..b35b261 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -181,5 +181,10 @@ test_expect_success 'git client does not show html errors' '
 	! grep "this is the error message" stderr
 '
 
+test_expect_success 'git client shows text/plain with a charset' '
+	test_must_fail git clone "$HTTPD_URL/error/charset" 2>stderr &&
+	grep "this is the error message" stderr
+'
+
 stop_httpd
 test_done
-- 
2.0.0.rc1.436.g03cb729

  parent reply	other threads:[~2014-05-22  9:29 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-21 10:25 [PATCH 0/9] handle alternate charsets for remote http errors Jeff King
2014-05-21 10:27 ` [PATCH 1/9] test-lib: preserve GIT_CURL_VERBOSE from the environment Jeff King
2014-05-21 10:27 ` [PATCH 2/9] strbuf: add strbuf_tolower function Jeff King
2014-05-22  0:07   ` Kyle J. McKay
2014-05-22  5:58     ` Jeff King
2014-05-22 18:36       ` Junio C Hamano
2014-05-22 18:41         ` Jeff King
2014-05-22 21:04           ` Junio C Hamano
2014-05-23 20:03             ` Jeff King
2014-05-22 22:52           ` Kyle J. McKay
2014-05-23 20:05             ` Jeff King
2014-05-23 22:34               ` Kyle J. McKay
2014-05-21 10:28 ` [PATCH 3/9] daemon/config: factor out duplicate xstrdup_tolower Jeff King
2014-05-21 10:29 ` [PATCH 4/9] http: normalize case of returned content-type Jeff King
2014-05-21 10:29 ` [PATCH 5/9] t/lib-httpd: use write_script to copy CGI scripts Jeff King
2014-05-21 10:29 ` [PATCH 6/9] t5550: test display of remote http error messages Jeff King
2014-05-21 10:33 ` [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter Jeff King
2014-05-22  0:07   ` Kyle J. McKay
2014-05-22  6:05     ` Jeff King
2014-05-22  7:27       ` Kyle J. McKay
2014-05-22  9:02         ` Jeff King
2014-05-22  7:12     ` Peter Krefting
2014-05-22  9:05       ` Jeff King
2014-05-22 10:19         ` Peter Krefting
2014-05-21 10:33 ` [PATCH 8/9] strbuf: add strbuf_reencode helper Jeff King
2014-05-21 10:33 ` [PATCH 9/9] remote-curl: reencode http error messages Jeff King
2014-05-22  0:07   ` Kyle J. McKay
2014-05-22  6:05     ` Jeff King
2014-05-22  7:26     ` Peter Krefting
2014-05-22  9:28 ` [PATCH v2 0/9] handle alternate charsets for remote http errors Jeff King
2014-05-22  9:28   ` [PATCH v2 1/8] test-lib: preserve GIT_CURL_VERBOSE from the environment Jeff King
2014-05-22  9:28   ` [PATCH v2 2/8] t/lib-httpd: use write_script to copy CGI scripts Jeff King
2014-05-22  9:29   ` [PATCH v2 3/8] t5550: test display of remote http error messages Jeff King
2014-05-22  9:29   ` Jeff King [this message]
2014-05-22 22:52     ` [PATCH v2 4/8] http: extract type/subtype portion of content-type Kyle J. McKay
2014-05-23 20:12       ` Jeff King
2014-05-23 22:00         ` Kyle J. McKay
2014-05-22  9:30   ` [PATCH v2 5/8] http: optionally extract charset parameter from content-type Jeff King
2014-05-22  9:30   ` [PATCH v2 6/8] strbuf: add strbuf_reencode helper Jeff King
2014-05-22  9:30   ` [PATCH v2 7/8] remote-curl: reencode http error messages Jeff King
2014-05-22  9:36   ` [PATCH v2 8/8] http: default text charset to iso-8859-1 Jeff King
2014-05-23  2:02     ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140522092947.GD15032@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=mackyle@gmail.com \
    --cc=peter@softwolves.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).