From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "Kyle J. McKay" <mackyle@gmail.com>,
Peter Krefting <peter@softwolves.pp.se>
Subject: [PATCH v2 5/8] http: optionally extract charset parameter from content-type
Date: Thu, 22 May 2014 05:30:05 -0400 [thread overview]
Message-ID: <20140522093005.GE15032@sigill.intra.peff.net> (raw)
In-Reply-To: <20140522092824.GA14530@sigill.intra.peff.net>
Since the previous commit, we now give a sanitized,
shortened version of the content-type header to any callers
who ask for it.
This patch adds back a way for them to cleanly access
specific parameters to the type. We could easily extract all
parameters and make them available via a string_list, but:
1. That complicates the interface and memory management.
2. In practice, no planned callers care about anything
except the charset.
This patch therefore goes with the simplest thing, and we
can expand or change the interface later if it becomes
necessary.
Signed-off-by: Jeff King <peff@peff.net>
---
http.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++----
http.h | 7 +++++++
2 files changed, 57 insertions(+), 4 deletions(-)
diff --git a/http.c b/http.c
index 4edf5b9..e26ee8b 100644
--- a/http.c
+++ b/http.c
@@ -907,14 +907,44 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
}
/*
+ * Check for and extract a content-type parameter. "raw"
+ * should be positioned at the start of the potential
+ * parameter, with any whitespace already removed.
+ *
+ * "name" is the name of the parameter. The value is appended
+ * to "out".
+ */
+static int extract_param(const char *raw, const char *name,
+ struct strbuf *out)
+{
+ size_t len = strlen(name);
+
+ if (strncasecmp(raw, name, len))
+ return -1;
+ raw += len;
+
+ if (*raw != '=')
+ return -1;
+ raw++;
+
+ while (*raw && !isspace(*raw))
+ strbuf_addch(out, *raw++);
+ return 0;
+}
+
+/*
* Extract a normalized version of the content type, with any
* spaces suppressed, all letters lowercased, and no trailing ";"
* or parameters.
*
+ * If the "charset" argument is not NULL, store the value of any
+ * charset parameter there.
+ *
* Example:
- * "TEXT/PLAIN; charset=utf-8" -> "text/plain"
+ * "TEXT/PLAIN; charset=utf-8" -> "text/plain", "utf-8"
*/
-static void extract_content_type(struct strbuf *raw, struct strbuf *type)
+static void extract_content_type(struct strbuf *raw, struct strbuf *type,
+ struct strbuf *charset)
{
const char *p;
@@ -923,10 +953,25 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type)
for (p = raw->buf; *p; p++) {
if (isspace(*p))
continue;
- if (*p == ';')
+ if (*p == ';') {
+ p++;
break;
+ }
strbuf_addch(type, tolower(*p));
}
+
+ if (!charset)
+ return;
+
+ strbuf_reset(charset);
+ while (*p) {
+ while (isspace(*p))
+ p++;
+ if (!extract_param(p, "charset", charset))
+ return;
+ while (*p && !isspace(*p))
+ p++;
+ }
}
/* http_request() targets */
@@ -983,7 +1028,8 @@ static int http_request(const char *url,
if (options && options->content_type) {
struct strbuf raw = STRBUF_INIT;
curlinfo_strbuf(slot->curl, CURLINFO_CONTENT_TYPE, &raw);
- extract_content_type(&raw, options->content_type);
+ extract_content_type(&raw, options->content_type,
+ options->charset);
strbuf_release(&raw);
}
diff --git a/http.h b/http.h
index e64084f..473179b 100644
--- a/http.h
+++ b/http.h
@@ -144,6 +144,13 @@ struct http_get_options {
struct strbuf *content_type;
/*
+ * If non-NULL, and content_type above is non-NULL, returns
+ * the charset parameter from the content-type. If none is
+ * present, returns an empty string.
+ */
+ struct strbuf *charset;
+
+ /*
* If non-NULL, returns the URL we ended up at, including any
* redirects we followed.
*/
--
2.0.0.rc1.436.g03cb729
next prev parent reply other threads:[~2014-05-22 9:30 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-21 10:25 [PATCH 0/9] handle alternate charsets for remote http errors Jeff King
2014-05-21 10:27 ` [PATCH 1/9] test-lib: preserve GIT_CURL_VERBOSE from the environment Jeff King
2014-05-21 10:27 ` [PATCH 2/9] strbuf: add strbuf_tolower function Jeff King
2014-05-22 0:07 ` Kyle J. McKay
2014-05-22 5:58 ` Jeff King
2014-05-22 18:36 ` Junio C Hamano
2014-05-22 18:41 ` Jeff King
2014-05-22 21:04 ` Junio C Hamano
2014-05-23 20:03 ` Jeff King
2014-05-22 22:52 ` Kyle J. McKay
2014-05-23 20:05 ` Jeff King
2014-05-23 22:34 ` Kyle J. McKay
2014-05-21 10:28 ` [PATCH 3/9] daemon/config: factor out duplicate xstrdup_tolower Jeff King
2014-05-21 10:29 ` [PATCH 4/9] http: normalize case of returned content-type Jeff King
2014-05-21 10:29 ` [PATCH 5/9] t/lib-httpd: use write_script to copy CGI scripts Jeff King
2014-05-21 10:29 ` [PATCH 6/9] t5550: test display of remote http error messages Jeff King
2014-05-21 10:33 ` [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter Jeff King
2014-05-22 0:07 ` Kyle J. McKay
2014-05-22 6:05 ` Jeff King
2014-05-22 7:27 ` Kyle J. McKay
2014-05-22 9:02 ` Jeff King
2014-05-22 7:12 ` Peter Krefting
2014-05-22 9:05 ` Jeff King
2014-05-22 10:19 ` Peter Krefting
2014-05-21 10:33 ` [PATCH 8/9] strbuf: add strbuf_reencode helper Jeff King
2014-05-21 10:33 ` [PATCH 9/9] remote-curl: reencode http error messages Jeff King
2014-05-22 0:07 ` Kyle J. McKay
2014-05-22 6:05 ` Jeff King
2014-05-22 7:26 ` Peter Krefting
2014-05-22 9:28 ` [PATCH v2 0/9] handle alternate charsets for remote http errors Jeff King
2014-05-22 9:28 ` [PATCH v2 1/8] test-lib: preserve GIT_CURL_VERBOSE from the environment Jeff King
2014-05-22 9:28 ` [PATCH v2 2/8] t/lib-httpd: use write_script to copy CGI scripts Jeff King
2014-05-22 9:29 ` [PATCH v2 3/8] t5550: test display of remote http error messages Jeff King
2014-05-22 9:29 ` [PATCH v2 4/8] http: extract type/subtype portion of content-type Jeff King
2014-05-22 22:52 ` Kyle J. McKay
2014-05-23 20:12 ` Jeff King
2014-05-23 22:00 ` Kyle J. McKay
2014-05-22 9:30 ` Jeff King [this message]
2014-05-22 9:30 ` [PATCH v2 6/8] strbuf: add strbuf_reencode helper Jeff King
2014-05-22 9:30 ` [PATCH v2 7/8] remote-curl: reencode http error messages Jeff King
2014-05-22 9:36 ` [PATCH v2 8/8] http: default text charset to iso-8859-1 Jeff King
2014-05-23 2:02 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140522093005.GE15032@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=mackyle@gmail.com \
--cc=peter@softwolves.pp.se \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).