From: Jeff King <peff@peff.net>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
Date: Fri, 30 Oct 2015 20:08:37 -0400 [thread overview]
Message-ID: <20151031000837.GA25849@sigill.intra.peff.net> (raw)
In-Reply-To: <1446245682-18087-1-git-send-email-dturner@twopensource.com>
On Fri, Oct 30, 2015 at 06:54:42PM -0400, David Turner wrote:
> A HTTP server is permitted to return a non-range response to a HTTP
> range request (and Apache httpd in fact does this in some cases).
> While libcurl knows how to correctly handle this (by skipping bytes
> before and after the requested range), it only turns on this handling
> if it is aware that a range request is being made. By manually
> setting the range header instead of using CURLOPT_RANGE, we were
> hiding the fact that this was a range request from libcurl. This
> could cause corruption.
The goal makes sense. Why weren't we using CURLOPT_RANGE before? Did it
not exist (or otherwise have limitations) in 2005, and if so, when did
it become usable? Do we need to protect this with an #ifdef for the curl
version?
> @@ -1213,8 +1213,9 @@ static int http_request(const char *url,
> curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
> fwrite);
> if (posn > 0) {
> - strbuf_addf(&buf, "Range: bytes=%ld-", posn);
> - headers = curl_slist_append(headers, buf.buf);
> + strbuf_addf(&buf, "%ld-", posn);
> + curl_easy_setopt(slot->curl, CURLOPT_RANGE,
> + &buf.buf);
> strbuf_reset(&buf);
> }
> } else
We reuse curl slots from request to request, and curl options may
persist. The old code appended to the header list, which then gets
added to the curl object with CURLOPT_HTTPHEADER. Subsequent requests,
even if they are not using ranges, would also set CURLOPT_HTTPHEADER,
possibly to NULL, so the range header would not accidentally creep into
the next request.
But with CURLOPT_RANGE, I think the value would persist to the next
request (unless curl automagically forgets it after making the request,
but I don't think it generally does so). I think the cleanest thing
would be to reset it back to NULL in get_active_slot (there are several
other fields that we reset there, too).
> @@ -1685,7 +1680,6 @@ struct http_object_request *new_http_object_request(const char *base_url,
> ssize_t prev_read = 0;
> long prev_posn = 0;
> char range[RANGE_HEADER_SIZE];
> - struct curl_slist *range_header = NULL;
> struct http_object_request *freq;
Junio asked elsewhere if we really need to tweak RANGE_HEADER_SIZE. But
I think we should go even further, and bump the definition much closer
to the point of use[1], and not worry about an arbitrary constant. After
all, we already use sizeof(), so we do not even end up repeating the
constant.
We could even hide the whole thing away with something like:
void http_set_range(CURL *curl, long lo, long hi)
{
char buf[128];
int len = 0;
if (lo >= 0)
len += xsnprintf(buf + len, "%ld", lo);
len += xsnprintf(buf + len, "-");
if (hi >= 0)
len += xsnprintf(buf + len, "%ld", hi);
curl_easy_setopt(curl, CURLOPT_RANGE, buf);
}
That would also make it easier to replace if we do need to keep an
#ifdef for older versions of curl. But maybe it is just
over-engineering.
-Peff
[1] Antique versions of curl needed the buffer to remain valid through
the whole request, but these days it makes its own copy (and even
under the old regime, I am not sure if the existing code would work
anyway, as we return the request from this function, and the buffer
is function-local).
next prev parent reply other threads:[~2015-10-31 0:08 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-30 22:54 [PATCH v2] http.c: use CURLOPT_RANGE for range requests David Turner
2015-10-30 23:13 ` Junio C Hamano
2015-10-31 0:08 ` Jeff King [this message]
2015-10-31 17:40 ` Junio C Hamano
2015-10-31 17:43 ` Jeff King
2015-11-01 23:00 ` Daniel Stenberg
2015-11-02 1:51 ` Jeff King
2015-11-02 7:05 ` Daniel Stenberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151031000837.GA25849@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).