git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v3] http.c: use CURLOPT_RANGE for range requests
Date: Mon, 2 Nov 2015 15:18:32 -0500	[thread overview]
Message-ID: <20151102201831.GA10722@sigill.intra.peff.net> (raw)
In-Reply-To: <1446492986-32350-1-git-send-email-dturner@twopensource.com>

On Mon, Nov 02, 2015 at 02:36:26PM -0500, David Turner wrote:

> A HTTP server is permitted to return a non-range response to a HTTP
> range request (and Apache httpd in fact does this in some cases).
> While libcurl knows how to correctly handle this (by skipping bytes
> before and after the requested range), it only turns on this handling
> if it is aware that a range request is being made.  By manually
> setting the range header instead of using CURLOPT_RANGE, we were
> hiding the fact that this was a range request from libcurl.  This
> could cause corruption.
> 
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
> 
> This version breaks the range option formatting/setting out to a
> helper function, as suggested by Junio and Jeff.
> 
> In addition, it clears the range option when curl slots are cleared
> before reuse, also as suggested

Thanks, this looks much nicer to me.

A few minor comments:

> +static void http_opt_request_remainder(CURL *curl, ssize_t lo)

I notice you used ssize_t here. If we are going to deal with large files, I
would think off_t would make more sense (i.e., to allow >2GB on a 32-bit
system).

But much worse than that, the value we are passing typically comes from
ftell(), which only returns a long. So we're truncated anyway in that
case.

I certainly don't think we are making anything _worse_ here; the problem
is in the existing code. But I don't think ssize_t is making anything
better (it's generally the same size as a long anyway). So I think I'd
prefer one of the following:

  1. Leave it as "long". At least then we are matching ftell(), which is
     clear (and works fine on 64-bit machines).

  2. Use off_t here instead. It doesn't fix the problem, but at least
     fixes our one component, so it's working towards a better solution
     in the long run.

  3. Detect and complain when we overflow the long. Hopefully ftell()
     returns -1 on a 32-bit system when the file is larger than 2GB, so
     this Just Works already, and we don't create a broken output.

  4. Fix all of the callers. I suspect this would involve calling
     fstat(fileno(fh)) to get a real off_t.

Options (3) and (4) are obviously more work, and I don't necessarily
expect you to do them. But I think I'd prefer (2) to what you have now.
Using off_t has an issue with being unsigned, but...

> +{
> +	char buf[128];
> +	int len = 0;
> +
> +	if (lo >= 0)
> +		len += xsnprintf(buf + len, 128 - len, "%"PRIiMAX,
> +				 (intmax_t)lo);
> +	len += xsnprintf(buf + len, 128 - len, "-");

I think we could just drop this "lo >= 0". Now that there is no "hi"
(which I think is fine), there's no reason to call the function at all
if you do not have a limit.

Also, we should prefer "sizeof(buf)" to repeating the "128", as the two
getting out of sync would be disastrous. So altogether something like:


  static void http_opt_request_remainder(CURL *curl, off_t pos)
  {
	char buf[128];
	xsnprintf(buf, sizeof(buf), "%"PRIuMAX", (uintmax_t)pos);
	curl_easy_setopt(curl, CURLOPT_RANGE, buf);
  }

would be enough, I think.

-Peff

  reply	other threads:[~2015-11-02 20:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-02 19:36 [PATCH v3] http.c: use CURLOPT_RANGE for range requests David Turner
2015-11-02 20:18 ` Jeff King [this message]
2015-11-02 21:50   ` Andreas Schwab
2015-11-02 22:10     ` Jeff King
2015-11-02 22:41       ` David Turner
2015-11-02 22:45         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151102201831.GA10722@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).