git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH] http: use --stdin and --keep when downloading pack
Date: Thu, 21 Feb 2019 09:08:09 -0500	[thread overview]
Message-ID: <20190221140809.GA21759@sigill.intra.peff.net> (raw)
In-Reply-To: <20190221001447.124088-1-jonathantanmy@google.com>

On Wed, Feb 20, 2019 at 04:14:47PM -0800, Jonathan Tan wrote:

> This is part of the work of CDN offloading of fetch responses.
> 
> I have plans to use the http_pack_request suite of functions to
> implement the part where we download from CDN over HTTP(S), but need
> this change to be able to do so. I think it's better from the code
> quality perspective to reuse these functions, but this necessitates a
> behavior change in that we no longer use the filename as declared by the
> server, so I'm sending this as RFC to see what the community thinks.

I think it makes sense. We don't use the server names for any of the
other protocols, and I'm happy to see one less place where we may
inherit a stupid or malicious item of data from the server.

> diff --git a/http-push.c b/http-push.c
> index b22c7caea0..409b266b0c 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -586,11 +586,16 @@ static void finish_request(struct transfer_request *request)
>  			fprintf(stderr, "Unable to get pack file %s\n%s",
>  				request->url, curl_errorstr);
>  		} else {
> +			char *lockfile;
> +
>  			preq = (struct http_pack_request *)request->userData;
>  
>  			if (preq) {
> -				if (finish_http_pack_request(preq) == 0)
> +				if (finish_http_pack_request(preq,
> +							     &lockfile) == 0) {
> +					unlink(lockfile);
>  					fail = 0;
> +				}
>  				release_http_pack_request(preq);
>  			}
>  		}

I was puzzled that you had to touch http-push.c. But indeed, it seems to
have some fetching code in it, too? I'm willing to throw up my hands in
disgust at the http-push code without looking further at this point. :)

>  	argv_array_push(&ip.args, "index-pack");
> -	argv_array_pushl(&ip.args, "-o", tmp_idx, NULL);
> -	argv_array_push(&ip.args, preq->tmpfile.buf);
> +	argv_array_push(&ip.args, "--stdin");
> +	argv_array_pushf(&ip.args, "--keep=git %"PRIuMAX, (uintmax_t)getpid());
>  	ip.git_cmd = 1;
> -	ip.no_stdin = 1;
> -	ip.no_stdout = 1;
> +	ip.in = tmpfile_fd;
> +	ip.out = -1;
>  
> -	if (run_command(&ip)) {
> -		unlink(preq->tmpfile.buf);
> -		unlink(tmp_idx);
> -		free(tmp_idx);
> -		return -1;
> +	if (start_command(&ip)) {
> +		ret = -1;
> +		goto cleanup;
>  	}
>  
> -	unlink(sha1_pack_index_name(p->sha1));
> +	*lockfile = index_pack_lockfile(ip.out);
> +	close(ip.out);

We're now doing bi-directional I/O with index-pack. But it should be
deadlock-free, because we know the output is small and will only come at
the end after we've closed its stdin.

> -	if (finalize_object_file(preq->tmpfile.buf, sha1_pack_name(p->sha1))
> -	 || finalize_object_file(tmp_idx, sha1_pack_index_name(p->sha1))) {
> -		free(tmp_idx);
> -		return -1;
> +	if (finish_command(&ip)) {
> +		ret = -1;
> +		goto cleanup;
>  	}

If the command fails but we got something in *lockfile, should we clean
it up? Likewise, do we need to be installing a signal handler to clean
it up in case we die in other code paths (or by a signal)?

-Peff

  reply	other threads:[~2019-02-21 14:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-21  0:14 [RFC PATCH] http: use --stdin and --keep when downloading pack Jonathan Tan
2019-02-21 14:08 ` Jeff King [this message]
2019-02-21 18:49   ` Jonathan Tan
2019-02-22  5:04     ` Jeff King
2019-02-21 21:26 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190221140809.GA21759@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).