git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Brandon Casey <drafnel@gmail.com>
Cc: Brandon Casey <bcasey@nvidia.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	daniel@haxx.se
Subject: Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
Date: Tue, 18 Jun 2013 18:13:28 -0400	[thread overview]
Message-ID: <20130618221327.GA14234@sigill.intra.peff.net> (raw)
In-Reply-To: <CA+sFfMdEvwzmnEBeO+_pwdmN3m5rkJvUCVFFJU8mtmyN+WxH6w@mail.gmail.com>

On Tue, Jun 18, 2013 at 12:29:03PM -0700, Brandon Casey wrote:

> >   1. Older versions of curl (and I do not recall which version off-hand,
> >      but it is not important) stored just the pointer. Calling code was
> >      required to manage the string lifetime itself.
> 
> Daniel mentions that the change happened in libcurl 7.17.  RHEL 4.X
> (yes, ancient, dead, I realize) provides 7.12 and RHEL 5.X (yes,
> ancient, but still widely in use) provides 7.15.  Just pointing it
> out.

Yeah, I didn't mean to imply "we don't care about these versions", only
that our analysis is different between the two sets. We have #ifdefs for
curl going back to 7.7.4. That's probably excessive, but AFAIK, we would
still work with such old versions.

> > It could be a problem when we have multiple handles in play
> > simultaneously (we invalidate the pointer that another simultaneous
> > handle is using, but do not immediately reset its pointer).
> 
> Don't we have multiple handles in play at the same time?  What's going
> on in get_active_slot() when USE_CURL_MULTI is defined?  It appears to
> be maintaining a list of "slot" 's, each with its own curl handle
> initialized either by curl_easy_duphandle() or get_curl_handle().

Yes, we do; the dumb http walker will pipeline loose pack and object
requests (which makes a big difference when fetching small files). The
smart http code may use the curl-multi interface under the hood, but it
should only have a single handle, and the use of the multi interface is
just for sharing code with the dumb fetch.

> So, yeah, this is what I was referring to when I mentioned
> "potentially dangerous".  Since the current code does not change the
> size of the string, the pointer will never change, so we won't ever
> invalidate a pointer that another handle is using.

Agreed. I did not so much mean to dispute your "potentially dangerous"
claim as clarify exactly what the potential is. :)

> The other thing I thought was potentially dangerous, was just
> truncating the string.  Again, if there are multiple curl handles in
> use (which I thought was a possibility), then merely truncating the
> string that contains the username/password could potentially cause a
> problem for another handle that could be in the middle of
> authenticating using the string.  But, I don't know if there is any
> multi-processing happening within the curl library.

I don't think curl does any threading; when we are not inside
curl_*_perform, there is no curl code running at all (Daniel can correct
me if I'm wrong on that).

So I think from curl's perspective a truncation and exact rewrite is
atomic, and it sees only the final content.  I don't know what would
happen if you truncated and put in _different_ contents. For example, if
curl would have written out half of the username/password, blocked and
returned from curl_multi_perform, then you update the buffer, then it
resumes writing.

IOW, I believe the current code is safe (though in a very subtle way),
but if you were to allow password update, I'm not sure if it would be or
not (and if not, you would need a per-handle buffer to make it safe).

I'm fine with making the safety less subtle (e.g., your patch, with a
comment added).

> If we _don't_ ever use multiple curl handles, and/or if there is no
> threading going on in the background within libcurl, then I don't
> think there is really any danger in what the current code does.  It
> would just be an issue of needlessly rewriting the same string over
> and over again, which is probably not a big deal depending on how
> often that happens.

It should be once per http request. But copying a dozen bytes is
probably nothing compared to the actual request.

  reply	other threads:[~2013-06-18 22:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-18  2:00 [PATCH] http.c: don't rewrite the user:passwd string multiple times Brandon Casey
2013-06-18  4:15 ` Eric Sunshine
2013-06-18  5:19 ` Jeff King
2013-06-18  6:36   ` Daniel Stenberg
2013-06-18 15:32     ` Junio C Hamano
2013-06-18 19:29   ` Brandon Casey
2013-06-18 22:13     ` Jeff King [this message]
2013-06-19  2:41       ` Brandon Casey
2013-06-19  2:43         ` [PATCH v2] " Brandon Casey
2013-06-19  5:26           ` Jeff King
2013-06-19  7:40       ` [PATCH] " Daniel Stenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130618221327.GA14234@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=bcasey@nvidia.com \
    --cc=daniel@haxx.se \
    --cc=drafnel@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).