All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brandon Williams <bmwill@google.com>
To: Ivan Vyshnevskyi <sainaen@gmail.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: [PATCH/RFC] push: anonymize URL in error output
Date: Fri, 25 Aug 2017 12:37:43 -0700	[thread overview]
Message-ID: <20170825193743.GD103659@google.com> (raw)
In-Reply-To: <82741094-19a6-e071-227d-f92b3b077a69@gmail.com>

On 08/24, Ivan Vyshnevskyi wrote:
> On 23/08/17 18:58, Jeff King wrote:
> > On Wed, Aug 23, 2017 at 12:49:29PM +0300, Ivan Vyshnevskyi wrote:
> > 
> >> Commits 47abd85 (fetch: Strip usernames from url's before storing them,
> >> 2009-04-17) and later 882d49c (push: anonymize URL in status output,
> >> 2016-07-14) made fetch and push strip the authentication part of the
> >> remote URLs when used in the merge-commit messages or status outputs.
> >> The URLs that are part of the error messages were not anonymized.
> >>
> >> A commonly used pattern for storing artifacts from a build server in a
> >> remote repository utilizes a "secure" environment variable with
> >> credentials to embed them in the URL and execute a push. Given enough
> >> runs, an intermittent network failure will cause a push to fail, leaving
> >> a non-anonymized URL in the build log.
> >>
> >> To prevent that, reuse the same anonymizing function to scrub
> >> credentials from URL in the push error output.
> > 
> > This makes sense. I suspect that most errors we output should be using
> > the anonymized URL. Did you poke around for other calls?
> Yes, I tried to check and unfortunately there are couple of places with
> possible leaks:
> * 'discover_refs()' in remote-curl.c when there's a HTTP error (see a
> real-life scenario with an authz error in my response to Lars) -- is it
> ok to include transport.h just to use one function or is there a cleaner
> way?
> * 'setup_push_upstream()' in push.c when a command doesn't have a branch
> names (haven't saw problems with this in the wild, but could occur
> during the CI setup) -- for this one, probably anonymization should
> happen when the 'remote->name' field is set in the 'make_remote()'; same
> question though, is it ok to include transport.h here?
> 
> Also there's an case of verbose output: I'm not sure I should change it,
> but it does print out the non-anonymized URLs at least during push.
> > 
> > The general structure of the patch looks good, but I have a few minor
> > comments below.
> > 
> >> Not sure how much of the background should be included in the commit message.
> >> The "commonly used pattern" I mention could be found in the myriad of
> >> online tutorials and looks something like this:
> > 
> > My knee-jerk reaction is if it's worth writing after the dashes, it's
> > worth putting in the commit message.
> > 
> > However, in the case I think it is OK as-is (the motivation of "we
> > already avoid leaking auth info to stdout, so we should do the same for
> > error messages" seems self-contained and reasonable)
> Well, I tend to be wordy, and most of the commit messages I saw were
> rather short, so decided to split. Wonder, if maybe example command
> should be included without the rest of it. Would it be useful?

I'm guilty of writing short commit messages (something I need to work
on) but when looking through logs I much prefer to see longer messages
explaining rationals and trade-offs.

> > 
> >> diff --git a/builtin/push.c b/builtin/push.c
> >> index 03846e837..59f3bc975 100644
> >> --- a/builtin/push.c
> >> +++ b/builtin/push.c
> >> @@ -336,7 +336,7 @@ static int push_with_options(struct transport *transport, int flags)
> >>  	err = transport_push(transport, refspec_nr, refspec, flags,
> >>  			     &reject_reasons);
> >>  	if (err != 0)
> >> -		error(_("failed to push some refs to '%s'"), transport->url);
> >> +		error(_("failed to push some refs to '%s'"), transport_anonymize_url(transport->url));
> > 
> > This leaks the return value. That's probably not a _huge_ deal since the
> > program is likely to exit, but it's a bad pattern. I wonder if we should
> > be setting up transport->anonymous_url preemptively, and just let its
> > memory belong to the transport struct.
> Ah. Thanks! I knew I'd fail in the memory management even with the
> one-line patch. :)
> 
> About 'transport->anonymous_url': not sure if it's worth it. There are
> four calls to 'transport_anonymize_url' right now and it looks like the
> new one in my patch is the first that has a transport struct instance
> near by. The next likely candidate for update 'discover_refs()' also
> gets the url as an argument.
> > 
> >> diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
> >> index d38bf3247..0b6fb6252 100755
> >> --- a/t/t5541-http-push-smart.sh
> >> +++ b/t/t5541-http-push-smart.sh
> >> @@ -377,5 +377,23 @@ test_expect_success 'push status output scrubs password' '
> >>  	grep "^To $HTTPD_URL/smart/test_repo.git" status
> >>  '
> >>  
> >> +cat >"$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update" <<EOF
> >> +#!/bin/sh
> >> +exit 1
> >> +EOF
> >> +chmod a+x "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update"
> >> +
> >> +cat >exp <<EOF
> >> +error: failed to push some refs to '$HTTPD_URL/smart/test_repo.git'
> >> +EOF
> > 
> > I know the t5541 script, which is old and messy, led you into these bad
> > constructs. But usually in modern tests we:
> > 
> >  1. Try to keep all commands inside test_expect blocks to catch
> >     unexpected failures or unwanted output.
> > 
> >  2. Use write_script for writing scripts, like:
> > 
> >       write_script "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update" <<-\EOF
> >       exit 1
> >       EOF
> > 
> >  3. Backslash our here-doc delimiter to suppress interpolation.
> > 
> >> +test_expect_success 'failed push status output scrubs password' '
> >> +	cd "$ROOT_PATH"/test_repo_clone &&
> >> +	test_must_fail git push "$HTTPD_URL_USER_PASS/smart/test_repo.git" +HEAD:scrub_err 2>stderr &&
> >> +	grep "^error: failed to push some refs" stderr >act &&
> >> +	test_i18ncmp exp act
> >> +'
> >> +rm -f "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update"
> > 
> > Similarly, this "rm" should probably be a test_when_finished in the
> > block with the write_script (unless you really need to carry it over
> > several test_expect blocks, in which case there should be an explicit
> > test_expect cleaning it up).
> Thanks! You're right. I just followed examples in the file.
> Updated [1], will send with the next patch version.
> 
> > 
> > Instead of grepping for the exact error, should we instead grep for the
> > password to make sure it is not present on _any_ line?
> > 
> > -Peff
> > 
> One possible issue I see is that this will make it overlap with the
> 'push status output scrubs password' case above. But if it's not a
> problem, I can replace last two lines with just a 'test_i18ngrep !'
> 
> [1]:
> https://github.com/sainaen/git/blob/af17713/t/t5541-http-push-smart.sh#L380-L392

-- 
Brandon Williams

  reply	other threads:[~2017-08-25 19:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23  9:49 [PATCH/RFC] push: anonymize URL in error output Ivan Vyshnevskyi
2017-08-23 10:57 ` Lars Schneider
2017-08-24 18:58   ` Ivan Vyshnevskyi
2017-08-23 15:58 ` Jeff King
2017-08-24 19:01   ` Ivan Vyshnevskyi
2017-08-25 19:37     ` Brandon Williams [this message]
2017-08-26 19:00       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170825193743.GD103659@google.com \
    --to=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sainaen@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.