Git development
 help / color / mirror / Atom feed
* Re: [PATCH] difftool.c: mark a file-local symbol with static
From: Junio C Hamano @ 2016-12-01 18:20 UTC (permalink / raw)
  To: Jeff King; +Cc: Ramsay Jones, Johannes Schindelin, GIT Mailing-list
In-Reply-To: <20161201040234.3rnuttitneweedn5@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> I don't have a preference on which direction we go, but yes, right now
> we are in an awkward middle ground. We should do one of:
>
>   1. Drop -Wno-format-zero-length from DEVELOPER_CFLAGS and make sure
>      future patches to do not violate it.
>
>   2. Declare warning("") as OK.
>
> I still think the warning is silly, but (1) has value in that it
> produces the least surprise and annoyance to various people building
> Git.

What is most awkward is that "make", with no customization out of
box, suggests to use -Wall in CFLAGS and triggers the misguided
warning, and DEVELOPER_CFLAGS, partly because it adds -Werror to
turn warnings (including this misguided one) into errors, disables
it with -Wno-format-zero-length.  As a result:

 - Casual builders that follow our suggestion get warnings.

 - Developers do not notice their new code may make the "casual
   builder" experience worse.

That is totally backwards.  That is probably what you meant by "the
least surprise and annoyance"?

I also still think that any_printf_like_function("%s", "") looks
silly.  I know that we've already started moving in that direction
and we stopped at a place we do not want to be in, but perhaps it
was a mistake to move in that direction in the first place.  I am
tempted to say we should do something like the attached, but that
may not fly well, as I suspect that -Wno-format-zero-length may be a
lot more exotic than the -Wall compiler option.  An obvious second
best option would be to drop -Wall from the "everybody" CFLAGS and
move it to DEVELOPER_CFLAGS instead.

diff --git a/Makefile b/Makefile
index a379738db2..137c10e257 100644
--- a/Makefile
+++ b/Makefile
@@ -391,10 +391,9 @@ GIT-VERSION-FILE: FORCE
 
 # CFLAGS and LDFLAGS are for the users to override from the command line.
 
-CFLAGS = -g -O2 -Wall
+CFLAGS = -g -O2 -Wall -Wno-format-zero-length
 DEVELOPER_CFLAGS = -Werror \
 	-Wdeclaration-after-statement \
-	-Wno-format-zero-length \
 	-Wold-style-definition \
 	-Woverflow \
 	-Wpointer-arith \



^ permalink raw reply related

* Re: Re* git pull --rebase should use fast forward merge if possible
From: Stefan Beller @ 2016-12-01 18:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org, neuling
In-Reply-To: <xmqqa8cfbkeq.fsf_-_@gitster.mtv.corp.google.com>

On Thu, Dec 1, 2016 at 9:59 AM, Junio C Hamano <gitster@pobox.com> wrote:

> +test_expect_success '--rebase fast forward' '
> +       git reset --hard before-rebase &&
> +       git checkout -b ff &&
> +       echo another modification >file &&
> +       git commit -m third file &&
> +
> +       git checkout to-rebase &&
> +       git pull --rebase . ff &&
> +       test "$(git rev-parse HEAD)" = "$(git rev-parse ff)" &&
> +
> +       # The above only validates the result.  Did we actually bypass rebase?

Good catch for the test, but I think we can make the sed regexp simpler, as we
can leave out the second "[0-9a-f]"? (git reflog |sed
"s/^[0-9a-f]*/OBJID/" works here)

The implication of that we'd also match if there is no object id at
all at the beginning,
which sounds fine.

I shortly debated the idea to just cut off anything before the first
space and then expect
"HEAD@{0}: pull --rebase . ff: Fast-forward" only, but why cut off
what we produce
in the first place?

    git reflog --format="%s" -1 >actual &&
    echo "pull --rebase . ff: Fast-forward" >expect &&
    test_cmp expect actual

maybe?

^ permalink raw reply

* Re: [PATCH v2 1/1] convert: git cherry-pick -Xrenormalize did not work
From: Junio C Hamano @ 2016-12-01 18:27 UTC (permalink / raw)
  To: tboegi; +Cc: git, eevee.reply
In-Reply-To: <20161130170232.19685-1-tboegi@web.de>

tboegi@web.de writes:

> From: Torsten Bögershausen <tboegi@web.de>
>
> Working with a repo that used to be all CRLF. At some point it
> was changed to all LF, with `text=auto` in .gitattributes.
> Trying to cherry-pick a commit from before the switchover fails:
>
> $ git cherry-pick -Xrenormalize <commit>
>     fatal: CRLF would be replaced by LF in [path]
>
> Commit 65237284 "unify the "auto" handling of CRLF" introduced
> a regression:
>
> Whenever crlf_action is CRLF_TEXT_XXX and not CRLF_AUTO_XXX,
> SAFE_CRLF_RENORMALIZE was feed into check_safe_crlf().
> This is wrong because here everything else than SAFE_CRLF_WARN is
> treated as SAFE_CRLF_FAIL.
>
> Call check_safe_crlf() only if checksafe is SAFE_CRLF_WARN or SAFE_CRLF_FAIL.
>
> Reported-by: Eevee (Lexy Munroe) <eevee@veekun.com>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---

I think the description and the code match with each other much
better and the resulting code explains what is going on more
clearly.  Thanks---will queue.

>  convert.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/convert.c b/convert.c
> index be91358..f8e4dfe 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -281,13 +281,13 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
>  		/*
>  		 * If the file in the index has any CR in it, do not convert.
>  		 * This is the new safer autocrlf handling.
> +		   - unless we want to renormalize in a merge or cherry-pick
>  		 */
> -		if (checksafe == SAFE_CRLF_RENORMALIZE)
> -			checksafe = SAFE_CRLF_FALSE;
> -		else if (has_cr_in_index(path))
> +		if ((checksafe != SAFE_CRLF_RENORMALIZE) && has_cr_in_index(path))
>  			convert_crlf_into_lf = 0;
>  	}
> -	if (checksafe && len) {
> +	if ((checksafe == SAFE_CRLF_WARN ||
> +	    (checksafe == SAFE_CRLF_FAIL)) && len) {
>  		struct text_stat new_stats;
>  		memcpy(&new_stats, &stats, sizeof(new_stats));
>  		/* simulate "git add" */

^ permalink raw reply

* Re: [PATCH v6 1/6] submodules: add helper functions to determine presence of submodules
From: Stefan Beller @ 2016-12-01 18:31 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, git@vger.kernel.org, Jonathan Tan,
	Junio C Hamano
In-Reply-To: <20161201042926.mr2qdta7hviizcya@sigill.intra.peff.net>

On Wed, Nov 30, 2016 at 8:29 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Nov 30, 2016 at 05:28:29PM -0800, Brandon Williams wrote:
>
>> +/*
>> + * Determine if a submodule has been populated at a given 'path'
>> + */
>> +int is_submodule_populated(const char *path)
>> +{
>> +     int ret = 0;
>> +     struct stat st;
>> +     char *gitdir = xstrfmt("%s/.git", path);
>> +
>> +     if (!stat(gitdir, &st))
>> +             ret = 1;
>> +
>> +     free(gitdir);
>> +     return ret;
>> +}
>
> I don't know if it's worth changing or not, but this could be a bit
> shorter:
>
>   int is_submodule_populated(const char *path)
>   {
>         return !access(mkpath("%s/.git", path), F_OK);
>   }
>
> There is a file_exists() helper, but it uses lstat(), which I think you
> don't want (because you'd prefer to bail on a broken .git symlink). But
> access(F_OK) does what you want, I think.
>
> mkpath() is generally an unsafe function because it uses a static
> buffer, but it's handy and safe for handing values to syscalls like
> this.
>
> I say "I don't know if it's worth it" because what you've written is
> fine, and while more lines, it's fairly obvious and safe.

OK, chiming in here as well. :)

I plan on making use of the is_submodule_populated method in
the checkout --recurse-submodules series, and for that I am
undecided whether a cheap stat is the right approach or if we want
to have the result of resolve_gitdir as that fails in weird corner cases.

Anyway, I'd propose to change the name when going with either the
code as is or what Jeff proposes to be one of

    is_submodule_populated_cheaply
    is_submodule_populated_with_no_sanity_check
    is_submodule_dot_git_present
    have_submodule_dot_git

I think I'd prefer the last one as that describes what the function
actually does in a concise way?

Thanks,
Stefan

^ permalink raw reply

* Re: [PATCH v6 1/6] submodules: add helper functions to determine presence of submodules
From: Junio C Hamano @ 2016-12-01 18:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, sbeller, jonathantanmy
In-Reply-To: <20161201042926.mr2qdta7hviizcya@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Wed, Nov 30, 2016 at 05:28:29PM -0800, Brandon Williams wrote:
>
>> +/*
>> + * Determine if a submodule has been populated at a given 'path'
>> + */
>> +int is_submodule_populated(const char *path)
>> +{
>> +	int ret = 0;
>> +	struct stat st;
>> +	char *gitdir = xstrfmt("%s/.git", path);
>> +
>> +	if (!stat(gitdir, &st))
>> +		ret = 1;
>> +
>> +	free(gitdir);
>> +	return ret;
>> +}
>
> I don't know if it's worth changing or not, but this could be a bit
> shorter:
>
>   int is_submodule_populated(const char *path)
>   {
> 	return !access(mkpath("%s/.git", path), F_OK);
>   }
>
> There is a file_exists() helper, but it uses lstat(), which I think you
> don't want (because you'd prefer to bail on a broken .git symlink). But
> access(F_OK) does what you want, I think.
>
> mkpath() is generally an unsafe function because it uses a static
> buffer, but it's handy and safe for handing values to syscalls like
> this.

I think your "unsafe" is not about thread-safety but about "the
caller cannot rely on returned value staying valid for long haul".
If this change since v5 is about thread-safety, I am not sure if it
is safe to use mkpath here.

I am a bit wary of making the check too sketchy like this, but this
is not about determining if a random "path" that has ".git" in a
superproject working tree is a submodule or not (that information
primarily comes from the superproject index), so I tend to agree
with the patch that it is sufficient to check presence of ".git"
alone.

^ permalink raw reply

* Re: [PATCH v6 5/6] grep: enable recurse-submodules to work on <tree> objects
From: Junio C Hamano @ 2016-12-01 18:49 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Johannes Sixt, git, peff, sbeller, jonathantanmy
In-Reply-To: <20161201175107.GB51406@google.com>

Brandon Williams <bmwill@google.com> writes:

> On 12/01, Johannes Sixt wrote:
>> Am 01.12.2016 um 02:28 schrieb Brandon Williams:
>> >+	git init "su:b" &&
>> 
>> Don't do that. Colons in file names won't work on Windows.
>> 
>> -- Hannes
>> 
>
> This test is needed to see if the code still works with filenames that
> contain colons.  Is there a way to mark the test to not run on windows?

Something like:

test_expect_success !MINGW 'a test' '
	git init s:u:b
'


^ permalink raw reply

* Re: Re* git pull --rebase should use fast forward merge if possible
From: Junio C Hamano @ 2016-12-01 18:50 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, neuling
In-Reply-To: <CAGZ79kZSEan5uXCUn4iVCWEc9zohMSr+UDyHDyQUHz84H=tR8w@mail.gmail.com>

Stefan Beller <sbeller@google.com> writes:

> On Thu, Dec 1, 2016 at 9:59 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> +test_expect_success '--rebase fast forward' '
>> +       git reset --hard before-rebase &&
>> +       git checkout -b ff &&
>> +       echo another modification >file &&
>> +       git commit -m third file &&
>> +
>> +       git checkout to-rebase &&
>> +       git pull --rebase . ff &&
>> +       test "$(git rev-parse HEAD)" = "$(git rev-parse ff)" &&
>> +
>> +       # The above only validates the result.  Did we actually bypass rebase?
>
> Good catch for the test, but I think we can make the sed regexp simpler, as we
> can leave out the second "[0-9a-f]"? (git reflog |sed
> "s/^[0-9a-f]*/OBJID/" works here)

This mimics the existing tests around there for consistency.
Simplifying or cleaning of this test script as a whole is outside
the scope.

^ permalink raw reply

* Re: [PATCH] difftool.c: mark a file-local symbol with static
From: Jeff King @ 2016-12-01 18:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ramsay Jones, Johannes Schindelin, GIT Mailing-list
In-Reply-To: <xmqq60n3bjel.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 01, 2016 at 10:20:50AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I don't have a preference on which direction we go, but yes, right now
> > we are in an awkward middle ground. We should do one of:
> >
> >   1. Drop -Wno-format-zero-length from DEVELOPER_CFLAGS and make sure
> >      future patches to do not violate it.
> >
> >   2. Declare warning("") as OK.
> >
> > I still think the warning is silly, but (1) has value in that it
> > produces the least surprise and annoyance to various people building
> > Git.
> 
> What is most awkward is that "make", with no customization out of
> box, suggests to use -Wall in CFLAGS and triggers the misguided
> warning, and DEVELOPER_CFLAGS, partly because it adds -Werror to
> turn warnings (including this misguided one) into errors, disables
> it with -Wno-format-zero-length.  As a result:
> 
>  - Casual builders that follow our suggestion get warnings.
> 
>  - Developers do not notice their new code may make the "casual
>    builder" experience worse.
> 
> That is totally backwards.  That is probably what you meant by "the
> least surprise and annoyance"?

Yes, exactly. :)

The surprise and annoyance for (1) is that people who get caught up by
the warning while writing new patches say "this warning is stupid, why
don't we disable it". But that is a much smaller number of people to be
annoyed.

> I also still think that any_printf_like_function("%s", "") looks
> silly.  I know that we've already started moving in that direction
> and we stopped at a place we do not want to be in, but perhaps it
> was a mistake to move in that direction in the first place.  I am
> tempted to say we should do something like the attached, but that
> may not fly well, as I suspect that -Wno-format-zero-length may be a
> lot more exotic than the -Wall compiler option.

Yeah, I think portability concerns are what caused us not to go down
that road in the first place. But I think it's also a mistake to put too
much into CFLAGS, because somebody who wants to override one flag has to
repeat the rest. So, e.g., right now I might do:

  make CFLAGS="-O3 -Wall"

to bump the optimization level. But if our codebase relies on
-Wno-format-zero-length being there, then I have to know to put it in
there, too.

You can work around that with an extra make variable, but that also
makes it harder to disable if your compiler doesn't like it.

> An obvious second
> best option would be to drop -Wall from the "everybody" CFLAGS and
> move it to DEVELOPER_CFLAGS instead.

Yeah, though that doesn't help the example above.

As ugly as warning("%s", "") is, I think it may be the thing that annoys
the smallest number of people.

-Peff

^ permalink raw reply

* Re: Re* git pull --rebase should use fast forward merge if possible
From: Stefan Beller @ 2016-12-01 18:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org, neuling
In-Reply-To: <xmqqoa0va3fw.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 1, 2016 at 10:50 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Thu, Dec 1, 2016 at 9:59 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> +test_expect_success '--rebase fast forward' '
>>> +       git reset --hard before-rebase &&
>>> +       git checkout -b ff &&
>>> +       echo another modification >file &&
>>> +       git commit -m third file &&
>>> +
>>> +       git checkout to-rebase &&
>>> +       git pull --rebase . ff &&
>>> +       test "$(git rev-parse HEAD)" = "$(git rev-parse ff)" &&
>>> +
>>> +       # The above only validates the result.  Did we actually bypass rebase?
>>
>> Good catch for the test, but I think we can make the sed regexp simpler, as we
>> can leave out the second "[0-9a-f]"? (git reflog |sed
>> "s/^[0-9a-f]*/OBJID/" works here)
>
> This mimics the existing tests around there for consistency.
> Simplifying or cleaning of this test script as a whole is outside
> the scope.

Ok, in that case the patch looks fine.

^ permalink raw reply

* Re: [PATCH v6 5/6] grep: enable recurse-submodules to work on <tree> objects
From: Jeff King @ 2016-12-01 18:52 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Johannes Sixt, git, sbeller, jonathantanmy, gitster
In-Reply-To: <20161201175107.GB51406@google.com>

On Thu, Dec 01, 2016 at 09:51:07AM -0800, Brandon Williams wrote:

> On 12/01, Johannes Sixt wrote:
> > Am 01.12.2016 um 02:28 schrieb Brandon Williams:
> > >+	git init "su:b" &&
> > 
> > Don't do that. Colons in file names won't work on Windows.
> > 
> > -- Hannes
> > 
> 
> This test is needed to see if the code still works with filenames that
> contain colons.  Is there a way to mark the test to not run on windows?

Junio suggested !MINGW, which seems sensible. Earlier I mentioned doing
the whole thing in-index, but I think that might get tricky because we
try to find the submodule as ".git/modules/<path>". So it probably isn't
worth the trouble.

-Peff

^ permalink raw reply

* Re: [PATCH v6 0/6] recursively grep across submodules
From: Jeff King @ 2016-12-01 19:03 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, jonathantanmy, gitster
In-Reply-To: <20161201174547.GA51406@google.com>

On Thu, Dec 01, 2016 at 09:45:47AM -0800, Brandon Williams wrote:

> Yeah I was trying to think through these scenarios myself last night.
> And like you found it seemed alright to let the child process deal with
> the .git file/dir as long as once actually exists at that path.  If one
> didn't then there would be the possibility that we ended up back at the
> superproject, which would result in an infinite loop.  And yeah if the
> .git file doesn't resolve to anything sensible then the user probably
> mangled their repository somehow anyways.

I hadn't considered the infinite loop. I thought the worst case is that
we might just generate bogus results by going back to the superproject.
But of course there is nothing to stop it from just recursing again.

However, it looks like there is a circuit-breaker; we end up back in the
superproject, but inside a subdirectory, which causes --super-prefix to
complain.

You can test it with just:

  rm submodule/.git
  mkdir submodule/.git

which says:

  fatal: can't use --super-prefix from a subdirectory
  fatal: process for submodule 'foo' failed with exit code: 128

It might be worth including a test to make sure that behavior remains.
I think it's more of an emergent behavior than something planned. :)

-Peff

^ permalink raw reply

* Re: [PATCH v6 1/6] submodules: add helper functions to determine presence of submodules
From: Jeff King @ 2016-12-01 19:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, sbeller, jonathantanmy
In-Reply-To: <xmqqwpfja3nk.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 01, 2016 at 10:46:23AM -0800, Junio C Hamano wrote:

> > mkpath() is generally an unsafe function because it uses a static
> > buffer, but it's handy and safe for handing values to syscalls like
> > this.
> 
> I think your "unsafe" is not about thread-safety but about "the
> caller cannot rely on returned value staying valid for long haul".
> If this change since v5 is about thread-safety, I am not sure if it
> is safe to use mkpath here.

Oh, good point. I meant "staying valid", but somehow totally forgot that
we cared about thread reentrancy here. As if I hadn't just spent an hour
debugging a thread problem.

My suggestion is clearly nonsense.

> I am a bit wary of making the check too sketchy like this, but this
> is not about determining if a random "path" that has ".git" in a
> superproject working tree is a submodule or not (that information
> primarily comes from the superproject index), so I tend to agree
> with the patch that it is sufficient to check presence of ".git"
> alone.

The real danger is that it is a different check than the child process
is going to use, so they may disagree (see the almost-infinite-loop
discussion elsewhere).

It feels quite hacky, but checking:

  if (is_git_directory(suspect))
	return 1; /* actual git dir */
  if (!stat(suspect, &st) && S_ISREG(st.st_mode))
	return 1; /* gitfile; may or may not be valid */
  return 0;

is a little more robust, because the child process will happily skip a
non-repo ".git" and keep walking back up to the superproject. Whereas if
it sees any ".git" file, even if it is bogus, it will barf then and
there.

I'm actually not sure if that latter behavior is a bug or not. I don't
think it was really planned out, and it obviously is inconsistent with
the other repo-discovery cases. But it is a convenient side effect for
submodules, and I doubt anybody is bothered by it in practice.

-Peff

^ permalink raw reply

* Re: [PATCH v6 1/6] submodules: add helper functions to determine presence of submodules
From: Brandon Williams @ 2016-12-01 19:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, sbeller, jonathantanmy
In-Reply-To: <20161201190925.xi2z7vauxyf3yxyc@sigill.intra.peff.net>

On 12/01, Jeff King wrote:
> On Thu, Dec 01, 2016 at 10:46:23AM -0800, Junio C Hamano wrote:
> 
> > > mkpath() is generally an unsafe function because it uses a static
> > > buffer, but it's handy and safe for handing values to syscalls like
> > > this.
> > 
> > I think your "unsafe" is not about thread-safety but about "the
> > caller cannot rely on returned value staying valid for long haul".
> > If this change since v5 is about thread-safety, I am not sure if it
> > is safe to use mkpath here.
> 
> Oh, good point. I meant "staying valid", but somehow totally forgot that
> we cared about thread reentrancy here. As if I hadn't just spent an hour
> debugging a thread problem.
> 
> My suggestion is clearly nonsense.
> 
> > I am a bit wary of making the check too sketchy like this, but this
> > is not about determining if a random "path" that has ".git" in a
> > superproject working tree is a submodule or not (that information
> > primarily comes from the superproject index), so I tend to agree
> > with the patch that it is sufficient to check presence of ".git"
> > alone.
> 
> The real danger is that it is a different check than the child process
> is going to use, so they may disagree (see the almost-infinite-loop
> discussion elsewhere).
> 
> It feels quite hacky, but checking:
> 
>   if (is_git_directory(suspect))
> 	return 1; /* actual git dir */
>   if (!stat(suspect, &st) && S_ISREG(st.st_mode))
> 	return 1; /* gitfile; may or may not be valid */
>   return 0;
> 
> is a little more robust, because the child process will happily skip a
> non-repo ".git" and keep walking back up to the superproject. Whereas if
> it sees any ".git" file, even if it is bogus, it will barf then and
> there.
> 
> I'm actually not sure if that latter behavior is a bug or not. I don't
> think it was really planned out, and it obviously is inconsistent with
> the other repo-discovery cases. But it is a convenient side effect for
> submodules, and I doubt anybody is bothered by it in practice.
> 
> -Peff

I think this more robust check is probably a good idea, that way we
don't step into a submodule with a .git directory that isn't really a
.git dir.

-- 
Brandon Williams

^ permalink raw reply

* Re: bw/transport-protocol-policy
From: Jeff King @ 2016-12-01 19:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Junio C Hamano, git
In-Reply-To: <20161201181415.GA54082@google.com>

On Thu, Dec 01, 2016 at 10:14:15AM -0800, Brandon Williams wrote:

> >   1. The new policy config lets you say "only allow this protocol when
> >      the user specifies it". But when http.c calls is_transport_allowed(),
> >      the latter has no idea that we are asking it about potential
> >      redirects (which obviously do _not_ come from the user), and would
> >      erroneously allow them.
> > 
> >      I think this needs fixed before the topic is merged. It's not a
> >      regression, as it only comes into play if you use the new policy
> >      config. But it is a minor security hole in the new feature.
> 
> I agree and it should be an easy fix.  We can just add a parameter like
> so:
> 
> diff --git a/transport.c b/transport.c
> index 2c0ec76..d38d50f 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -723,7 +723,7 @@ static enum protocol_allow_config get_protocol_config(const char *type)
>  	return PROTOCOL_ALLOW_USER_ONLY;
>  }
>  
> -int is_transport_allowed(const char *type)
> +int is_transport_allowed(const char *type, int redirect)
>  {
>  	const struct string_list *whitelist = protocol_whitelist();
>  	if (whitelist)
> @@ -735,7 +735,7 @@ int is_transport_allowed(const char *type)
>  	case PROTOCOL_ALLOW_NEVER:
>  		return 0;
>  	case PROTOCOL_ALLOW_USER_ONLY:
> -		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> +		return git_env_bool("GIT_PROTOCOL_FROM_USER", !redirect);
>  	}
>  
>  	die("BUG: invalid protocol_allow_config type");
> 
> That way the libcurl code can say it is asking if it is ok to redirect
> to that protocol.

I wouldn't expect anyone to ever set GIT_PROTOCOL_FROM_USER=1, but it
does behave in a funny way here, overriding the "redirect" flag. I think
we'd want something more like:

  if (redirect < 0)
	redirect = git_env_bool("GIT_PROTOCOL_FROM_USER", 1);

and then pass in "-1" from transport_check_allowed().

I think that's sufficient to fix the topic as-is. However, the http
redirect series adds an extra complication, because with http-alternates
we resolve some of the redirects ourselves. So in those cases we'd want
to restrict CURLOPT_PROTOCOLS as if they were redirects. We may need to
set up two CURLOPT values: ones from the user and ones from redirects,
and then feed them to CURLOPT_PROTOCOLS and CURLOPT_REDIR_PROTOCOLS as
appropriate depending on the request context.

> We should switch to warning all the time since this series adds in
> default whitelisted/blacklisted protocols anyways.

Yeah, good point. As a bonus it makes the code simpler.

-Peff

^ permalink raw reply

* Re: bw/transport-protocol-policy
From: Brandon Williams @ 2016-12-01 19:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git
In-Reply-To: <20161201192055.44mtvtacyhpbqbqu@sigill.intra.peff.net>

On 12/01, Jeff King wrote:
> On Thu, Dec 01, 2016 at 10:14:15AM -0800, Brandon Williams wrote:
> 
> > >   1. The new policy config lets you say "only allow this protocol when
> > >      the user specifies it". But when http.c calls is_transport_allowed(),
> > >      the latter has no idea that we are asking it about potential
> > >      redirects (which obviously do _not_ come from the user), and would
> > >      erroneously allow them.
> > > 
> > >      I think this needs fixed before the topic is merged. It's not a
> > >      regression, as it only comes into play if you use the new policy
> > >      config. But it is a minor security hole in the new feature.
> > 
> > I agree and it should be an easy fix.  We can just add a parameter like
> > so:
> > 
> > diff --git a/transport.c b/transport.c
> > index 2c0ec76..d38d50f 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -723,7 +723,7 @@ static enum protocol_allow_config get_protocol_config(const char *type)
> >  	return PROTOCOL_ALLOW_USER_ONLY;
> >  }
> >  
> > -int is_transport_allowed(const char *type)
> > +int is_transport_allowed(const char *type, int redirect)
> >  {
> >  	const struct string_list *whitelist = protocol_whitelist();
> >  	if (whitelist)
> > @@ -735,7 +735,7 @@ int is_transport_allowed(const char *type)
> >  	case PROTOCOL_ALLOW_NEVER:
> >  		return 0;
> >  	case PROTOCOL_ALLOW_USER_ONLY:
> > -		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> > +		return git_env_bool("GIT_PROTOCOL_FROM_USER", !redirect);
> >  	}
> >  
> >  	die("BUG: invalid protocol_allow_config type");
> > 
> > That way the libcurl code can say it is asking if it is ok to redirect
> > to that protocol.
> 
> I wouldn't expect anyone to ever set GIT_PROTOCOL_FROM_USER=1, but it
> does behave in a funny way here, overriding the "redirect" flag. I think
> we'd want something more like:
> 
>   if (redirect < 0)
> 	redirect = git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> 
> and then pass in "-1" from transport_check_allowed().

I don't think I quite follow your solution but I came up with this:

  case PROTOCOL_ALLOW_USER_ONLY:
    return redirect ? 0 : git_env_bool("GIT_PROTOCOL_FROM_USER", 1);

Which should address the same issue.

-- 
Brandon Williams

^ permalink raw reply

* Re: [PATCH 1/3] compat: add qsort_s()
From: Jeff King @ 2016-12-01 19:35 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Junio C Hamano, Johannes Schindelin
In-Reply-To: <fc602a66-a06c-203e-b50b-55fd7b258b54@web.de>

On Thu, Dec 01, 2016 at 05:26:43PM +0100, René Scharfe wrote:

> The function qsort_s() was introduced with C11 Annex K; it provides the
> ability to pass a context pointer to the comparison function, supports
> the convention of using a NULL pointer for an empty array and performs a
> few safety checks.
> 
> Add an implementation based on compat/qsort.c for platforms that lack a
> native standards-compliant qsort_s() (i.e. basically everyone).  It
> doesn't perform the full range of possible checks: It uses size_t
> instead of rsize_t and doesn't check nmemb and size against RSIZE_MAX
> because we probably don't have the restricted size type defined.  For
> the same reason it returns int instead of errno_t.

Hmm. So it sounds like qsort_r(), but with the NULL-is-empty magic. But
we already are OK without the latter (and can emulate it easily). Would
it make sense to do:

  #if defined(HAVE_QSORT_S)
  /* huzzah, use the system-native qsort_s */

  #elif defined(HAVE_QSORT_R)
  int git_qsort_s(void *b, size_t n, size_t s,
		   int (*cmp)(const void *, const void *, void *), void *ctx)
  {
	if (!n)
		return 0;
	if (!b || !cmp)
		return -1;
	qsort_r(b, n, s, cmp, ctx);
	return 0;
  }

  #else
  /* fallback implementation as your patch does */
  #endif

To make matters more fun, apparently[1] there are multiple variants of
qsort_r with different argument orders. _And_ apparently Microsoft
defines qsort_s, but it's not quite the same thing. But all of that can
be dealt with by having more specific flags (HAVE_GNU_QSORT_R, etc).

It just seems like we should be able to do a better job of using the
system qsort in many cases.

-Peff

[1] https://stackoverflow.com/questions/39560773/different-declarations-of-qsort-r-on-mac-and-linux/39561369

^ permalink raw reply

* [PATCH v6 0/4] transport protocol policy configuration
From: Brandon Williams @ 2016-12-01 19:44 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, peff, sbeller, bburky, jrnieder
In-Reply-To: <1478555462-132573-1-git-send-email-bmwill@google.com>

v6 introduces 2 additional patches which address problems with protocols that
libcurl is allowed to use for redirection.

Brandon Williams (4):
  lib-proto-disable: variable name fix
  transport: add protocol policy config option
  http: always warn if libcurl version is too old
  transport: check if protocol can be used on a redirect

 Documentation/config.txt         |  46 +++++++++++++
 Documentation/git.txt            |  38 ++++-------
 git-submodule.sh                 |  12 ++--
 http.c                           |  13 ++--
 t/lib-proto-disable.sh           | 142 ++++++++++++++++++++++++++++++++++++---
 t/t5509-fetch-push-namespaces.sh |   1 +
 t/t5802-connect-helper.sh        |   1 +
 transport.c                      |  82 +++++++++++++++++++---
 transport.h                      |  13 ++--
 9 files changed, 281 insertions(+), 67 deletions(-)

-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply

* [PATCH v6 4/4] transport: check if protocol can be used on a redirect
From: Brandon Williams @ 2016-12-01 19:44 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, peff, sbeller, bburky, jrnieder
In-Reply-To: <1480621447-52399-1-git-send-email-bmwill@google.com>

Add a the 'redirect' parameter to 'is_transport_allowed' which allows
callers to query if a transport protocol can be used on a redirect.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c      | 8 ++++----
 transport.c | 6 +++---
 transport.h | 7 ++++---
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/http.c b/http.c
index fee128b..d31ded8 100644
--- a/http.c
+++ b/http.c
@@ -725,13 +725,13 @@ static CURL *get_curl_handle(void)
 	curl_easy_setopt(result, CURLOPT_POST301, 1);
 #endif
 #if LIBCURL_VERSION_NUM >= 0x071304
-	if (is_transport_allowed("http"))
+	if (is_transport_allowed("http", 1))
 		allowed_protocols |= CURLPROTO_HTTP;
-	if (is_transport_allowed("https"))
+	if (is_transport_allowed("https", 1))
 		allowed_protocols |= CURLPROTO_HTTPS;
-	if (is_transport_allowed("ftp"))
+	if (is_transport_allowed("ftp", 1))
 		allowed_protocols |= CURLPROTO_FTP;
-	if (is_transport_allowed("ftps"))
+	if (is_transport_allowed("ftps", 1))
 		allowed_protocols |= CURLPROTO_FTPS;
 	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, allowed_protocols);
 #else
diff --git a/transport.c b/transport.c
index 186de9a..9fee241 100644
--- a/transport.c
+++ b/transport.c
@@ -723,7 +723,7 @@ static enum protocol_allow_config get_protocol_config(const char *type)
 	return PROTOCOL_ALLOW_USER_ONLY;
 }
 
-int is_transport_allowed(const char *type)
+int is_transport_allowed(const char *type, int redirect)
 {
 	const struct string_list *whitelist = protocol_whitelist();
 	if (whitelist)
@@ -735,7 +735,7 @@ int is_transport_allowed(const char *type)
 	case PROTOCOL_ALLOW_NEVER:
 		return 0;
 	case PROTOCOL_ALLOW_USER_ONLY:
-		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+		return git_env_bool("GIT_PROTOCOL_FROM_USER", !redirect);
 	}
 
 	die("BUG: invalid protocol_allow_config type");
@@ -743,7 +743,7 @@ int is_transport_allowed(const char *type)
 
 void transport_check_allowed(const char *type)
 {
-	if (!is_transport_allowed(type))
+	if (!is_transport_allowed(type, 0))
 		die("transport '%s' not allowed", type);
 }
 
diff --git a/transport.h b/transport.h
index f4998bc..72971ad 100644
--- a/transport.h
+++ b/transport.h
@@ -153,10 +153,11 @@ extern int transport_summary_width(const struct ref *refs);
 struct transport *transport_get(struct remote *, const char *);
 
 /*
- * Check whether a transport is allowed by the environment. Type should
- * generally be the URL scheme, as described in Documentation/git.txt
+ * Check whether a transport is allowed by the environment. Setting 'redirect'
+ * can be used to querry if the transport can be used in a redirect.  Type
+ * should generally be the URL scheme, as described in Documentation/git.txt
  */
-int is_transport_allowed(const char *type);
+int is_transport_allowed(const char *type, int redirect);
 
 /*
  * Check whether a transport is allowed by the environment,
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v6 2/4] transport: add protocol policy config option
From: Brandon Williams @ 2016-12-01 19:44 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, peff, sbeller, bburky, jrnieder
In-Reply-To: <1480621447-52399-1-git-send-email-bmwill@google.com>

Previously the `GIT_ALLOW_PROTOCOL` environment variable was used to
specify a whitelist of protocols to be used in clone/fetch/push
commands.  This patch introduces new configuration options for more
fine-grained control for allowing/disallowing protocols.  This also has
the added benefit of allowing easier construction of a protocol
whitelist on systems where setting an environment variable is
non-trivial.

Now users can specify a policy to be used for each type of protocol via
the 'protocol.<name>.allow' config option.  A default policy for all
unconfigured protocols can be set with the 'protocol.allow' config
option.  If no user configured default is made git will allow known-safe
protocols (http, https, git, ssh, file), disallow known-dangerous
protocols (ext), and have a default policy of `user` for all other
protocols.

The supported policies are `always`, `never`, and `user`.  The `user`
policy can be used to configure a protocol to be usable when explicitly
used by a user, while disallowing it for commands which run
clone/fetch/push commands without direct user intervention (e.g.
recursive initialization of submodules).  Commands which can potentially
clone/fetch/push from untrusted repositories without user intervention
can export `GIT_PROTOCOL_FROM_USER` with a value of '0' to prevent
protocols configured to the `user` policy from being used.

Fix remote-ext tests to use the new config to allow the ext
protocol to be tested.

Based on a patch by Jeff King <peff@peff.net>

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt         |  46 ++++++++++++++
 Documentation/git.txt            |  38 +++++-------
 git-submodule.sh                 |  12 ++--
 t/lib-proto-disable.sh           | 130 +++++++++++++++++++++++++++++++++++++--
 t/t5509-fetch-push-namespaces.sh |   1 +
 t/t5802-connect-helper.sh        |   1 +
 transport.c                      |  75 +++++++++++++++++++++-
 7 files changed, 264 insertions(+), 39 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 27069ac..5fe50bc 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2308,6 +2308,52 @@ pretty.<name>::
 	Note that an alias with the same name as a built-in format
 	will be silently ignored.
 
+protocol.allow::
+	If set, provide a user defined default policy for all protocols which
+	don't explicitly have a policy (`protocol.<name>.allow`).  By default,
+	if unset, known-safe protocols (http, https, git, ssh, file) have a
+	default policy of `always`, known-dangerous protocols (ext) have a
+	default policy of `never`, and all other protocols have a default
+	policy of `user`.  Supported policies:
++
+--
+
+* `always` - protocol is always able to be used.
+
+* `never` - protocol is never able to be used.
+
+* `user` - protocol is only able to be used when `GIT_PROTOCOL_FROM_USER` is
+  either unset or has a value of 1.  This policy should be used when you want a
+  protocol to be directly usable by the user but don't want it used by commands which
+  execute clone/fetch/push commands without user input, e.g. recursive
+  submodule initialization.
+
+--
+
+protocol.<name>.allow::
+	Set a policy to be used by protocol `<name>` with clone/fetch/push
+	commands. See `protocol.allow` above for the available policies.
++
+The protocol names currently used by git are:
++
+--
+  - `file`: any local file-based path (including `file://` URLs,
+    or local paths)
+
+  - `git`: the anonymous git protocol over a direct TCP
+    connection (or proxy, if configured)
+
+  - `ssh`: git over ssh (including `host:path` syntax,
+    `ssh://`, etc).
+
+  - `http`: git over http, both "smart http" and "dumb http".
+    Note that this does _not_ include `https`; if you want to configure
+    both, you must do so individually.
+
+  - any external helpers are named by their protocol (e.g., use
+    `hg` to allow the `git-remote-hg` helper)
+--
+
 pull.ff::
 	By default, Git does not create an extra merge commit when merging
 	a commit that is a descendant of the current commit. Instead, the
diff --git a/Documentation/git.txt b/Documentation/git.txt
index ab7215e..c52cec8 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1150,30 +1150,20 @@ of clones and fetches.
 	cloning a repository to make a backup).
 
 `GIT_ALLOW_PROTOCOL`::
-	If set, provide a colon-separated list of protocols which are
-	allowed to be used with fetch/push/clone. This is useful to
-	restrict recursive submodule initialization from an untrusted
-	repository. Any protocol not mentioned will be disallowed (i.e.,
-	this is a whitelist, not a blacklist). If the variable is not
-	set at all, all protocols are enabled.  The protocol names
-	currently used by git are:
-
-	  - `file`: any local file-based path (including `file://` URLs,
-	    or local paths)
-
-	  - `git`: the anonymous git protocol over a direct TCP
-	    connection (or proxy, if configured)
-
-	  - `ssh`: git over ssh (including `host:path` syntax,
-	    `ssh://`, etc).
-
-	  - `http`: git over http, both "smart http" and "dumb http".
-	    Note that this does _not_ include `https`; if you want both,
-	    you should specify both as `http:https`.
-
-	  - any external helpers are named by their protocol (e.g., use
-	    `hg` to allow the `git-remote-hg` helper)
-
+	If set to a colon-separated list of protocols, behave as if
+	`protocol.allow` is set to `never`, and each of the listed
+	protocols has `protocol.<name>.allow` set to `always`
+	(overriding any existing configuration). In other words, any
+	protocol not mentioned will be disallowed (i.e., this is a
+	whitelist, not a blacklist). See the description of
+	`protocol.allow` in linkgit:git-config[1] for more details.
+
+`GIT_PROTOCOL_FROM_USER`::
+	Set to 0 to prevent protocols used by fetch/push/clone which are
+	configured to the `user` state.  This is useful to restrict recursive
+	submodule initialization from an untrusted repository or for programs
+	which feed potentially-untrusted URLS to git commands.  See
+	linkgit:git-config[1] for more details.
 
 Discussion[[Discussion]]
 ------------------------
diff --git a/git-submodule.sh b/git-submodule.sh
index a024a13..0a477b4 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -21,14 +21,10 @@ require_work_tree
 wt_prefix=$(git rev-parse --show-prefix)
 cd_to_toplevel
 
-# Restrict ourselves to a vanilla subset of protocols; the URLs
-# we get are under control of a remote repository, and we do not
-# want them kicking off arbitrary git-remote-* programs.
-#
-# If the user has already specified a set of allowed protocols,
-# we assume they know what they're doing and use that instead.
-: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
-export GIT_ALLOW_PROTOCOL
+# Tell the rest of git that any URLs we get don't come
+# directly from the user, so it can apply policy as appropriate.
+GIT_PROTOCOL_FROM_USER=0
+export GIT_PROTOCOL_FROM_USER
 
 command=
 branch=
diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index be88e9a..02f49cb 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -1,10 +1,7 @@
 # Test routines for checking protocol disabling.
 
-# test cloning a particular protocol
-#   $1 - description of the protocol
-#   $2 - machine-readable name of the protocol
-#   $3 - the URL to try cloning
-test_proto () {
+# Test clone/fetch/push with GIT_ALLOW_PROTOCOL whitelist
+test_whitelist () {
 	desc=$1
 	proto=$2
 	url=$3
@@ -62,6 +59,129 @@ test_proto () {
 			test_must_fail git clone --bare "$url" tmp.git
 		)
 	'
+
+	test_expect_success "clone $desc (env var has precedence)" '
+		rm -rf tmp.git &&
+		(
+			GIT_ALLOW_PROTOCOL=none &&
+			export GIT_ALLOW_PROTOCOL &&
+			test_must_fail git -c protocol.allow=always clone --bare "$url" tmp.git &&
+			test_must_fail git -c protocol.$proto.allow=always clone --bare "$url" tmp.git
+		)
+	'
+}
+
+test_config () {
+	desc=$1
+	proto=$2
+	url=$3
+
+	# Test clone/fetch/push with protocol.<type>.allow config
+	test_expect_success "clone $desc (enabled with config)" '
+		rm -rf tmp.git &&
+		git -c protocol.$proto.allow=always clone --bare "$url" tmp.git
+	'
+
+	test_expect_success "fetch $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=always fetch
+	'
+
+	test_expect_success "push $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=always  push origin HEAD:pushed
+	'
+
+	test_expect_success "push $desc (disabled)" '
+		test_must_fail git -C tmp.git -c protocol.$proto.allow=never push origin HEAD:pushed
+	'
+
+	test_expect_success "fetch $desc (disabled)" '
+		test_must_fail git -C tmp.git -c protocol.$proto.allow=never fetch
+	'
+
+	test_expect_success "clone $desc (disabled)" '
+		rm -rf tmp.git &&
+		test_must_fail git -c protocol.$proto.allow=never clone --bare "$url" tmp.git
+	'
+
+	# Test clone/fetch/push with protocol.user.allow and its env var
+	test_expect_success "clone $desc (enabled)" '
+		rm -rf tmp.git &&
+		git -c protocol.$proto.allow=user clone --bare "$url" tmp.git
+	'
+
+	test_expect_success "fetch $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=user fetch
+	'
+
+	test_expect_success "push $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=user push origin HEAD:pushed
+	'
+
+	test_expect_success "push $desc (disabled)" '
+		(
+			cd tmp.git &&
+			GIT_PROTOCOL_FROM_USER=0 &&
+			export GIT_PROTOCOL_FROM_USER &&
+			test_must_fail git -c protocol.$proto.allow=user push origin HEAD:pushed
+		)
+	'
+
+	test_expect_success "fetch $desc (disabled)" '
+		(
+			cd tmp.git &&
+			GIT_PROTOCOL_FROM_USER=0 &&
+			export GIT_PROTOCOL_FROM_USER &&
+			test_must_fail git -c protocol.$proto.allow=user fetch
+		)
+	'
+
+	test_expect_success "clone $desc (disabled)" '
+		rm -rf tmp.git &&
+		(
+			GIT_PROTOCOL_FROM_USER=0 &&
+			export GIT_PROTOCOL_FROM_USER &&
+			test_must_fail git -c protocol.$proto.allow=user clone --bare "$url" tmp.git
+		)
+	'
+
+	# Test clone/fetch/push with protocol.allow user defined default
+	test_expect_success "clone $desc (enabled)" '
+		rm -rf tmp.git &&
+		git config --global protocol.allow always &&
+		git clone --bare "$url" tmp.git
+	'
+
+	test_expect_success "fetch $desc (enabled)" '
+		git -C tmp.git fetch
+	'
+
+	test_expect_success "push $desc (enabled)" '
+		git -C tmp.git push origin HEAD:pushed
+	'
+
+	test_expect_success "push $desc (disabled)" '
+		git config --global protocol.allow never &&
+		test_must_fail git -C tmp.git push origin HEAD:pushed
+	'
+
+	test_expect_success "fetch $desc (disabled)" '
+		test_must_fail git -C tmp.git fetch
+	'
+
+	test_expect_success "clone $desc (disabled)" '
+		rm -rf tmp.git &&
+		test_must_fail git clone --bare "$url" tmp.git
+	'
+}
+
+# test cloning a particular protocol
+#   $1 - description of the protocol
+#   $2 - machine-readable name of the protocol
+#   $3 - the URL to try cloning
+test_proto () {
+	test_whitelist "$@"
+
+	test_config "$@"
 }
 
 # set up an ssh wrapper that will access $host/$repo in the
diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh
index bc44ac3..75c570a 100755
--- a/t/t5509-fetch-push-namespaces.sh
+++ b/t/t5509-fetch-push-namespaces.sh
@@ -4,6 +4,7 @@ test_description='fetch/push involving ref namespaces'
 . ./test-lib.sh
 
 test_expect_success setup '
+	git config --global protocol.ext.allow user &&
 	test_tick &&
 	git init original &&
 	(
diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
index b7a7f9d..c6c2661 100755
--- a/t/t5802-connect-helper.sh
+++ b/t/t5802-connect-helper.sh
@@ -4,6 +4,7 @@ test_description='ext::cmd remote "connect" helper'
 . ./test-lib.sh
 
 test_expect_success setup '
+	git config --global protocol.ext.allow user &&
 	test_tick &&
 	git commit --allow-empty -m initial &&
 	test_tick &&
diff --git a/transport.c b/transport.c
index d57e8de..2c0ec76 100644
--- a/transport.c
+++ b/transport.c
@@ -664,10 +664,81 @@ static const struct string_list *protocol_whitelist(void)
 	return enabled ? &allowed : NULL;
 }
 
+enum protocol_allow_config {
+	PROTOCOL_ALLOW_NEVER = 0,
+	PROTOCOL_ALLOW_USER_ONLY,
+	PROTOCOL_ALLOW_ALWAYS
+};
+
+static enum protocol_allow_config parse_protocol_config(const char *key,
+							const char *value)
+{
+	if (!strcasecmp(value, "always"))
+		return PROTOCOL_ALLOW_ALWAYS;
+	else if (!strcasecmp(value, "never"))
+		return PROTOCOL_ALLOW_NEVER;
+	else if (!strcasecmp(value, "user"))
+		return PROTOCOL_ALLOW_USER_ONLY;
+
+	die("unknown value for config '%s': %s", key, value);
+}
+
+static enum protocol_allow_config get_protocol_config(const char *type)
+{
+	char *key = xstrfmt("protocol.%s.allow", type);
+	char *value;
+
+	/* first check the per-protocol config */
+	if (!git_config_get_string(key, &value)) {
+		enum protocol_allow_config ret =
+			parse_protocol_config(key, value);
+		free(key);
+		free(value);
+		return ret;
+	}
+	free(key);
+
+	/* if defined, fallback to user-defined default for unknown protocols */
+	if (!git_config_get_string("protocol.allow", &value)) {
+		enum protocol_allow_config ret =
+			parse_protocol_config("protocol.allow", value);
+		free(value);
+		return ret;
+	}
+
+	/* fallback to built-in defaults */
+	/* known safe */
+	if (!strcmp(type, "http") ||
+	    !strcmp(type, "https") ||
+	    !strcmp(type, "git") ||
+	    !strcmp(type, "ssh") ||
+	    !strcmp(type, "file"))
+		return PROTOCOL_ALLOW_ALWAYS;
+
+	/* known scary; err on the side of caution */
+	if (!strcmp(type, "ext"))
+		return PROTOCOL_ALLOW_NEVER;
+
+	/* unknown; by default let them be used only directly by the user */
+	return PROTOCOL_ALLOW_USER_ONLY;
+}
+
 int is_transport_allowed(const char *type)
 {
-	const struct string_list *allowed = protocol_whitelist();
-	return !allowed || string_list_has_string(allowed, type);
+	const struct string_list *whitelist = protocol_whitelist();
+	if (whitelist)
+		return string_list_has_string(whitelist, type);
+
+	switch (get_protocol_config(type)) {
+	case PROTOCOL_ALLOW_ALWAYS:
+		return 1;
+	case PROTOCOL_ALLOW_NEVER:
+		return 0;
+	case PROTOCOL_ALLOW_USER_ONLY:
+		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+	}
+
+	die("BUG: invalid protocol_allow_config type");
 }
 
 void transport_check_allowed(const char *type)
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v6 1/4] lib-proto-disable: variable name fix
From: Brandon Williams @ 2016-12-01 19:44 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, peff, sbeller, bburky, jrnieder
In-Reply-To: <1480621447-52399-1-git-send-email-bmwill@google.com>

The test_proto function assigns the positional parameters to named
variables, but then still refers to "$desc" as "$1". Using $desc is
more readable and less error-prone.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/lib-proto-disable.sh | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index b0917d9..be88e9a 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -9,7 +9,7 @@ test_proto () {
 	proto=$2
 	url=$3
 
-	test_expect_success "clone $1 (enabled)" '
+	test_expect_success "clone $desc (enabled)" '
 		rm -rf tmp.git &&
 		(
 			GIT_ALLOW_PROTOCOL=$proto &&
@@ -18,7 +18,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "fetch $1 (enabled)" '
+	test_expect_success "fetch $desc (enabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=$proto &&
@@ -27,7 +27,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "push $1 (enabled)" '
+	test_expect_success "push $desc (enabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=$proto &&
@@ -36,7 +36,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "push $1 (disabled)" '
+	test_expect_success "push $desc (disabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=none &&
@@ -45,7 +45,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "fetch $1 (disabled)" '
+	test_expect_success "fetch $desc (disabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=none &&
@@ -54,7 +54,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "clone $1 (disabled)" '
+	test_expect_success "clone $desc (disabled)" '
 		rm -rf tmp.git &&
 		(
 			GIT_ALLOW_PROTOCOL=none &&
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v6 3/4] http: always warn if libcurl version is too old
From: Brandon Williams @ 2016-12-01 19:44 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, peff, sbeller, bburky, jrnieder
In-Reply-To: <1480621447-52399-1-git-send-email-bmwill@google.com>

Now that there are default "known-good" and "known-bad" protocols which
are allowed/disallowed by 'is_transport_allowed' we should always warn
the user that older versions of libcurl can't respect the allowed
protocols for redirects.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c      | 5 ++---
 transport.c | 5 -----
 transport.h | 6 ------
 3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/http.c b/http.c
index 4c4a812..fee128b 100644
--- a/http.c
+++ b/http.c
@@ -735,9 +735,8 @@ static CURL *get_curl_handle(void)
 		allowed_protocols |= CURLPROTO_FTPS;
 	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, allowed_protocols);
 #else
-	if (transport_restrict_protocols())
-		warning("protocol restrictions not applied to curl redirects because\n"
-			"your curl version is too old (>= 7.19.4)");
+	warning("protocol restrictions not applied to curl redirects because\n"
+		"your curl version is too old (>= 7.19.4)");
 #endif
 	if (getenv("GIT_CURL_VERBOSE"))
 		curl_easy_setopt(result, CURLOPT_VERBOSE, 1L);
diff --git a/transport.c b/transport.c
index 2c0ec76..186de9a 100644
--- a/transport.c
+++ b/transport.c
@@ -747,11 +747,6 @@ void transport_check_allowed(const char *type)
 		die("transport '%s' not allowed", type);
 }
 
-int transport_restrict_protocols(void)
-{
-	return !!protocol_whitelist();
-}
-
 struct transport *transport_get(struct remote *remote, const char *url)
 {
 	const char *helper;
diff --git a/transport.h b/transport.h
index b8e4ee8..f4998bc 100644
--- a/transport.h
+++ b/transport.h
@@ -164,12 +164,6 @@ int is_transport_allowed(const char *type);
  */
 void transport_check_allowed(const char *type);
 
-/*
- * Returns true if the user has attempted to turn on protocol
- * restrictions at all.
- */
-int transport_restrict_protocols(void);
-
 /* Transport options which apply to git:// and scp-style URLs */
 
 /* The program to use on the remote side to send a pack */
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* Re: bw/transport-protocol-policy
From: Jeff King @ 2016-12-01 19:46 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Junio C Hamano, git
In-Reply-To: <20161201193524.GC54082@google.com>

On Thu, Dec 01, 2016 at 11:35:24AM -0800, Brandon Williams wrote:

> > I wouldn't expect anyone to ever set GIT_PROTOCOL_FROM_USER=1, but it
> > does behave in a funny way here, overriding the "redirect" flag. I think
> > we'd want something more like:
> > 
> >   if (redirect < 0)
> > 	redirect = git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> > 
> > and then pass in "-1" from transport_check_allowed().
> 
> I don't think I quite follow your solution but I came up with this:
> 
>   case PROTOCOL_ALLOW_USER_ONLY:
>     return redirect ? 0 : git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> 
> Which should address the same issue.

I think mine was confused a bit by using the word "redirect". It was
really meant to be "from_user", and could take three values: definitely
yes, definitely no, and unknown (-1). And in the unknown case, we pull
the value from the environment.

Yours combines "definitely no" and "unknown" into a single value ("1" in
your case, but that is because "redirect" and "from_user" have inverted
logic from each other).

I think that is OK, as there isn't any case where a caller would want to
say "definitely no". The most they would say is "_I_ am not doing
anything to make you think this value is not from the user", but we
would still want to check the environment to see that nobody _else_ had
put in such a restriction.

-Peff

^ permalink raw reply

* [PATCH v6 4/4] transport: check if protocol can be used on a redirect
From: Brandon Williams @ 2016-12-01 19:48 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, peff, sbeller, bburky, jrnieder
In-Reply-To: <1480621447-52399-5-git-send-email-bmwill@google.com>

Add a the 'redirect' parameter to 'is_transport_allowed' which allows
callers to query if a transport protocol can be used on a redirect.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c      | 8 ++++----
 transport.c | 6 +++---
 transport.h | 7 ++++---
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/http.c b/http.c
index fee128b..d31ded8 100644
--- a/http.c
+++ b/http.c
@@ -725,13 +725,13 @@ static CURL *get_curl_handle(void)
 	curl_easy_setopt(result, CURLOPT_POST301, 1);
 #endif
 #if LIBCURL_VERSION_NUM >= 0x071304
-	if (is_transport_allowed("http"))
+	if (is_transport_allowed("http", 1))
 		allowed_protocols |= CURLPROTO_HTTP;
-	if (is_transport_allowed("https"))
+	if (is_transport_allowed("https", 1))
 		allowed_protocols |= CURLPROTO_HTTPS;
-	if (is_transport_allowed("ftp"))
+	if (is_transport_allowed("ftp", 1))
 		allowed_protocols |= CURLPROTO_FTP;
-	if (is_transport_allowed("ftps"))
+	if (is_transport_allowed("ftps", 1))
 		allowed_protocols |= CURLPROTO_FTPS;
 	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, allowed_protocols);
 #else
diff --git a/transport.c b/transport.c
index 186de9a..7c4a757 100644
--- a/transport.c
+++ b/transport.c
@@ -723,7 +723,7 @@ static enum protocol_allow_config get_protocol_config(const char *type)
 	return PROTOCOL_ALLOW_USER_ONLY;
 }
 
-int is_transport_allowed(const char *type)
+int is_transport_allowed(const char *type, int redirect)
 {
 	const struct string_list *whitelist = protocol_whitelist();
 	if (whitelist)
@@ -735,7 +735,7 @@ int is_transport_allowed(const char *type)
 	case PROTOCOL_ALLOW_NEVER:
 		return 0;
 	case PROTOCOL_ALLOW_USER_ONLY:
-		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+		return redirect ? 0 : git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
 	}
 
 	die("BUG: invalid protocol_allow_config type");
@@ -743,7 +743,7 @@ int is_transport_allowed(const char *type)
 
 void transport_check_allowed(const char *type)
 {
-	if (!is_transport_allowed(type))
+	if (!is_transport_allowed(type, 0))
 		die("transport '%s' not allowed", type);
 }
 
diff --git a/transport.h b/transport.h
index f4998bc..4bcf5d3 100644
--- a/transport.h
+++ b/transport.h
@@ -153,10 +153,11 @@ extern int transport_summary_width(const struct ref *refs);
 struct transport *transport_get(struct remote *, const char *);
 
 /*
- * Check whether a transport is allowed by the environment. Type should
- * generally be the URL scheme, as described in Documentation/git.txt
+ * Check whether a transport is allowed by the environment. Setting 'redirect'
+ * can be used to query if the transport can be used in a redirect.  Type
+ * should generally be the URL scheme, as described in Documentation/git.txt
  */
-int is_transport_allowed(const char *type);
+int is_transport_allowed(const char *type, int redirect);
 
 /*
  * Check whether a transport is allowed by the environment,
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* Re: [PATCH v6 4/4] transport: check if protocol can be used on a redirect
From: Brandon Williams @ 2016-12-01 19:49 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, bburky, jrnieder
In-Reply-To: <1480621700-53222-1-git-send-email-bmwill@google.com>

On 12/01, Brandon Williams wrote:
> Add a the 'redirect' parameter to 'is_transport_allowed' which allows
> callers to query if a transport protocol can be used on a redirect.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  http.c      | 8 ++++----
>  transport.c | 6 +++---
>  transport.h | 7 ++++---
>  3 files changed, 11 insertions(+), 10 deletions(-)
> 

Accidentally sent out an old version of just this patch.  Here is the
updated one.

-- 
Brandon Williams

^ permalink raw reply

* Re: [PATCH v6 4/4] transport: check if protocol can be used on a redirect
From: Jeff King @ 2016-12-01 19:50 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, bburky, jrnieder
In-Reply-To: <1480621447-52399-5-git-send-email-bmwill@google.com>

On Thu, Dec 01, 2016 at 11:44:07AM -0800, Brandon Williams wrote:

> Add a the 'redirect' parameter to 'is_transport_allowed' which allows
> callers to query if a transport protocol can be used on a redirect.

s/a the/a/

> -int is_transport_allowed(const char *type)
> +int is_transport_allowed(const char *type, int redirect)
>  {
>  	const struct string_list *whitelist = protocol_whitelist();
>  	if (whitelist)
> @@ -735,7 +735,7 @@ int is_transport_allowed(const char *type)
>  	case PROTOCOL_ALLOW_NEVER:
>  		return 0;
>  	case PROTOCOL_ALLOW_USER_ONLY:
> -		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> +		return git_env_bool("GIT_PROTOCOL_FROM_USER", !redirect);
>  	}

This has the older logic still.

I'm not sure if we should call this "redirect" here. That's how it's
used by the curl code, but I think from the perspective of the transport
whitelist, it is really "are you overriding the from_user environment".

Calling it "from_user" may be confusing though, as the default value
would become "1", even though it means only "as far as I know this is
from the user, but maybe the environment says otherwise". So bizarrely,
I think calling it "not_from_user" is the clearest value.

-Peff

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox