Git development
 help / color / mirror / Atom feed
* Cherry-pick applied X times
From: Delanoe, Yann @ 2016-12-15  8:12 UTC (permalink / raw)
  To: git@vger.kernel.org

Hi Git community, 
 
I'm new to GIT and responsible of a project to migrate our SVN repo to GIT.
I've made the migration with the git-svn tools ... it was long, but everything seems fine ; source code is correct and all its history is there.
 
It happen with our delivery workflow that we will have to use cherry-picks to prepare our patches, so I made some test on it.
 
During those tests I saw a strange behaviour: I tried to cherry-pick onto a release branch a commit from the master that had been previously already merged onto this branch with SVN. GIT did not detect it and added the code a second time in the source file modified. I supposed this was du the fact the first merge had been made with SVN. I tried to cherry pick the same commit again ... and GIT add one more time the code of the commit. It appears I could cherry-pick this commit X times with GIT, and each time he added the code again.
 
I looked in SVN, the merge property of the first commit from master to the release branch is ok.
Our SVN repo has more than 22K revisions
 
I found out that some other commits had the same behaviour.
 
Is there a direction onto which I should investigate to determine where the problem comes from ?
 

Here is an example of the multiple cherry pick.

Branches:
 
> git branch
  master
* release/15.0.0
  release/15.3.0
 
Check commit already exist on release (cb8c480) and get master hash (bee110c):
 
> git log --oneline |grep GTX-20264
cb8c480 GTX-20264 : Missing end of string in field hostReference for custom network ack
> git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
> git log --oneline |grep GTX-20264
bee110c GTX-20264 : Missing end of string in field hostReference for custom network ack
> git checkout release/15.0.0
Switched to branch 'release/15.0.0'
Your branch is up-to-date with 'origin/release/15.0.0'.
 
 
And now I cherry-pick the master commit onto the release/15.0.0 a multiple of times:

> git cherry-pick bee110c
[release/15.0.0 e8bfc33] GTX-20264 : Missing end of string in field hostReference for custom network ack
Author: Yann Delanoe <ydelanoe@bottomline.com>
1 file changed, 3 insertions(+)
> git cherry-pick bee110c
[release/15.0.0 2b98c8d] GTX-20264 : Missing end of string in field hostReference for custom network ack
Author: Yann Delanoe <ydelanoe@bottomline.com>
1 file changed, 3 insertions(+)
> git cherry-pick bee110c
[release/15.0.0 820cd8a] GTX-20264 : Missing end of string in field hostReference for custom network ack
Author: Yann Delanoe <ydelanoe@bottomline.com>
1 file changed, 3 insertions(+)
 

 
Best regards
Yann DELANOE - Product engineer - Bottomline technologies

^ permalink raw reply

* Re: [PATCH v2] fix pushing to //server/share/dir on Windows
From: Torsten Bögershausen @ 2016-12-15  7:30 UTC (permalink / raw)
  To: Johannes Sixt, Jeff King; +Cc: Johannes Schindelin, Git Mailing List
In-Reply-To: <787a421b-8b7a-14c5-768f-06c3dc183cf4@kdbg.org>



On 14/12/16 20:37, Johannes Sixt wrote:
> normalize_path_copy() is not prepared to keep the double-slash of a
> //server/share/dir kind of path, but treats it like a regular POSIX
> style path and transforms it to /server/share/dir.
>
> The bug manifests when 'git push //server/share/dir master' is run,
> because tmp_objdir_add_as_alternate() uses the path in normalized
> form when it registers the quarantine object database via
> link_alt_odb_entries(). Needless to say that the directory cannot be
> accessed using the wrongly normalized path.
>
> Fix it by skipping all of the root part, not just a potential drive
> prefix. offset_1st_component takes care of this, see the
> implementation in compat/mingw.c::mingw_offset_1st_component().
>
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
> Am 14.12.2016 um 18:30 schrieb Jeff King:
>> Would it be reasonable to
>> write:
>>
>>    /* Copy initial part of absolute path, converting separators on Windows */
>>    const char *end = src + offset_1st_component(src);
>>    while (src < end) {
>> 	  char c = *src++;
>> 	  if (c == '\\')
>> 		  c = '/';
>> 	  *dst++ = c;
>>    }
> Makes a lot of sense! I haven't had an opportunity, though, to test
> on Windows.
I'm not sure, if a conversion should be done here, in this part of code.
To my knowledge,

C:\dir1\file
is the same
as
C:/dir1/file
and that is handled by windows.

The \\server\share\dir1\file is native to windows,
and I can't see good reasons to change '\' into '/' somewhere in Git,
when UNC is used.

Cygwin does a translation from
//server/share/dir1/file
into
\\server\share\dir1\file

In other words:
The patch looks good as is, and once I get a Windows machine,
may be able to do some testing and come up with test cases


<https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx>
[]


^ permalink raw reply

* Re: [PATCHv3 1/3] merge: Add '--continue' option as a synonym for 'git commit'
From: Chris Packham @ 2016-12-15  7:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: GIT, Markus Hitter, Jeff King, Jacob Keller
In-Reply-To: <xmqqk2b2xu81.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 15, 2016 at 7:04 AM, Junio C Hamano <gitster@pobox.com> wrote:
> The last one 3/3 is a nice touch that makes sure that we do not
> forget what we discovered during the discussion.  Very much
> appreciated.
>
> Will queue.  Thanks.

Did you want me to send a v4 to mark the strings for translation or
will you apply a fixup your end?

^ permalink raw reply

* git bug - merging JS / Node.js code with "git merge --squash"
From: Alexander Mills @ 2016-12-15  7:12 UTC (permalink / raw)
  To: git

@git-community

I am on Git git version 2.7.4

This problem is happening on Ubuntu 16.04, but the same problem was
also happening when I was running on MacOS.

I am consistently seeing merge bugs, when merging between branches of
a mostly Node.js project.
I am seeing fairly bad merges that mangle the code in ways that seem
to clearly show some sort of Git bug. Some of these merges were for
files where there was likely not even a diff between the files.
TBH I am no Git expert, but maybe I will learn something from this
investigation.

The latest example of a mangled file is here:
https://github.com/ORESoftware/suman/blob/staging/suman.conf.js

You can see some repeat code, and also there is a dangling brace which
means that the file won't even "compile" with Node.js, using "$ node
-c". Notice that this file was *not* a file where I recently  had to
manually merge code or manually fix a conflict, so I am hoping this is
not obvious operator error.

Here is the script I am using to merge between branches:
https://github.com/ORESoftware/suman/blob/dev/publish-suman.sh

basically it is doing the merge with this line:

git merge --squash -Xtheirs dev -m "squashing" &&

This is obviously very concerning because I can get very strange bugs
that I wouldn't expect, because I just assume that merges go well if
they succeed and it's hard to check for failure after that; even in a
compile statically typed language it could still prove difficult.

I am doing a check to make sure all my files compile with "node -c"
after the merge, but even then Git could create mangled code that
would still pass a "node -c" check.

Please let me know if this is a known bug and if there is a good
strategy to avoid it.

thanks!

-- 
Alexander D. Mills
(650)269-9502
alexander.d.mills@gmail.com

www.linkedin.com/pub/alexander-mills/b/7a5/418/

^ permalink raw reply

* Re: [PATCH v10 0/6] transport protocol policy configuration
From: Jeff King @ 2016-12-15  0:22 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

On Wed, Dec 14, 2016 at 02:39:49PM -0800, Brandon Williams wrote:

> v10 of this series fixes the following:
> * A few updates to the commit messages in order to better convey the reasoning
>   behind the a few of the patches.
> * Additional test to verify that curl redirects respect configured protocol
>   policies.
> * Patch added by Jeff King to make http alternates respect configured
>   protocol policies.

Thanks, this one looks fine to me.

-Peff

^ permalink raw reply

* Re: [PATCH v10 2/6] http: always warn if libcurl version is too old
From: Jeff King @ 2016-12-15  0:21 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, gitster, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-3-git-send-email-bmwill@google.com>

On Wed, Dec 14, 2016 at 02:39:51PM -0800, Brandon Williams wrote:

> Always warn if libcurl version is too old because:
> 
> 1. Even without a protocol whitelist, newer versions of curl have all
>    non-http protocols disabled by default.

Technically, non-http and non-ftp. Maybe just "non-standard" would be
more accurate.

Not worth a re-roll, but if Junio hasn't applied yet, maybe worth fixing
up while applying.

-Peff

^ permalink raw reply

* Re: [PATCHv3 4/4] rm: add absorb a submodules git dir before deletion
From: Junio C Hamano @ 2016-12-14 23:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, David.Turner, bmwill, sandals
In-Reply-To: <20161214224101.6211-5-sbeller@google.com>

Stefan Beller <sbeller@google.com> writes:

>  			if (list.entry[i].is_submodule) {
>  				if (is_empty_dir(path)) {
> +					if (rmdir(path))
> +						die_errno("git rm: '%s'", path);
> +				} else if (file_exists(path))
> +					/* non empty directory: */

Lose colon?

> +					remove_directory_or_die(path);

... otherwise?  I.e.

				else
					???

If we are running "git rm -f <path>", then the path could be a
submodule in the index and on the filesystem, it could be (1)
already missing, as the user removed an empty submodule directory
she is not interested in, (2) a non-directory, e.g. a file or a
symbolic link, perhaps because she was trying to reorganize the
superproject working tree but decided against it, or (3) something
else?

(1) is perfectly OK; we end up with a result without the path, which
is what "git rm -f" wanted to do anyway.  I am not sure what should
happen in (2), and what other corner cases there are for (3), though.

And use of file_exists(path) in the above patch may trigger a
strange error message in case (2), as remove_directory_or_die()
would say "path is not a directory", to which the user will say "Yes
I know, I wanted you to remove it with 'git rm -f'".




^ permalink raw reply

* Re: [PATCHv3 3/4] submodule: add flags to ok_to_remove_submodule
From: Junio C Hamano @ 2016-12-14 23:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, David.Turner, bmwill, sandals
In-Reply-To: <20161214224101.6211-4-sbeller@google.com>

Stefan Beller <sbeller@google.com> writes:

> diff --git a/submodule.c b/submodule.c
> index 9f0b544ebe..2d13744b06 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1019,7 +1019,7 @@ int submodule_uses_gitfile(const char *path)
>  	return 1;
>  }
>  
> -int ok_to_remove_submodule(const char *path)
> +int ok_to_remove_submodule(const char *path, unsigned flags)
>  {
>  	ssize_t len;
>  	struct child_process cp = CHILD_PROCESS_INIT;
> @@ -1032,15 +1032,27 @@ int ok_to_remove_submodule(const char *path)
>  	if (!submodule_uses_gitfile(path))
>  		return 0;
>  
> -	argv_array_pushl(&cp.args, "status", "--porcelain", "-u",
> +	argv_array_pushl(&cp.args, "status", "--porcelain",
>  				   "--ignore-submodules=none", NULL);
> +
> +	if (flags & SUBMODULE_REMOVAL_IGNORE_UNTRACKED)
> +		argv_array_push(&cp.args, "-uno");
> +	else
> +		argv_array_push(&cp.args, "-uall");
> +
> +	if (!(flags & SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED))
> +		argv_array_push(&cp.args, "--ignored");
> +

These "internal values to assemble command line" operations we
cannot avoid.  But things like this ...

>  	if (start_command(&cp))
> -		die(_("could not run 'git status --porcelain -u --ignore-submodules=none' in submodule %s"), path);
> +		die(_("could not run 'git status --porcelain --ignore-submodules=none %s %s' in submodule '%s'"),
> +			(flags & SUBMODULE_REMOVAL_IGNORE_UNTRACKED) ? "-uno" : "-uall",
> +			(!(flags & SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED)) ? "--ignored" : "",
> +			path);

and this ...

>  	if (finish_command(&cp))
> -		die(_("'git status --porcelain -u --ignore-submodules=none' failed in submodule %s"), path);
> +		die(_("'git status --porcelain --ignore-submodules=none %s %s' failed in submodule '%s'"),
> +			(flags & SUBMODULE_REMOVAL_IGNORE_UNTRACKED) ? "-uno" : "-uall",
> +			(!(flags & SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED)) ? "--ignored" : "",
> +			path);

makes me wonder if we want a helper that builds the string out of an
already assembled cp.args[] array, so that we won't have to do the
same thing twice/thrice and more importantly we won't have to worry
about these three going out of sync.


^ permalink raw reply

* Re: [PATCH v10 0/6] transport protocol policy configuration
From: Junio C Hamano @ 2016-12-14 23:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

Brandon Williams <bmwill@google.com> writes:

> v10 of this series fixes the following:
> * A few updates to the commit messages in order to better convey the reasoning
>   behind the a few of the patches.
> * Additional test to verify that curl redirects respect configured protocol
>   policies.
> * Patch added by Jeff King to make http alternates respect configured
>   protocol policies.

Thanks.  Will replace the previous one.

^ permalink raw reply

* Re: [PATCHv3 3/4] submodule: add flags to ok_to_remove_submodule
From: Brandon Williams @ 2016-12-14 23:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git, David.Turner, sandals
In-Reply-To: <20161214224101.6211-4-sbeller@google.com>

On 12/14, Stefan Beller wrote:
> In different contexts the question whether deleting a submodule is ok
> to remove may be answered differently.

This sentence is oddly worded.  Maybe this:

  In different context the question "Is it ok to delete a submodule?"
  may be answered differently.

-- 
Brandon Williams

^ permalink raw reply

* Re: [PATCH v9 5/5] transport: add from_user parameter to is_transport_allowed
From: Jeff King @ 2016-12-14 22:52 UTC (permalink / raw)
  To: Blake Burkhart; +Cc: git, jrnieder, Brandon Williams, gitster, sbeller
In-Reply-To: <CAP3OtXhH++szRws20MaHt-ftLBMUJuYiTmfL50mOFP4FA4Mn6Q@mail.gmail.com>

On Wed, Dec 14, 2016 at 04:29:52PM -0600, Blake Burkhart wrote:

> You may want to set CURLOPT_DEFAULT_PROTOCOL if we don't already. Apparently
> the default value of NULL causes it to make a guess based on the host if no
> protocol is present. But you are discussing a situation where "http://" is
> present, so that doesn't apply.

Cute. I agree it doesn't matter here, where we're sure there's a
protocol specifier at the beginning. It might matter if you instruct git
to use a specific remote-helper, like:

  $ echo 127.0.0.1 ftp.example.com >>/etc/hosts
  $ git clone http::ftp.example.com

which will try to connect via ftp. Of course that's no different than:

  $ git clone http::ftp://example.com

which you can already do.  git-clone sees "http::" and hands it off to
git-remote-http, which then processes the rest of the arguments as it
sees fit (in this case, handing it off to curl).

Prior to these more recent patches I suspect you could do:

  $ git clone http::file://whatever

but now we set CURLOPT_PROTOCOL to restrict it to just http/ftp (so you
can be confusing by asking for http and ending up in curl to do ftp, but
in any such case you could also have just asked for ftp in the first
place).

> Also, I thought we left out ftp because it was deprecated, but I don't
> remember exactly.

I couldn't find anything interesting in the archives (neither the public
list nor git-security). Given how unlikely it is to be used, it does
seem like a good idea to keep it in the "maybe" category, if only
because it decreases the attack surface (and for those following along,
we're just talking about not-from-user uses here, so people sticking a
funny ftp URL in .gitmodules, or redirecting http to ftp, etc).

-Peff

^ permalink raw reply

* [PATCHv3 4/4] rm: add absorb a submodules git dir before deletion
From: Stefan Beller @ 2016-12-14 22:41 UTC (permalink / raw)
  To: gitster; +Cc: git, David.Turner, bmwill, sandals, Stefan Beller
In-Reply-To: <20161214224101.6211-1-sbeller@google.com>

When deleting a submodule, we need to keep the actual git directory around,
such that we do not lose local changes in there and at a later checkout
of the submodule we don't need to clone it again.

Now that the functionality is available to absorb the git directory of a
submodule, rewrite the checking in git-rm to not complain, but rather
relocate the git directories inside the superproject.

An alternative solution was discussed to have a function
`depopulate_submodule`, that couples the check for its git directory and
possible relocation before the the removal, such that it is less likely to
miss the check in the future, but the indirection with such a function
added seemed also complex. The reason for that was that this possible
move of the git directory was also implemented in `ok_to_remove_submodule`,
such that this function could truthfully answer whether it is ok to remove
the submodule.

The solution proposed here defers all these checks to the caller.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/rm.c  | 75 ++++++++++++++---------------------------------------------
 cache.h       |  2 ++
 entry.c       |  8 +++++++
 t/t3600-rm.sh | 39 ++++++++++++-------------------
 4 files changed, 42 insertions(+), 82 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index fdd7183f61..025ef4c735 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -59,27 +59,9 @@ static void print_error_files(struct string_list *files_list,
 	}
 }
 
-static void error_removing_concrete_submodules(struct string_list *files, int *errs)
-{
-	print_error_files(files,
-			  Q_("the following submodule (or one of its nested "
-			     "submodules)\n"
-			     "uses a .git directory:",
-			     "the following submodules (or one of their nested "
-			     "submodules)\n"
-			     "use a .git directory:", files->nr),
-			  _("\n(use 'rm -rf' if you really want to remove "
-			    "it including all of its history)"),
-			  errs);
-	string_list_clear(files, 0);
-}
-
-static int check_submodules_use_gitfiles(void)
+static void submodules_absorb_gitdir_if_needed(const char *prefix)
 {
 	int i;
-	int errs = 0;
-	struct string_list files = STRING_LIST_INIT_NODUP;
-
 	for (i = 0; i < list.nr; i++) {
 		const char *name = list.entry[i].name;
 		int pos;
@@ -99,12 +81,9 @@ static int check_submodules_use_gitfiles(void)
 			continue;
 
 		if (!submodule_uses_gitfile(name))
-			string_list_append(&files, name);
+			absorb_git_dir_into_superproject(prefix, name,
+				ABSORB_GITDIR_RECURSE_SUBMODULES);
 	}
-
-	error_removing_concrete_submodules(&files, &errs);
-
-	return errs;
 }
 
 static int check_local_mod(struct object_id *head, int index_only)
@@ -120,7 +99,6 @@ static int check_local_mod(struct object_id *head, int index_only)
 	int errs = 0;
 	struct string_list files_staged = STRING_LIST_INIT_NODUP;
 	struct string_list files_cached = STRING_LIST_INIT_NODUP;
-	struct string_list files_submodule = STRING_LIST_INIT_NODUP;
 	struct string_list files_local = STRING_LIST_INIT_NODUP;
 
 	no_head = is_null_oid(head);
@@ -218,13 +196,8 @@ static int check_local_mod(struct object_id *head, int index_only)
 		else if (!index_only) {
 			if (staged_changes)
 				string_list_append(&files_cached, name);
-			if (local_changes) {
-				if (S_ISGITLINK(ce->ce_mode) &&
-				    !submodule_uses_gitfile(name))
-					string_list_append(&files_submodule, name);
-				else
-					string_list_append(&files_local, name);
-			}
+			if (local_changes)
+				string_list_append(&files_local, name);
 		}
 	}
 	print_error_files(&files_staged,
@@ -246,8 +219,6 @@ static int check_local_mod(struct object_id *head, int index_only)
 			  &errs);
 	string_list_clear(&files_cached, 0);
 
-	error_removing_concrete_submodules(&files_submodule, &errs);
-
 	print_error_files(&files_local,
 			  Q_("the following file has local modifications:",
 			     "the following files have local modifications:",
@@ -341,6 +312,8 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 			exit(0);
 	}
 
+	submodules_absorb_gitdir_if_needed(prefix);
+
 	/*
 	 * If not forced, the file, the index and the HEAD (if exists)
 	 * must match; but the file can already been removed, since
@@ -357,9 +330,6 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 			oidclr(&oid);
 		if (check_local_mod(&oid, index_only))
 			exit(1);
-	} else if (!index_only) {
-		if (check_submodules_use_gitfiles())
-			exit(1);
 	}
 
 	/*
@@ -393,27 +363,16 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 			const char *path = list.entry[i].name;
 			if (list.entry[i].is_submodule) {
 				if (is_empty_dir(path)) {
-					if (!rmdir(path)) {
-						removed = 1;
-						if (!remove_path_from_gitmodules(path))
-							gitmodules_modified = 1;
-						continue;
-					}
-				} else {
-					strbuf_reset(&buf);
-					strbuf_addstr(&buf, path);
-					if (!remove_dir_recursively(&buf, 0)) {
-						removed = 1;
-						if (!remove_path_from_gitmodules(path))
-							gitmodules_modified = 1;
-						strbuf_release(&buf);
-						continue;
-					} else if (!file_exists(path))
-						/* Submodule was removed by user */
-						if (!remove_path_from_gitmodules(path))
-							gitmodules_modified = 1;
-					/* Fallthrough and let remove_path() fail. */
-				}
+					if (rmdir(path))
+						die_errno("git rm: '%s'", path);
+				} else if (file_exists(path))
+					/* non empty directory: */
+					remove_directory_or_die(path);
+
+				removed = 1;
+				if (!remove_path_from_gitmodules(path))
+					gitmodules_modified = 1;
+				continue;
 			}
 			if (!remove_path(path)) {
 				removed = 1;
diff --git a/cache.h b/cache.h
index a50a61a197..3a423af59b 100644
--- a/cache.h
+++ b/cache.h
@@ -2018,4 +2018,6 @@ void sleep_millisec(int millisec);
  */
 void safe_create_dir(const char *dir, int share);
 
+extern void remove_directory_or_die(const char *path);
+
 #endif /* CACHE_H */
diff --git a/entry.c b/entry.c
index c6eea240b6..ddd4cfb2bf 100644
--- a/entry.c
+++ b/entry.c
@@ -73,6 +73,14 @@ static void remove_subtree(struct strbuf *path)
 		die_errno("cannot rmdir '%s'", path->buf);
 }
 
+void remove_directory_or_die(const char *path)
+{
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_addstr(&sb, path);
+	remove_subtree(&sb);
+	strbuf_release(&sb);
+}
+
 static int create_file(const char *path, unsigned int mode)
 {
 	mode = (mode & 0100) ? 0777 : 0666;
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index bcbb680651..5aa6db584c 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -569,26 +569,22 @@ test_expect_success 'rm of a conflicted unpopulated submodule succeeds' '
 	test_cmp expect actual
 '
 
-test_expect_success 'rm of a populated submodule with a .git directory fails even when forced' '
+test_expect_success 'rm of a populated submodule with a .git directory migrates git dir' '
 	git checkout -f master &&
 	git reset --hard &&
 	git submodule update &&
 	(cd submod &&
 		rm .git &&
 		cp -R ../.git/modules/sub .git &&
-		GIT_WORK_TREE=. git config --unset core.worktree
+		GIT_WORK_TREE=. git config --unset core.worktree &&
+		rm -r ../.git/modules/sub
 	) &&
-	test_must_fail git rm submod &&
-	test -d submod &&
-	test -d submod/.git &&
-	git status -s -uno --ignore-submodules=none >actual &&
-	! test -s actual &&
-	test_must_fail git rm -f submod &&
-	test -d submod &&
-	test -d submod/.git &&
+	git rm submod 2>output.err &&
+	! test -d submod &&
+	! test -d submod/.git &&
 	git status -s -uno --ignore-submodules=none >actual &&
-	! test -s actual &&
-	rm -rf submod
+	test -s actual &&
+	test_i18ngrep Migrating output.err
 '
 
 cat >expect.deepmodified <<EOF
@@ -667,24 +663,19 @@ test_expect_success 'rm of a populated nested submodule with a nested .git direc
 	git submodule update --recursive &&
 	(cd submod/subsubmod &&
 		rm .git &&
-		cp -R ../../.git/modules/sub/modules/sub .git &&
+		mv ../../.git/modules/sub/modules/sub .git &&
 		GIT_WORK_TREE=. git config --unset core.worktree
 	) &&
-	test_must_fail git rm submod &&
-	test -d submod &&
-	test -d submod/subsubmod/.git &&
-	git status -s -uno --ignore-submodules=none >actual &&
-	! test -s actual &&
-	test_must_fail git rm -f submod &&
-	test -d submod &&
-	test -d submod/subsubmod/.git &&
+	git rm submod 2>output.err &&
+	! test -d submod &&
+	! test -d submod/subsubmod/.git &&
 	git status -s -uno --ignore-submodules=none >actual &&
-	! test -s actual &&
-	rm -rf submod
+	test -s actual &&
+	test_i18ngrep Migrating output.err
 '
 
 test_expect_success 'checking out a commit after submodule removal needs manual updates' '
-	git commit -m "submodule removal" submod &&
+	git commit -m "submodule removal" submod .gitmodules &&
 	git checkout HEAD^ &&
 	git submodule update &&
 	git checkout -q HEAD^ &&
-- 
2.11.0.rc2.35.g26e18c9


^ permalink raw reply related

* [PATCHv3 3/4] submodule: add flags to ok_to_remove_submodule
From: Stefan Beller @ 2016-12-14 22:41 UTC (permalink / raw)
  To: gitster; +Cc: git, David.Turner, bmwill, sandals, Stefan Beller
In-Reply-To: <20161214224101.6211-1-sbeller@google.com>

In different contexts the question whether deleting a submodule is ok
to remove may be answered differently.

In 293ab15eea (submodule: teach rm to remove submodules unless they
contain a git directory, 2012-09-26) a case was made that we can safely
ignore ignored untracked files for removal as we explicitely ask for the
removal of the submodule.

In a later patch we want to remove submodules even when the user doesn't
explicitly ask for it (e.g. checking out a tree-ish in which the submodule
doesn't exist).  In that case we want to be more careful when it comes
to deletion of untracked files. As of this patch it is unclear how this
will be implemented exactly, so we'll offer flags in which the caller
can specify how the different untracked files ought to be handled.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/rm.c |  3 ++-
 submodule.c  | 23 +++++++++++++++++++----
 submodule.h  |  5 ++++-
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 3f3e24eb36..fdd7183f61 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -187,7 +187,8 @@ static int check_local_mod(struct object_id *head, int index_only)
 		 */
 		if (ce_match_stat(ce, &st, 0) ||
 		    (S_ISGITLINK(ce->ce_mode) &&
-		     !ok_to_remove_submodule(ce->name)))
+		     !ok_to_remove_submodule(ce->name,
+				SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED)))
 			local_changes = 1;
 
 		/*
diff --git a/submodule.c b/submodule.c
index 9f0b544ebe..2d13744b06 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1019,7 +1019,7 @@ int submodule_uses_gitfile(const char *path)
 	return 1;
 }
 
-int ok_to_remove_submodule(const char *path)
+int ok_to_remove_submodule(const char *path, unsigned flags)
 {
 	ssize_t len;
 	struct child_process cp = CHILD_PROCESS_INIT;
@@ -1032,15 +1032,27 @@ int ok_to_remove_submodule(const char *path)
 	if (!submodule_uses_gitfile(path))
 		return 0;
 
-	argv_array_pushl(&cp.args, "status", "--porcelain", "-u",
+	argv_array_pushl(&cp.args, "status", "--porcelain",
 				   "--ignore-submodules=none", NULL);
+
+	if (flags & SUBMODULE_REMOVAL_IGNORE_UNTRACKED)
+		argv_array_push(&cp.args, "-uno");
+	else
+		argv_array_push(&cp.args, "-uall");
+
+	if (!(flags & SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED))
+		argv_array_push(&cp.args, "--ignored");
+
 	prepare_submodule_repo_env(&cp.env_array);
 	cp.git_cmd = 1;
 	cp.no_stdin = 1;
 	cp.out = -1;
 	cp.dir = path;
 	if (start_command(&cp))
-		die(_("could not run 'git status --porcelain -u --ignore-submodules=none' in submodule %s"), path);
+		die(_("could not run 'git status --porcelain --ignore-submodules=none %s %s' in submodule '%s'"),
+			(flags & SUBMODULE_REMOVAL_IGNORE_UNTRACKED) ? "-uno" : "-uall",
+			(!(flags & SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED)) ? "--ignored" : "",
+			path);
 
 	len = strbuf_read(&buf, cp.out, 1024);
 	if (len > 2)
@@ -1048,7 +1060,10 @@ int ok_to_remove_submodule(const char *path)
 	close(cp.out);
 
 	if (finish_command(&cp))
-		die(_("'git status --porcelain -u --ignore-submodules=none' failed in submodule %s"), path);
+		die(_("'git status --porcelain --ignore-submodules=none %s %s' failed in submodule '%s'"),
+			(flags & SUBMODULE_REMOVAL_IGNORE_UNTRACKED) ? "-uno" : "-uall",
+			(!(flags & SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED)) ? "--ignored" : "",
+			path);
 
 	strbuf_release(&buf);
 	return ok_to_remove;
diff --git a/submodule.h b/submodule.h
index 61fb610749..3ed3aa479a 100644
--- a/submodule.h
+++ b/submodule.h
@@ -59,7 +59,10 @@ extern int fetch_populated_submodules(const struct argv_array *options,
 			       int quiet, int max_parallel_jobs);
 extern unsigned is_submodule_modified(const char *path, int ignore_untracked);
 extern int submodule_uses_gitfile(const char *path);
-extern int ok_to_remove_submodule(const char *path);
+
+#define SUBMODULE_REMOVAL_IGNORE_UNTRACKED (1<<0)
+#define SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED (1<<1)
+extern int ok_to_remove_submodule(const char *path, unsigned flags);
 extern int merge_submodule(unsigned char result[20], const char *path,
 			   const unsigned char base[20],
 			   const unsigned char a[20],
-- 
2.11.0.rc2.35.g26e18c9


^ permalink raw reply related

* [PATCHv3 1/4] submodule.h: add extern keyword to functions
From: Stefan Beller @ 2016-12-14 22:40 UTC (permalink / raw)
  To: gitster; +Cc: git, David.Turner, bmwill, sandals, Stefan Beller
In-Reply-To: <20161214224101.6211-1-sbeller@google.com>

As the upcoming series will add a lot of functions to the submodule
header, let's first make the header consistent to the rest of the project
by adding the extern keyword to functions.

As per the CodingGuidelines we try to stay below 80 characters per line,
so adapt all those functions to stay below 80 characters that are already
using more than one line.  Those function using just one line are better
kept in one line than breaking them up into multiple lines just for the
goal of staying below the character limit as it makes grepping
for functions easier if they are one liners.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.h | 55 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/submodule.h b/submodule.h
index 6229054b99..61fb610749 100644
--- a/submodule.h
+++ b/submodule.h
@@ -29,50 +29,55 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
-int is_staging_gitmodules_ok(void);
-int update_path_in_gitmodules(const char *oldpath, const char *newpath);
-int remove_path_from_gitmodules(const char *path);
-void stage_updated_gitmodules(void);
-void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
+extern int is_staging_gitmodules_ok(void);
+extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
+extern int remove_path_from_gitmodules(const char *path);
+extern void stage_updated_gitmodules(void);
+extern void set_diffopt_flags_from_submodule_config(struct diff_options *,
 		const char *path);
-int submodule_config(const char *var, const char *value, void *cb);
-void gitmodules_config(void);
-int parse_submodule_update_strategy(const char *value,
+extern int submodule_config(const char *var, const char *value, void *cb);
+extern void gitmodules_config(void);
+extern int parse_submodule_update_strategy(const char *value,
 		struct submodule_update_strategy *dst);
-const char *submodule_strategy_to_string(const struct submodule_update_strategy *s);
-void handle_ignore_submodules_arg(struct diff_options *diffopt, const char *);
-void show_submodule_summary(FILE *f, const char *path,
+extern const char *submodule_strategy_to_string(const struct submodule_update_strategy *s);
+extern void handle_ignore_submodules_arg(struct diff_options *, const char *);
+extern void show_submodule_summary(FILE *f, const char *path,
 		const char *line_prefix,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset);
-void show_submodule_inline_diff(FILE *f, const char *path,
+extern void show_submodule_inline_diff(FILE *f, const char *path,
 		const char *line_prefix,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset,
 		const struct diff_options *opt);
-void set_config_fetch_recurse_submodules(int value);
-void check_for_new_submodule_commits(unsigned char new_sha1[20]);
-int fetch_populated_submodules(const struct argv_array *options,
+extern void set_config_fetch_recurse_submodules(int value);
+extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
+extern int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
 			       int quiet, int max_parallel_jobs);
-unsigned is_submodule_modified(const char *path, int ignore_untracked);
-int submodule_uses_gitfile(const char *path);
-int ok_to_remove_submodule(const char *path);
-int merge_submodule(unsigned char result[20], const char *path, const unsigned char base[20],
-		    const unsigned char a[20], const unsigned char b[20], int search);
-int find_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_name,
-		struct string_list *needs_pushing);
-int push_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_name);
-int parallel_submodules(void);
+extern unsigned is_submodule_modified(const char *path, int ignore_untracked);
+extern int submodule_uses_gitfile(const char *path);
+extern int ok_to_remove_submodule(const char *path);
+extern int merge_submodule(unsigned char result[20], const char *path,
+			   const unsigned char base[20],
+			   const unsigned char a[20],
+			   const unsigned char b[20], int search);
+extern int find_unpushed_submodules(unsigned char new_sha1[20],
+				    const char *remotes_name,
+				    struct string_list *needs_pushing);
+extern int push_unpushed_submodules(unsigned char new_sha1[20],
+				    const char *remotes_name);
+extern void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
+extern int parallel_submodules(void);
 
 /*
  * Prepare the "env_array" parameter of a "struct child_process" for executing
  * a submodule by clearing any repo-specific envirionment variables, but
  * retaining any config in the environment.
  */
-void prepare_submodule_repo_env(struct argv_array *out);
+extern void prepare_submodule_repo_env(struct argv_array *out);
 
 #define ABSORB_GITDIR_RECURSE_SUBMODULES (1<<0)
 extern void absorb_git_dir_into_superproject(const char *prefix,
-- 
2.11.0.rc2.35.g26e18c9


^ permalink raw reply related

* [PATCHv3 2/4] submodule: modernize ok_to_remove_submodule to use argv_array
From: Stefan Beller @ 2016-12-14 22:40 UTC (permalink / raw)
  To: gitster; +Cc: git, David.Turner, bmwill, sandals, Stefan Beller
In-Reply-To: <20161214224101.6211-1-sbeller@google.com>

Instead of constructing the NULL terminated array ourselves, we
should make use of the argv_array infrastructure.

While at it, adapt the error messages to reflect the actual invocation.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/submodule.c b/submodule.c
index 45ccfb7ab4..9f0b544ebe 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1023,13 +1023,6 @@ int ok_to_remove_submodule(const char *path)
 {
 	ssize_t len;
 	struct child_process cp = CHILD_PROCESS_INIT;
-	const char *argv[] = {
-		"status",
-		"--porcelain",
-		"-u",
-		"--ignore-submodules=none",
-		NULL,
-	};
 	struct strbuf buf = STRBUF_INIT;
 	int ok_to_remove = 1;
 
@@ -1039,14 +1032,15 @@ int ok_to_remove_submodule(const char *path)
 	if (!submodule_uses_gitfile(path))
 		return 0;
 
-	cp.argv = argv;
+	argv_array_pushl(&cp.args, "status", "--porcelain", "-u",
+				   "--ignore-submodules=none", NULL);
 	prepare_submodule_repo_env(&cp.env_array);
 	cp.git_cmd = 1;
 	cp.no_stdin = 1;
 	cp.out = -1;
 	cp.dir = path;
 	if (start_command(&cp))
-		die("Could not run 'git status --porcelain -uall --ignore-submodules=none' in submodule %s", path);
+		die(_("could not run 'git status --porcelain -u --ignore-submodules=none' in submodule %s"), path);
 
 	len = strbuf_read(&buf, cp.out, 1024);
 	if (len > 2)
@@ -1054,7 +1048,7 @@ int ok_to_remove_submodule(const char *path)
 	close(cp.out);
 
 	if (finish_command(&cp))
-		die("'git status --porcelain -uall --ignore-submodules=none' failed in submodule %s", path);
+		die(_("'git status --porcelain -u --ignore-submodules=none' failed in submodule %s"), path);
 
 	strbuf_release(&buf);
 	return ok_to_remove;
-- 
2.11.0.rc2.35.g26e18c9


^ permalink raw reply related

* [PATCHv3 0/4] git-rm absorbs submodule git directory before deletion
From: Stefan Beller @ 2016-12-14 22:40 UTC (permalink / raw)
  To: gitster; +Cc: git, David.Turner, bmwill, sandals, Stefan Beller

v3:
* removed the patch to enhance ok_to_remove_submodule to absorb the submodule
  if needed
* Removed all the error reporting from git-rm that was related to submodule
  git directories not absorbed.
* instead just absorb the git repositories or let the absorb function die
  with an appropriate error message.

v2:
* new base where to apply the patch:
  sb/submodule-embed-gitdir merged with sb/t3600-cleanup.
  I got merge conflicts and resolved them this way:
#@@@ -709,9 -687,10 +687,9 @@@ test_expect_success 'checking out a com
#          git commit -m "submodule removal" submod &&
#          git checkout HEAD^ &&
#          git submodule update &&
#-         git checkout -q HEAD^ 2>actual &&
#+         git checkout -q HEAD^ &&
#          git checkout -q master 2>actual &&
# -        echo "warning: unable to rmdir submod: Directory not empty" >expected &&
# -        test_i18ncmp expected actual &&
# +        test_i18ngrep "^warning: unable to rmdir submod:" actual &&
#          git status -s submod >actual &&
#          echo "?? submod/" >expected &&
#          test_cmp expected actual &&
#

* improved commit message in "ok_to_remove_submodule: absorb the submodule git dir"
  (David Turner offered me some advice on how to write better English off list)
* simplified code in last patch:
  -> dropped wrong comment for fallthrough
  -> moved redundant code out of both bodies of an if-clause.
* Fixed last patchs commit message to have "or_die" instead of or_dir.

v1:
The "checkout --recurse-submodules" series got too large to comfortably send
it out for review, so I had to break it up into smaller series'; this is the
first subseries, but it makes sense on its own.

This series teaches git-rm to absorb the git directory of a submodule instead
of failing and complaining about the git directory preventing deletion.

It applies on origin/sb/submodule-embed-gitdir.

Any feedback welcome!

Thanks,
Stefan


Stefan Beller (4):
  submodule.h: add extern keyword to functions
  submodule: modernize ok_to_remove_submodule to use argv_array
  submodule: add flags to ok_to_remove_submodule
  rm: add absorb a submodules git dir before deletion

 builtin/rm.c  | 78 +++++++++++++++--------------------------------------------
 cache.h       |  2 ++
 entry.c       |  8 ++++++
 submodule.c   | 31 +++++++++++++++---------
 submodule.h   | 58 +++++++++++++++++++++++++-------------------
 t/t3600-rm.sh | 39 ++++++++++++------------------
 6 files changed, 97 insertions(+), 119 deletions(-)

-- 
2.11.0.rc2.35.g26e18c9


^ permalink raw reply

* [PATCH v10 1/6] lib-proto-disable: variable name fix
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

The test_proto function assigns the positional parameters to named
variables, but then still refers to "$desc" as "$1". Using $desc is
more readable and less error-prone.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/lib-proto-disable.sh | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index b0917d9..be88e9a 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -9,7 +9,7 @@ test_proto () {
 	proto=$2
 	url=$3
 
-	test_expect_success "clone $1 (enabled)" '
+	test_expect_success "clone $desc (enabled)" '
 		rm -rf tmp.git &&
 		(
 			GIT_ALLOW_PROTOCOL=$proto &&
@@ -18,7 +18,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "fetch $1 (enabled)" '
+	test_expect_success "fetch $desc (enabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=$proto &&
@@ -27,7 +27,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "push $1 (enabled)" '
+	test_expect_success "push $desc (enabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=$proto &&
@@ -36,7 +36,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "push $1 (disabled)" '
+	test_expect_success "push $desc (disabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=none &&
@@ -45,7 +45,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "fetch $1 (disabled)" '
+	test_expect_success "fetch $desc (disabled)" '
 		(
 			cd tmp.git &&
 			GIT_ALLOW_PROTOCOL=none &&
@@ -54,7 +54,7 @@ test_proto () {
 		)
 	'
 
-	test_expect_success "clone $1 (disabled)" '
+	test_expect_success "clone $desc (disabled)" '
 		rm -rf tmp.git &&
 		(
 			GIT_ALLOW_PROTOCOL=none &&
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v10 4/6] http: create function to get curl allowed protocols
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

Move the creation of an allowed protocols whitelist to a helper
function. This will be useful when we need to compute the set of
allowed protocols differently for normal and redirect cases.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/http.c b/http.c
index 034426e..f7c488a 100644
--- a/http.c
+++ b/http.c
@@ -489,10 +489,25 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
+static long get_curl_allowed_protocols(void)
+{
+	long allowed_protocols = 0;
+
+	if (is_transport_allowed("http"))
+		allowed_protocols |= CURLPROTO_HTTP;
+	if (is_transport_allowed("https"))
+		allowed_protocols |= CURLPROTO_HTTPS;
+	if (is_transport_allowed("ftp"))
+		allowed_protocols |= CURLPROTO_FTP;
+	if (is_transport_allowed("ftps"))
+		allowed_protocols |= CURLPROTO_FTPS;
+
+	return allowed_protocols;
+}
+
 static CURL *get_curl_handle(void)
 {
 	CURL *result = curl_easy_init();
-	long allowed_protocols = 0;
 
 	if (!result)
 		die("curl_easy_init failed");
@@ -572,16 +587,10 @@ static CURL *get_curl_handle(void)
 	curl_easy_setopt(result, CURLOPT_POST301, 1);
 #endif
 #if LIBCURL_VERSION_NUM >= 0x071304
-	if (is_transport_allowed("http"))
-		allowed_protocols |= CURLPROTO_HTTP;
-	if (is_transport_allowed("https"))
-		allowed_protocols |= CURLPROTO_HTTPS;
-	if (is_transport_allowed("ftp"))
-		allowed_protocols |= CURLPROTO_FTP;
-	if (is_transport_allowed("ftps"))
-		allowed_protocols |= CURLPROTO_FTPS;
-	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, allowed_protocols);
-	curl_easy_setopt(result, CURLOPT_PROTOCOLS, allowed_protocols);
+	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
+			 get_curl_allowed_protocols());
+	curl_easy_setopt(result, CURLOPT_PROTOCOLS,
+			 get_curl_allowed_protocols());
 #else
 	warning("protocol restrictions not applied to curl redirects because\n"
 		"your curl version is too old (>= 7.19.4)");
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v10 6/6] http: respect protocol.*.allow=user for http-alternates
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Jeff King, gitster, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

From: Jeff King <peff@peff.net>

The http-walker may fetch the http-alternates (or
alternates) file from a remote in order to find more
objects. This should count as a "not from the user" use of
the protocol. But because we implement the redirection
ourselves and feed the new URL to curl, it will use the
CURLOPT_PROTOCOLS rules, not the more restrictive
CURLOPT_REDIR_PROTOCOLS.

The ideal solution would be for each curl request we make to
know whether or not is directly from the user or part of an
alternates redirect, and then set CURLOPT_PROTOCOLS as
appropriate. However, that would require plumbing that
information through all of the various layers of the http
code.

Instead, let's check the protocol at the source: when we are
parsing the remote http-alternates file. The only downside
is that if there's any mismatch between what protocol we
think it is versus what curl thinks it is, it could violate
the policy.

To address this, we'll make the parsing err on the picky
side, and only allow protocols that it can parse
definitively. So for example, you can't elude the "http"
policy by asking for "HTTP://", even though curl might
handle it; we would reject it as unknown. The only unsafe
case would be if you have a URL that starts with "http://"
but curl interprets as another protocol. That seems like an
unlikely failure mode (and we are still protected by our
base CURLOPT_PROTOCOL setting, so the worst you could do is
trigger one of https, ftp, or ftps).

Signed-off-by: Jeff King <peff@peff.net>
---
 http-walker.c              | 52 ++++++++++++++++++++++++++++++++++++----------
 t/t5550-http-fetch-dumb.sh | 10 +++++++++
 2 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index c2f81cd..b34b6ac 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -3,6 +3,7 @@
 #include "walker.h"
 #include "http.h"
 #include "list.h"
+#include "transport.h"
 
 struct alt_base {
 	char *base;
@@ -160,6 +161,32 @@ static void prefetch(struct walker *walker, unsigned char *sha1)
 #endif
 }
 
+static int is_alternate_allowed(const char *url)
+{
+	const char *protocols[] = {
+		"http", "https", "ftp", "ftps"
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(protocols); i++) {
+		const char *end;
+		if (skip_prefix(url, protocols[i], &end) &&
+		    starts_with(end, "://"))
+			break;
+	}
+
+	if (i >= ARRAY_SIZE(protocols)) {
+		warning("ignoring alternate with unknown protocol: %s", url);
+		return 0;
+	}
+	if (!is_transport_allowed(protocols[i], 0)) {
+		warning("ignoring alternate with restricted protocol: %s", url);
+		return 0;
+	}
+
+	return 1;
+}
+
 static void process_alternates_response(void *callback_data)
 {
 	struct alternates_request *alt_req =
@@ -274,17 +301,20 @@ static void process_alternates_response(void *callback_data)
 				struct strbuf target = STRBUF_INIT;
 				strbuf_add(&target, base, serverlen);
 				strbuf_add(&target, data + i, posn - i - 7);
-				warning("adding alternate object store: %s",
-					target.buf);
-				newalt = xmalloc(sizeof(*newalt));
-				newalt->next = NULL;
-				newalt->base = strbuf_detach(&target, NULL);
-				newalt->got_indices = 0;
-				newalt->packs = NULL;
-
-				while (tail->next != NULL)
-					tail = tail->next;
-				tail->next = newalt;
+
+				if (is_alternate_allowed(target.buf)) {
+					warning("adding alternate object store: %s",
+						target.buf);
+					newalt = xmalloc(sizeof(*newalt));
+					newalt->next = NULL;
+					newalt->base = strbuf_detach(&target, NULL);
+					newalt->got_indices = 0;
+					newalt->packs = NULL;
+
+					while (tail->next != NULL)
+						tail = tail->next;
+					tail->next = newalt;
+				}
 			}
 		}
 		i = posn + 1;
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 22011f0..c0ee29c 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -360,5 +360,15 @@ test_expect_success 'http-alternates cannot point at funny protocols' '
 		clone "$HTTPD_URL/dumb/evil.git" evil-file
 '
 
+test_expect_success 'http-alternates triggers not-from-user protocol check' '
+	echo "$HTTPD_URL/dumb/victim.git/objects" \
+		>"$evil/objects/info/http-alternates" &&
+	test_config_global http.followRedirects true &&
+	test_must_fail git -c protocol.http.allow=user \
+		clone $HTTPD_URL/dumb/evil.git evil-user &&
+	git -c protocol.http.allow=always \
+		clone $HTTPD_URL/dumb/evil.git evil-user
+'
+
 stop_httpd
 test_done
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v10 5/6] transport: add from_user parameter to is_transport_allowed
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

Add a from_user parameter to is_transport_allowed() to allow http to be
able to distinguish between protocol restrictions for redirects versus
initial requests.  CURLOPT_REDIR_PROTOCOLS can now be set differently
from CURLOPT_PROTOCOLS to disallow use of protocols with the "user"
policy in redirects.

This change allows callers to query if a transport protocol is allowed,
given that the caller knows that the protocol is coming from the user
(1) or not from the user (0) such as redirects in libcurl.  If unknown a
-1 should be provided which falls back to reading
`GIT_PROTOCOL_FROM_USER` to determine if the protocol came from the
user.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c                        | 14 +++++++-------
 t/t5812-proto-disable-http.sh |  7 +++++++
 transport.c                   |  8 +++++---
 transport.h                   | 13 ++++++++++---
 4 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/http.c b/http.c
index f7c488a..2208269 100644
--- a/http.c
+++ b/http.c
@@ -489,17 +489,17 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
-static long get_curl_allowed_protocols(void)
+static long get_curl_allowed_protocols(int from_user)
 {
 	long allowed_protocols = 0;
 
-	if (is_transport_allowed("http"))
+	if (is_transport_allowed("http", from_user))
 		allowed_protocols |= CURLPROTO_HTTP;
-	if (is_transport_allowed("https"))
+	if (is_transport_allowed("https", from_user))
 		allowed_protocols |= CURLPROTO_HTTPS;
-	if (is_transport_allowed("ftp"))
+	if (is_transport_allowed("ftp", from_user))
 		allowed_protocols |= CURLPROTO_FTP;
-	if (is_transport_allowed("ftps"))
+	if (is_transport_allowed("ftps", from_user))
 		allowed_protocols |= CURLPROTO_FTPS;
 
 	return allowed_protocols;
@@ -588,9 +588,9 @@ static CURL *get_curl_handle(void)
 #endif
 #if LIBCURL_VERSION_NUM >= 0x071304
 	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
-			 get_curl_allowed_protocols());
+			 get_curl_allowed_protocols(0));
 	curl_easy_setopt(result, CURLOPT_PROTOCOLS,
-			 get_curl_allowed_protocols());
+			 get_curl_allowed_protocols(-1));
 #else
 	warning("protocol restrictions not applied to curl redirects because\n"
 		"your curl version is too old (>= 7.19.4)");
diff --git a/t/t5812-proto-disable-http.sh b/t/t5812-proto-disable-http.sh
index 044cc15..d911afd 100755
--- a/t/t5812-proto-disable-http.sh
+++ b/t/t5812-proto-disable-http.sh
@@ -30,5 +30,12 @@ test_expect_success 'curl limits redirects' '
 	test_must_fail git clone "$HTTPD_URL/loop-redir/smart/repo.git"
 '
 
+test_expect_success 'http can be limited to from-user' '
+	git -c protocol.http.allow=user \
+		clone "$HTTPD_URL/smart/repo.git" plain.git &&
+	test_must_fail git -c protocol.http.allow=user \
+		clone "$HTTPD_URL/smart-redir-perm/repo.git" redir.git
+'
+
 stop_httpd
 test_done
diff --git a/transport.c b/transport.c
index fbd799d..f50c31a 100644
--- a/transport.c
+++ b/transport.c
@@ -676,7 +676,7 @@ static enum protocol_allow_config get_protocol_config(const char *type)
 	return PROTOCOL_ALLOW_USER_ONLY;
 }
 
-int is_transport_allowed(const char *type)
+int is_transport_allowed(const char *type, int from_user)
 {
 	const struct string_list *whitelist = protocol_whitelist();
 	if (whitelist)
@@ -688,7 +688,9 @@ int is_transport_allowed(const char *type)
 	case PROTOCOL_ALLOW_NEVER:
 		return 0;
 	case PROTOCOL_ALLOW_USER_ONLY:
-		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+		if (from_user < 0)
+			from_user = git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+		return from_user;
 	}
 
 	die("BUG: invalid protocol_allow_config type");
@@ -696,7 +698,7 @@ int is_transport_allowed(const char *type)
 
 void transport_check_allowed(const char *type)
 {
-	if (!is_transport_allowed(type))
+	if (!is_transport_allowed(type, -1))
 		die("transport '%s' not allowed", type);
 }
 
diff --git a/transport.h b/transport.h
index 3396e1d..4f1c801 100644
--- a/transport.h
+++ b/transport.h
@@ -142,10 +142,17 @@ struct transport {
 struct transport *transport_get(struct remote *, const char *);
 
 /*
- * Check whether a transport is allowed by the environment. Type should
- * generally be the URL scheme, as described in Documentation/git.txt
+ * Check whether a transport is allowed by the environment.
+ *
+ * Type should generally be the URL scheme, as described in
+ * Documentation/git.txt
+ *
+ * from_user specifies if the transport was given by the user.  If unknown pass
+ * a -1 to read from the environment to determine if the transport was given by
+ * the user.
+ *
  */
-int is_transport_allowed(const char *type);
+int is_transport_allowed(const char *type, int from_user);
 
 /*
  * Check whether a transport is allowed by the environment,
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v10 0/6] transport protocol policy configuration
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481679637-133137-1-git-send-email-bmwill@google.com>

v10 of this series fixes the following:
* A few updates to the commit messages in order to better convey the reasoning
  behind the a few of the patches.
* Additional test to verify that curl redirects respect configured protocol
  policies.
* Patch added by Jeff King to make http alternates respect configured
  protocol policies.

Brandon Williams (5):
  lib-proto-disable: variable name fix
  http: always warn if libcurl version is too old
  transport: add protocol policy config option
  http: create function to get curl allowed protocols
  transport: add from_user parameter to is_transport_allowed

Jeff King (1):
  http: respect protocol.*.allow=user for http-alternates

 Documentation/config.txt         |  46 +++++++++++++
 Documentation/git.txt            |  38 ++++-------
 git-submodule.sh                 |  12 ++--
 http-walker.c                    |  52 +++++++++++---
 http.c                           |  36 ++++++----
 t/lib-proto-disable.sh           | 142 ++++++++++++++++++++++++++++++++++++---
 t/t5509-fetch-push-namespaces.sh |   1 +
 t/t5550-http-fetch-dumb.sh       |  10 +++
 t/t5802-connect-helper.sh        |   1 +
 t/t5812-proto-disable-http.sh    |   7 ++
 transport.c                      |  84 ++++++++++++++++++++---
 transport.h                      |  19 +++---
 12 files changed, 363 insertions(+), 85 deletions(-)

-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply

* [PATCH v10 3/6] transport: add protocol policy config option
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

Previously the `GIT_ALLOW_PROTOCOL` environment variable was used to
specify a whitelist of protocols to be used in clone/fetch/push
commands.  This patch introduces new configuration options for more
fine-grained control for allowing/disallowing protocols.  This also has
the added benefit of allowing easier construction of a protocol
whitelist on systems where setting an environment variable is
non-trivial.

Now users can specify a policy to be used for each type of protocol via
the 'protocol.<name>.allow' config option.  A default policy for all
unconfigured protocols can be set with the 'protocol.allow' config
option.  If no user configured default is made git will allow known-safe
protocols (http, https, git, ssh, file), disallow known-dangerous
protocols (ext), and have a default policy of `user` for all other
protocols.

The supported policies are `always`, `never`, and `user`.  The `user`
policy can be used to configure a protocol to be usable when explicitly
used by a user, while disallowing it for commands which run
clone/fetch/push commands without direct user intervention (e.g.
recursive initialization of submodules).  Commands which can potentially
clone/fetch/push from untrusted repositories without user intervention
can export `GIT_PROTOCOL_FROM_USER` with a value of '0' to prevent
protocols configured to the `user` policy from being used.

Fix remote-ext tests to use the new config to allow the ext
protocol to be tested.

Based on a patch by Jeff King <peff@peff.net>

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt         |  46 ++++++++++++++
 Documentation/git.txt            |  38 +++++-------
 git-submodule.sh                 |  12 ++--
 t/lib-proto-disable.sh           | 130 +++++++++++++++++++++++++++++++++++++--
 t/t5509-fetch-push-namespaces.sh |   1 +
 t/t5802-connect-helper.sh        |   1 +
 transport.c                      |  75 +++++++++++++++++++++-
 7 files changed, 264 insertions(+), 39 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 8153336..50d3d06 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2260,6 +2260,52 @@ pretty.<name>::
 	Note that an alias with the same name as a built-in format
 	will be silently ignored.
 
+protocol.allow::
+	If set, provide a user defined default policy for all protocols which
+	don't explicitly have a policy (`protocol.<name>.allow`).  By default,
+	if unset, known-safe protocols (http, https, git, ssh, file) have a
+	default policy of `always`, known-dangerous protocols (ext) have a
+	default policy of `never`, and all other protocols have a default
+	policy of `user`.  Supported policies:
++
+--
+
+* `always` - protocol is always able to be used.
+
+* `never` - protocol is never able to be used.
+
+* `user` - protocol is only able to be used when `GIT_PROTOCOL_FROM_USER` is
+  either unset or has a value of 1.  This policy should be used when you want a
+  protocol to be directly usable by the user but don't want it used by commands which
+  execute clone/fetch/push commands without user input, e.g. recursive
+  submodule initialization.
+
+--
+
+protocol.<name>.allow::
+	Set a policy to be used by protocol `<name>` with clone/fetch/push
+	commands. See `protocol.allow` above for the available policies.
++
+The protocol names currently used by git are:
++
+--
+  - `file`: any local file-based path (including `file://` URLs,
+    or local paths)
+
+  - `git`: the anonymous git protocol over a direct TCP
+    connection (or proxy, if configured)
+
+  - `ssh`: git over ssh (including `host:path` syntax,
+    `ssh://`, etc).
+
+  - `http`: git over http, both "smart http" and "dumb http".
+    Note that this does _not_ include `https`; if you want to configure
+    both, you must do so individually.
+
+  - any external helpers are named by their protocol (e.g., use
+    `hg` to allow the `git-remote-hg` helper)
+--
+
 pull.ff::
 	By default, Git does not create an extra merge commit when merging
 	a commit that is a descendant of the current commit. Instead, the
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 923aa49..d9fb937 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1129,30 +1129,20 @@ of clones and fetches.
 	cloning a repository to make a backup).
 
 `GIT_ALLOW_PROTOCOL`::
-	If set, provide a colon-separated list of protocols which are
-	allowed to be used with fetch/push/clone. This is useful to
-	restrict recursive submodule initialization from an untrusted
-	repository. Any protocol not mentioned will be disallowed (i.e.,
-	this is a whitelist, not a blacklist). If the variable is not
-	set at all, all protocols are enabled.  The protocol names
-	currently used by git are:
-
-	  - `file`: any local file-based path (including `file://` URLs,
-	    or local paths)
-
-	  - `git`: the anonymous git protocol over a direct TCP
-	    connection (or proxy, if configured)
-
-	  - `ssh`: git over ssh (including `host:path` syntax,
-	    `ssh://`, etc).
-
-	  - `http`: git over http, both "smart http" and "dumb http".
-	    Note that this does _not_ include `https`; if you want both,
-	    you should specify both as `http:https`.
-
-	  - any external helpers are named by their protocol (e.g., use
-	    `hg` to allow the `git-remote-hg` helper)
-
+	If set to a colon-separated list of protocols, behave as if
+	`protocol.allow` is set to `never`, and each of the listed
+	protocols has `protocol.<name>.allow` set to `always`
+	(overriding any existing configuration). In other words, any
+	protocol not mentioned will be disallowed (i.e., this is a
+	whitelist, not a blacklist). See the description of
+	`protocol.allow` in linkgit:git-config[1] for more details.
+
+`GIT_PROTOCOL_FROM_USER`::
+	Set to 0 to prevent protocols used by fetch/push/clone which are
+	configured to the `user` state.  This is useful to restrict recursive
+	submodule initialization from an untrusted repository or for programs
+	which feed potentially-untrusted URLS to git commands.  See
+	linkgit:git-config[1] for more details.
 
 Discussion[[Discussion]]
 ------------------------
diff --git a/git-submodule.sh b/git-submodule.sh
index 78fdac9..fc44076 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -22,14 +22,10 @@ require_work_tree
 wt_prefix=$(git rev-parse --show-prefix)
 cd_to_toplevel
 
-# Restrict ourselves to a vanilla subset of protocols; the URLs
-# we get are under control of a remote repository, and we do not
-# want them kicking off arbitrary git-remote-* programs.
-#
-# If the user has already specified a set of allowed protocols,
-# we assume they know what they're doing and use that instead.
-: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
-export GIT_ALLOW_PROTOCOL
+# Tell the rest of git that any URLs we get don't come
+# directly from the user, so it can apply policy as appropriate.
+GIT_PROTOCOL_FROM_USER=0
+export GIT_PROTOCOL_FROM_USER
 
 command=
 branch=
diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index be88e9a..02f49cb 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -1,10 +1,7 @@
 # Test routines for checking protocol disabling.
 
-# test cloning a particular protocol
-#   $1 - description of the protocol
-#   $2 - machine-readable name of the protocol
-#   $3 - the URL to try cloning
-test_proto () {
+# Test clone/fetch/push with GIT_ALLOW_PROTOCOL whitelist
+test_whitelist () {
 	desc=$1
 	proto=$2
 	url=$3
@@ -62,6 +59,129 @@ test_proto () {
 			test_must_fail git clone --bare "$url" tmp.git
 		)
 	'
+
+	test_expect_success "clone $desc (env var has precedence)" '
+		rm -rf tmp.git &&
+		(
+			GIT_ALLOW_PROTOCOL=none &&
+			export GIT_ALLOW_PROTOCOL &&
+			test_must_fail git -c protocol.allow=always clone --bare "$url" tmp.git &&
+			test_must_fail git -c protocol.$proto.allow=always clone --bare "$url" tmp.git
+		)
+	'
+}
+
+test_config () {
+	desc=$1
+	proto=$2
+	url=$3
+
+	# Test clone/fetch/push with protocol.<type>.allow config
+	test_expect_success "clone $desc (enabled with config)" '
+		rm -rf tmp.git &&
+		git -c protocol.$proto.allow=always clone --bare "$url" tmp.git
+	'
+
+	test_expect_success "fetch $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=always fetch
+	'
+
+	test_expect_success "push $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=always  push origin HEAD:pushed
+	'
+
+	test_expect_success "push $desc (disabled)" '
+		test_must_fail git -C tmp.git -c protocol.$proto.allow=never push origin HEAD:pushed
+	'
+
+	test_expect_success "fetch $desc (disabled)" '
+		test_must_fail git -C tmp.git -c protocol.$proto.allow=never fetch
+	'
+
+	test_expect_success "clone $desc (disabled)" '
+		rm -rf tmp.git &&
+		test_must_fail git -c protocol.$proto.allow=never clone --bare "$url" tmp.git
+	'
+
+	# Test clone/fetch/push with protocol.user.allow and its env var
+	test_expect_success "clone $desc (enabled)" '
+		rm -rf tmp.git &&
+		git -c protocol.$proto.allow=user clone --bare "$url" tmp.git
+	'
+
+	test_expect_success "fetch $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=user fetch
+	'
+
+	test_expect_success "push $desc (enabled)" '
+		git -C tmp.git -c protocol.$proto.allow=user push origin HEAD:pushed
+	'
+
+	test_expect_success "push $desc (disabled)" '
+		(
+			cd tmp.git &&
+			GIT_PROTOCOL_FROM_USER=0 &&
+			export GIT_PROTOCOL_FROM_USER &&
+			test_must_fail git -c protocol.$proto.allow=user push origin HEAD:pushed
+		)
+	'
+
+	test_expect_success "fetch $desc (disabled)" '
+		(
+			cd tmp.git &&
+			GIT_PROTOCOL_FROM_USER=0 &&
+			export GIT_PROTOCOL_FROM_USER &&
+			test_must_fail git -c protocol.$proto.allow=user fetch
+		)
+	'
+
+	test_expect_success "clone $desc (disabled)" '
+		rm -rf tmp.git &&
+		(
+			GIT_PROTOCOL_FROM_USER=0 &&
+			export GIT_PROTOCOL_FROM_USER &&
+			test_must_fail git -c protocol.$proto.allow=user clone --bare "$url" tmp.git
+		)
+	'
+
+	# Test clone/fetch/push with protocol.allow user defined default
+	test_expect_success "clone $desc (enabled)" '
+		rm -rf tmp.git &&
+		git config --global protocol.allow always &&
+		git clone --bare "$url" tmp.git
+	'
+
+	test_expect_success "fetch $desc (enabled)" '
+		git -C tmp.git fetch
+	'
+
+	test_expect_success "push $desc (enabled)" '
+		git -C tmp.git push origin HEAD:pushed
+	'
+
+	test_expect_success "push $desc (disabled)" '
+		git config --global protocol.allow never &&
+		test_must_fail git -C tmp.git push origin HEAD:pushed
+	'
+
+	test_expect_success "fetch $desc (disabled)" '
+		test_must_fail git -C tmp.git fetch
+	'
+
+	test_expect_success "clone $desc (disabled)" '
+		rm -rf tmp.git &&
+		test_must_fail git clone --bare "$url" tmp.git
+	'
+}
+
+# test cloning a particular protocol
+#   $1 - description of the protocol
+#   $2 - machine-readable name of the protocol
+#   $3 - the URL to try cloning
+test_proto () {
+	test_whitelist "$@"
+
+	test_config "$@"
 }
 
 # set up an ssh wrapper that will access $host/$repo in the
diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh
index bc44ac3..75c570a 100755
--- a/t/t5509-fetch-push-namespaces.sh
+++ b/t/t5509-fetch-push-namespaces.sh
@@ -4,6 +4,7 @@ test_description='fetch/push involving ref namespaces'
 . ./test-lib.sh
 
 test_expect_success setup '
+	git config --global protocol.ext.allow user &&
 	test_tick &&
 	git init original &&
 	(
diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
index b7a7f9d..c6c2661 100755
--- a/t/t5802-connect-helper.sh
+++ b/t/t5802-connect-helper.sh
@@ -4,6 +4,7 @@ test_description='ext::cmd remote "connect" helper'
 . ./test-lib.sh
 
 test_expect_success setup '
+	git config --global protocol.ext.allow user &&
 	test_tick &&
 	git commit --allow-empty -m initial &&
 	test_tick &&
diff --git a/transport.c b/transport.c
index dff929e..fbd799d 100644
--- a/transport.c
+++ b/transport.c
@@ -617,10 +617,81 @@ static const struct string_list *protocol_whitelist(void)
 	return enabled ? &allowed : NULL;
 }
 
+enum protocol_allow_config {
+	PROTOCOL_ALLOW_NEVER = 0,
+	PROTOCOL_ALLOW_USER_ONLY,
+	PROTOCOL_ALLOW_ALWAYS
+};
+
+static enum protocol_allow_config parse_protocol_config(const char *key,
+							const char *value)
+{
+	if (!strcasecmp(value, "always"))
+		return PROTOCOL_ALLOW_ALWAYS;
+	else if (!strcasecmp(value, "never"))
+		return PROTOCOL_ALLOW_NEVER;
+	else if (!strcasecmp(value, "user"))
+		return PROTOCOL_ALLOW_USER_ONLY;
+
+	die("unknown value for config '%s': %s", key, value);
+}
+
+static enum protocol_allow_config get_protocol_config(const char *type)
+{
+	char *key = xstrfmt("protocol.%s.allow", type);
+	char *value;
+
+	/* first check the per-protocol config */
+	if (!git_config_get_string(key, &value)) {
+		enum protocol_allow_config ret =
+			parse_protocol_config(key, value);
+		free(key);
+		free(value);
+		return ret;
+	}
+	free(key);
+
+	/* if defined, fallback to user-defined default for unknown protocols */
+	if (!git_config_get_string("protocol.allow", &value)) {
+		enum protocol_allow_config ret =
+			parse_protocol_config("protocol.allow", value);
+		free(value);
+		return ret;
+	}
+
+	/* fallback to built-in defaults */
+	/* known safe */
+	if (!strcmp(type, "http") ||
+	    !strcmp(type, "https") ||
+	    !strcmp(type, "git") ||
+	    !strcmp(type, "ssh") ||
+	    !strcmp(type, "file"))
+		return PROTOCOL_ALLOW_ALWAYS;
+
+	/* known scary; err on the side of caution */
+	if (!strcmp(type, "ext"))
+		return PROTOCOL_ALLOW_NEVER;
+
+	/* unknown; by default let them be used only directly by the user */
+	return PROTOCOL_ALLOW_USER_ONLY;
+}
+
 int is_transport_allowed(const char *type)
 {
-	const struct string_list *allowed = protocol_whitelist();
-	return !allowed || string_list_has_string(allowed, type);
+	const struct string_list *whitelist = protocol_whitelist();
+	if (whitelist)
+		return string_list_has_string(whitelist, type);
+
+	switch (get_protocol_config(type)) {
+	case PROTOCOL_ALLOW_ALWAYS:
+		return 1;
+	case PROTOCOL_ALLOW_NEVER:
+		return 0;
+	case PROTOCOL_ALLOW_USER_ONLY:
+		return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+	}
+
+	die("BUG: invalid protocol_allow_config type");
 }
 
 void transport_check_allowed(const char *type)
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* [PATCH v10 2/6] http: always warn if libcurl version is too old
From: Brandon Williams @ 2016-12-14 22:39 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, peff, sbeller, bburky, jrnieder
In-Reply-To: <1481755195-174539-1-git-send-email-bmwill@google.com>

Always warn if libcurl version is too old because:

1. Even without a protocol whitelist, newer versions of curl have all
   non-http protocols disabled by default.
2. A future patch will introduce default "known-good" and "known-bad"
   protocols which are allowed/disallowed by 'is_transport_allowed'
   which older version of libcurl can't respect.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c      | 5 ++---
 transport.c | 5 -----
 transport.h | 6 ------
 3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/http.c b/http.c
index 5cd3ffd..034426e 100644
--- a/http.c
+++ b/http.c
@@ -583,9 +583,8 @@ static CURL *get_curl_handle(void)
 	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, allowed_protocols);
 	curl_easy_setopt(result, CURLOPT_PROTOCOLS, allowed_protocols);
 #else
-	if (transport_restrict_protocols())
-		warning("protocol restrictions not applied to curl redirects because\n"
-			"your curl version is too old (>= 7.19.4)");
+	warning("protocol restrictions not applied to curl redirects because\n"
+		"your curl version is too old (>= 7.19.4)");
 #endif
 
 	if (getenv("GIT_CURL_VERBOSE"))
diff --git a/transport.c b/transport.c
index 41eb82c..dff929e 100644
--- a/transport.c
+++ b/transport.c
@@ -629,11 +629,6 @@ void transport_check_allowed(const char *type)
 		die("transport '%s' not allowed", type);
 }
 
-int transport_restrict_protocols(void)
-{
-	return !!protocol_whitelist();
-}
-
 struct transport *transport_get(struct remote *remote, const char *url)
 {
 	const char *helper;
diff --git a/transport.h b/transport.h
index c681408..3396e1d 100644
--- a/transport.h
+++ b/transport.h
@@ -153,12 +153,6 @@ int is_transport_allowed(const char *type);
  */
 void transport_check_allowed(const char *type);
 
-/*
- * Returns true if the user has attempted to turn on protocol
- * restrictions at all.
- */
-int transport_restrict_protocols(void);
-
 /* Transport options which apply to git:// and scp-style URLs */
 
 /* The program to use on the remote side to send a pack */
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* Re: [PATCH v9 5/5] transport: add from_user parameter to is_transport_allowed
From: Brandon Williams @ 2016-12-14 21:58 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller, bburky, gitster, jrnieder
In-Reply-To: <20161214211229.mpgu3hahlfbdxys6@sigill.intra.peff.net>

On 12/14, Jeff King wrote:
> On Wed, Dec 14, 2016 at 03:33:49PM -0500, Jeff King wrote:
> 
> > One other "simple" fix is at the moment we parse http-alternates, to
> > parse the URL ourself and check the policy. Like:
> > [...]
> > I may have convinced myself this is a reasonable approach.
> 
> So here it is in patch form, with a test.
> 
> I also took a look at how bad it would be to plumb through the "this is
> an alternate" flag. And it's actually a little nasty, because the
> http-walker isn't calling get_active_slot() itself. It's relying on
> http_get_file() and other wrappers. The recent http_get_options makes
> this slightly less terrible, but I'd still rather avoid infecting the
> general http code that is used for the smart-http code paths.
> 
> So I think this patch on top of your series, plus the other minor fixes
> we've discussed, the topic should be ready for 'next'.

Sounds good, I'll make those few changes, place this patch ontop of
the series and send out v10 of the series in just a bit.

-- 
Brandon Williams

^ permalink raw reply

* Re: [PATCH v9 5/5] transport: add from_user parameter to is_transport_allowed
From: Jeff King @ 2016-12-14 21:12 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, bburky, gitster, jrnieder
In-Reply-To: <20161214203349.6ym3v2ny2gonovx2@sigill.intra.peff.net>

On Wed, Dec 14, 2016 at 03:33:49PM -0500, Jeff King wrote:

> One other "simple" fix is at the moment we parse http-alternates, to
> parse the URL ourself and check the policy. Like:
> [...]
> I may have convinced myself this is a reasonable approach.

So here it is in patch form, with a test.

I also took a look at how bad it would be to plumb through the "this is
an alternate" flag. And it's actually a little nasty, because the
http-walker isn't calling get_active_slot() itself. It's relying on
http_get_file() and other wrappers. The recent http_get_options makes
this slightly less terrible, but I'd still rather avoid infecting the
general http code that is used for the smart-http code paths.

So I think this patch on top of your series, plus the other minor fixes
we've discussed, the topic should be ready for 'next'.

-- >8 --
Subject: http: respect protocol.*.allow=user for http-alternates

The http-walker may fetch the http-alternates (or
alternates) file from a remote in order to find more
objects. This should count as a "not from the user" use of
the protocol. But because we implement the redirection
ourselves and feed the new URL to curl, it will use the
CURLOPT_PROTOCOLS rules, not the more restrictive
CURLOPT_REDIR_PROTOCOLS.

The ideal solution would be for each curl request we make to
know whether or not is directly from the user or part of an
alternates redirect, and then set CURLOPT_PROTOCOLS as
appropriate. However, that would require plumbing that
information through all of the various layers of the http
code.

Instead, let's check the protocol at the source: when we are
parsing the remote http-alternates file. The only downside
is that if there's any mismatch between what protocol we
think it is versus what curl thinks it is, it could violate
the policy.

To address this, we'll make the parsing err on the picky
side, and only allow protocols that it can parse
definitively. So for example, you can't elude the "http"
policy by asking for "HTTP://", even though curl might
handle it; we would reject it as unknown. The only unsafe
case would be if you have a URL that starts with "http://"
but curl interprets as another protocol. That seems like an
unlikely failure mode (and we are still protected by our
base CURLOPT_PROTOCOL setting, so the worst you could do is
trigger one of https, ftp, or ftps).

Signed-off-by: Jeff King <peff@peff.net>
---
I actually do reject "HTTP://" here, though I suspect curl does take it.
I notice that you cannot ask to clone "HTTP://..." in the first place,
as our remote-helper interface is case sensitive (or maybe it would even
respect "HTTP://" on a case-insensitive file system!).

Possibly we should be more consistent about down-casing protocols when
comparing them, but I'm not sure if anybody actually cares in practice.

 http-walker.c              | 52 ++++++++++++++++++++++++++++++++++++----------
 t/t5550-http-fetch-dumb.sh | 10 +++++++++
 2 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index c2f81cd6a..b34b6ace7 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -3,6 +3,7 @@
 #include "walker.h"
 #include "http.h"
 #include "list.h"
+#include "transport.h"
 
 struct alt_base {
 	char *base;
@@ -160,6 +161,32 @@ static void prefetch(struct walker *walker, unsigned char *sha1)
 #endif
 }
 
+static int is_alternate_allowed(const char *url)
+{
+	const char *protocols[] = {
+		"http", "https", "ftp", "ftps"
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(protocols); i++) {
+		const char *end;
+		if (skip_prefix(url, protocols[i], &end) &&
+		    starts_with(end, "://"))
+			break;
+	}
+
+	if (i >= ARRAY_SIZE(protocols)) {
+		warning("ignoring alternate with unknown protocol: %s", url);
+		return 0;
+	}
+	if (!is_transport_allowed(protocols[i], 0)) {
+		warning("ignoring alternate with restricted protocol: %s", url);
+		return 0;
+	}
+
+	return 1;
+}
+
 static void process_alternates_response(void *callback_data)
 {
 	struct alternates_request *alt_req =
@@ -274,17 +301,20 @@ static void process_alternates_response(void *callback_data)
 				struct strbuf target = STRBUF_INIT;
 				strbuf_add(&target, base, serverlen);
 				strbuf_add(&target, data + i, posn - i - 7);
-				warning("adding alternate object store: %s",
-					target.buf);
-				newalt = xmalloc(sizeof(*newalt));
-				newalt->next = NULL;
-				newalt->base = strbuf_detach(&target, NULL);
-				newalt->got_indices = 0;
-				newalt->packs = NULL;
-
-				while (tail->next != NULL)
-					tail = tail->next;
-				tail->next = newalt;
+
+				if (is_alternate_allowed(target.buf)) {
+					warning("adding alternate object store: %s",
+						target.buf);
+					newalt = xmalloc(sizeof(*newalt));
+					newalt->next = NULL;
+					newalt->base = strbuf_detach(&target, NULL);
+					newalt->got_indices = 0;
+					newalt->packs = NULL;
+
+					while (tail->next != NULL)
+						tail = tail->next;
+					tail->next = newalt;
+				}
 			}
 		}
 		i = posn + 1;
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 22011f0b6..c0ee29c65 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -360,5 +360,15 @@ test_expect_success 'http-alternates cannot point at funny protocols' '
 		clone "$HTTPD_URL/dumb/evil.git" evil-file
 '
 
+test_expect_success 'http-alternates triggers not-from-user protocol check' '
+	echo "$HTTPD_URL/dumb/victim.git/objects" \
+		>"$evil/objects/info/http-alternates" &&
+	test_config_global http.followRedirects true &&
+	test_must_fail git -c protocol.http.allow=user \
+		clone $HTTPD_URL/dumb/evil.git evil-user &&
+	git -c protocol.http.allow=always \
+		clone $HTTPD_URL/dumb/evil.git evil-user
+'
+
 stop_httpd
 test_done
-- 
2.11.0.341.g202cd3142


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox