Git development
 help / color / mirror / Atom feed
* [PATCH v3 05/23] raceproof_create_file(): new function
From: Michael Haggerty @ 2016-12-31  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, David Turner, Michael Haggerty
In-Reply-To: <cover.1483153436.git.mhagger@alum.mit.edu>

Add a function that tries to create a file and any containing
directories in a way that is robust against races with other processes
that might be cleaning up empty directories at the same time.

The actual file creation is done by a callback function, which, if it
fails, should set errno to EISDIR or ENOENT according to the convention
of open(). raceproof_create_file() detects such failures, and
respectively either tries to delete empty directories that might be in
the way of the file or tries to create the containing directories. Then
it retries the callback function.

This function is not yet used.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 cache.h     | 43 ++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 111 insertions(+)

diff --git a/cache.h b/cache.h
index 8177c3a..7bc4ea6 100644
--- a/cache.h
+++ b/cache.h
@@ -1057,6 +1057,49 @@ enum scld_error {
 enum scld_error safe_create_leading_directories(char *path);
 enum scld_error safe_create_leading_directories_const(const char *path);
 
+/*
+ * Callback function for raceproof_create_file(). This function is
+ * expected to do something that makes dirname(path) permanent despite
+ * the fact that other processes might be cleaning up empty
+ * directories at the same time. Usually it will create a file named
+ * path, but alternatively it could create another file in that
+ * directory, or even chdir() into that directory. The function should
+ * return 0 if the action was completed successfully. On error, it
+ * should return a nonzero result and set errno.
+ * raceproof_create_file() treats two errno values specially:
+ *
+ * - ENOENT -- dirname(path) does not exist. In this case,
+ *             raceproof_create_file() tries creating dirname(path)
+ *             (and any parent directories, if necessary) and calls
+ *             the function again.
+ *
+ * - EISDIR -- the file already exists and is a directory. In this
+ *             case, raceproof_create_file() deletes the directory
+ *             (recursively) if it is empty and calls the function
+ *             again.
+ *
+ * Any other errno causes raceproof_create_file() to fail with the
+ * callback's return value and errno.
+ *
+ * Obviously, this function should be OK with being called again if it
+ * fails with ENOENT or EISDIR. In other scenarios it will not be
+ * called again.
+ */
+typedef int create_file_fn(const char *path, void *cb);
+
+/*
+ * Create a file in dirname(path) by calling fn, creating leading
+ * directories if necessary. Retry a few times in case we are racing
+ * with another process that is trying to clean up the directory that
+ * contains path. See the documentation for create_file_fn for more
+ * details.
+ *
+ * Return the value and set the errno that resulted from the most
+ * recent call of fn. fn is always called at least once, and will be
+ * called more than once if it returns ENOENT or EISDIR.
+ */
+int raceproof_create_file(const char *path, create_file_fn fn, void *cb);
+
 int mkdir_in_gitdir(const char *path);
 extern char *expand_user_path(const char *path);
 const char *enter_repo(const char *path, int strict);
diff --git a/sha1_file.c b/sha1_file.c
index ae8f0b4..b08f54c 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -179,6 +179,74 @@ enum scld_error safe_create_leading_directories_const(const char *path)
 	return result;
 }
 
+int raceproof_create_file(const char *path, create_file_fn fn, void *cb)
+{
+	/*
+	 * The number of times we will try to remove empty directories
+	 * in the way of path. This is only 1 because if another
+	 * process is racily creating directories that conflict with
+	 * us, we don't want to fight against them.
+	 */
+	int remove_directories_remaining = 1;
+
+	/*
+	 * The number of times that we will try to create the
+	 * directories containing path. We are willing to attempt this
+	 * more than once, because another process could be trying to
+	 * clean up empty directories at the same time as we are
+	 * trying to create them.
+	 */
+	int create_directories_remaining = 3;
+
+	/* A scratch copy of path, filled lazily if we need it: */
+	struct strbuf path_copy = STRBUF_INIT;
+
+	int ret, save_errno;
+
+	/* Sanity check: */
+	assert(*path);
+
+retry_fn:
+	ret = fn(path, cb);
+	save_errno = errno;
+	if (!ret)
+		goto out;
+
+	if (errno == EISDIR && remove_directories_remaining-- > 0) {
+		/*
+		 * A directory is in the way. Maybe it is empty; try
+		 * to remove it:
+		 */
+		if (!path_copy.len)
+			strbuf_addstr(&path_copy, path);
+
+		if (!remove_dir_recursively(&path_copy, REMOVE_DIR_EMPTY_ONLY))
+			goto retry_fn;
+	} else if (errno == ENOENT && create_directories_remaining-- > 0) {
+		/*
+		 * Maybe the containing directory didn't exist, or
+		 * maybe it was just deleted by a process that is
+		 * racing with us to clean up empty directories. Try
+		 * to create it:
+		 */
+		enum scld_error scld_result;
+
+		if (!path_copy.len)
+			strbuf_addstr(&path_copy, path);
+
+		do {
+			scld_result = safe_create_leading_directories(path_copy.buf);
+			if (scld_result == SCLD_OK)
+				goto retry_fn;
+		} while (scld_result == SCLD_VANISHED && create_directories_remaining-- > 0);
+	}
+
+out:
+	strbuf_release(&path_copy);
+	errno = save_errno;
+	return ret;
+}
+
 static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1)
 {
 	int i;
-- 
2.9.3


^ permalink raw reply related

* [PATCH v3 03/23] safe_create_leading_directories_const(): preserve errno
From: Michael Haggerty @ 2016-12-31  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, David Turner, Michael Haggerty
In-Reply-To: <cover.1483153436.git.mhagger@alum.mit.edu>

Some implementations of free() change errno (even thought they
shouldn't):

  https://sourceware.org/bugzilla/show_bug.cgi?id=17924

So preserve the errno from safe_create_leading_directories() across the
call to free().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 sha1_file.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/sha1_file.c b/sha1_file.c
index 1173071..10395e7 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -166,10 +166,14 @@ enum scld_error safe_create_leading_directories(char *path)
 
 enum scld_error safe_create_leading_directories_const(const char *path)
 {
+	int save_errno;
 	/* path points to cache entries, so xstrdup before messing with it */
 	char *buf = xstrdup(path);
 	enum scld_error result = safe_create_leading_directories(buf);
+
+	save_errno = errno;
 	free(buf);
+	errno = save_errno;
 	return result;
 }
 
-- 
2.9.3


^ permalink raw reply related

* [PATCH v3 00/23] Delete directories left empty after ref deletion
From: Michael Haggerty @ 2016-12-31  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, David Turner, Michael Haggerty

This is a re-roll of an old patch series. v1 [1] got some feedback,
which I think was all addressed in v2 [2]. But it seems that v2 fell
on the floor, and I didn't bother following up because it was in the
same area of code that was undergoing heavy changes due to the
pluggable reference backend work. Sorry for the long delay before
getting back to it.

It turns out that this patch series is still relevant and didn't even
need all that must adjustment. While rebasing onto the current
`master`, I tidied up a bit, tightened up some code, and improved some
commit messages. But the spirit of the patch series and most of its
code are unchanged.

This patch series is also available from my GitHub account [3] as
branch delete-empty-refs-dirs.

Brief summary (see v1 [1] for more details):

Previously, we were pretty sloppy about leaving empty directories
behind (under both $GIT_DIR/refs and $GIT_DIR/logs) when deleting
references. Such directories could accumulate essentially forever.
It's true that `pack-refs` deletes directories that it empties, but if
a directory is *already* empty, then `pack-refs` doesn't remove it. It
is also true that if an empty directory gets in the way of the
creation of a *new* reference, then it is deleted. But otherwise there
is no systematic cleanup of empty directories.

A reason for the old behavior was that the code paths *creating* new
files in these hierarchies were not always robust against a race with
another process that was cleaning up empty directories. So most of
this patch series is dedicated to hardening up the creation code paths
(via a new function, `raceproof_create_file()`). The last several
patches teach `files_transaction_commit()` to delete empty directories
when deleting references and reflogs.

Michael

[1] http://public-inbox.org/git/cover.1455626201.git.mhagger@alum.mit.edu/T/#u
[2] http://public-inbox.org/git/cover.1456405698.git.mhagger@alum.mit.edu/T/#u
[3] http://github.com/mhagger/git

Michael Haggerty (23):
  files_rename_ref(): tidy up whitespace
  t5505: use "for-each-ref" to test for the non-existence of references
  safe_create_leading_directories_const(): preserve errno
  safe_create_leading_directories(): set errno on SCLD_EXISTS
  raceproof_create_file(): new function
  lock_ref_sha1_basic(): inline constant
  lock_ref_sha1_basic(): use raceproof_create_file()
  rename_tmp_log(): use raceproof_create_file()
  rename_tmp_log(): improve error reporting
  log_ref_write(): inline function
  log_ref_setup(): separate code for create vs non-create
  log_ref_setup(): improve robustness against races
  log_ref_setup(): pass the open file descriptor back to the caller
  log_ref_write_1(): don't depend on logfile argument
  log_ref_setup(): manage the name of the reflog file internally
  log_ref_write_1(): inline function
  delete_ref_loose(): derive loose reference path from lock
  delete_ref_loose(): inline function
  try_remove_empty_parents(): rename parameter "name" -> "refname"
  try_remove_empty_parents(): don't trash argument contents
  try_remove_empty_parents(): don't accommodate consecutive slashes
  try_remove_empty_parents(): teach to remove parents of reflogs, too
  files_transaction_commit(): clean up empty directories

 cache.h               |  48 ++++++-
 refs/files-backend.c  | 375 +++++++++++++++++++++++++-------------------------
 refs/refs-internal.h  |  11 +-
 sha1_file.c           |  76 +++++++++-
 t/t1400-update-ref.sh |  27 ++++
 t/t5505-remote.sh     |   2 +-
 6 files changed, 346 insertions(+), 193 deletions(-)

-- 
2.9.3


^ permalink raw reply

* [PATCH v3 02/23] t5505: use "for-each-ref" to test for the non-existence of references
From: Michael Haggerty @ 2016-12-31  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, David Turner, Michael Haggerty
In-Reply-To: <cover.1483153436.git.mhagger@alum.mit.edu>

Instead of looking on the filesystem inside ".git/refs/remotes/origin",
use "git for-each-ref" to check for leftover references under the
remote's old name.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 t/t5505-remote.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index 8198d8e..65030fb 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -725,7 +725,7 @@ test_expect_success 'rename a remote' '
 	(
 		cd four &&
 		git remote rename origin upstream &&
-		rmdir .git/refs/remotes/origin &&
+		test -z "$(git for-each-ref refs/remotes/origin)" &&
 		test "$(git symbolic-ref refs/remotes/upstream/HEAD)" = "refs/remotes/upstream/master" &&
 		test "$(git rev-parse upstream/master)" = "$(git rev-parse master)" &&
 		test "$(git config remote.upstream.fetch)" = "+refs/heads/*:refs/remotes/upstream/*" &&
-- 
2.9.3


^ permalink raw reply related

* [PATCH v3 01/23] files_rename_ref(): tidy up whitespace
From: Michael Haggerty @ 2016-12-31  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, David Turner, Michael Haggerty
In-Reply-To: <cover.1483153436.git.mhagger@alum.mit.edu>

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f902393..be4ffdc 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2631,7 +2631,7 @@ static int files_rename_ref(struct ref_store *ref_store,
 	if (!read_ref_full(newrefname, RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
 			   sha1, NULL) &&
 	    delete_ref(newrefname, NULL, REF_NODEREF)) {
-		if (errno==EISDIR) {
+		if (errno == EISDIR) {
 			struct strbuf path = STRBUF_INIT;
 			int result;
 
-- 
2.9.3


^ permalink raw reply related

* Wanted: shallow submodule clones with --no-single-branch.
From: Tor Andersson @ 2016-12-30 10:50 UTC (permalink / raw)
  To: git

Hi,

When adding submodules with --depth=1 only the master branch is
cloned. This often leaves the submodule pointing to a non-existing
commit.

It would be useful if I could pass the --no-single-branch argument to
the submodule clone process, since then a submodule can point to any
tag or branch without ending up in this situation. I've got a local
patch to hardwire the --no-single-branch argument in the
builtin/submodule--helper.c clone_submodule function, but I'm not sure
if this will have any other adverse effects?

Better yet would be for the shallow submodule clone to automatically
retrieve and graft the actual commit the submodule points to, but
that's probably wishing for too much.

-Tor

^ permalink raw reply

* Re: [PATCH] submodule.c: use GIT_DIR_ENVIRONMENT consistently
From: René Scharfe @ 2016-12-30  1:36 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, jrnieder, git
In-Reply-To: <20161230004739.19144-1-sbeller@google.com>

Am 30.12.2016 um 01:47 schrieb Stefan Beller:
> diff --git a/submodule.c b/submodule.c
> index ece17315d6..973b9f3f96 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1333,5 +1333,6 @@ void prepare_submodule_repo_env(struct argv_array *out)
>  		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
>  			argv_array_push(out, *var);
>  	}
> -	argv_array_push(out, "GIT_DIR=.git");
> +	argv_array_push(out, "%s=%s", GIT_DIR_ENVIRONMENT,
> +		DEFAULT_GIT_DIR_ENVIRONMENT);

argv_array_pushf (with added "f") instead?

And indent continued lines to align them with the left parenthesis, like 
this:

	fn(arg1,
	   arg2);

>  }
>


^ permalink raw reply

* Re: [PATCH v2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR
From: Jeff King @ 2016-12-30  0:49 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Jonathan Nieder, git@vger.kernel.org, Duy Nguyen, Junio C Hamano
In-Reply-To: <CAGZ79kZiR7grGeMrMFRuQQ0kW6eQ4m=ZizGHDbT-s14iAj8hWQ@mail.gmail.com>

On Thu, Dec 29, 2016 at 04:37:30PM -0800, Stefan Beller wrote:

> > +       mkdir lsremote-root &&
> > +       (
> > +               GIT_CEILING_DIRECTORIES=$(pwd) &&
> > +               export GIT_CEILING_DIRECTORIES &&
> > +               cd lsremote-root &&
> > +               git ls-remote "$HTTPD_URL/smart/repo.git" >../actual
> > +       ) &&
> 
> We could avoid the subshell via
> 
> GIT_CEILING_DIRECTORIES=$(pwd) \
>     git -C lsremote-root lsremote ... >actual
> 
> Not sure if it is worth to trade off a block of code (and an extra shell
> at run time) for an overly long line.
> 
> The rest looks good to me.

I mentioned elsewhere that we now have a "nongit" function to do this as
a one-liner. It might be worth applying your optimization to that
function, so it would take effect in may places.

-Peff

^ permalink raw reply

* Re: [PATCH v2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR
From: Jeff King @ 2016-12-30  0:48 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Duy Nguyen, Junio C Hamano, Stefan Beller
In-Reply-To: <20161230001114.GB7883@aiede.mtv.corp.google.com>

On Thu, Dec 29, 2016 at 04:11:14PM -0800, Jonathan Nieder wrote:

> Thanks.  Here's the patch again, now with commit messages and a test.
> Thanks for the analysis and sorry for the slow turnaround.

Thanks for following up. While working on a similar one recently, I had
the nagging feeling that there was a case we had found that was still to
be dealt with, and I think this was it. :)

The patch to the C code looks good. I have a few comments on the tests:

> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> index aeb3a63f7c..a86fca3e6c 100755
> --- a/t/t5550-http-fetch-dumb.sh
> +++ b/t/t5550-http-fetch-dumb.sh
> @@ -34,6 +34,21 @@ test_expect_success 'clone http repository' '
>  	test_cmp file clone/file
>  '
>  
> +test_expect_success 'list refs from outside any repository' '
> +	cat >expect <<-EOF &&
> +	$(git rev-parse master)	HEAD
> +	$(git rev-parse master)	refs/heads/master
> +	EOF
> +	mkdir lsremote-root &&
> +	(
> +		GIT_CEILING_DIRECTORIES=$(pwd) &&
> +		export GIT_CEILING_DIRECTORIES &&
> +		cd lsremote-root &&
> +		git ls-remote "$HTTPD_URL/dumb/repo.git" >../actual
> +	) &&
> +	test_cmp expect actual
> +'

Since my recent de95302a4c (t5000: extract nongit function to
test-lib-functions.sh, 2016-12-15), this can be shortened to:

  cat >expect <<-EOF &&
  ...
  EOF
  nongit git ls-remote "$HTTPD_URL/dumb/repo.git" >actual &&
  test_cmp expect actual

I think that commit is in 'master' now.

Without my patch to die() in setup_git_env(), I think this would pass
with or without your patch, right?  I think the current behavior _is_
buggy, but a setup to show off the improvement would be so arcane that I
don't think it is worth it. E.g., something like:

  echo '[http]extraHeader = "Foo: Bar" >nongit/.git/config

would probably trigger the use of that config when it shouldn't. But
that's really stretching.

> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index 6e5b9e42fb..7ba894f0f4 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -163,6 +163,21 @@ test_expect_success 'redirects send auth to new location' '
>  	expect_askpass both user@host auth/smart/repo.git
>  '
>  
> +test_expect_success 'list refs from outside any repository' '
> +	cat >expect <<-EOF &&
> +	$(git rev-parse master)	HEAD
> +	$(git rev-parse master)	refs/heads/master
> +	EOF
> +	mkdir lsremote-root &&
> +	(
> +		GIT_CEILING_DIRECTORIES=$(pwd) &&
> +		export GIT_CEILING_DIRECTORIES &&
> +		cd lsremote-root &&
> +		git ls-remote "$HTTPD_URL/smart/repo.git" >../actual
> +	) &&
> +	test_cmp expect actual
> +'

Is this really testing anything that the dumb one isn't? The interesting
bit is in invoking git-remote-http, not what it does under the hood. I
don't mind too much being thorough, but if we are going to go that route
we should probably come up with a standard battery of tests for dumb
and smart HTTP, and then invoke them for each case without having to
write each one twice.

> diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
> index 225a022e8a..4573d98e6c 100755
> --- a/t/t5570-git-daemon.sh
> +++ b/t/t5570-git-daemon.sh
> @@ -35,6 +35,21 @@ test_expect_success 'clone git repository' '
>  	test_cmp file clone/file
>  '
>  
> +test_expect_success 'list refs from outside any repository' '
> +	cat >expect <<-EOF &&
> +	$(git rev-parse master)	HEAD
> +	$(git rev-parse master)	refs/heads/master
> +	EOF
> +	mkdir lsremote-root &&
> +	(
> +		GIT_CEILING_DIRECTORIES=$(pwd) &&
> +		export GIT_CEILING_DIRECTORIES &&
> +		cd lsremote-root &&
> +		git ls-remote "$GIT_DAEMON_URL/repo.git" >../actual
> +	) &&
> +	test_cmp expect actual
> +'

This one isn't actually broken now, right? The test is just there to
catch future regressions?

If we are being thorough, then would we also care about file-local
repos, git-over-ssh, etc?

> diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
> index c6c2661878..7232032cd2 100755
> --- a/t/t5802-connect-helper.sh
> +++ b/t/t5802-connect-helper.sh

This one is quite a big addition. I know this falls under the "while
we're at it" line at the end of your commit message, but maybe it's
worth pulling the GIT_EXT_SERVICE bits out into their own patch (and
explaining briefly what's going on; I had to go look up what
GIT_EXT_SERVICE_NOPREFIX even was).

-Peff

^ permalink raw reply

* [PATCH] submodule.c: use GIT_DIR_ENVIRONMENT consistently
From: Stefan Beller @ 2016-12-30  0:47 UTC (permalink / raw)
  To: gitster, jrnieder; +Cc: git, Stefan Beller

In C code we have the luxury of having constants for all the important
things that are hard coded. This is the only place in C, that hard codes
the git directory environment variable, so fix it.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
  Signed-off-by-the-format-patch-config ;)
  
  This is the only occurrence for "GIT_DIR=" in C, but what about ".git"
  git grep "\.git\"" *.c finds some places, which we may want to convert
  to DEFAULT_GIT_DIR_ENVIRONMENT?
  (mainly things that are newer if I can judge the places correctly
  lots of submodules, worktrees and the no data in ".git" bug AFAICT)
  
  Thanks,
  Stefan

 submodule.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index ece17315d6..973b9f3f96 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1333,5 +1333,6 @@ void prepare_submodule_repo_env(struct argv_array *out)
 		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
 			argv_array_push(out, *var);
 	}
-	argv_array_push(out, "GIT_DIR=.git");
+	argv_array_push(out, "%s=%s", GIT_DIR_ENVIRONMENT,
+		DEFAULT_GIT_DIR_ENVIRONMENT);
 }
-- 
2.11.0.259.ga95e92af08.dirty


^ permalink raw reply related

* Re: [PATCH v2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR
From: Stefan Beller @ 2016-12-30  0:37 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jeff King, git@vger.kernel.org, Duy Nguyen, Junio C Hamano
In-Reply-To: <20161230001114.GB7883@aiede.mtv.corp.google.com>

> +       mkdir lsremote-root &&
> +       (
> +               GIT_CEILING_DIRECTORIES=$(pwd) &&
> +               export GIT_CEILING_DIRECTORIES &&
> +               cd lsremote-root &&
> +               git ls-remote "$HTTPD_URL/smart/repo.git" >../actual
> +       ) &&

We could avoid the subshell via

GIT_CEILING_DIRECTORIES=$(pwd) \
    git -C lsremote-root lsremote ... >actual

Not sure if it is worth to trade off a block of code (and an extra shell
at run time) for an overly long line.

The rest looks good to me.

Thanks,
Stefan

^ permalink raw reply

* [PATCH v2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR
From: Jonathan Nieder @ 2016-12-30  0:11 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano, Stefan Beller
In-Reply-To: <20161122024102.otlnl6jcrb3pejux@sigill.intra.peff.net>

To push from or fetch to the current repository, remote helpers need
to know what repository that is.  Accordingly, Git sets the GIT_DIR
environment variable to the path to the current repository when
invoking remote helpers.

There is a special case it does not handle: "git ls-remote" and "git
archive --remote" can be run to inspect a remote repository without
being run from any local repository.  GIT_DIR is not useful in this
scenario:

- if we are not in a repository, we don't need to set GIT_DIR to
  override an existing GIT_DIR value from the environment.  If GIT_DIR
  is present then we would be in a repository if it were valid and
  would have called die() if it weren't.

- not setting GIT_DIR may cause a helper to do the usual discovery
  walk to find the repository.  But we know we're not in one, or we
  would have found it ourselves.  So in the worst case it may expend
  a little extra effort to try to find a repository and fail (for
  example, remote-curl would do this to try to find repository-level
  configuration).

So leave GIT_DIR unset in this case.  This makes GIT_DIR easier to
understand for remote helper authors and makes transport code less of
a special case for repository discovery.

Noticed using b1ef400e (setup_git_env: avoid blind fall-back to
".git", 2016-10-20) from 'next':

 $ cd /tmp
 $ git ls-remote https://kernel.googlesource.com/pub/scm/git/git
 fatal: BUG: setup_git_env called without repository

While at it, add some tests for other variables in the environment of
commands run by git-remote-ext.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Jeff King wrote:

> So yeah, I think this is the extent of the change needed.

Thanks.  Here's the patch again, now with commit messages and a test.
Thanks for the analysis and sorry for the slow turnaround.

Sincerely,
Jonathan

 t/t5550-http-fetch-dumb.sh  | 15 ++++++++
 t/t5551-http-fetch-smart.sh | 15 ++++++++
 t/t5570-git-daemon.sh       | 15 ++++++++
 t/t5802-connect-helper.sh   | 84 ++++++++++++++++++++++++++++++++++++++++++++-
 transport-helper.c          |  5 +--
 5 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index aeb3a63f7c..a86fca3e6c 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -34,6 +34,21 @@ test_expect_success 'clone http repository' '
 	test_cmp file clone/file
 '
 
+test_expect_success 'list refs from outside any repository' '
+	cat >expect <<-EOF &&
+	$(git rev-parse master)	HEAD
+	$(git rev-parse master)	refs/heads/master
+	EOF
+	mkdir lsremote-root &&
+	(
+		GIT_CEILING_DIRECTORIES=$(pwd) &&
+		export GIT_CEILING_DIRECTORIES &&
+		cd lsremote-root &&
+		git ls-remote "$HTTPD_URL/dumb/repo.git" >../actual
+	) &&
+	test_cmp expect actual
+'
+
 test_expect_success 'create password-protected repository' '
 	mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/" &&
 	cp -Rf "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 6e5b9e42fb..7ba894f0f4 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -163,6 +163,21 @@ test_expect_success 'redirects send auth to new location' '
 	expect_askpass both user@host auth/smart/repo.git
 '
 
+test_expect_success 'list refs from outside any repository' '
+	cat >expect <<-EOF &&
+	$(git rev-parse master)	HEAD
+	$(git rev-parse master)	refs/heads/master
+	EOF
+	mkdir lsremote-root &&
+	(
+		GIT_CEILING_DIRECTORIES=$(pwd) &&
+		export GIT_CEILING_DIRECTORIES &&
+		cd lsremote-root &&
+		git ls-remote "$HTTPD_URL/smart/repo.git" >../actual
+	) &&
+	test_cmp expect actual
+'
+
 test_expect_success 'disable dumb http on server' '
 	git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
 		config http.getanyfile false
diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
index 225a022e8a..4573d98e6c 100755
--- a/t/t5570-git-daemon.sh
+++ b/t/t5570-git-daemon.sh
@@ -35,6 +35,21 @@ test_expect_success 'clone git repository' '
 	test_cmp file clone/file
 '
 
+test_expect_success 'list refs from outside any repository' '
+	cat >expect <<-EOF &&
+	$(git rev-parse master)	HEAD
+	$(git rev-parse master)	refs/heads/master
+	EOF
+	mkdir lsremote-root &&
+	(
+		GIT_CEILING_DIRECTORIES=$(pwd) &&
+		export GIT_CEILING_DIRECTORIES &&
+		cd lsremote-root &&
+		git ls-remote "$GIT_DAEMON_URL/repo.git" >../actual
+	) &&
+	test_cmp expect actual
+'
+
 test_expect_success 'fetch changes via git protocol' '
 	echo content >>file &&
 	git commit -a -m two &&
diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
index c6c2661878..7232032cd2 100755
--- a/t/t5802-connect-helper.sh
+++ b/t/t5802-connect-helper.sh
@@ -3,6 +3,10 @@
 test_description='ext::cmd remote "connect" helper'
 . ./test-lib.sh
 
+escape_spaces () {
+	echo "$*" | sed -e "s/ /% /g"
+}
+
 test_expect_success setup '
 	git config --global protocol.ext.allow user &&
 	test_tick &&
@@ -19,7 +23,7 @@ test_expect_success setup '
 '
 
 test_expect_success clone '
-	cmd=$(echo "echo >&2 ext::sh invoked && %S .." | sed -e "s/ /% /g") &&
+	cmd=$(escape_spaces "echo >&2 ext::sh invoked && %S ..") &&
 	git clone "ext::sh -c %S% ." dst &&
 	git for-each-ref refs/heads/ refs/tags/ >expect &&
 	(
@@ -70,6 +74,84 @@ test_expect_success 'update backfilled tag without primary transfer' '
 	test_cmp expect actual
 '
 
+test_expect_success 'GIT_EXT_SERVICE for clone, ls-remote, push, archive' '
+	rm -rf dst &&
+	>actual &&
+	cat >expect <<-\EOF &&
+	git-upload-pack
+	git-upload-pack
+	git-receive-pack
+	git-upload-archive
+	EOF
+	git archive HEAD >expect.tar &&
+	cmd=$(escape_spaces "echo >>actual \$GIT_EXT_SERVICE && %S .") &&
+
+	git clone "ext::sh -c $cmd" dst &&
+	git ls-remote "ext::sh -c $cmd" &&
+	test_when_finished "git update-ref -d refs/heads/newbranch" &&
+	git push "ext::sh -c $cmd" master:newbranch &&
+	git archive --remote="ext::sh -c $cmd" HEAD >actual.tar &&
+
+	test_cmp expect actual &&
+	test_cmp expect.tar actual.tar
+'
+
+test_expect_success 'GIT_EXT_SERVICE_NOPREFIX for clone, ls-remote, push, archive' '
+	rm -rf dst &&
+	>actual &&
+	cat >expect <<-\EOF &&
+	upload-pack
+	upload-pack
+	receive-pack
+	upload-archive
+	EOF
+	git archive HEAD >expect.tar &&
+	cmd=$(escape_spaces "echo >>actual \$GIT_EXT_SERVICE_NOPREFIX && %S .") &&
+
+	git clone "ext::sh -c $cmd" dst &&
+	git ls-remote "ext::sh -c $cmd" &&
+	test_when_finished "git update-ref -d refs/heads/newbranch" &&
+	git push "ext::sh -c $cmd" master:newbranch &&
+	git archive --remote="ext::sh -c $cmd" HEAD >actual.tar &&
+
+	test_cmp expect actual &&
+	test_cmp expect.tar actual.tar
+'
+
+test_expect_success 'GIT_DIR is set to the enclosing repo for ls-remote' '
+	git rev-parse --git-dir >expect &&
+	cmd=$(escape_spaces "echo \$GIT_DIR >actual && %S .") &&
+	git ls-remote "ext::sh -c $cmd" &&
+	test_cmp expect actual
+'
+
+test_expect_success 'GIT_DIR is set to the enclosing repo for archive' '
+	git rev-parse --git-dir >expect &&
+	git archive HEAD >expect.tar &&
+	cmd=$(escape_spaces "echo \$GIT_DIR >actual && %S .") &&
+	git archive --remote="ext::sh -c $cmd" HEAD >actual.tar &&
+	test_cmp expect actual &&
+	test_cmp expect.tar actual.tar
+'
+
+test_expect_success 'GIT_DIR is not set if there is no enclosing repo' '
+	rm -rf subdir &&
+	>actual &&
+	printf "%s\n" unset unset >expect &&
+	git archive HEAD >expect.tar &&
+
+	mkdir subdir &&
+	cmd=$(escape_spaces "echo \${GIT_DIR-unset} >>../actual && %S ..") &&
+	(
+		GIT_CEILING_DIRECTORIES=$(pwd) &&
+		export GIT_CEILING_DIRECTORIES &&
+		cd subdir &&
+		git ls-remote "ext::sh -c $cmd" &&
+		git archive --remote="ext::sh -c $cmd" HEAD >../actual.tar
+	) &&
+	test_cmp expect actual &&
+	test_cmp expect.tar actual.tar
+'
 
 test_expect_success 'set up fake git-daemon' '
 	mkdir remote &&
diff --git a/transport-helper.c b/transport-helper.c
index 91aed35ebb..e4fd982383 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -124,8 +124,9 @@ static struct child_process *get_helper(struct transport *transport)
 	helper->git_cmd = 0;
 	helper->silent_exec_failure = 1;
 
-	argv_array_pushf(&helper->env_array, "%s=%s", GIT_DIR_ENVIRONMENT,
-			 get_git_dir());
+	if (have_git_dir())
+		argv_array_pushf(&helper->env_array, "%s=%s",
+				 GIT_DIR_ENVIRONMENT, get_git_dir());
 
 	code = start_command(helper);
 	if (code < 0 && errno == ENOENT)
-- 
2.11.0.355.g1ede815


^ permalink raw reply related

* Re: Bug: Assertion failed: function prefix_pathspec, file pathspec.c, line 317.
From: Stefan Beller @ 2016-12-29 23:45 UTC (permalink / raw)
  To: Rafal W; +Cc: git@vger.kernel.org
In-Reply-To: <CANmdXCEGTxgj8BuQZkaBQaktOBrGmLETcQJg9xfV27Kn-0aLGA@mail.gmail.com>

On Thu, Dec 29, 2016 at 3:37 PM, Rafal W <kenorb@gmail.com> wrote:
> The following error happens when I'm running "git add ." in the submodule dir:
>

Thanks for reporting!

Please see the patch that I sent out earlier today:
https://public-inbox.org/git/20161229192908.32633-1-sbeller@google.com/

I wonder if the error message is sufficient, i.e. would you liked to have
more help reported by git?

^ permalink raw reply

* Bug: Assertion failed: function prefix_pathspec, file pathspec.c, line 317.
From: Rafal W @ 2016-12-29 23:37 UTC (permalink / raw)
  To: git@vger.kernel.org

The following error happens when I'm running "git add ." in the submodule dir:

$ GIT_TRACE=1 git add .
23:25:18.313575 git.c:350               trace: built-in: git 'add' '.'
Assertion failed: (item->nowildcard_len <= item->len && item->prefix
<= item->len), function prefix_pathspec, file pathspec.c, line 317.
Abort trap: 6

Then git crashes, so I've to remove index.lock manually
(.git/modules/Foo/index.lock).

This is what I did before it start doing this:
1. I've renamed GitHub repository to different name.
2. Created new repository in place of the old one (which consist other
submodules).
3. In previously cloned submodule dir (which was pointing to previous
repo before rename) I did: git fetch origin
4. Then I did: git reset origin/master --hard
5. 'git status' looks ok
6. Moved some of the files into submodule dir of the current submodule
and entered that dir.
7. From now on, the git add crashes with trap 6:  git add .

Probably there are some easier steps to reproduce the issue, but this
is what I did.

OS: macOS Sierra
git version 2.8.4

(lldb) bt
* thread #1: tid = 0x2338e, 0x00007fffbe6a1dda
libsystem_kernel.dylib`__pthread_kill + 10, queue =
'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fffbe6a1dda libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fffbe78c797 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fffbe607440 libsystem_c.dylib`abort + 129
    frame #3: 0x00007fffbe5ce8b3 libsystem_c.dylib`__assert_rtn + 320
    frame #4: 0x00000001000d83f9 git`parse_pathspec + 2917
    frame #5: 0x0000000100002ded git`cmd_add + 991
    frame #6: 0x0000000100001e64 git`handle_builtin + 357
    frame #7: 0x0000000100001877 git`main + 288

Other people could reproduce the same thing, also on Mac, see:
http://stackoverflow.com/q/23205961

Kind regards,
kenorb

^ permalink raw reply

* Re: [PATCH] am: add am.signoff add config variable
From: Stefan Beller @ 2016-12-29 23:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, Eduardo Habkost, git@vger.kernel.org, Paul Tan
In-Reply-To: <xmqqpoka5pb0.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 29, 2016 at 2:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> IANAL either, but we have been striving to keep output of
>>>
>>>    $ git grep '\.signoff' Documentation
>>
>>>
>>> empty to keep Sign-off meaningful.
>>
>> Try again with -i ;)
>> and you'll find format.signOff
>
> Mistakes happen.  Finding an old mistake is not an excuse for you to
> make the same one again.
> It is an opportunity to come up with a way
> to correct it without hurting existing users by designing a smooth
> transition path.
>

knee jerk reaction:

1) Document why format.signoff is bad (even after a long
  office discussion I am not fully convinced it is bad. That
  may be because I am biased as I find format.signoff *very*
  useful. The cumbersome contribution process as laid out
  by SubmittingPatches just got easier for me as I have one
  step less to worry about. I haven't made a mistake so far
  sending out crap where I'll throw a temper tantrum if you
  apply it.

  So I would expect a maintainer of a project that uses
  email based workflow to write that documentation giving
  reasons. That person is currently consuming the automatically
  signed off patches, so I'd want to know their line of thinking.

2) If the config option is set, but no explicit sign off is given,
  put a different footer, e.g.
    git config format.signoff true &&
    git format-patch HEAD^
  may produce Auto-Signed-Off-By: ...
  whereas
    git -c format.signoff format-patch
  behaves the same as
    git format-patch --signoff
  that gives the Signed-Off-By as we know it.

  It is up to the upstream project to accept these new sign offs.

3) (later) warn about the option if it is set, giving the text from 1)

4) (a long time later) remove the option.

--
For 2) I am not sure what we want there, because this
has to happen in collaboration with all the upstream projects that
use sign offs.

We could be subtle, i.e. just use all lowercase / all uppercase
letters for this differentiation. Then automated tools that check for signoff
are easily adjusted. e.g. The eclipse foundation disallows pushing for
review if any patch is missing a signoff; Gerrit can check for that.
Gerrit is not case sensitive when checking for a footer.

I am not sure if there are any other tools out there that automatically
check for that, but I would assume they are also case insensitive in
such a case, as it is unclear to me how to properly capitalize the sign off.

This is an easy way forward for upstream projects., though confusing in
court later on.
--
We could also be non-subtle, very explicit, and each tool that
can add sign offs currently, needs to be explicit about itself:

    Configured-Formatpatch-Signed-Off: (for git format.signoff with config)
    # and others:
    Explicit-Formatpatch-Signed-Off:
    Git-Gui-Button-Clicked-Signed-Off:
    Git-Gui-Button-Shortcut-Signed-Off:

Once we have that we could add much more of these:
    Configured-Commit-Signed-Off:
    etc.

You can continue to sign off via just typing it, or by
    I-typed-it-signed-Off,
--
One of the problems highlighted to me was that you could have accidentally
configured format.signoff globally, but you're only allowed/desire to sign off
in a particular repository, such that

    Repolocal-Configured-Formatpatch-Signed-Off:
    Global-Configured-Formatpatch-Signed-Off:

may be worth discussing.
--

^ permalink raw reply

* Re: [PATCH] am: add am.signoff add config variable
From: Junio C Hamano @ 2016-12-29 22:43 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Eric Wong, Eduardo Habkost, git@vger.kernel.org, Paul Tan
In-Reply-To: <CAGZ79kaAbCTsY_SddVMKMsLV0xyXNBFvxQ=J-20Cwdz31v4OwA@mail.gmail.com>

Stefan Beller <sbeller@google.com> writes:

>> IANAL either, but we have been striving to keep output of
>>
>>    $ git grep '\.signoff' Documentation
>
>>
>> empty to keep Sign-off meaningful.
>
> Try again with -i ;)
> and you'll find format.signOff

Mistakes happen.  Finding an old mistake is not an excuse for you to
make the same one again.  It is an opportunity to come up with a way
to correct it without hurting existing users by designing a smooth
transition path.


^ permalink raw reply

* Re: [PATCH] am: add am.signoff add config variable
From: Stefan Beller @ 2016-12-29 22:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, Eduardo Habkost, git@vger.kernel.org, Paul Tan
In-Reply-To: <xmqqtw9m5s5m.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 29, 2016 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <e@80x24.org> writes:
>
>> Eduardo Habkost <ehabkost@redhat.com> wrote:
>>> git-am has options to enable --message-id and --3way by default,
>>> but no option to enable --signoff by default. Add a "am.signoff"
>>> config option.
>>
>> I'm not sure this is a good idea.  IANAL, but a sign-off
>> has some sort of legal meaning for this project (DCO)
>> and that would be better decided on a patch-by-patch basis
>> rather than a blanket statement.
>
> IANAL either, but we have been striving to keep output of
>
>    $ git grep '\.signoff' Documentation

Try again with -i ;)
and you'll find format.signOff

>
> empty to keep Sign-off meaningful.
>
> Adding more publicized ways to add SoB without thinking will make it
> harder to argue against one who tells the court "that log message
> ends with a SoB by person X but it is very plausible that it was
> done by inertia without person X really intending to certify what
> DCO says, and the SoB is meaningless".

I think we should be symmetrical, am is the opposite of
format-patch in the Git <-> email conversion.

If I were to follow your arguments here, we should revert
1d1876e930 (Add configuration variable for sign-off to format-patch)

On the other hand, I would argue that thinking and typing things is
orthogonal (the more you type, doesn't imply that you think harder
or even at all).

--
However I think there is a use case for such an option
(in the short term?) as e.g. you as the Git maintainer being employed
has the legal rights to sign off on pretty much any patch in Git.
For other projects this may be different.

Which is why a per-repository configurable thing is useful to
setup a default for your environment.

IIUC long term we rather want to have easily configurable trailers
for am/format-patch/commit, such that you could configure to
remove any "ChangeId:" footers before sending out, or adding
a "Tested-by" footer by a CI system or such?

>
>> I don't add my SoB to patches (either my own or received) until
>> I'm comfortable with it;

comfortable is orthogonal to legal, specifically on the receiving side.
I can understand using format-patch to send out unsigned patches
(e.g. for heavy WIP things, "please to not apply literally")

>> and I'd rather err on the side of
>> forgetting and being prodded to resubmit rather than putting
>> an SoB on the wrong patch.
>

^ permalink raw reply

* Re: [PATCH] am: add am.signoff add config variable
From: Eduardo Habkost @ 2016-12-29 22:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, git, Paul Tan
In-Reply-To: <xmqqtw9m5s5m.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 29, 2016 at 01:42:13PM -0800, Junio C Hamano wrote:
> Eric Wong <e@80x24.org> writes:
> 
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> >> git-am has options to enable --message-id and --3way by default,
> >> but no option to enable --signoff by default. Add a "am.signoff"
> >> config option.
> >
> > I'm not sure this is a good idea.  IANAL, but a sign-off
> > has some sort of legal meaning for this project (DCO)
> > and that would be better decided on a patch-by-patch basis
> > rather than a blanket statement.
> 
> IANAL either, but we have been striving to keep output of
> 
>    $ git grep '\.signoff' Documentation
> 
> empty to keep Sign-off meaningful. 
> 
> Adding more publicized ways to add SoB without thinking will make it
> harder to argue against one who tells the court "that log message
> ends with a SoB by person X but it is very plausible that it was
> done by inertia without person X really intending to certify what
> DCO says, and the SoB is meaningless".

This sounds completely reasonable to me. I now see that the
config option was already proposed in 2011 and the same arguments
were discussed. Sorry for the noise.

> 
> > I don't add my SoB to patches (either my own or received) until
> > I'm comfortable with it; and I'd rather err on the side of
> > forgetting and being prodded to resubmit rather than putting
> > an SoB on the wrong patch.
> 

-- 
Eduardo

^ permalink raw reply

* Re: [PATCH] am: add am.signoff add config variable
From: Junio C Hamano @ 2016-12-29 21:42 UTC (permalink / raw)
  To: Eric Wong; +Cc: Eduardo Habkost, git, Paul Tan
In-Reply-To: <20161229084701.GA3643@starla>

Eric Wong <e@80x24.org> writes:

> Eduardo Habkost <ehabkost@redhat.com> wrote:
>> git-am has options to enable --message-id and --3way by default,
>> but no option to enable --signoff by default. Add a "am.signoff"
>> config option.
>
> I'm not sure this is a good idea.  IANAL, but a sign-off
> has some sort of legal meaning for this project (DCO)
> and that would be better decided on a patch-by-patch basis
> rather than a blanket statement.

IANAL either, but we have been striving to keep output of

   $ git grep '\.signoff' Documentation

empty to keep Sign-off meaningful. 

Adding more publicized ways to add SoB without thinking will make it
harder to argue against one who tells the court "that log message
ends with a SoB by person X but it is very plausible that it was
done by inertia without person X really intending to certify what
DCO says, and the SoB is meaningless".

> I don't add my SoB to patches (either my own or received) until
> I'm comfortable with it; and I'd rather err on the side of
> forgetting and being prodded to resubmit rather than putting
> an SoB on the wrong patch.


^ permalink raw reply

* [PATCHv2] unpack-trees: move checkout state into check_updates
From: Stefan Beller @ 2016-12-29 19:43 UTC (permalink / raw)
  To: gitster; +Cc: git, l.s.r, Stefan Beller

The checkout state was introduced via 16da134b1f9
(read-trees: refactor the unpack_trees() part, 2006-07-30). An attempt to
refactor the checkout state was done in b56aa5b268e (unpack-trees: pass
checkout state explicitly to check_updates(), 2016-09-13), but we can
go even further.

The `struct checkout state` is not used in unpack_trees apart from
initializing it, so move it into the function that makes use of it,
which is `check_updates`.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Rene wrote:
> we could use "index" instead of "&o->result".  Not sure if it's worth a patch, though.

done in this iteration of the patch.
I also reordered the variables at the beginning of the function for readability.

Thanks,
Stefan

 unpack-trees.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 7a6df99d10..eb44f50e65 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -218,14 +218,18 @@ static void unlink_entry(const struct cache_entry *ce)
 	schedule_dir_for_removal(ce->name, ce_namelen(ce));
 }
 
-static int check_updates(struct unpack_trees_options *o,
-			 const struct checkout *state)
+static int check_updates(struct unpack_trees_options *o)
 {
 	unsigned cnt = 0, total = 0;
 	struct progress *progress = NULL;
 	struct index_state *index = &o->result;
-	int i;
-	int errs = 0;
+	struct checkout state = CHECKOUT_INIT;
+	int i, errs = 0;
+
+	state.force = 1;
+	state.quiet = 1;
+	state.refresh_cache = 1;
+	state.istate = index;
 
 	if (o->update && o->verbose_update) {
 		for (total = cnt = 0; cnt < index->cache_nr; cnt++) {
@@ -240,7 +244,7 @@ static int check_updates(struct unpack_trees_options *o,
 	}
 
 	if (o->update)
-		git_attr_set_direction(GIT_ATTR_CHECKOUT, &o->result);
+		git_attr_set_direction(GIT_ATTR_CHECKOUT, index);
 	for (i = 0; i < index->cache_nr; i++) {
 		const struct cache_entry *ce = index->cache[i];
 
@@ -251,7 +255,7 @@ static int check_updates(struct unpack_trees_options *o,
 			continue;
 		}
 	}
-	remove_marked_cache_entries(&o->result);
+	remove_marked_cache_entries(index);
 	remove_scheduled_dirs();
 
 	for (i = 0; i < index->cache_nr; i++) {
@@ -264,7 +268,7 @@ static int check_updates(struct unpack_trees_options *o,
 			display_progress(progress, ++cnt);
 			ce->ce_flags &= ~CE_UPDATE;
 			if (o->update && !o->dry_run) {
-				errs |= checkout_entry(ce, state, NULL);
+				errs |= checkout_entry(ce, &state, NULL);
 			}
 		}
 	}
@@ -1094,14 +1098,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	int i, ret;
 	static struct cache_entry *dfc;
 	struct exclude_list el;
-	struct checkout state = CHECKOUT_INIT;
 
 	if (len > MAX_UNPACK_TREES)
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-	state.force = 1;
-	state.quiet = 1;
-	state.refresh_cache = 1;
-	state.istate = &o->result;
 
 	memset(&el, 0, sizeof(el));
 	if (!core_apply_sparse_checkout || !o->update)
@@ -1238,7 +1237,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	}
 
 	o->src_index = NULL;
-	ret = check_updates(o, &state) ? (-2) : 0;
+	ret = check_updates(o) ? (-2) : 0;
 	if (o->dst_index) {
 		if (!ret) {
 			if (!o->result.cache_tree)
-- 
2.11.0.259.ga95e92af08.dirty


^ permalink raw reply related

* [PATCH] pathspec: give better message for submodule related pathspec error
From: Stefan Beller @ 2016-12-29 19:29 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, peff, Stefan Beller

Every once in a while someone complains to the mailing list to have
run into this weird assertion[1].

The usual response from the mailing list is link to old discussions[2],
and acknowledging the problem stating it is known.

For now just improve the user visible error message.

[1] https://www.google.com/search?q=item-%3Enowildcard_len
[2] http://git.661346.n2.nabble.com/assert-failed-in-submodule-edge-case-td7628687.html
    https://www.spinics.net/lists/git/msg249473.html

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 * a more defensive check and message
 * with tests!
 
 pathspec.c                       | 19 +++++++++++++++++--
 t/t6134-pathspec-in-submodule.sh | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 2 deletions(-)
 create mode 100755 t/t6134-pathspec-in-submodule.sh

diff --git a/pathspec.c b/pathspec.c
index 22ca74a126..b446d79615 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -313,8 +313,23 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
 	}
 
 	/* sanity checks, pathspec matchers assume these are sane */
-	assert(item->nowildcard_len <= item->len &&
-	       item->prefix         <= item->len);
+	if (item->nowildcard_len > item->len ||
+	    item->prefix         > item->len) {
+		/* Historically this always was a submodule issue */
+		for (i = 0; i < active_nr; i++) {
+			struct cache_entry *ce = active_cache[i];
+			int ce_len = ce_namelen(ce);
+			int len = ce_len < item->len ? ce_len : item->len;
+			if (!S_ISGITLINK(ce->ce_mode))
+				continue;
+			if (!memcmp(ce->name, item->match, len))
+				die (_("Pathspec '%s' is in submodule '%.*s'"),
+					item->original, ce_len, ce->name);
+		}
+		/* The error is a new unknown bug */
+		die ("BUG: item->nowildcard_len > item->len || item->prefix > item->len)");
+	}
+
 	return magic;
 }
 
diff --git a/t/t6134-pathspec-in-submodule.sh b/t/t6134-pathspec-in-submodule.sh
new file mode 100755
index 0000000000..e62dbb7327
--- /dev/null
+++ b/t/t6134-pathspec-in-submodule.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description='test case exclude pathspec'
+
+. ./test-lib.sh
+
+test_expect_success 'setup a submodule' '
+	test_commit 1 &&
+	git submodule add ./ sub &&
+	git commit -a -m "add submodule" &&
+	git submodule deinit --all
+'
+
+cat <<EOF >expect
+fatal: Pathspec 'sub/a' is in submodule 'sub'
+EOF
+
+test_expect_success 'error message for path inside submodule' '
+	echo a >sub/a &&
+	test_must_fail git add sub/a 2>actual &&
+	test_cmp expect actual
+'
+
+cat <<EOF >expect
+fatal: Pathspec '.' is in submodule 'sub'
+EOF
+
+test_expect_success 'error message for path inside submodule from within submodule' '
+	test_must_fail git -C sub add . 2>actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.11.0.259.ga95e92af08.dirty


^ permalink raw reply related

* Re: [PATCH] am: add am.signoff add config variable
From: Jacob Keller @ 2016-12-29 17:49 UTC (permalink / raw)
  To: Eric Wong, Eduardo Habkost; +Cc: git, Paul Tan
In-Reply-To: <20161229084701.GA3643@starla>

On December 29, 2016 12:47:01 AM PST, Eric Wong <e@80x24.org> wrote:
>Eduardo Habkost <ehabkost@redhat.com> wrote:
>> git-am has options to enable --message-id and --3way by default,
>> but no option to enable --signoff by default. Add a "am.signoff"
>> config option.
>
>I'm not sure this is a good idea.  IANAL, but a sign-off
>has some sort of legal meaning for this project (DCO)
>and that would be better decided on a patch-by-patch basis
>rather than a blanket statement.
>
>I don't add my SoB to patches (either my own or received) until
>I'm comfortable with it; and I'd rather err on the side of
>forgetting and being prodded to resubmit rather than putting
>an SoB on the wrong patch.
>
>
>(I'm barely online today, no rush needed in responding)

I don't know what is true for all projects, but I can't believe making it configurable would cause problems.... If you don't want it you don't have to enable it.

I suppose your argument is that we shouldn't ever make it automatically but... I know that many people who receive email patches, apply them, then forward those to another group add their sign off as a mark of "yes I certify that this is legally OK" so I think this would be useful.

Thanks
Jake



^ permalink raw reply

* Re: [PATCH v3] am: add am.signoff add config variable
From: Eduardo Habkost @ 2016-12-29 15:49 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: git, Paul Tan, Stefan Beller, Eric Wong
In-Reply-To: <m2wpejb1zn.fsf@linux-m68k.org>

On Thu, Dec 29, 2016 at 08:58:36AM +0100, Andreas Schwab wrote:
> On Dez 28 2016, Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > @@ -32,10 +32,12 @@ OPTIONS
> >  	If you supply directories, they will be treated as Maildirs.
> >  
> >  -s::
> > ---signoff::
> > +--[no-]-signoff::
> 
> That's one dash too much.

Oops. I can fix it in v4, but I will first wait to see what
others think about the legal implications of setting it by
default (see point raised by Eric Wong on v1).

-- 
Eduardo

^ permalink raw reply

* Re: [PATCH] am: add am.signoff add config variable
From: Eduardo Habkost @ 2016-12-29 15:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, Paul Tan
In-Reply-To: <20161229084701.GA3643@starla>

On Thu, Dec 29, 2016 at 08:47:01AM +0000, Eric Wong wrote:
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> > git-am has options to enable --message-id and --3way by default,
> > but no option to enable --signoff by default. Add a "am.signoff"
> > config option.
> 
> I'm not sure this is a good idea.  IANAL, but a sign-off
> has some sort of legal meaning for this project (DCO)
> and that would be better decided on a patch-by-patch basis
> rather than a blanket statement.
> 
> I don't add my SoB to patches (either my own or received) until
> I'm comfortable with it; and I'd rather err on the side of
> forgetting and being prodded to resubmit rather than putting
> an SoB on the wrong patch.

This is an interesting point. I am not competent to argue about
it, so I will let the Git developers decide.

-- 
Eduardo

^ permalink raw reply

* Re: [PATCH v2] am: add am.signoff add config variable
From: Eduardo Habkost @ 2016-12-29 15:37 UTC (permalink / raw)
  To: Pranit Bauva; +Cc: Stefan Beller, git@vger.kernel.org, Paul Tan
In-Reply-To: <CAFZEwPOPMrCXTc+SMhjGSnPKLmefcde4MgJsz7n5rBApACZOug@mail.gmail.com>

On Thu, Dec 29, 2016 at 01:29:33PM +0530, Pranit Bauva wrote:
> Hey Eduardo,
> 
> On Thu, Dec 29, 2016 at 12:49 AM, Eduardo Habkost <ehabkost@redhat.com> wrote:
> >> test_expect_success '--no-signoff overrides am.signoff' '
> >>       rm -fr .git/rebase-apply &&
> >>       git reset --hard first &&
> >>       test_config am.signoff true &&
> >>       git am --no-signoff <patch2 &&
> >>       printf "%s\n" "$signoff" >expected &&
> >>       git cat-file commit HEAD^ | grep "Signed-off-by:" >actual &&
> >>       test_cmp expected actual &&
> >>       git cat-file commit HEAD | grep "Signed-off-by:" >actual &&
> >>       test_must_be_empty actual
> >> '
> >>
> >> The test fails because the second "grep" command returns a
> >> non-zero exit code. Any suggestions to avoid that problem in a
> >> more idiomatic way?
> >
> > I just found out that "test_must_fail grep ..." is a common
> > idiom, so what about:
> 
> Is there any particular reason to use "grep" instead of "test_cmp"? To
> check for non-zero error code, you can always use "! test_cmp".

The test code is checking only the "Signed-off-by" lines, not the
whole commit message. "test_cmp" would require recovering the
entire contents of the original commit message, which would add
complexity to the test code.

-- 
Eduardo

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox