Git development

Git development
 help / color / mirror / Atom feed

* [PATCH v4 02/23] refname_is_safe(): correct docstring
From: Michael Haggerty @ 2017-01-06 16:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, David Turner, Jacob Keller, Philip Oakley,
	Michael Haggerty
In-Reply-To: <cover.1483719289.git.mhagger@alum.mit.edu>

The behavior of refname_is_safe() was changed in

    e40f355 "refname_is_safe(): insist that the refname already be normalized", 2016-04-27

without a corresponding update to its docstring. The function is in fact
stricter than documented, because it now insists that the result of
normalizing the part of a refname following "refs/" is identical to that
part of the original refname. Fix the docstring.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/refs-internal.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 708b260..dc81acc 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -62,11 +62,12 @@
  * This function does not check that the reference name is legal; for
  * that, use check_refname_format().
  *
- * We consider a refname that starts with "refs/" to be safe as long
- * as any ".." components that it might contain do not escape "refs/".
- * Names that do not start with "refs/" are considered safe iff they
- * consist entirely of upper case characters and '_' (like "HEAD" and
- * "MERGE_HEAD" but not "config" or "FOO/BAR").
+ * A refname that starts with "refs/" is considered safe iff it
+ * doesn't contain any "." or ".." components or consecutive '/'
+ * characters, end with '/', or (on Windows) contain any '\'
+ * characters. Names that do not start with "refs/" are considered
+ * safe iff they consist entirely of upper case characters and '_'
+ * (like "HEAD" and "MERGE_HEAD" but not "config" or "FOO/BAR").
  */
 int refname_is_safe(const char *refname);
 
-- 
2.9.3


^ permalink raw reply related

* [PATCH v4 06/23] raceproof_create_file(): new function
From: Michael Haggerty @ 2017-01-06 16:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, David Turner, Jacob Keller, Philip Oakley,
	Michael Haggerty
In-Reply-To: <cover.1483719289.git.mhagger@alum.mit.edu>

Add a function that tries to create a file and any containing
directories in a way that is robust against races with other processes
that might be cleaning up empty directories at the same time.

The actual file creation is done by a callback function, which, if it
fails, should set errno to EISDIR or ENOENT according to the convention
of open(). raceproof_create_file() detects such failures, and
respectively either tries to delete empty directories that might be in
the way of the file or tries to create the containing directories. Then
it retries the callback function.

This function is not yet used.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 cache.h     | 43 ++++++++++++++++++++++++++++++++++++++
 sha1_file.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 111 insertions(+)

diff --git a/cache.h b/cache.h
index 8177c3a..95cde14 100644
--- a/cache.h
+++ b/cache.h
@@ -1057,6 +1057,49 @@ enum scld_error {
 enum scld_error safe_create_leading_directories(char *path);
 enum scld_error safe_create_leading_directories_const(const char *path);
 
+/*
+ * Callback function for raceproof_create_file(). This function is
+ * expected to do something that makes dirname(path) permanent despite
+ * the fact that other processes might be cleaning up empty
+ * directories at the same time. Usually it will create a file named
+ * path, but alternatively it could create another file in that
+ * directory, or even chdir() into that directory. The function should
+ * return 0 if the action was completed successfully. On error, it
+ * should return a nonzero result and set errno.
+ * raceproof_create_file() treats two errno values specially:
+ *
+ * - ENOENT -- dirname(path) does not exist. In this case,
+ *             raceproof_create_file() tries creating dirname(path)
+ *             (and any parent directories, if necessary) and calls
+ *             the function again.
+ *
+ * - EISDIR -- the file already exists and is a directory. In this
+ *             case, raceproof_create_file() removes the directory if
+ *             it is empty (and recursively any empty directories that
+ *             it contains) and calls the function again.
+ *
+ * Any other errno causes raceproof_create_file() to fail with the
+ * callback's return value and errno.
+ *
+ * Obviously, this function should be OK with being called again if it
+ * fails with ENOENT or EISDIR. In other scenarios it will not be
+ * called again.
+ */
+typedef int create_file_fn(const char *path, void *cb);
+
+/*
+ * Create a file in dirname(path) by calling fn, creating leading
+ * directories if necessary. Retry a few times in case we are racing
+ * with another process that is trying to clean up the directory that
+ * contains path. See the documentation for create_file_fn for more
+ * details.
+ *
+ * Return the value and set the errno that resulted from the most
+ * recent call of fn. fn is always called at least once, and will be
+ * called more than once if it returns ENOENT or EISDIR.
+ */
+int raceproof_create_file(const char *path, create_file_fn fn, void *cb);
+
 int mkdir_in_gitdir(const char *path);
 extern char *expand_user_path(const char *path);
 const char *enter_repo(const char *path, int strict);
diff --git a/sha1_file.c b/sha1_file.c
index ae8f0b4..b08f54c 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -179,6 +179,74 @@ enum scld_error safe_create_leading_directories_const(const char *path)
 	return result;
 }
 
+int raceproof_create_file(const char *path, create_file_fn fn, void *cb)
+{
+	/*
+	 * The number of times we will try to remove empty directories
+	 * in the way of path. This is only 1 because if another
+	 * process is racily creating directories that conflict with
+	 * us, we don't want to fight against them.
+	 */
+	int remove_directories_remaining = 1;
+
+	/*
+	 * The number of times that we will try to create the
+	 * directories containing path. We are willing to attempt this
+	 * more than once, because another process could be trying to
+	 * clean up empty directories at the same time as we are
+	 * trying to create them.
+	 */
+	int create_directories_remaining = 3;
+
+	/* A scratch copy of path, filled lazily if we need it: */
+	struct strbuf path_copy = STRBUF_INIT;
+
+	int ret, save_errno;
+
+	/* Sanity check: */
+	assert(*path);
+
+retry_fn:
+	ret = fn(path, cb);
+	save_errno = errno;
+	if (!ret)
+		goto out;
+
+	if (errno == EISDIR && remove_directories_remaining-- > 0) {
+		/*
+		 * A directory is in the way. Maybe it is empty; try
+		 * to remove it:
+		 */
+		if (!path_copy.len)
+			strbuf_addstr(&path_copy, path);
+
+		if (!remove_dir_recursively(&path_copy, REMOVE_DIR_EMPTY_ONLY))
+			goto retry_fn;
+	} else if (errno == ENOENT && create_directories_remaining-- > 0) {
+		/*
+		 * Maybe the containing directory didn't exist, or
+		 * maybe it was just deleted by a process that is
+		 * racing with us to clean up empty directories. Try
+		 * to create it:
+		 */
+		enum scld_error scld_result;
+
+		if (!path_copy.len)
+			strbuf_addstr(&path_copy, path);
+
+		do {
+			scld_result = safe_create_leading_directories(path_copy.buf);
+			if (scld_result == SCLD_OK)
+				goto retry_fn;
+		} while (scld_result == SCLD_VANISHED && create_directories_remaining-- > 0);
+	}
+
+out:
+	strbuf_release(&path_copy);
+	errno = save_errno;
+	return ret;
+}
+
 static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1)
 {
 	int i;
-- 
2.9.3


^ permalink raw reply related

* [PATCH v4 03/23] t5505: use "for-each-ref" to test for the non-existence of references
From: Michael Haggerty @ 2017-01-06 16:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, David Turner, Jacob Keller, Philip Oakley,
	Michael Haggerty
In-Reply-To: <cover.1483719289.git.mhagger@alum.mit.edu>

Instead of looking on the filesystem inside ".git/refs/remotes/origin",
use "git for-each-ref" to check for leftover references under the
remote's old name.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 t/t5505-remote.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index 8198d8e..65030fb 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -725,7 +725,7 @@ test_expect_success 'rename a remote' '
 	(
 		cd four &&
 		git remote rename origin upstream &&
-		rmdir .git/refs/remotes/origin &&
+		test -z "$(git for-each-ref refs/remotes/origin)" &&
 		test "$(git symbolic-ref refs/remotes/upstream/HEAD)" = "refs/remotes/upstream/master" &&
 		test "$(git rev-parse upstream/master)" = "$(git rev-parse master)" &&
 		test "$(git config remote.upstream.fetch)" = "+refs/heads/*:refs/remotes/upstream/*" &&
-- 
2.9.3


^ permalink raw reply related

* [PATCH v4 01/23] files_rename_ref(): tidy up whitespace
From: Michael Haggerty @ 2017-01-06 16:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, David Turner, Jacob Keller, Philip Oakley,
	Michael Haggerty
In-Reply-To: <cover.1483719289.git.mhagger@alum.mit.edu>

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f902393..be4ffdc 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2631,7 +2631,7 @@ static int files_rename_ref(struct ref_store *ref_store,
 	if (!read_ref_full(newrefname, RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
 			   sha1, NULL) &&
 	    delete_ref(newrefname, NULL, REF_NODEREF)) {
-		if (errno==EISDIR) {
+		if (errno == EISDIR) {
 			struct strbuf path = STRBUF_INIT;
 			int result;
 
-- 
2.9.3


^ permalink raw reply related

* [PATCH v4 00/23] Delete directories left empty after ref deletion
From: Michael Haggerty @ 2017-01-06 16:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, David Turner, Jacob Keller, Philip Oakley,
	Michael Haggerty

This is v4 of this patch series. Thanks to Peff, Junio, Jake, and
Philip for their feedback about v3. I believe I have addressed all of
the comments about v1 [1], v2 [2], and v3 [3].

Changes since v3:

* Insert a new first patch correcting the docstring of
  `refname_is_safe()`.

* In "raceproof_create_file(): new function", tweak a comment for
  clarity as suggested by Peff.

* In "log_ref_setup(): separate code for create vs non-create", add a
  semicolon to an otherwise empty code block as requested by Junio.

* Drop the patch "try_remove_empty_parents(): don't accommodate
  consecutive slashes", for reasons discussed on the ML [4].

Michael

[1] http://public-inbox.org/git/cover.1455626201.git.mhagger@alum.mit.edu/T/#u
[2] http://public-inbox.org/git/cover.1456405698.git.mhagger@alum.mit.edu/T/#u
[3] http://public-inbox.org/git/cover.1483153436.git.mhagger@alum.mit.edu/T/#u
[4] http://public-inbox.org/git/5051c78e-51f9-becd-e1a6-9c0b781d6912@alum.mit.edu/
and surrounding thread.

Michael Haggerty (23):
  files_rename_ref(): tidy up whitespace
  refname_is_safe(): correct docstring
  t5505: use "for-each-ref" to test for the non-existence of references
  safe_create_leading_directories_const(): preserve errno
  safe_create_leading_directories(): set errno on SCLD_EXISTS
  raceproof_create_file(): new function
  lock_ref_sha1_basic(): inline constant
  lock_ref_sha1_basic(): use raceproof_create_file()
  rename_tmp_log(): use raceproof_create_file()
  rename_tmp_log(): improve error reporting
  log_ref_write(): inline function
  log_ref_setup(): separate code for create vs non-create
  log_ref_setup(): improve robustness against races
  log_ref_setup(): pass the open file descriptor back to the caller
  log_ref_write_1(): don't depend on logfile argument
  log_ref_setup(): manage the name of the reflog file internally
  log_ref_write_1(): inline function
  delete_ref_loose(): derive loose reference path from lock
  delete_ref_loose(): inline function
  try_remove_empty_parents(): rename parameter "name" -> "refname"
  try_remove_empty_parents(): don't trash argument contents
  try_remove_empty_parents(): teach to remove parents of reflogs, too
  files_transaction_commit(): clean up empty directories

 cache.h               |  48 ++++++-
 refs/files-backend.c  | 371 +++++++++++++++++++++++++-------------------------
 refs/refs-internal.h  |  22 ++-
 sha1_file.c           |  76 ++++++++++-
 t/t1400-update-ref.sh |  27 ++++
 t/t5505-remote.sh     |   2 +-
 6 files changed, 351 insertions(+), 195 deletions(-)

-- 
2.9.3


^ permalink raw reply

* [ANNOUNCE] git-test: run automated tests against a range of Git commits
From: Michael Haggerty @ 2017-01-06 15:52 UTC (permalink / raw)
  To: git discussion list

I just released ⁠⁠⁠⁠`git test⁠⁠⁠⁠`, a script for running automated
tests across a range of Git commits and keeping track of the results in
git notes:

    https://github.com/mhagger/git-test

This is a script that I've been using in one form or another for years
and I find it really handy [1].

`git-test` is meant for people who want *all* of their commits (not just
the branch tip) to pass their automated tests.

Tl;dr: How to use `git test`

1.  Define the test you want to run. The string can be any shell
    command:

        git test add "make -j8 && make -j16 test"

2.  Create a separate linked worktree in which to run your tests:

        git worktree add --detach ../test HEAD

3.  Create a terminal window and `cd` to the directory containing
    the testing worktree, and run the test against all of the commits
    on your branch:

        cd ../test
        git test range master..mybranch

    If any of the commits are broken, `git tree` will display the
    error then stop with that commit checked out.

4.  As you work, whenever you want to test new commits, go to the
    testing terminal window and run the same command again:

        git test range master..mybranch

    `git test` is smart enough to remember which commits (actually,
    trees) have already been tested and only run the test against
    commits that have been changed or added. And since the tests are
    run in a different worktree, you can continue working in your
    main working directory while the tests run.

It is also possible to define more than one test suite in a given
repository, retry tests, etc. Type `git test help` or read the `README`
file [2] for more information.

`git test` stores the test results in git notes (under
`refs/notes/test/<name>`), linked to the commit's tree SHA-1. This means
that test results remain valid even across some kinds of commit
rewriting, like changes to commit metadata or squashing adjacent
commits, and a subset of results even remains valid if a commit is split
or if some commits earlier in a patch series are reordered.

I don't plan to turn this into a gigantic project or anything, but I
find this script really useful so I wanted to put it out in the world.
Feedback and/or pull requests welcome!

Michael

[1] The name sucks, I know :-/
[2] https://github.com/mhagger/git-test/blob/master/README.md

^ permalink raw reply

* Re: Regression: Ctrl-c from the pager in an alias exits it
From: Johannes Sixt @ 2017-01-06 14:39 UTC (permalink / raw)
  To: Jeff King; +Cc: Trygve Aaberge, Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20170106073224.5hsrib77tx5tgx7d@sigill.intra.peff.net>

Am 06.01.2017 um 08:32 schrieb Jeff King:
> On Fri, Jan 06, 2017 at 02:26:02AM -0500, Jeff King wrote:
>
>> You'll notice that it actually calls wait() on the pager. That's due to
>> a3da882120 (pager: do wait_for_pager on signal death, 2009-01-22), which
>> IIRC was addressing a very similar problem. We want to stop feeding the
>> pager when we get a signal, but we don't want the main process to
>> actually exit, or the pager loses the controlling terminal.
>>
>> In our new scenario we have an extra process, though. The git-log child
>> will wait on the pager, but the parent process can't. It doesn't know
>> about it. I think that it in turn needs to wait on the child when it
>> dies, and then the whole chain will stand still until the pager exits.
>
> And here's a patch to do that. It seems to work.
>
> I'll sleep on it and then write up a commit message tomorrow if it still
> makes sense.
>
> diff --git a/run-command.c b/run-command.c
> index ca905a9e80..db47c429b7 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -29,6 +29,8 @@ static int installed_child_cleanup_handler;
>
>  static void cleanup_children(int sig, int in_signal)
>  {
> +	struct child_to_clean *children_to_wait_for = NULL;
> +
>  	while (children_to_clean) {
>  		struct child_to_clean *p = children_to_clean;
>  		children_to_clean = p->next;
> @@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal)
>  		}
>
>  		kill(p->pid, sig);
> +		p->next = children_to_wait_for;
> +		children_to_wait_for = p;
> +	}
> +
> +	while (children_to_wait_for) {
> +		struct child_to_clean *p = children_to_wait_for;
> +		children_to_wait_for = p->next;
> +
> +		while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR)
> +			; /* spin waiting for process exit or error */
> +
>  		if (!in_signal)
>  			free(p);
>  	}
>

This looks like the minimal change necessary. I wonder, though, whether 
the new local variable is really required. Wouldn't it be sufficient to 
walk the children_to_clean chain twice?

-- Hannes


^ permalink raw reply

* Re: Encoding issue in git gui
From: Johannes Schindelin @ 2017-01-06 13:56 UTC (permalink / raw)
  To: Anton Tsyganenko; +Cc: Pat Thoyts, git
In-Reply-To: <8dc95372-979c-e8e5-5df8-1f4025ab0db5@yandex.ru>

[-- Attachment #1: Type: text/plain, Size: 1088 bytes --]

Hi,

On Fri, 6 Jan 2017, Anton Tsyganenko wrote:

> When I run a tool from tools menu in the standard git gui, the output
> seems to have wrong encoding. For example, add a tool called "encoding
> test" that runs command "echo Проверка", then run it. In the output
> window "Ð?Ñ?Ð¾Ð²ÐµÑ?ÐºÐ°" in printed instead of "Проверка".

I can confirm that this happens on Linux:

$ uname -srv
Linux 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016

$ git gui version
git-gui version 0.21.0

$ dpkg -l tcl tk
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  tcl            8.6.0+9      amd64        Tool Command Language (default ve
ii  tk             8.6.0+9      amd64        Toolkit for Tcl and X11 (default

Cc'ing the Git GUI maintainer.

Ciao,
Johannes

^ permalink raw reply

* Re: Squashing commits with git rebase -i may miscount commits in commit message
From: Johannes Schindelin @ 2017-01-06 13:46 UTC (permalink / raw)
  To: Brandon Tolsch; +Cc: git
In-Reply-To: <CAMWRQeRaQQQcJ-R8eHc7f0KqZF2eEkYJOyTb9n7ds78pTqV-AA@mail.gmail.com>

Hi,

On Fri, 6 Jan 2017, Brandon Tolsch wrote:

> git --version: 2.11.0
> 
> When using git rebase -i to squash a series of commits that includes
> more than 10 commits, the generated commit message you are given to
> edit counts the old messages incorrectly.  It will say the total
> number of commits is (actual % 10) (if they were 0-based) and it will
> also count the commits as (actual % 10).
> 
> For example, 15, 25, 35, etc. commits:
> # This is a combination of 5 commits.
> # This is the 1st commit message:
> msg
> 
> # This is the commit message #2:
> ...
> ...
> # This is commit message #10:
> msg
> 
> # This is commit message #1:
> ...
> 
> # This is commit message #5:
> msg
> 
> While not a big issue, it did make me double check what I was doing
> when I saw "a combination of 10 commits" instead of 20 in the commit
> message.

Just for the record: I verified that the rebase--helper based interactive
rebase (which is already in Git for Windows since v2.10.0) does *not* have
this bug.

Maybe a point in favor rebase--helper...

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH] branch_get_push: do not segfault when HEAD is detached
From: Johannes Schindelin @ 2017-01-06 13:41 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: git, Jeff King
In-Reply-To: <20170106045623.21118-1-kyle@kyleam.com>

Hi Kyle,

On Thu, 5 Jan 2017, Kyle Meyer wrote:

> Move the detached HEAD check from branch_get_push_1() to
> branch_get_push() to avoid setting branch->push_tracking_ref when
> branch is NULL.
> 
> Signed-off-by: Kyle Meyer <kyle@kyleam.com>

Good point.

> diff --git a/remote.c b/remote.c
> index ad6c5424e..d5eaec737 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -1716,9 +1716,6 @@ static const char *branch_get_push_1(struct branch *branch, struct strbuf *err)
>  {
>  	struct remote *remote;
>  
> -	if (!branch)
> -		return error_buf(err, _("HEAD does not point to a branch"));
> -
>  	remote = remote_get(pushremote_for_branch(branch, NULL));
>  	if (!remote)
>  		return error_buf(err,
> @@ -1778,6 +1775,9 @@ static const char *branch_get_push_1(struct branch *branch, struct strbuf *err)
>  
>  const char *branch_get_push(struct branch *branch, struct strbuf *err)
>  {
> +	if (!branch)
> +		return error_buf(err, _("HEAD does not point to a branch"));
> +
>  	if (!branch->push_tracking_ref)
>  		branch->push_tracking_ref = branch_get_push_1(branch, err);

This is the only caller of branch_get_push_1(), so all is good.

> diff --git a/t/t1514-rev-parse-push.sh b/t/t1514-rev-parse-push.sh
> index 7214f5b33..90c639ae1 100755
> --- a/t/t1514-rev-parse-push.sh
> +++ b/t/t1514-rev-parse-push.sh
> @@ -60,4 +60,10 @@ test_expect_success '@{push} with push refspecs' '
>  	resolve topic@{push} refs/remotes/origin/magic/topic
>  '
>  
> +test_expect_success 'resolving @{push} fails with a detached HEAD' '
> +	git checkout HEAD^{} &&

I seem to recall that we prefer HEAD^0 over HEAD^{} and the existing
scripts seem to agree with me:

$ git grep -c HEAD^0 junio/pu -- t/
junio/pu:t/t1450-fsck.sh:1
junio/pu:t/t1507-rev-parse-upstream.sh:2
junio/pu:t/t2020-checkout-detach.sh:5
junio/pu:t/t3200-branch.sh:1
junio/pu:t/t3203-branch-output.sh:3
junio/pu:t/t3400-rebase.sh:1
junio/pu:t/t3404-rebase-interactive.sh:1
junio/pu:t/t5407-post-rewrite-hook.sh:2
junio/pu:t/t5505-remote.sh:1
junio/pu:t/t5510-fetch.sh:1
junio/pu:t/t5533-push-cas.sh:3
junio/pu:t/t6035-merge-dir-to-symlink.sh:3
junio/pu:t/t7201-co.sh:2
junio/pu:t/t7402-submodule-rebase.sh:1
junio/pu:t/t9105-git-svn-commit-diff.sh:1
junio/pu:t/t9107-git-svn-migrate.sh:1

$ git grep -c HEAD^{} junio/pu -- t/
junio/pu:t/t3200-branch.sh:3

Maybe use HEAD^0 just for consistency?

Ciao,
Johannes

^ permalink raw reply

* Re: Regression: Ctrl-c from the pager in an alias exits it
From: Trygve Aaberge @ 2017-01-06 13:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20170106073224.5hsrib77tx5tgx7d@sigill.intra.peff.net>

On Fri, Jan 06, 2017 at 02:32:25 -0500, Jeff King wrote:
> And here's a patch to do that. It seems to work.

Nice, thanks! This works very well for me too.

-- 
Trygve Aaberge

^ permalink raw reply

* Re: git branch -D doesn't work with deleted worktree
From: Duy Nguyen @ 2017-01-06 10:05 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Roland Illig, git@vger.kernel.org
In-Reply-To: <CAGZ79kaLpf1nzSAgRJQamMGk-327LO+qQYihYVVcU+86n92ivg@mail.gmail.com>

On Thu, Jan 5, 2017 at 9:02 PM, Stefan Beller <sbeller@google.com> wrote:
> On Thu, Jan 5, 2017 at 2:06 AM, Roland Illig <rillig@novomind.com> wrote:
>> Git 2.11.0 gives a wrong error message after the following commands:
>>
>> $ git init
>> $ echo hello >file
>> $ git add file
>> $ git commit -m "message"
>> $ git worktree add ../worktree
>> $ rm -rf ../worktree
>> $ git br -D worktree
>> error: Cannot delete branch 'worktree' checked out at '../worktree'
>>
>> Since ../worktree has been deleted, there cannot be anything checked out at that location.
>>
>> In my opinion, deleting the branch should just work. Especially since I used the -D option and the "git worktree" documentation says "When you are done with a linked working tree you can simply delete it."

Since -D means "I know what I'm doing, get out of my way", maybe we
should continue if any worktree has the branch checked out by
detaching it?

(Yes I'm carefully tip toeing around the deleted worktree issue since
"git worktree remove" is coming. After that point, running "worktree
prune" before "branch -D" does not sound so bad)
-- 
Duy

^ permalink raw reply

* Re: [PATCH v2 4/4] mergetool: fix running in subdir when rerere enabled
From: Johannes Sixt @ 2017-01-06  9:42 UTC (permalink / raw)
  To: Richard Hansen; +Cc: git, davvid, sbeller, simon
In-Reply-To: <20170106010945.79382-5-hansenr@google.com>

Am 06.01.2017 um 02:09 schrieb Richard Hansen:
> If rerere is enabled and no pathnames are given, run cd_to_toplevel
> before running 'git diff --name-only' so that 'git diff --name-only'
> sees the pathnames emitted by 'git rerere remaining'.
>
> Also run cd_to_toplevel before running 'git rerere remaining' in case
> 'git rerere remaining' is ever changed to print pathnames relative to
> the current directory rather than to $GIT_WORK_TREE.
>
> This fixes a regression introduced in
> 57937f70a09c12ef484c290865dac4066d207c9c (v2.11.0).
>
> Signed-off-by: Richard Hansen <hansenr@google.com>
> ---
>  git-mergetool.sh     | 1 +
>  t/t7610-mergetool.sh | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/git-mergetool.sh b/git-mergetool.sh
> index e52b4e4f2..67ea0d6db 100755
> --- a/git-mergetool.sh
> +++ b/git-mergetool.sh
> @@ -456,6 +456,7 @@ main () {
>
>  	if test $# -eq 0 && test -e "$GIT_DIR/MERGE_RR"
>  	then
> +		cd_to_toplevel
>  		set -- $(git rerere remaining)
>  		if test $# -eq 0
>  		then

This cannot be a complete solution. Why do we have another 
cd_to_toplevel later, after `git diff --name-only -- "$@"`?

Maybe it is necessary to revert back to the flow control that we had 
before 57937f70a09c ("mergetool: honor diff.orderFile", 2016-10-07)? It 
did not have `test $# -eq 0` and `test -e "$GIT_DIR/MERGE_RR"` in a 
single condition.

-- Hannes


^ permalink raw reply

* Encoding issue in git gui
From: Anton Tsyganenko @ 2017-01-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1.1.1: Type: text/plain, Size: 301 bytes --]

When I run a tool from tools menu in the standard git gui, the output
seems to have wrong encoding. For example, add a tool called "encoding
test" that runs command "echo Проверка", then run it. In the output
window "ÐŸÑ€Ð¾Ð²ÐµÑ€ÐºÐ°" in printed instead of "Проверка".

[-- Attachment #1.1.2: 2017-01-06-115157_416x253_scrot.png --]
[-- Type: image/png, Size: 19154 bytes --]

[-- Attachment #1.1.3: 2017-01-06-115239_831x305_scrot.png --]
[-- Type: image/png, Size: 10701 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 870 bytes --]

^ permalink raw reply

* Squashing commits with git rebase -i may miscount commits in commit message
From: Brandon Tolsch @ 2017-01-06  9:04 UTC (permalink / raw)
  To: git

git --version: 2.11.0

When using git rebase -i to squash a series of commits that includes
more than 10 commits, the generated commit message you are given to
edit counts the old messages incorrectly.  It will say the total
number of commits is (actual % 10) (if they were 0-based) and it will
also count the commits as (actual % 10).

For example, 15, 25, 35, etc. commits:
# This is a combination of 5 commits.
# This is the 1st commit message:
msg

# This is the commit message #2:
...
...
# This is commit message #10:
msg

# This is commit message #1:
...

# This is commit message #5:
msg

While not a big issue, it did make me double check what I was doing
when I saw "a combination of 10 commits" instead of 20 in the commit
message.

-Brandon Tolsch

^ permalink raw reply

* Re: Regression: Ctrl-c from the pager in an alias exits it
From: Jeff King @ 2017-01-06  7:32 UTC (permalink / raw)
  To: Trygve Aaberge; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20170106072602.wkbzho5z3osz5hee@sigill.intra.peff.net>

On Fri, Jan 06, 2017 at 02:26:02AM -0500, Jeff King wrote:

> You'll notice that it actually calls wait() on the pager. That's due to
> a3da882120 (pager: do wait_for_pager on signal death, 2009-01-22), which
> IIRC was addressing a very similar problem. We want to stop feeding the
> pager when we get a signal, but we don't want the main process to
> actually exit, or the pager loses the controlling terminal.
> 
> In our new scenario we have an extra process, though. The git-log child
> will wait on the pager, but the parent process can't. It doesn't know
> about it. I think that it in turn needs to wait on the child when it
> dies, and then the whole chain will stand still until the pager exits.

And here's a patch to do that. It seems to work.

I'll sleep on it and then write up a commit message tomorrow if it still
makes sense.

diff --git a/run-command.c b/run-command.c
index ca905a9e80..db47c429b7 100644
--- a/run-command.c
+++ b/run-command.c
@@ -29,6 +29,8 @@ static int installed_child_cleanup_handler;
 
 static void cleanup_children(int sig, int in_signal)
 {
+	struct child_to_clean *children_to_wait_for = NULL;
+
 	while (children_to_clean) {
 		struct child_to_clean *p = children_to_clean;
 		children_to_clean = p->next;
@@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal)
 		}
 
 		kill(p->pid, sig);
+		p->next = children_to_wait_for;
+		children_to_wait_for = p;
+	}
+
+	while (children_to_wait_for) {
+		struct child_to_clean *p = children_to_wait_for;
+		children_to_wait_for = p->next;
+
+		while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR)
+			; /* spin waiting for process exit or error */
+
 		if (!in_signal)
 			free(p);
 	}

^ permalink raw reply related

* Re: Regression: Ctrl-c from the pager in an alias exits it
From: Jeff King @ 2017-01-06  7:26 UTC (permalink / raw)
  To: Trygve Aaberge; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20170106064752.iccrk656c6k2wrfy@sigill.intra.peff.net>

On Fri, Jan 06, 2017 at 01:47:52AM -0500, Jeff King wrote:

> > > To reproduce, create e.g. the alias `l = log` and run `git l`. Then press
> > > Ctrl-c. The expected behavior is that nothing happens. The actual behavior is
> > > that the pager exits.
> > 
> > That's weird. I can't reproduce at all here. But I also can't think of a
> > thing that Git could do that would impact the behavior. For example:
> 
> I take it back. I could not reproduce under my normal config, which sets
> the pager to "diff-highlight | less". But if I drop that config, I can
> reproduce easily. And bisect it to that same commit, 86d26f240f
> (setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when ..,
> 2015-12-20).

I made a little progress with strace. Prior to 86d26f240f, we would run
the "log" builtin directly inside the same executable. After it, we exec
a separate process, and the process tree looks like:

  |       `-bash,10123
  |           `-git,10125 l
  |               `-git-log,10127
  |                   `-less,10128

Now if I strace all of those, I see (I've reordered and snipped a bit
for clarity):

  $ strace -p 10125 -p 10127 -p 10128
  strace: Process 10125 attached
  strace: Process 10127 attached
  strace: Process 10128 attached
  [pid 10127] write(1, "\n\33[33mcommit bae73e80d48ace1faa3"..., 1481 <unfinished ...>
  [pid 10128] read(5,  <unfinished ...>
  [pid 10125] wait4(10127,  <unfinished ...>

The main git process is waiting for the child to finish, the child is
blocked writing to the pager, and the pager is waiting for input from
the terminal (fd 5).

So I hit ^C:

  [pid 10128] <... read resumed> 0x7ffd39153b57, 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  [pid 10127] <... write resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  [pid 10125] <... wait4 resumed> 0x7ffe88d0a560, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  [pid 10128] --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
  [pid 10127] --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
  [pid 10125] --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---

Everybody gets the signal...

  [pid 10127] write(1, "\n\33[33mcommit bae73e80d48ace1faa3"..., 1481 <unfinished ...>

The blocked writer will resume its write() until it returns. This is the
same as it would get under the older code, too (after write() returns it
will handle the signal and die).

  [pid 10125] kill(10127, SIGINT <unfinished ...>
  [pid 10125] <... kill resumed> )        = 0

The parent git process tries to propagate the signal to the child.
Unnecessary in this instance, but helpful when the signal is only
delivered to the parent. This is due to 

  [pid 10128] rt_sigaction(SIGINT, {sa_handler=0x558dd1af0300, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f03fc8c2040},  <unfinished ...>
  [pid 10128] <... rt_sigaction resumed> {sa_handler=0x558dd1af0300, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f03fc8c2040}, 8) = 0
  [pid 10128] rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
  [pid 10128] <... rt_sigprocmask resumed> NULL, 8) = 0
  [pid 10128] write(1, "\7\r\33[K:\33[K", 9 <unfinished ...>
  [pid 10128] <... write resumed> )       = 9
  [pid 10128] read(5,  <unfinished ...>

And here's the pager handling the signal, and going back to waiting for
terminal input.

  [pid 10125] rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fbe2b4c1040},  <unfinished ...>
  [pid 10125] <... rt_sigaction resumed> {sa_handler=0x55aec373a6a0, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fbe2b4c1040}, 8) = 0
  [pid 10125] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [INT], 8) = 0
  [pid 10125] getpid( <unfinished ...>
  [pid 10125] <... getpid resumed> )      = 10125
  [pid 10125] gettid()                    = 10125
  [pid 10125] tgkill(10125, 10125, SIGINT) = 0
  [pid 10125] rt_sigprocmask(SIG_SETMASK, [INT], NULL, 8) = 0
  [pid 10125] rt_sigreturn({mask=[]})     = 61
  [pid 10125] --- SIGINT {si_signo=SIGINT, si_code=SI_TKILL, si_pid=10125, si_uid=1000} ---
  [pid 10125] +++ killed by SIGINT +++

The parent process pops the signal handler and propagates to itself,
dying.

At this point things pause, and nothing happens. But as soon as I hit a
key, the pager dies:

  [pid 10128] <... read resumed> "\r", 1) = 1
  [pid 10128] write(1, "\r\33[K", 4)      = 4
  [pid 10128] write(1, "     \"Another round of MIPS fixe"..., 50) = 50
  [pid 10128] read(5, 0x7ffd39153b57, 1)  = -1 EIO (Input/output error)
  [pid 10128] write(1, "\r\33[K\33[?1l\33>", 11) = 11
  [pid 10128] fsync(5)                    = -1 EINVAL (Invalid argument)
  [pid 10128] ioctl(5, TCGETS, {B38400 opost isig -icanon -echo ...}) = 0
  [pid 10128] ioctl(5, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = -1 EIO (Input/output error)
  [pid 10128] exit_group(1)               = ?
  [pid 10128] +++ exited with 1 +++

The key thing is the EIO we get reading from the terminal. I think
because the head of the process group died, we lost our controlling
terminal.

And then naturally the git-log process gets SIGPIPE and dies:

  <... write resumed> )                   = -1 EPIPE (Broken pipe)
  --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=10127, si_uid=1000} ---
  --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=10128, si_uid=1000, si_status=1, si_utime=0, si_stime=0} ---
  write(1, "\n\33[33mcommit bae73e80d48ace1faa3"..., 1481) = -1 EPIPE (Broken pipe)
  close(1)                                = 0
  close(2)                                = 0
  wait4(10128, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 10128
  rt_sigaction(SIGPIPE, {sa_handler=SIG_DFL, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fa2d51e2040}, {sa_handler=0x564583689d30, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fa2d51e2040}, 8) = 0
  rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [INT PIPE], 8) = 0
  getpid()                                = 10127
  gettid()                                = 10127
  tgkill(10127, 10127, SIGPIPE)           = 0
  rt_sigprocmask(SIG_SETMASK, [INT PIPE], NULL, 8) = 0
  rt_sigreturn({mask=[INT]})              = -1 EPIPE (Broken pipe)
  --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=10127, si_uid=1000} ---
  +++ killed by SIGPIPE +++

You'll notice that it actually calls wait() on the pager. That's due to
a3da882120 (pager: do wait_for_pager on signal death, 2009-01-22), which
IIRC was addressing a very similar problem. We want to stop feeding the
pager when we get a signal, but we don't want the main process to
actually exit, or the pager loses the controlling terminal.

In our new scenario we have an extra process, though. The git-log child
will wait on the pager, but the parent process can't. It doesn't know
about it. I think that it in turn needs to wait on the child when it
dies, and then the whole chain will stand still until the pager exits.

-Peff

^ permalink raw reply

* Re: Regression: Ctrl-c from the pager in an alias exits it
From: Jeff King @ 2017-01-06  6:47 UTC (permalink / raw)
  To: Trygve Aaberge; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20170106064032.eqxxer5mx5hsh2md@sigill.intra.peff.net>

On Fri, Jan 06, 2017 at 01:40:32AM -0500, Jeff King wrote:

> On Thu, Jan 05, 2017 at 03:25:29PM +0100, Trygve Aaberge wrote:
> 
> > I'm experiencing an issue when using aliases for commands that open the pager.
> > When I press Ctrl-c from the pager, it exits. This does not happen when I
> > don't use an alias and did not happen before. It causes problems because
> > Ctrl-c is also used for other things, such as canceling a search that hasn't
> > completed.
> > 
> > To reproduce, create e.g. the alias `l = log` and run `git l`. Then press
> > Ctrl-c. The expected behavior is that nothing happens. The actual behavior is
> > that the pager exits.
> 
> That's weird. I can't reproduce at all here. But I also can't think of a
> thing that Git could do that would impact the behavior. For example:

I take it back. I could not reproduce under my normal config, which sets
the pager to "diff-highlight | less". But if I drop that config, I can
reproduce easily. And bisect it to that same commit, 86d26f240f
(setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when ..,
2015-12-20).

-Peff

^ permalink raw reply

* Re: Regression: Ctrl-c from the pager in an alias exits it
From: Jeff King @ 2017-01-06  6:40 UTC (permalink / raw)
  To: Trygve Aaberge; +Cc: git
In-Reply-To: <20170105142529.GA15009@aaberge.net>

On Thu, Jan 05, 2017 at 03:25:29PM +0100, Trygve Aaberge wrote:

> I'm experiencing an issue when using aliases for commands that open the pager.
> When I press Ctrl-c from the pager, it exits. This does not happen when I
> don't use an alias and did not happen before. It causes problems because
> Ctrl-c is also used for other things, such as canceling a search that hasn't
> completed.
> 
> To reproduce, create e.g. the alias `l = log` and run `git l`. Then press
> Ctrl-c. The expected behavior is that nothing happens. The actual behavior is
> that the pager exits.

That's weird. I can't reproduce at all here. But I also can't think of a
thing that Git could do that would impact the behavior. For example:

  1. Git should never kill() the pager; in fact it waits for it to
     finish.

  2. Git can impact the pager environment and command line options, but
     those haven't changed anytime recently (and besides, I'm not sure
     there even is an option to convince less to die).

  3. Git can impact the set of blocked signals that the pager sees, but
     I think less would set up its own handler for SIGINT anyway.

I suppose it's possible that your shell or another program (tmux,
maybe?) catches the SIGINT and does a process-group kill. But then I
don't know why it would matter that you're using an alias. The process
tree without an alias:

  |-xterm,21861,peff
  |   `-bash,21863
  |       `-git,22376 log
  |           `-less,22377

and with:

  |-xterm,21861,peff
  |   `-bash,21863
  |       `-git,22391 l
  |           `-git-log,22393
  |               `-less,22394

are pretty similar.

Puzzling.

-Peff

^ permalink raw reply

* Re: [PATCH] branch_get_push: do not segfault when HEAD is detached
From: Jeff King @ 2017-01-06  6:18 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: git
In-Reply-To: <20170106045623.21118-1-kyle@kyleam.com>

On Thu, Jan 05, 2017 at 11:56:23PM -0500, Kyle Meyer wrote:

> Move the detached HEAD check from branch_get_push_1() to
> branch_get_push() to avoid setting branch->push_tracking_ref when
> branch is NULL.

Yep, I think this is the right fix.

> diff --git a/t/t1514-rev-parse-push.sh b/t/t1514-rev-parse-push.sh
> index 7214f5b33..90c639ae1 100755
> --- a/t/t1514-rev-parse-push.sh
> +++ b/t/t1514-rev-parse-push.sh
> @@ -60,4 +60,10 @@ test_expect_success '@{push} with push refspecs' '
>  	resolve topic@{push} refs/remotes/origin/magic/topic
>  '
>  
> +test_expect_success 'resolving @{push} fails with a detached HEAD' '
> +	git checkout HEAD^{} &&
> +	test_when_finished "git checkout -" &&
> +	test_must_fail git rev-parse @{push}
> +'

Looks good. Thanks.

-Peff

PS Looks like this is your first patch. Welcome. :)

^ permalink raw reply

* [PATCH] branch_get_push: do not segfault when HEAD is detached
From: Kyle Meyer @ 2017-01-06  4:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Kyle Meyer

Move the detached HEAD check from branch_get_push_1() to
branch_get_push() to avoid setting branch->push_tracking_ref when
branch is NULL.

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
---
 remote.c                  | 6 +++---
 t/t1514-rev-parse-push.sh | 6 ++++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/remote.c b/remote.c
index ad6c5424e..d5eaec737 100644
--- a/remote.c
+++ b/remote.c
@@ -1716,9 +1716,6 @@ static const char *branch_get_push_1(struct branch *branch, struct strbuf *err)
 {
 	struct remote *remote;
 
-	if (!branch)
-		return error_buf(err, _("HEAD does not point to a branch"));
-
 	remote = remote_get(pushremote_for_branch(branch, NULL));
 	if (!remote)
 		return error_buf(err,
@@ -1778,6 +1775,9 @@ static const char *branch_get_push_1(struct branch *branch, struct strbuf *err)
 
 const char *branch_get_push(struct branch *branch, struct strbuf *err)
 {
+	if (!branch)
+		return error_buf(err, _("HEAD does not point to a branch"));
+
 	if (!branch->push_tracking_ref)
 		branch->push_tracking_ref = branch_get_push_1(branch, err);
 	return branch->push_tracking_ref;
diff --git a/t/t1514-rev-parse-push.sh b/t/t1514-rev-parse-push.sh
index 7214f5b33..90c639ae1 100755
--- a/t/t1514-rev-parse-push.sh
+++ b/t/t1514-rev-parse-push.sh
@@ -60,4 +60,10 @@ test_expect_success '@{push} with push refspecs' '
 	resolve topic@{push} refs/remotes/origin/magic/topic
 '
 
+test_expect_success 'resolving @{push} fails with a detached HEAD' '
+	git checkout HEAD^{} &&
+	test_when_finished "git checkout -" &&
+	test_must_fail git rev-parse @{push}
+'
+
 test_done
-- 
2.11.0


^ permalink raw reply related

* [PATCH 3/3] blame: output porcelain "previous" header for each file
From: Jeff King @ 2017-01-06  4:20 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170106041541.rjzzofal5hscv6yi@sigill.intra.peff.net>

It's possible for content currently found in one file to
have originated in two separate files, each of which may
have been modified in some single older commit.  The
--porcelain output generates an incorrect "previous" header
in this case, whereas --line-porcelain gets it right.  The
problem is that the porcelain output tries to omit repeated
details of commits, and treats "previous" as a property of
the commit, when it is really a property of the blamed block
of lines.

Let's look at an example. In a case like this, you might see
this output from --line-porcelain:

  SOME_SHA1 1 1 1
  author ...
  committer ...
  previous SOME_SHA1^ file_one
  filename file_one
          ...some line content...
  SOME_SHA1 2 1 1
  author ...
  committer ...
  previous SOME_SHA1^ file_two
  filename file_two
          ...some different content....

The "filename" fields tell us that the two lines are from
two different files. But notice that the filename also
appears in the "previous" field, which tells us where to
start a re-blame. The second content line never appeared in
file_one at all, so we would obviously need to re-blame from
file_two (or possibly even some other file, if had just been
renamed to file_two in SOME_SHA1).

So far so good. Now here's what --porcelain looks like:

  SOME_SHA1 1 1 1
  author ...
  committer ...
  previous SOME_SHA1^ file_one
  filename file_one
          ...some line content...
  SOME_SHA1 2 1 1
  filename file_two
          ...some different content....

We've dropped the author and committer fields from the
second line, as they would just be repeats.  But we can't
omit "filename", because it depends on the actual block of
blamed lines, not just the commit. This is handled by
emit_porcelain_details(), which will show the filename
either if it is the first mention of the commit _or_ if the
commit has multiple paths in it.

But we don't give "previous" the same handling. It's written
inside emit_one_suspect_detail(), which bails early if we've
already seen that commit. And so the output above is wrong;
a reader would assume that the correct place to re-blame
line two is from file_one, but that's obviously nonsense.

Let's treat "previous" the same as "filename", and show it
fresh whenever we know we are in a confusing case like this.

Signed-off-by: Jeff King <peff@peff.net>
---
I'm assuming that the parent sha1 for a "previous" field will always be
the same for a given commit. So we really only need to care about
reprinting when we know there are multiple paths, as this patch does
(i.e., treat it exactly the same as "filename"). If I'm wrong, then
there's probably another corner case that this doesn't handle. I
couldn't think of a way to trigger such a setup, though.

 builtin/blame.c             |  23 +++++----
 t/t8011-blame-split-file.sh | 117 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 131 insertions(+), 9 deletions(-)
 create mode 100755 t/t8011-blame-split-file.sh

diff --git a/builtin/blame.c b/builtin/blame.c
index c6170fed81..3aae19a0f9 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1700,13 +1700,23 @@ static void get_commit_info(struct commit *commit,
 }
 
 /*
+ * Write out any suspect information which depends on the path. This must be
+ * handled separately from emit_one_suspect_detail(), because a given commit
+ * may have changes in multiple paths. So this needs to appear each time
+ * we mention a new group.
+ *
  * To allow LF and other nonportable characters in pathnames,
  * they are c-style quoted as needed.
  */
-static void write_filename_info(const char *path)
+static void write_filename_info(struct origin *suspect)
 {
+	if (suspect->previous) {
+		struct origin *prev = suspect->previous;
+		printf("previous %s ", oid_to_hex(&prev->commit->object.oid));
+		write_name_quoted(prev->path, stdout, '\n');
+	}
 	printf("filename ");
-	write_name_quoted(path, stdout, '\n');
+	write_name_quoted(suspect->path, stdout, '\n');
 }
 
 /*
@@ -1735,11 +1745,6 @@ static int emit_one_suspect_detail(struct origin *suspect, int repeat)
 	printf("summary %s\n", ci.summary.buf);
 	if (suspect->commit->object.flags & UNINTERESTING)
 		printf("boundary\n");
-	if (suspect->previous) {
-		struct origin *prev = suspect->previous;
-		printf("previous %s ", oid_to_hex(&prev->commit->object.oid));
-		write_name_quoted(prev->path, stdout, '\n');
-	}
 
 	commit_info_destroy(&ci);
 
@@ -1760,7 +1765,7 @@ static void found_guilty_entry(struct blame_entry *ent,
 		       oid_to_hex(&suspect->commit->object.oid),
 		       ent->s_lno + 1, ent->lno + 1, ent->num_lines);
 		emit_one_suspect_detail(suspect, 0);
-		write_filename_info(suspect->path);
+		write_filename_info(suspect);
 		maybe_flush_or_die(stdout, "stdout");
 	}
 	pi->blamed_lines += ent->num_lines;
@@ -1884,7 +1889,7 @@ static void emit_porcelain_details(struct origin *suspect, int repeat)
 {
 	if (emit_one_suspect_detail(suspect, repeat) ||
 	    (suspect->commit->object.flags & MORE_THAN_ONE_PATH))
-		write_filename_info(suspect->path);
+		write_filename_info(suspect);
 }
 
 static void emit_porcelain(struct scoreboard *sb, struct blame_entry *ent,
diff --git a/t/t8011-blame-split-file.sh b/t/t8011-blame-split-file.sh
new file mode 100755
index 0000000000..831125047b
--- /dev/null
+++ b/t/t8011-blame-split-file.sh
@@ -0,0 +1,117 @@
+#!/bin/sh
+
+test_description='
+The general idea is that we have a single file whose lines come from
+multiple other files, and those individual files were modified in the same
+commits. That means that we will see the same commit in multiple contexts,
+and each one should be attributed to the correct file.
+
+Note that we need to use "blame -C" to find the commit for all lines. We will
+not bother testing that the non-C case fails to find it. That is how blame
+behaves now, but it is not a property we want to make sure is retained.
+'
+. ./test-lib.sh
+
+# help avoid typing and reading long strings of similar lines
+# in the tests below
+generate_expect () {
+	while read nr data
+	do
+		i=0
+		while test $i -lt $nr
+		do
+			echo $data
+			i=$((i + 1))
+		done
+	done
+}
+
+test_expect_success 'setup split file case' '
+	# use lines long enough to trigger content detection
+	test_seq 1000 1010 >one &&
+	test_seq 2000 2010 >two &&
+	git add one two &&
+	test_commit base &&
+
+	sed "6s/^/modified /" <one >one.tmp &&
+	mv one.tmp one &&
+	sed "6s/^/modified /" <two >two.tmp &&
+	mv two.tmp two &&
+	git add -u &&
+	test_commit modified &&
+
+	cat one two >combined &&
+	git add combined &&
+	git rm one two &&
+	test_commit combined
+'
+
+test_expect_success 'setup simulated porcelain' '
+	# This just reads porcelain-ish output and tries
+	# to output the value of a given field for each line (either by
+	# reading the field that accompanies this line, or referencing
+	# the information found last time the commit was mentioned).
+	cat >read-porcelain.pl <<-\EOF
+	my $field = shift;
+	while (<>) {
+		if (/^[0-9a-f]{40} /) {
+			flush();
+			$hash = $&;
+		} elsif (/^$field (.*)/) {
+			$cache{$hash} = $1;
+		}
+	}
+	flush();
+
+	sub flush {
+		return unless defined $hash;
+		if (defined $cache{$hash}) {
+			print "$cache{$hash}\n";
+		} else {
+			print "NONE\n";
+		}
+	}
+	EOF
+'
+
+for output in porcelain line-porcelain
+do
+	test_expect_success "generate --$output output" '
+		git blame --root -C --$output combined >output
+	'
+
+	test_expect_success "$output output finds correct commits" '
+		generate_expect >expect <<-\EOF &&
+		5 base
+		1 modified
+		10 base
+		1 modified
+		5 base
+		EOF
+		perl read-porcelain.pl summary <output >actual &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "$output output shows correct filenames" '
+		generate_expect >expect <<-\EOF &&
+		11 one
+		11 two
+		EOF
+		perl read-porcelain.pl filename <output >actual &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "$output output shows correct previous pointer" '
+		generate_expect >expect <<-EOF &&
+		5 NONE
+		1 $(git rev-parse modified^) one
+		10 NONE
+		1 $(git rev-parse modified^) two
+		5 NONE
+		EOF
+		perl read-porcelain.pl previous <output >actual &&
+		test_cmp expect actual
+	'
+done
+
+test_done
-- 
2.11.0.519.g31435224cf

^ permalink raw reply related

* [PATCH 1/3] blame: fix alignment with --abbrev=40
From: Jeff King @ 2017-01-06  4:17 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170106041541.rjzzofal5hscv6yi@sigill.intra.peff.net>

The blame command internally adds 1 to any requested sha1
abbreviation length, and then subtracts it when outputting a
boundary commit. This lets regular and boundary sha1s line
up visually, but it misses one corner case.

When the requested length is 40, we bump the value to 41.
But since we only have 40 characters, that's all we can show
(fortunately the truncation is done by a printf precision
field, so it never tries to read past the end of the
buffer).  So a normal sha1 shows 40 hex characters, and a
boundary sha1 shows "^" plus 40 hex characters. The result
is misaligned.

The "-l" option to show long sha1s gets around this by
skipping the "abbrev" variable entirely and just always
using GIT_SHA1_HEXSZ.  This avoids the "+1" issue, but it
does mean that boundary commits only have 39 characters
printed.  This is somewhat odd, but it does look good
visually: the results are aligned and left-justified. The
alternative would be to allocate an extra column that would
contain either an extra space or the "^" boundary marker.

As this is by definition the human-readable view, it's
probably not that big a deal either way (and of course
--porcelain, etc, correctly produce correct 40-hex sha1s).
But for consistency, this patch teaches --abbrev=40 to
produce the same output as "-l" (always left-aligned, with
40-hex for normal sha1s, and "^" plus 39-hex for
boundaries).

Signed-off-by: Jeff King <peff@peff.net>
---
I mostly didn't explore the extra-column solution out of a sense of
inertia. The "-l" option has behaved this way for years. But it was also
out of laziness, as I doubt anybody cares too much, and it would require
a fair bit of special-casing in the printing routines.

 builtin/blame.c  |  2 +-
 t/t8002-blame.sh | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 4ddfadb71f..1d312542de 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2656,7 +2656,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 	} else if (show_progress < 0)
 		show_progress = isatty(2);
 
-	if (0 < abbrev)
+	if (0 < abbrev && abbrev < GIT_SHA1_HEXSZ)
 		/* one more abbrev length is needed for the boundary commit */
 		abbrev++;
 
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index ab79de9544..c6347ad8fd 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -86,4 +86,32 @@ test_expect_success 'blame with showEmail config true' '
 	test_cmp expected_n result
 '
 
+test_expect_success 'set up abbrev tests' '
+	test_commit abbrev &&
+	sha1=$(git rev-parse --verify HEAD) &&
+	check_abbrev () {
+		expect=$1; shift
+		echo $sha1 | cut -c 1-$expect >expect &&
+		git blame "$@" abbrev.t >actual &&
+		perl -lne "/[0-9a-f]+/ and print \$&" <actual >actual.sha &&
+		test_cmp expect actual.sha
+	}
+'
+
+test_expect_success 'blame --abbrev=<n> works' '
+	# non-boundary commits get +1 for alignment
+	check_abbrev 31 --abbrev=30 HEAD &&
+	check_abbrev 30 --abbrev=30 ^HEAD
+'
+
+test_expect_success 'blame -l aligns regular and boundary commits' '
+	check_abbrev 40 -l HEAD &&
+	check_abbrev 39 -l ^HEAD
+'
+
+test_expect_success 'blame --abbrev=40 behaves like -l' '
+	check_abbrev 40 --abbrev=40 HEAD &&
+	check_abbrev 39 --abbrev=40 ^HEAD
+'
+
 test_done
-- 
2.11.0.519.g31435224cf


^ permalink raw reply related

* [PATCH 2/3] blame: handle --no-abbrev
From: Jeff King @ 2017-01-06  4:18 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170106041541.rjzzofal5hscv6yi@sigill.intra.peff.net>

You can already ask blame for full sha1s with "-l" or with
"--abbrev=40". But for consistency with other parts of Git,
we should support "--no-abbrev".

Worse, blame already accepts --no-abbrev, but it's totally
broken. When we see --no-abbrev, the abbrev variable is set
to 0, which is then used as a printf precision. For regular
sha1s, that means we print nothing at all (which is very
wrong). For boundary commits we decrement it to "-1", which
printf interprets as "no limit" (which is almost correct,
except it misses the 39-length magic explained in the
previous commit).

Let's detect --no-abbrev and behave as if --abbrev=40 was
given.

Signed-off-by: Jeff King <peff@peff.net>
---
I also wondered if we needed to clamp this within MINIMUM_ABBREV, but
that is done for us already by the parseopt handler.

 builtin/blame.c  | 2 ++
 t/t8002-blame.sh | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/builtin/blame.c b/builtin/blame.c
index 1d312542de..c6170fed81 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2659,6 +2659,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 	if (0 < abbrev && abbrev < GIT_SHA1_HEXSZ)
 		/* one more abbrev length is needed for the boundary commit */
 		abbrev++;
+	else if (!abbrev)
+		abbrev = GIT_SHA1_HEXSZ;

 	if (revs_file && read_ancestry(revs_file))
 		die_errno("reading graft file '%s' failed", revs_file);
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index c6347ad8fd..380e1c1054 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -114,4 +114,8 @@ test_expect_success 'blame --abbrev=40 behaves like -l' '
 	check_abbrev 39 --abbrev=40 ^HEAD
 '

+test_expect_success '--no-abbrev works like --abbrev=40' '
+	check_abbrev 40 --no-abbrev
+'
+
 test_done
-- 
2.11.0.519.g31435224cf

^ permalink raw reply related

* [PATCH 0/3] fixing some random blame corner cases
From: Jeff King @ 2017-01-06  4:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Here are three fixes for some fairly obscure corner cases. I haven't
actually seen these in the wild. I came up with the final one while
discussing a hypothetical with somebody, then ran across the middle one
while trying to write a test for the third, which made me scratch my
head enough to yield the first one. Classic yak-shaving.

One other thing that surprised me while writing blame tests is that
"--root" is not the default for git-blame (though it has been for many
years in git-log). I'm not sure if it would be a good idea to change it,
or if blame is too plumbing-ish to allow that.

  [1/3]: blame: fix alignment with --abbrev=40
  [2/3]: blame: handle --no-abbrev
  [3/3]: blame: output porcelain "previous" header for each file

 builtin/blame.c             |  27 ++++++----
 t/t8002-blame.sh            |  32 ++++++++++++
 t/t8011-blame-split-file.sh | 117 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 166 insertions(+), 10 deletions(-)
 create mode 100755 t/t8011-blame-split-file.sh

-Peff

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox