Git development
 help / color / mirror / Atom feed
* Re: [PATCH v2 2/7] test-lib: parse some --options earlier
From: Jeff King @ 2019-01-03  4:53 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Junio C Hamano
In-Reply-To: <20181230190419.GB6120@szeder.dev>

On Sun, Dec 30, 2018 at 08:04:19PM +0100, SZEDER Gábor wrote:

> > (in fact, given that this is just
> > the internal tests, I am tempted to say that we should just make it
> > "-r<arg>" for the sake of simplicity and consistency. But maybe somebody
> > would be annoyed. I have never used "-r" ever myself).
> 
> I didn't even know what '-r' does...

I had to look it up, too. :)

> And I agree that changing it to '-r<arg>' would be the best, but this
> patch series is about adding '--stress', so changing how '-r' gets its
> mandatory argument (and potentially annoying someone) is beyond the
> scope, I would say.

OK, I'm fine with that (though once we've built the infrastructure to
handle its unstuck form, I don't know if there's much point in changing
it, so we can probably just let it live on forever).

-Peff

^ permalink raw reply

* Re: [PATCH] test-lib: check Bash version for '-x' without using shell arrays
From: Jeff King @ 2019-01-03  4:52 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: SZEDER Gábor, Junio C Hamano, Max Kirillov, Carlo Arenas,
	git
In-Reply-To: <a82251fa-38e1-233e-50d3-3ed4850b4e11@kdbg.org>

On Wed, Jan 02, 2019 at 01:20:47AM +0100, Johannes Sixt wrote:

> > diff --git a/t/test-lib.sh b/t/test-lib.sh
> > index 0f1faa24b2..f47a191e3b 100644
> > --- a/t/test-lib.sh
> > +++ b/t/test-lib.sh
> > @@ -324,9 +324,12 @@ do
> >   		# isn't executed with a suitable Bash version.
> >   		if test -z "$test_untraceable" || {
> >   		     test -n "$BASH_VERSION" && {
> > -		       test ${BASH_VERSINFO[0]} -gt 4 || {
> > -			 test ${BASH_VERSINFO[0]} -eq 4 &&
> > -			 test ${BASH_VERSINFO[1]} -ge 1
> > +		       bash_major=${BASH_VERSION%%.*}
> > +		       bash_minor=${BASH_VERSION#*.}
> > +		       bash_minor=${bash_minor%%.*}
> > +		       test $bash_major -gt 4 || {
> > +			 test $bash_major -eq 4 &&
> > +			 test $bash_minor -ge 1
> >   		       }
> >   		     }
> >   		   }
> > 
> 
> Would it perhaps be simpler to just hide the syntax behind eval? Like
> 
>  		if test -z "$test_untraceable" || {
>  		     test -n "$BASH_VERSION" && eval '
> 		       test ${BASH_VERSINFO[0]} -gt 4 || {
> 			 test ${BASH_VERSINFO[0]} -eq 4 &&
> 			 test ${BASH_VERSINFO[1]} -ge 1
> 		       }
>  		     '

That was my first thought, too. :)

The parsing here is simple enough that I'd be fine either with the
original patch, or an eval-based version (and otherwise, the goal and
description seem quite good to me).

-Peff

^ permalink raw reply

* Re: [PATCH] banned.h: mark strncat() as banned
From: Jeff King @ 2019-01-03  4:49 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, git
In-Reply-To: <20190102093846.6664-1-e@80x24.org>

On Wed, Jan 02, 2019 at 09:38:46AM +0000, Eric Wong wrote:

> strncat() has the same quadratic behavior as strcat() and is
> difficult-to-read and bug-prone.  While it hasn't yet been a
> problem in git iself, strncat() found it's way into 'master'
> of cgit and caused segfaults on my system.

I'm in favor of this.

It doesn't have the "oops, I didn't NUL-terminate for you" problem that
strncpy() has. But it actually has the opposite problem! It will always
place a NUL, and you have to feed it sizeof(dst)-1 to avoid an overflow.

So I think it's important for safety (though I'd be fine banning it on
the quadratic grounds alone ;) ).

-Peff

^ permalink raw reply

* Re: [PATCH 3/3] t0006-date.sh: add `human` date format tests.
From: Stephen & Linda Smith @ 2019-01-03  2:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Linus Torvalds, Ævar Arnfjörð Bjarmason
In-Reply-To: <xmqqva37j595.fsf@gitster-ct.c.googlers.com>

On Wednesday, January 2, 2019 11:15:02 AM MST Junio C Hamano wrote:
> 'date +%s' is used everywhere in this patch but has never been used
> in our test suite before.  It is not portable.
So I don't make this mistake again, Is there a reference somewhere for that is 
and is not portable?

> 
> We perhaps can use "test-tool date timestamp", like so
> 
> 	check_human_date $(test-tool date timestamp "18000 seconds ago") ...
> 
> or moving the part that munges 18000 into the above form inside
> check_human_date helper function, e.g.
> 
> 	check_human_date () {
> 		commit_date=$(test-tool date timestamp "$1 seconds ago")
> 		commit_date="$commit_date +0200"
>                 expect=$2
> 		...
> 	}
> 
> which would let us write
> 
> 	check_human_date 432000" $THIS_YEAR_REGEX  # 5 days ago
Thanks

>
> > +check_human_date() {
> > +	commit_date=$1
> > +	expect=$2
> > +	test_expect_success "$commit_date" "
> > +		echo $expect $commit_date >dates &&
> > +		git add dates &&
> > +		git commit -m 'Expect String' --date=\"$commit_date\" dates &&
> > +		git log -1 --date=human | grep \"^Date:\" >actual &&
> > +		grep \"$expect\" actual
> > +"
> 
> As the body of the test_expect_success helper is eval'ed, variables
> $commit_date and $expect should be visible to it, without turning
> them into values before executing test_expect_success function,
> i.e.
> 
> 	test_expect_success "$commit_date" '
> 		echo "$expect $commit_date" >dates &&
> 		...
> 		git commit -m "Expect String" --date="$commit_date" dates &&
> 		git show -s --date=human | grep '^Date:" >actual &&
> 		grep "$expect" actual
> 	'
> 
> which would reduce the need for unreadable backslashes.
I was worried about embedded spaces that might not be parsed correctly by the 
called function.  I will update

> 
> Instead of duplicating, perhaps move this to a more common place?
> Would it make sense to make it "check_date_format ()" helper by
> passing another argument to parameterize --date=human part
I had considered that, but then noted that for the other formats specific 
strings were being used.  The use of specific strings was possible since the 
other formats were always guarenteed to have the same string literal due to a 
singe unvarying input.

I don't mind parameterize the format and it would make the solution more 
general.


> > "Stephen P. Smith" <ischis2@cox.net> writes:
> > +# Subtract some known constant time and look for expected field format
> > +TODAY_REGEX='5 hours ago'
> > +THIS_YEAR_REGEX='[A-Z][a-z][a-z] [A-Z][a-z][a-z] [0-9]*
> > [012][0-9]:[0-6][0-9]' +MORE_THAN_A_YEAR_REGEX='[A-Z][a-z][a-z]
> > [A-Z][a-z][a-z] [0-9]* [0-9][0-9][0-9][0-9]' +check_human_date "$(($(date
> > +%s)-18000)) +0200" $TODAY_REGEX # 5 hours ago +check_human_date
> > "$(($(date +%s)-432000)) +0200" $THIS_YEAR_REGEX  # 5 days ago
> > +check_human_date() {
> > +	commit_date=$1
> > +	expect=$2
> > +	test_expect_success "$commit_date" "
> > +		echo $expect $commit_date >dates &&
> > +		git add dates &&
> > +		git commit -m 'Expect String' --date=\"$commit_date\" dates &&
> > +		git show --date=human | grep \"^Date:\" >actual &&
> 
> Using "show" here is much better than "log -1" above; using "show
> -s" would be even better.

I was attempting to test both git log and git show.  For get log the `-1` was 
to only get the latest commit.

Are you suggesting that t4202-log.sh not be updated and that only and  t7007-
show.sh and t0006-date.sh updated?  

Side note:  I found when updating that all three scripts that log and show 
returned the same formats, but date returned a different string if the delta 
date was less than 24hours

I just noted that the patch 3/3 should be re-titled since the tests are 
currently for three commands.

Hope you are better.
sps




^ permalink raw reply

* Re: [PATCH v17 0/7] git bisect: convert from shell to C
From: Ramsay Jones @ 2019-01-03  1:19 UTC (permalink / raw)
  To: Tanushree Tumane via GitGitGadget, git; +Cc: Junio C Hamano
In-Reply-To: <pull.101.v17.git.gitgitgadget@gmail.com>



On 02/01/2019 15:38, Tanushree Tumane via GitGitGadget wrote:
[snip]
> base-commit: 7f4e64169352e03476b0ea64e7e2973669e491a2
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-101%2Ftanushree27%2Fgit-bisect_part2_fixup-v17
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-101/tanushree27/git-bisect_part2_fixup-v17
> Pull-Request: https://github.com/gitgitgadget/git/pull/101

I didn't look at the patches, only the range-diff below, and the
only thing I noticed was ...

> 
> Range-diff vs v16:
> 
>  1:  f1e89ba517 ! 1:  338ebdc97a bisect--helper: `bisect_reset` shell function in C
>      @@ -16,8 +16,9 @@
>       
>           Mentored-by: Lars Schneider <larsxschneider@gmail.com>
>           Mentored-by: Christian Couder <chriscool@tuxfamily.org>
>      +    Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
>           Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
>      -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
>      +    Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
>       
>        diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>        --- a/builtin/bisect--helper.c
>      @@ -53,8 +54,10 @@
>       +	struct strbuf branch = STRBUF_INIT;
>       +
>       +	if (!commit) {
>      -+		if (strbuf_read_file(&branch, git_path_bisect_start(), 0) < 1)
>      -+			return !printf(_("We are not bisecting.\n"));
>      ++		if (strbuf_read_file(&branch, git_path_bisect_start(), 0) < 1) {
>      ++			printf(_("We are not bisecting.\n"));
>      ++			return 0;
>      ++		}
>       +		strbuf_rtrim(&branch);
>       +	} else {
>       +		struct object_id oid;
>      @@ -69,11 +72,11 @@
>       +
>       +		argv_array_pushl(&argv, "checkout", branch.buf, "--", NULL);
>       +		if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) {
>      -+			error(_("Could not check out original HEAD '%s'. Try "
>      -+				"'git bisect reset <commit>'."), branch.buf);
>       +			strbuf_release(&branch);
>       +			argv_array_clear(&argv);
>      -+			return -1;
>      ++			return error(_("could not check out original"
>      ++				       " HEAD '%s'. Try 'git bisect"
>      ++				       "reset <commit>'."), branch.buf);

... this 'branch.buf' will refer to the empty 'slopbuf', since
the call to 'strbuf_release(&branch)' now precedes this call
to error().

ATB,
Ramsay Jones

^ permalink raw reply

* Re: [PATCH v2 8/8] checkout: introduce checkout.overlayMode config
From: Junio C Hamano @ 2019-01-02 23:39 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Nguyễn Thái Ngọc Duy, Elijah Newren
In-Reply-To: <20181220134820.21810-9-t.gummerer@gmail.com>

Thomas Gummerer <t.gummerer@gmail.com> writes:

> In the previous patch we introduced a new no-overlay mode for git
> checkout.  Some users (such as the author of this commit) may want to
> have this mode turned on by default as it matches their mental model
> more closely.  Make that possible by introducing a new config option
> to that extend.
>
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  Documentation/config/checkout.txt |  7 +++++++
>  builtin/checkout.c                |  8 +++++++-
>  t/t2025-checkout-no-overlay.sh    | 10 ++++++++++
>  3 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt
> index c4118fa196..53f917e15e 100644
> --- a/Documentation/config/checkout.txt
> +++ b/Documentation/config/checkout.txt
> @@ -21,3 +21,10 @@ checkout.optimizeNewBranch::
>  	will not update the skip-worktree bit in the index nor add/remove
>  	files in the working directory to reflect the current sparse checkout
>  	settings nor will it show the local changes.
> +
> +checkout.overlayMode::
> +	In the default overlay mode files `git checkout` never
> +	removes files from the index or the working tree.

Technically the above "never removes" is incorrect.

	$ mv COPYING 1 && mkdir COPYING && mv 1 COPYING/COPYING
	$ git add COPYING
	$ git checkout HEAD COPYING

would remove COPYING/1 from the index and from the working tree to
make room.

Because I think that a bit of white lie like what you wrote would
help readers understand the key point of "overlay or not overlay"
better than an overly precise description of the reason why the
removal in the above three-liner case is the right thing to do, I
think the text in the patch is good enough at least for now, but I'd
mention it in case somebody else can think of a better phrasing to
covey the same key point without being technically incorrect.

Thanks.

^ permalink raw reply

* Re: [RFC PATCH 1/1] filter-options: Expand abbreviated numbers
From: Junio C Hamano @ 2019-01-02 23:15 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: git
In-Reply-To: <d324e7836928940a4df0b43da3ffeb8526feac61.1545261186.git.steadmon@google.com>

Josh Steadmon <steadmon@google.com> writes:

> When communicating with a remote server or a subprocess, use expanded
> numbers rather than abbreviated numbers in the object filter spec (e.g.
> "limit:blob=1k" becomes "limit:blob=1024").
>
> Update the protocol docs to note that clients should always perform this
> expansion, to allow for more compatibility between server
> implementations.
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
>  Documentation/technical/protocol-v2.txt |  5 ++++-
>  builtin/clone.c                         |  6 +++++-
>  builtin/fetch.c                         |  7 ++++++-
>  fetch-pack.c                            | 15 ++++++++++++---
>  list-objects-filter-options.c           | 20 ++++++++++++++++++--
>  list-objects-filter-options.h           | 17 +++++++++++++++--
>  t/t6112-rev-list-filters-objects.sh     | 17 +++++++++++++++++
>  transport-helper.c                      | 13 +++++++++----
>  upload-pack.c                           |  7 +++++--
>  9 files changed, 91 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 09e4e0273f..d001372404 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -296,7 +296,10 @@ included in the client's request:
>  	Request that various objects from the packfile be omitted
>  	using one of several filtering techniques. These are intended
>  	for use with partial clone and partial fetch operations. See
> -	`rev-list` for possible "filter-spec" values.
> +	`rev-list` for possible "filter-spec" values. Clients MUST
> +	translate abbreviated numbers (e.g. "1k") into fully-expanded
> +	numbers (e.g. "1024") on the client side, so that the server
> +	does not need to implement unit parsing.

I suspect that it is too late now to retroactively say "MUST" here.
The best we may be able to do is to say "The sender SHOULD send a
plain integer without unit and the receiver SHOULD be prepared to
scale an integer with unit".


^ permalink raw reply

* Re: [PATCH 1/1] commit-graph: writing missing parents is a BUG
From: Junio C Hamano @ 2019-01-02 23:01 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, Derrick Stolee
In-Reply-To: <907a24d2c45acec17607971b577782798c4adcbc.1545250444.git.gitgitgadget@gmail.com>

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> When writing a commit-graph, we write GRAPH_MISSING_PARENT if the
> parent's object id does not appear in the list of commits to be
> written into the commit-graph. This was done as the initial design
> allowed commits to have missing parents, but the final version
> requires the commit-graph to be closed under reachability. Thus,
> this GRAPH_MISSING_PARENT value should never be written.
>
> However, there are reasons why it could be written! These range
> from a bug in the reachable-closure code to a memory error causing
> the binary search into the list of object ids to fail. In either
> case, we should fail fast and avoid writing the commit-graph file
> with bad data.
>
> Remove the GRAPH_MISSING_PARENT constant in favor of the constant
> GRAPH_EDGE_LAST_MASK, which has the same value.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---

Thanks, will queue.

^ permalink raw reply

* Re: [PATCH 1/1]     Add author and committer configuration settings
From: Junio C Hamano @ 2019-01-02 22:57 UTC (permalink / raw)
  To: William Hubbs; +Cc: git, chutzpah
In-Reply-To: <20181219183939.16358-2-williamh@gentoo.org>

William Hubbs <williamh@gentoo.org> writes:

> Subject: Re: [PATCH 1/1]     Add author and committer configuration settings

Perhaps something like this

	Subject: config: allow giving separate author and committer idents

would fit better in "git shortlog --no-merges" output.

>     The author.email, author.name, committer.email and committer.name
>     settings are analogous to the GIT_AUTHOR_* and GIT_COMMITTER_*
>     environment variables, but for the git config system. This allows them
>     to be set separately for each repository.

Don't indent the whole proposed log message.

> diff --git a/Documentation/config/user.txt b/Documentation/config/user.txt
> index b5b2ba1199..6ba7002252 100644
> --- a/Documentation/config/user.txt
> +++ b/Documentation/config/user.txt
> @@ -1,3 +1,23 @@
> +author.email::
> +Your email address to be recorded on the author line of any newly
> +created commits.
> +If this is not set, we use user.email.

"author line" is a bit too technical and appropriate only to those
who are familiar with "git cat-file commit" output.  How about
phrasing it a bit differently, like

	author.email::
		The email-address used for the author of newly
		created commits.  Defaults to the value of the
		`GIT_AUTHOR_EMAIL` environment variable, or if it is
		not set, the `user.email` configuration variable.

Likewise for the other three variables.

>  user.email::
>  	Your email address to be recorded in any newly created commits.
>  	Can be overridden by the `GIT_AUTHOR_EMAIL`, `GIT_COMMITTER_EMAIL`, and

As you can see, the enumeration list in this file is formatted by
indenting the definition body.  

> diff --git a/builtin/commit.c b/builtin/commit.c
> index c021b119bb..49a97adeb8 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -607,7 +607,7 @@ static void determine_author_info(struct strbuf *author_ident)
>  		set_ident_var(&date, strbuf_detach(&date_buf, NULL));
>  	}
>  
> -	strbuf_addstr(author_ident, fmt_ident(name, email, date, IDENT_STRICT));
> +	strbuf_addstr(author_ident, fmt_ident(name, email, date, IDENT_STRICT|IDENT_AUTHOR));

That's now a bit overly long line.

>  	assert_split_ident(&author, author_ident);
>  	export_one("GIT_AUTHOR_NAME", author.name_begin, author.name_end, 0);
>  	export_one("GIT_AUTHOR_EMAIL", author.mail_begin, author.mail_end, 0);
> diff --git a/cache.h b/cache.h
> index ca36b44ee0..0ee87f22a9 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1479,10 +1479,13 @@ int date_overflows(timestamp_t date);
>  #define IDENT_STRICT	       1
>  #define IDENT_NO_DATE	       2
>  #define IDENT_NO_NAME	       4
> +#define IDENT_AUTHOR          8
> +#define IDENT_COMMITTER       16

>  extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);

It is wrong to pass "do we want the author, or the committer, name?"
information in the same flag parameter to fmt_ident(), and it is
wrong to introduce IDENT_AUTHOR/COMMITTER bits as if they belong to
the existing four.  For one thing, unlike these other bits, these
two are not independent bits.  It would be an error for a caller to
pass neither bits, or both bits at the same time.

One way to do it better may be to pass it as another parameter, e.g.

	enum want_ident {
		WANT_AUTHOR_IDENT,
		WANT_COMMITTER_IDENT
	};
	const char *fmt_ident(const char *name, const char *email,
				enum want_ident whose_ident,
				const char *date_str, int flags);

In the remainder of the review, I'd give update suggestions based on
this function signature.

> diff --git a/ident.c b/ident.c
> index 33bcf40644..3da96ebbef 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -11,6 +11,10 @@
>  static struct strbuf git_default_name = STRBUF_INIT;
>  static struct strbuf git_default_email = STRBUF_INIT;
>  static struct strbuf git_default_date = STRBUF_INIT;
> +static struct strbuf git_author_name = STRBUF_INIT;
> +static struct strbuf git_author_email = STRBUF_INIT;
> +static struct strbuf git_committer_name = STRBUF_INIT;
> +static struct strbuf git_committer_email = STRBUF_INIT;
>  static int default_email_is_bogus;
>  static int default_name_is_bogus;
>  
> @@ -361,7 +365,15 @@ const char *fmt_ident(const char *name, const char *email,
>  	int strict = (flag & IDENT_STRICT);
>  	int want_date = !(flag & IDENT_NO_DATE);
>  	int want_name = !(flag & IDENT_NO_NAME);
> +	int want_author = (flag & IDENT_AUTHOR);
> +	int want_committer = (flag & IDENT_COMMITTER);
>  
> +	if (!email) {
> +		if (want_author && git_author_email.len)
> +			email = git_author_email.buf;
> +		else if (want_committer && git_committer_email.len)
> +			email = git_committer_email.buf;
> +	}
>  	if (!email) {
>  		if (strict && ident_use_config_only
>  		    && !(ident_config_given & IDENT_MAIL_GIVEN)) {
> @@ -377,6 +389,12 @@ const char *fmt_ident(const char *name, const char *email,
>  
>  	if (want_name) {
>  		int using_default = 0;
> +		if (!name) {
> +			if (want_author && git_author_name.len)
> +				name = git_author_name.buf;
> +			else if (want_committer && git_committer_name.len)
> +				name = git_committer_name.buf;
> +		}
>  		if (!name) {
>  			if (strict && ident_use_config_only
>  			    && !(ident_config_given & IDENT_NAME_GIVEN)) {

The reviewer's interest here is to see how "author.name trumps
user.name" precedence is implemented; by mucking with "name" before
the code that deals with the ident_default_name(), which yields
git_default_name taken from user.name, the code gives precedence to
these newly introduced variables.

Which makes sense.

> @@ -425,9 +443,11 @@ const char *fmt_ident(const char *name, const char *email,
>  	return ident.buf;
>  }
>  
> -const char *fmt_name(const char *name, const char *email)
> +const char *fmt_committer_name(void)
>  {
> -	return fmt_ident(name, email, NULL, IDENT_STRICT | IDENT_NO_DATE);
> +	char *name = getenv("GIT_COMMITTER_NAME");
> +	char *email = getenv("GIT_COMMITTER_EMAIL");
> +	return fmt_ident(name, email, NULL, IDENT_STRICT | IDENT_NO_DATE|IDENT_COMMITTER);
>  }

OK, we are lucky that no existing caller use fmt_name() with author
information, I guess?  The resulting codebase does invite a question
"why don't we have fmt_author_name() at all?", which is somewhat
disturbing.

As we are going to change this function *and* all of its callers
anyway, perhaps we can generalize it with minimum effort, like so:

	const char *fmt_name(enum want_ident whose_ident)
	{
		switch (whose_ident) {
		case WANT_AUTHOR_IDENT:
			name = getenv("GIT_AUTHOR_NAME");
			email = getenv("GIT_COMMITTER_NAME");
			break;
		case WANT_COMMITTER_IDENT:
			...
		}
		return fmt_ident(name, email, whose_ident,
				  NULL, IDENT_STRICT | IDENT_NO_DATE);
	}

Thanks.

^ permalink raw reply

* Submodule log bug
From: David Turner @ 2019-01-02 22:14 UTC (permalink / raw)
  To: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 809 bytes --]

When a submodule is renamed, git log gives incorrect output:

commit 350ebece9bce8d9c495f9a51e6f5529749c5c3cc (HEAD -> master)
Author:
David Turner <novalis@novalis.org>
Date:   Wed Jan 2 17:09:56 2019 -0500

    move

diff --git a/.gitmodules b/.gitmodules
index da1a767..f4baf2a 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,3 +1,3 @@
-[submodule "mymod"]
-	path = mymod
+[submodule "morx"]
+	path = morx
 	url = ../sub
Submodule mymod 86da4a4...86da4a4 (commits not present)

^-- I expect this last line to tell me that the submodule has been
renamed, rather than that it has changed SHA to the same SHA.

See the attached shell script for a demo of this.  I tested with
b21ebb671bb as well as 2.18 and 2.19.  Thanks to Adam Bliss
<abliss@twosigma.com> for helping to figure out the reproduction steps.

[-- Attachment #2: demo2.sh --]
[-- Type: application/x-shellscript, Size: 575 bytes --]

^ permalink raw reply related

* Re: [PATCH v4 1/4] transport-helper: use xread instead of read
From: Junio C Hamano @ 2019-01-02 20:55 UTC (permalink / raw)
  To: randall.s.becker; +Cc: git, Randall S. Becker
In-Reply-To: <20181228233556.5704-1-randall.s.becker@rogers.com>

randall.s.becker@rogers.com writes:

> From: "Randall S. Becker" <rsbecker@nexbridge.com>
>
> This fix was needed on HPE NonStop NSE and NSX where SSIZE_MAX is less than
> BUFFERSIZE resulting in EINVAL. The call to read in transport-helper.c
> was the only place outside of wrapper.c where it is used instead of xread.

Thanks.

> Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
> ---
>  transport-helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/transport-helper.c b/transport-helper.c
> index bf225c698f..a290695a12 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -1225,7 +1225,7 @@ static int udt_do_read(struct unidirectional_transfer *t)
>  		return 0;	/* No space for more. */
>  
>  	transfer_debug("%s is readable", t->src_name);
> -	bytes = read(t->src, t->buf + t->bufuse, BUFFERSIZE - t->bufuse);
> +	bytes = xread(t->src, t->buf + t->bufuse, BUFFERSIZE - t->bufuse);
> - 	if (bytes < 0 && errno != EWOULDBLOCK && errno != EAGAIN &&
> - 		errno != EINTR) {
> + 	if (bytes < 0 && errno != EINTR) {
>  		error_errno(_("read(%s) failed"), t->src_name);

Can't we also lose EINTR check, though?  When read() returns
negative, we check errno and if it is EINTR, continue the loop.

^ permalink raw reply

* RE: Regression in git-subtree.sh, introduced in 2.20.1, after 315a84f9aa0e2e629b0680068646b0032518ebed
From: Strain, Roger L. @ 2019-01-02 20:20 UTC (permalink / raw)
  To: Marc Balmer, Duy Nguyen; +Cc: Git Mailing List, Junio C Hamano
In-Reply-To: <0F754615-C852-49D8-8E0C-DD2A00A15ED1@msys.ch>

> -----Original Message-----
> From: Marc Balmer <marc@msys.ch>
> Sent: Monday, December 31, 2018 5:24 AM
> To: Duy Nguyen <pclouds@gmail.com>
> Cc: Git Mailing List <git@vger.kernel.org>; Strain, Roger L.
> <roger.strain@swri.org>; Junio C Hamano <gitster@pobox.com>
> Subject: Re: Regression in git-subtree.sh, introduced in 2.20.1, after
> 315a84f9aa0e2e629b0680068646b0032518ebed
> 
> 
> 
> > Am 31.12.2018 um 12:20 schrieb Duy Nguyen <pclouds@gmail.com>:
> >
> > On Mon, Dec 31, 2018 at 6:13 PM Marc Balmer <marc@msys.ch> wrote:
> >>
> >>
> >>
> >>> Am 31.12.2018 um 11:51 schrieb Duy Nguyen <pclouds@gmail.com>:
> >>>
> >>> On Mon, Dec 31, 2018 at 5:44 PM Marc Balmer <marc@msys.ch> wrote:
> >>>>
> >>>> Hi
> >>>>
> >>>> One of the last three commits in git-subtree.sh introduced a regression
> leading to a segfault.
> >>>>
> >>>> Here is the error message when I try to split out my i18n files:
> >>>>
> >>>> $ git subtree split --prefix=i18n
> >>>> cache for e39a2a0c6431773a5d831eb3cb7f1cd40d0da623 already exists!
> >>>>  (Lots of output omitted)
> >>>> 436/627 (1819) [1455]       <- Stays at 436/ while the numbers in () and []
> increase, then segfaults:
> >>>> /usr/libexec/git-core/git-subtree: line 751: 54693 Done                    eval
> "$grl"
> >>>>   54694 Segmentation fault      (core dumped) | while read rev parents;
> do
> >>>
> >>> Do you still have this core dump? Could you run it and see if it's
> >>> "git" that crashed (and where) or "sh"?
> >>
> >> It is /usr/bin/bash that segfaults.  My guess is, that it runs out of memory
> (as described above, git-subtree enters an infinite loop untils it segafults).
> >
> > Ah that's better (I was worried about "git" crashing). The problematic
> > commit should be 19ad68d95d (subtree: performance improvement for
> > finding unexpected parent commits - 2018-10-12) then, although I can't
> > see why.
> >
> > I don't think we have any release coming up soon, so maybe Roger can
> > still have some time to fix it instead of a just a revert.
> 
> In a (private) Email to me, he indicated that had no time for a fix.  Maybe he
> can speak up here?
> 
> In any case, if I can help testing, I am in.  I just don't know the inner workings
> of git-subtree.sh (I am a mere user of it...)

TL;DR: Current script uses git rev-list to retrieve all commits which are reachable from HEAD but not from <abc123>. Is there a syntax that will instead return all commits reachable from HEAD, but stop traversing when <abc123> is encountered? It's a subtle distinction, but important.

Long version:

While I don't have a lot of time to troubleshoot this as deeply as I'd like, I've at least tried to give it a little bit of thought. If we assume it's an out of memory problem, that's likely due to commit 315a84f9aa0e2e629b0680068646b0032518ebed, which introduced a recursive history search. It might be possible to refactor that logic to avoid the recursive approach, but shell scripting is not my native tongue, and I didn't immediately see the right way to do it. I can explain the problem the commit was trying to address, and if anyone has suggestions about how to better fix the problem, I'd love to hear them.

This part of the subtree script is attempting to identify all mainline commits which need to be evaluated for subtree content. When subtree rejoin commits are included, it short-circuits this process by identifying all commits which can be reached from the current HEAD, but which CANNOT be reached from any of the known subtree rejoin points. While this does simplify the number of commits that need to be evaluated, it causes the script to miss commits that do in fact need to be evaluated when merges bypass those rejoins. Consider the following:

(Apologies for the crude diagram, I'm having to use Outlook at work and have no idea if this will format properly.)

[A]--(B)--(D)--[E]---(G)
 \       /          /
  ---(C)-------(F)--

In this graph, [A] represents an initial commit, which is a subtree rejoin (presumably mapping to subtree commit A'). From this, commits B and C were created, then those two were merged as commit D, and a subtree operation was performed to push these changes to the subtree. The result was rejoined as commit [E] (mapping as subtree commit E'). We then have a commit F with a single parent of C, and a merge of E and F into commit G.

At this point, I want to push the commits through G back to the subtree. The original script used git rev-list to list all commits which can be reached from G, but which cannot be reached from known split points A and E. This results in a commit list of G, F. It then looks at commit F to determine how it fits into the subtree, sees that it has a parent C, and looks for what subtree commit C maps to. Unfortunately C wasn't included in the list of commits to evaluate, so that check failed, and the script treats F as an initial commit. The subtree generated then includes F' as a complete commit, unrelated to any previous commits in the subtree, rather than connecting it to C' as a parent.

The ideal solution to this would be a call to git rev-list (or another tool, if appropriate) that could actually generate a commit list of G, F, C in the above diagram. Rather than being "All commits I can reach from G but not from A or E", what it really needs is "All commits I can reach from G, but stop traversing the history any time I reach A or E". If someone with better knowledge can provide that command, I think we could restore the script to a non-recursive approach and give better information about the true progress of the command, rather than having to rely on the "extra count" I introduced to keep track of how many unexpected commits are being recursively evaluated because the initial rev-list didn't include them.

As an aside, there's still another problem that we've run into which is forcing us to still run a custom version of the subtree script. There's a section in the script commented "ugly.  is there no better way to tell if this is a subtree vs. a mainline commit?  Does it matter?" Unfortunately, that section isn't working properly in our repo at this point, and I've had to include a VERY ugly check that looks for the existence of a known file in our specific mainline repo to make this determination. I'm still trying to sort through a better approach that might work for others, but have come up blank to this point. If anyone else has thoughts, I'd really love to hear those as well.

Anyway, sorry for the novel, but I wanted to brain dump what I've learned in this process. If anyone can suggest a syntax for rev-list that gives the appropriate set of commits, I'd love to get that in place; otherwise we could definitely try to unwind the recursion approach, but my shell script naivety may get me in trouble.

-- 
Roger

^ permalink raw reply

* Re: [PATCH v3] sha1-name.c: Fix handling of revisions that contain paths with brackets
From: Junio C Hamano @ 2019-01-02 20:35 UTC (permalink / raw)
  To: Stan Hu; +Cc: git
In-Reply-To: <20181229054357.11716-1-stanhu@gmail.com>

Stan Hu <stanhu@gmail.com> writes:

> -	if (len < 4 || name[len-1] != '}')
> +	if (len < 4)
>  		return -1;

The original does not expect any string given after the ^{<type>}
dereferencer, like :<path>, and that is why this function returns
very early for anything when name[len-1] is not a closing brace.

We do not do that anymore, because...?

This gives me a nagging feeling that the patch is barking up a wrong
tree.  Consider what happens when a path that does not have any
funny characters, e.g. "v1.4.0^{tree}:a/b/c", is given from the top
level of the request chain (e.g. "rev-parse v1.4.0^{tree}:a/b/c")?
The caller must be feeding this function "v1.4.0^{tree}" after
finding the colon before the path "a/b/c" and setting len to point
at the colon---otherwise we won't be checking for "}" at the end
like this.

When given "master^{tree}:a/b/{c}", wouldn't the caller be doing
the same stripping to find the colon and calling this function with
len pointing at the colon before the path (i.e. "master^{tree}")?

To put it another way, it is OK if this patch wants to shift the
responsibility of finding the colon that separates the tree-ish part
and the path part from the caller to this function, but then I would
expect changes to the caller, because now the caller does not have
to find ":a/b/c" in "v1.4.0^{tree}:a/b/c" and set up the len before
calling this function.  Why can the resulting code after applying
this patch be correct without such a matching change? 

>  
> -	for (sp = name + len - 1; name <= sp; sp--) {
> +	for (sp = name; sp < name + len; sp++) {
>  		int ch = *sp;
> -		if (ch == '{' && name < sp && sp[-1] == '^')
> +		if (ch == '^' && sp < name + len && sp[1] == '{')
>  			break;
>  	}

We used to scan from the tail (as we expected that the caller gives
us a (name, len) that ends with "^{<type>}".  The updated code
instead scans from the front, looking for "^{".  I do not
particularly mind the change of strategy, as long as it is correctly
done, but I suspect the function will stay simpler if the callee is
fixed instead.

The only troublesome case is the REV^{/...} syntax.  For example,

	HEAD^{/^Git 2.0}^{tree}:t/

would want to find the commit "HEAD^{/^Git 2.0}", peel it down to a
tree object with "^{tree}" and then take its "t/" subtree.  It used
to be that the caller (get_oid_with_context_1() has a loop that
finds a colon outside "{...}" and that finds ":t/" before calling
get_oid_1()) was responsible to give us "HEAD^{/^Git 2.0}^{tree}" by
stripping ":t/", and I presume that it is still happening, but the
above loop would terminate upon seeing "^{" immediately after HEAD,
without even realizing that it has ^{tree} after it.

> -	if (sp <= name)
> +
> +	if (sp == name + len)
>  		return -1;

> -	sp++; /* beginning of type name, or closing brace for empty */
> -	if (starts_with(sp, "commit}"))
> +	sp += 2; /* beginning of type name, or closing brace for empty */

And this comment indicates that the code did not expect to see ^{/^Git 2.0}
before ^{tree}.  

I do not quite follow.  How can this patch be correct?

Puzzled.

$ git rev-parse "master^{/^Git 2.0}^{tree}"
bc6ce29d1ec757d9d036532531a1046db4da0d96
$ ./git rev-parse "master^{/^Git 2.0}^{tree}"
master^{/^Git 2.0}^{tree}
fatal: ambiguous argument 'master^{/^Git 2.0}^{tree}': unknown revision or path not in the working tree.

^ permalink raw reply

* Re: [PATCH] log: add %S option (like --source) to log --format
From: Junio C Hamano @ 2019-01-02 19:33 UTC (permalink / raw)
  To: issac.trotts; +Cc: git, noemi, Issac Trotts
In-Reply-To: <20181231045335.12883-1-issac.trotts@gmail.com>

issac.trotts@gmail.com writes:

> From: Issac Trotts <issac.trotts@gmail.com>

I think you want to have 

	From: Issac Trotts <issactrotts@google.com>

instead, so that the authorship actually matches your sign-off.

> -	if (source) {
> +	if (source || (rev->pretty_given && (rev->commit_format == CMIT_FMT_USERFORMAT) && w.source)) {

I am not sure why the above is not simply

	if (source || w.source) {

but needs to check pretty-given etc.  Care to explain why?

Thanks.

^ permalink raw reply

* Re: [PATCH] diff: add support for reading files literally with --no-index
From: Junio C Hamano @ 2019-01-02 18:56 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Duy Nguyen
In-Reply-To: <20181220002610.43832-1-sandals@crustytoothpaste.net>

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> +test_expect_success SYMLINKS 'diff --no-index --literally with symlinks' '
> +	test_write_lines a b c >f1 &&
> +	test_write_lines a d c >f2 &&
> +	ln -s f1 s1 &&
> +	ln -s f2 s2 &&
> +	cat >expect <<-\EOF &&
> +	diff --git a/s1 b/s2
> +	--- a/s1
> +	+++ b/s2
> +	@@ -1,3 +1,3 @@
> +	 a
> +	-b
> +	+d
> +	 c
> +	EOF
> +	test_expect_code 1 git diff --no-index --literally s1 s2 >actual &&
> +	test_cmp expect actual
> +'

This is good as a goal, but the implementation seems to be overly
eager to dereference any symlink or non-regular file found in any
level of recursion.  The use case presented as the justification in
the proposed log message, and the explanation in the documentation,
suggests that only the paths given from the command line are treated
this way.

It may make sense to do this as two orthgonal flags.  Dereference
symlinks and named pipes given from the command line is one use
case.  Dereference any symlinks encountered during recursion is
another.  And the latter might be useful even inside a repository
as an option, even though the former would never make sense unless
running in --no-index mode.

Thanks.



^ permalink raw reply

* Re: [PATCH] test-lib: check Bash version for '-x' without using shell arrays
From: Carlo Arenas @ 2019-01-02 18:31 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: SZEDER Gábor, Junio C Hamano, Max Kirillov, Jeff King,
	Git List
In-Reply-To: <CAPig+cRLqy3Zdt4rUbMXC2dd1bhOXYZ=8HEn++zsb=Rau7zADw@mail.gmail.com>

Tested-By: Carlo Marcelo Arenas Belón <carenas@gmail.com>

^ permalink raw reply

* Re: [PATCH 3/3] t0006-date.sh: add `human` date format tests.
From: Junio C Hamano @ 2019-01-02 18:15 UTC (permalink / raw)
  To: Stephen P. Smith
  Cc: git, Linus Torvalds, Ævar Arnfjörð Bjarmason
In-Reply-To: <20181231003150.8031-4-ischis2@cox.net>

"Stephen P. Smith" <ischis2@cox.net> writes:

> +# Subtract some known constant time and look for expected field format
> +TODAY_REGEX='5 hours ago'
> +THIS_YEAR_REGEX='[A-Z][a-z][a-z] [A-Z][a-z][a-z] [0-9]* [012][0-9]:[0-6][0-9]'
> +MORE_THAN_A_YEAR_REGEX='[A-Z][a-z][a-z] [A-Z][a-z][a-z] [0-9]* [0-9][0-9][0-9][0-9]'
> +check_human_date "$(($(date +%s)-18000)) +0200" $TODAY_REGEX # 5 hours ago
> +check_human_date "$(($(date +%s)-432000)) +0200" $THIS_YEAR_REGEX  # 5 days ago

'date +%s' is used everywhere in this patch but has never been used
in our test suite before.  It is not portable.

We perhaps can use "test-tool date timestamp", like so

	check_human_date $(test-tool date timestamp "18000 seconds ago") ...

or moving the part that munges 18000 into the above form inside
check_human_date helper function, e.g.

	check_human_date () {
		commit_date=$(test-tool date timestamp "$1 seconds ago")
		commit_date="$commit_date +0200"
                expect=$2
		...
	}

which would let us write

	check_human_date 432000" $THIS_YEAR_REGEX  # 5 days ago

> +check_human_date() {
> +	commit_date=$1
> +	expect=$2
> +	test_expect_success "$commit_date" "
> +		echo $expect $commit_date >dates && 
> +		git add dates &&
> +		git commit -m 'Expect String' --date=\"$commit_date\" dates &&
> +		git log -1 --date=human | grep \"^Date:\" >actual &&
> +		grep \"$expect\" actual
> +"

As the body of the test_expect_success helper is eval'ed, variables
$commit_date and $expect should be visible to it, without turning
them into values before executing test_expect_success function,
i.e.

	test_expect_success "$commit_date" '
		echo "$expect $commit_date" >dates &&
		...
		git commit -m "Expect String" --date="$commit_date" dates &&
		git show -s --date=human | grep '^Date:" >actual &&
		grep "$expect" actual
	'

which would reduce the need for unreadable backslashes.

Instead of duplicating, perhaps move this to a more common place?
Would it make sense to make it "check_date_format ()" helper by
passing another argument to parameterize --date=human part

> +check_human_date() {
> +	commit_date=$1
> +	expect=$2
> +	test_expect_success "$commit_date" "
> +		echo $expect $commit_date >dates && 
> +		git add dates &&
> +		git commit -m 'Expect String' --date=\"$commit_date\" dates &&
> +		git show --date=human | grep \"^Date:\" >actual &&

Using "show" here is much better than "log -1" above; using "show
-s" would be even better.

> +		grep \"$expect\" actual
> +"
> +}
> +
> +TODAY_REGEX='[A-Z][a-z][a-z] [012][0-9]:[0-6][0-9] .0200'
> +THIS_YEAR_REGEX='[A-Z][a-z][a-z] [A-Z][a-z][a-z] [0-9]* [012][0-9]:[0-6][0-9]'
> +MORE_THAN_A_YEAR_REGEX='[A-Z][a-z][a-z] [A-Z][a-z][a-z] [0-9]* [0-9][0-9][0-9][0-9]'
> +check_human_date "$(($(date +%s)-18000)) +0200" $TODAY_REGEX # 5 hours ago
> +check_human_date "$(($(date +%s)-432000)) +0200" $THIS_YEAR_REGEX  # 5 days ago
> +check_human_date "$(($(date +%s)-1728000)) +0200" $THIS_YEAR_REGEX # 3 weeks ago
> +check_human_date "$(($(date +%s)-13000000)) +0200" $THIS_YEAR_REGEX # 5 months ago
> +check_human_date "$(($(date +%s)-31449600)) +0200" $THIS_YEAR_REGEX # 12 months ago
> +check_human_date "$(($(date +%s)-37500000)) +0200" $MORE_THAN_A_YEAR_REGEX # 1 year, 2 months ago
> +check_human_date "$(($(date +%s)-55188000)) +0200" $MORE_THAN_A_YEAR_REGEX # 1 year, 9 months ago
> +check_human_date "$(($(date +%s)-630000000)) +0200" $MORE_THAN_A_YEAR_REGEX # 20 years ago
> +
> +
>  test_done

^ permalink raw reply

* Re: [PATCH] test-lib: check Bash version for '-x' without using shell arrays
From: Eric Sunshine @ 2019-01-02 18:05 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Junio C Hamano, Max Kirillov, Carlo Arenas, Jeff King, Git List
In-Reply-To: <20190101231949.8184-1-szeder.dev@gmail.com>

On Tue, Jan 1, 2019 at 6:20 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> [...]
> To my understanding both shells are right and conform to POSIX,
> because the standard allows both behavior by stating the following

s/behavior/behaviors/

> under '2.8.1 Consequences of Shell Errors':
>
> Reported-by: Max Kirillov <max@max630.net>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>

^ permalink raw reply

* Re: [PATCH] banned.h: mark strncat() as banned
From: Eric Sunshine @ 2019-01-02 18:00 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jeff King, Junio C Hamano, Git List
In-Reply-To: <20190102093846.6664-1-e@80x24.org>

On Wed, Jan 2, 2019 at 4:38 AM Eric Wong <e@80x24.org> wrote:
>
> strncat() has the same quadratic behavior as strcat() and is
> difficult-to-read and bug-prone.  While it hasn't yet been a
> problem in git iself, strncat() found it's way into 'master'

s/iself/itself/

> of cgit and caused segfaults on my system.
>
> Signed-off-by: Eric Wong <e@80x24.org>

^ permalink raw reply

* Re: Git extra hook, pre-upload
From: Junio C Hamano @ 2019-01-02 17:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Xheroz 128, git
In-Reply-To: <87k1jqem1p.fsf@evledraar.gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We do not have such a pre-upload hook, but could have one. There's an
> old thread from 2011 detailing some potential downsides:
>
> https://public-inbox.org/git/CAMK1S_jaEWV=F6iHKZw_6u5ncDW0bPosNx-03W9bOLOfEEEY1Q@mail.gmail.com/
>
> FWIW I think most servers who find themselves needing such a hook use it
> to e.g. log how many fetches a given repository might serve, and end up
> instead wrapping git commands in some custom shell.

Thanks for a concise summary.


^ permalink raw reply

* Re: [PATCH] test-lib: check Bash version for '-x' without using shell arrays
From: Junio C Hamano @ 2019-01-02 16:46 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: SZEDER Gábor, Max Kirillov, Carlo Arenas, Jeff King, git
In-Reply-To: <a82251fa-38e1-233e-50d3-3ed4850b4e11@kdbg.org>

Johannes Sixt <j6t@kdbg.org> writes:

> Would it perhaps be simpler to just hide the syntax behind eval? Like
>
>  		if test -z "$test_untraceable" || {
>  		     test -n "$BASH_VERSION" && eval '
> 		       test ${BASH_VERSINFO[0]} -gt 4 || {
> 			 test ${BASH_VERSINFO[0]} -eq 4 &&
> 			 test ${BASH_VERSINFO[1]} -ge 1
> 		       }
>  		     '

It indeed is somewhat easier to see that your version does exactly
the same as the original than the posted patch, but the pattern
substitutions are not all that bad, either.

^ permalink raw reply

* Re: Regression `git checkout $rev -b branch` while in a `--no-checkout` clone does not check out files
From: Anthony Sottile @ 2019-01-02 16:18 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List
In-Reply-To: <CACsJy8B=-V7XY+=5pwwSzg8B6Goa55DPPU3ErgjOEsSJVni18Q@mail.gmail.com>

On Wed, Jan 2, 2019 at 3:08 AM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Wed, Jan 2, 2019 at 6:36 AM Anthony Sottile <asottile@umich.edu> wrote:
> >
> > Here's a simple regression test -- haven't had time to bisect this
>
> I can't reproduce with either 2.20.0, 2.20.1 or 'master'. It would be
> great if you could bisect this.
>
> There are no suspicious commits from 2.27.1 touching
> builtin/checkout.c. Though there are some more changes in
> unpack-trees.c that might cause this (big wild guess).
> --
> Duy


heated a small room but here's the results of the bisect!

fa655d8411cc2d7ffcf898e53a1493c737d7de68 is the first bad commit
commit fa655d8411cc2d7ffcf898e53a1493c737d7de68
Author: Ben Peart <Ben.Peart@microsoft.com>
Date:   Thu Aug 16 18:27:11 2018 +0000

    checkout: optimize "git checkout -b <new_branch>"

    Skip merging the commit, updating the index and working directory if and
    only if we are creating a new branch via "git checkout -b <new_branch>."
    Any other checkout options will still go through the former code path.

    If sparse_checkout is on, require the user to manually opt in to this
    optimzed behavior by setting the config setting checkout.optimizeNewBranch
    to true as we will no longer update the skip-worktree bit in the index, nor
    add/remove files in the working directory to reflect the current sparse
    checkout settings.

    For comparison, running "git checkout -b <new_branch>" on a large
repo takes:

    14.6 seconds - without this patch
    0.3 seconds - with this patch

    Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

:040000 040000 817bfb8ef961545a554005d42967b5ab7cfdb041
e57e576d0d4fb7f25c12a5dcc7651ef6698e961b M    Documentation
:040000 040000 c089f91f4532caa2a17e4f10a1a7ed3aa5d2023c
7cf16a0aa288f898a880ffefe82ee7506b83bef4 M    builtin
:040000 040000 adfdb05964a692e03ee07d2e43841f6304d996bd
8681416093802b9051599ebea8f63f5a45968e6f M    t
bisect run success


Here's the script and invocations:

```
#!/usr/bin/env bash
set -euxo pipefail

rm -rf "$PWD/prefix"
make prefix="$PWD/prefix" -j8 install
export PATH="$PWD/prefix/bin:$PATH"

rm -rf src dest

git --version

git init src
echo hi > src/a
git -C src add .
git -C src commit -m "initial commit"
rev="$(git -C src rev-parse HEAD)"

git clone --no-checkout src dest
git -C dest checkout "$rev" -b branch
test -f dest/a

: 'SUCCESS!'
```

```
git bisect begin
git bisect bad HEAD
git bisect good v2.17.1
git bisect run ./bisect.sh
```

Anthony

^ permalink raw reply

* Re: [PATCH 2/2] Rebase: Run post-checkout hook on checkout
From: Johannes Schindelin @ 2019-01-02 15:47 UTC (permalink / raw)
  To: Orgad Shaneh; +Cc: git
In-Reply-To: <20181229213759.12878-2-orgads@gmail.com>

Hi Orgad,

On Sat, 29 Dec 2018, orgads@gmail.com wrote:

> From: Orgad Shaneh <orgads@gmail.com>
> 
> The scripted version of rebase used to run this hook on the initial
> checkout. The transition to built-in introduced a regression.
> 
> Signed-off-by: Orgad Shaneh <orgads@gmail.com>

ACK!

For lurkers: there was a "pre-review" at
https://github.com/git-for-windows/git/pull/1992

Ciao,
Dscho

> ---
>  builtin/rebase.c              | 12 ++++++++++--
>  t/t5403-post-checkout-hook.sh | 20 ++++++++++++++++++++
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index b5c99ec10c..8402765a79 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -530,6 +530,7 @@ static int run_specific_rebase(struct rebase_options *opts)
>  
>  #define RESET_HEAD_DETACH (1<<0)
>  #define RESET_HEAD_HARD (1<<1)
> +#define RESET_HEAD_RUN_POST_CHECKOUT_HOOK (1<<2)
>  
>  static int reset_head(struct object_id *oid, const char *action,
>  		      const char *switch_to_branch, unsigned flags,
> @@ -537,6 +538,7 @@ static int reset_head(struct object_id *oid, const char *action,
>  {
>  	unsigned detach_head = flags & RESET_HEAD_DETACH;
>  	unsigned reset_hard = flags & RESET_HEAD_HARD;
> +	unsigned run_hook = flags & RESET_HEAD_RUN_POST_CHECKOUT_HOOK;
>  	struct object_id head_oid;
>  	struct tree_desc desc[2] = { { NULL }, { NULL } };
>  	struct lock_file lock = LOCK_INIT;
> @@ -636,6 +638,10 @@ static int reset_head(struct object_id *oid, const char *action,
>  			ret = update_ref(reflog_head, "HEAD", oid, NULL, 0,
>  					 UPDATE_REFS_MSG_ON_ERR);
>  	}
> +	if (run_hook)
> +		run_hook_le(NULL, "post-checkout",
> +			    oid_to_hex(orig ? orig : &null_oid),
> +			    oid_to_hex(oid), "1", NULL);
>  
>  leave_reset_head:
>  	strbuf_release(&msg);
> @@ -1465,7 +1471,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  					    getenv(GIT_REFLOG_ACTION_ENVIRONMENT),
>  					    options.switch_to);
>  				if (reset_head(&oid, "checkout",
> -					       options.head_name, 0,
> +					       options.head_name,
> +					       RESET_HEAD_RUN_POST_CHECKOUT_HOOK,
>  					       NULL, buf.buf) < 0) {
>  					ret = !!error(_("could not switch to "
>  							"%s"),
> @@ -1539,7 +1546,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  	strbuf_addf(&msg, "%s: checkout %s",
>  		    getenv(GIT_REFLOG_ACTION_ENVIRONMENT), options.onto_name);
>  	if (reset_head(&options.onto->object.oid, "checkout", NULL,
> -		       RESET_HEAD_DETACH, NULL, msg.buf))
> +		       RESET_HEAD_DETACH | RESET_HEAD_RUN_POST_CHECKOUT_HOOK,
> +		       NULL, msg.buf))
>  		die(_("Could not detach HEAD"));
>  	strbuf_release(&msg);
>  
> diff --git a/t/t5403-post-checkout-hook.sh b/t/t5403-post-checkout-hook.sh
> index 1d15a1031f..a539ffc080 100755
> --- a/t/t5403-post-checkout-hook.sh
> +++ b/t/t5403-post-checkout-hook.sh
> @@ -13,6 +13,8 @@ test_expect_success setup '
>  	EOF
>  	test_commit one &&
>  	test_commit two &&
> +	test_commit rebase-on-me &&
> +	git reset --hard HEAD^ &&
>  	test_commit three
>  '
>  
> @@ -44,6 +46,24 @@ test_expect_success 'post-checkout receives the right args when not switching br
>  	test $old = $new && test $flag = 0
>  '
>  
> +test_expect_success 'post-checkout is triggered on rebase' '
> +	test_when_finished "rm -f .git/post-checkout.args" &&
> +	git checkout -b rebase-test master &&
> +	rm -f .git/post-checkout.args &&
> +	git rebase rebase-on-me &&
> +	read old new flag <.git/post-checkout.args &&
> +	test $old != $new && test $flag = 1
> +'
> +
> +test_expect_success 'post-checkout is triggered on rebase with fast-forward' '
> +	test_when_finished "rm -f .git/post-checkout.args" &&
> +	git checkout -b ff-rebase-test rebase-on-me^ &&
> +	rm -f .git/post-checkout.args &&
> +	git rebase rebase-on-me &&
> +	read old new flag <.git/post-checkout.args &&
> +	test $old != $new && test $flag = 1
> +'
> +
>  test_expect_success 'post-checkout hook is triggered by clone' '
>  	mkdir -p templates/hooks &&
>  	write_script templates/hooks/post-checkout <<-\EOF &&
> -- 
> 2.20.1
> 
> 

^ permalink raw reply

* [PATCH v17 6/7] bisect--helper: `get_terms` & `bisect_terms` shell function in C
From: Pranit Bauva via GitGitGadget @ 2019-01-02 15:38 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Pranit Bauva
In-Reply-To: <pull.101.v17.git.gitgitgadget@gmail.com>

From: Pranit Bauva <pranit.bauva@gmail.com>

Reimplement the `get_terms` and `bisect_terms` shell function in C and
add `bisect-terms` subcommand to `git bisect--helper` to call it from
git-bisect.sh .

Using `--bisect-terms` subcommand is a temporary measure to port shell
function in C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired but its implementation will
be called by some other methods.

Also use error() to report "no terms defined" and accordingly change the
test in t6030.

We need to use PARSE_OPT_KEEP_UNKNOWN here to allow for parameters that
look like options (e.g --term-good) but should not be parsed by
cmd_bisect__helper(). This change is safe because all other cmdmodes have
strict argc checks already.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
---
 builtin/bisect--helper.c    | 62 +++++++++++++++++++++++++++++++++++--
 git-bisect.sh               | 35 ++-------------------
 t/t6030-bisect-porcelain.sh |  2 +-
 3 files changed, 63 insertions(+), 36 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 38ae825b9a..d61edd3c91 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -23,6 +23,7 @@ static const char * const git_bisect_helper_usage[] = {
 	N_("git bisect--helper --bisect-write [--no-log] <state> <revision> <good_term> <bad_term>"),
 	N_("git bisect--helper --bisect-check-and-set-terms <command> <good_term> <bad_term>"),
 	N_("git bisect--helper --bisect-next-check <good_term> <bad_term> [<term>]"),
+	N_("git bisect--helper --bisect-terms [--term-good | --term-old | --term-bad | --term-new]"),
 	NULL
 };
 
@@ -339,6 +340,55 @@ static int bisect_next_check(const struct bisect_terms *terms,
 	return retval;
 }
 
+static int get_terms(struct bisect_terms *terms)
+{
+	struct strbuf str = STRBUF_INIT;
+	FILE *fp = NULL;
+	int res = 0;
+
+	fp = fopen(git_path_bisect_terms(), "r");
+	if (!fp) {
+		res = -1;
+		goto finish;
+	}
+
+	free_terms(terms);
+	strbuf_getline_lf(&str, fp);
+	terms->term_bad = strbuf_detach(&str, NULL);
+	strbuf_getline_lf(&str, fp);
+	terms->term_good = strbuf_detach(&str, NULL);
+
+finish:
+	if (fp)
+		fclose(fp);
+	strbuf_release(&str);
+	return res;
+}
+
+static int bisect_terms(struct bisect_terms *terms, const char *option)
+{
+	if (get_terms(terms))
+		return error(_("no terms defined"));
+
+	if (option == NULL) {
+		printf(_("Your current terms are %s for the old state\n"
+			 "and %s for the new state.\n"),
+		       terms->term_good, terms->term_bad);
+		return 0;
+	}
+	if (one_of(option, "--term-good", "--term-old", NULL))
+		printf("%s\n", terms->term_good);
+	else if (one_of(option, "--term-bad", "--term-new", NULL))
+		printf("%s\n", terms->term_bad);
+	else
+		return error(_("invalid argument %s for 'git bisect terms'.\n"
+			       "Supported options are: "
+			       "--term-good|--term-old and "
+			       "--term-bad|--term-new."), option);
+
+	return 0;
+}
+
 int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 {
 	enum {
@@ -349,7 +399,8 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		BISECT_RESET,
 		BISECT_WRITE,
 		CHECK_AND_SET_TERMS,
-		BISECT_NEXT_CHECK
+		BISECT_NEXT_CHECK,
+		BISECT_TERMS
 	} cmdmode = 0;
 	int no_checkout = 0, res = 0, nolog = 0;
 	struct option options[] = {
@@ -369,6 +420,8 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 			 N_("check and set terms in a bisection state"), CHECK_AND_SET_TERMS),
 		OPT_CMDMODE(0, "bisect-next-check", &cmdmode,
 			 N_("check whether bad or good terms exist"), BISECT_NEXT_CHECK),
+		OPT_CMDMODE(0, "bisect-terms", &cmdmode,
+			 N_("print out the bisect terms"), BISECT_TERMS),
 		OPT_BOOL(0, "no-checkout", &no_checkout,
 			 N_("update BISECT_HEAD instead of checking out the current commit")),
 		OPT_BOOL(0, "no-log", &nolog,
@@ -378,7 +431,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 	struct bisect_terms terms = { .term_good = NULL, .term_bad = NULL };
 
 	argc = parse_options(argc, argv, prefix, options,
-			     git_bisect_helper_usage, 0);
+			     git_bisect_helper_usage, PARSE_OPT_KEEP_UNKNOWN);
 
 	if (!cmdmode)
 		usage_with_options(git_bisect_helper_usage, options);
@@ -419,6 +472,11 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		set_terms(&terms, argv[1], argv[0]);
 		res = bisect_next_check(&terms, argc == 3 ? argv[2] : NULL);
 		break;
+	case BISECT_TERMS:
+		if (argc > 1)
+			return error(_("--bisect-terms requires 0 or 1 argument"));
+		res = bisect_terms(&terms, argc == 1 ? argv[0] : NULL);
+		break;
 	default:
 		return error("BUG: unknown subcommand '%d'", cmdmode);
 	}
diff --git a/git-bisect.sh b/git-bisect.sh
index 5ef3e25621..bdb614e3c2 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -355,7 +355,7 @@ bisect_replay () {
 		"$TERM_GOOD"|"$TERM_BAD"|skip)
 			git bisect--helper --bisect-write "$command" "$rev" "$TERM_GOOD" "$TERM_BAD" || exit;;
 		terms)
-			bisect_terms $rev ;;
+			git bisect--helper --bisect-terms $rev || exit;;
 		*)
 			die "$(gettext "?? what are you talking about?")" ;;
 		esac
@@ -439,37 +439,6 @@ get_terms () {
 	fi
 }
 
-bisect_terms () {
-	get_terms
-	if ! test -s "$GIT_DIR/BISECT_TERMS"
-	then
-		die "$(gettext "no terms defined")"
-	fi
-	case "$#" in
-	0)
-		gettextln "Your current terms are $TERM_GOOD for the old state
-and $TERM_BAD for the new state."
-		;;
-	1)
-		arg=$1
-		case "$arg" in
-			--term-good|--term-old)
-				printf '%s\n' "$TERM_GOOD"
-				;;
-			--term-bad|--term-new)
-				printf '%s\n' "$TERM_BAD"
-				;;
-			*)
-				die "$(eval_gettext "invalid argument \$arg for 'git bisect terms'.
-Supported options are: --term-good|--term-old and --term-bad|--term-new.")"
-				;;
-		esac
-		;;
-	*)
-		usage ;;
-	esac
-}
-
 case "$#" in
 0)
 	usage ;;
@@ -500,7 +469,7 @@ case "$#" in
 	run)
 		bisect_run "$@" ;;
 	terms)
-		bisect_terms "$@" ;;
+		git bisect--helper --bisect-terms "$@" || exit;;
 	*)
 		usage ;;
 	esac
diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh
index f84ff941c3..55835ee4a4 100755
--- a/t/t6030-bisect-porcelain.sh
+++ b/t/t6030-bisect-porcelain.sh
@@ -802,7 +802,7 @@ test_expect_success 'bisect terms needs 0 or 1 argument' '
 	test_must_fail git bisect terms only-one &&
 	test_must_fail git bisect terms 1 2 &&
 	test_must_fail git bisect terms 2>actual &&
-	echo "no terms defined" >expected &&
+	echo "error: no terms defined" >expected &&
 	test_i18ncmp expected actual
 '
 
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v17 3/7] wrapper: move is_empty_file() and rename it as is_empty_or_missing_file()
From: Pranit Bauva via GitGitGadget @ 2019-01-02 15:38 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Pranit Bauva
In-Reply-To: <pull.101.v17.git.gitgitgadget@gmail.com>

From: Pranit Bauva <pranit.bauva@gmail.com>

is_empty_file() can help to refactor a lot of code. This will be very
helpful in porting "git bisect" to C.

Suggested-by: Torsten Bögershausen <tboegi@web.de>
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
---
 builtin/am.c | 20 ++------------------
 cache.h      |  3 +++
 wrapper.c    | 13 +++++++++++++
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 8f27f3375b..310eefe9e8 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -34,22 +34,6 @@
 #include "packfile.h"
 #include "repository.h"
 
-/**
- * Returns 1 if the file is empty or does not exist, 0 otherwise.
- */
-static int is_empty_file(const char *filename)
-{
-	struct stat st;
-
-	if (stat(filename, &st) < 0) {
-		if (errno == ENOENT)
-			return 1;
-		die_errno(_("could not stat %s"), filename);
-	}
-
-	return !st.st_size;
-}
-
 /**
  * Returns the length of the first line of msg.
  */
@@ -1220,7 +1204,7 @@ static int parse_mail(struct am_state *state, const char *mail)
 		goto finish;
 	}
 
-	if (is_empty_file(am_path(state, "patch"))) {
+	if (is_empty_or_missing_file(am_path(state, "patch"))) {
 		printf_ln(_("Patch is empty."));
 		die_user_resolve(state);
 	}
@@ -1803,7 +1787,7 @@ static void am_run(struct am_state *state, int resume)
 		resume = 0;
 	}
 
-	if (!is_empty_file(am_path(state, "rewritten"))) {
+	if (!is_empty_or_missing_file(am_path(state, "rewritten"))) {
 		assert(state->rebasing);
 		copy_notes_for_rebase(state);
 		run_post_rewrite_hook(state);
diff --git a/cache.h b/cache.h
index ca36b44ee0..9ff7e56b74 100644
--- a/cache.h
+++ b/cache.h
@@ -1788,4 +1788,7 @@ void safe_create_dir(const char *dir, int share);
  */
 extern int print_sha1_ellipsis(void);
 
+/* Return 1 if the file is empty or does not exists, 0 otherwise. */
+extern int is_empty_or_missing_file(const char *filename);
+
 #endif /* CACHE_H */
diff --git a/wrapper.c b/wrapper.c
index e4fa9d84cd..ea3cf64d4c 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -690,3 +690,16 @@ int xgethostname(char *buf, size_t len)
 		buf[len - 1] = 0;
 	return ret;
 }
+
+int is_empty_or_missing_file(const char *filename)
+{
+	struct stat st;
+
+	if (stat(filename, &st) < 0) {
+		if (errno == ENOENT)
+			return 1;
+		die_errno(_("could not stat %s"), filename);
+	}
+
+	return !st.st_size;
+}
-- 
gitgitgadget


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox