Git development
 help / color / mirror / Atom feed
* [PATCH v2 5/6] pack-objects: create pack.useSparse setting
From: Derrick Stolee via GitGitGadget @ 2018-11-29 14:24 UTC (permalink / raw)
  To: git; +Cc: peff, avarab, jrnieder, Junio C Hamano, Derrick Stolee
In-Reply-To: <pull.89.v2.git.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

The '--sparse' flag in 'git pack-objects' changes the algorithm
used to enumerate objects to one that is faster for individual
users pushing new objects that change only a small cone of the
working directory. The sparse algorithm is not recommended for a
server, which likely sends new objects that appear across the
entire working directory.

Create a 'pack.useSparse' setting that enables this new algorithm.
This allows 'git push' to use this algorithm without passing a
'--sparse' flag all the way through four levels of run_command()
calls.

If the '--no-sparse' flag is set, then this config setting is
overridden.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/pack-objects.c         |  4 ++++
 t/t5322-pack-objects-sparse.sh | 15 +++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7d5b0735e3..124b1bafc4 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2711,6 +2711,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 		use_bitmap_index_default = git_config_bool(k, v);
 		return 0;
 	}
+	if (!strcmp(k, "pack.usesparse")) {
+		sparse = git_config_bool(k, v);
+		return 0;
+	}
 	if (!strcmp(k, "pack.threads")) {
 		delta_search_threads = git_config_int(k, v);
 		if (delta_search_threads < 0)
diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
index 45dba6e014..8f5699bd91 100755
--- a/t/t5322-pack-objects-sparse.sh
+++ b/t/t5322-pack-objects-sparse.sh
@@ -121,4 +121,19 @@ test_expect_success 'sparse pack-objects' '
 	test_cmp expect_sparse_objects.txt sparse_objects.txt
 '
 
+test_expect_success 'pack.useSparse enables algorithm' '
+	git config pack.useSparse true &&
+	git pack-objects --stdout --revs <packinput.txt >sparse.pack &&
+	git index-pack -o sparse.idx sparse.pack &&
+	git show-index <sparse.idx | awk "{print \$2}" >sparse_objects.txt &&
+	test_cmp expect_sparse_objects.txt sparse_objects.txt
+'
+
+test_expect_success 'pack.useSparse overridden' '
+	git pack-objects --stdout --revs --no-sparse <packinput.txt >sparse.pack &&
+	git index-pack -o sparse.idx sparse.pack &&
+	git show-index <sparse.idx | awk "{print \$2}" >sparse_objects.txt &&
+	test_cmp expect_objects.txt sparse_objects.txt
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* Re: [PATCH 2/2] format-patch: allow for independent diff & range-diff options
From: Ævar Arnfjörð Bjarmason @ 2018-11-29 14:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Eric Sunshine
In-Reply-To: <nycvar.QRO.7.76.6.1811291307070.41@tvgsbejvaqbjf.bet>


On Thu, Nov 29 2018, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Thu, 29 Nov 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> On Thu, Nov 29 2018, Johannes Schindelin wrote:
>>
>> > On Wed, 28 Nov 2018, Ævar Arnfjörð Bjarmason wrote:
>> >
>> >> Change the semantics of the "--range-diff" option so that the regular
>> >> diff options can be provided separately for the range-diff and the
>> >> patch. This allows for supplying e.g. --range-diff-U0 and -U1 to
>> >> "format-patch" to provide different context for the range-diff and the
>> >> patch. This wasn't possible before.
>> >
>> > I really, really dislike the `--range-diff-<random-thing>`. We have
>> > precedent for passing optional arguments that are passed to some other
>> > command, so a much more logical and consistent convention would be to use
>> > `--range-diff[=<diff-option>..]`, allowing all of the diff options that
>> > you might want to pass to the outer diff in one go rather than having a
>> > lengthy string of `--range-diff-this` and `--range-diff-that` options.
>>
>> Where do we pass those sorts of arguments?
>>
>> Reasons I did it this way:
>>
>>  a) Passing it as one option will require the user to double-quote those
>>     options that take quoted arguments (e.g. --word-diff-regex), which I
>>     thought sucked more than the prefix. On the implementation side we
>>     couldn't leave the parsing of the command-line to the shell anymore.
>>
>>  b) I think people will want to tweak this very rarely, much more rarely
>>     than e.g. -U10 in format-patch itself, so having something long-ish
>>     doesn't sound bad.
>
> Hmm. I still don't like it. It sets a precedent, and we simply do not do
> it that way in other circumstances (most obvious would be the -X merge
> options). The more divergent user interfaces for the same sort of thing
> are, the more brain cycles you force users to spend on navigating said
> interfaces.

Yeah it sucks, I just think it sucks less than the alternative :)
I.e. I'm not picky about --range-diff-* prefix the name, but I think
doing our own shell parsing would be nasty.

>> > I only had time to skim the patch, and I have to wonder why you pass
>> > around full-blown `rev_info` structs for range diff (and with that really
>> > awful name `rd_rev`) rather than just the `diff_options` that you
>> > *actually* care about?
>>
>> Because setup_revisions() which does all the command-line parsing needs
>> a rev_info, so even if we only need the diffopt in the end we need to
>> initiate the whole thing.
>>
>> Suggestions for a better varibale name most welcome.
>
> `range_diff_revs`
>
> And you do not need to pass around the whole thing. You can easily pass
> `&range_diff_revs.diffopt`.
>
> Don't pass around what you do not need to pass around.

Ah, you mean internally in log.c, yes that makes sense. I thought you
meant just pass diffopt to setup_revisions() (which needs the containing
struct). Willdo.

> Ciao,
> Dscho
>
>>
>> > Ciao,
>> > Dscho
>> >
>> >>
>> >> Ever since the "--range-diff" option was added in
>> >> 31e2617a5f ("format-patch: add --range-diff option to embed diff in
>> >> cover letter", 2018-07-22) the "rev->diffopt" we pass down to the diff
>> >> machinery has been the one we get from "format-patch"'s own
>> >> setup_revisions().
>> >>
>> >> This sort of thing is unique among the log-like commands in
>> >> builtin/log.c, no command than format-patch will embed the output of
>> >> another log-like command. Since the "rev->diffopt" is reused we need
>> >> to munge it before we pass it to show_range_diff(). See
>> >> 43dafc4172 ("format-patch: don't include --stat with --range-diff
>> >> output", 2018-11-22) for a related regression fix which is being
>> >> mostly reverted here.
>> >>
>> >> Implementation notes: 1) We're not bothering with the full teardown
>> >> around die() and will leak memory, but it's too much boilerplate to do
>> >> all the frees with/without the die() and not worth it. 2) We call
>> >> repo_init_revisions() for "rd_rev" even though we could get away with
>> >> a shallow copy like the code we're replacing (and which
>> >> show_range_diff() itself does). This is to make this code more easily
>> >> understood.
>> >>
>> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> >> ---
>> >>  Documentation/git-format-patch.txt | 10 ++++++-
>> >>  builtin/log.c                      | 42 +++++++++++++++++++++++-------
>> >>  t/t3206-range-diff.sh              | 41 +++++++++++++++++++++++++++++
>> >>  3 files changed, 82 insertions(+), 11 deletions(-)
>> >>
>> >> diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
>> >> index aba4c5febe..6c048f415f 100644
>> >> --- a/Documentation/git-format-patch.txt
>> >> +++ b/Documentation/git-format-patch.txt
>> >> @@ -24,7 +24,8 @@ SYNOPSIS
>> >>  		   [--to=<email>] [--cc=<email>]
>> >>  		   [--[no-]cover-letter] [--quiet] [--notes[=<ref>]]
>> >>  		   [--interdiff=<previous>]
>> >> -		   [--range-diff=<previous> [--creation-factor=<percent>]]
>> >> +		   [--range-diff=<previous> [--creation-factor=<percent>]
>> >> +		      [--range-diff<common diff option>]]
>> >>  		   [--progress]
>> >>  		   [<common diff options>]
>> >>  		   [ <since> | <revision range> ]
>> >> @@ -257,6 +258,13 @@ feeding the result to `git send-email`.
>> >>  	creation/deletion cost fudge factor. See linkgit:git-range-diff[1])
>> >>  	for details.
>> >>
>> >> +--range-diff<common diff option>::
>> >> +	Other options prefixed with `--range-diff` are stripped of
>> >> +	that prefix and passed as-is to the diff machinery used to
>> >> +	generate the range-diff, e.g. `--range-diff-U0` and
>> >> +	`--range-diff--no-color`. This allows for adjusting the format
>> >> +	of the range-diff independently from the patch itself.
>> >> +
>> >>  --notes[=<ref>]::
>> >>  	Append the notes (see linkgit:git-notes[1]) for the commit
>> >>  	after the three-dash line.
>> >> diff --git a/builtin/log.c b/builtin/log.c
>> >> index 02d88fa233..7658e56ecc 100644
>> >> --- a/builtin/log.c
>> >> +++ b/builtin/log.c
>> >> @@ -1023,7 +1023,8 @@ static void show_diffstat(struct rev_info *rev,
>> >>  	fprintf(rev->diffopt.file, "\n");
>> >>  }
>> >>
>> >> -static void make_cover_letter(struct rev_info *rev, int use_stdout,
>> >> +static void make_cover_letter(struct rev_info *rev, struct rev_info *rd_rev,
>> >> +			      int use_stdout,
>> >>  			      struct commit *origin,
>> >>  			      int nr, struct commit **list,
>> >>  			      const char *branch_name,
>> >> @@ -1095,13 +1096,9 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
>> >>  	}
>> >>
>> >>  	if (rev->rdiff1) {
>> >> -		struct diff_options opts;
>> >> -		memcpy(&opts, &rev->diffopt, sizeof(opts));
>> >> -		opts.output_format &= ~(DIFF_FORMAT_DIFFSTAT | DIFF_FORMAT_SUMMARY);
>> >> -
>> >>  		fprintf_ln(rev->diffopt.file, "%s", rev->rdiff_title);
>> >>  		show_range_diff(rev->rdiff1, rev->rdiff2,
>> >> -				rev->creation_factor, 1, &opts);
>> >> +				rev->creation_factor, 1, &rd_rev->diffopt);
>> >>  	}
>> >>  }
>> >>
>> >> @@ -1485,6 +1482,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>> >>  	struct commit *commit;
>> >>  	struct commit **list = NULL;
>> >>  	struct rev_info rev;
>> >> +	struct rev_info rd_rev;
>> >>  	struct setup_revision_opt s_r_opt;
>> >>  	int nr = 0, total, i;
>> >>  	int use_stdout = 0;
>> >> @@ -1603,6 +1601,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>> >>  	init_log_defaults();
>> >>  	git_config(git_format_config, NULL);
>> >>  	repo_init_revisions(the_repository, &rev, prefix);
>> >> +	repo_init_revisions(the_repository, &rd_rev, prefix);
>> >>  	rev.commit_format = CMIT_FMT_EMAIL;
>> >>  	rev.expand_tabs_in_log_default = 0;
>> >>  	rev.verbose_header = 1;
>> >> @@ -1689,8 +1688,32 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>> >>  	rev.preserve_subject = keep_subject;
>> >>
>> >>  	argc = setup_revisions(argc, argv, &rev, &s_r_opt);
>> >> -	if (argc > 1)
>> >> -		die(_("unrecognized argument: %s"), argv[1]);
>> >> +	if (argc > 1) {
>> >> +		struct argv_array args = ARGV_ARRAY_INIT;
>> >> +		const char *prefix = "--range-diff";
>> >> +		int have_prefix = 0;
>> >> +
>> >> +		for (i = 0; i < argc; i++) {
>> >> +			struct strbuf sb = STRBUF_INIT;
>> >> +			char *str;
>> >> +
>> >> +			strbuf_addstr(&sb, argv[i]);
>> >> +			if (starts_with(argv[i], prefix)) {
>> >> +				have_prefix = 1;
>> >> +				strbuf_remove(&sb, 0, strlen(prefix));
>> >> +			}
>> >> +			str = strbuf_detach(&sb, NULL);
>> >> +			strbuf_release(&sb);
>> >> +
>> >> +			argv_array_push(&args, str);
>> >> +		}
>> >> +
>> >> +		if (!have_prefix)
>> >> +			die(_("unrecognized argument: %s"), argv[1]);
>> >> +		argc = setup_revisions(args.argc, args.argv, &rd_rev, NULL);
>> >> +		if (argc > 1)
>> >> +			die(_("unrecognized argument: %s"), argv[1]);
>> >> +	}
>> >>
>> >>  	if (rev.diffopt.output_format & DIFF_FORMAT_NAME)
>> >>  		die(_("--name-only does not make sense"));
>> >> @@ -1702,7 +1725,6 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>> >>  	if (!use_patch_format &&
>> >>  		(!rev.diffopt.output_format ||
>> >>  		 rev.diffopt.output_format == DIFF_FORMAT_PATCH))
>> >> -		/* Needs to be mirrored in show_range_diff() invocation */
>> >>  		rev.diffopt.output_format = DIFF_FORMAT_DIFFSTAT | DIFF_FORMAT_SUMMARY;
>> >>  	if (!rev.diffopt.stat_width)
>> >>  		rev.diffopt.stat_width = MAIL_DEFAULT_WRAP;
>> >> @@ -1877,7 +1899,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>> >>  	if (cover_letter) {
>> >>  		if (thread)
>> >>  			gen_message_id(&rev, "cover");
>> >> -		make_cover_letter(&rev, use_stdout,
>> >> +		make_cover_letter(&rev, &rd_rev, use_stdout,
>> >>  				  origin, nr, list, branch_name, quiet);
>> >>  		print_bases(&bases, rev.diffopt.file);
>> >>  		print_signature(rev.diffopt.file);
>> >> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
>> >> index bc5facc1cd..6916103888 100755
>> >> --- a/t/t3206-range-diff.sh
>> >> +++ b/t/t3206-range-diff.sh
>> >> @@ -308,6 +308,35 @@ test_expect_success 'format-patch with <common diff option>' '
>> >>  		--range-diff=topic~..topic changed~..changed >actual.raw &&
>> >>  	sed -ne "/^1:/,/^--/p" <actual.raw >actual.range-diff &&
>> >>  	sed -e "s|:$||" >expect <<-\EOF &&
>> >> +	1:  a63e992 ! 1:  d966c5c s/12/B/
>> >> +	    @@ -8,7 +8,7 @@
>> >> +	     @@
>> >> +	      9
>> >> +	      10
>> >> +	    - B
>> >> +	    + BB
>> >> +	     -12
>> >> +	     +B
>> >> +	      13
>> >> +	-- :
>> >> +	EOF
>> >> +	test_cmp expect actual.range-diff &&
>> >> +	sed -ne "/^--- /,/^--/p" <actual.raw >actual.diff &&
>> >> +	sed -e "s|:$||" >expect <<-\EOF &&
>> >> +	--- a/file
>> >> +	+++ b/file
>> >> +	@@ -12 +12 @@ BB
>> >> +	-12
>> >> +	+B
>> >> +	-- :
>> >> +	EOF
>> >> +	test_cmp expect actual.diff &&
>> >> +
>> >> +	# -U0 & --range-diff-U0
>> >> +	git format-patch --cover-letter --stdout -U0 --range-diff-U0 \
>> >> +		--range-diff=topic~..topic changed~..changed >actual.raw &&
>> >> +	sed -ne "/^1:/,/^--/p" <actual.raw >actual.range-diff &&
>> >> +	sed -e "s|:$||" >expect <<-\EOF &&
>> >>  	1:  a63e992 ! 1:  d966c5c s/12/B/
>> >>  	    @@ -11 +11 @@
>> >>  	    - B
>> >> @@ -327,4 +356,16 @@ test_expect_success 'format-patch with <common diff option>' '
>> >>  	test_cmp expect actual.diff
>> >>  '
>> >>
>> >> +test_expect_success 'format-patch option parsing with --range-diff-*' '
>> >> +	test_must_fail git format-patch --stdout --unknown \
>> >> +		master..unmodified 2>stderr &&
>> >> +	test_i18ngrep "unrecognized argument: --unknown" stderr &&
>> >> +	test_must_fail git format-patch --stdout --range-diff-unknown \
>> >> +		master..unmodified 2>stderr &&
>> >> +	test_i18ngrep "unrecognized argument: --range-diff-unknown" stderr &&
>> >> +	test_must_fail git format-patch --stdout --unknown --range-diff-unknown \
>> >> +		master..unmodified 2>stderr &&
>> >> +	test_i18ngrep "unrecognized argument: --unknown" stderr
>> >> +'
>> >> +
>> >>  test_done
>> >> --
>> >> 2.20.0.rc1.387.gf8505762e3
>> >>
>> >>
>>

^ permalink raw reply

* [PATCH v2 6/6] pack-objects: create GIT_TEST_PACK_SPARSE
From: Derrick Stolee via GitGitGadget @ 2018-11-29 14:24 UTC (permalink / raw)
  To: git; +Cc: peff, avarab, jrnieder, Junio C Hamano, Derrick Stolee
In-Reply-To: <pull.89.v2.git.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

Create a test variable GIT_TEST_PACK_SPARSE to enable the sparse
object walk algorithm by default during the test suite. Enabling
this variable ensures coverage in many interesting cases, such as
shallow clones, partial clones, and missing objects.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/pack-objects.c         | 1 +
 t/README                       | 4 ++++
 t/t5322-pack-objects-sparse.sh | 6 +++---
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 124b1bafc4..507d381153 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3331,6 +3331,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 
 	read_replace_refs = 0;
 
+	sparse = git_env_bool("GIT_TEST_PACK_SPARSE", 0);
 	reset_pack_idx_option(&pack_idx_opts);
 	git_config(git_pack_config, NULL);
 
diff --git a/t/README b/t/README
index 28711cc508..8b6dfe1864 100644
--- a/t/README
+++ b/t/README
@@ -342,6 +342,10 @@ GIT_TEST_INDEX_VERSION=<n> exercises the index read/write code path
 for the index version specified.  Can be set to any valid version
 (currently 2, 3, or 4).
 
+GIT_TEST_PACK_SPARSE=<boolean> if enabled will default the pack-objects
+builtin to use the sparse object walk. This can still be overridden by
+the --no-sparse command-line argument.
+
 GIT_TEST_PRELOAD_INDEX=<boolean> exercises the preload-index code path
 by overriding the minimum number of cache entries required per thread.
 
diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
index 8f5699bd91..e8cf41d1c6 100755
--- a/t/t5322-pack-objects-sparse.sh
+++ b/t/t5322-pack-objects-sparse.sh
@@ -36,7 +36,7 @@ test_expect_success 'setup repo' '
 '
 
 test_expect_success 'non-sparse pack-objects' '
-	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
+	git pack-objects --stdout --revs --no-sparse <packinput.txt >nonsparse.pack &&
 	git index-pack -o nonsparse.idx nonsparse.pack &&
 	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
 	test_cmp expect_objects.txt nonsparse_objects.txt
@@ -70,7 +70,7 @@ test_expect_success 'duplicate a folder from f3 and commit to topic1' '
 '
 
 test_expect_success 'non-sparse pack-objects' '
-	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
+	git pack-objects --stdout --revs --no-sparse <packinput.txt >nonsparse.pack &&
 	git index-pack -o nonsparse.idx nonsparse.pack &&
 	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
 	test_cmp expect_objects.txt nonsparse_objects.txt
@@ -102,7 +102,7 @@ test_expect_success 'non-sparse pack-objects' '
 		topic1			\
 		topic1^{tree}		\
 		topic1:f3 | sort >expect_objects.txt &&
-	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
+	git pack-objects --stdout --revs --no-sparse <packinput.txt >nonsparse.pack &&
 	git index-pack -o nonsparse.idx nonsparse.pack &&
 	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
 	test_cmp expect_objects.txt nonsparse_objects.txt
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 4/6] revision: implement sparse algorithm
From: Derrick Stolee via GitGitGadget @ 2018-11-29 14:24 UTC (permalink / raw)
  To: git; +Cc: peff, avarab, jrnieder, Junio C Hamano, Derrick Stolee
In-Reply-To: <pull.89.v2.git.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

When enumerating objects to place in a pack-file during 'git
pack-objects --revs', we discover the "frontier" of commits
that we care about and the boundary with commit we find
uninteresting. From that point, we walk trees to discover which
trees and blobs are uninteresting. Finally, we walk trees to find
the interesting trees.

This commit introduces a new, "sparse" way to discover the
uninteresting trees. We use the perspective of a single user trying
to push their topic to a large repository. That user likely changed
a very small fraction of the paths in their working directory, but
we spend a lot of time walking all reachable trees.

The way to switch the logic to work in this sparse way is to start
caring about which paths introduce new trees. While it is not
possible to generate a diff between the frontier boundary and all
of the interesting commits, we can simulate that behavior by
inspecting all of the root trees as a whole, then recursing down
to the set of trees at each path.

We already had taken the first step by passing an oidset to
mark_trees_uninteresting_sparse(). We now create a dictionary
whose keys are paths and values are oidsets. We consider the set
of trees that appear at each path. While we inspect a tree, we
add its subtrees to the oidsets corresponding to the tree entry's
path. We also mark trees as UNINTERESTING if the tree we are
parsing is UNINTERESTING.

To actually improve the peformance, we need to terminate our
recursion unless the oidset contains some intersting trees and
some uninteresting trees. Technically, we only need one interesting
tree for this to speed up in most cases, but we also will not mark
anything UNINTERESTING if there are no uninteresting trees, so
that would be wasted effort.

There are a few ways that this is not a universally better option.

First, we can pack extra objects. If someone copies a subtree
from one tree to another, the first tree will appear UNINTERESTING
and we will not recurse to see that the subtree should also be
UNINTERESTING. We will walk the new tree and see the subtree as
a "new" object and add it to the pack. We add a test case that
demonstrates this as a way to prove that the --sparse option is
actually working.

Second, we can have extra memory pressure. If instead of being a
single user pushing a small topic we are a server sending new
objects from across the entire working directory, then we will
gain very little (the recursion will rarely terminate early) but
will spend extra time maintaining the path-oidset dictionaries.

Despite these potential drawbacks, the benefits of the algorithm
are clear. By adding a counter to 'add_children_by_path' and
'mark_tree_contents_uninteresting', I measured the number of
parsed trees for the two algorithms in a variety of repos.

For git.git, I used the following input:

	v2.19.0
	^v2.19.0~10

 Objects to pack: 550
Walked (old alg): 282
Walked (new alg): 130

For the Linux repo, I used the following input:

	v4.18
	^v4.18~10

 Objects to pack:   518
Walked (old alg): 4,836
Walked (new alg):   188

The two repos above are rather "wide and flat" compared to
other repos that I have used in the past. As a comparison,
I tested an old topic branch in the Azure DevOps repo, which
has a much deeper folder structure than the Linux repo.

 Objects to pack:    220
Walked (old alg): 22,804
Walked (new alg):    129

I used the number of walked trees the main metric above because
it is consistent across multiple runs. When I ran my tests, the
performance of the pack-objects command with the same options
could change the end-to-end time by 10x depending on the file
system being warm. However, by repeating the same test on repeat
I could get more consistent timing results. The git.git and
Linux tests were too fast overall (less than 0.5s) to measure
an end-to-end difference. The Azure DevOps case was slow enough
to see the time improve from 15s to 1s in the warm case. The
cold case was 90s to 9s in my testing.

These improvements will have even larger benefits in the super-
large Windows repository. In our experiments, we see the
"Enumerate objects" phase of pack-objects taking 60-80% of the
end-to-end time of non-trivial pushes, taking longer than the
network time to send the pack and the server time to verify the
pack.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 revision.c                     | 116 ++++++++++++++++++++++++++++++---
 t/t5322-pack-objects-sparse.sh |  21 ++++--
 2 files changed, 121 insertions(+), 16 deletions(-)

diff --git a/revision.c b/revision.c
index f9eb6400f1..971f1bb095 100644
--- a/revision.c
+++ b/revision.c
@@ -99,29 +99,125 @@ void mark_tree_uninteresting(struct repository *r, struct tree *tree)
 	mark_tree_contents_uninteresting(r, tree);
 }
 
+struct paths_and_oids {
+	struct string_list list;
+};
+
+static void paths_and_oids_init(struct paths_and_oids *po)
+{
+	string_list_init(&po->list, 1);
+}
+
+static void paths_and_oids_clear(struct paths_and_oids *po)
+{
+	int i;
+	for (i = 0; i < po->list.nr; i++) {
+		oidset_clear(po->list.items[i].util);
+		free(po->list.items[i].util);
+	}
+
+	string_list_clear(&po->list, 0);
+}
+
+static void paths_and_oids_insert(struct paths_and_oids *po,
+				  const char *path,
+				  const struct object_id *oid)
+{
+	struct string_list_item *item = string_list_insert(&po->list, path);
+	struct oidset *set;
+
+	if (!item->util) {
+		set = xcalloc(1, sizeof(struct oidset));
+		oidset_init(set, 16);
+		item->util = set;
+	} else {
+		set = item->util;
+	}
+
+	oidset_insert(set, oid);
+}
+
+static void add_children_by_path(struct repository *r,
+				 struct tree *tree,
+				 struct paths_and_oids *po)
+{
+	struct tree_desc desc;
+	struct name_entry entry;
+
+	if (!tree)
+		return;
+
+	if (parse_tree_gently(tree, 1) < 0)
+		return;
+
+	init_tree_desc(&desc, tree->buffer, tree->size);
+	while (tree_entry(&desc, &entry)) {
+		switch (object_type(entry.mode)) {
+		case OBJ_TREE:
+			paths_and_oids_insert(po, entry.path, entry.oid);
+
+			if (tree->object.flags & UNINTERESTING) {
+				struct tree *child = lookup_tree(r, entry.oid);
+				if (child)
+					child->object.flags |= UNINTERESTING;
+			}
+			break;
+		case OBJ_BLOB:
+			if (tree->object.flags & UNINTERESTING) {
+				struct blob *child = lookup_blob(r, entry.oid);
+				if (child)
+					child->object.flags |= UNINTERESTING;
+			}
+			break;
+		default:
+			/* Subproject commit - not in this repository */
+			break;
+		}
+	}
+
+	free_tree_buffer(tree);
+}
+
 void mark_trees_uninteresting_sparse(struct repository *r,
 				     struct oidset *set)
 {
+	int i;
+	unsigned has_interesting = 0, has_uninteresting = 0;
+	struct paths_and_oids po;
 	struct object_id *oid;
 	struct oidset_iter iter;
 
 	oidset_iter_init(set, &iter);
-	while ((oid = oidset_iter_next(&iter))) {
+	while ((!has_interesting || !has_uninteresting) &&
+	       (oid = oidset_iter_next(&iter))) {
 		struct tree *tree = lookup_tree(r, oid);
 
 		if (!tree)
 			continue;
 
-		if (tree->object.flags & UNINTERESTING) {
-			/*
-			 * Remove the flag so the next call
-			 * is not a no-op. The flag is added
-			 * in mark_tree_unintersting().
-			 */
-			tree->object.flags ^= UNINTERESTING;
-			mark_tree_uninteresting(r, tree);
-		}
+		if (tree->object.flags & UNINTERESTING)
+			has_uninteresting = 1;
+		else
+			has_interesting = 1;
 	}
+
+	/* Do not walk unless we have both types of trees. */
+	if (!has_uninteresting || !has_interesting)
+		return;
+
+	paths_and_oids_init(&po);
+
+	oidset_iter_init(set, &iter);
+	while ((oid = oidset_iter_next(&iter))) {
+		struct tree *tree = lookup_tree(r, oid);
+		add_children_by_path(r, tree, &po);
+	}
+
+	for (i = 0; i < po.list.nr; i++)
+		mark_trees_uninteresting_sparse(
+			r, (struct oidset *)po.list.items[i].util);
+
+	paths_and_oids_clear(&po);
 }
 
 struct commit_stack {
diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
index 81f6805bc3..45dba6e014 100755
--- a/t/t5322-pack-objects-sparse.sh
+++ b/t/t5322-pack-objects-sparse.sh
@@ -83,22 +83,25 @@ test_expect_success 'sparse pack-objects' '
 	test_cmp expect_objects.txt sparse_objects.txt
 '
 
+# Demonstrate that the algorithms differ when we copy a tree wholesale
+# from one folder to another.
+
 test_expect_success 'duplicate a folder from f1 into f3' '
 	mkdir f3/f4 &&
 	cp -r f1/f1/* f3/f4 &&
 	git add f3/f4 &&
 	git commit -m "Copied f1/f1 to f3/f4" &&
-	cat >packinput.txt <<-EOF &&
+	cat >packinput.txt <<-EOF
 	topic1
 	^topic1~1
 	EOF
-	git rev-parse		\
-		topic1		\
-		topic1^{tree}	\
-		topic1:f3 | sort >expect_objects.txt
 '
 
 test_expect_success 'non-sparse pack-objects' '
+	git rev-parse			\
+		topic1			\
+		topic1^{tree}		\
+		topic1:f3 | sort >expect_objects.txt &&
 	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
 	git index-pack -o nonsparse.idx nonsparse.pack &&
 	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
@@ -106,10 +109,16 @@ test_expect_success 'non-sparse pack-objects' '
 '
 
 test_expect_success 'sparse pack-objects' '
+	git rev-parse			\
+		topic1			\
+		topic1^{tree}		\
+		topic1:f3		\
+		topic1:f3/f4		\
+		topic1:f3/f4/data.txt | sort >expect_sparse_objects.txt &&
 	git pack-objects --stdout --revs --sparse <packinput.txt >sparse.pack &&
 	git index-pack -o sparse.idx sparse.pack &&
 	git show-index <sparse.idx | awk "{print \$2}" >sparse_objects.txt &&
-	test_cmp expect_objects.txt sparse_objects.txt
+	test_cmp expect_sparse_objects.txt sparse_objects.txt
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 3/6] pack-objects: add --sparse option
From: Derrick Stolee via GitGitGadget @ 2018-11-29 14:24 UTC (permalink / raw)
  To: git; +Cc: peff, avarab, jrnieder, Junio C Hamano, Derrick Stolee
In-Reply-To: <pull.89.v2.git.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

Add a '--sparse' option flag to the pack-objects builtin. This
allows the user to specify that they want to use the new logic
for walking trees. This logic currently does not differ from the
existing output, but will in a later change.

Create a new test script, t5322-pack-objects-sparse.sh, to ensure
the object list that is selected matches what we expect. When we
update the logic to walk in a sparse fashion, the final test will
be updated to show the extra objects that are added.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-pack-objects.txt |   9 ++-
 builtin/pack-objects.c             |   5 +-
 t/t5322-pack-objects-sparse.sh     | 115 +++++++++++++++++++++++++++++
 3 files changed, 127 insertions(+), 2 deletions(-)
 create mode 100755 t/t5322-pack-objects-sparse.sh

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 40c825c381..ced2630eb3 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -14,7 +14,7 @@ SYNOPSIS
 	[--local] [--incremental] [--window=<n>] [--depth=<n>]
 	[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
 	[--stdout [--filter=<filter-spec>] | base-name]
-	[--shallow] [--keep-true-parents] < object-list
+	[--shallow] [--keep-true-parents] [--sparse] < object-list
 
 
 DESCRIPTION
@@ -196,6 +196,13 @@ depth is 4095.
 	Add --no-reuse-object if you want to force a uniform compression
 	level on all data no matter the source.
 
+--sparse::
+	Use the "sparse" algorithm to determine which objects to include in
+	the pack. This can have significant performance benefits when computing
+	a pack to send a small change. However, it is possible that extra
+	objects are added to the pack-file if the included commits contain
+	certain types of direct renames.
+
 --thin::
 	Create a "thin" pack by omitting the common objects between a
 	sender and a receiver in order to reduce network transfer. This
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5f70d840a7..7d5b0735e3 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -84,6 +84,7 @@ static unsigned long pack_size_limit;
 static int depth = 50;
 static int delta_search_threads;
 static int pack_to_stdout;
+static int sparse;
 static int thin;
 static int num_preferred_base;
 static struct progress *progress_state;
@@ -3135,7 +3136,7 @@ static void get_object_list(int ac, const char **av)
 
 	if (prepare_revision_walk(&revs))
 		die(_("revision walk setup failed"));
-	mark_edges_uninteresting(&revs, show_edge, 0);
+	mark_edges_uninteresting(&revs, show_edge, sparse);
 
 	if (!fn_show_object)
 		fn_show_object = show_object;
@@ -3292,6 +3293,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		{ OPTION_CALLBACK, 0, "unpack-unreachable", NULL, N_("time"),
 		  N_("unpack unreachable objects newer than <time>"),
 		  PARSE_OPT_OPTARG, option_parse_unpack_unreachable },
+		OPT_BOOL(0, "sparse", &sparse,
+			 N_("use the sparse reachability algorithm")),
 		OPT_BOOL(0, "thin", &thin,
 			 N_("create thin packs")),
 		OPT_BOOL(0, "shallow", &shallow,
diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
new file mode 100755
index 0000000000..81f6805bc3
--- /dev/null
+++ b/t/t5322-pack-objects-sparse.sh
@@ -0,0 +1,115 @@
+#!/bin/sh
+
+test_description='pack-objects object selection using sparse algorithm'
+. ./test-lib.sh
+
+test_expect_success 'setup repo' '
+	test_commit initial &&
+	for i in $(test_seq 1 3)
+	do
+		mkdir f$i &&
+		for j in $(test_seq 1 3)
+		do
+			mkdir f$i/f$j &&
+			echo $j >f$i/f$j/data.txt
+		done
+	done &&
+	git add . &&
+	git commit -m "Initialized trees" &&
+	for i in $(test_seq 1 3)
+	do
+		git checkout -b topic$i master &&
+		echo change-$i >f$i/f$i/data.txt &&
+		git commit -a -m "Changed f$i/f$i/data.txt"
+	done &&
+	cat >packinput.txt <<-EOF &&
+	topic1
+	^topic2
+	^topic3
+	EOF
+	git rev-parse			\
+		topic1			\
+		topic1^{tree}		\
+		topic1:f1		\
+		topic1:f1/f1		\
+		topic1:f1/f1/data.txt | sort >expect_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
+	git index-pack -o nonsparse.idx nonsparse.pack &&
+	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
+	test_cmp expect_objects.txt nonsparse_objects.txt
+'
+
+test_expect_success 'sparse pack-objects' '
+	git pack-objects --stdout --revs --sparse <packinput.txt >sparse.pack &&
+	git index-pack -o sparse.idx sparse.pack &&
+	git show-index <sparse.idx | awk "{print \$2}" >sparse_objects.txt &&
+	test_cmp expect_objects.txt sparse_objects.txt
+'
+
+# Demonstrate that both algorithms send "extra" objects because
+# they are not in the frontier.
+
+test_expect_success 'duplicate a folder from f3 and commit to topic1' '
+	git checkout topic1 &&
+	echo change-3 >f3/f3/data.txt &&
+	git commit -a -m "Changed f3/f3/data.txt" &&
+	git rev-parse			\
+		topic1~1		\
+		topic1~1^{tree}		\
+		topic1^{tree}		\
+		topic1			\
+		topic1:f1		\
+		topic1:f1/f1		\
+		topic1:f1/f1/data.txt	\
+		topic1:f3		\
+		topic1:f3/f3		\
+		topic1:f3/f3/data.txt | sort >expect_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
+	git index-pack -o nonsparse.idx nonsparse.pack &&
+	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
+	test_cmp expect_objects.txt nonsparse_objects.txt
+'
+
+test_expect_success 'sparse pack-objects' '
+	git pack-objects --stdout --revs --sparse <packinput.txt >sparse.pack &&
+	git index-pack -o sparse.idx sparse.pack &&
+	git show-index <sparse.idx | awk "{print \$2}" >sparse_objects.txt &&
+	test_cmp expect_objects.txt sparse_objects.txt
+'
+
+test_expect_success 'duplicate a folder from f1 into f3' '
+	mkdir f3/f4 &&
+	cp -r f1/f1/* f3/f4 &&
+	git add f3/f4 &&
+	git commit -m "Copied f1/f1 to f3/f4" &&
+	cat >packinput.txt <<-EOF &&
+	topic1
+	^topic1~1
+	EOF
+	git rev-parse		\
+		topic1		\
+		topic1^{tree}	\
+		topic1:f3 | sort >expect_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+	git pack-objects --stdout --revs <packinput.txt >nonsparse.pack &&
+	git index-pack -o nonsparse.idx nonsparse.pack &&
+	git show-index <nonsparse.idx | awk "{print \$2}" >nonsparse_objects.txt &&
+	test_cmp expect_objects.txt nonsparse_objects.txt
+'
+
+test_expect_success 'sparse pack-objects' '
+	git pack-objects --stdout --revs --sparse <packinput.txt >sparse.pack &&
+	git index-pack -o sparse.idx sparse.pack &&
+	git show-index <sparse.idx | awk "{print \$2}" >sparse_objects.txt &&
+	test_cmp expect_objects.txt sparse_objects.txt
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related

* Re: [PATCH 3/5] pack-objects: add --sparse option
From: Derrick Stolee @ 2018-11-29 14:20 UTC (permalink / raw)
  To: Stefan Beller, gitgitgadget
  Cc: git, Jeff King, Ævar Arnfjörð Bjarmason,
	Jonathan Nieder, Junio C Hamano, Derrick Stolee
In-Reply-To: <CAGZ79kbVGyX=Ky9HvmUGWPoHdLJYqnPTNda2kT3aC8XuHnH3+A@mail.gmail.com>

On 11/28/2018 5:11 PM, Stefan Beller wrote:
>> +--sparse::
>> +       Use the "sparse" algorithm to determine which objects to include in
>> +       the pack. This can have significant performance benefits when computing
>> +       a pack to send a small change. However, it is possible that extra
>> +       objects are added to the pack-file if the included commits contain
>> +       certain types of direct renames.
> As a user, where do I find a discussion of these walking algorithms?
> (i.e. how can I tell if I can really expect to gain performance benefits,
> what are the tradeoffs? "If it were strictly better than the original,
> it would be on by default" might be a thought of a user)

You're right that having this hidden as an opt-in config variable makes 
it hard to discover as a typical user.

I would argue that we should actually make the config setting true by 
default, and recommend that servers opt-out. Here are my reasons:

1. The vast majority of users are clients.

2. Client users are not likely to know about and tweak these settings.

3. Server users are more likely to keep an eye on the different knobs 
they can tweak.

4. Servers should use the reachability bitmaps, which don't use this 
logic anyway.

While _eventually_ we should make this opt-out, we shouldn't do that 
until it has cooked a while.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH] rebase: mark the C reimplementation as an experimental opt-in feature (was Re: [ANNOUNCE] Git v2.20.0-rc1)
From: Johannes Schindelin @ 2018-11-29 14:17 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, git,
	Ian Jackson
In-Reply-To: <nycvar.QRO.7.76.6.1811281015360.41@tvgsbejvaqbjf.bet>

[-- Attachment #1: Type: text/plain, Size: 3675 bytes --]

Hi Jonathan,

if you could pry more information (or better information) out of that bug
reporter, that would be nice. Apparently my email address is blacklisted
by his mail provider, so he is unlikely to have received my previous mail
(nor will he receive this one, I am sure).

Thanks,
Dscho

On Wed, 28 Nov 2018, Johannes Schindelin wrote:

> Hi Jonathan,
> 
> On Tue, 27 Nov 2018, Jonathan Nieder wrote:
> 
> > At https://bugs.debian.org/914695 is a report of a test regression in
> > an outside project that is very likely to have been triggered by the
> > new faster rebase code.
> 
> From looking through that log.gz (without having a clue where the test
> code lives, so I cannot say what it is supposed to do, and also: this is
> the first time I hear about dgit...), it would appear that this must be a
> regression in the reflog messages produced by `git rebase`.
> 
> > The issue has not been triaged, so I don't know yet whether it's a
> > problem in rebase-in-c or a manifestation of a bug in the test.
> 
> It ends thusly:
> 
> -- snip --
> [...]
> + git reflog
> + egrep 'debrebase new-upstream.*checkout'
> + test 1 = 0
> + t-report-failure
> + set +x
> TEST FAILED
> -- snap --
> 
> Which makes me think that the reflog we produce in *some* code path that
> originally called `git checkout` differs from the scripted rebase's
> generated reflog.
> 
> > That said, Google has been running with the new rebase since ~1 month
> > ago when it became the default, with no issues reported by users.  As a
> > result, I am confident that it can cope with what most users of "next"
> > throw at it, which means that if we are to find more issues to polish it
> > better, it will need all the exposure it can get.
> 
> Right. And there are a few weeks before the holidays, which should give me
> time to fix whatever bugs are discovered (I only half mind being the only
> one who fixes these bugs).
> 
> > In the Google deployment, we will keep using rebase-in-c even if it
> > gets disabled by default, in order to help with that.
> > 
> > From the Debian point of view, it's only a matter of time before
> > rebase-in-c becomes the default: even if it's not the default in 2.20,
> > it would presumably be so in 2.21 or 2.22.  That means the community's
> > attention when resolving security and reliability bugs would be on the
> > rebase-in-c implementation.  As a result, the Debian package will most
> > likely enable rebase-in-c by default even if upstream disables it, in
> > order to increase the package's shelf life (i.e. to ease the
> > maintenance burden of supporting whichever version of the package ends
> > up in the next Debian stable).
> > 
> > So with either hat on, it doesn't matter whether you apply this patch
> > upstream.
> > 
> > Having two pretty different deployments end up with the same
> > conclusion leads me to suspect that it's best for upstream not to
> > apply the revert patch, unless either
> > 
> >   (a) we have a concrete regression to address and then try again, or
> >   (b) we have a test or other plan to follow before trying again.
> 
> In this instance, I am more a fan of the "let's move fast and break
> things, then move even faster fixing them" approach.
> 
> Besides, the bug that Ævar discovered was a bug already in the scripted
> rebase, but hidden by yet another bug (the missing error checking).
> 
> I get the pretty firm impression that the common code paths are now pretty
> robust, and only lesser-exercised features may expose a bug (or
> regression, as in the case of the reflogs, where one could argue that the
> exact reflog message is not something we promise not to fiddle with).
> 
> Ciao,
> Dscho

^ permalink raw reply

* Re: [PATCH v1] teach git to support a virtual (partially populated) work directory
From: Ben Peart @ 2018-11-29 14:09 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, benpeart, pclouds, sandals, avarab, Johannes.Schindelin
In-Reply-To: <20181128133114.GF30222@szeder.dev>


On 11/28/2018 8:31 AM, SZEDER Gábor wrote:
> On Tue, Nov 27, 2018 at 02:50:57PM -0500, Ben Peart wrote:
>
>> diff --git a/t/t1092-virtualworkdir.sh b/t/t1092-virtualworkdir.sh
>> new file mode 100755
>> index 0000000000..0cdfe9b362
>> --- /dev/null
>> +++ b/t/t1092-virtualworkdir.sh
>> @@ -0,0 +1,393 @@
>> +#!/bin/sh
>> +
>> +test_description='virtual work directory tests'
>> +
>> +. ./test-lib.sh
>> +
>> +# We need total control of the virtual work directory hook
>> +sane_unset GIT_TEST_VIRTUALWORKDIR
>> +
>> +clean_repo () {
>> +	rm .git/index &&
>> +	git -c core.virtualworkdir=false reset --hard HEAD &&
>> +	git -c core.virtualworkdir=false clean -fd &&
>> +	touch untracked.txt &&
> We would usually run '>untracked.txt' instead, sparing the external
> process.
>
> A further nit is that a function called 'clean_repo' creates new
> untracked files...

Thanks, all good suggestions I've incorporated for the next iteration.

>
>> +	touch dir1/untracked.txt &&
>> +	touch dir2/untracked.txt
>> +}
>> +
>> +test_expect_success 'setup' '
>> +	mkdir -p .git/hooks/ &&
>> +	cat > .gitignore <<-\EOF &&
> CodingGuidelines suggest no space between redirection operator and
> filename.
>
>> +		.gitignore
>> +		expect*
>> +		actual*
>> +	EOF
>> +	touch file1.txt &&
>> +	touch file2.txt &&
>> +	mkdir -p dir1 &&
>> +	touch dir1/file1.txt &&
>> +	touch dir1/file2.txt &&
>> +	mkdir -p dir2 &&
>> +	touch dir2/file1.txt &&
>> +	touch dir2/file2.txt &&
>> +	git add . &&
>> +	git commit -m "initial" &&
>> +	git config --local core.virtualworkdir true
>> +'
>
>> +test_expect_success 'verify files not listed are ignored by git clean -f -x' '
>> +	clean_repo &&
> I find it odd to clean the repo right after setting it up; but then
> again, 'clean_repo' not only cleans, but also creates new files.
> Perhaps rename it to 'reset_repo'?  Dunno.
>
>> +	write_script .git/hooks/virtual-work-dir <<-\EOF &&
>> +		printf "untracked.txt\0"
>> +		printf "dir1/\0"
>> +	EOF
>> +	mkdir -p dir3 &&
>> +	touch dir3/untracked.txt &&
>> +	git clean -f -x &&
>> +	test -f file1.txt &&
> Please use the 'test_path_is_file', ...
>
>> +	test -f file2.txt &&
>> +	test ! -f untracked.txt &&
> ... 'test_path_is_missing', and ...
>
>> +	test -d dir1 &&
> ... 'test_path_is_dir' helpers, respectively, because they print
> informative error messages on failure.
>
>> +	test -f dir1/file1.txt &&
>> +	test -f dir1/file2.txt &&
>> +	test ! -f dir1/untracked.txt &&
>> +	test -f dir2/file1.txt &&
>> +	test -f dir2/file2.txt &&
>> +	test -f dir2/untracked.txt &&
>> +	test -d dir3 &&
>> +	test -f dir3/untracked.txt
>> +'

^ permalink raw reply

* Re: [PATCH v1] mem_pool: add GIT_TRACE_MEMPOOL support
From: Ben Peart @ 2018-11-29 14:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, benpeart
In-Reply-To: <nycvar.QRO.7.76.6.1811281036300.41@tvgsbejvaqbjf.bet>


On 11/28/2018 4:37 AM, Johannes Schindelin wrote:
> Hi Ben,
>
> On Tue, 27 Nov 2018, Ben Peart wrote:
>
>> From: Ben Peart <benpeart@microsoft.com>
>>
>> Add tracing around initializing and discarding mempools. In discard report
>> on the amount of memory unused in the current block to help tune setting
>> the initial_size.
>>
>> Signed-off-by: Ben Peart <benpeart@microsoft.com>
>> ---
> Looks good.
>
> My only question: should we also trace calls to _alloc(), _calloc() and
> _combine()?

I was trying to tune the initial size in my use of the mem_pool and so 
found this tracing useful to see how much memory was actually being 
used.  I'm inclined to only add tracing as it is needed rather that 
proactively because we think it _might_ be needed.  I suspect _alloc() 
and _calloc() would get very noisy and not add much value.

>
> Ciao,
> Johannes
>
>> Notes:
>>      Base Ref: * git-trace-mempool
>>      Web-Diff: https://github.com/benpeart/git/commit/9ac84bbca2
>>      Checkout: git fetch https://github.com/benpeart/git git-trace-mempool-v1 && git checkout 9ac84bbca2
>>
>>   mem-pool.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/mem-pool.c b/mem-pool.c
>> index a2841a4a9a..065389aaec 100644
>> --- a/mem-pool.c
>> +++ b/mem-pool.c
>> @@ -5,6 +5,7 @@
>>   #include "cache.h"
>>   #include "mem-pool.h"
>>   
>> +static struct trace_key trace_mem_pool = TRACE_KEY_INIT(MEMPOOL);
>>   #define BLOCK_GROWTH_SIZE 1024*1024 - sizeof(struct mp_block);
>>   
>>   /*
>> @@ -48,12 +49,16 @@ void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size)
>>   		mem_pool_alloc_block(pool, initial_size, NULL);
>>   
>>   	*mem_pool = pool;
>> +	trace_printf_key(&trace_mem_pool, "mem_pool (%p): init (%"PRIuMAX") initial size\n",
>> +		pool, (uintmax_t)initial_size);
>>   }
>>   
>>   void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
>>   {
>>   	struct mp_block *block, *block_to_free;
>>   
>> +	trace_printf_key(&trace_mem_pool, "mem_pool (%p): discard (%"PRIuMAX") unused\n",
>> +		mem_pool, (uintmax_t)(mem_pool->mp_block->end - mem_pool->mp_block->next_free));
>>   	block = mem_pool->mp_block;
>>   	while (block)
>>   	{
>>
>> base-commit: bb75be6cb916297f271c846f2f9caa3daaaec718
>> -- 
>> 2.18.0.windows.1
>>
>>

^ permalink raw reply

* Re: [PATCH v11 00/22] Convert "git stash" to C builtin
From: Johannes Schindelin @ 2018-11-29 14:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Paul-Sebastian Ungureanu, Thomas Gummerer, git
In-Reply-To: <nycvar.QRO.7.76.6.1811291352480.41@tvgsbejvaqbjf.bet>

Hi Junio,

On Thu, 29 Nov 2018, Johannes Schindelin wrote:

> On Mon, 26 Nov 2018, Junio C Hamano wrote:
> 
> > Junio C Hamano <gitster@pobox.com> writes:
> > 
> > > Thomas Gummerer <t.gummerer@gmail.com> writes:
> > >
> > >> Thanks for your work on this!  I have read through the range-diff and
> > >> the new patch of this last round, and this addresses all the comments
> > >> I had on v10 (and some more :)).  I consider it
> > >> Reviewed-by: Thomas Gummerer <t.gummerer@gmail.com>
> > >
> > > Thanks.
> > >
> > > One thing that bothers me is that this seems to have been rebased on
> > > 'master', but as long as we are rebasing, the updated series must
> > > also take into account of the sd/stash-wo-user-name topic, i.e. if
> > > we are rebasing it, it should be rebased on top of the result of
> > >
> > > 	git checkout -B ps/rebase-in-c master
> > > 	git merge --no-ff sd/stash-wo-user-name
> > >
> > > I think.
> > 
> > https://travis-ci.org/git/git/builds/459619672 would show that this
> > C reimplementation now regresses from the scripted version due to
> > lack of such rebasing (i.e. porting a correction from scripted one).
> 
> Oh, you know, at first I *mis-read* your mail to mean "don't you rebase
> all the time!", but in this case (in contrast to earlier statements about
> rebasing between iterations of patch series), you *do* want Paul to
> rebase.
> 
> Let me see what I can come up with in my `git-stash` branch on
> https://github.com/dscho/git

There. I force-pushed an update that is based on sd/stash-wo-user-name and
adds a `prepare_fallback_ident(name, email)` to `ident.c` for use in the
built-in stash:

https://github.com/dscho/git/commit/d37ce623fbd32e4345c701dea822e56de1a5417f

It passes t3903 in a little over a minute with
GIT_TEST_STASH_USE_BUILTIN=true and in a little less than seven minutes
with GIT_TEST_STASH_USE_BUILTIN=false.

Ciao,
Dscho


^ permalink raw reply

* Re: Git Tags
From: Mateusz Loskot @ 2018-11-29 13:45 UTC (permalink / raw)
  To: git
In-Reply-To: <001901d487e9$0f9e9d90$2edbd8b0$@nexbridge.com>

On Thu, 29 Nov 2018 at 14:40, Randall S. Becker <rsbecker@nexbridge.com> wrote:
> On November 29, 2018 6:56, Mateusz Loskot wrote:
> > On Thu, 29 Nov 2018 at 12:50, Stefanie Leisestreichler <stefanie.leisestreichler@peter-speer.de> wrote:
> > >
> > > git tag -a 0.9.0
> > > git push origin master
> > >
> > > In my local repository, when I run "git tag" it is showing me "0.9.0".
> > >
> > > Then I did (on box B)
> > > git clone ssh://user@host:/path/project.git cd project git tag
> > >
> > > Now git tag is showing nothing.
> > >
> > > Why is the tag only available in my local repository?
> >
> > >From https://git-scm.com/book/en/v2/Git-Basics-Tagging
> > "By default, the git push command doesn’t transfer tags to remote servers.
> > You will have to explicitly push tags to a shared server after you have created
> > them."
>
> git push --tags
>

After my yesterday experience [1], I'd be careful with git push --tags :)

[1] genuine screenshot and link to related thread at
https://twitter.com/mloskot/status/1068072285846859776

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net

^ permalink raw reply

* RE: Git Tags
From: Randall S. Becker @ 2018-11-29 13:40 UTC (permalink / raw)
  To: 'Mateusz Loskot', git
In-Reply-To: <CABUeae_4yxtxFmi14+OivX-wFQq4Hd5uEV3_WhRMsMHbvSxy7w@mail.gmail.com>

On November 29, 2018 6:56, Mateusz Loskot wrote:
> On Thu, 29 Nov 2018 at 12:50, Stefanie Leisestreichler
> <stefanie.leisestreichler@peter-speer.de> wrote:
> >
> > git tag -a 0.9.0
> > git push origin master
> >
> > In my local repository, when I run "git tag" it is showing me "0.9.0".
> >
> > Then I did (on box B)
> > git clone ssh://user@host:/path/project.git cd project git tag
> >
> > Now git tag is showing nothing.
> >
> > Why is the tag only available in my local repository?
> 
> >From https://git-scm.com/book/en/v2/Git-Basics-Tagging
> "By default, the git push command doesn’t transfer tags to remote servers.
> You will have to explicitly push tags to a shared server after you have created
> them."

git push --tags

and

git fetch --tags

to be symmetrical

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 211288444200000000
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.


^ permalink raw reply

* Re: [bug report] git-gui child windows are blank
From: Kenn Sebesta @ 2018-11-29 13:05 UTC (permalink / raw)
  To: sunshine; +Cc: sbeller, git
In-Reply-To: <CAPig+cRU5pxZr50UpDwA2044dbX_jKZP1rhm4JQPB_yjgSTmgA@mail.gmail.com>

Just checked gitk, it seems to work fine including children windows.
This problem seems to affect git-gui only.
On Thu, Nov 29, 2018 at 5:16 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Wed, Nov 28, 2018 at 2:29 PM Stefan Beller <sbeller@google.com> wrote:
> > On Wed, Nov 28, 2018 at 6:13 AM Kenn Sebesta <kenn@eissq.com> wrote:
> > > v2.19.2, installed from brew on macOS Mojave 14.2.1.
> > >
> > > `git-gui` is my much beloved go-to tool for everything git.
> > > Unfortunately, on my new Macbook Air it seems to have a bug. When I
> > > first load the program, the parent window populates normally with the
> > > stage/unstaged and diff panes. However, when I click Push, the child
> > > window is completely blank. The frame is there, but there is no
> > > content.
> >
> > Looking through the mailing list archive, this
> > seemed to be one of the last relevant messges
> > https://public-inbox.org/git/20180724065120.7664-1-sunshine@sunshineco.com/
>
> I was hoping that Junio would patch-monkey that one since that
> crash-at-launch makes gitk unusable for Mojave users, but apparently
> it never got picked up.
>
> Anyhow, the issue fixed by that patch has to do with 'osascript' on
> Apple, and it's the only such invocation in the source code of
> gitk/git-gui. The 'osascript' invocation merely attempts to foreground
> the application at launch, so is almost certainly unrelated to the
> above reported behavior. Somebody running Mojave will likely need to
> spend some time debugging it. (Unfortunately, I'm a couple major
> releases behind and don't plan on upgrading.)

^ permalink raw reply

* [PATCH v2 1/1] rebase --stat: fix when rebasing to an unrelated history
From: Johannes Schindelin via GitGitGadget @ 2018-11-29 13:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin
In-Reply-To: <pull.88.v2.git.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When rebasing to a commit history that has no common commits with the
current branch, there is no merge base. In diffstat mode, this means
that we cannot compare to the merge base, but we have to compare to the
empty tree instead.

Also, if running in verbose diffstat mode, we should not output

	Changes from <merge-base> to <onto>

as that does not make sense without any merge base.

Note: neither scripted nor built-in versoin of `git rebase` were
prepared for this situation well. We use this opportunity not only to
fix the bug(s), but also to make both versions' output consistent in
this instance. And add a regression test to keep this working in all
eternity.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase.c          | 18 ++++++++++++------
 git-legacy-rebase.sh      | 10 ++++++++--
 t/t3406-rebase-message.sh | 10 ++++++++++
 3 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 5b3e5baec8..1c6f817f4b 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1481,10 +1481,15 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 	if (options.flags & REBASE_DIFFSTAT) {
 		struct diff_options opts;
 
-		if (options.flags & REBASE_VERBOSE)
-			printf(_("Changes from %s to %s:\n"),
-				oid_to_hex(&merge_base),
-				oid_to_hex(&options.onto->object.oid));
+		if (options.flags & REBASE_VERBOSE) {
+			if (is_null_oid(&merge_base))
+				printf(_("Changes to %s:\n"),
+				       oid_to_hex(&options.onto->object.oid));
+			else
+				printf(_("Changes from %s to %s:\n"),
+				       oid_to_hex(&merge_base),
+				       oid_to_hex(&options.onto->object.oid));
+		}
 
 		/* We want color (if set), but no pager */
 		diff_setup(&opts);
@@ -1494,8 +1499,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 			DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
 		opts.detect_rename = DIFF_DETECT_RENAME;
 		diff_setup_done(&opts);
-		diff_tree_oid(&merge_base, &options.onto->object.oid,
-			      "", &opts);
+		diff_tree_oid(is_null_oid(&merge_base) ?
+			      the_hash_algo->empty_tree : &merge_base,
+			      &options.onto->object.oid, "", &opts);
 		diffcore_std(&opts);
 		diff_flush(&opts);
 	}
diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
index b97ffdc9dd..b4c7dbfa57 100755
--- a/git-legacy-rebase.sh
+++ b/git-legacy-rebase.sh
@@ -718,10 +718,16 @@ if test -n "$diffstat"
 then
 	if test -n "$verbose"
 	then
-		echo "$(eval_gettext "Changes from \$mb to \$onto:")"
+		if test -z "$mb"
+		then
+			echo "$(eval_gettext "Changes to \$onto:")"
+		else
+			echo "$(eval_gettext "Changes from \$mb to \$onto:")"
+		fi
 	fi
+	mb_tree="${mb:-$(git hash-object -t tree /dev/null)}"
 	# We want color (if set), but no pager
-	GIT_PAGER='' git diff --stat --summary "$mb" "$onto"
+	GIT_PAGER='' git diff --stat --summary "$mb_tree" "$onto"
 fi
 
 test -n "$interactive_rebase" && run_specific_rebase
diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
index 38bd876cab..c2c9950568 100755
--- a/t/t3406-rebase-message.sh
+++ b/t/t3406-rebase-message.sh
@@ -91,4 +91,14 @@ test_expect_success 'error out early upon -C<n> or --whitespace=<bad>' '
 	test_i18ngrep "Invalid whitespace option" err
 '
 
+test_expect_success 'rebase -i onto unrelated history' '
+	git init unrelated &&
+	test_commit -C unrelated 1 &&
+	git -C unrelated remote add -f origin "$PWD" &&
+	git -C unrelated branch --set-upstream-to=origin/master &&
+	git -C unrelated -c core.editor=true rebase -i -v --stat >actual &&
+	test_i18ngrep "Changes to " actual &&
+	test_i18ngrep "5 files changed" actual
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 0/1] Fix git rebase --stat -i <unrelated-history>
From: Johannes Schindelin via GitGitGadget @ 2018-11-29 13:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <pull.88.git.gitgitgadget@gmail.com>

We're really entering obscure territory here, I would say.

To trigger the bug, two things have to come together: the user must have
asked for a diffstat afterwards, and the commits need to have been rebased
onto a completely unrelated commit history (i.e. there must exist no merge
base between the pre-rebase HEAD and the post-rebase HEAD).

Please note that this bug existed already in the scripted rebase, but it was
never detected because the scripted version would not even perform any error
checking.

It will make Junio very happy that I squashed the regression test in to the
patch that fixes it. The reason, however, was not to make Junio happy (I
hope to make him happy by fixing bugs), but simply that an earlier iteration
of test would only fail with the built-in rebase, but not with the scripted
version. The current version would fail with the scripted version, but I'll
save the time to split the patch again now.

Changes since v1:

 * The commit message now talks more about what we should do in case that
   there is no merge base, rather than stressing the differences between the
   way scripted vs built-in rebase handled it (both buggy, and fixed by this
   patch).
 * In case that there is no merge base, it no longer reports "Changes from
   (empty) to ..." but "Changes to ...", which should be a lot less
   controversial.

Johannes Schindelin (1):
  rebase --stat: fix when rebasing to an unrelated history

 builtin/rebase.c          | 18 ++++++++++++------
 git-legacy-rebase.sh      | 10 ++++++++--
 t/t3406-rebase-message.sh | 10 ++++++++++
 3 files changed, 30 insertions(+), 8 deletions(-)


base-commit: a1598010f775d82b5adf12c29d0f5bc9b41434c6
Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-88%2Fdscho%2Frebase-stat-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-88/dscho/rebase-stat-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/88

Range-diff vs v1:

 1:  680385e4bd ! 1:  190c7856ad rebase --stat: fix when rebasing to an unrelated history
     @@ -3,22 +3,21 @@
          rebase --stat: fix when rebasing to an unrelated history
      
          When rebasing to a commit history that has no common commits with the
     -    current branch, there is no merge base. The scripted version of the `git
     -    rebase` command was not prepared for that and spewed out
     +    current branch, there is no merge base. In diffstat mode, this means
     +    that we cannot compare to the merge base, but we have to compare to the
     +    empty tree instead.
      
     -            fatal: ambiguous argument '': unknown revision or path not in
     -            the working tree.
     +    Also, if running in verbose diffstat mode, we should not output
      
     -    but then continued (due to lack of error checking).
     +            Changes from <merge-base> to <onto>
      
     -    The built-in version of the `git rebase` command blindly translated that
     -    shell script code, assuming that there is no need to test whether there
     -    *was* a merge base, and due to its better error checking, exited with a
     -    fatal error (because it tried to read the object with hash 00000000...
     -    as a tree).
     +    as that does not make sense without any merge base.
      
     -    Fix both scripted and built-in versions to output something sensibly,
     -    and add a regression test to keep this working in all eternity.
     +    Note: neither scripted nor built-in versoin of `git rebase` were
     +    prepared for this situation well. We use this opportunity not only to
     +    fix the bug(s), but also to make both versions' output consistent in
     +    this instance. And add a regression test to keep this working in all
     +    eternity.
      
          Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     @@ -27,15 +26,25 @@
      --- a/builtin/rebase.c
      +++ b/builtin/rebase.c
      @@
     + 	if (options.flags & REBASE_DIFFSTAT) {
     + 		struct diff_options opts;
       
     - 		if (options.flags & REBASE_VERBOSE)
     - 			printf(_("Changes from %s to %s:\n"),
     +-		if (options.flags & REBASE_VERBOSE)
     +-			printf(_("Changes from %s to %s:\n"),
      -				oid_to_hex(&merge_base),
     -+				is_null_oid(&merge_base) ?
     -+				"(empty)" : oid_to_hex(&merge_base),
     - 				oid_to_hex(&options.onto->object.oid));
     +-				oid_to_hex(&options.onto->object.oid));
     ++		if (options.flags & REBASE_VERBOSE) {
     ++			if (is_null_oid(&merge_base))
     ++				printf(_("Changes to %s:\n"),
     ++				       oid_to_hex(&options.onto->object.oid));
     ++			else
     ++				printf(_("Changes from %s to %s:\n"),
     ++				       oid_to_hex(&merge_base),
     ++				       oid_to_hex(&options.onto->object.oid));
     ++		}
       
       		/* We want color (if set), but no pager */
     + 		diff_setup(&opts);
      @@
       			DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
       		opts.detect_rename = DIFF_DETECT_RENAME;
     @@ -57,8 +66,12 @@
       	if test -n "$verbose"
       	then
      -		echo "$(eval_gettext "Changes from \$mb to \$onto:")"
     -+		mb_display="${mb:-(empty)}"
     -+		echo "$(eval_gettext "Changes from \$mb_display to \$onto:")"
     ++		if test -z "$mb"
     ++		then
     ++			echo "$(eval_gettext "Changes to \$onto:")"
     ++		else
     ++			echo "$(eval_gettext "Changes from \$mb to \$onto:")"
     ++		fi
       	fi
      +	mb_tree="${mb:-$(git hash-object -t tree /dev/null)}"
       	# We want color (if set), but no pager
     @@ -81,7 +94,7 @@
      +	git -C unrelated remote add -f origin "$PWD" &&
      +	git -C unrelated branch --set-upstream-to=origin/master &&
      +	git -C unrelated -c core.editor=true rebase -i -v --stat >actual &&
     -+	test_i18ngrep "Changes from (empty)" actual &&
     ++	test_i18ngrep "Changes to " actual &&
      +	test_i18ngrep "5 files changed" actual
      +'
      +

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH v11 00/22] Convert "git stash" to C builtin
From: Johannes Schindelin @ 2018-11-29 12:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Paul-Sebastian Ungureanu, Thomas Gummerer, git
In-Reply-To: <xmqq8t1gwano.fsf@gitster-ct.c.googlers.com>

Hi Junio,

On Mon, 26 Nov 2018, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Thomas Gummerer <t.gummerer@gmail.com> writes:
> >
> >> Thanks for your work on this!  I have read through the range-diff and
> >> the new patch of this last round, and this addresses all the comments
> >> I had on v10 (and some more :)).  I consider it
> >> Reviewed-by: Thomas Gummerer <t.gummerer@gmail.com>
> >
> > Thanks.
> >
> > One thing that bothers me is that this seems to have been rebased on
> > 'master', but as long as we are rebasing, the updated series must
> > also take into account of the sd/stash-wo-user-name topic, i.e. if
> > we are rebasing it, it should be rebased on top of the result of
> >
> > 	git checkout -B ps/rebase-in-c master
> > 	git merge --no-ff sd/stash-wo-user-name
> >
> > I think.
> 
> https://travis-ci.org/git/git/builds/459619672 would show that this
> C reimplementation now regresses from the scripted version due to
> lack of such rebasing (i.e. porting a correction from scripted one).

Oh, you know, at first I *mis-read* your mail to mean "don't you rebase
all the time!", but in this case (in contrast to earlier statements about
rebasing between iterations of patch series), you *do* want Paul to
rebase.

Let me see what I can come up with in my `git-stash` branch on
https://github.com/dscho/git

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 1/1] rebase --stat: fix when rebasing to an unrelated history
From: Johannes Schindelin @ 2018-11-29 12:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin via GitGitGadget, git
In-Reply-To: <xmqqlg5cph1n.fsf@gitster-ct.c.googlers.com>

Hi Junio,

On Thu, 29 Nov 2018, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
> 
> > The built-in version of the `git rebase` command blindly translated that
> > shell script code, assuming that there is no need to test whether there
> > *was* a merge base, and due to its better error checking, exited with a
> > fatal error (because it tried to read the object with hash 00000000...
> > as a tree).
> 
> And the scripted version produced a wrong result because it did not
> check the lack of merge base and fed an empty string (presumably, as
> that is what you would get from mb=$(merge-base A B) when A and B
> are unrelated) to later steps?  If that is the case, then it *is* a
> very significant thing to record here.  As it was not merely "failed
> to stop due to lack of error checking", but a lot worse.  It was
> producing a wrong result.

Indeed. But it was only in the `--stat` reporting, so it did not produce
an incorrect rebased history.

> A more faithful reimplementation would not have exited with fatal
> error, but would have produced the same wrong result, so "a better
> error checking caused the reimplementation die where the original
> did not die" may be correct but is much less important than the fact
> that "the logic used in the original to produce diffstat forgot that
> there may not be a merge base and would not have worked correctly in
> such a case, and that is what we are correcting" is more important.

True.

> Please descibe the issue as such, if that is the case (I cannot read
> from the description in the proposed log message if that is the
> case---but I came to the conclusion that it is what you wanted to
> fix reading the code).

Indeed, my commit message is a bit too close to what I fixed, and not
enough about what needed to be fixed.

> >  		if (options.flags & REBASE_VERBOSE)
> >  			printf(_("Changes from %s to %s:\n"),
> > -				oid_to_hex(&merge_base),
> > +				is_null_oid(&merge_base) ?
> > +				"(empty)" : oid_to_hex(&merge_base),
> 
> I am not sure (empty) is a good idea for two reasons.  It is going
> to be inserted into an translated string without translation.

Oooops.

> Also it is too similar to the fallback string used by some printf
> implementations when NULL was given to %s by mistake.

You mean `(null)`? That was actually intentional, I just thought that
`(empty)` would be different enough...

> If there weren't i18n issues, I would suggest to use "empty merge
> base" or "empty tree" instead, but the old one would have shown
> 0{40}, so perhaps empty_tree_oid_hex() is a good substitute?

As this is a user-facing message, I'd rather avoid something like
`empty_tree_oid_hex()`, which to every Git user who does not happen to be
a Git developer, too, must sound very cryptic.

But I guess that I should not be so lazy and really use two different
messages here:

	Changes from <merge-base> to <onto>

and if there is no merge base,

	Changes in <onto>

Will fix.

> > @@ -1494,8 +1495,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
> >  			DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
> >  		opts.detect_rename = DIFF_DETECT_RENAME;
> >  		diff_setup_done(&opts);
> > -		diff_tree_oid(&merge_base, &options.onto->object.oid,
> > -			      "", &opts);
> > +		diff_tree_oid(is_null_oid(&merge_base) ?
> > +			      the_hash_algo->empty_tree : &merge_base,
> > +			      &options.onto->object.oid, "", &opts);
> 
> The original would have run "git diff '' $onto", and died with an
> error message, so both the original and the reimplementation were
> failing.  Just in different ways ;-)

Right.

> > diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
> > index b97ffdc9dd..be3b241676 100755
> > --- a/git-legacy-rebase.sh
> > +++ b/git-legacy-rebase.sh
> > @@ -718,10 +718,12 @@ if test -n "$diffstat"
> >  then
> >  	if test -n "$verbose"
> >  	then
> > -		echo "$(eval_gettext "Changes from \$mb to \$onto:")"
> > +		mb_display="${mb:-(empty)}"
> > +		echo "$(eval_gettext "Changes from \$mb_display to \$onto:")"
> >  	fi
> > +	mb_tree="${mb:-$(git hash-object -t tree /dev/null)}"
> >  	# We want color (if set), but no pager
> > -	GIT_PAGER='' git diff --stat --summary "$mb" "$onto"
> > +	GIT_PAGER='' git diff --stat --summary "$mb_tree" "$onto"
> 
> Code fix for diff-tree invocation, both in the builtin/ version and
> this one, looks correct.

Okay. I fixed the two things you pointed out, just waiting for the Linux
phase to finish (I don't think there is anything OS-specific in this
patch, so I trust macOS and Windows to pass if Linux passes):
https://git-for-windows.visualstudio.com/git/_build/results?buildId=26116

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 2/2] format-patch: allow for independent diff & range-diff options
From: Johannes Schindelin @ 2018-11-29 12:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Eric Sunshine
In-Reply-To: <8736rkyy4h.fsf@evledraar.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 11467 bytes --]

Hi Ævar,

On Thu, 29 Nov 2018, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Nov 29 2018, Johannes Schindelin wrote:
> 
> > On Wed, 28 Nov 2018, Ævar Arnfjörð Bjarmason wrote:
> >
> >> Change the semantics of the "--range-diff" option so that the regular
> >> diff options can be provided separately for the range-diff and the
> >> patch. This allows for supplying e.g. --range-diff-U0 and -U1 to
> >> "format-patch" to provide different context for the range-diff and the
> >> patch. This wasn't possible before.
> >
> > I really, really dislike the `--range-diff-<random-thing>`. We have
> > precedent for passing optional arguments that are passed to some other
> > command, so a much more logical and consistent convention would be to use
> > `--range-diff[=<diff-option>..]`, allowing all of the diff options that
> > you might want to pass to the outer diff in one go rather than having a
> > lengthy string of `--range-diff-this` and `--range-diff-that` options.
> 
> Where do we pass those sorts of arguments?
> 
> Reasons I did it this way:
> 
>  a) Passing it as one option will require the user to double-quote those
>     options that take quoted arguments (e.g. --word-diff-regex), which I
>     thought sucked more than the prefix. On the implementation side we
>     couldn't leave the parsing of the command-line to the shell anymore.
> 
>  b) I think people will want to tweak this very rarely, much more rarely
>     than e.g. -U10 in format-patch itself, so having something long-ish
>     doesn't sound bad.

Hmm. I still don't like it. It sets a precedent, and we simply do not do
it that way in other circumstances (most obvious would be the -X merge
options). The more divergent user interfaces for the same sort of thing
are, the more brain cycles you force users to spend on navigating said
interfaces.

> > I only had time to skim the patch, and I have to wonder why you pass
> > around full-blown `rev_info` structs for range diff (and with that really
> > awful name `rd_rev`) rather than just the `diff_options` that you
> > *actually* care about?
> 
> Because setup_revisions() which does all the command-line parsing needs
> a rev_info, so even if we only need the diffopt in the end we need to
> initiate the whole thing.
> 
> Suggestions for a better varibale name most welcome.

`range_diff_revs`

And you do not need to pass around the whole thing. You can easily pass
`&range_diff_revs.diffopt`.

Don't pass around what you do not need to pass around.

Ciao,
Dscho

> 
> > Ciao,
> > Dscho
> >
> >>
> >> Ever since the "--range-diff" option was added in
> >> 31e2617a5f ("format-patch: add --range-diff option to embed diff in
> >> cover letter", 2018-07-22) the "rev->diffopt" we pass down to the diff
> >> machinery has been the one we get from "format-patch"'s own
> >> setup_revisions().
> >>
> >> This sort of thing is unique among the log-like commands in
> >> builtin/log.c, no command than format-patch will embed the output of
> >> another log-like command. Since the "rev->diffopt" is reused we need
> >> to munge it before we pass it to show_range_diff(). See
> >> 43dafc4172 ("format-patch: don't include --stat with --range-diff
> >> output", 2018-11-22) for a related regression fix which is being
> >> mostly reverted here.
> >>
> >> Implementation notes: 1) We're not bothering with the full teardown
> >> around die() and will leak memory, but it's too much boilerplate to do
> >> all the frees with/without the die() and not worth it. 2) We call
> >> repo_init_revisions() for "rd_rev" even though we could get away with
> >> a shallow copy like the code we're replacing (and which
> >> show_range_diff() itself does). This is to make this code more easily
> >> understood.
> >>
> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> >> ---
> >>  Documentation/git-format-patch.txt | 10 ++++++-
> >>  builtin/log.c                      | 42 +++++++++++++++++++++++-------
> >>  t/t3206-range-diff.sh              | 41 +++++++++++++++++++++++++++++
> >>  3 files changed, 82 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
> >> index aba4c5febe..6c048f415f 100644
> >> --- a/Documentation/git-format-patch.txt
> >> +++ b/Documentation/git-format-patch.txt
> >> @@ -24,7 +24,8 @@ SYNOPSIS
> >>  		   [--to=<email>] [--cc=<email>]
> >>  		   [--[no-]cover-letter] [--quiet] [--notes[=<ref>]]
> >>  		   [--interdiff=<previous>]
> >> -		   [--range-diff=<previous> [--creation-factor=<percent>]]
> >> +		   [--range-diff=<previous> [--creation-factor=<percent>]
> >> +		      [--range-diff<common diff option>]]
> >>  		   [--progress]
> >>  		   [<common diff options>]
> >>  		   [ <since> | <revision range> ]
> >> @@ -257,6 +258,13 @@ feeding the result to `git send-email`.
> >>  	creation/deletion cost fudge factor. See linkgit:git-range-diff[1])
> >>  	for details.
> >>
> >> +--range-diff<common diff option>::
> >> +	Other options prefixed with `--range-diff` are stripped of
> >> +	that prefix and passed as-is to the diff machinery used to
> >> +	generate the range-diff, e.g. `--range-diff-U0` and
> >> +	`--range-diff--no-color`. This allows for adjusting the format
> >> +	of the range-diff independently from the patch itself.
> >> +
> >>  --notes[=<ref>]::
> >>  	Append the notes (see linkgit:git-notes[1]) for the commit
> >>  	after the three-dash line.
> >> diff --git a/builtin/log.c b/builtin/log.c
> >> index 02d88fa233..7658e56ecc 100644
> >> --- a/builtin/log.c
> >> +++ b/builtin/log.c
> >> @@ -1023,7 +1023,8 @@ static void show_diffstat(struct rev_info *rev,
> >>  	fprintf(rev->diffopt.file, "\n");
> >>  }
> >>
> >> -static void make_cover_letter(struct rev_info *rev, int use_stdout,
> >> +static void make_cover_letter(struct rev_info *rev, struct rev_info *rd_rev,
> >> +			      int use_stdout,
> >>  			      struct commit *origin,
> >>  			      int nr, struct commit **list,
> >>  			      const char *branch_name,
> >> @@ -1095,13 +1096,9 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
> >>  	}
> >>
> >>  	if (rev->rdiff1) {
> >> -		struct diff_options opts;
> >> -		memcpy(&opts, &rev->diffopt, sizeof(opts));
> >> -		opts.output_format &= ~(DIFF_FORMAT_DIFFSTAT | DIFF_FORMAT_SUMMARY);
> >> -
> >>  		fprintf_ln(rev->diffopt.file, "%s", rev->rdiff_title);
> >>  		show_range_diff(rev->rdiff1, rev->rdiff2,
> >> -				rev->creation_factor, 1, &opts);
> >> +				rev->creation_factor, 1, &rd_rev->diffopt);
> >>  	}
> >>  }
> >>
> >> @@ -1485,6 +1482,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >>  	struct commit *commit;
> >>  	struct commit **list = NULL;
> >>  	struct rev_info rev;
> >> +	struct rev_info rd_rev;
> >>  	struct setup_revision_opt s_r_opt;
> >>  	int nr = 0, total, i;
> >>  	int use_stdout = 0;
> >> @@ -1603,6 +1601,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >>  	init_log_defaults();
> >>  	git_config(git_format_config, NULL);
> >>  	repo_init_revisions(the_repository, &rev, prefix);
> >> +	repo_init_revisions(the_repository, &rd_rev, prefix);
> >>  	rev.commit_format = CMIT_FMT_EMAIL;
> >>  	rev.expand_tabs_in_log_default = 0;
> >>  	rev.verbose_header = 1;
> >> @@ -1689,8 +1688,32 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >>  	rev.preserve_subject = keep_subject;
> >>
> >>  	argc = setup_revisions(argc, argv, &rev, &s_r_opt);
> >> -	if (argc > 1)
> >> -		die(_("unrecognized argument: %s"), argv[1]);
> >> +	if (argc > 1) {
> >> +		struct argv_array args = ARGV_ARRAY_INIT;
> >> +		const char *prefix = "--range-diff";
> >> +		int have_prefix = 0;
> >> +
> >> +		for (i = 0; i < argc; i++) {
> >> +			struct strbuf sb = STRBUF_INIT;
> >> +			char *str;
> >> +
> >> +			strbuf_addstr(&sb, argv[i]);
> >> +			if (starts_with(argv[i], prefix)) {
> >> +				have_prefix = 1;
> >> +				strbuf_remove(&sb, 0, strlen(prefix));
> >> +			}
> >> +			str = strbuf_detach(&sb, NULL);
> >> +			strbuf_release(&sb);
> >> +
> >> +			argv_array_push(&args, str);
> >> +		}
> >> +
> >> +		if (!have_prefix)
> >> +			die(_("unrecognized argument: %s"), argv[1]);
> >> +		argc = setup_revisions(args.argc, args.argv, &rd_rev, NULL);
> >> +		if (argc > 1)
> >> +			die(_("unrecognized argument: %s"), argv[1]);
> >> +	}
> >>
> >>  	if (rev.diffopt.output_format & DIFF_FORMAT_NAME)
> >>  		die(_("--name-only does not make sense"));
> >> @@ -1702,7 +1725,6 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >>  	if (!use_patch_format &&
> >>  		(!rev.diffopt.output_format ||
> >>  		 rev.diffopt.output_format == DIFF_FORMAT_PATCH))
> >> -		/* Needs to be mirrored in show_range_diff() invocation */
> >>  		rev.diffopt.output_format = DIFF_FORMAT_DIFFSTAT | DIFF_FORMAT_SUMMARY;
> >>  	if (!rev.diffopt.stat_width)
> >>  		rev.diffopt.stat_width = MAIL_DEFAULT_WRAP;
> >> @@ -1877,7 +1899,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >>  	if (cover_letter) {
> >>  		if (thread)
> >>  			gen_message_id(&rev, "cover");
> >> -		make_cover_letter(&rev, use_stdout,
> >> +		make_cover_letter(&rev, &rd_rev, use_stdout,
> >>  				  origin, nr, list, branch_name, quiet);
> >>  		print_bases(&bases, rev.diffopt.file);
> >>  		print_signature(rev.diffopt.file);
> >> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> >> index bc5facc1cd..6916103888 100755
> >> --- a/t/t3206-range-diff.sh
> >> +++ b/t/t3206-range-diff.sh
> >> @@ -308,6 +308,35 @@ test_expect_success 'format-patch with <common diff option>' '
> >>  		--range-diff=topic~..topic changed~..changed >actual.raw &&
> >>  	sed -ne "/^1:/,/^--/p" <actual.raw >actual.range-diff &&
> >>  	sed -e "s|:$||" >expect <<-\EOF &&
> >> +	1:  a63e992 ! 1:  d966c5c s/12/B/
> >> +	    @@ -8,7 +8,7 @@
> >> +	     @@
> >> +	      9
> >> +	      10
> >> +	    - B
> >> +	    + BB
> >> +	     -12
> >> +	     +B
> >> +	      13
> >> +	-- :
> >> +	EOF
> >> +	test_cmp expect actual.range-diff &&
> >> +	sed -ne "/^--- /,/^--/p" <actual.raw >actual.diff &&
> >> +	sed -e "s|:$||" >expect <<-\EOF &&
> >> +	--- a/file
> >> +	+++ b/file
> >> +	@@ -12 +12 @@ BB
> >> +	-12
> >> +	+B
> >> +	-- :
> >> +	EOF
> >> +	test_cmp expect actual.diff &&
> >> +
> >> +	# -U0 & --range-diff-U0
> >> +	git format-patch --cover-letter --stdout -U0 --range-diff-U0 \
> >> +		--range-diff=topic~..topic changed~..changed >actual.raw &&
> >> +	sed -ne "/^1:/,/^--/p" <actual.raw >actual.range-diff &&
> >> +	sed -e "s|:$||" >expect <<-\EOF &&
> >>  	1:  a63e992 ! 1:  d966c5c s/12/B/
> >>  	    @@ -11 +11 @@
> >>  	    - B
> >> @@ -327,4 +356,16 @@ test_expect_success 'format-patch with <common diff option>' '
> >>  	test_cmp expect actual.diff
> >>  '
> >>
> >> +test_expect_success 'format-patch option parsing with --range-diff-*' '
> >> +	test_must_fail git format-patch --stdout --unknown \
> >> +		master..unmodified 2>stderr &&
> >> +	test_i18ngrep "unrecognized argument: --unknown" stderr &&
> >> +	test_must_fail git format-patch --stdout --range-diff-unknown \
> >> +		master..unmodified 2>stderr &&
> >> +	test_i18ngrep "unrecognized argument: --range-diff-unknown" stderr &&
> >> +	test_must_fail git format-patch --stdout --unknown --range-diff-unknown \
> >> +		master..unmodified 2>stderr &&
> >> +	test_i18ngrep "unrecognized argument: --unknown" stderr
> >> +'
> >> +
> >>  test_done
> >> --
> >> 2.20.0.rc1.387.gf8505762e3
> >>
> >>
> 

^ permalink raw reply

* Re: Git Tags
From: Mateusz Loskot @ 2018-11-29 11:56 UTC (permalink / raw)
  To: git
In-Reply-To: <c8fc0da2-c3ff-4985-e4a2-a066a3a6f2af@peter-speer.de>

On Thu, 29 Nov 2018 at 12:50, Stefanie Leisestreichler
<stefanie.leisestreichler@peter-speer.de> wrote:
>
> git tag -a 0.9.0
> git push origin master
>
> In my local repository, when I run "git tag" it is showing me "0.9.0".
>
> Then I did (on box B)
> git clone ssh://user@host:/path/project.git
> cd project
> git tag
>
> Now git tag is showing nothing.
>
> Why is the tag only available in my local repository?

From https://git-scm.com/book/en/v2/Git-Basics-Tagging
"By default, the git push command doesn’t transfer tags to remote servers.
You will have to explicitly push tags to a shared server after you
have created them."

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net

^ permalink raw reply

* Git Tags
From: Stefanie Leisestreichler @ 2018-11-29 11:11 UTC (permalink / raw)
  To: git

Hi.

I have done this (on box A):

git commit -m "Message"
git tag -a 0.9.0
git push origin master

In my local repository, when I run "git tag" it is showing me "0.9.0".

Then I did (on box B)
git clone ssh://user@host:/path/project.git
cd project
git tag

Now git tag is showing nothing.

Why is the tag only available in my local repository?

Also when I try to
git clone --branch 0.9.0 ssh://user@host:/path/project.git
it tells me: fatal:remote branch not found in upstream repository origin

Why is the tag not available in origin?

Thanks,
Stefanie


^ permalink raw reply

* Re: [PATCH v11 20/22] stash: convert `stash--helper.c` into `stash.c`
From: Johannes Schindelin @ 2018-11-29 10:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Paul-Sebastian Ungureanu, git, t.gummerer
In-Reply-To: <87ftvmytqj.fsf@evledraar.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1998 bytes --]

Hi,

On Wed, 28 Nov 2018, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Nov 27 2018, Johannes Schindelin wrote:
> 
> > On Mon, 26 Nov 2018, Junio C Hamano wrote:
> >
> >> Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> writes:
> >>
> >> > The old shell script `git-stash.sh`  was removed and replaced
> >> > entirely by `builtin/stash.c`. In order to do that, `create` and
> >> > `push` were adapted to work without `stash.sh`. For example, before
> >> > this commit, `git stash create` called `git stash--helper create
> >> > --message "$*"`. If it called `git stash--helper create "$@"`, then
> >> > some of these changes wouldn't have been necessary.
> >> >
> >> > This commit also removes the word `helper` since now stash is
> >> > called directly and not by a shell script.
> >>
> >> Seeing the recent trouble in "rebase in C" and how keeping the
> >> scripted version as "git legacy-rebase" helped us postpone the
> >> rewritten version without ripping the whole thing out, I wonder if
> >> we can do the same here.
> >
> > Feel very free to cherry-pick
> > https://github.com/git-for-windows/git/commit/004da7e7faa36c872868ae938e06594ea1c2f01c
> > and
> > https://github.com/git-for-windows/git/commit/cedfcd39f5a4e4beb33e16fa67c4659fd4bdabf6
> > which is what we carry in Git for Windows.
> 
> ...and then something similar to 62c23938fa ("tests: add a special setup
> where rebase.useBuiltin is off", 2018-11-14) so those of us who're
> smoking next for bugs can test both and report if some of the test
> setups (odd OS's etc) show a difference in behavior.

I allowed myself to make those changes, and to reorder the last three
patches as asked by Junio. Paul, please find the result at
https://github.com/dscho/git as `git-stash`. If you agree with it, I would
be delighted if you resubmitted it directly after Git v2.20.0 is released
(Junio, it seemed that -rc1 slipped a couple of days, and now -rc2, too,
any word when you think the final v2.20.0 is due?).

Thanks,
Dscho

^ permalink raw reply

* Re: [PATCH] rebase -i: introduce the 'test' command
From: Johannes Schindelin @ 2018-11-29 10:55 UTC (permalink / raw)
  To: Paul Morelle; +Cc: Git Users
In-Reply-To: <nycvar.QRO.7.76.6.1811281935310.41@tvgsbejvaqbjf.bet>

Hi Paul,

On Thu, 29 Nov 2018, Johannes Schindelin wrote:

> I already added a test... See the reschedule-failed-exec branch on
> https://github.com/dscho/git.

And now I put up a proper Pull Request, to be submitted via GitGitGadget
right after Git v2.20.0 will be released (technically, we are in a
feature freeze period right now, I just could no resist, as I found your
reasoning for rescheduling failed `exec` commands compelling, and there
were no `rebase` bugs for me to fix).

https://github.com/gitgitgadget/git/pull/90

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH 2/2] format-patch: allow for independent diff & range-diff options
From: Ævar Arnfjörð Bjarmason @ 2018-11-29 10:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Eric Sunshine
In-Reply-To: <nycvar.QRO.7.76.6.1811291103190.41@tvgsbejvaqbjf.bet>


On Thu, Nov 29 2018, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Wed, 28 Nov 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> Change the semantics of the "--range-diff" option so that the regular
>> diff options can be provided separately for the range-diff and the
>> patch. This allows for supplying e.g. --range-diff-U0 and -U1 to
>> "format-patch" to provide different context for the range-diff and the
>> patch. This wasn't possible before.
>
> I really, really dislike the `--range-diff-<random-thing>`. We have
> precedent for passing optional arguments that are passed to some other
> command, so a much more logical and consistent convention would be to use
> `--range-diff[=<diff-option>..]`, allowing all of the diff options that
> you might want to pass to the outer diff in one go rather than having a
> lengthy string of `--range-diff-this` and `--range-diff-that` options.

Where do we pass those sorts of arguments?

Reasons I did it this way:

 a) Passing it as one option will require the user to double-quote those
    options that take quoted arguments (e.g. --word-diff-regex), which I
    thought sucked more than the prefix. On the implementation side we
    couldn't leave the parsing of the command-line to the shell anymore.

 b) I think people will want to tweak this very rarely, much more rarely
    than e.g. -U10 in format-patch itself, so having something long-ish
    doesn't sound bad.

> I only had time to skim the patch, and I have to wonder why you pass
> around full-blown `rev_info` structs for range diff (and with that really
> awful name `rd_rev`) rather than just the `diff_options` that you
> *actually* care about?

Because setup_revisions() which does all the command-line parsing needs
a rev_info, so even if we only need the diffopt in the end we need to
initiate the whole thing.

Suggestions for a better varibale name most welcome.

> Ciao,
> Dscho
>
>>
>> Ever since the "--range-diff" option was added in
>> 31e2617a5f ("format-patch: add --range-diff option to embed diff in
>> cover letter", 2018-07-22) the "rev->diffopt" we pass down to the diff
>> machinery has been the one we get from "format-patch"'s own
>> setup_revisions().
>>
>> This sort of thing is unique among the log-like commands in
>> builtin/log.c, no command than format-patch will embed the output of
>> another log-like command. Since the "rev->diffopt" is reused we need
>> to munge it before we pass it to show_range_diff(). See
>> 43dafc4172 ("format-patch: don't include --stat with --range-diff
>> output", 2018-11-22) for a related regression fix which is being
>> mostly reverted here.
>>
>> Implementation notes: 1) We're not bothering with the full teardown
>> around die() and will leak memory, but it's too much boilerplate to do
>> all the frees with/without the die() and not worth it. 2) We call
>> repo_init_revisions() for "rd_rev" even though we could get away with
>> a shallow copy like the code we're replacing (and which
>> show_range_diff() itself does). This is to make this code more easily
>> understood.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  Documentation/git-format-patch.txt | 10 ++++++-
>>  builtin/log.c                      | 42 +++++++++++++++++++++++-------
>>  t/t3206-range-diff.sh              | 41 +++++++++++++++++++++++++++++
>>  3 files changed, 82 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
>> index aba4c5febe..6c048f415f 100644
>> --- a/Documentation/git-format-patch.txt
>> +++ b/Documentation/git-format-patch.txt
>> @@ -24,7 +24,8 @@ SYNOPSIS
>>  		   [--to=<email>] [--cc=<email>]
>>  		   [--[no-]cover-letter] [--quiet] [--notes[=<ref>]]
>>  		   [--interdiff=<previous>]
>> -		   [--range-diff=<previous> [--creation-factor=<percent>]]
>> +		   [--range-diff=<previous> [--creation-factor=<percent>]
>> +		      [--range-diff<common diff option>]]
>>  		   [--progress]
>>  		   [<common diff options>]
>>  		   [ <since> | <revision range> ]
>> @@ -257,6 +258,13 @@ feeding the result to `git send-email`.
>>  	creation/deletion cost fudge factor. See linkgit:git-range-diff[1])
>>  	for details.
>>
>> +--range-diff<common diff option>::
>> +	Other options prefixed with `--range-diff` are stripped of
>> +	that prefix and passed as-is to the diff machinery used to
>> +	generate the range-diff, e.g. `--range-diff-U0` and
>> +	`--range-diff--no-color`. This allows for adjusting the format
>> +	of the range-diff independently from the patch itself.
>> +
>>  --notes[=<ref>]::
>>  	Append the notes (see linkgit:git-notes[1]) for the commit
>>  	after the three-dash line.
>> diff --git a/builtin/log.c b/builtin/log.c
>> index 02d88fa233..7658e56ecc 100644
>> --- a/builtin/log.c
>> +++ b/builtin/log.c
>> @@ -1023,7 +1023,8 @@ static void show_diffstat(struct rev_info *rev,
>>  	fprintf(rev->diffopt.file, "\n");
>>  }
>>
>> -static void make_cover_letter(struct rev_info *rev, int use_stdout,
>> +static void make_cover_letter(struct rev_info *rev, struct rev_info *rd_rev,
>> +			      int use_stdout,
>>  			      struct commit *origin,
>>  			      int nr, struct commit **list,
>>  			      const char *branch_name,
>> @@ -1095,13 +1096,9 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
>>  	}
>>
>>  	if (rev->rdiff1) {
>> -		struct diff_options opts;
>> -		memcpy(&opts, &rev->diffopt, sizeof(opts));
>> -		opts.output_format &= ~(DIFF_FORMAT_DIFFSTAT | DIFF_FORMAT_SUMMARY);
>> -
>>  		fprintf_ln(rev->diffopt.file, "%s", rev->rdiff_title);
>>  		show_range_diff(rev->rdiff1, rev->rdiff2,
>> -				rev->creation_factor, 1, &opts);
>> +				rev->creation_factor, 1, &rd_rev->diffopt);
>>  	}
>>  }
>>
>> @@ -1485,6 +1482,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>>  	struct commit *commit;
>>  	struct commit **list = NULL;
>>  	struct rev_info rev;
>> +	struct rev_info rd_rev;
>>  	struct setup_revision_opt s_r_opt;
>>  	int nr = 0, total, i;
>>  	int use_stdout = 0;
>> @@ -1603,6 +1601,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>>  	init_log_defaults();
>>  	git_config(git_format_config, NULL);
>>  	repo_init_revisions(the_repository, &rev, prefix);
>> +	repo_init_revisions(the_repository, &rd_rev, prefix);
>>  	rev.commit_format = CMIT_FMT_EMAIL;
>>  	rev.expand_tabs_in_log_default = 0;
>>  	rev.verbose_header = 1;
>> @@ -1689,8 +1688,32 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>>  	rev.preserve_subject = keep_subject;
>>
>>  	argc = setup_revisions(argc, argv, &rev, &s_r_opt);
>> -	if (argc > 1)
>> -		die(_("unrecognized argument: %s"), argv[1]);
>> +	if (argc > 1) {
>> +		struct argv_array args = ARGV_ARRAY_INIT;
>> +		const char *prefix = "--range-diff";
>> +		int have_prefix = 0;
>> +
>> +		for (i = 0; i < argc; i++) {
>> +			struct strbuf sb = STRBUF_INIT;
>> +			char *str;
>> +
>> +			strbuf_addstr(&sb, argv[i]);
>> +			if (starts_with(argv[i], prefix)) {
>> +				have_prefix = 1;
>> +				strbuf_remove(&sb, 0, strlen(prefix));
>> +			}
>> +			str = strbuf_detach(&sb, NULL);
>> +			strbuf_release(&sb);
>> +
>> +			argv_array_push(&args, str);
>> +		}
>> +
>> +		if (!have_prefix)
>> +			die(_("unrecognized argument: %s"), argv[1]);
>> +		argc = setup_revisions(args.argc, args.argv, &rd_rev, NULL);
>> +		if (argc > 1)
>> +			die(_("unrecognized argument: %s"), argv[1]);
>> +	}
>>
>>  	if (rev.diffopt.output_format & DIFF_FORMAT_NAME)
>>  		die(_("--name-only does not make sense"));
>> @@ -1702,7 +1725,6 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>>  	if (!use_patch_format &&
>>  		(!rev.diffopt.output_format ||
>>  		 rev.diffopt.output_format == DIFF_FORMAT_PATCH))
>> -		/* Needs to be mirrored in show_range_diff() invocation */
>>  		rev.diffopt.output_format = DIFF_FORMAT_DIFFSTAT | DIFF_FORMAT_SUMMARY;
>>  	if (!rev.diffopt.stat_width)
>>  		rev.diffopt.stat_width = MAIL_DEFAULT_WRAP;
>> @@ -1877,7 +1899,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>>  	if (cover_letter) {
>>  		if (thread)
>>  			gen_message_id(&rev, "cover");
>> -		make_cover_letter(&rev, use_stdout,
>> +		make_cover_letter(&rev, &rd_rev, use_stdout,
>>  				  origin, nr, list, branch_name, quiet);
>>  		print_bases(&bases, rev.diffopt.file);
>>  		print_signature(rev.diffopt.file);
>> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
>> index bc5facc1cd..6916103888 100755
>> --- a/t/t3206-range-diff.sh
>> +++ b/t/t3206-range-diff.sh
>> @@ -308,6 +308,35 @@ test_expect_success 'format-patch with <common diff option>' '
>>  		--range-diff=topic~..topic changed~..changed >actual.raw &&
>>  	sed -ne "/^1:/,/^--/p" <actual.raw >actual.range-diff &&
>>  	sed -e "s|:$||" >expect <<-\EOF &&
>> +	1:  a63e992 ! 1:  d966c5c s/12/B/
>> +	    @@ -8,7 +8,7 @@
>> +	     @@
>> +	      9
>> +	      10
>> +	    - B
>> +	    + BB
>> +	     -12
>> +	     +B
>> +	      13
>> +	-- :
>> +	EOF
>> +	test_cmp expect actual.range-diff &&
>> +	sed -ne "/^--- /,/^--/p" <actual.raw >actual.diff &&
>> +	sed -e "s|:$||" >expect <<-\EOF &&
>> +	--- a/file
>> +	+++ b/file
>> +	@@ -12 +12 @@ BB
>> +	-12
>> +	+B
>> +	-- :
>> +	EOF
>> +	test_cmp expect actual.diff &&
>> +
>> +	# -U0 & --range-diff-U0
>> +	git format-patch --cover-letter --stdout -U0 --range-diff-U0 \
>> +		--range-diff=topic~..topic changed~..changed >actual.raw &&
>> +	sed -ne "/^1:/,/^--/p" <actual.raw >actual.range-diff &&
>> +	sed -e "s|:$||" >expect <<-\EOF &&
>>  	1:  a63e992 ! 1:  d966c5c s/12/B/
>>  	    @@ -11 +11 @@
>>  	    - B
>> @@ -327,4 +356,16 @@ test_expect_success 'format-patch with <common diff option>' '
>>  	test_cmp expect actual.diff
>>  '
>>
>> +test_expect_success 'format-patch option parsing with --range-diff-*' '
>> +	test_must_fail git format-patch --stdout --unknown \
>> +		master..unmodified 2>stderr &&
>> +	test_i18ngrep "unrecognized argument: --unknown" stderr &&
>> +	test_must_fail git format-patch --stdout --range-diff-unknown \
>> +		master..unmodified 2>stderr &&
>> +	test_i18ngrep "unrecognized argument: --range-diff-unknown" stderr &&
>> +	test_must_fail git format-patch --stdout --unknown --range-diff-unknown \
>> +		master..unmodified 2>stderr &&
>> +	test_i18ngrep "unrecognized argument: --unknown" stderr
>> +'
>> +
>>  test_done
>> --
>> 2.20.0.rc1.387.gf8505762e3
>>
>>

^ permalink raw reply

* Re: [bug report] git-gui child windows are blank
From: Eric Sunshine @ 2018-11-29 10:16 UTC (permalink / raw)
  To: Stefan Beller; +Cc: kenn, Git List
In-Reply-To: <CAGZ79kZtrj-Gg8P-xJmn8TYjgXuUmj8pJUXU+Vf89F0tKHuYBg@mail.gmail.com>

On Wed, Nov 28, 2018 at 2:29 PM Stefan Beller <sbeller@google.com> wrote:
> On Wed, Nov 28, 2018 at 6:13 AM Kenn Sebesta <kenn@eissq.com> wrote:
> > v2.19.2, installed from brew on macOS Mojave 14.2.1.
> >
> > `git-gui` is my much beloved go-to tool for everything git.
> > Unfortunately, on my new Macbook Air it seems to have a bug. When I
> > first load the program, the parent window populates normally with the
> > stage/unstaged and diff panes. However, when I click Push, the child
> > window is completely blank. The frame is there, but there is no
> > content.
>
> Looking through the mailing list archive, this
> seemed to be one of the last relevant messges
> https://public-inbox.org/git/20180724065120.7664-1-sunshine@sunshineco.com/

I was hoping that Junio would patch-monkey that one since that
crash-at-launch makes gitk unusable for Mojave users, but apparently
it never got picked up.

Anyhow, the issue fixed by that patch has to do with 'osascript' on
Apple, and it's the only such invocation in the source code of
gitk/git-gui. The 'osascript' invocation merely attempts to foreground
the application at launch, so is almost certainly unrelated to the
above reported behavior. Somebody running Mojave will likely need to
spend some time debugging it. (Unfortunately, I'm a couple major
releases behind and don't plan on upgrading.)

^ permalink raw reply

* Re: in 2.19.2 t0061-run-command FAILs if . is in $PATH
From: Johannes Schindelin @ 2018-11-29 10:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H.Merijn Brand, git
In-Reply-To: <xmqqpnuor6qs.fsf@gitster-ct.c.googlers.com>

Hi Merijn and Junio,

On Thu, 29 Nov 2018, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > -test_expect_success 'run_command is restricted to PATH' '
> > +test_lazy_prereq DOT_IN_PATH '
> > +	case ":$PATH:" in
> > +	*:.:*) true;;
> > +	*) false;;
> > +	esac
> > +'
> 
> An empty element in the colon-separated list also serves as an
> instruction to pick up executable from $cwd, so
> 
> 	case ":$PATH:" in
> 	*:.:** | *::*) true ;;
> 	*) false ;;
> 	esac
> 
> perhaps.

Good point.

Merijn, please be sure to squash this fix in before you submit the final
thing.

Thanks,
Johannes

> 
> > +test_expect_success !DOT_IN_PATH 'run_command is restricted to PATH' '
> >  	write_script should-not-run <<-\EOF &&
> >  	echo yikes
> >  	EOF
> > -- snap --
> >
> > If so, can you please provide a commit message for it (you can add my
> > Signed-off-by: line and your Tested-by: line).
> >
> > Thanks,
> > Johannes
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox