Git development
 help / color / mirror / Atom feed
* Re: Assertion in git log graphing [regression in v2.25]
From: Derrick Stolee @ 2020-01-07 14:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Bradley Smith, Junio C Hamano, James Coglan, git
In-Reply-To: <20200107140417.GA12242@coredump.intra.peff.net>

On 1/7/2020 9:04 AM, Jeff King wrote:
> On Tue, Jan 07, 2020 at 08:25:59AM -0500, Derrick Stolee wrote:
> 
>> On 1/7/2020 6:48 AM, Jeff King wrote:
>>> The assertion itself is quite old, so I wondered if it was even still
>>> relevant. Removing it does produce a reasonable-looking graph:
>>
>> As I'm digging into this case, and finding when the assertion is hit,
>> I see that the issue is in the line further below your coloring issue:
> 
> Oh, you're right. I totally missed that.
> 
> So perhaps we have two bugs, or perhaps they have the same root cause.
> 
>>>   | | | | * dd068b4 Merge commit '8f076d8' into HEAD
>>>   | |_|_|/| 
>>>   |/| | |/  
>>>   | | |/|   
>>>   | |/| |   
>>>   | * | | 8f076d8 5
>>
>> What is output is actually this, above. But the logic that includes the
>> assert is checking where the underscores end, and the shown underscores
>> actually pass the check. The issue is that it seems like it really wants
>> to show this:
>>
>>>   | | | | * dd068b4 Merge commit '8f076d8' into HEAD
>>>   | |_|_|/| 
>>>   |/| |_|/  
>>>   | |/| |   
>>>   | * | | 8f076d8 5
>>
>> Note that I dropped a line and compressed a slash into an underscore. It's
>> on that line that this condition is being hit.
> 
> Hrm. I could see either being acceptable, but I do think the second one
> is a bit easier to read. I'm not sure which was intended for this case.
Somewhat expanding the situation to test more of these collapses, I can get
the following graph:

*   6_M1
|\
| | *   6_M2
| | |\
| | | * 6_H
| | | | *   6_M3
| | | | |\
| | | | | * 6_J
| | | | * | 6_I
| | | | | | *   6_M4
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
| | * | | | 6_G
| | | |_|/
| | |/| |
| | * | | 6_F
| * | | | 6_E
| | |/ /
| |/| |
| * | | 6_D
| | |/
| |/|
* | | 6_C
| |/
|/|
* | 6_B
|/
* 6_A

Note how 6_M4 has three parents, and the first parent has a horizontal
edge. Even giving an extra padding line between horizontal edges, we
_could_ output the following instead:


| | | | | | *   6_M4
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
| | * | | | 6_G

However, I'll stick with the fix for the coloring issue. I think it
was a previous bug that just wasn't hit until now.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v3 01/15] rebase: extend the options for handling of empty commits
From: Phillip Wood @ 2020-01-07 14:37 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Johannes.Schindelin, phillip.wood, liu.denton, gitster, plroskin,
	alban.gruin, szeder.dev, Elijah Newren
In-Reply-To: <1c2b77e94d63f86590ca934855066eca278f576e.1577217299.git.gitgitgadget@gmail.com>

Hi Elijah

Thanks for working on this series, I think making the sequencer the 
default backend is a good idea. I have a few reservations about this 
path though...

On 24/12/2019 19:54, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Extend the interactive machinery with the ability to handle the full
> spread of options for how to handle commits that either start or become
> empty (by "become empty" I mean the changes in a commit are a subset of
> changes that exist upstream, so the net effect of applying the commit is
> no changes).  Introduce a new command line flag for selecting the
> desired behavior:
>     --empty={drop,keep,ask}
> with the definitions:
>     drop: drop empty commits
>     keep: keep empty commits
>     ask:  provide the user a chance to interact and pick what to do with
>           empty commits on a case-by-case basis

I think we want to distinguish between commits that are empty before
rebasing and those that become empty when they are rebased. --keep-empty 
explicily only applies to commits that are already empty. Cherry-pick
has distinct options for those two cases. If I've explicitly created an 
empty commit then I want to keep it but I don't want to keep commits 
that become empty because the changes they contain are already upstream.

If we want an option that keeps commits that become empty (Off hand I 
don't know why we would though) we should consider if that option should 
disable --cherry-mark when we create the todo list so that it keeps all 
commits that become empty when they're rebased.

> Note that traditionally, am-based rebases have always dropped commits
> that either started or became empty, while interactive-based rebases
> have defaulted to ask (and provided an option to keep commits that
> started empty).  This difference made sense since users of an am-based
> rebase just wanted to quickly batch apply a sequence of commits, while
> users editing a todo list will likely want the chance to interact and
> handle unusual cases on a case-by-case basis.  

I don't see why it makes sense to drop an empty commit that I've made 
just because it is being rebased. I'm pretty sure the behavor of the 
am-based rebase is a function of `git am` not being able to create empty 
commits.

> However, not all rebases
> using the interactive machinery are explicitly interactive anymore.  In
> particular --merge was always meant to behave more like --am: just
> rebase a batch of commits without popping up a todo list.
> 
> If the --empty flag is not specified, pick defaults as follows:
>     explicitly interactive: ask
>     --exec: keep (exec is about checking existing commits, and often
>                   used without actually changing the base.  Thus the
>                   expectation is that the user doesn't necessarily want
>                   anything to change; they just want to test).
>     otherwise: drop

I'm not sure I like changing the behavior based on --exec, I see what
you're getting at but it has the potential to be confusing. What if I
want to rearrange the commits without changing the base - why must I
specify --empty=keep there but not if I add --exec to the command line?

> Also, this commit makes --keep-empty just imply --empty=keep, and hides
> it from help so that we aren't confusing users with different ways to do
> the same thing.  (I could have added a --drop-empty flag, but then that
> invites users to specify both --keep-empty and --drop-empty and we have
> to add sanity checking around that; it seems cleaner to have a single
> multi-valued option.)  This actually fixes --keep-empty too; previously,
> it only meant to sometimes keep empty commits, in particular commits
> which started empty would be kept.  But it would still error out and ask
> the user what to do with commits that became empty.  Now it keeps empty
> commits, as instructed.

It certainly changes the behavior of --keep-empty but I'm not sure it 
"fixes" it. If I have some empty commits I want to keep as placeholders 
then that's different from wanting to keep commits that become empty 
because their changes are upstream but --cherry-mark didn't detect them.

In summary I'm in favor of making it easier to drop commits that become 
empty but not tying that to the handling of commits that are empty 
before they are rebased.

I'm also not happy that the deprecation of --keep-empty suddenly makes 
--no-keep-empty an error.

> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  Documentation/git-rebase.txt      | 35 ++++++------
>  builtin/rebase.c                  | 83 +++++++++++++++++++++++++---
>  rebase-interactive.c              |  4 +-
>  rebase-interactive.h              |  2 +-
>  sequencer.c                       | 74 +++++++++++++++++++------
>  sequencer.h                       |  6 ++-
>  t/t3421-rebase-topology-linear.sh |  4 +-
>  t/t3424-rebase-empty.sh           | 89 +++++++++++++++++++++++++++++++
>  t/t3427-rebase-subtree.sh         | 16 +++---
>  9 files changed, 263 insertions(+), 50 deletions(-)
>  create mode 100755 t/t3424-rebase-empty.sh
> 
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 1d0e2d27cc..ff32ca1080 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -258,9 +258,25 @@ See also INCOMPATIBLE OPTIONS below.
>  	original branch. The index and working tree are also left
>  	unchanged as a result.
>  
> +--empty={drop,keep,ask}::
> +	How to handle commits that become empty (because they contain a
> +	subset of already upstream changes) or start empty.  With drop
> +	(the default), commits that start or become empty are dropped.
> +	With keep (implied by --exec), such commits are kept.  With ask
> +	(implied by --interactive), the rebase will halt when an empty
> +	commit is applied allowing you to choose whether to drop it or
> +	commit it.  Also with ask, if the rebase is interactive then
> +	commits which start empty will be commented out in the todo
> +	action list (giving you a chance to uncomment).
> ++
> +Note that this has no effect on commits which are already upstream (as
> +can be checked via `git log --cherry-mark ...`), which are always
> +dropped by rebase.
> ++
> +See also INCOMPATIBLE OPTIONS below.
> +
>  --keep-empty::
> -	Keep the commits that do not change anything from its
> -	parents in the result.
> +	Deprecated alias for what is now known as --empty=keep.
>  +
>  See also INCOMPATIBLE OPTIONS below.
>  
> @@ -569,6 +585,7 @@ are incompatible with the following options:
>   * --interactive
>   * --exec
>   * --keep-empty
> + * --empty=
>   * --edit-todo
>   * --root when used in combination with --onto
>  
> @@ -580,6 +597,7 @@ In addition, the following pairs of options are incompatible:
>   * --preserve-merges and --ignore-whitespace
>   * --preserve-merges and --committer-date-is-author-date
>   * --preserve-merges and --ignore-date
> + * --preserve-merges and --empty=
>   * --keep-base and --onto
>   * --keep-base and --root
>  
> @@ -588,19 +606,6 @@ BEHAVIORAL DIFFERENCES
>  
>  There are some subtle differences how the backends behave.
>  
> -Empty commits
> -~~~~~~~~~~~~~
> -
> -The am backend drops any "empty" commits, regardless of whether the
> -commit started empty (had no changes relative to its parent to
> -start with) or ended empty (all changes were already applied
> -upstream in other commits).
> -
> -The interactive backend drops commits by default that
> -started empty and halts if it hits a commit that ended up empty.
> -The `--keep-empty` option exists for the interactive backend to allow
> -it to keep commits that started empty.
> -
>  Directory rename detection
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index ddf33bc9d4..6903249307 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -50,8 +50,16 @@ enum rebase_type {
>  	REBASE_PRESERVE_MERGES
>  };
>  
> +enum empty_type {
> +	EMPTY_UNSPECIFIED = -1,
> +	EMPTY_DROP,
> +	EMPTY_KEEP,
> +	EMPTY_ASK
> +};
> +
>  struct rebase_options {
>  	enum rebase_type type;
> +	enum empty_type empty;
>  	const char *state_dir;
>  	struct commit *upstream;
>  	const char *upstream_name;
> @@ -77,7 +85,6 @@ struct rebase_options {
>  	const char *action;
>  	int signoff;
>  	int allow_rerere_autoupdate;
> -	int keep_empty;
>  	int autosquash;
>  	int ignore_whitespace;
>  	char *gpg_sign_opt;
> @@ -95,6 +102,7 @@ struct rebase_options {
>  
>  #define REBASE_OPTIONS_INIT {			  	\
>  		.type = REBASE_UNSPECIFIED,	  	\
> +		.empty = EMPTY_UNSPECIFIED,	  	\
>  		.flags = REBASE_NO_QUIET, 		\
>  		.git_am_opts = ARGV_ARRAY_INIT,		\
>  		.git_format_patch_opt = STRBUF_INIT	\
> @@ -114,6 +122,10 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
>  		replay.allow_rerere_auto = opts->allow_rerere_autoupdate;
>  	replay.allow_empty = 1;
>  	replay.allow_empty_message = opts->allow_empty_message;
> +	replay.drop_redundant_commits = (opts->empty == EMPTY_DROP);
> +	replay.keep_redundant_commits = (opts->empty == EMPTY_KEEP);
> +	replay.ask_on_initially_empty = (opts->empty == EMPTY_ASK &&
> +					 !(opts->flags & REBASE_INTERACTIVE_EXPLICIT));
>  	replay.verbose = opts->flags & REBASE_VERBOSE;
>  	replay.reschedule_failed_exec = opts->reschedule_failed_exec;
>  	replay.committer_date_is_author_date =
> @@ -389,7 +401,10 @@ static int run_rebase_interactive(struct rebase_options *opts,
>  
>  	git_config_get_bool("rebase.abbreviatecommands", &abbreviate_commands);
>  
> -	flags |= opts->keep_empty ? TODO_LIST_KEEP_EMPTY : 0;
> +	flags |= (opts->empty == EMPTY_DROP) ? TODO_LIST_DROP_EMPTY : 0;
> +	flags |= (opts->empty == EMPTY_ASK &&
> +		  opts->flags & REBASE_INTERACTIVE_EXPLICIT) ?
> +			TODO_LIST_ASK_EMPTY : 0;
>  	flags |= abbreviate_commands ? TODO_LIST_ABBREVIATE_CMDS : 0;
>  	flags |= opts->rebase_merges ? TODO_LIST_REBASE_MERGES : 0;
>  	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
> @@ -453,6 +468,19 @@ static int run_rebase_interactive(struct rebase_options *opts,
>  	return ret;
>  }
>  
> +static int parse_opt_keep_empty(const struct option *opt, const char *arg,
> +				int unset)
> +{
> +	struct rebase_options *opts = opt->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +	BUG_ON_OPT_ARG(arg);
> +
> +	opts->empty = EMPTY_KEEP;
> +	opts->type = REBASE_INTERACTIVE;
> +	return 0;
> +}
> +
>  static const char * const builtin_rebase_interactive_usage[] = {
>  	N_("git rebase--interactive [<options>]"),
>  	NULL
> @@ -466,7 +494,10 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
>  	struct option options[] = {
>  		OPT_NEGBIT(0, "ff", &opts.flags, N_("allow fast-forward"),
>  			   REBASE_FORCE),
> -		OPT_BOOL(0, "keep-empty", &opts.keep_empty, N_("keep empty commits")),
> +		{ OPTION_CALLBACK, 'k', "keep-empty", &options, NULL,
> +			N_("(DEPRECATED) keep empty commits"),
> +			PARSE_OPT_NOARG | PARSE_OPT_NONEG | PARSE_OPT_HIDDEN,

It is all very well deprecating --keep-empty but suddenly making 
'--no-keep-empty' an error goes beyond deprecation. Also I'm not sure 
it's worth changing these options as I think the only user is 
git-rebase--preserve-merges.sh

Best Wishes

Phillip

> +			parse_opt_keep_empty },
>		OPT_BOOL(0, "allow-empty-message", &opts.allow_empty_message,
>  			 N_("allow commits with empty messages")),
>  		OPT_BOOL(0, "rebase-merges", &opts.rebase_merges, N_("rebase merge commits")),
> @@ -1166,7 +1197,7 @@ static int run_specific_rebase(struct rebase_options *opts, enum action action)
>  		opts->allow_rerere_autoupdate ?
>  			opts->allow_rerere_autoupdate == RERERE_AUTOUPDATE ?
>  			"--rerere-autoupdate" : "--no-rerere-autoupdate" : "");
> -	add_var(&script_snippet, "keep_empty", opts->keep_empty ? "yes" : "");
> +	add_var(&script_snippet, "empty", opts->empty == EMPTY_KEEP ? "yes" : "");
>  	add_var(&script_snippet, "autosquash", opts->autosquash ? "t" : "");
>  	add_var(&script_snippet, "gpg_sign_opt", opts->gpg_sign_opt);
>  	add_var(&script_snippet, "cmd", opts->cmd);
> @@ -1360,6 +1391,29 @@ static int parse_opt_interactive(const struct option *opt, const char *arg,
>  	return 0;
>  }
>  
> +static enum empty_type parse_empty_value(const char *value)
> +{
> +	if (!strcasecmp(value, "drop"))
> +		return EMPTY_DROP;
> +	else if (!strcasecmp(value, "keep"))
> +		return EMPTY_KEEP;
> +	else if (!strcasecmp(value, "ask"))
> +		return EMPTY_ASK;
> +
> +	die(_("unrecognized empty type '%s'; valid values are \"drop\", \"keep\", and \"ask\"."), value);
> +}
> +
> +static int parse_opt_empty(const struct option *opt, const char *arg, int unset)
> +{
> +	struct rebase_options *options = opt->value;
> +	enum empty_type value = parse_empty_value(arg);
> +
> +	BUG_ON_OPT_NEG(unset);
> +
> +	options->empty = value;
> +	return 0;
> +}
> +
>  static void NORETURN error_on_missing_default_upstream(void)
>  {
>  	struct branch *current_branch = branch_get(NULL);
> @@ -1505,8 +1559,13 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  				 "ignoring them"),
>  			      REBASE_PRESERVE_MERGES, PARSE_OPT_HIDDEN),
>  		OPT_RERERE_AUTOUPDATE(&options.allow_rerere_autoupdate),
> -		OPT_BOOL('k', "keep-empty", &options.keep_empty,
> -			 N_("preserve empty commits during rebase")),
> +		OPT_CALLBACK_F(0, "empty", &options, N_("{drop,keep,ask}"),
> +			       N_("how to handle empty commits"),
> +			       PARSE_OPT_NONEG, parse_opt_empty),
> +		{ OPTION_CALLBACK, 'k', "keep-empty", &options, NULL,
> +			N_("(DEPRECATED) keep empty commits"),
> +			PARSE_OPT_NOARG | PARSE_OPT_NONEG | PARSE_OPT_HIDDEN,
> +			parse_opt_keep_empty },
>  		OPT_BOOL(0, "autosquash", &options.autosquash,
>  			 N_("move commits that begin with "
>  			    "squash!/fixup! under -i")),
> @@ -1770,8 +1829,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  	if (!(options.flags & REBASE_NO_QUIET))
>  		argv_array_push(&options.git_am_opts, "-q");
>  
> -	if (options.keep_empty)
> -		imply_interactive(&options, "--keep-empty");
> +	if (options.empty != EMPTY_UNSPECIFIED)
> +		imply_interactive(&options, "--empty");
>  
>  	if (gpg_sign) {
>  		free(options.gpg_sign_opt);
> @@ -1856,6 +1915,14 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  		break;
>  	}
>  
> +	if (options.empty == EMPTY_UNSPECIFIED) {
> +		if (options.flags & REBASE_INTERACTIVE_EXPLICIT)
> +			options.empty = EMPTY_ASK;
> +		else if (exec.nr > 0)
> +			options.empty = EMPTY_KEEP;
> +		else
> +			options.empty = EMPTY_DROP;
> +	}
>  	if (reschedule_failed_exec > 0 && !is_interactive(&options))
>  		die(_("--reschedule-failed-exec requires "
>  		      "--exec or --interactive"));
> diff --git a/rebase-interactive.c b/rebase-interactive.c
> index aa18ae82b7..ad82bf77df 100644
> --- a/rebase-interactive.c
> +++ b/rebase-interactive.c
> @@ -28,7 +28,7 @@ static enum missing_commit_check_level get_missing_commit_check_level(void)
>  	return MISSING_COMMIT_CHECK_IGNORE;
>  }
>  
> -void append_todo_help(unsigned keep_empty, int command_count,
> +void append_todo_help(unsigned no_ask_empty, int command_count,
>  		      const char *shortrevisions, const char *shortonto,
>  		      struct strbuf *buf)
>  {
> @@ -81,7 +81,7 @@ void append_todo_help(unsigned keep_empty, int command_count,
>  
>  	strbuf_add_commented_lines(buf, msg, strlen(msg));
>  
> -	if (!keep_empty) {
> +	if (!no_ask_empty) {
>  		msg = _("Note that empty commits are commented out");
>  		strbuf_add_commented_lines(buf, msg, strlen(msg));
>  	}
> diff --git a/rebase-interactive.h b/rebase-interactive.h
> index 44dbb06311..f531e00ba7 100644
> --- a/rebase-interactive.h
> +++ b/rebase-interactive.h
> @@ -5,7 +5,7 @@ struct strbuf;
>  struct repository;
>  struct todo_list;
>  
> -void append_todo_help(unsigned keep_empty, int command_count,
> +void append_todo_help(unsigned no_ask_empty, int command_count,
>  		      const char *shortrevisions, const char *shortonto,
>  		      struct strbuf *buf);
>  int edit_todo_list(struct repository *r, struct todo_list *todo_list,
> diff --git a/sequencer.c b/sequencer.c
> index 763ccbbc45..d2c11f34b7 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -160,6 +160,9 @@ static GIT_PATH_FUNC(rebase_path_strategy, "rebase-merge/strategy")
>  static GIT_PATH_FUNC(rebase_path_strategy_opts, "rebase-merge/strategy_opts")
>  static GIT_PATH_FUNC(rebase_path_allow_rerere_autoupdate, "rebase-merge/allow_rerere_autoupdate")
>  static GIT_PATH_FUNC(rebase_path_reschedule_failed_exec, "rebase-merge/reschedule-failed-exec")
> +static GIT_PATH_FUNC(rebase_path_drop_redundant_commits, "rebase-merge/drop_redundant_commits")
> +static GIT_PATH_FUNC(rebase_path_keep_redundant_commits, "rebase-merge/keep_redundant_commits")
> +static GIT_PATH_FUNC(rebase_path_ask_on_initially_empty, "rebase-merge/ask_on_initially_empty")
>  
>  static int git_sequencer_config(const char *k, const char *v, void *cb)
>  {
> @@ -1623,7 +1626,7 @@ static int allow_empty(struct repository *r,
>  	empty_commit = is_original_commit_empty(commit);
>  	if (empty_commit < 0)
>  		return empty_commit;
> -	if (!empty_commit)
> +	if (!empty_commit || opts->ask_on_initially_empty)
>  		return 0;
>  	else
>  		return 1;
> @@ -1837,7 +1840,7 @@ static int do_pick_commit(struct repository *r,
>  	char *author = NULL;
>  	struct commit_message msg = { NULL, NULL, NULL, NULL };
>  	struct strbuf msgbuf = STRBUF_INIT;
> -	int res, unborn = 0, reword = 0, allow;
> +	int res, unborn = 0, reword = 0, allow, drop_commit;
>  
>  	if (opts->no_commit) {
>  		/*
> @@ -2042,13 +2045,20 @@ static int do_pick_commit(struct repository *r,
>  		goto leave;
>  	}
>  
> -	allow = allow_empty(r, opts, commit);
> -	if (allow < 0) {
> -		res = allow;
> -		goto leave;
> -	} else if (allow)
> -		flags |= ALLOW_EMPTY;
> -	if (!opts->no_commit) {
> +	drop_commit = 0;
> +	if (opts->drop_redundant_commits && is_index_unchanged(r)) {
> +		drop_commit = 1;
> +		fprintf(stderr, _("No changes -- Patch already applied."));
> +	} else {
> +		allow = allow_empty(r, opts, commit);
> +		if (allow < 0) {
> +			res = allow;
> +			goto leave;
> +		} else if (allow) {
> +			flags |= ALLOW_EMPTY;
> +		}
> +	}
> +	if (!opts->no_commit && !drop_commit) {
>  		if (author || command == TODO_REVERT || (flags & AMEND_MSG))
>  			res = do_commit(r, msg_file, author, opts, flags);
>  		else
> @@ -2501,9 +2511,15 @@ static int populate_opts_cb(const char *key, const char *value, void *data)

This function is used by cherry-pick/revert not rebase do we need to
change it?

>  	else if (!strcmp(key, "options.allow-empty-message"))
>  		opts->allow_empty_message =
>  			git_config_bool_or_int(key, value, &error_flag);
> +	else if (!strcmp(key, "options.drop-redundant-commits"))
> +		opts->drop_redundant_commits =
> +			git_config_bool_or_int(key, value, &error_flag);
>  	else if (!strcmp(key, "options.keep-redundant-commits"))
>  		opts->keep_redundant_commits =
>  			git_config_bool_or_int(key, value, &error_flag);
> +	else if (!strcmp(key, "options.ask_on_initially_empty"))
> +		opts->ask_on_initially_empty =
> +			git_config_bool_or_int(key, value, &error_flag);>  	else if (!strcmp(key, "options.signoff"))
>  		opts->signoff = git_config_bool_or_int(key, value, &error_flag);
>  	else if (!strcmp(key, "options.record-origin"))
> @@ -2612,6 +2628,15 @@ static int read_populate_opts(struct replay_opts *opts)
>  		if (file_exists(rebase_path_reschedule_failed_exec()))
>  			opts->reschedule_failed_exec = 1;
>  
> +		if (file_exists(rebase_path_drop_redundant_commits()))
> +			opts->drop_redundant_commits = 1;
> +
> +		if (file_exists(rebase_path_keep_redundant_commits()))
> +			opts->keep_redundant_commits = 1;
> +
> +		if (file_exists(rebase_path_ask_on_initially_empty()))
> +			opts->ask_on_initially_empty = 1;
> +
>  		read_strategy_opts(opts, &buf);
>  		strbuf_release(&buf);
>  
> @@ -2695,6 +2720,12 @@ int write_basic_state(struct replay_opts *opts, const char *head_name,
>  		write_file(rebase_path_cdate_is_adate(), "%s", "");
>  	if (opts->ignore_date)
>  		write_file(rebase_path_ignore_date(), "%s", "");
> +	if (opts->drop_redundant_commits)
> +		write_file(rebase_path_drop_redundant_commits(), "%s", "");
> +	if (opts->keep_redundant_commits)
> +		write_file(rebase_path_keep_redundant_commits(), "%s", "");
> +	if (opts->ask_on_initially_empty)
> +		write_file(rebase_path_ask_on_initially_empty(), "%s", "");
>  	if (opts->reschedule_failed_exec)
>  		write_file(rebase_path_reschedule_failed_exec(), "%s", "");
>  
> @@ -3033,9 +3064,15 @@ static int save_opts(struct replay_opts *opts)

again this is for cherry-pick/revert

>  	if (opts->allow_empty_message)
>  		res |= git_config_set_in_file_gently(opts_file,
>  				"options.allow-empty-message", "true");
> +	if (opts->drop_redundant_commits)
> +		res |= git_config_set_in_file_gently(opts_file,
> +				"options.drop-redundant-commits", "true");
>  	if (opts->keep_redundant_commits)
>  		res |= git_config_set_in_file_gently(opts_file,
>  				"options.keep-redundant-commits", "true");
> +	if (opts->ask_on_initially_empty)
> +		res |= git_config_set_in_file_gently(opts_file,
> +				"options.ask_on_initially_empty", "true");
>  	if (opts->signoff)
>  		res |= git_config_set_in_file_gently(opts_file,
>  					"options.signoff", "true");
> @@ -4691,7 +4728,8 @@ static int make_script_with_merges(struct pretty_print_context *pp,
>  				   struct rev_info *revs, struct strbuf *out,
>  				   unsigned flags)
>  {
> -	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
> +	int drop_empty = flags & TODO_LIST_DROP_EMPTY;
> +	int ask_empty = flags & TODO_LIST_ASK_EMPTY;
>  	int rebase_cousins = flags & TODO_LIST_REBASE_COUSINS;
>  	int root_with_onto = flags & TODO_LIST_ROOT_WITH_ONTO;
>  	struct strbuf buf = STRBUF_INIT, oneline = STRBUF_INIT;
> @@ -4746,6 +4784,8 @@ static int make_script_with_merges(struct pretty_print_context *pp,
>  		is_empty = is_original_commit_empty(commit);
>  		if (!is_empty && (commit->object.flags & PATCHSAME))
>  			continue;
> +		if (is_empty && drop_empty)
> +			continue;
>  
>  		strbuf_reset(&oneline);
>  		pretty_print_commit(pp, commit, &oneline);
> @@ -4754,7 +4794,7 @@ static int make_script_with_merges(struct pretty_print_context *pp,
>  		if (!to_merge) {
>  			/* non-merge commit: easy case */
>  			strbuf_reset(&buf);
> -			if (!keep_empty && is_empty)
> +			if (is_empty && ask_empty)
>  				strbuf_addf(&buf, "%c ", comment_line_char);
>  			strbuf_addf(&buf, "%s %s %s", cmd_pick,
>  				    oid_to_hex(&commit->object.oid),
> @@ -4922,7 +4962,8 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>  	struct pretty_print_context pp = {0};
>  	struct rev_info revs;
>  	struct commit *commit;
> -	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
> +	int drop_empty = flags & TODO_LIST_DROP_EMPTY;
> +	int ask_empty = flags & TODO_LIST_ASK_EMPTY;
>  	const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
>  	int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
>  
> @@ -4958,11 +4999,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>  		return make_script_with_merges(&pp, &revs, out, flags);
>  
>  	while ((commit = get_revision(&revs))) {
> -		int is_empty  = is_original_commit_empty(commit);
> +		int is_empty = is_original_commit_empty(commit);
>  
>  		if (!is_empty && (commit->object.flags & PATCHSAME))
>  			continue;
> -		if (!keep_empty && is_empty)
> +		if (is_empty && drop_empty)
> +			continue;
> +		if (is_empty && ask_empty)
>  			strbuf_addf(out, "%c ", comment_line_char);
>  		strbuf_addf(out, "%s %s ", insn,
>  			    oid_to_hex(&commit->object.oid));
> @@ -5100,7 +5143,8 @@ int todo_list_write_to_file(struct repository *r, struct todo_list *todo_list,
>  
>  	todo_list_to_strbuf(r, todo_list, &buf, num, flags);
>  	if (flags & TODO_LIST_APPEND_TODO_HELP)
> -		append_todo_help(flags & TODO_LIST_KEEP_EMPTY, count_commands(todo_list),
> +		append_todo_help(!(flags & TODO_LIST_ASK_EMPTY),
> +				 count_commands(todo_list),
>  				 shortrevisions, shortonto, &buf);
>  
>  	res = write_message(buf.buf, buf.len, file, 0);
> diff --git a/sequencer.h b/sequencer.h
> index e9a0e03ea2..1c3abb661c 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -39,7 +39,9 @@ struct replay_opts {
>  	int allow_rerere_auto;
>  	int allow_empty;
>  	int allow_empty_message;
> +	int drop_redundant_commits;
>  	int keep_redundant_commits;
> +	int ask_on_initially_empty;
>  	int verbose;
>  	int quiet;
>  	int reschedule_failed_exec;
> @@ -134,7 +136,7 @@ int sequencer_rollback(struct repository *repo, struct replay_opts *opts);
>  int sequencer_skip(struct repository *repo, struct replay_opts *opts);
>  int sequencer_remove_state(struct replay_opts *opts);
>  
> -#define TODO_LIST_KEEP_EMPTY (1U << 0)
> +/* #define TODO_LIST_KEEP_EMPTY (1U << 0) */ /* No longer used */
>  #define TODO_LIST_SHORTEN_IDS (1U << 1)
>  #define TODO_LIST_ABBREVIATE_CMDS (1U << 2)
>  #define TODO_LIST_REBASE_MERGES (1U << 3)
> @@ -150,6 +152,8 @@ int sequencer_remove_state(struct replay_opts *opts);
>   * `--onto`, we do not want to re-generate the root commits.
>   */
>  #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
> +#define TODO_LIST_DROP_EMPTY (1U << 7)
> +#define TODO_LIST_ASK_EMPTY (1U << 8)
>  
>  
>  int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
> diff --git a/t/t3421-rebase-topology-linear.sh b/t/t3421-rebase-topology-linear.sh
> index 325072b0a3..d23e0bf778 100755
> --- a/t/t3421-rebase-topology-linear.sh
> +++ b/t/t3421-rebase-topology-linear.sh
> @@ -230,7 +230,7 @@ test_run_rebase () {
>  test_run_rebase success ''
>  test_run_rebase success -m
>  test_run_rebase success -i
> -test_have_prereq !REBASE_P || test_run_rebase failure -p
> +test_have_prereq !REBASE_P || test_run_rebase success -p
>  
>  test_run_rebase () {
>  	result=$1
> @@ -245,7 +245,7 @@ test_run_rebase () {
>  test_run_rebase success ''
>  test_run_rebase success -m
>  test_run_rebase success -i
> -test_have_prereq !REBASE_P || test_run_rebase failure -p
> +test_have_prereq !REBASE_P || test_run_rebase success -p
>  test_run_rebase success --rebase-merges
>  
>  #       m
> diff --git a/t/t3424-rebase-empty.sh b/t/t3424-rebase-empty.sh
> new file mode 100755
> index 0000000000..9d52e1417f
> --- /dev/null
> +++ b/t/t3424-rebase-empty.sh
> @@ -0,0 +1,89 @@
> +#!/bin/sh
> +
> +test_description='git rebase of commits that start or become empty'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup test repository' '
> +	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
> +	test_write_lines A B C D E F G H I J >letters &&
> +	git add numbers letters &&
> +	git commit -m A &&
> +
> +	git branch upstream &&
> +	git branch localmods &&
> +
> +	git checkout upstream &&
> +	test_write_lines A B C D E >letters &&
> +	git add letters &&
> +	git commit -m B &&
> +
> +	test_write_lines 1 2 3 4 five 6 7 8 9 ten >numbers &&
> +	git add numbers &&
> +	git commit -m C &&
> +
> +	git checkout localmods &&
> +	test_write_lines 1 2 3 4 five 6 7 8 9 10 >numbers &&
> +	git add numbers &&
> +	git commit -m C2 &&
> +
> +	git commit --allow-empty -m D &&
> +
> +	test_write_lines A B C D E >letters &&
> +	git add letters &&
> +	git commit -m "Five letters ought to be enough for anybody"
> +'
> +
> +test_expect_success 'rebase --merge --empty=drop' '
> +	git checkout -B testing localmods &&
> +	git rebase --merge --empty=drop upstream &&
> +
> +	test_write_lines C B A >expect &&
> +	git log --format=%s >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'rebase --merge --empty=keep' '
> +	git checkout -B testing localmods &&
> +	git rebase --merge --empty=keep upstream &&
> +
> +	test_write_lines D C2 C B A >expect &&
> +	git log --format=%s >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'rebase --merge --empty=ask' '
> +	git checkout -B testing localmods &&
> +	test_must_fail git rebase --merge --empty=ask upstream &&
> +
> +	test_must_fail git rebase --skip &&
> +	git commit --allow-empty &&
> +	git rebase --continue &&
> +
> +	test_write_lines D C B A >expect &&
> +	git log --format=%s >actual &&
> +	test_cmp expect actual
> +'
> +
> +GIT_SEQUENCE_EDITOR=: && export GIT_SEQUENCE_EDITOR
> +
> +test_expect_success 'rebase --interactive --empty=drop' '
> +	git checkout -B testing localmods &&
> +	git rebase --interactive --empty=drop upstream &&
> +
> +	test_write_lines C B A >expect &&
> +	git log --format=%s >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'rebase --interactive --empty=keep' '
> +	git checkout -B testing localmods &&
> +	git rebase --interactive --empty=keep upstream &&
> +
> +	test_write_lines D C2 C B A >expect &&
> +	git log --format=%s >actual &&
> +	test_cmp expect actual
> +'
> +
> +
> +test_done
> diff --git a/t/t3427-rebase-subtree.sh b/t/t3427-rebase-subtree.sh
> index bec48e6a1f..468ebc1bef 100755
> --- a/t/t3427-rebase-subtree.sh
> +++ b/t/t3427-rebase-subtree.sh
> @@ -85,23 +85,27 @@ test_expect_failure REBASE_P 'Rebase -Xsubtree --keep-empty --preserve-merges --
>  	verbose test "$(commit_message HEAD)" = "Empty commit"
>  '
>  
> -test_expect_success 'Rebase -Xsubtree --keep-empty --onto commit' '
> +test_expect_success 'Rebase -Xsubtree --empty=ask --onto commit' '
>  	reset_rebase &&
>  	git checkout -b rebase-onto to-rebase &&
> -	test_must_fail git rebase -Xsubtree=files_subtree --keep-empty --onto files-master master &&
> +	test_must_fail git rebase -Xsubtree=files_subtree --empty=ask --onto files-master master &&
>  	: first pick results in no changes &&
> -	git rebase --continue &&
> +	test_must_fail git rebase --skip &&
> +	: last pick was an empty commit that has no changes, but we want to keep it &&
> +	git commit --allow-empty &&
>  	verbose test "$(commit_message HEAD~2)" = "master4" &&
>  	verbose test "$(commit_message HEAD~)" = "files_subtree/master5" &&
>  	verbose test "$(commit_message HEAD)" = "Empty commit"
>  '
>  
> -test_expect_success 'Rebase -Xsubtree --keep-empty --rebase-merges --onto commit' '
> +test_expect_success 'Rebase -Xsubtree --empty=ask --rebase-merges --onto commit' '
>  	reset_rebase &&
>  	git checkout -b rebase-merges-onto to-rebase &&
> -	test_must_fail git rebase -Xsubtree=files_subtree --keep-empty --rebase-merges --onto files-master --root &&
> +	test_must_fail git rebase -Xsubtree=files_subtree --empty=ask --rebase-merges --onto files-master --root &&
>  	: first pick results in no changes &&
> -	git rebase --continue &&
> +	test_must_fail git rebase --skip &&
> +	: last pick was an empty commit that has no changes, but we want to keep it &&
> +	git commit --allow-empty &&
>  	verbose test "$(commit_message HEAD~2)" = "master4" &&
>  	verbose test "$(commit_message HEAD~)" = "files_subtree/master5" &&
>  	verbose test "$(commit_message HEAD)" = "Empty commit"
> 


^ permalink raw reply

* Re: [PATCH v3 10/15] rebase: add an --am option
From: Phillip Wood @ 2020-01-07 14:43 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Johannes.Schindelin, phillip.wood, liu.denton, gitster, plroskin,
	alban.gruin, szeder.dev, Elijah Newren
In-Reply-To: <1df11f0b5105b1f602fdd723e0f74565e436faba.1577217299.git.gitgitgadget@gmail.com>

Hi Elijah

On 24/12/2019 19:54, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Currently, this option doesn't do anything except error out if any
> options requiring the interactive-backend are also passed.  However,
> when we make the default backend configurable later in this series, this
> flag will provide a way to override the config setting.

I wonder if we really want to add a new option that we know we don't 
want to keep in the long term. Could we not just default to the 
interactive backend and fallback to the am backend if the user passes 
any of the am-based options on the commandline? i.e. reverse the current 
situation of defaulting to the am backend and selecting the interactive 
backend when the user passes certain options.

Best Wishes

Phillip


> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>   Documentation/git-rebase.txt | 11 ++++++++++-
>   builtin/rebase.c             | 18 +++++++++++++++++-
>   2 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index f1ace07c38..cf1ac2e359 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -258,6 +258,13 @@ See also INCOMPATIBLE OPTIONS below.
>   	original branch. The index and working tree are also left
>   	unchanged as a result.
>   
> +--am:
> +	Use git-am internally to rebase.  This option may become a
> +	no-op in the future once the interactive backend handles
> +	everything the am one does.
> ++
> +See also INCOMPATIBLE OPTIONS below.
> +
>   --empty={drop,keep,ask}::
>   	How to handle commits that become empty (because they contain a
>   	subset of already upstream changes) or start empty.  With drop
> @@ -372,7 +379,7 @@ See also INCOMPATIBLE OPTIONS below.
>   	Ensure at least <n> lines of surrounding context match before
>   	and after each change.  When fewer lines of surrounding
>   	context exist they all must match.  By default no context is
> -	ever ignored.
> +	ever ignored.  Implies --am.
>   +
>   See also INCOMPATIBLE OPTIONS below.
>   
> @@ -417,6 +424,7 @@ with `--keep-base` in order to drop those commits from your branch.
>   --whitespace=<option>::
>   	This flag is passed to the 'git apply' program
>   	(see linkgit:git-apply[1]) that applies the patch.
> +	Implies --am.
>   +
>   See also INCOMPATIBLE OPTIONS below.
>   
> @@ -567,6 +575,7 @@ INCOMPATIBLE OPTIONS
>   
>   The following options:
>   
> + * --am
>    * --whitespace
>    * -C
>   
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index d2b99e9908..b7915fc0cb 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1361,6 +1361,18 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
>   	return res && is_linear_history(onto, head);
>   }
>   
> +static int parse_opt_am(const struct option *opt, const char *arg, int unset)
> +{
> +	struct rebase_options *opts = opt->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +	BUG_ON_OPT_ARG(arg);
> +
> +	opts->type = REBASE_AM;
> +
> +	return 0;
> +}
> +
>   /* -i followed by -m is still -i */
>   static int parse_opt_merge(const struct option *opt, const char *arg, int unset)
>   {
> @@ -1546,6 +1558,10 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   		OPT_CMDMODE(0, "show-current-patch", &action,
>   			    N_("show the patch file being applied or merged"),
>   			    ACTION_SHOW_CURRENT_PATCH),
> +		{ OPTION_CALLBACK, 0, "am", &options, NULL,
> +			N_("use apply-mail strategies to rebase"),
> +			PARSE_OPT_NOARG | PARSE_OPT_NONEG,
> +			parse_opt_am },
>   		{ OPTION_CALLBACK, 'm', "merge", &options, NULL,
>   			N_("use merging strategies to rebase"),
>   			PARSE_OPT_NOARG | PARSE_OPT_NONEG,
> @@ -1906,7 +1922,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   	if (isatty(2) && options.flags & REBASE_NO_QUIET)
>   		strbuf_addstr(&options.git_format_patch_opt, " --progress");
>   
> -	if (options.git_am_opts.argc) {
> +	if (options.git_am_opts.argc || options.type == REBASE_AM) {
>   		/* all am options except -q are compatible only with --am */
>   		for (i = options.git_am_opts.argc - 1; i >= 0; i--)
>   			if (strcmp(options.git_am_opts.argv[i], "-q"))
> 

^ permalink raw reply

* Re: Assertion in git log graphing [regression in v2.25]
From: Derrick Stolee @ 2020-01-07 14:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Bradley Smith, Junio C Hamano, James Coglan, git
In-Reply-To: <39277f9a-f88d-94a3-2a8b-b6e0a3dfdc62@gmail.com>

On 1/7/2020 9:22 AM, Derrick Stolee wrote:
> Note how 6_M4 has three parents, and the first parent has a horizontal
> edge. Even giving an extra padding line between horizontal edges, we
> _could_ output the following instead:
> 
> 
> | | | | | | *   6_M4
> | |_|_|_|_|/|\
> |/| | | | |/ /
> | | |_|_|/| /
> | |/| | | |/
> | | | |_|/|
> | | |/| | |
> | | * | | | 6_G

I should have tested this example with v2.24.1, which gives this output
in this section:

| | | | | | *-.   6_M4
| | | | | | |\ \  
| |_|_|_|_|/ / /  
|/| | | | | / /   
| | |_|_|_|/ /    
| |/| | | | /     
| | | |_|_|/      
| | |/| | |       
| | * | | | 6_G

So, this is likely the behavior we would prefer.

For the release, it is likely best to keep the change that does not
collapse properly and fix this collapse bug later. (It is likely an
invasive change.)

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH 4/9] commit-graph: document bloom filter format
From: Jakub Narebski @ 2020-01-07 14:46 UTC (permalink / raw)
  To: Garima Singh via GitGitGadget
  Cc: git, Derrick Stolee, SZEDER Gábor, Jonathan Tan,
	Jeff Hostetler, Taylor Blau, Jeff King, Junio C Hamano,
	Garima Singh
In-Reply-To: <3182a11f7c07af834ba71dc7861742458754eb91.1576879520.git.gitgitgadget@gmail.com>

"Garima Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Garima Singh <garima.singh@microsoft.com>
>
> Update the technical documentation for commit-graph-format with BIDX
> and BDAT chunk information.
>
> RFC Notes:
> 1. [Call for advice] We specifically mention that we are using Bloom
>    filters in this technical document. Should this document also be
>    made open to other data structures in the future, with versioning
>    information?

I'm not sure.  In theory we might want to switch to another
probabilistic set inclusion query structure, like xor filters or cuckoo
hashing.

On one hand side we could use separate chunks (e.g. XIDX, XDAT for xor
filters), on the other hand we need only one such structure.  On the
gripping hand this can be left for the future, if needed.

Sidenote: using Bloom filters is somewhat encoded in the name of chunk
(B from Bloom filter).  I don't have a better poposal for 4-char name
(XIDX / XDAT for cXange?  CHDX / CHDT for CHange?  FIDX / FDAT for
changed Files?... I don't know).

>
> 2. [Call for advice] We are also not describing the explicit nature
>    of how we store the bloom filter binary data. Would it be useful
>    to document details about the hash algorithm, the number of hashes
>    and the specific seed values we are using in a separate document,
>    or perhaps in a separate section in this document?

I think it would be best to keep description of the commit graph format
concise.  The details about Bloom filter implementation would be better
put in Documentation/technical/commit-graph.txt in my opinion, together
with reasoning behind it (perhaps borrowing from Derrick Stolee blog
post).

This could be done as a separate patch.

>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Garima Singh <garima.singh@microsoft.com>
> ---
>  Documentation/technical/commit-graph-format.txt | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
> index a4f17441ae..6497f19f08 100644
> --- a/Documentation/technical/commit-graph-format.txt
> +++ b/Documentation/technical/commit-graph-format.txt
> @@ -17,6 +17,9 @@ metadata, including:
>  - The parents of the commit, stored using positional references within
>    the graph file.
>  
> +- The bloom filter of the commit carrying the paths that were changed between
> +  the commit and it's first parent.

s/bloom/Bloom/ and s/it's/its/

I am not sure about exact wording, but I could at this time think of a
better but concise way of stating it.

> +
>  These positional references are stored as unsigned 32-bit integers
>  corresponding to the array position within the list of commit OIDs. Due
>  to some special constants we use to track parents, we can store at most
> @@ -93,6 +96,20 @@ CHUNK DATA:
>        positions for the parents until reaching a value with the most-significant
>        bit on. The other bits correspond to the position of the last parent.
>  
> +  Bloom Filter Index (ID: {'B', 'I', 'D', 'X'}) [Optional]
> +      For each commit we store the offset of its bloom filter in the BDAT chunk
> +      as follows:
> +      BIDX[i] = number of 8-byte words in all the bloom filters from commit 0 to
> +		commit i (inclusive)

I think it would be better for consistency and ease of reading to follow
the example of OID Fanout (OIDF) chunk description:

 +  Bloom Filter Index (ID: {'B', 'I', 'D', 'X'}) (N * 4 bytes) [Optional]
 +      The ith entry, BIDX[i], stores the number of 8-byte word blocks
 +      in all Bloom filters from commit 0 up to commit i (inclusive)
 +      in lexicographical order.

Maybe even add the following to make implementing it easier:

 +      Data for Bloom filter for i-th commit spans from BIDX[i-1] to
 +      BIDX[i] (plus header length), where we take BIDX[-1] to be 0.

Is it possible for (BIDX[i] - BIDX[i-1]) to be zero (no Bloom filter),
for example for commits with more than 512 changes?  Or is this case
handled by 1 8-byte word Bloom filter of all bits sets to '1', i.e.
0xffffffffffffffff?

How the case of too many changes is distingushed from the case of no
changes (`git commit --allow-empty`, or `git merge --ours`)?  Is the
case of no changes uninteresting, i.e. Bloom filter consisting of zero,
that is with all bits set to '0'?

> +
> +  Bloom Filter Data (ID: {'B', 'D', 'A', 'T'}) [Optional]
> +      * It starts with three 32 bit integers for the

I would say "It starts with the header consisting of three unsigned
32-bit integers:" (but current version is not bad).

I wonder if this metadata should perhaps be put in a separate chunk,
BMET... (Bloom filter METadata).

> +	    - version of the hash algorithm being used

This number not only encodes that the base hash algorithm being used is
32-bit Murmur3 hash, but that 'k' hashes used in the computation are
created out of Murmur3 hash using double hashing technique, and
specifies two specific seed values for this double hashing technique.

[Maybe we should store those two seed values here too?]

It might be important to say that the currently supported version is
'1', and if Git encounters unknown hashing algorithm version it should
not use Bloom filter data.

Unless we store encoded _name_ of the hash algorithm, e.g. bytes
'm','u','r','3' for MurmurHash3_32... though it is about more than
a base hash.

Do we need whole 4 bytes for hash version, or is it for ease of use and
alignment?

> +	    - the number of hashes used in the computation

All right.  Perhaps we should test in the future patches that the value
different from the default of 7 would also work.

Also 8-bits / 1 byte for number of hashes (hash functions) should be
enough: as I have written in prevous reply there is no need for k > 32.

> +	    - the number of bits per entry

This is important for construction of Bloom filter, but I think it is
not necessary to use it -- so it may not be necessary to store it.

Would also fit in a single byte: we don't need exceedingly low false
positive probability.

We could use it to estimate the false positive probability, and...

> +	  * The rest of the chunk is the concatenation of all the computed bloom 
> +	  filters for the commits in lexicographic order.

  +	 * The rest of the chunk is the concatenation of all the computed Bloom 
  +	   filters for the commits in lexicographic order.

It would be, I think, a good idea to make it explicit that BDAT is
present iff BIDX is present (iff == if and only if), i.e. that either
both or neither of those chunks should be present.

> +
>    Base Graphs List (ID: {'B', 'A', 'S', 'E'}) [Optional]
>        This list of H-byte hashes describe a set of B commit-graph files that
>        form a commit-graph chain. The graph position for the ith commit in this

Best,
-- 
Jakub Narębski

^ permalink raw reply

* [PATCH 0/3] Fix two bugs in graph.c
From: Derrick Stolee via GitGitGadget @ 2020-01-07 14:55 UTC (permalink / raw)
  To: git; +Cc: peff, brad, sunshine, Derrick Stolee, Junio C Hamano

This is a possible fix for the bug reported in [1].

The first commit fixes the runtime failure due to the assert() statement.

The second commit replaces the assert() statements with a macro that
triggers a BUG().

The third commit adds another test that shows a more complicated example and
how the new code in v2.25.0-rc1 has a behavior change that is not
necessarily wanted.

Thanks, -Stolee

[1] 
https://lore.kernel.org/git/CAHt=fUXTHc4JPsapvHKnw5vHhp2cBOYRNfdaSDWBUnKt8fWfeA@mail.gmail.com/

Derrick Stolee (3):
  graph: fix case that hit assert()
  graph: replace assert() with graph_assert() macro
  t4215: add bigger graph collapse test

 graph.c                      |  39 +++++++------
 t/t4215-log-skewed-merges.sh | 105 +++++++++++++++++++++++++++++++++++
 2 files changed, 126 insertions(+), 18 deletions(-)


base-commit: 8679ef24ed64018bb62170c43ce73e0261c0600a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-517%2Fderrickstolee%2Fgraph-assert-fix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-517/derrickstolee/graph-assert-fix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/517
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 2/3] graph: replace assert() with graph_assert() macro
From: Derrick Stolee via GitGitGadget @ 2020-01-07 14:55 UTC (permalink / raw)
  To: git; +Cc: peff, brad, sunshine, Derrick Stolee, Junio C Hamano,
	Derrick Stolee
In-Reply-To: <pull.517.git.1578408947.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

The assert() macro is sometimes compiled out. Instead, switch these into
BUG() statements using our own custom macro.

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 graph.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/graph.c b/graph.c
index aaf97069bd..ac78bdf96c 100644
--- a/graph.c
+++ b/graph.c
@@ -6,6 +6,8 @@
 #include "revision.h"
 #include "argv-array.h"
 
+#define graph_assert(exp) if (!(exp)) { BUG("assert failed: "#exp""); }
+
 /* Internal API */
 
 /*
@@ -316,7 +318,7 @@ static struct strbuf *diff_output_prefix_callback(struct diff_options *opt, void
 	struct git_graph *graph = data;
 	static struct strbuf msgbuf = STRBUF_INIT;
 
-	assert(opt);
+	graph_assert(opt);
 
 	strbuf_reset(&msgbuf);
 	if (opt->line_prefix)
@@ -865,13 +867,13 @@ static void graph_output_pre_commit_line(struct git_graph *graph,
 	 *
 	 * We need 2 extra rows for every parent over 2.
 	 */
-	assert(graph->num_parents >= 3);
+	graph_assert(graph->num_parents >= 3);
 
 	/*
 	 * graph->expansion_row tracks the current expansion row we are on.
 	 * It should be in the range [0, num_expansion_rows - 1]
 	 */
-	assert(0 <= graph->expansion_row &&
+	graph_assert(0 <= graph->expansion_row &&
 	       graph->expansion_row < graph_num_expansion_rows(graph));
 
 	/*
@@ -923,13 +925,13 @@ static void graph_output_commit_char(struct git_graph *graph, struct graph_line
 	 * (We should only see boundary commits when revs->boundary is set.)
 	 */
 	if (graph->commit->object.flags & BOUNDARY) {
-		assert(graph->revs->boundary);
+		graph_assert(graph->revs->boundary);
 		graph_line_addch(line, 'o');
 		return;
 	}
 
 	/*
-	 * get_revision_mark() handles all other cases without assert()
+	 * get_revision_mark() handles all other cases without graph_assert()
 	 */
 	graph_line_addstr(line, get_revision_mark(graph->revs, graph->commit));
 }
@@ -1094,7 +1096,7 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
 
 			for (j = 0; j < graph->num_parents; j++) {
 				par_column = graph_find_new_column_by_commit(graph, parents->item);
-				assert(par_column >= 0);
+				graph_assert(par_column >= 0);
 
 				c = merge_chars[idx];
 				graph_line_write_column(line, &graph->new_columns[par_column], c);
@@ -1172,14 +1174,14 @@ static void graph_output_collapsing_line(struct git_graph *graph, struct graph_l
 		 * the graph much more legible, since whenever branches
 		 * cross, only one is moving directions.
 		 */
-		assert(target * 2 <= i);
+		graph_assert(target * 2 <= i);
 
 		if (target * 2 == i) {
 			/*
 			 * This column is already in the
 			 * correct place
 			 */
-			assert(graph->mapping[i] == -1);
+			graph_assert(graph->mapping[i] == -1);
 			graph->mapping[i] = target;
 		} else if (graph->mapping[i - 1] < 0) {
 			/*
@@ -1225,8 +1227,8 @@ static void graph_output_collapsing_line(struct git_graph *graph, struct graph_l
 			 * The space just to the left of this
 			 * branch should always be empty.
 			 */
-			assert(graph->mapping[i - 1] > target);
-			assert(graph->mapping[i - 2] < 0);
+			graph_assert(graph->mapping[i - 1] > target);
+			graph_assert(graph->mapping[i - 2] < 0);
 			graph->mapping[i - 2] = target;
 			/*
 			 * Mark this branch as the horizontal edge to
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 1/3] graph: fix case that hit assert()
From: Derrick Stolee via GitGitGadget @ 2020-01-07 14:55 UTC (permalink / raw)
  To: git; +Cc: peff, brad, sunshine, Derrick Stolee, Junio C Hamano,
	Derrick Stolee
In-Reply-To: <pull.517.git.1578408947.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

A failure was reported in "git log --graph --all" with the new
graph-rendering logic. Create a test case that matches the
topology of that example and uses an explicit ref ordering instead
of the "--all" option. The test would fail with the following error:

	graph.c:1228: graph_output_collapsing_line: Assertion
		      `graph->mapping[i - 3] == target' failed.

The situation is a little complicated, so let's break it down.

The assert was introduced by eaf158f8 ("graph API: Use horizontal
lines for more compact graphs", 2009-04-21), which is quite old.
This assert is trying to say that when we complete a horizontal
line with a single slash, it is because we have reached our target.

This assertion is hit when we have two collapsing lines from the
same merge commit, as follows:

    | | | | *
    | |_|_|/|
    |/| | |/
    | | |/|
    | |/| |
    | * | |
    * | | |

It is actually the _second_ collapsing line that hits this assert.
The reason we are in this code path is because we are collapsing
the first line, and it in that case we are hitting our target now
that the horizontal line is complete. However, the second line
cannot be a horizontal line, so it will collapse without horizontal
lines. In this case, it is inappropriate to assert that we have
reached our target, as we need to continue for another column
before reaching the target. Dropping the assert is safe here.

Second, the horizontal lines in that first line drop their coloring.
This is due to a use of graph_line_addch() instead of
graph_line_write_column(). Using a ternary operator to pick the
character is nice for compact code, but we actually need a column
to provide the color.

Helped-by: Jeff King <peff@peff.net>
Reported-by: Bradley Smith <brad@brad-smith.co.uk>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 graph.c                      | 17 ++++++++-------
 t/t4215-log-skewed-merges.sh | 42 ++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/graph.c b/graph.c
index 66ae18add8..aaf97069bd 100644
--- a/graph.c
+++ b/graph.c
@@ -1063,7 +1063,7 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
 	int i, j;
 
 	struct commit_list *first_parent = first_interesting_parent(graph);
-	int seen_parent = 0;
+	struct column *parent_col = NULL;
 
 	/*
 	 * Output the post-merge row
@@ -1117,12 +1117,17 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
 			graph_line_addch(line, ' ');
 		} else {
 			graph_line_write_column(line, col, '|');
-			if (graph->merge_layout != 0 || i != graph->commit_index - 1)
-				graph_line_addch(line, seen_parent ? '_' : ' ');
+			if (graph->merge_layout != 0 || i != graph->commit_index - 1) {
+				if (parent_col)
+					graph_line_write_column(
+						line, parent_col, '_');
+				else
+					graph_line_addch(line, ' ');
+			}
 		}
 
 		if (col_commit == first_parent->item)
-			seen_parent = 1;
+			parent_col = col;
 	}
 
 	/*
@@ -1219,13 +1224,9 @@ static void graph_output_collapsing_line(struct git_graph *graph, struct graph_l
 			 *
 			 * The space just to the left of this
 			 * branch should always be empty.
-			 *
-			 * The branch to the left of that space
-			 * should be our eventual target.
 			 */
 			assert(graph->mapping[i - 1] > target);
 			assert(graph->mapping[i - 2] < 0);
-			assert(graph->mapping[i - 3] == target);
 			graph->mapping[i - 2] = target;
 			/*
 			 * Mark this branch as the horizontal edge to
diff --git a/t/t4215-log-skewed-merges.sh b/t/t4215-log-skewed-merges.sh
index 18709a723e..ddf6f6f5d3 100755
--- a/t/t4215-log-skewed-merges.sh
+++ b/t/t4215-log-skewed-merges.sh
@@ -240,4 +240,46 @@ test_expect_success 'log --graph with octopus merge with column joining its penu
 	EOF
 '
 
+test_expect_success 'log --graph with multiple tips' '
+	git checkout --orphan 6_1 &&
+	test_commit 6_A &&
+	git branch 6_2 &&
+	git branch 6_4 &&
+	test_commit 6_B &&
+	git branch 6_3 &&
+	test_commit 6_C &&
+	git checkout 6_2 && test_commit 6_D &&
+	git checkout 6_3 && test_commit 6_E &&
+	git checkout -b 6_5 6_1 &&
+	git merge --no-ff 6_2 -m 6_F &&
+	git checkout 6_4 && test_commit 6_G &&
+	git checkout 6_3 &&
+	git merge --no-ff 6_4 -m 6_H &&
+	git checkout 6_1 &&
+	git merge --no-ff 6_2 -m 6_I &&
+
+	check_graph 6_1 6_3 6_5 <<-\EOF
+	*   6_I
+	|\
+	| | *   6_H
+	| | |\
+	| | | * 6_G
+	| | * | 6_E
+	| | | | * 6_F
+	| |_|_|/|
+	|/| | |/
+	| | |/|
+	| |/| |
+	| * | | 6_D
+	| | |/
+	| |/|
+	* | | 6_C
+	| |/
+	|/|
+	* | 6_B
+	|/
+	* 6_A
+	EOF
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 3/3] t4215: add bigger graph collapse test
From: Derrick Stolee via GitGitGadget @ 2020-01-07 14:55 UTC (permalink / raw)
  To: git; +Cc: peff, brad, sunshine, Derrick Stolee, Junio C Hamano,
	Derrick Stolee
In-Reply-To: <pull.517.git.1578408947.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

A previous test in t4215-log-skewed-merges.sh was added to demonstrate
exactly the topology of a reported failure in "git log --graph". While
investigating the fix, we realized that multiple edges that could
collapse with horizontal lines were not doing so.

Specifically, examine this section of the graph:

	| | | | | | *   7_M4
	| |_|_|_|_|/|\
	|/| | | | |/ /
	| | | | |/| /
	| | | |/| |/
	| | |/| |/|
	| |/| |/| |
	| | |/| | |
	| | * | | | 7_G

Document this behavior with a test. This behavior is new, as the
behavior in v2.24.1 has the following output:

	| | | | | | *-.   7_M4
	| | | | | | |\ \
	| |_|_|_|_|/ / /
	|/| | | | | / /
	| | |_|_|_|/ /
	| |/| | | | /
	| | | |_|_|/
	| | |/| | |
	| | * | | | 7_G

The behavior changed logically in 479db18b ("graph: smooth appearance
of collapsing edges on commit lines", 2019-10-15), but was actually
broken due to an assert() bug in 458152cc ("graph: example of graph
output that can be simplified", 2019-10-15). A future change could
modify this behavior to do the following instead:

	| | | | | | *   7_M4
	| |_|_|_|_|/|\
	|/| | | | |/ /
	| | |_|_|/| /
	| |/| | | |/
	| | | |_|/|
	| | |/| | |
	| | * | | | 7_G

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t4215-log-skewed-merges.sh | 63 ++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/t/t4215-log-skewed-merges.sh b/t/t4215-log-skewed-merges.sh
index ddf6f6f5d3..be2c24564b 100755
--- a/t/t4215-log-skewed-merges.sh
+++ b/t/t4215-log-skewed-merges.sh
@@ -282,4 +282,67 @@ test_expect_success 'log --graph with multiple tips' '
 	EOF
 '
 
+test_expect_success 'log --graph with multiple tips' '
+	git checkout --orphan 7_1 &&
+	test_commit 7_A &&
+	test_commit 7_B &&
+	test_commit 7_C &&
+	git checkout -b 7_2 7_1~2 &&
+	test_commit 7_D &&
+	test_commit 7_E &&
+	git checkout -b 7_3 7_1~1 &&
+	test_commit 7_F &&
+	test_commit 7_G &&
+	git checkout -b 7_4 7_2~1 &&
+	test_commit 7_H &&
+	git checkout -b 7_5 7_1~2 &&
+	test_commit 7_I &&
+	git checkout -b 7_6 7_3~1 &&
+	test_commit 7_J &&
+	git checkout -b M_1 7_1 &&
+	git merge --no-ff 7_2 -m 7_M1 &&
+	git checkout -b M_3 7_3 &&
+	git merge --no-ff 7_4 -m 7_M2 &&
+	git checkout -b M_5 7_5 &&
+	git merge --no-ff 7_6 -m 7_M3 &&
+	git checkout -b M_7 7_1 &&
+	git merge --no-ff 7_2 7_3 -m 7_M4 &&
+
+	check_graph M_1 M_3 M_5 M_7 <<-\EOF
+	*   7_M1
+	|\
+	| | *   7_M2
+	| | |\
+	| | | * 7_H
+	| | | | *   7_M3
+	| | | | |\
+	| | | | | * 7_J
+	| | | | * | 7_I
+	| | | | | | *   7_M4
+	| |_|_|_|_|/|\
+	|/| | | | |/ /
+	| | | | |/| /
+	| | | |/| |/
+	| | |/| |/|
+	| |/| |/| |
+	| | |/| | |
+	| | * | | | 7_G
+	| | | |_|/
+	| | |/| |
+	| | * | | 7_F
+	| * | | | 7_E
+	| | |/ /
+	| |/| |
+	| * | | 7_D
+	| | |/
+	| |/|
+	* | | 7_C
+	| |/
+	|/|
+	* | 7_B
+	|/
+	* 7_A
+	EOF
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* Re: Assertion in git log graphing
From: Derrick Stolee @ 2020-01-07 14:57 UTC (permalink / raw)
  To: Bradley Smith, git, Jeff King, Eric Sunshine
In-Reply-To: <CAHt=fUXTHc4JPsapvHKnw5vHhp2cBOYRNfdaSDWBUnKt8fWfeA@mail.gmail.com>

On 1/7/2020 6:24 AM, Bradley Smith wrote:
> Hi,
> 
> The following git repository (https://github.com/brads55/git-testcase)
> causes an assertion when running:
> 
>   $ git log --oneline --graph --all
> 
>   git-log: graph.c:1228: graph_output_collapsing_line: Assertion
> `graph->mapping[i - 3] == target' failed.

Bradley,

Thanks for the report. I've submitted [1] what I hope to be a
sufficient fix.

Thanks,
-Stolee

[1] https://lore.kernel.org/git/pull.517.git.1578408947.gitgitgadget@gmail.com/

^ permalink raw reply

* [PATCH 0/1] string-list: note in docs that callers can specify sorting function
From: Elijah Newren via GitGitGadget @ 2020-01-07 15:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Simple change to the string-list docs to reflect the fact that string-list
now has a cmp member field these days that can be set to something other
than strcmp().

Elijah Newren (1):
  string-list: note in docs that callers can specify sorting function

 string-list.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)


base-commit: 8679ef24ed64018bb62170c43ce73e0261c0600a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-689%2Fnewren%2Fstring-list-update-docs-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-689/newren/string-list-update-docs-v1
Pull-Request: https://github.com/git/git/pull/689
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/1] string-list: note in docs that callers can specify sorting function
From: Elijah Newren via GitGitGadget @ 2020-01-07 15:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Elijah Newren
In-Reply-To: <pull.689.git.git.1578409650.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

In commit 1959bf6430 (string_list API: document what "sorted" means,
2012-09-17), Documentation/technical/api-string-list.txt was updated to
specify that strcmp() was used for sorting.  In commit 8dd5afc926
(string-list: allow case-insensitive string list, 2013-01-07), a cmp
member was added to struct string_list to allow callers to specify an
alternative comparison function, but api-string-list.txt was not
updated.  In commit 4f665f2cf3 (string-list.h: move documentation from
Documentation/api/ into header, 2017-09-26), the now out-dated
api-string-list.txt documentation was moved into string-list.h.  Update
the docs to reflect the configurability of sorting.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 string-list.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/string-list.h b/string-list.h
index 7bb0ad07e6..6c5d274126 100644
--- a/string-list.h
+++ b/string-list.h
@@ -4,7 +4,8 @@
 /**
  * The string_list API offers a data structure and functions to handle
  * sorted and unsorted arrays of strings.  A "sorted" list is one whose
- * entries are sorted by string value in `strcmp()` order.
+ * entries are sorted by string value in the order specified by the `cmp`
+ * member (`strcmp()` by default).
  *
  * The caller:
  *
@@ -209,7 +210,8 @@ struct string_list_item *string_list_append(struct string_list *list, const char
 struct string_list_item *string_list_append_nodup(struct string_list *list, char *string);
 
 /**
- * Sort the list's entries by string value in `strcmp()` order.
+ * Sort the list's entries by string value in order specified by list->cmp
+ * (strcmp() if list->cmp is NULL).
  */
 void string_list_sort(struct string_list *list);
 
-- 
gitgitgadget

^ permalink raw reply related

* Re: Broken branch after git commit - tracked files in staging area can't be removed with restore --staged, or commit or stash
From: Torsten Krah @ 2020-01-07 15:28 UTC (permalink / raw)
  To: git
In-Reply-To: <234df85965f8a685be5e563fe795ed477f359d7c.camel@gmail.com>

Am Dienstag, den 07.01.2020, 14:43 +0100 schrieb Torsten Krah:
> Although restore --staged moved my unwanted files away from the
> staging
> area and "git status" told me that they are not "in" the commit the
> commit itself did still include them.

I can reproduce that (locally) at least:

What does *not* work for me:

   git clone XX main
   cd main
   git fetch XX && git checkout FETCH_HEAD
   git checkout -b TEST
   git reset --soft HEAD~1
   git restore --staged $FILES

git status now lists $FILES as unstaged and they are not included in
the staging area.

   git commit

-> now $FILES are included in the commit (I would expect them not to be
included - right?) and git status does list those still in the working
area.

What does work:

   git clone XX main
   cd main
   git fetch XX && git checkout FETCH_HEAD
   git checkout -b TEST
   git reset --soft HEAD~1
   git reset HEAD $FILES

git status now lists $FILES as unstaged and they are not included in
the staging area.

   git commit

produces a commit where $FILES are not included and they are still in
the working area, unstaged - like expected.



git status tells me this in the staging area part:

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)

I did that and its not working (for me) - looks at least like a bug or
I am doing something wrong and I am just too dumb at the moment to see
my failure.

Cheers

Torsten

PS: $FILES are files which are all "new" and first time added in the
commit I want to modify with restore.

PPS: The second problem with those staged deleted, unstageable,
uncomittable files still persists in my copy of those branch (I can't
reproduce that - still I have the repository in that state).



^ permalink raw reply

* Re: [PATCH 1/3] graph: fix case that hit assert()
From: Jeff King @ 2020-01-07 15:30 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, brad, sunshine, Derrick Stolee, Junio C Hamano
In-Reply-To: <65186f3ded251e0bcf1fcb18160163a3efd97c37.1578408947.git.gitgitgadget@gmail.com>

On Tue, Jan 07, 2020 at 02:55:45PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> A failure was reported in "git log --graph --all" with the new
> graph-rendering logic. Create a test case that matches the
> topology of that example and uses an explicit ref ordering instead
> of the "--all" option. The test would fail with the following error:
> 
> 	graph.c:1228: graph_output_collapsing_line: Assertion
> 		      `graph->mapping[i - 3] == target' failed.
> 
> The situation is a little complicated, so let's break it down.

First off, thanks for digging into this so promptly. Your solution looks
correct to me. Everything else I'll mention here are nits. :)

Your commit message starts off talking about the test, but without
describing what's interesting about it. I think the answer is that we
have two "skewed" merge parents for the same merge; maybe it would make
sense to lead with that. I.e.:

  Subject: graph: drop assert() for merge with two collapsing parents

  When "git log --graph" shows a merge commit that has two collapsing
  lines, like:

    [your diagram]

  we trigger an assert():

    graph.c:1228: graph_output_collapsing_line: Assertion
                  `graph->mapping[i - 3] == target' failed.

  ...and so on...

> The assert was introduced by eaf158f8 ("graph API: Use horizontal
> lines for more compact graphs", 2009-04-21), which is quite old.
> This assert is trying to say that when we complete a horizontal
> line with a single slash, it is because we have reached our target.

Thanks for this final sentence; writing that out in English made the
purpose of the assert() much clearer.

That could perhaps be an argument in favor of writing it as a BUG()
with a similar human-readable explanation. I guess there was already a
comment in the code, but it didn't quite click with me as much as what
you wrote above.

> It is actually the _second_ collapsing line that hits this assert.
> The reason we are in this code path is because we are collapsing
> the first line, and it in that case we are hitting our target now

s/it//

> that the horizontal line is complete. However, the second line
> cannot be a horizontal line, so it will collapse without horizontal
> lines. In this case, it is inappropriate to assert that we have
> reached our target, as we need to continue for another column
> before reaching the target. Dropping the assert is safe here.

I think that makes sense. My big concern here is that the assert() was
preventing us from looking out of bounds in the graph->mapping array,
but I don't think that's the case here.

Worth mentioning that this was due to 0f0f389f12 (graph: tidy up display
of left-skewed merges, 2019-10-15), in case somebody has to later dig
deeper?

> Second, the horizontal lines in that first line drop their coloring.
> This is due to a use of graph_line_addch() instead of
> graph_line_write_column(). Using a ternary operator to pick the
> character is nice for compact code, but we actually need a column
> to provide the color.

It seems like this is a totally separate bug, and could be its own
commit?

I think it's also caused by 0f0f389f12, which actually introduced the
seen_parent mechanism that you're correcting here.

> Helped-by: Jeff King <peff@peff.net>
> Reported-by: Bradley Smith <brad@brad-smith.co.uk>

I don't know that I did much, but OK. :)

Thanks once again Bradley for the reproducible case.

> diff --git a/t/t4215-log-skewed-merges.sh b/t/t4215-log-skewed-merges.sh
> index 18709a723e..ddf6f6f5d3 100755
> --- a/t/t4215-log-skewed-merges.sh
> +++ b/t/t4215-log-skewed-merges.sh
> @@ -240,4 +240,46 @@ test_expect_success 'log --graph with octopus merge with column joining its penu
>  	EOF
>  '
>  
> +test_expect_success 'log --graph with multiple tips' '

This nicely covers the assert() problem. Could we check the same case
with "--color" and test_decode_color to check the coloring issue (see
t4214 for some prior art)?

-Peff

^ permalink raw reply

* Re: [PATCH 2/3] graph: replace assert() with graph_assert() macro
From: Jeff King @ 2020-01-07 15:36 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, brad, sunshine, Derrick Stolee, Junio C Hamano
In-Reply-To: <5dd305d2f0de43a70b46336c8f1a62437e0511e1.1578408947.git.gitgitgadget@gmail.com>

On Tue, Jan 07, 2020 at 02:55:46PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The assert() macro is sometimes compiled out. Instead, switch these into
> BUG() statements using our own custom macro.
> 
> Reported-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

I can buy the argument that compiling with and without NDEBUG can lead
to confusion. But if that is the case, wouldn't it be so for all of the
assert() calls, not just ones in the graph code?

Previous discussions[1] seemed to conclude that having a kernel-style
BUG_ON() is probably the right way forward. I.e., replace this:

> +#define graph_assert(exp) if (!(exp)) { BUG("assert failed: "#exp""); }

with something similar in git-compat-util.h. Even if we don't convert
everybody to it immediately, it would be available for use.

At any rate, I think this patch (and the third one) can be post-v2.25.
But we'd want the first one before the release.

-Peff

[1] https://lore.kernel.org/git/20171122223827.26773-1-sbeller@google.com/

^ permalink raw reply

* Fwd: Add support for axiterm colors in parse_color
From: Eyal Soha @ 2020-01-07 15:36 UTC (permalink / raw)
  To: git
In-Reply-To: <CANsz78K-BiswWPdhd_N25NuApcv7zSb2cw2Y9DSinkpNpuogYw@mail.gmail.com>

https://en.wikipedia.org/wiki/ANSI_escape_code

ANSI color codes 90-97 and 100-107 are brighter versions of the 30-37
and 40-47 colors.  To view them, run `colortest-16`.  In that
colortest, they are named with a leading "+".  Maybe git config could
support those colors, too, with a leading plus in the name?  They are
nice for having a different shade of a color in a diff.

Here's a patch with tests.  I don't know how to submit it.  Thoughts?

Eyal

diff --git a/color.c b/color.c
index ebb222ec33..a39fe7d133 100644
--- a/color.c
+++ b/color.c
@@ -33,6 +33,7 @@ struct color {
                COLOR_UNSPECIFIED = 0,
                COLOR_NORMAL,
                COLOR_ANSI, /* basic 0-7 ANSI colors */
+               COLOR_AXITERM, /* brighter than 0-7 ANSI colors */
                COLOR_256,
                COLOR_RGB
        } type;
@@ -95,6 +96,12 @@ static int parse_color(struct color *out, const
char *name, int len)
                        out->value = i;
                        return 0;
                }
+               if (*name == '+' &&
+                   match_word(name+1, len-1, color_names[i])) {
+                       out->type = COLOR_AXITERM;
+                       out->value = i;
+                       return 0;
+               }
        }

        /* And finally try a literal 256-color-mode number */
@@ -166,23 +173,24 @@ int color_parse(const char *value, char *dst)
  * already have the ANSI escape code in it. "out" should have enough
  * space in it to fit any color.
  */
-static char *color_output(char *out, int len, const struct color *c, char type)
+static char *color_output(char *out, int len, const struct color *c,
+                          const char *type)
 {
        switch (c->type) {
        case COLOR_UNSPECIFIED:
        case COLOR_NORMAL:
                break;
        case COLOR_ANSI:
-               if (len < 2)
+       case COLOR_AXITERM:
+               if (len < strlen(type) + 1)
                        BUG("color parsing ran out of space");
-               *out++ = type;
-               *out++ = '0' + c->value;
+               out += xsnprintf(out, len, "%s%c", type, '0' + c->value);
                break;
        case COLOR_256:
-               out += xsnprintf(out, len, "%c8;5;%d", type, c->value);
+               out += xsnprintf(out, len, "%s8;5;%d", type, c->value);
                break;
        case COLOR_RGB:
-               out += xsnprintf(out, len, "%c8;2;%d;%d;%d", type,
+               out += xsnprintf(out, len, "%s8;2;%d;%d;%d", type,
                                 c->red, c->green, c->blue);
                break;
        }
@@ -279,14 +287,24 @@ int color_parse_mem(const char *value, int
value_len, char *dst)
                if (!color_empty(&fg)) {
                        if (sep++)
                                OUT(';');
-                       /* foreground colors are all in the 3x range */
-                       dst = color_output(dst, end - dst, &fg, '3');
+                       /* foreground colors are in the 3x range */
+                       char *range = "3";
+                       if (fg.type == COLOR_AXITERM) {
+                               /* axiterm fg colors are in the 9x range */
+                               range = "9";
+                       }
+                       dst = color_output(dst, end - dst, &fg, range);
                }
                if (!color_empty(&bg)) {
                        if (sep++)
                                OUT(';');
                        /* background colors are all in the 4x range */
-                       dst = color_output(dst, end - dst, &bg, '4');
+                       char *range = "4";
+                       if (bg.type == COLOR_AXITERM) {
+                               /* axiterm bg colors are in the 10x range */
+                               range = "10";
+                       }
+                       dst = color_output(dst, end - dst, &bg, range);
                }
                OUT('m');
        }
diff --git a/t/t4026-color.sh b/t/t4026-color.sh
index 671e951ee5..2b8430584f 100755
--- a/t/t4026-color.sh
+++ b/t/t4026-color.sh
@@ -30,6 +30,14 @@ test_expect_success 'attribute before color name' '
        color "bold red" "[1;31m"
 '

+test_expect_success 'axiterm bright fg color' '
+       color "+red" "[91m"
+'
+
+test_expect_success 'axiterm bright bg color' '
+       color "green +blue" "[32;104m"
+'
+
 test_expect_success 'color name before attribute' '
        color "red bold" "[1;31m"
 '

^ permalink raw reply related

* Re: [PATCH 3/3] t4215: add bigger graph collapse test
From: Jeff King @ 2020-01-07 15:39 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, brad, sunshine, Derrick Stolee, Junio C Hamano
In-Reply-To: <f74e82bea68701beb734537cafd147162d1bb2c6.1578408947.git.gitgitgadget@gmail.com>

On Tue, Jan 07, 2020 at 02:55:47PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> A previous test in t4215-log-skewed-merges.sh was added to demonstrate
> exactly the topology of a reported failure in "git log --graph". While
> investigating the fix, we realized that multiple edges that could
> collapse with horizontal lines were not doing so.

Thanks for constructing this larger case.

As for including this patch, I could take or leave it for now. I like
the idea of documenting things further, but unless it's marked
expect_failure, I don't think it's going to call anybody's attention
more than this thread already has.

So I'd love to hear what James thinks should happen here, given that
it's an extension of his other work. But I'd just as soon punt on the
patch until we decide whether it _should_ change (and then either mark
it with expect_failure, or include the test along with a patch changing
the behavior).

-Peff

^ permalink raw reply

* Re: [PATCH 2/3] graph: replace assert() with graph_assert() macro
From: Eric Sunshine @ 2020-01-07 15:51 UTC (permalink / raw)
  To: Jeff King
  Cc: Derrick Stolee via GitGitGadget, Git List, Bradley Smith,
	Derrick Stolee, Junio C Hamano
In-Reply-To: <20200107153651.GB20591@coredump.intra.peff.net>

On Tue, Jan 7, 2020 at 10:36 AM Jeff King <peff@peff.net> wrote:
> On Tue, Jan 07, 2020 at 02:55:46PM +0000, Derrick Stolee via GitGitGadget wrote:
> > The assert() macro is sometimes compiled out. Instead, switch these into
> > BUG() statements using our own custom macro.
>
> I can buy the argument that compiling with and without NDEBUG can lead
> to confusion. But if that is the case, wouldn't it be so for all of the
> assert() calls, not just ones in the graph code?

This wasn't just a matter of potential confusion. It's one thing to
have assert()s in the code in general, but another thing when a
scripted test specifically depends upon the asserted condition, as was
the case with the test as originally proposed. Since the final patch
series removes that particular assert() altogether, it's perhaps not
that important anymore.

^ permalink raw reply

* git-p4 cannot use perforce client created by p4java — "Expected view key View1 missing"
From: Ivan Selin @ 2020-01-07 15:53 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1171 bytes --]

Hello!

If I create a perforce client from java using p4java, it gets created
with an extra key "ViewDepotType" in client definition. When I try to
do `git p4 sync --use-client-spec`, git-p4 dies with message like
"Expected view key View1 missing" — because it assumes that all keys
starting with "View" are "View0", "View1" and so on.

How to reproduce:
1) Create perforce repository;
2) Create a perforce client (let's name it "my-client") in said
perforce repository using p4java; add one view mapping to the client
on creation;
3) Run `P4CLIENT=my-client git p4 sync --use-client-spec`;
4) git p4 finishes with error "Expected view key View1 missing".

Attaching hexdumped/unmarshalled examples of "regular" client
definition and client created with p4java. Note that p4java's version
has "ViewDepotType" key and regular client does not. Also,
"ViewDepotType" key is not showing up in text output of `p4 client
-o`, only in binary format (`p4 -G client -o`). And I'm attaching a
patch that solved the issue for me.

Is that information enough or should I add anything else? I'm new to
git community, but willing to help.

Regards,
Ivan.

[-- Attachment #2: p4-client-regular.txt --]
[-- Type: text/plain, Size: 2581 bytes --]

  $ p4 -G client -o c-1 | hexdump -C
00000000  7b 73 04 00 00 00 63 6f  64 65 73 04 00 00 00 73  |{s....codes....s|
00000010  74 61 74 73 06 00 00 00  43 6c 69 65 6e 74 73 03  |tats....Clients.|
00000020  00 00 00 63 2d 31 73 06  00 00 00 55 70 64 61 74  |...c-1s....Updat|
00000030  65 73 13 00 00 00 32 30  32 30 2f 30 31 2f 30 35  |es....2020/01/05|
00000040  20 31 36 3a 34 35 3a 33  36 73 06 00 00 00 41 63  | 16:45:36s....Ac|
00000050  63 65 73 73 73 13 00 00  00 32 30 32 30 2f 30 31  |cesss....2020/01|
00000060  2f 30 35 20 31 36 3a 34  35 3a 33 36 73 04 00 00  |/05 16:45:36s...|
00000070  00 52 6f 6f 74 73 3e 00  00 00 2f 68 6f 6d 65 2f  |.Roots>.../home/|
00000080  64 69 72 65 63 74 6f 72  79 2f 64 65 76 2f 66 72  |directory/dev/fr|
00000090  65 65 2f 67 69 74 2d 70  34 2f 70 34 2d 73 61 6d  |ee/git-p4/p4-sam|
000000a0  70 6c 65 2d 72 65 70 6f  2f 64 69 72 65 63 74 2d  |ple-repo/direct-|
000000b0  63 6d 64 2d 74 65 73 74  73 07 00 00 00 4f 70 74  |cmd-tests....Opt|
000000c0  69 6f 6e 73 73 3a 00 00  00 6e 6f 61 6c 6c 77 72  |ionss:...noallwr|
000000d0  69 74 65 20 6e 6f 63 6c  6f 62 62 65 72 20 6e 6f  |ite noclobber no|
000000e0  63 6f 6d 70 72 65 73 73  20 75 6e 6c 6f 63 6b 65  |compress unlocke|
000000f0  64 20 6e 6f 6d 6f 64 74  69 6d 65 20 6e 6f 72 6d  |d nomodtime norm|
00000100  64 69 72 73 0d 00 00 00  53 75 62 6d 69 74 4f 70  |dirs....SubmitOp|
00000110  74 69 6f 6e 73 73 0f 00  00 00 73 75 62 6d 69 74  |tionss....submit|
00000120  75 6e 63 68 61 6e 67 65  64 73 07 00 00 00 4c 69  |unchangeds....Li|
00000130  6e 65 45 6e 64 73 05 00  00 00 6c 6f 63 61 6c 73  |neEnds....locals|
00000140  05 00 00 00 56 69 65 77  30 73 15 00 00 00 2f 2f  |....View0s....//|
00000150  64 65 70 6f 74 2f 2e 2e  2e 20 2f 2f 63 2d 31 2f  |depot/... //c-1/|
00000160  2e 2e 2e 73 04 00 00 00  54 79 70 65 73 09 00 00  |...s....Types...|
00000170  00 77 72 69 74 65 61 62  6c 65 73 06 00 00 00 42  |.writeables....B|
00000180  61 63 6b 75 70 73 06 00  00 00 65 6e 61 62 6c 65  |ackups....enable|
00000190  30                                                |0|
00000191

  $ p4 -G client -o c-1 | python -c 'import marshal, pprint, sys; pprint.pprint(marshal.load(sys.stdin))'
{'Access': '2020/01/05 16:45:36',
 'Backup': 'enable',
 'Client': 'c-1',
 'LineEnd': 'local',
 'Options': 'noallwrite noclobber nocompress unlocked nomodtime normdir',
 'Root': '/home/directory/dev/free/git-p4/p4-sample-repo/direct-cmd-test',
 'SubmitOptions': 'submitunchanged',
 'Type': 'writeable',
 'Update': '2020/01/05 16:45:36',
 'View0': '//depot/... //c-1/...',
 'code': 'stat'}


[-- Attachment #3: git-p4-view-keys.patch --]
[-- Type: text/x-patch, Size: 557 bytes --]

 git-p4.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git git-p4.py git-p4.py
index ca0a874501..1dc1588255 100755
--- git-p4.py
+++ git-p4.py
@@ -1108,8 +1108,8 @@ def getClientSpec():
     # the //client/ name
     client_name = entry["Client"]
 
-    # just the keys that start with "View"
-    view_keys = [ k for k in entry.keys() if k.startswith("View") ]
+    # just the keys "View0", "View1", ...
+    view_keys = [ k for k in entry.keys() if re.match(r"View\d+", k) ]
 
     # hold this new View
     view = View(client_name)

[-- Attachment #4: p4-client-from-p4java.txt --]
[-- Type: text/plain, Size: 3386 bytes --]

  $ p4 -G client -o git-p4-sync-2 | hexdump -C
00000000  7b 73 04 00 00 00 63 6f  64 65 73 04 00 00 00 73  |{s....codes....s|
00000010  74 61 74 73 06 00 00 00  43 6c 69 65 6e 74 73 0d  |tats....Clients.|
00000020  00 00 00 67 69 74 2d 70  34 2d 73 79 6e 63 2d 32  |...git-p4-sync-2|
00000030  73 06 00 00 00 55 70 64  61 74 65 73 13 00 00 00  |s....Updates....|
00000040  32 30 31 39 2f 31 32 2f  31 35 20 31 33 3a 34 39  |2019/12/15 13:49|
00000050  3a 35 31 73 06 00 00 00  41 63 63 65 73 73 73 13  |:51s....Accesss.|
00000060  00 00 00 32 30 31 39 2f  31 32 2f 31 35 20 31 33  |...2019/12/15 13|
00000070  3a 34 39 3a 35 31 73 05  00 00 00 4f 77 6e 65 72  |:49:51s....Owner|
00000080  73 02 00 00 00 73 61 73  04 00 00 00 52 6f 6f 74  |s....sas....Root|
00000090  73 52 00 00 00 2f 76 61  72 2f 61 74 6c 61 73 73  |sR.../var/atlass|
000000a0  69 61 6e 2f 61 70 70 6c  69 63 61 74 69 6f 6e 2d  |ian/application-|
000000b0  64 61 74 61 2f 62 69 74  62 75 63 6b 65 74 2f 73  |data/bitbucket/s|
000000c0  68 61 72 65 64 2f 69 73  2d 67 69 74 2d 70 34 2f  |hared/is-git-p4/|
000000d0  72 65 70 6f 73 69 74 6f  72 79 2f 72 65 70 6f 5f  |repository/repo_|
000000e0  6c 6f 63 61 6c 5f 32 73  07 00 00 00 4f 70 74 69  |local_2s....Opti|
000000f0  6f 6e 73 73 3a 00 00 00  6e 6f 61 6c 6c 77 72 69  |onss:...noallwri|
00000100  74 65 20 6e 6f 63 6c 6f  62 62 65 72 20 6e 6f 63  |te noclobber noc|
00000110  6f 6d 70 72 65 73 73 20  75 6e 6c 6f 63 6b 65 64  |ompress unlocked|
00000120  20 6e 6f 6d 6f 64 74 69  6d 65 20 6e 6f 72 6d 64  | nomodtime normd|
00000130  69 72 73 0d 00 00 00 53  75 62 6d 69 74 4f 70 74  |irs....SubmitOpt|
00000140  69 6f 6e 73 73 0f 00 00  00 73 75 62 6d 69 74 75  |ionss....submitu|
00000150  6e 63 68 61 6e 67 65 64  73 07 00 00 00 4c 69 6e  |nchangeds....Lin|
00000160  65 45 6e 64 73 05 00 00  00 6c 6f 63 61 6c 73 05  |eEnds....locals.|
00000170  00 00 00 56 69 65 77 30  73 1e 00 00 00 2f 2f 72  |...View0s....//r|
00000180  65 70 6f 2f 2e 2e 2e 20  2f 2f 67 69 74 2d 70 34  |epo/... //git-p4|
00000190  2d 73 79 6e 63 2d 32 2f  2e 2e 2e 73 04 00 00 00  |-sync-2/...s....|
000001a0  54 79 70 65 73 09 00 00  00 77 72 69 74 65 61 62  |Types....writeab|
000001b0  6c 65 73 06 00 00 00 42  61 63 6b 75 70 73 06 00  |les....Backups..|
000001c0  00 00 65 6e 61 62 6c 65  73 09 00 00 00 65 78 74  |..enables....ext|
000001d0  72 61 54 61 67 30 73 0d  00 00 00 56 69 65 77 44  |raTag0s....ViewD|
000001e0  65 70 6f 74 54 79 70 65  73 0d 00 00 00 65 78 74  |epotTypes....ext|
000001f0  72 61 54 61 67 54 79 70  65 30 73 04 00 00 00 77  |raTagType0s....w|
00000200  6f 72 64 73 0d 00 00 00  56 69 65 77 44 65 70 6f  |ords....ViewDepo|
00000210  74 54 79 70 65 73 05 00  00 00 67 72 61 70 68 30  |tTypes....graph0|
00000220

  $ p4 -G client -o git-p4-sync-2 | python -c 'import marshal, pprint, sys; pprint.pprint(marshal.load(sys.stdin))'
{'Access': '2019/12/15 13:49:51',
 'Backup': 'enable',
 'Client': 'git-p4-sync-2',
 'LineEnd': 'local',
 'Options': 'noallwrite noclobber nocompress unlocked nomodtime normdir',
 'Owner': 'sa',
 'Root': '/var/atlassian/application-data/bitbucket/shared/is-git-p4/repository/repo_local_2',
 'SubmitOptions': 'submitunchanged',
 'Type': 'writeable',
 'Update': '2019/12/15 13:49:51',
 'View0': '//repo/... //git-p4-sync-2/...',
 'ViewDepotType': 'graph',
 'code': 'stat',
 'extraTag0': 'ViewDepotType',
 'extraTagType0': 'word'}

^ permalink raw reply

* Re: [PATCH 5/9] commit-graph: write changed path bloom filters to commit-graph file.
From: Jakub Narebski @ 2020-01-07 16:01 UTC (permalink / raw)
  To: Garima Singh via GitGitGadget
  Cc: git, Derrick Stolee, SZEDER Gábor, Jonathan Tan,
	Jeff Hostetler, Taylor Blau, Jeff King, Junio C Hamano,
	Garima Singh
In-Reply-To: <7648021072ca11153ac65c90f0ebed5973f20e1a.1576879520.git.gitgitgadget@gmail.com>

"Garima Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Garima Singh <garima.singh@microsoft.com>
>
> Write bloom filters to the commit-graph using the format described in
> Documentation/technical/commit-graph-format.txt
>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Garima Singh <garima.singh@microsoft.com>

Looks good to me.

> ---
>  commit-graph.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  commit-graph.h |  5 ++++
>  2 files changed, 85 insertions(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 8c4941eeaa..def2ade166 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -24,7 +24,9 @@
>  #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */
>  #define GRAPH_CHUNKID_EXTRAEDGES 0x45444745 /* "EDGE" */
>  #define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */
> -#define MAX_NUM_CHUNKS 5
> +#define GRAPH_CHUNKID_BLOOMINDEXES 0x42494458 /* "BIDX" */
> +#define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */
> +#define MAX_NUM_CHUNKS 7

Very minor nitpick: shouldn't we follow the order in the
commit-graph-format.txt document (i.e. "BASE" as last chunk and last
preprocessor constant)?

>  
>  #define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16)
>  
> @@ -282,6 +284,32 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
>  				chunk_repeated = 1;
>  			else
>  				graph->chunk_base_graphs = data + chunk_offset;
> +			break;
> +
> +		case GRAPH_CHUNKID_BLOOMINDEXES:
> +			if (graph->chunk_bloom_indexes)
> +				chunk_repeated = 1;
> +			else
> +				graph->chunk_bloom_indexes = data + chunk_offset;
> +			break;

All right.

> +
> +		case GRAPH_CHUNKID_BLOOMDATA:
> +			if (graph->chunk_bloom_data)
> +				chunk_repeated = 1;
> +			else {
> +				uint32_t hash_version;
> +				graph->chunk_bloom_data = data + chunk_offset;
> +				hash_version = get_be32(data + chunk_offset);

All right, now I see why all those header values for BDAT chunk are
defined to be 32-bit integers.  For code simplicity.

> +
> +				if (hash_version != 1)
> +					break;

What does it mean for Git?  Behave as if there were no Bloom filter
data?

> +
> +				graph->settings = xmalloc(sizeof(struct bloom_filter_settings));
> +				graph->settings->hash_version = hash_version;
> +				graph->settings->num_hashes = get_be32(data + chunk_offset + 4);
> +				graph->settings->bits_per_entry = get_be32(data + chunk_offset + 8);

All right, looks O.K.

> +			}
> +			break;
>  		}
>  
>  		if (chunk_repeated) {
> @@ -996,6 +1024,39 @@ static void write_graph_chunk_extra_edges(struct hashfile *f,
>  	}
>  }
>  
> +static void write_graph_chunk_bloom_indexes(struct hashfile *f,
> +					    struct write_commit_graph_context *ctx)
> +{
> +	struct commit **list = ctx->commits.list;
> +	struct commit **last = ctx->commits.list + ctx->commits.nr;
> +	uint32_t cur_pos = 0;
> +
> +	while (list < last) {
> +		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list);
> +		cur_pos += filter->len;
> +		hashwrite_be32(f, cur_pos);
> +		list++;
> +	}

Why not follow the write_graph_chunk_oids() example, instead of
write_graph_chunk_data(), that is use simply:

  +	struct commit **list = ctx->commits.list;
  +	uint32_t cur_pos = 0;
  +
  +	for (count = 0; count < ctx->commits.nr; count++, list++) {
  +		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list);
  +		cur_pos += filter->len;
  +		hashwrite_be32(f, cur_pos);
  +	}

I guess using here

  +		cur_pos += get_bloom_filter(ctx->r, *list)->len;

would be too cryptic, and hard to debug?

Also, wouldn't we need

  +		display_progress(ctx->progress, ++ctx->progress_cnt);

before hashwrite_be32()?

> +}
> +
> +static void write_graph_chunk_bloom_data(struct hashfile *f,
> +					 struct write_commit_graph_context *ctx,
> +					 struct bloom_filter_settings *settings)
> +{
> +	struct commit **first = ctx->commits.list;

Even if we decide to use `while` loop, like write_graph_chunk_data(),
and not `for` loop, like write_graph_chunk_oids(), why the change from
`struct commit **list = ...` to `struct commit **first = ...`?

> +	struct commit **last = ctx->commits.list + ctx->commits.nr;
> +
> +	hashwrite_be32(f, settings->hash_version);
> +	hashwrite_be32(f, settings->num_hashes);
> +	hashwrite_be32(f, settings->bits_per_entry);

All right, simple.

> +
> +	while (first < last) {
> +		struct bloom_filter *filter = get_bloom_filter(ctx->r, *first);

Hmmm... wouldn't this compute Bloom filter second time?
get_bloom_filter() does work unconditionally.

Wouldn't

  +		struct bloom_filter *filter = bloom_filter_slab_at(&bloom_filters, *first);

be enough?

Or make get_bloom_filter() use *_peek() to check if Bloom filter for
given commit was already computed, and only if it returns NULL do the
work.

> +		hashwrite(f, filter->data, filter->len * sizeof(uint64_t));

Might need display_progress() before hashwrote().

> +		first++;
> +	}
> +}
> +
>  static int oid_compare(const void *_a, const void *_b)
>  {
>  	const struct object_id *a = (const struct object_id *)_a;
> @@ -1388,6 +1449,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
>  	struct strbuf progress_title = STRBUF_INIT;
>  	int num_chunks = 3;
>  	struct object_id file_hash;
> +	struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS;
>  
>  	if (ctx->split) {
>  		struct strbuf tmp_file = STRBUF_INIT;
> @@ -1432,6 +1494,12 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
>  		chunk_ids[num_chunks] = GRAPH_CHUNKID_EXTRAEDGES;
>  		num_chunks++;
>  	}
> +	if (ctx->bloom) {
> +		chunk_ids[num_chunks] = GRAPH_CHUNKID_BLOOMINDEXES;
> +		num_chunks++;
> +		chunk_ids[num_chunks] = GRAPH_CHUNKID_BLOOMDATA;
> +		num_chunks++;
> +	}

Looks all right.

>  	if (ctx->num_commit_graphs_after > 1) {
>  		chunk_ids[num_chunks] = GRAPH_CHUNKID_BASE;
>  		num_chunks++;
> @@ -1450,6 +1518,13 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
>  						4 * ctx->num_extra_edges;
>  		num_chunks++;
>  	}
> +	if (ctx->bloom) {
> +		chunk_offsets[num_chunks + 1] = chunk_offsets[num_chunks] + sizeof(uint32_t) * ctx->commits.nr;
> +		num_chunks++;
> +
> +		chunk_offsets[num_chunks + 1] = chunk_offsets[num_chunks] + sizeof(uint32_t) * 3 + ctx->total_bloom_filter_size;
> +		num_chunks++;
> +	}

Better wrap those long lines, like above:

  +	if (ctx->bloom) {
  +		chunk_offsets[num_chunks + 1] = chunk_offsets[num_chunks] +
  +						sizeof(uint32_t) * ctx->commits.nr;
  +		num_chunks++;
  +
  +		chunk_offsets[num_chunks + 1] = chunk_offsets[num_chunks] +
  +						sizeof(uint32_t) * 3 + ctx->total_bloom_filter_size;
  +		num_chunks++;
  +	}

>  	if (ctx->num_commit_graphs_after > 1) {
>  		chunk_offsets[num_chunks + 1] = chunk_offsets[num_chunks] +
>  						hashsz * (ctx->num_commit_graphs_after - 1);
> @@ -1487,6 +1562,10 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
>  	write_graph_chunk_data(f, hashsz, ctx);
>  	if (ctx->num_extra_edges)
>  		write_graph_chunk_extra_edges(f, ctx);
> +	if (ctx->bloom) {
> +		write_graph_chunk_bloom_indexes(f, ctx);
> +		write_graph_chunk_bloom_data(f, ctx, &bloom_settings);
> +	}

All right.

>  	if (ctx->num_commit_graphs_after > 1 &&
>  	    write_graph_chunk_base(f, ctx)) {
>  		return -1;
> diff --git a/commit-graph.h b/commit-graph.h
> index 952a4b83be..2202ad91ae 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -10,6 +10,7 @@
>  #define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
>  
>  struct commit;
> +struct bloom_filter_settings;
>  
>  char *get_commit_graph_filename(const char *obj_dir);
>  int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
> @@ -58,6 +59,10 @@ struct commit_graph {
>  	const unsigned char *chunk_commit_data;
>  	const unsigned char *chunk_extra_edges;
>  	const unsigned char *chunk_base_graphs;
> +	const unsigned char *chunk_bloom_indexes;
> +	const unsigned char *chunk_bloom_data;
> +
> +	struct bloom_filter_settings *settings;

Should this be part of `struct commit_graph`?  Shouldn't we free() this
data, or is it a pointer into xmmap-ped file... no it isn't -- we
xalloc() it, so we should free() it.

I think it should be done in 'cleanup:' section of write_commit_graph(),
but I am not entirely sure.

>  };
>  
>  struct commit_graph *load_commit_graph_one_fd_st(int fd, struct stat *st);

Best,
-- 
Jakub Narębski

^ permalink raw reply

* Re: [PATCH 0/1] sequencer: comment out the 'squash!' line
From: Junio C Hamano @ 2020-01-07 16:15 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Mike Rappazzo, Michael Rappazzo via GitGitGadget, Git List
In-Reply-To: <20200107013401.GI6570@camp.crustytoothpaste.net>

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> I can see the argument that this makes it a little harder for mechanical
> processing across versions, but I suspect most of that looks something
> like "sed -i -e '/^squash! /d' COMMIT_EDITMSG" and it probably won't be
> affected.

With the left-anchoring, the search pattern will no longer find that
line if "squash!" is commented out, but people tend to be sloppy and
do not anchor the pattern would not notice the difference.  Perhaps
the downside may not be too severe?  I dunno.

> We do make occasional slightly incompatible changes across
> versions in order to improve user experience, and I think a lot of folks
> who use squash commits will find this a pleasant improvement.


^ permalink raw reply

* Re: [PATCH v2 1/1] branch: advise the user to checkout a different branch before deleting
From: Junio C Hamano @ 2020-01-07 16:34 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Heba Waly via GitGitGadget, Git List, Heba Waly
In-Reply-To: <CAPig+cQ0qY8KDZrQ8khuz34DqPimorN7JHHn0Ms=KpvJYtxJoA@mail.gmail.com>

Eric Sunshine <sunshine@sunshineco.com> writes:

[jc: skipped all the good suggestions I agree with]

>> +                                       }
>> +                                       else {
>> +                                               advise(_("The branch you are trying to delete is checked "
>> +                                                       "out on another worktree, run the following command "
>> +                                                       "to checkout a different branch then try again:\n"
>> +                                                       "git -C %s switch <branch>"), wt->path);
>
> I like the use of -C here because it makes the command self-contained,
> however, I also worry because wt->path is an absolute path, thus
> likely to be quite lengthy, which means that the important part of the
> example command (the "switch <branch>") can get pushed quite far away,
> thus is more easily overlooked by the reader. I wonder if it would
> make more sense to show the 'cd' command explicitly, although doing so
> ties the example to a particular shell, which may be a downside.
>
>     cd %s
>     git switch <different-branch>
>     cd -
>     git branch -%c <this-branch>

Note that wt->path may have special characters that would need to be
protected from the user's shell (worse, the quoting convention may
be different depending on which shell is in use).  That is one of
the reasons why I would suggest to stay away from giving an advice
that pretends to be cut-and-paste-able without being so.  In this
case, <different-branch> and <this-branch> must be filled by the
user anyway, and the only thing worth cutting-and-pasting is the
path to the other worktree, not the "git -C" or "cd" that users
should be able to come up with.

	"The branch is checked out on another worktree at\n"
	"path '%s'\n"
	"and cannot be deleted.  Go there, check out some other\n"
	"branch and try again."

or something like that, perhaps?  

> (It is rather verbose and ugly, though.)

I tend to agree.  It also feels to me that it is giving too much
hand-holding, but after all advise() may turning out to be about
giving that.


^ permalink raw reply

* Re: [PATCH 1/1] add: use advise function to display hints
From: Junio C Hamano @ 2020-01-07 16:35 UTC (permalink / raw)
  To: Heba Waly; +Cc: Git Mailing List, Heba Waly via GitGitGadget
In-Reply-To: <CACg5j27ce5BfR9RKekMEXokvCnXiXzmCVsyEKce+HORe8kL_GQ@mail.gmail.com>

Heba Waly <heba.waly@gmail.com> writes:

> On Fri, Jan 3, 2020 at 11:48 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>> A side note.
>>
>> Right now, the advise() API is a bit awkweard to use correctly.
>> When introducing a new advice message, you would
>>
>>  * come up with advice.frotz configuration variable
>>
>>  * define and declare advice_frotz global variable that defaults to
>>    true
>>
>>  * sprinkle calls like this:
>>
>>         if (advice_frotz)
>>                 advise(_("helpful message about frotz"));
>>
>> I am wondering about two things:
>>
>>  (1) if we can update the API so that the above can be reduced to
>>      just adding calls like:
>>
>>         advise_ng("frotz", _("helpful message about frotz"));
>>
>>  (2) if such a simplified advise_ng API is a good idea to begin
>>      with.
>>
>
> That's a valid suggestion, I can investigate that in a new patch, I'd rather
> keep this one as simple as calling the existing advise function.

Yeah, the side note wasn't even a suggestion for improving _this_
topic, nor even specifically addressed to you.  Let's stay focused.

Thanks.


^ permalink raw reply

* Re: [PATCH v2 1/1] unpack-trees: exit check_updates() early if updates are not wanted
From: Junio C Hamano @ 2020-01-07 16:37 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren
In-Reply-To: <dd277273324eaba5e1f4a368bf2e4d046033c776.1578380277.git.gitgitgadget@gmail.com>

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> check_updates() has a lot of code that repeatedly checks whether
> o->update or o->dry_run are set.  (Note that o->dry_run is a
> near-synonym for !o->update, but not quite as per commit 2c9078d05bf2
> ("unpack-trees: add the dry_run flag to unpack_trees_options",
> 2011-05-25).)  In fact, this function almost turns into a no-op whenever
> the condition
>    !o->update || o->dry_run
> is met.  Simplify the code by checking this condition at the beginning
> of the function, and when it is true, do the few things that are
> relevant and return early.

Thanks; will queue.


^ permalink raw reply

* Re: [PATCH 0/5] refactor gpg-interface and add gpg verification for clones
From: Junio C Hamano @ 2020-01-07 16:54 UTC (permalink / raw)
  To: Hans Jerry Illikainen; +Cc: git
In-Reply-To: <87lfqjnc5o.hji@dyntopia.com>

Hans Jerry Illikainen <hji@dyntopia.com> writes:

>> It almost makes me suspect that what you are trying to do with the
>> "verification upon cloning" may be much better done as a "verify the
>> tree for trustworthyness before checking it out to the working tree"
>> mechanism, where the trustworthyness of a tree-ish object may be
>> defined (and possibly made customizable by the policy of the project
>> the user is working on) like so:
>>
>>  - A tree is trustworthy if it is the tree of a trustworthy commit.
>>
>>  - A commit is trustworthy if
>>
>>    . it carries a trusted signature, or
>>    . it is pointed by a tag that carries a trusted signature, or
>>    . it is reachable from a trustworthy commit.
>>
>> Now, it is prohibitively expensive to compute the trusttworthiness
>> of a tree, defined like the above, when checking it out, UNLESS you
>> force each and every commit to carry a trusted signature, which is
>> not necessarily practical in the real world.
>
> Even though you mention that it's computationally expensive, I kind of
> like this approach.  It seems generalized enough that it doesn't need to
> be tied to a single operation like 'clone'.

Well, I don't ;-)  But even if you are not signing each and every
commit, it may be possible that the reachability information kept in
the commit-graph may help reduce the cost of the computation.  We'd
probably need a way to memoise which tags and commits have already
been verified earlier for trustworthiness for the scheme to work.

> Wouldn't that still forgo signature verification when doing something
> like:
>
>     $ git fetch
>     $ git checkout -b foo origin/branch-with-previously-unseen-commits
>
> unless either fetch or checkout is equipped with signature verification?

Yes, but I was assuming that "fetch", as an underlying mechanism for
"pull" would be what you'd teach the verification.

> Also, in case of a revoked key, this approach would require that tags
> signed with that key be deleted on origin.  

I am not sure if I follow.  An object that was signed back when a
key was OK and then the key gets revoked later---what should happen
to the trustworthiness of that object?  Another object that was
obtained from elsewhere was verified to be trustworthy, but later we
found out that the key we thought was trusted was already revoked---
what should happen to the trustworthiness of that object?

In either case, I think it is an unrelated matter if the tag object
should be removed either here or at the origin.  When spread *now*,
the key that was used to sign that tag object is known to have been
revoked, so while it is of no use to convey trustworthiness, if the
upstream project chooses to keep the tag (perhaps while re-issuing
a new version of the tag that points at the same object but signed
with a different key), that is perfectly fine, I would say.

> Maybe that's to be
> considered best practice anyway, but distro maintainers might not
> appreciate disappearing release tags.

I think that is best left as a policy decision for each project, and
we are not in the business of removing objects---we just should give
them tools to determine what key was used to sign an object, GPG or
whatever signing and key management infrastructure the projects
chooses to use gives them tools to decide if the key is what the
project trusts, and we give them tools to remove tags the project
finds unwanted.  Combining the tools according to the policy decision
the project makes is outside the scope of what we do here.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox