* Re: [PATCH v2 3/2] credential-cache: use child_process.args
From: Jeff King @ 2020-09-01 4:49 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqpn76e873.fsf@gitster.c.googlers.com>
On Mon, Aug 31, 2020 at 03:56:00PM -0700, Junio C Hamano wrote:
> > - argv[0] = "git-credential-cache--daemon";
> > - argv[1] = socket;
> > - daemon.argv = argv;
> > + strvec_pushl(&daemon.args,
> > + "credential-cache--daemon", socket,
> > + NULL);
> > + daemon.git_cmd = 1;
> > daemon.no_stdin = 1;
> > daemon.out = -1;
> >
>
> By the way, an interesting fact is that this cannot graduate UNTIL
> credential-cache becomes a built-in. Having an intermediate level
> process seems to break t0301.
Hmm, that is interesting. I thought it would work OK because we don't
rely on any process-id magic for finding the daemon, etc, and instead
talk over pipe descriptors. But that proves to be our undoing.
What happens usually is this:
- credential-cache spawns credential-cache--daemon with its
stdout connected to a pipe
- the client calls read_in_full() waiting for the daemon to tell us
that it started up
- the daemon opens the socket, then writes "ok\n" to stdout and closes
the pipe
- the client gets EOF on the pipe, then compares what it read to
"ok\n", and continues (or relays an error)
But when we add the extra "git" wrapper process into the mix, we never
see that EOF. The wrapper's stdout also points to that pipe, so closing
it in the daemon process isn't enough for the client to get EOF. And the
wrapper doesn't exit, because it's waiting on the daemon.
So one hacky fix is:
diff --git a/credential-cache.c b/credential-cache.c
index 04df61cf02..9bfddaf050 100644
--- a/credential-cache.c
+++ b/credential-cache.c
@@ -51,7 +51,7 @@ static void spawn_daemon(const char *socket)
if (start_command(&daemon))
die_errno("unable to start cache daemon");
- r = read_in_full(daemon.out, buf, sizeof(buf));
+ r = read_in_full(daemon.out, buf, 3);
if (r < 0)
die_errno("unable to read result code from cache daemon");
if (r != 3 || memcmp(buf, "ok\n", 3))
which makes t0301 pass. A less-hacky solution might be to loosen its
expectations not to require EOF at all (and just accept anything
starting with "ok\n". But I don't think it's worth doing either, if we
know we're going to switch it to a builtin anyway (and that also makes
me feel slightly better about the plan to do so).
I do wonder if it points to a problem in the git.c wrapper. It's
duplicating all of the descriptors that the child external commands will
see, so callers can't rely on EOF (or EPIPE for its stdin) to know when
the external program has closed them. For the most part that's OK,
because we expect them to close when the external program exits, at
which point the wrapper will exit, too. But things get weird around
half-duplex shutdowns, or programs that try to daemonize.
Perhaps git.c should be closing all descriptors after spawning the
child. Of course that gets weird if it wants to write an error to stderr
about spawning the child. I dunno. It seems as likely to introduce
problems as solve them, so if nothing is broken beyond this cache-daemon
thing, I'd just as soon leave it.
I wouldn't be surprised if older pre-builtin "upload-pack" could run
into problems. But we always called it as "git-upload-pack" back then
anyway. Possibly modern "git daemon" could suffer weirdness, as it's
still a separate program (I shied away from including it in my recent
"slimming" series exactly because I was afraid of these kinds of issues;
but it sounds like being a builtin probably has less-surprising
implications overall).
All of which is to say I'm happy to do nothing, but that turned out to
be a very interesting data point. Thanks for mentioning it. :)
-Peff
^ permalink raw reply related
* Re: [PATCH v4 4/6] config: correctly read worktree configs in submodules
From: Jonathan Nieder @ 2020-09-01 2:41 UTC (permalink / raw)
To: Matheus Tavares; +Cc: git, gitster, stolee, newren, jonathantanmy
In-Reply-To: <6402c968077900d48d189551a652e10984437a9f.1591974940.git.matheus.bernardino@usp.br>
Hi,
Matheus Tavares wrote:
> One of the steps in do_git_config_sequence() is to load the
> worktree-specific config file. Although the function receives a git_dir
> string, it relies on git_pathdup(), which uses the_repository->git_dir,
> to make the path to the file. Furthermore, it also checks that
> extensions.worktreeConfig is set through the
> repository_format_worktree_config variable, which refers to
> the_repository only. Thus, when a submodule has worktree-specific
> settings, a command executed in the superproject that recurses into the
> submodule won't find the said settings.
I think the above goes out of order: it states the "how" before the
"what". Instead, a commit message should lead with the problem the
change aims to solve.
Is the idea here that until this patch, we're only able to read
worktree config from a repository when extensions.worktreeConfig is
set in the_repository, meaning that
- when examining submodule config in a process where the_repository
represents the superproject, we do not read the submodule's worktree
config even if extensions.worktreeConfig is set in the submodule,
unless the superproject has extensions.worktreeConfig set, and
- when examining submodule config in a process where the_repository
represents the superproject, we *do* read the submodule's worktree
config even if extensions.worktreeConfig is not set in the submodule,
if the superproject has extensions.worktreeConfig set, and
?
That sounds like a serious problem indeed. Thanks for fixing it.
> This will be especially important in the next patch: git-grep will learn
> to honor sparse checkouts and, when running with --recurse-submodules,
> the submodule's sparse checkout settings must be loaded. As these
> settings are stored in the config.worktree file, they would be ignored
> without this patch. So let's fix this by reading the right
> config.worktree file and extensions.worktreeConfig setting, based on the
> git_dir and commondir paths given to do_git_config_sequence(). Also
> add a test to avoid any regressions.
I see. I'm not sure that's more important than other cases, but I
can understand if the problem was noticed in this circumstance. :)
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> config.c | 21 +++++++++---
> t/helper/test-config.c | 67 +++++++++++++++++++++++++++++++++-----
> t/t2404-worktree-config.sh | 16 +++++++++
> 3 files changed, 91 insertions(+), 13 deletions(-)
>
> diff --git a/config.c b/config.c
> index 8db9c77098..c2d56309dc 100644
> --- a/config.c
> +++ b/config.c
> @@ -1747,11 +1747,22 @@ static int do_git_config_sequence(const struct config_options *opts,
> ret += git_config_from_file(fn, repo_config, data);
>
> current_parsing_scope = CONFIG_SCOPE_WORKTREE;
> - if (!opts->ignore_worktree && repository_format_worktree_config) {
> + if (!opts->ignore_worktree && repo_config && opts->git_dir) {
Can we eliminate the repository_format_worktree_config global to save
the next caller from the same problem?
> + struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
> + struct strbuf buf = STRBUF_INIT;
> +
> + read_repository_format(&repo_fmt, repo_config);
> +
> + if (!verify_repository_format(&repo_fmt, &buf) &&
> + repo_fmt.worktree_config) {
This undoes the caching the repository_format_worktree_config means to
do. Can we cache the value in "struct repository" instead? That way,
in the common case where we're reading the_repository, we wouldn't
experience a slowdown.
> - char *path = git_pathdup("config.worktree");
> + char *path = mkpathdup("%s/config.worktree", opts->git_dir);
Can this use a helper like repo_git_path or strbuf_repo_git_path
(preferably one using strbuf like the latter)?
[...]
> + strbuf_release(&buf);
> + clear_repository_format(&repo_fmt);
> }
>
> current_parsing_scope = CONFIG_SCOPE_COMMAND;
> diff --git a/t/helper/test-config.c b/t/helper/test-config.c
> index 61da2574c5..284f83a921 100644
> --- a/t/helper/test-config.c
> +++ b/t/helper/test-config.c
> @@ -2,12 +2,19 @@
> #include "cache.h"
> #include "config.h"
> #include "string-list.h"
> +#include "submodule-config.h"
>
> /*
> * This program exposes the C API of the configuration mechanism
> * as a set of simple commands in order to facilitate testing.
> *
> - * Reads stdin and prints result of command to stdout:
> + * Usage: test-tool config [--submodule=<path>] <cmd> [<args>]
> + *
> + * If --submodule=<path> is given, <cmd> will operate on the submodule at the
> + * given <path>. This option is not valid for the commands: read_early_config,
> + * configset_get_value and configset_get_value_multi.
Nice!
[...]
> @@ -93,7 +102,18 @@ int cmd__config(int argc, const char **argv)
> if (argc == 0)
> goto print_usage_error;
>
> + if (skip_prefix(*argv, "--submodule=", &subrepo_path)) {
> + argc--;
> + argv++;
> + if (argc == 0)
> + goto print_usage_error;
> + }
Can this use the parse_options API?
> +
> if (argc == 2 && !strcmp(argv[0], "read_early_config")) {
> + if (subrepo_path) {
> + fprintf(stderr, "Cannot use --submodule with read_early_config\n");
> + return TC_USAGE_ERROR;
Should this use die() or BUG()?
> + }
> read_early_config(early_config_cb, (void *)argv[1]);
> return TC_SUCCESS;
> }
> @@ -101,8 +121,23 @@ int cmd__config(int argc, const char **argv)
> setup_git_directory();
> git_configset_init(&cs);
>
> + if (subrepo_path) {
> + const struct submodule *sub;
> + struct repository *subrepo = xcalloc(1, sizeof(*repo));
nit: this could be scoped to cmd__config:
struct repository subrepo = {0};
> +
> + sub = submodule_from_path(the_repository, &null_oid, subrepo_path);
> + if (!sub || repo_submodule_init(subrepo, the_repository, sub)) {
> + fprintf(stderr, "Invalid argument to --submodule: '%s'\n",
> + subrepo_path);
> + free(subrepo);
> + ret = TC_USAGE_ERROR;
Likewise: I think may want to use die() or BUG() (and likewise for other
USAGE_ERROR cases).
Thanks and hope that helps,
Jonathan
^ permalink raw reply
* Re: Bug Report: "git reset --hard" does not cancel an on-going rebase
From: Junio C Hamano @ 2020-08-31 23:11 UTC (permalink / raw)
To: Thomas Bétous; +Cc: git
In-Reply-To: <34B7EFB9-8710-4993-ACCD-604313D543E7@gmail.com>
Thomas Bétous <th.betous@gmail.com> writes:
> I would like to report an inconsistent behavior of the
> rebase/reset commands. I don’t know whether it is an actual bug
> or something else but according to me something is not right.
> When a rebase gets paused (because of a conflict for instance) I
> would expect the command "git reset --hard" to cancel this
> on-going rebase but it does not. I expect this because for
> instance "git reset --hard" cancels a cherry-pick in the same use
> case so I think the behavior of these 2 commands should be
> consistent.
It is reasonable and desirable for multi-commit operations like
"rebase", "cherry-pick A..B" and "revert A..B" not to abort the
entire sequence with a mere "reset --hard". After a step resulted
in conflicts, the user may try to resolve them, getting into too
deep a hole by botching resolution, and wish to redo the current
step from scratch, and "reset --hard" can be a way to clean the
slate before recreating the same conflict. To abort the whole
thing, "rebase --abort" and "cherry-pick --abort" would be needed
to differentiate from the "clean this single step" request made with
"reset --hard".
On the other hand, operations on a single commit like "cherry-pick
X" does not have to retain "what to do after we have dealt with the
current step", so "reset --hard" that finishes the whole thing
(after all, the whole thing is the single step that may have
conflicted) would just be a convenient short-hand.
^ permalink raw reply
* Re: [PATCH v2 3/2] credential-cache: use child_process.args
From: Junio C Hamano @ 2020-08-31 22:56 UTC (permalink / raw)
To: git; +Cc: Jeff King
In-Reply-To: <xmqqzh6ht7fg.fsf_-_@gitster.c.googlers.com>
Junio C Hamano <gitster@pobox.com> writes:
> As child_process structure has an embedded strvec args for
> formulating the command line, let's use it instead of using
> an out-of-line argv[] whose length needs to be maintained
> correctly.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> builtin/credential-cache.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/credential-cache.c b/builtin/credential-cache.c
> index d0fafdeb9e..195335a783 100644
> --- a/builtin/credential-cache.c
> +++ b/builtin/credential-cache.c
> @@ -42,13 +42,13 @@ static int send_request(const char *socket, const struct strbuf *out)
> static void spawn_daemon(const char *socket)
> {
> struct child_process daemon = CHILD_PROCESS_INIT;
> - const char *argv[] = { NULL, NULL, NULL };
> char buf[128];
> int r;
>
> - argv[0] = "git-credential-cache--daemon";
> - argv[1] = socket;
> - daemon.argv = argv;
> + strvec_pushl(&daemon.args,
> + "credential-cache--daemon", socket,
> + NULL);
> + daemon.git_cmd = 1;
> daemon.no_stdin = 1;
> daemon.out = -1;
>
By the way, an interesting fact is that this cannot graduate UNTIL
credential-cache becomes a built-in. Having an intermediate level
process seems to break t0301.
^ permalink raw reply
* Bug Report: "git reset --hard" does not cancel an on-going rebase
From: Thomas Bétous @ 2020-08-31 21:59 UTC (permalink / raw)
To: git
Hello,
I would like to report an inconsistent behavior of the rebase/reset commands. I don’t know whether it is an actual bug or something else but according to me something is not right.
When a rebase gets paused (because of a conflict for instance) I would expect the command "git reset --hard" to cancel this on-going rebase but it does not.
I expect this because for instance "git reset --hard" cancels a cherry-pick in the same use case so I think the behavior of these 2 commands should be consistent.
The issue can be reproduced with the following commands:
# First trigger a rebase that will paused because of a conflict
# (there is probably a simpler way to end up in this use case)
$ git init test-repo-rebase-reset
$ cd test-repo-rebase-reset/
$ touch file
$ git add file
$ git commit file -m 'First commit’
$ echo "dummy line" > file
$ git commit --all -m 'Second commit’
$ git switch --create new-branch HEAD~1
$ git rm file
$ git commit -m 'Remove file’
$ git rebase master
# Let’s check that the rebase got paused because of the conflict
$ git status
$ git reset --hard
# Even after the “git reset --hard“ the rebase was not aborted
$ git status
# Now let’s check what happens with the cherry-pick command
$ git rebase --abort
$ git switch master
$ git cherry-pick $(git rev-parse -1 new-branch)
# The cherry-pick got paused because of the same conflict
git status
# Let’s try to abort the cherry-pick with the reset command
git reset --hard
# Double check the cherry-pick was aborted
git status
I think the behavior will be the same on all OS but if it helps I was able to reproduce it on Cygwin/Windows 10 with git 2.28.0 and Mac OS with git 2.26.0
Note that in the commands above I didn’t give any revision to the reset command but the issue would be the same if I did. Moreover it makes it even more confusing I think: you can move your HEAD to any revision while the rebase is still waiting to be completed.
What do you think? Does this behavior seems normal to you?
Thank you in advance for your answer.
Best regards,
Thomas
^ permalink raw reply
* Re: [PATCH v4] revision: add separate field for "-m" of "diff-index -m"
From: Sergey Organov @ 2020-08-31 21:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git
In-Reply-To: <xmqqzh6aeedr.fsf@gitster.c.googlers.com>
Junio C Hamano <gitster@pobox.com> writes:
> Sergey Organov <sorganov@gmail.com> writes:
>
>> Add separate 'match_missing' field for diff-index to use and set it when we
>> encounter "-m" option. This field won't then be cleared when another
>> meaning of
>> "-m" is reverted (e.g., by "--no-diff-merges"), nor it will be affected by
>> future option(s) that might drive 'ignore_merges' field.
>>
>> Use this new field from diff-lib:do_oneway_diff() instead of reusing
>> 'ignore_merges' field.
>>
>> Signed-off-by: Sergey Organov <sorganov@gmail.com>
>> ---
>
> Looks good. Will queue. Thanks.
Nice! Thanks a lot!
^ permalink raw reply
* Re: [PATCH v4] revision: add separate field for "-m" of "diff-index -m"
From: Junio C Hamano @ 2020-08-31 20:42 UTC (permalink / raw)
To: Sergey Organov; +Cc: Jeff King, git
In-Reply-To: <20200831201422.27189-1-sorganov@gmail.com>
Sergey Organov <sorganov@gmail.com> writes:
> Add separate 'match_missing' field for diff-index to use and set it when we
> encounter "-m" option. This field won't then be cleared when another meaning of
> "-m" is reverted (e.g., by "--no-diff-merges"), nor it will be affected by
> future option(s) that might drive 'ignore_merges' field.
>
> Use this new field from diff-lib:do_oneway_diff() instead of reusing
> 'ignore_merges' field.
>
> Signed-off-by: Sergey Organov <sorganov@gmail.com>
> ---
Looks good. Will queue. Thanks.
^ permalink raw reply
* [PATCH v4] revision: add separate field for "-m" of "diff-index -m"
From: Sergey Organov @ 2020-08-31 20:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git, Sergey Organov
In-Reply-To: <20200829194634.23306-1-sorganov@gmail.com>
Add separate 'match_missing' field for diff-index to use and set it when we
encounter "-m" option. This field won't then be cleared when another meaning of
"-m" is reverted (e.g., by "--no-diff-merges"), nor it will be affected by
future option(s) that might drive 'ignore_merges' field.
Use this new field from diff-lib:do_oneway_diff() instead of reusing
'ignore_merges' field.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
---
v4: fix new field name to be 'match_missing', and improve comment in the code to
treat both involved bits similarily
v3: improve commit message
v2: rebased from 'maint' onto 'master'
diff-lib.c | 10 ++--------
revision.c | 6 ++++++
revision.h | 1 +
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/diff-lib.c b/diff-lib.c
index 50521e2093fc..5d5d3dafab33 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -405,14 +405,8 @@ static void do_oneway_diff(struct unpack_trees_options *o,
/* if the entry is not checked out, don't examine work tree */
cached = o->index_only ||
(idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx)));
- /*
- * Backward compatibility wart - "diff-index -m" does
- * not mean "do not ignore merges", but "match_missing".
- *
- * But with the revision flag parsing, that's found in
- * "!revs->ignore_merges".
- */
- match_missing = !revs->ignore_merges;
+
+ match_missing = revs->match_missing;
if (cached && idx && ce_stage(idx)) {
struct diff_filepair *pair;
diff --git a/revision.c b/revision.c
index 96630e31867d..73e3d14cc165 100644
--- a/revision.c
+++ b/revision.c
@@ -2344,7 +2344,13 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
revs->diffopt.flags.recursive = 1;
revs->diffopt.flags.tree_in_recursive = 1;
} else if (!strcmp(arg, "-m")) {
+ /*
+ * To "diff-index", "-m" means "match missing", and to the "log"
+ * family of commands, it means "show full diff for merges". Set
+ * both fields appropriately.
+ */
revs->ignore_merges = 0;
+ revs->match_missing = 1;
} else if ((argcount = parse_long_opt("diff-merges", argv, &optarg))) {
if (!strcmp(optarg, "off")) {
revs->ignore_merges = 1;
diff --git a/revision.h b/revision.h
index c1e5bcf139d7..f6bf860d19e5 100644
--- a/revision.h
+++ b/revision.h
@@ -188,6 +188,7 @@ struct rev_info {
unsigned int diff:1,
full_diff:1,
show_root_diff:1,
+ match_missing:1,
no_commit_id:1,
verbose_header:1,
combine_merges:1,
--
2.25.1
^ permalink raw reply related
* Re: [PATCH v3] revision: add separate field for "-m" of "diff-index -m"
From: Sergey Organov @ 2020-08-31 19:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git
In-Reply-To: <xmqq4koihgpk.fsf@gitster.c.googlers.com>
Junio C Hamano <gitster@pobox.com> writes:
> Sergey Organov <sorganov@gmail.com> writes:
>
>> Add separate 'diff_index_match_missing' field for diff-index to use
>> and set it
>> when we encounter "-m" option. This field won't then be cleared when another
>> meaning of "-m" is reverted (e.g., by "--no-diff-merges"), nor it will be
>> affected by future option(s) that might drive 'ignore_merges' field.
>>
>> Use this new field from diff-lib:do_oneway_diff() instead of reusing
>> 'ignore_merges' field.
>>
>> Signed-off-by: Sergey Organov <sorganov@gmail.com>
>> ---
>
> Much easier to reason about. As I said, I think we would ideally
> want to detect and diagnose --[no-]diff-merges on the command line
> of "diff" or "diff-{files,index,trees}" as an error, but for now
> this is a good first step.
>
>> } else if (!strcmp(arg, "-m")) {
>> revs->ignore_merges = 0;
>> + /*
>> + * Backward compatibility wart - "diff-index -m" does
>> + * not mean "do not ignore merges", but "match_missing",
>> + * so set separate flag for it.
>> + */
>> + revs->diff_index_match_missing = 1;
>
> Half the wart has been removed thanks to this patch and the rest of
> the code can look at the right field for their purpose. The parsing,
> unless we make a bigger change that allows us to detect and diagnose
> "diff-index --no-diff-merges" as an error, still needs to be tricky
> and may deserve a comment.
>
> The comment should apply to and treat both fields equally, perhaps
> like this:
>
> } else if (!strcmp(arg, "-m")) {
> /*
> * To "diff-index", "-m" means "match missing", and to
> * the "log" family of commands, it means "keep merges".
> * Set both fields appropriately.
> */
> revs->ignore_merges = 0;
> revs->match_missing = 1;
> }
>
> By the way, let's drop diff_index_ prefix from the name of the new
> field. I do not see a strong reason to object to a possible update
> to "diff-files" to match the behaviour of "diff-index".
>
> In a sparsely checked out working tree (e.g. start from "clone
> --no-checkout"), you can check out only paths that you want to
> modify, edit them, and then "git diff-files -m" would be able to
> show useful result without having to show deletions to all other
> files you are not interested in. And that is exactly the same use
> case as "git diff-index -m HEAD" was invented for.
Fine with me, thanks! I'll re-roll soon.
Thanks,
-- Sergey
^ permalink raw reply
* Re: [PATCH] Avoid infinite loop in malformed packfiles
From: Jeff King @ 2020-08-31 19:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: ori, René Scharfe, git
In-Reply-To: <xmqqk0xehj38.fsf@gitster.c.googlers.com>
On Mon, Aug 31, 2020 at 09:32:27AM -0700, Junio C Hamano wrote:
> > A related point is that delta chains might be composed of both types. If
> > we don't differentiate between the two types, then the limit is clearly
> > total chain length. If we do, then is the limit the total number of
> > ref-deltas found in the current lookup, or is it the number of
> > consecutive ref-deltas? I guess it would have to be the former if our
> > goal is to catch cycles (since a cycle could include an ofs-delta, as
> > long as a ref-delta is the part that forms the loop).
>
> Ah, OK, you've thought about it already.
>
> I wonder we can just count both and limit the chain length to the
> total number of objects in the pack we are currently looking at?
That's an interesting suggestion. Within a single pack, it does prevent
cycles, and it does so without needing a separate knob, which is nice.
As you note, it only works as long as packs aren't thin. That shouldn't
matter for the current scheme (where all on-disk packs are
self-contained with respect to deltas), but I do wonder if we'll
eventually want to support on-disk thin packs (coupled with a
multi-pack-index, that eliminates most of the reason that one needs
repack existing objects; it's probably a necessary step in scaling to
repos with hundreds of millions of objects). We could still auto-bound
it with the total number of packed objects in the repository, though.
> It
> guarantees to catch any cycle as long as pack is not thin, but is
> that too lenient and likely to bust the stack while counting? On
> the other side of the coin, we saw 10000 as a hard-coded limit in
> the patch, but do we know 10000 is low enough that most boxes have
> no trouble recursing that deep?
I don't think we have to worry about stack size. We already ran into
stack-busting problems with non-broken cases. ;) That led to 790d96c023
(sha1_file: remove recursion in packed_object_info, 2013-03-25) using
its own stack.
I do wonder about CPU, though. We might have tens of millions of objects
in a single pack file. How long does it take to convince ourselves we're
cycling (even if the cycle itself might only involve a handful of
objects)? I'm not sure we care too much about this being a fast
operation (after all, the point is that it should never happen and we're
just trying not to spin forever). But if it takes 60 minutes to detect
the cycle, from a user's perspective that might not be any different
than an infinite loop.
-Peff
^ permalink raw reply
* Re: [PATCH 3/5] worktree: teach "repair" to fix outgoing links to worktrees
From: Junio C Hamano @ 2020-08-31 19:07 UTC (permalink / raw)
To: Eric Sunshine; +Cc: Git List, Henré Botha, Jeff King
In-Reply-To: <CAPig+cT-w6LV490MGNyG_ihWkSzdgfnEBrjQCsafjndTRmMgFA@mail.gmail.com>
Eric Sunshine <sunshine@sunshineco.com> writes:
> No. For this repair to work, the gitfile in the linked worktree must
> be intact; it must be pointing back at the .git/worktrees/<id>
> directory so that "git worktree repair" can repair the corresponding
> .git/worktrees/<id>/gitdir file.
OK, I missed the fact that we can learn <id> quite easily if we have
the gitfile. Thanks.
^ permalink raw reply
* Re: [PATCH v2 0/5] add "git worktree repair" command
From: Junio C Hamano @ 2020-08-31 18:59 UTC (permalink / raw)
To: Eric Sunshine
Cc: git, Henré Botha, Jeff King, Johannes Schindelin,
Ramsay Jones
In-Reply-To: <20200831065800.62502-1-sunshine@sunshineco.com>
Eric Sunshine <sunshine@sunshineco.com> writes:
> This is a re-roll of [1] which adds a "git worktree repair" command
> and which fixes bugs with "git init --separate-git-dir" related to
> worktrees.
These look good to me. Thanks.
^ permalink raw reply
* Re: [PATCH] t3200: clean side effect of git checkout --orphan
From: Junio C Hamano @ 2020-08-31 18:21 UTC (permalink / raw)
To: Eric Sunshine; +Cc: Aaron Lipman, Git List
In-Reply-To: <CAPig+cQeYz9Mh+26YshuCQSzXCCUyKNGGr1wJ3FNNLpf=9QRuw@mail.gmail.com>
Eric Sunshine <sunshine@sunshineco.com> writes:
> On Sat, Aug 29, 2020 at 6:57 PM Aaron Lipman <alipman88@gmail.com> wrote:
>> The "refuse --edit-description on unborn branch for now" test in t3200
>> switches to an orphan branch, causing subsequent git commands
>> referencing HEAD to fail. Avoid this side-effect by switching back to
>> master after that test finishes.
>>
>> This has gone undetected, as the next effected test expects failure -
>> but it currently fails for the wrong reason.
>
> s/effected/affected
>
> In fact, the three tests following the orphan test all expect failure
> (though I didn't check if they also fail for the wrong reason), and
> the following test which doesn't expect failure has an explicit "git
> checkout master" early on, which explains why it was not impacted by
> this problem.
>
>> Verbose output of the next test referencing HEAD,
>> "--merged is incompatible with --no-merged":
>>
>> fatal: malformed object name HEAD
>>
>> Which this commit corrects to:
>>
>> error: option `no-merged' is incompatible with --merged
>>
>> Signed-off-by: Aaron Lipman <alipman88@gmail.com>
>
> Description and actual fix make perfect sense.
Yeah, looks good. Of course, the affected test can be made more
defensive to protect the precondition it relies on from getting
broken by other tests (i.e. if it refers to HEAD, it should make
sure it is on an actual commit). Each test cleaning up after
itself is a good discipline to have, but what is "clean" is quite
subjective and depends on each test piece in the script X-<.
Thanks, will queue.
^ permalink raw reply
* Re: Git in Outreachy?
From: Emily Shaffer @ 2020-08-31 18:05 UTC (permalink / raw)
To: Jeff King; +Cc: git, Christian Couder, Johannes Schindelin
In-Reply-To: <20200828065609.GA2105118@coredump.intra.peff.net>
On Fri, Aug 28, 2020 at 02:56:09AM -0400, Jeff King wrote:
>
> Are we interested in participating in the December 2020 round of
> Outreachy? The community application period is now open.
>
> I can look into lining up funding, but we'd also need:
>
> - volunteers to act as mentors
I'm interested in mentoring or co-mentoring this term. (I'm not working
this week but I didn't want to miss a deadline to volunteer.)
>
> - updates to our applicant materials (proposed projects, but also
> microproject / patch suggestions)
One thought I had was whether it might be cool to shop for another
co-mentor from Wireshark and have an intern teach Wireshark how to
decipher Git protocol. But it seems large, and maybe last-minute for
this term.
- Emily
^ permalink raw reply
* Re: [PATCH v1 3/3] git: catch an attempt to run "git-foo"
From: Junio C Hamano @ 2020-08-31 17:45 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Jeff King
In-Reply-To: <nycvar.QRO.7.76.6.2008311156020.56@tvgsbejvaqbjf.bet>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> Hi Junio,
>
> On Fri, 28 Aug 2020, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> > ---
>> > git.c | 3 ++-
>> > help.c | 5 ++++-
>> > 2 files changed, 6 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/git.c b/git.c
>> > index 71ef4835b20e..863fd0c58a66 100644
>> > --- a/git.c
>> > +++ b/git.c
>> > @@ -851,7 +851,8 @@ int cmd_main(int argc, const char **argv)
>> > * that one cannot handle it.
>> > */
>> > if (skip_prefix(cmd, "git-", &cmd)) {
>> > - warn_on_dashed_git(argv[0]);
>> > + strip_extension(&cmd);
>> > + warn_on_dashed_git(cmd);
>>
>> The argv[0] may have been NULL from the beginning of cmd_main(), and
>> cmd would be "git-help" in such a case. We used to pass NULL to
>> warn_on_dashed_git() in such a case because "git-help" is not what the
>> user typed, but what we internally trigger, so we didn't want
>> warn_on_dashed_git() to do anything based on that internal string.
>
> True. The test suite passes with this, though, which means we haven't
> covered that case.
Yup, it would be a good thing to add to our tests, with or without
this patch.
>> How does your handle_builtin() work, by the way?
>>
>> The original code (even before we added warn_on_dashed_git() in this
>> codepath) did not do any strip_extension(), so handle_builtin() can
>> take commands with ".exe" suffix, but now we are feeding the result
>> of strip_extension() to it, so it can deal with both?
>
> Yes, it can deal with both. It calls `strip_extension()`, which on Windows
> removes the `.exe` suffix, if found.
>
>> Sounds convenient and sloppy (not the handle_builtin's
>> implementation, but its callers that sometimes feeds the full
>> executable name, and feeds only the basename some other times) at
>> the same time.
>
> Right, it does not _require_ the extension. I do not necessarily agree
> that it is sloppy, but I do agree that it is convenient ;-)
Being lenient to its input is not sloppy.
It just is by being convenient, it allows its callers to be sloppy,
which may hurt them as not all callees they call may not be as
lenient as handle_builtin(), which is the only downside I would be
worried about. Nothing big.
thanks.
^ permalink raw reply
* Re: Git in Outreachy?
From: Junio C Hamano @ 2020-08-31 17:41 UTC (permalink / raw)
To: Jeff King; +Cc: git, Christian Couder, Johannes Schindelin
In-Reply-To: <20200828065609.GA2105118@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> Are we interested in participating in the December 2020 round of
> Outreachy? The community application period is now open.
>
> I can look into lining up funding, but we'd also need:
>
> - volunteers to act as mentors
>
> - updates to our applicant materials (proposed projects, but also
> microproject / patch suggestions)
>
> -Peff
FWIW, I am interested in seeing this project participating. As
usual, I won't be able to mentor to avoid biases, though.
As to microprojects, I think we saw #leftoverbits and #micrproject
sprinkled in a handful of messages in recent discussions, so with
the help of list archive, we may come up with new ones.
Thanks.
^ permalink raw reply
* Re: [PATCH v2 2/2] wt-status: tolerate dangling marks
From: Junio C Hamano @ 2020-08-31 17:37 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git, jrnieder
In-Reply-To: <20200831171732.1199755-1-jonathantanmy@google.com>
Jonathan Tan <jonathantanmy@google.com> writes:
>> For example, what is the reasoning behind making dwim_ref() unable
>> to ask the "do so gently" variant, while allowing repo_dwim_ref()
>> to?
>>
>> I am NOT necessarily saying these two functions MUST be able to
>> access the same set of features and the only difference between them
>> MUST be kept to the current "repo_* variant can work on an arbitrary
>> repository, while the variant without repo_* would work on the
>> primary repository only". As long as there is a good reason to make
>> their power diverge, it is OK---I just do not see why and I'd like
>> to know.
>
> Maybe add the following to the end of the last paragraph of the commit
> message:
>
> (dwim_ref() is unchanged because I expect more and more code to use
> repo_dwim_ref(), and to reduce the size of the diff.)
If that is the reasoning, I totally disagree. We may be staring
problems in submodules too long to think everybody should use the
variant with repo_ prefix, but a single repository operations will
continue to exist, and when you know you only need to access the
current repository, not having to use the function with longer name
without having to pass an extra parameter is a plus.
I even wonder why dwim_ref() is not defined like so in a header:
#define dwim_ref(s, l, o, r) \
repo_dwim_ref(the_repository, (s), (l), (o), (r))
Which would force a change like the one we are discussing to keep
them in parallel and keep the promise that only difference between
the two is that dwim_ref() works with the_repository and the other
can take an arbitrary repository. Perhaps that can be a preliminary
clean-up patch before these two patches ;-)
Doing so would mean that coders have to remember one fewer thing
than "dwim_ref(), not only cannot take a repository pointer, cannot
be told to report error gently". The worst part is that we know
that the difference will GROW, because the purpose of adding the
extra option argument to the API is exactly to make it easy to do
so.
Reducing the size of the diff is a good justification only when the
end result is the same. If it burdens future developers more, that
is "I write less at the expense of forcing others to write more",
which is not quite the same thing.
Thanks.
^ permalink raw reply
* Re: [PATCH v3] revision: add separate field for "-m" of "diff-index -m"
From: Junio C Hamano @ 2020-08-31 17:23 UTC (permalink / raw)
To: Sergey Organov; +Cc: Jeff King, git
In-Reply-To: <20200831125350.26472-1-sorganov@gmail.com>
Sergey Organov <sorganov@gmail.com> writes:
> Add separate 'diff_index_match_missing' field for diff-index to use and set it
> when we encounter "-m" option. This field won't then be cleared when another
> meaning of "-m" is reverted (e.g., by "--no-diff-merges"), nor it will be
> affected by future option(s) that might drive 'ignore_merges' field.
>
> Use this new field from diff-lib:do_oneway_diff() instead of reusing
> 'ignore_merges' field.
>
> Signed-off-by: Sergey Organov <sorganov@gmail.com>
> ---
Much easier to reason about. As I said, I think we would ideally
want to detect and diagnose --[no-]diff-merges on the command line
of "diff" or "diff-{files,index,trees}" as an error, but for now
this is a good first step.
> } else if (!strcmp(arg, "-m")) {
> revs->ignore_merges = 0;
> + /*
> + * Backward compatibility wart - "diff-index -m" does
> + * not mean "do not ignore merges", but "match_missing",
> + * so set separate flag for it.
> + */
> + revs->diff_index_match_missing = 1;
Half the wart has been removed thanks to this patch and the rest of
the code can look at the right field for their purpose. The parsing,
unless we make a bigger change that allows us to detect and diagnose
"diff-index --no-diff-merges" as an error, still needs to be tricky
and may deserve a comment.
The comment should apply to and treat both fields equally, perhaps
like this:
} else if (!strcmp(arg, "-m")) {
/*
* To "diff-index", "-m" means "match missing", and to
* the "log" family of commands, it means "keep merges".
* Set both fields appropriately.
*/
revs->ignore_merges = 0;
revs->match_missing = 1;
}
By the way, let's drop diff_index_ prefix from the name of the new
field. I do not see a strong reason to object to a possible update
to "diff-files" to match the behaviour of "diff-index".
In a sparsely checked out working tree (e.g. start from "clone
--no-checkout"), you can check out only paths that you want to
modify, edit them, and then "git diff-files -m" would be able to
show useful result without having to show deletions to all other
files you are not interested in. And that is exactly the same use
case as "git diff-index -m HEAD" was invented for.
Thanks.
> diff --git a/revision.h b/revision.h
> index c1e5bcf139d7..5ae8254ffaed 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -188,6 +188,7 @@ struct rev_info {
> unsigned int diff:1,
> full_diff:1,
> show_root_diff:1,
> + diff_index_match_missing:1,
> no_commit_id:1,
> verbose_header:1,
> combine_merges:1,
^ permalink raw reply
* Re: [PATCH v2 2/2] wt-status: tolerate dangling marks
From: Jonathan Tan @ 2020-08-31 17:17 UTC (permalink / raw)
To: gitster; +Cc: jonathantanmy, git, jrnieder
In-Reply-To: <xmqqo8mti8od.fsf@gitster.c.googlers.com>
> Jonathan Tan <jonathantanmy@google.com> writes:
>
> > +
> > + /*
> > + * If ^{upstream} or ^{push} (or equivalent) is requested, and the
> > + * branch in question does not have such a reference, return -1 instead
> > + * of die()-ing.
> > + */
> > + unsigned nonfatal_dangling_mark : 1;
>
> Micronit; I would have avoided "or equivalent" as the point of
> parenthetical comment was not to say these two modifiers upstream
> and push (and other forms that spell differently but invokes exactly
> one of these two features) are special, but to say that these two
> are merely examples, and any other ^{modifiers} we have or we will
> add in the future would honor this bit. Perhaps "(and the like)"?
"(and the like)" sounds good.
> Among these callers that reach substitute_branch_name(), how were
> those that can specify the new bit chosen?
I just did the minimal change to fix the bug in the test.
> For example, what is the reasoning behind making dwim_ref() unable
> to ask the "do so gently" variant, while allowing repo_dwim_ref()
> to?
>
> I am NOT necessarily saying these two functions MUST be able to
> access the same set of features and the only difference between them
> MUST be kept to the current "repo_* variant can work on an arbitrary
> repository, while the variant without repo_* would work on the
> primary repository only". As long as there is a good reason to make
> their power diverge, it is OK---I just do not see why and I'd like
> to know.
Maybe add the following to the end of the last paragraph of the commit
message:
(dwim_ref() is unchanged because I expect more and more code to use
repo_dwim_ref(), and to reduce the size of the diff.)
If you prefer not to make this change locally, let me know and I'll
resend one with the updated commit message and the "(and the like)"
change above.
> The same question about not allowing the gentler variant while
> drimming the reflog.
Same as above - I only did the minimal change to fix the bug.
Admittedly, if a reflog-accessing command could fail in the same way
(dangling mark), we would need to fix repo_dwim_log() and then we could
simplify substitute_branch_name() to not take the nonfatal_dangling_mark
parameter (since all dangling marks would now be nonfatal), but I
haven't looked beyond this.
^ permalink raw reply
* Re: [PATCH v6 06/13] bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions in C
From: Junio C Hamano @ 2020-08-31 17:09 UTC (permalink / raw)
To: Miriam R.; +Cc: git
In-Reply-To: <CAN7CjDBd=41PQ5qfqazdtx4uoRcfcc6cUWf5u0cNiooSo24ENg@mail.gmail.com>
"Miriam R." <mirucam@gmail.com> writes:
> I don't know if it is something related to my compiler but my -Werror
> build does not break.
FWIW, here is a full compilation command line from "make V=1" for
the file:
cc -o builtin/bisect--helper.o -c -MF builtin/.depend/bisect--helper.o.d -MQ builtin/bisect--helper.o -MMD -MP -Werror -Wall -Wdeclaration-after-statement -Wformat-security -Wold-style-definition -Woverflow -Wpointer-arith -Wstrict-prototypes -Wunused -Wvla -Wextra -Wmissing-prototypes -Wno-empty-body -Wno-missing-field-initializers -Wno-sign-compare -Wno-unused-parameter -g -O2 -Wall -I. -DHAVE_SYSINFO -DGIT_HOST_CPU="\"x86_64\"" -DHAVE_ALLOCA_H -DUSE_CURL_FOR_IMAP_SEND -DSHA1_DC -DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 -DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" -DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\"" -DSHA256_BLK -DHAVE_PATHS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM '-DPROCFS_EXECUTABLE_PATH="/proc/self/exe"' -DFREAD_READS_DIRECTORIES -DNO_STRLCPY -DSHELL_PATH='"/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"' builtin/bisect--helper.c
builtin/bisect--helper.c: In function 'bisect_next':
builtin/bisect--helper.c:570:6: error: variable 'no_checkout' set but not used [-Werror=unused-but-set-variable]
570 | int no_checkout;
| ^~~~~~~~~~~
cc1: all warnings being treated as errors
^ permalink raw reply
* Re: [PATCH v2] revision: add separate field for "-m" of "diff-index -m"
From: Junio C Hamano @ 2020-08-31 17:00 UTC (permalink / raw)
To: Sergey Organov; +Cc: Jeff King, git
In-Reply-To: <87a6ybugpe.fsf@osv.gnss.ru>
Sergey Organov <sorganov@gmail.com> writes:
> $ git diff-index -m --no-diff-merges HEAD
> :100644 000000 4aec621a6d1a9a5892f0b4b6feb2ed329fd04bf2 0000000000000000000000000000000000000000 D main/main.cc
At the first glance, this looked like a good justification for this
patch.
> If you say "compatibility wart" is not a trouble by itself, -- I'm fine
> with it, -- then "more" in my commit message is misplaced indeed.
Yeah, when I wrote the "compatibility wart" comment originally, I
was describing "this needs a tricky code because two independent
options happen to share the command line parser" and nothing more.
I was not reacting to "more", by the way. I was reacting the lack
of concrete problem description. "A '-m' option given to the
'diff-index' command can be defeated by giving '--no-diff-merges'
later" you showed above can be a good replacement for "causes more
troubles".
But in the ideal world, "--[no-]diff-merges" should be rejected as
an irrelevant/unrecognised option to the "diff" family of commands
(as I said in the message you are responding to, it is only relevant
to the "log" family of commands where the diff machinery is solely
to compare between (some of) its parents and in that context, what,
if anything, kind of special treatment is made for merge commits
makes sense as an optional instruction to the command). Splitting
the field into two fields, setting both fields upon "-m" but
toggling only one with longhand "--[no-]diff-merges" would allow the
code to notice and make the above command line silently turn the
"--[no-]diff-merges" into a no-op, so in that sense it would be a
good first step, but an ideal solution would probably need to know
if we are parsing for the "log" family or for the "diff" family and
error out upon seeing a "log"-only option like "--[no-]diff-merges"
when checking the command line option for "diff".
> This change has nothing to do with defaults. It rather about correct and
> clear code.
OK, I misread your intention. Sorry about that.
Thanks.
^ permalink raw reply
* Re: [PATCH] Avoid infinite loop in malformed packfiles
From: ori @ 2020-08-31 16:50 UTC (permalink / raw)
To: peff, ori; +Cc: gitster, l.s.r, git
In-Reply-To: <20200831092946.GA2812764@coredump.intra.peff.net>
> On Sun, Aug 30, 2020 at 09:15:10AM -0700, Junio C Hamano wrote:
>
>> René Scharfe <l.s.r@web.de> writes:
>>
>> >> Will that work? I'd expect that modern pack files end up being
>> >> offset deltas, rather than reference deltas.
>> >
>> > True, but going down all the way would work:
>>
>> Perhaps, but I'd rather use pack-objects to prepare the repository
>> with no-delta-base-offset to force ref deltas.
>
> Yeah, that seems like a much better test setup.
>
> It does raise an interesting question, though. I had imagined we would
> limit the depth of all delta chains here, not just ref-deltas. But it is
> true that ofs deltas can't cycle. Without cycles, neither type can go on
> indefinitely (they are limited by the number of entries in the
> packfile). I could see arguments going either way:
Yeah -- that's what I'd implemented. I was just thinking that I'd want to
test the issue that caused the problem in the first place, but it's the
same code path either way.
I like the idea of limiting to the total number of objects in the
pack. If we do that, we don't need a knob at all, since if we need
more objects in the stack than are in the pack, it's obviously
invalid.
That does eliminate an obvious way to test things, and we'd need
to provide in an invalid pack file.
^ permalink raw reply
* Re: [PATCH] Avoid infinite loop in malformed packfiles
From: Junio C Hamano @ 2020-08-31 16:32 UTC (permalink / raw)
To: Jeff King; +Cc: ori, René Scharfe, git
In-Reply-To: <20200831092946.GA2812764@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> It does raise an interesting question, though. I had imagined we would
> limit the depth of all delta chains here, not just ref-deltas. But it is
> true that ofs deltas can't cycle. Without cycles, neither type can go on
> indefinitely (they are limited by the number of entries in the
> packfile). I could see arguments going either way:
>
> - ofs deltas cannot cycle, so we do not need a counter that limits
> them (and which _could_ find a false positive). So we should not
> limit them.
>
> - a counter is preventing us from following cycles indefinitely, but
> also hardening us against misbehavior due to bugs or insanely large
> delta chains (intentional or not). So we should include ofs deltas
> in our limit.
A chain can have both types, so I am fuzzy how the counting would
go. We just do not count ofs_delta at all and only count ref_delta
we've seen during the recursion?
> A related point is that delta chains might be composed of both types. If
> we don't differentiate between the two types, then the limit is clearly
> total chain length. If we do, then is the limit the total number of
> ref-deltas found in the current lookup, or is it the number of
> consecutive ref-deltas? I guess it would have to be the former if our
> goal is to catch cycles (since a cycle could include an ofs-delta, as
> long as a ref-delta is the part that forms the loop).
Ah, OK, you've thought about it already.
I wonder we can just count both and limit the chain length to the
total number of objects in the pack we are currently looking at? It
guarantees to catch any cycle as long as pack is not thin, but is
that too lenient and likely to bust the stack while counting? On
the other side of the coin, we saw 10000 as a hard-coded limit in
the patch, but do we know 10000 is low enough that most boxes have
no trouble recursing that deep?
Thanks.
^ permalink raw reply
* Re: [GSoC][PATCH] submodule: port submodule subcommand 'add' from shell to C
From: Shourya Shukla @ 2020-08-31 13:04 UTC (permalink / raw)
To: Kaartic Sivaraam
Cc: gitster, git, christian.couder, johannes.schindelin, liu.denton
In-Reply-To: <ce151a1408291bb0991ce89459e36ee13ccdfa52.camel@gmail.com>
On 31/08 01:28, Kaartic Sivaraam wrote:
> On Wed, 2020-08-26 at 14:45 +0530, Shourya Shukla wrote:
> > On 24/08 11:35, Junio C Hamano wrote:
> > > Shourya Shukla <shouryashukla.oo@gmail.com> writes:
> > >
> > > The shell version would error out with anything in the index, so I'd
> > > expect that a faithful conversion would not call is_directory() nor
> > > submodule_from_path() at all---it would just look path up in the_index
> > > and complains if anything is found. For example, the quoted part in
> > > the original above is what gives the error message when I do
> > >
> > > $ git submodule add ./Makefile
> > > 'Makefile' already exists in the index.
> > >
> > > I think. And the above code won't trigger the "already exists" at
> > > all because 'path' is not a directory.
> >
> > Alright. That is correct. I tried to use a multitude of functions but
> > did not find luck with any of them. The functions I tried:
> >
>
> It would've been nice to see the actual code you tried so that it's
> easier for others to more easily identify if you're using the wrong
> function or using the correct function in the wrong way.
Yeah, that is my fault. I will tag along below.
> > - index_path() to check if the path is in the index. For some
> > reason, it switched to the 'default' case and return the
> > 'unsupported file type' error.
> >
> > - A combination of doing an OR with index_file_exists() and
> > index_dir_exists(). Still no luck. t7406.43 fails.
> >
> > - Using index_name_pos() along with the above two functions. Again a
> > failure in the same test.
> >
> > I feel that index_name_pos() should suffice this task but it fails in
> > t7406.43. The SM is in index since 'git ls-files --error-unmatch s1'
> > does return 's1' (s1 is the submodule). What am I missing here?
> >
>
> You're likely missing the fact that you should call `read_cache` before
> using `index_name_pos` or the likes of it.
Alright, called it.
> For instance, the following works without issues for most cases (more
> on that below):
>
> if (read_cache() < 0)
> die(_("index file corrupt"));
>
> cache_pos = cache_name_pos(path, strlen(path));
> if (cache_pos >= 0) {
> if (!force) {
> die(_("'%s' already exists in the index"),
> path);
> }
> else {
> struct cache_entry *ce = the_index.cache[cache_pos];
>
> if (!S_ISGITLINK(ce->ce_mode))
> die(_("'%s' already exists in the index and is not a "
> "submodule"), path);
> }
> }
I actually did this only using 'index_*()' functions. But made a very
very very silly mistake:
I did a sizeof() instead of strlen() and I did not notice this until
I saw what you did. IDK how I made this mistake.
This is what I have done finally:
---
if (read_cache() < 0)
die(_("index file corrupt"));
if (!force) {
if (cache_file_exists(path, strlen(path), ignore_case) ||
cache_dir_exists(path, strlen(path)))
die(_("'%s' already exists in the index"), path);
} else {
int cache_pos = cache_name_pos(path, strlen(path));
struct cache_entry *ce = the_index.cache[cache_pos];
if (cache_pos >= 0 && !S_ISGITLINK(ce->ce_mode))
die(_("'%s' already exists in the index and is not a "
"submodule"), path);
}
---
I did not put the 'cache_pos >= 0' at the start since I thought that it
will unnecessarily increase an indentation level. Since we are using
'cache_{file,dir}_exists' in the first check and 'cache_name_pos()' in
the second, the placement of check at another indentation level would be
unnecessary. What do you think about this?
> This is more close to what the shell version did but misses one case
> which might or might not be covered by the test suite[1]. The case when
> path is a directory that has tracked contents. In the shell version we
> would get:
>
> $ git submodule add ../git-crypt/ builtin
> 'builtin' already exists in the index
> $ git submodule add --force ../git-crypt/ builtin
> 'builtin' already exists in the index and is not a submodule
>
> In the C version with the above snippet we get:
>
> $ git submodule add --force ../git-crypt/ builtin
> fatal: 'builtin' does not have a commit checked out
> $ git submodule add ../git-crypt/ builtin
> fatal: 'builtin' does not have a commit checked out
>
> That's not appropriate and should be fixed. I believe we could do
> something with `cache_dir_exists` to fix this.
>
>
> Footnote
> ===
>
> [1]: If it's not covered already, it might be a good idea to add a test
> for the above case.
Like Junio said, we do not care if it is a file or a directory of any
sorts, we will give the error if it already exists. Therefore, even if
it is an untracked or a tracked one, it should not matter to us. Hence
testing for it may not be necessary is what I feel. Why should we test
it?
^ permalink raw reply
* [PATCH v3] revision: add separate field for "-m" of "diff-index -m"
From: Sergey Organov @ 2020-08-31 12:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git, Sergey Organov
In-Reply-To: <20200829194634.23306-1-sorganov@gmail.com>
Add separate 'diff_index_match_missing' field for diff-index to use and set it
when we encounter "-m" option. This field won't then be cleared when another
meaning of "-m" is reverted (e.g., by "--no-diff-merges"), nor it will be
affected by future option(s) that might drive 'ignore_merges' field.
Use this new field from diff-lib:do_oneway_diff() instead of reusing
'ignore_merges' field.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
---
v3: improve commit message
v2: rebased from 'maint' onto 'master'
diff-lib.c | 10 ++--------
revision.c | 6 ++++++
revision.h | 1 +
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/diff-lib.c b/diff-lib.c
index 50521e2093fc..f2aee78e7aa2 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -405,14 +405,8 @@ static void do_oneway_diff(struct unpack_trees_options *o,
/* if the entry is not checked out, don't examine work tree */
cached = o->index_only ||
(idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx)));
- /*
- * Backward compatibility wart - "diff-index -m" does
- * not mean "do not ignore merges", but "match_missing".
- *
- * But with the revision flag parsing, that's found in
- * "!revs->ignore_merges".
- */
- match_missing = !revs->ignore_merges;
+
+ match_missing = revs->diff_index_match_missing;
if (cached && idx && ce_stage(idx)) {
struct diff_filepair *pair;
diff --git a/revision.c b/revision.c
index 96630e31867d..64b16f7d1033 100644
--- a/revision.c
+++ b/revision.c
@@ -2345,6 +2345,12 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
revs->diffopt.flags.tree_in_recursive = 1;
} else if (!strcmp(arg, "-m")) {
revs->ignore_merges = 0;
+ /*
+ * Backward compatibility wart - "diff-index -m" does
+ * not mean "do not ignore merges", but "match_missing",
+ * so set separate flag for it.
+ */
+ revs->diff_index_match_missing = 1;
} else if ((argcount = parse_long_opt("diff-merges", argv, &optarg))) {
if (!strcmp(optarg, "off")) {
revs->ignore_merges = 1;
diff --git a/revision.h b/revision.h
index c1e5bcf139d7..5ae8254ffaed 100644
--- a/revision.h
+++ b/revision.h
@@ -188,6 +188,7 @@ struct rev_info {
unsigned int diff:1,
full_diff:1,
show_root_diff:1,
+ diff_index_match_missing:1,
no_commit_id:1,
verbose_header:1,
combine_merges:1,
--
2.25.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox