From: Junio C Hamano <gitster@pobox.com>
To: "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Tao Klerks" <tao@klerks.biz>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH v5 2/2] untracked-cache: support '--untracked-files=all' if configured
Date: Tue, 29 Mar 2022 10:43:35 -0700 [thread overview]
Message-ID: <xmqqsfr0u01k.fsf@gitster.g> (raw)
In-Reply-To: <f60d2c6e36c3218f9b19d7ce62a090d7d6e0e7f6.1648553134.git.gitgitgadget@gmail.com> (Tao Klerks via GitGitGadget's message of "Tue, 29 Mar 2022 11:25:34 +0000")
"Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Tao Klerks <tao@klerks.biz>
>
> Untracked cache was originally designed to only work with
> '--untracked-files=normal', but this causes performance issues for UI
> tooling that wants to see "all" on a frequent basis. On the other hand,
> the conditions that prevented applicability to the "all" mode no
> longer seem to apply.
There is a slight leap in logic flow before ", but this causes" that
forces readers read the above twice and guess what is missing. I am
guessing that
... was designed to only work with "--untracked-files=normal",
and is bypassed when "--untracked-files=all" request, but this
causes ...
is what you meant to say.
The claim in the last sentence "... no longer seem to apply" is
thrown at the readers without much rationale. Either justify it
more solidly or discard the claim?
> The disqualification of untracked cache has a particularly significant
> impact on Windows with the defaulted fscache, where the introduction
> of fsmonitor can make the first and costly directory-iteration happen
> in untracked file detection, single-threaded, rather than in
> preload-index on multiple threads. Improving the performance of a
> "normal" 'git status' run with fsmonitor can make
> 'git status --untracked-files=all' perform much worse.
The last sentence is a bit hard to parse and very hard to reason
about. Do you mean to say "--untracked-files=all is slower when the
untracked cache is in use, so the performance of 'git status' may be
improved for '--untracked-files=normal' when the untracked cache is
used, it hurts when 'git status --untracked-files=all' is run"?
> To partially address this, align the supported directory flags for the
> stored untracked cache data with the git config. If a user specifies
> an '--untracked-files=' commandline parameter that does not align with
> their 'status.showuntrackedfiles' config value, then the untracked
> cache will be ignored - as it is for other unsupported situations like
> when a pathspec is specified.
>
> If the previously stored flags no longer match the current
> configuration, but the currently-applicable flags do match the current
> configuration, then discard the previously stored untracked cache
> data.
Let me follow what actually happens in the patch aloud.
> -static void new_untracked_cache(struct index_state *istate)
> +static unsigned new_untracked_cache_flags(struct index_state *istate)
> +{
> + struct repository *repo = istate->repo;
> + char *val;
> +
> + /*
> + * This logic is coordinated with the setting of these flags in
> + * wt-status.c#wt_status_collect_untracked(), and the evaluation
> + * of the config setting in commit.c#git_status_config()
> + */
> + if (!repo_config_get_string(repo, "status.showuntrackedfiles", &val) &&
> + !strcmp(val, "all"))
> + return 0;
> +
> + /*
> + * The default, if "all" is not set, is "normal" - leading us here.
> + * If the value is "none" then it really doesn't matter.
> + */
> + return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
> +}
This _guesses_ the user preference from the configuration.
> +static void new_untracked_cache(struct index_state *istate, int flags)
> {
> struct untracked_cache *uc = xcalloc(1, sizeof(*uc));
> strbuf_init(&uc->ident, 100);
> uc->exclude_per_dir = ".gitignore";
> - /* should be the same flags used by git-status */
> - uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
> + uc->dir_flags = flags >= 0 ? flags : new_untracked_cache_flags(istate);
We use unsigned for the flag word unless there is a reason not to,
but this function wants to use top-bit as a signal to "guess from
config". The caller either dictates what bits to set, or we guess.
And the created uc records the flags.
> @@ -2762,11 +2782,11 @@ static void new_untracked_cache(struct index_state *istate)
> void add_untracked_cache(struct index_state *istate)
> {
> if (!istate->untracked) {
> - new_untracked_cache(istate);
> + new_untracked_cache(istate, -1);
We are newly creating, so "-1" (guess from config) may be the most
appropriate (unless the caller of add_untracked_cache() already
knows what operation it is using for, that is).
> } else {
> if (!ident_in_untracked(istate->untracked)) {
Found an existing one but need to recreate.
> free_untracked_cache(istate->untracked);
> - new_untracked_cache(istate);
> + new_untracked_cache(istate, -1);
In this case, is it likely to give us a better guess to read the
configuration, or copy from the existing untracked file? My gut
feeling says it would be the latter, and if that is correct, then
we might want
+ int old_flags = istate->untracked->dir_flags;
free_untracked_cache(istate->untracked);
- new_untracked_cache(istate);
+ new_untracked_cache(istate, old_flags);
instead? I dunno.
> @@ -2781,9 +2801,9 @@ void remove_untracked_cache(struct index_state *istate)
> }
>
> static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir,
> - int base_len,
> - const struct pathspec *pathspec,
> - struct index_state *istate)
> + int base_len,
> + const struct pathspec *pathspec,
> + struct index_state *istate)
> {
> struct untracked_cache_dir *root;
> static int untracked_cache_disabled = -1;
Is this just re-indenting? Not very welcome, but OK.
> @@ -2814,17 +2834,9 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
> if (base_len || (pathspec && pathspec->nr))
> return NULL;
>
> - /* Different set of flags may produce different results */
> - if (dir->flags != dir->untracked->dir_flags ||
> - /*
> - * See treat_directory(), case index_nonexistent. Without
> - * this flag, we may need to also cache .git file content
> - * for the resolve_gitlink_ref() call, which we don't.
> - */
> - !(dir->flags & DIR_SHOW_OTHER_DIRECTORIES) ||
> - /* We don't support collecting ignore files */
> - (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO |
> - DIR_COLLECT_IGNORED)))
> + /* We don't support collecting ignore files */
> + if (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO |
> + DIR_COLLECT_IGNORED))
> return NULL;
>
> /*
> @@ -2847,6 +2859,41 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
> return NULL;
> }
>
> + /*
> + * We don't support using or preparing the untracked cache if
> + * the current effective flags don't match the configured
> + * flags.
> + */
Is that what we want? What happens when your user does this:
- set showuntrackedfiles to "normal"
- generate the untracked cache
- set showuntrackedfiles to "all"
And now the user wants to make a request that wants to see flags
that are suitable for "normal".
The best case would be for the untracked cache to know what flags
were in use when it was generated, so that in the above sequence,
even though the current value of configuration variable and the
current request from the command line contradict each other, we
can notice that the untracked cache data is suitable for the current
request and the right thing happens.
> + if (dir->flags != new_untracked_cache_flags(istate))
> + return NULL;
But this only pays attention to what is in the configuration? It
seems to me that it is being too pessimistic, but perhaps it is
necessary for correctness somehow?
Thanks.
next prev parent reply other threads:[~2022-03-29 17:43 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-23 6:44 [PATCH] Support untracked cache with '--untracked-files=all' if configured Tao Klerks via GitGitGadget
2022-02-25 17:52 ` [PATCH v2] untracked-cache: support " Tao Klerks via GitGitGadget
2022-02-25 19:43 ` Junio C Hamano
2022-02-27 11:21 ` Tao Klerks
2022-02-27 19:54 ` Junio C Hamano
2022-02-27 11:22 ` [PATCH v3] " Tao Klerks via GitGitGadget
2022-02-27 14:39 ` Tao Klerks
2022-02-27 15:13 ` [PATCH v4] " Tao Klerks via GitGitGadget
2022-02-28 14:02 ` Ævar Arnfjörð Bjarmason
2022-03-02 8:47 ` Tao Klerks
2022-03-29 11:25 ` [PATCH v5 0/2] Support untracked cache with " Tao Klerks via GitGitGadget
2022-03-29 11:25 ` [PATCH v5 1/2] untracked-cache: test untracked-cache-bypassing behavior with -uall Tao Klerks via GitGitGadget
2022-03-29 16:51 ` Junio C Hamano
2022-03-30 4:46 ` Tao Klerks
2022-03-30 16:39 ` Junio C Hamano
2022-03-31 5:15 ` Tao Klerks
2022-03-29 11:25 ` [PATCH v5 2/2] untracked-cache: support '--untracked-files=all' if configured Tao Klerks via GitGitGadget
2022-03-29 17:43 ` Junio C Hamano [this message]
2022-03-30 19:59 ` Tao Klerks
2022-03-31 16:02 ` [PATCH v6 0/2] Support untracked cache with " Tao Klerks via GitGitGadget
2022-03-31 16:02 ` [PATCH v6 1/2] untracked-cache: test untracked-cache-bypassing behavior with -uall Tao Klerks via GitGitGadget
2022-03-31 16:02 ` [PATCH v6 2/2] untracked-cache: support '--untracked-files=all' if configured Tao Klerks via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqsfr0u01k.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=tao@klerks.biz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).