From: "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Tao Klerks <tao@klerks.biz>
Subject: [PATCH v2 0/3] Empty untracked cache performance issue
Date: Fri, 25 Feb 2022 17:10:11 +0000 [thread overview]
Message-ID: <pull.986.v2.git.1645809015.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.986.git.1624559401.gitgitgadget@gmail.com>
This patchset addresses a performance issue with untracked cache. When a new
untracked cache structure is added to the index but remains empty,
subsequent "git status" calls populate it but do not write the index - so
they perform as though the index were disabled.
This situation can be caused in several different ways:
* Running "git update-index --untracked-cache" on a repo that did not have
the untracked cache
* Modifying the git configuration to enable untracked cache, but then
immediately running "git status -uall", causing the untracked cache to be
created, but not used/populated (and the index written).
* (likely others)
The patchset includes fixes to t7519, which otherwise starts failing because
it wasn't testing what it intended to.
Tao Klerks (3):
t7519: avoid file to index mtime race for untracked cache
t7519: populate untracked cache before test
untracked-cache: write index when populating empty untracked cache
dir.c | 10 +++++++---
t/t7519-status-fsmonitor.sh | 7 +++++++
2 files changed, 14 insertions(+), 3 deletions(-)
base-commit: 4c53a8c20f8984adb226293a3ffd7b88c3f4ac1a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-986%2FTaoK%2Ftaok-empty-untracked-cache-bug-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-986/TaoK/taok-empty-untracked-cache-bug-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/986
Range-diff vs v1:
1: c4d2a3cab4b ! 1: 9421b71540d Add a second's delay to t7519 for untracked cache
@@ Metadata
Author: Tao Klerks <tao@klerks.biz>
## Commit message ##
- Add a second's delay to t7519 for untracked cache
+ t7519: avoid file to index mtime race for untracked cache
In t7519 there is a test that writes files to disk, and immediately
- writes the untracked index. Because of mtime-comparison logic
- that uses a 1-second resolution, this means the cached entries
- are not trusted/used under some circumstances (see
- read-cache.c#is_racy_stat()).
+ writes the untracked index. Because of mtime-comparison logic that
+ uses a 1-second resolution, this means the cached entries are not
+ trusted/used under some circumstances
+ (see read-cache.c#is_racy_stat()).
- Untracked cache tests in t7063 use a 1-second delay to avoid
- this issue. We should do the same here.
+ Untracked cache tests in t7063 use a 1-second delay to avoid this
+ issue, but we don't want to introduce arbitrary slowdowns, so instead
+ use test-tool chmtime to backdate the files slightly.
- This change doesn't actually affect the outcome of the test,
- but does enhance its validity, and becomes relevant after
- later changes
+ This change doesn't actually affect the outcome of the test, but does
+ enhance its validity, and becomes relevant after later changes.
Signed-off-by: Tao Klerks <tao@klerks.biz>
## t/t7519-status-fsmonitor.sh ##
-@@ t/t7519-status-fsmonitor.sh: clean_repo () {
- git clean -fd
- }
-
-+avoid_racy() {
-+ sleep 1
-+}
-+
- dirty_repo () {
- : >untracked &&
- : >dir1/untracked &&
@@ t/t7519-status-fsmonitor.sh: test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR'
+ cd dot-git &&
+ mkdir -p .git/hooks &&
+ : >tracked &&
++ test-tool chmtime =-60 tracked &&
+ : >modified &&
++ test-tool chmtime =-60 modified &&
+ mkdir dir1 &&
+ : >dir1/tracked &&
++ test-tool chmtime =-60 dir1/tracked &&
+ : >dir1/modified &&
++ test-tool chmtime =-60 dir1/modified &&
+ mkdir dir2 &&
+ : >dir2/tracked &&
++ test-tool chmtime =-60 dir2/tracked &&
: >dir2/modified &&
++ test-tool chmtime =-60 dir2/modified &&
write_integration_script &&
git config core.fsmonitor .git/hooks/fsmonitor-test &&
-+ avoid_racy &&
git update-index --untracked-cache &&
- git update-index --fsmonitor &&
- GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \
2: d63faad03a4 ! 2: d29a68e65a0 In t7519, populate untracked cache before test
@@ Metadata
Author: Tao Klerks <tao@klerks.biz>
## Commit message ##
- In t7519, populate untracked cache before test
+ t7519: populate untracked cache before test
- In its current state, the t7519 test dealing with untracked
- cache assumes that
- "git update-index --untracked-cache" will *populate*
- the untracked cache. This is not correct - it will only add
- an empty untracked cache structure to the index.
+ In its current state, the t7519 test dealing with untracked cache
+ assumes that "git update-index --untracked-cache" will *populate* the
+ untracked cache. This is not correct - it will only add an empty
+ untracked cache structure to the index.
- If we're going to compare two git status runs with
- something interesting happening in-between, we
- need to ensure that the index is in a stable/steady
- state *before* that first run.
+ If we're going to compare two git status runs with something
+ interesting happening in-between, we need to ensure that the index is
+ in a stable/steady state *before* that first run.
- We achieve this by adding another prior "git status"
- run.
+ Achieve this by adding another prior "git status" run.
- At this stage this change does nothing, because there
- is a bug, addressed in the next patch. whereby once
- the empty untracked cache structure is added by the
- update-index invocation, the untracked cache gets
- updated in every subsequent "git status" call, but the
- index with these updates does not get written down.
+ At this stage this change does nothing, because there is a bug,
+ addressed in the next patch, whereby once the empty untracked cache
+ structure is added by the update-index invocation, the untracked cache
+ gets updated in every subsequent "git status" call, but the index with
+ these updates does not get written down.
- That bug actually invalidates this entire test case -
- but we're fixing that next.
+ That bug actually invalidates this entire test case - but we're fixing
+ that next.
Signed-off-by: Tao Klerks <tao@klerks.biz>
## t/t7519-status-fsmonitor.sh ##
@@ t/t7519-status-fsmonitor.sh: test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR'
- avoid_racy &&
+ git config core.fsmonitor .git/hooks/fsmonitor-test &&
git update-index --untracked-cache &&
git update-index --fsmonitor &&
+ git status &&
3: 627f1952fd8 ! 3: 190b27e518a Write index when populating empty untracked cache
@@ Metadata
Author: Tao Klerks <tao@klerks.biz>
## Commit message ##
- Write index when populating empty untracked cache
+ untracked-cache: write index when populating empty untracked cache
- It is expected that an empty/unpopulated untracked
- cache structure can be written to the index - by update-
- index, or by a "git status" call that sees the untracked cache
- should be enabled, but is running with options that make
- it non-applicable in that run.
+ It is expected that an empty/unpopulated untracked cache structure can
+ be written to the index - by update-index, or by a "git status" call
+ that sees the untracked cache should be enabled and is not, but is
+ running with options that make the untracked cache non-applicable in
+ that run (eg a pathspec).
- Currently, if that happens, then subsequent "git status"
- calls end up populating the untracked cache, but not
- writing the index (not saving their work) - so the
- performance outcome is almost identical to the cache
- being altogether disabled.
+ Currently, if that happens, then subsequent "git status" calls end up
+ populating the untracked cache, but not writing the index (not saving
+ their work) - so the performance outcome is almost identical to the
+ cache being altogether disabled.
- This continues until the index gets written with the cache
- populated, for some *other* reason.
+ This continues until the index gets written with the untracked cache
+ populated, for some *other* reason, such as a working tree change.
- In this change, we detect the "existing cache is empty
- and it looks like we are using it" condition, and queues
- an index write when this happens.
+ Detect the condition where an empty untracked cache exists in the
+ index and we will collect the list of untracked paths, and queue an
+ index write under that condition, so that the collected untracked
+ paths can be written out to the untracked cache extension in the
+ index.
- This change depends on previous fixes to t7519 for the
- "ignore .git changes when invalidating UNTR" test case to
- pass - before this fix, the test never actually did anything
- as it was not set up correctly.
+ This change depends on previous fixes to t7519 for the "ignore .git
+ changes when invalidating UNTR" test case to pass - before this fix,
+ the test never actually did anything as it was not set up correctly.
Signed-off-by: Tao Klerks <tao@klerks.biz>
@@ dir.c: static struct untracked_cache_dir *validate_untracked_cache(struct dir_st
- if (!dir->untracked->root)
+ if (!dir->untracked->root) {
++ /* Untracked cache existed but is not initialized; fix that */
FLEX_ALLOC_STR(dir->untracked->root, name, "");
-+ /*
-+ * If we've had to initialize the root, then what we had was an
-+ * empty uninitialized untracked cache structure. We will be
-+ * populating it now, so we should trigger an index write.
-+ */
+ istate->cache_changed |= UNTRACKED_CHANGED;
+ }
--
gitgitgadget
next prev parent reply other threads:[~2022-02-25 17:10 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget
2021-06-29 4:22 ` Junio C Hamano
2021-06-24 18:30 ` [PATCH 2/3] In t7519, populate untracked cache before test Tao Klerks via GitGitGadget
2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget
2021-06-29 4:42 ` Junio C Hamano
2022-02-24 17:52 ` Tao Klerks
2022-02-24 20:35 ` Junio C Hamano
2022-02-25 17:10 ` Tao Klerks via GitGitGadget [this message]
2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for " Tao Klerks via GitGitGadget
2022-02-25 19:07 ` Junio C Hamano
2022-02-27 22:12 ` Tao Klerks
2022-02-25 17:10 ` [PATCH v2 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget
2022-02-25 17:10 ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget
2022-02-25 19:12 ` Junio C Hamano
2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2022-02-27 21:56 ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget
2022-02-27 21:57 ` [PATCH v3 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget
2022-02-27 21:57 ` [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.986.v2.git.1645809015.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=tao@klerks.biz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.