All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Utsav Shah via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Utsav Shah <ukshah2@illinois.edu>, Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v3 0/1] unpack-trees: skip stat on fsmonitor-valid files
Date: Wed, 06 Nov 2019 04:54:14 +0000	[thread overview]
Message-ID: <pull.424.v3.git.1573016055.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.424.v2.git.1572967644.gitgitgadget@gmail.com>

The index might be aware that a file hasn't modified via fsmonitor, but
unpack-trees did not pay attention to it and checked via ie_match_stat which
can be inefficient on certain filesystems. This significantly slows down
commands that run oneway_merge, like checkout and reset --hard.

This patch makes oneway_merge check whether a file is considered unchanged
through fsmonitor and skips ie_match_stat on it. unpack-trees also now
correctly copies over fsmonitor validity state from the source index.
Finally, for correctness, we force a refresh of fsmonitor state in
tweak_fsmonitor.

After this change, commands like stash (that use reset --hard internally) go
from 8s or more to ~2s on a 250k file repository on a mac.

Signed-off-by: Utsav Shah utsav@dropbox.com [utsav@dropbox.com]

Utsav Shah (1):
  unpack-trees: skip stat on fsmonitor-valid files

 fsmonitor.c                 | 39 ++++++++++++++++++++++++-------------
 t/t7519-status-fsmonitor.sh |  9 +++++++--
 unpack-trees.c              |  6 +++++-
 3 files changed, 37 insertions(+), 17 deletions(-)


base-commit: 566a1439f6f56c2171b8853ddbca0ad3f5098770
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-424%2FUtsav2%2Fskip-lstat-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-424/Utsav2/skip-lstat-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/424

Range-diff vs v2:

 1:  f76ba554ed ! 1:  4bea7075cf unpack-trees: skip stat on fsmonitor-valid files
     @@ -23,6 +23,15 @@
       --- a/fsmonitor.c
       +++ b/fsmonitor.c
      @@
     + 
     + 	if (pos >= istate->cache_nr)
     + 		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" >= %u)",
     +-		    (uintmax_t)pos, istate->cache_nr);
     ++			(uintmax_t)pos, istate->cache_nr);
     + 
     + 	ce = istate->cache[pos];
     + 	ce->ce_flags &= ~CE_FSMONITOR_VALID;
     +@@
       	}
       	istate->fsmonitor_dirty = fsmonitor_dirty;
       
     @@ -31,7 +40,7 @@
      -		    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
      +	if (!istate->split_index && istate->fsmonitor_dirty->bit_size > istate->cache_nr)
      +		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %"PRIuMAX")",
     -+		    (uintmax_t)istate->fsmonitor_dirty->bit_size, (uintmax_t)istate->cache_nr);
     ++			(uintmax_t)istate->fsmonitor_dirty->bit_size, (uintmax_t)istate->cache_nr);
      +
       
       	trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful");
     @@ -45,7 +54,7 @@
      -		    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
      +	if (!istate->split_index && istate->fsmonitor_dirty->bit_size > istate->cache_nr)
      +		BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %"PRIuMAX")",
     -+		    (uintmax_t)istate->fsmonitor_dirty->bit_size, (uintmax_t)istate->cache_nr);
     ++			(uintmax_t)istate->fsmonitor_dirty->bit_size, (uintmax_t)istate->cache_nr);
       
       	put_be32(&hdr_version, INDEX_EXTENSION_VERSION);
       	strbuf_add(sb, &hdr_version, sizeof(uint32_t));
     @@ -57,10 +66,33 @@
      +		if (istate->untracked)
      +			istate->untracked->use_fsmonitor = 1;
       	} else {
     ++
     ++		/* We only want to run the post index changed hook if we've actually changed entries, so keep track
     ++		 * if we actually changed entries or not */
     ++		int is_cache_changed = 0;
       		/* Mark all entries invalid */
     - 		for (i = 0; i < istate->cache_nr; i++)
     +-		for (i = 0; i < istate->cache_nr; i++)
     +-			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
     ++		for (i = 0; i < istate->cache_nr; i++) {
     ++			if (istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) {
     ++				is_cache_changed = 1;
     ++				istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
     ++			}
     ++		}
     + 
     + 		/* If we're going to check every file, ensure we save the results */
     +-		istate->cache_changed |= FSMONITOR_CHANGED;
     ++		if (is_cache_changed)
     ++			istate->cache_changed |= FSMONITOR_CHANGED;
     + 
     + 		if (istate->untracked)
     + 			istate->untracked->use_fsmonitor = 0;
      @@
     - 				    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
     + 			/* Mark all previously saved entries as dirty */
     + 			if (istate->fsmonitor_dirty->bit_size > istate->cache_nr)
     + 				BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %u)",
     +-				    (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
     ++					(uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
       			ewah_each_bit(istate->fsmonitor_dirty, fsmonitor_ewah_callback, istate);
       
      -			/* Now mark the untracked cache for fsmonitor usage */
     @@ -71,20 +103,6 @@
       
       		ewah_free(istate->fsmonitor_dirty);
      
     - diff --git a/t/t7113-post-index-change-hook.sh b/t/t7113-post-index-change-hook.sh
     - --- a/t/t7113-post-index-change-hook.sh
     - +++ b/t/t7113-post-index-change-hook.sh
     -@@
     - 	git checkout -- dir1/file1.txt &&
     - 	test_path_is_file testsuccess && rm -f testsuccess &&
     - 	test_path_is_missing testfailure &&
     --	git update-index &&
     --	test_path_is_missing testsuccess &&
     --	test_path_is_missing testfailure &&
     - 	git reset --soft &&
     - 	test_path_is_missing testsuccess &&
     - 	test_path_is_missing testfailure
     -
       diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
       --- a/t/t7519-status-fsmonitor.sh
       +++ b/t/t7519-status-fsmonitor.sh
     @@ -113,8 +131,8 @@
      -	write_script .git/hooks/fsmonitor-test<<-\EOF &&
      -	EOF
       	clean_repo &&
     -+	write_integration_script &&
       	dirty_repo &&
     ++	write_integration_script &&
       	git add . &&
      +	write_script .git/hooks/fsmonitor-test<<-\EOF &&
      +	EOF

-- 
gitgitgadget

  parent reply	other threads:[~2019-11-06  4:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25 15:23 [PATCH 0/1] unpack-trees: skip lstat on files based on fsmonitor Utsav Shah via GitGitGadget
2019-10-25 15:23 ` [PATCH 1/1] unpack-trees: skip lstat " Utsav Shah via GitGitGadget
2019-10-28  3:37   ` Junio C Hamano
2019-10-28  6:39     ` Utsav Shah
2019-10-28 19:23       ` Kevin Willford
2019-10-29 19:06         ` Utsav Shah
2019-10-29 20:12           ` Kevin Willford
2019-10-29 23:50             ` Utsav Shah
2019-10-30  0:21               ` Junio C Hamano
2019-10-30 16:41                 ` Utsav Shah
2019-11-04  6:02                   ` Junio C Hamano
2019-11-05 15:27 ` [PATCH v2 0/1] unpack-trees: skip stat on fsmonitor-valid files Utsav Shah via GitGitGadget
2019-11-05 15:27   ` [PATCH v2 1/1] " Utsav Shah via GitGitGadget
2019-11-05 21:40     ` Kevin Willford
2019-11-06  4:36       ` Utsav Shah
2019-11-06 17:24         ` Kevin Willford
2019-11-06  4:54   ` Utsav Shah via GitGitGadget [this message]
2019-11-06  4:54     ` [PATCH v3 " Utsav Shah via GitGitGadget
2019-11-06 10:46       ` Junio C Hamano
2019-11-06 22:33         ` Utsav Shah
2019-11-08  3:51           ` Utsav Shah
2019-11-08  4:11             ` Junio C Hamano
2019-11-06 10:16     ` [PATCH v3 0/1] " Junio C Hamano
2019-11-20  8:32     ` [PATCH v4 " Utsav Shah via GitGitGadget
2019-11-20  8:32       ` [PATCH v4 1/1] " Utsav Shah via GitGitGadget
2019-11-21  4:15         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.424.v3.git.1573016055.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ukshah2@illinois.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.