git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Junio C Hamano <gitster@pobox.com>, Ben Peart <benpeart@microsoft.com>
Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net
Subject: Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
Date: Wed, 20 Sep 2017 12:19:29 -0400	[thread overview]
Message-ID: <3311de8b-f9df-07e0-6c5d-7f491e9bcaa8@gmail.com> (raw)
In-Reply-To: <xmqqr2v2p0gn.fsf@gitster.mtv.corp.google.com>



On 9/19/2017 10:28 PM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> +/* do stat comparison even if CE_FSMONITOR_VALID is true */
>> +#define CE_MATCH_IGNORE_FSMONITOR 0X20
> 
> Hmm, when should a programmer use this flag?
> 

Pretty much the same places you would also use CE_MATCH_IGNORE_VALID and 
CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those 
features.  That is generally when you are about to overwrite data so 
want to be *really* sure you have what you think you have.

The other place I used it was in preload_index(). In that case, I didn't 
want to trigger the call to refresh_fsmonitor() as preload_index() is 
about trying to do a fast precompute of state for the bulk of the index 
entries but is not required for correctness as refresh_cache_ent() will 
ensure any "missed" by preload_index() are up-to-date if/when that is 
needed.

>> +int git_config_get_fsmonitor(void)
>> +{
>> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
>> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
> 
> Will the environment be part of the published API, or is it a
> remnant of a useful tool for debugging while developing the feature?
> 
> If it is the former (and I'd say why not, even though "git -c
> core.fsmontor=..." may be easy enough), we might want to rename it
> to replace _TEST with _PROGRAM or something and document it.
> 

This was added this to facilitate testing.  That is why it has the magic 
naming of "GIT_***_TEST" which is the only way I found to ensure that 
env variable gets passed to tests.  Its use is discussed in patch 10 
which contains the tests.

>> diff --git a/diff-lib.c b/diff-lib.c
>> index 2a52b07954..23c6d03ca9 100644
>> --- a/diff-lib.c
>> +++ b/diff-lib.c
>> @@ -12,6 +12,7 @@
>>   #include "refs.h"
>>   #include "submodule.h"
>>   #include "dir.h"
>> +#include "fsmonitor.h"
>>   
>>   /*
>>    * diff-files
>> @@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>>   
>>   		if (!changed && !dirty_submodule) {
>>   			ce_mark_uptodate(ce);
>> +			mark_fsmonitor_valid(ce);
> 
> I notice all calls to mark_fsmonitor_valid(ce) always follow a call
> to ce_mark_uptodate(ce).  Should the call to the former be made as
> part of the latter instead?  Are there cases where we want to call
> ce_mark_uptodate(ce) to mark the ce up-to-date, yet we do not want
> to call mark_fsmonitor_valid(ce)?  How does a programmer tell when
> s/he wants to call ce_mark_uptodate(ce) if s/he also should call
> mark_fsmonitor_valid(ce)?

mark_fsmonitor_valid(ce) is the way to indicate that cache entries that 
were once fsmonitor dirty are now properly reflected in the index so can 
come off the "dirty" list.  It can't really be combined with 
ce_mark_uptodate(ce) as that would prevent the CE_MATCH_IGNORE_FSMONITOR 
logic:

	if (!ignore_skip_worktree && ce_skip_worktree(ce)) {
		ce_mark_uptodate(ce);
		return ce;
	}
	if (!ignore_valid && (ce->ce_flags & CE_VALID)) {
		ce_mark_uptodate(ce);
		return ce;
	}
	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) {
		ce_mark_uptodate(ce);
		return ce;
	}

In addition, fsmonitor is an optional feature and so the 
mark_fsmonitor_valid(ce) call should only happen when the feature is 
turned on. I tried to keep it as simple as possible by making that test 
and set logic part of the function but still be performant by making the 
function inline.

> 
> Together with "when to pass IGNORE_FSMONITOR" question, is there a
> summary that guides new programmers to answer these questions in the
> new documentation?
> 

Only the discussion in this mail thread.  I could add something to the 
function header in fsmonitor.h if that would help.  How about something 
like:

diff --git a/fsmonitor.h b/fsmonitor.h
index c2240b811a..03bf3efe61 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -34,9 +34,11 @@ extern void tweak_fsmonitor(struct index_state *istate);
   */
  extern void refresh_fsmonitor(struct index_state *istate);

-/*
- * Set the given cache entries CE_FSMONITOR_VALID bit.
- */
+/*
+ * Set the given cache entries CE_FSMONITOR_VALID bit. This should be
+ * called any time the cache entry has been updated to reflect the
+ * current state of the file on disk.
+ */
  static inline void mark_fsmonitor_valid(struct cache_entry *ce)
  {
         if (core_fsmonitor) {
@@ -46,8 +48,11 @@ static inline void mark_fsmonitor_valid(struct 
cache_entry *ce)
  }

  /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate any
- * corresponding untracked cache directory structures.
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
+ * any corresponding untracked cache directory structures. This should
+ * be called any time git creates or modifies a file that should
+ * trigger an lstat() or invalidate the untracked cache for the
+ * corresponding directory
   */
  static inline void mark_fsmonitor_invalid(struct index_state *istate, 
struct cache_entry *ce)
  {


>> diff --git a/dir.h b/dir.h
>> index e3717055d1..fab8fc1561 100644
>> --- a/dir.h
>> +++ b/dir.h
>> @@ -139,6 +139,8 @@ struct untracked_cache {
>>   	int gitignore_invalidated;
>>   	int dir_invalidated;
>>   	int dir_opened;
>> +	/* fsmonitor invalidation data */
>> +	unsigned int use_fsmonitor : 1;
> 
> This makes it look like we will add a bit more fields in later
> patches that are about "invalidation" around fsmonitor, but it
> appears that such an addition never happens til the end of the
> series.  And use_fsmonitor boolean does not seem to be what the
> comment says---it just tells us if fsmonitor is in use in the
> operation of the untracked cache, no?

I don't have any more planned bit fields.  I only needed a single bit so 
used a bit field just in case someone else comes by later and needs 
another bit.  If you aren't worried about that, we can just make this an 
int.

Correct.  The bit just indicates whether fsmonitor has been used to 
ensure the cache is current or if it needs to be checked.  I have 
comments to this effect where the flag is used in the code but could 
move/duplicate them into the header if wished. For example:

/* With fsmonitor, we can trust the untracked cache's valid field. */

and

/* Now mark the untracked cache for fsmonitor usage */

> 
>> diff --git a/entry.c b/entry.c
>> index cb291aa88b..5e6794f9fc 100644
>> --- a/entry.c
>> +++ b/entry.c
>> @@ -4,6 +4,7 @@
>>   #include "streaming.h"
>>   #include "submodule.h"
>>   #include "progress.h"
>> +#include "fsmonitor.h"
>>   
>>   static void create_directories(const char *path, int path_len,
>>   			       const struct checkout *state)
>> @@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
>>   			lstat(ce->name, &st);
>>   		fill_stat_cache_info(ce, &st);
>>   		ce->ce_flags |= CE_UPDATE_IN_BASE;
>> +		mark_fsmonitor_invalid(state->istate, ce);
>>   		state->istate->cache_changed |= CE_ENTRY_CHANGED;
> 
> Similar to "how does mark_fsmonitor_valid() relate to
> ce_mark_uptodate()?" question earlier, mark_fsmonitor_invalid()
> seems to often appear in pairs with the update to cache_changed.
> Are there cases where we need CE_ENTRY_CHANGED bit added to the
> index state yet we do not want to call mark_fsmonitor_invalid()?  I
> am wondering if there should be a single helper function that lets
> callers say "I changed this ce in this istate this way.  Please
> update CE_VALID, CE_UPDATE_IN_BASE and CE_FSMONITOR_VALID
> accordingly".
> 
> That "changed _this way_" is deliberately vague in my question
> above, because it is not yet clear to me when mark-valid and
> mark-invalid should and should not be called from the series.
> 

Please let me know if my comment/patch above does not address this 
concern sufficiently.

>> +	/* a fsmonitor process can return '*' to indicate all entries are invalid */
>> +	if (query_success && query_result.buf[0] != '/') {
>> +		/* Mark all entries returned by the monitor as dirty */
> 
> The comment talks about '*' and code checks if it is not '/'?  The
> else clause of this if/else handles the "invalidate all" case, so
> shouldn't we be comparing with '*' instead here?
> 
> Ah, fsmonitor-watchman section of the doc tells us to write the hook
> in a way to return slash, so the comment above the code is stale and
> the comparison with '/' is what we want, I guess.
> 

Correct.  Sorry about missing that.  Here is a patch that can be 
squashed in.

diff --git a/fsmonitor.c b/fsmonitor.c
index b8b2d88fe1..7c1540c054 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -176,7 +176,7 @@ void refresh_fsmonitor(struct index_state *istate)
                         core_fsmonitor, query_success ? "success" : 
"failure");
         }

-       /* a fsmonitor process can return '*' to indicate all entries 
are invalid */
+       /* a fsmonitor process can return '/' to indicate all entries 
are invalid */
         if (query_success && query_result.buf[0] != '/') {
                 /* Mark all entries returned by the monitor as dirty */
                 buf = query_result.buf;


  reply	other threads:[~2017-09-20 16:19 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-06-10 13:40 ` [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c Ben Peart
2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-06-27 15:43   ` Christian Couder
2017-07-03 21:25     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-06-27 16:20   ` Christian Couder
2017-07-07 18:50     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 5/7] fsmonitor: add documentation for the " Ben Peart
2017-06-10 13:40 ` [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
2017-06-10 14:04   ` Ben Peart
2017-06-12 22:04   ` Junio C Hamano
2017-06-14 14:12     ` Ben Peart
2017-06-14 18:36       ` Junio C Hamano
2017-07-07 18:14         ` Ben Peart
2017-07-07 18:35           ` Junio C Hamano
2017-07-07 19:07             ` Ben Peart
2017-07-07 19:33             ` David Turner
2017-07-08  7:19             ` Christian Couder
2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
2017-07-10 13:36   ` Ben Peart
2017-07-10 14:40     ` Ben Peart
2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-15 19:20   ` [PATCH v6 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-15 21:35     ` David Turner
2017-09-18 13:07       ` Ben Peart
2017-09-18 13:32         ` David Turner
2017-09-18 13:49           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-15 19:43     ` David Turner
2017-09-18 13:27       ` Ben Peart
2017-09-17  8:03     ` Junio C Hamano
2017-09-18 13:29       ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-15 20:34     ` David Turner
2017-09-15 19:20   ` [PATCH v6 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-17  8:02     ` Junio C Hamano
2017-09-18 13:38       ` Ben Peart
2017-09-18 15:43         ` Torsten Bögershausen
2017-09-18 16:28           ` Ben Peart
2017-09-19 14:16             ` Torsten Bögershausen
2017-09-19 15:36               ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 20:43     ` Jonathan Nieder
2017-09-20 17:11       ` Ben Peart
2017-09-20 17:46         ` Jonathan Nieder
2017-09-21  0:05           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-15 22:00     ` David Turner
2017-09-19 19:32       ` David Turner
2017-09-19 20:30         ` Ben Peart
2017-09-16 15:27     ` Torsten Bögershausen
2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
2017-09-19 20:37         ` Jonathan Nieder
2017-09-20 13:49           ` Torsten Bögershausen
2017-09-22  1:04             ` Junio C Hamano
2017-09-18 14:06       ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-17  4:47     ` Junio C Hamano
2017-09-18 15:25       ` Ben Peart
2017-09-19 20:34         ` Jonathan Nieder
2017-09-15 19:20   ` [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
2017-09-15 21:56     ` David Turner
2017-09-18 14:24     ` Johannes Schindelin
2017-09-18 18:19       ` Ben Peart
2017-09-19 15:28         ` Johannes Schindelin
2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-20 22:06       ` Stefan Beller
2017-09-21  0:02         ` Ben Peart
2017-09-21  0:44           ` Stefan Beller
2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-20  5:47       ` Junio C Hamano
2017-09-20 14:58         ` Ben Peart
2017-09-21  1:46           ` Junio C Hamano
2017-09-21  2:06             ` Ben Peart
2017-09-21  2:18               ` Junio C Hamano
2017-09-21  2:32                 ` Junio C Hamano
2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-20  2:28       ` Junio C Hamano
2017-09-20 16:19         ` Ben Peart [this message]
2017-09-21  2:00           ` Junio C Hamano
2017-09-21  2:24             ` Ben Peart
2017-09-21 14:35               ` Ben Peart
2017-09-22  1:02                 ` Junio C Hamano
2017-09-20  6:23       ` Junio C Hamano
2017-09-20 16:29         ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-20 10:00       ` Martin Ågren
2017-09-20 17:02         ` Ben Peart
2017-09-20 17:11           ` Martin Ågren
2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-19 19:46       ` David Turner
2017-09-19 20:44         ` Ben Peart
2017-09-19 21:27           ` David Turner
2017-09-19 22:44             ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-19 19:27     ` [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 19:27     ` [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-19 19:27     ` [PATCH v7 12/12] fsmonitor: add a performance test Ben Peart
2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:31           ` Ben Peart
2017-09-24  3:51             ` Jeff King
2017-09-24  3:52             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-22 16:35       ` [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-22 16:35       ` [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-22 16:35       ` [PATCH v8 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:33           ` Ben Peart
2017-09-24  3:51             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-22 16:35       ` [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-22 16:35       ` [PATCH v8 12/12] fsmonitor: add a performance test Ben Peart
2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
2017-09-29 12:07         ` Ben Peart
2017-10-01  8:24           ` Junio C Hamano
2017-10-03 19:48             ` Ben Peart
2017-10-04  2:09               ` Junio C Hamano
2017-10-04  6:38                 ` Alex Vandiver
2017-10-04 12:48                   ` Ben Peart
2017-10-04 12:27                 ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3311de8b-f9df-07e0-6c5d-7f491e9bcaa8@gmail.com \
    --to=peartben@gmail.com \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).