From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [RFC] On watchman support
Date: Mon, 17 Nov 2014 19:25:36 -0500 [thread overview]
Message-ID: <1416270336.13653.23.camel@leckie> (raw)
In-Reply-To: <20141111124901.GA6011@lanh>
On Tue, 2014-11-11 at 19:49 +0700, Duy Nguyen wrote:
> I've come to the last piece to speed up "git status", watchman
> support. And I realized it's not as good as I thought.
>
> Watchman could be used for two things: to avoid refreshing the index,
> and to avoid searching for ignored files. The first one can be done
> (with the patch below as demonstration). And it should keep refresh
> cost to near zero in the best case, the cost is proportional to the
> number of modified files.
>
> For avoiding searching for ignored files. My intention was to build on
> top of untracked cache. If watchman can tell me what files are added
> or deleted since last observed time, then I can invalidate just
> directories that contain them, or even better, calculate ignore status
> for those files only.
>
> This is important because in reality compilers and editors tend to
> update files by creating a new version then rename them, updating
> directory mtime and invalidating untracked cache as a consequence. As
> you edit more files (or your rebuild touches more dirs), untracked
> cache performance drops (until the next "git status"). The numbers I
> posted so far are the best case.
>
> The problem with watchman is it cannot tell me "new" files since the
> last observed time (let's say 'T'). If a file exists at 'T', gets
> deleted then recreated, then watchman tells me it's a new file. I want
> to separate those from ones that do not exist before 'T'.
>
> David's watchman approach does not have this problem because he keeps
> track of all entries under $GIT_WORK_TREE and knows which files are
> truely new. But I don't really want to keep the whole file list around,
> especially when watchman already manages the same list.
>
> So we got a few options:
>
> 1) Convince watchman devs to add something to make it work
Based on the thread on the watchman github it looks like this won't
happen.
> 2) Fork watchman
>
> 3) Make another daemon to keep file list around, or put it in a shared
> memory.
>
> 4) Move David's watchman series forward (and maybe make use of shared
> mem for fs_cache).
>
> 5) Go with something similar to the patch below and accept untracked
> cache performance degrades from time to time
>
> 6) ??
>
> I'm working on 1). 2) is just bad taste, listed for completeness
> only. If we go with 3) and watchman starts to support Windows (seems
> to be in their plan), we'll need to rework some how. And I really
> don't like 3)
>
> If 1-3 does not work out, we're left without 4) and 5). We could
> support both, but proobably not worth the code complexity and should
> just go with one.
>
> And if we go with 4) we should probably think of dropping untracked
> cache if watchman will support Windows in the end. 4) also has another
> advantage over untracked cache, that it could speed up listing ignored
> files as well as untracked files.
>
> Comments?
I don't think it would be impossible to add Windows support to watchman;
the necessary functions exist, although I don't know how well they work.
My experience with watchman is that it is something of a stress test of
a filesystem's notification layer. It has exposed bugs in inotify, and
caused system instability on OS X.
My patches are not the world's most beautiful, but they do work. I
think some improvement might be possible by keeping info about tracked
files in the index, and only storing the tree of ignored and untracked
files separately. But I have not thought this through fully. In any
case, making use of shared memory for the fs_cache (as some of your
other patches do for the index) would definitely save time.
next prev parent reply other threads:[~2014-11-18 0:25 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-11 12:49 [RFC] On watchman support Duy Nguyen
2014-11-13 5:05 ` Torsten Bögershausen
2014-11-13 12:22 ` Duy Nguyen
2014-11-15 7:24 ` Torsten Bögershausen
2014-11-18 0:25 ` David Turner [this message]
2014-11-18 10:48 ` Duy Nguyen
2014-11-18 18:12 ` David Turner
2014-11-18 20:55 ` Junio C Hamano
2014-11-18 21:12 ` David Turner
2014-11-18 21:26 ` Junio C Hamano
2014-11-19 1:46 ` Jeff King
2014-11-28 11:13 ` Duy Nguyen
2014-12-01 20:45 ` David Turner
2014-11-19 15:26 ` Paolo Ciarrocchi
2014-11-19 16:43 ` David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1416270336.13653.23.camel@leckie \
--to=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).