git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Turner <dturner@twopensource.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: git <git@vger.kernel.org>,
	"Nguyen Thai Ngoc Duy" <pclouds@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: Watchman/inotify support and other ways to speed up git status
Date: Tue, 27 Oct 2015 19:54:49 -0400	[thread overview]
Message-ID: <1445990089.8302.27.camel@twopensource.com> (raw)
In-Reply-To: <CAP8UFD3Cd9SOh6EYwcx9hTVv7P24M5bEJRCYCT5Qgj=qPRJ8hw@mail.gmail.com>


On Thu, 2015-10-22 at 07:59 +0200, Christian Couder wrote:
> Hi everyone,
> 
> I am starting to investigate ways to speed up git status and other git
> commands for Booking.com (thanks to AEvar) and I'd be happy to discuss
> the current status or be pointed to relevant documentation or mailing
> list threads.
> 
> From the threads below ([0], [1], [2], [3], [4], [5], [6], [7], [8]) I
> understand that the status is roughly the following:
> 
> - instead of working on inotify support it's better to work on using a
> cross platform tool like Watchman
> 
> - instead of working on Watchman support it is better to work first on
> caching information in the index
> 
> - git update-index --untracked-cache has been developed by Duy and
> others and merged to master in May 2015 to cache untracked status in
> the index; it is still considered experimental
> 
> - git index-helper has been worked on by Duy but its status is not
> clear (at least to me)
> 
> Is that correct?
> What are the possible/planned next steps in this area? improving

We're using Watchman at Twitter.  A week or two ago posted a dump of our
code to github, but I would advise waiting a day or two to use it, as
I'm about to pull a large number of bugfixes into it (I'll update this
thread and provide a link once I do so).  

It's good, but it's not great.  One major problem is a bug on OS X[1]
that causes missed updates.  Another is that wide changes end up being
quite inefficient when querying watchman.  This means that we do some
hackery to manually update the fs_cache during various large git
operations.

I agree that in general it would be better to store or all some of this
information in the index, and the untracked-cache is a good step on
that. But with it enabled and watchman disabled, there still appears to
be 1 lstat per file (plus one stat per dir).  The stats per-directory
alone are a large issue for Twitter because we have a relatively deep
and bushy directory structure (an average dir has about 3 or 4 entries
in it).  As a result, git status with watchman is almost twice as fast
as with the untracked cache (on my particular machine).


[1] https://github.com/facebook/watchman/issues/172

  parent reply	other threads:[~2015-10-27 23:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-22  5:59 Watchman/inotify support and other ways to speed up git status Christian Couder
2015-10-22  7:29 ` Duy Nguyen
2015-10-27 23:54 ` David Turner [this message]
2015-10-29  8:10   ` Christian Couder
2015-11-02 20:56     ` David Turner
2015-11-03  5:45       ` Duy Nguyen
2015-11-03  7:09         ` Christian Couder
2015-11-03 20:32           ` David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1445990089.8302.27.camel@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).