git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Turner <dturner@twopensource.com>
To: git mailing list <git@vger.kernel.org>, Duy Nguyen <pclouds@gmail.com>
Subject: Re: Watchman support for git
Date: Mon, 05 May 2014 20:13:44 -0700	[thread overview]
Message-ID: <1399346024.11843.43.camel@stross> (raw)
In-Reply-To: <CACsJy8Dr=m5FwmD2gXTM3bN-iYv+fyZ9RBUyvcj3UJJCC1yYtg@mail.gmail.com>

On Sun, 2014-05-04 at 07:15 +0700, Duy Nguyen wrote:
> > I would like to merge the feature into master.  It works well for me,
> > and some of my colleagues who have tried it out.
> 
> Have you tried to turn watchman on by default, then run it with git
> test suite? That usually helps.

I have.  The tests work run fine under make, but prove sometimes freezes
due to an issue in libwatchman which I just fixed (and which I plan to
merge as soon as I can get a colleague to look the changes over).

> > I can split the vmac patch into two, but one of them will remain quite
> > large because it contains the code for VMAC and AES, which total a bit
> > over 100k.  Since the list will probably reject that, I'll post a link
> > to a repository containing the patches.
> 
> With the read-cache deamon, I think hashing cost is less of an issue,
> so new hashing algorithm becomes less important. If you store the file
> cache in the deamon's memory only, there's no need to hash anything.
> But I guess you already tried this.

I agree that with the daemon, the cost is less of an issue, but I am not
100% sure it is a non-issue; consecutive commands that need to
read/write the index can still be slowed down.

> > I'm not 100% sure how to split the watchman patch up.  I could add the
> > fs_cache code and then separately add the watchman code that populates
> > the cache.  Do you think there is a need to divide it up beyond this?
> 
> I'll need to have closer look at your patches to give any suggestions.

I have uploaded a new version (which is about 5-10% faster and which
corrects some minor changes) to https://github.com/dturner-tw/git.git on
the watchman branch.  

> Although if you don't mind waiting a bit, I can try to put my
> untracked cache patches in good shape (hopefully in 2 weeks), then you
> can mostly avoid touching dir.c and reuse my work.

If the untracked cache patches are going to make it into master, then I
would of course be willing to rewrite on top of them.  But I would also
like to have a sense of whether there is any interest in watchman
support (outside of Twitter).

For what it's worth, the numbers today for index version 4 are for my
superscience repo are:
~380 (no watchman), ~260 (untracked-cache), ~175 (watchman).

That's because untracked-cache still has to stat every directory.

> I backed away from watchman support because I was worried about its
> overhead (of watchman itself, and git/watchman IPC because it's not
> designed specifically for git), which led me to try optimizing git as
> much as possible without watchman first, then see how/if watchman can
> help on top of that. I still think it's a good approach (maybe because
> it started to make me doubt if watchman could pull a big performance
> win on top to justify the changes to support it)

I think on large repositories (especially deeply-nested ones), with the
common case of a small number of changes, watchman will end up being a
big win.  Java tends towards deep nesting
(src/main/java/com/twitter/common/...), which is probably why my test
repo had the largest speedup (>50%).  The IPC overhead might become bad
if there were a large number of changes, but so far this has not been an
issue for me in testing.  

  reply	other threads:[~2014-05-06 16:32 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 23:14 Watchman support for git dturner
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24   ` Junio C Hamano
2014-05-07  0:06     ` David Turner
2014-05-07  3:00       ` Jeff King
2014-05-07  3:33         ` David Turner
2014-05-07 17:42           ` Junio C Hamano
2014-05-07 20:57             ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03  2:24   ` David Turner
2014-05-03  3:40     ` Felipe Contreras
2014-05-05 18:08       ` David Turner
2014-05-05 18:14         ` Felipe Contreras
2014-05-08 19:17       ` Sebastian Schuberth
2014-05-09  7:08         ` David Lang
2014-05-09 17:17           ` David Turner
2014-05-09 18:08             ` David Lang
2014-05-09 18:17               ` David Turner
2014-05-09 18:27                 ` David Lang
2014-05-09 18:47                   ` David Turner
2014-05-03  0:52 ` Duy Nguyen
2014-05-03  4:39   ` David Turner
2014-05-03  8:49     ` Duy Nguyen
2014-05-03 20:49       ` David Turner
2014-05-04  0:15         ` Duy Nguyen
2014-05-06  3:13           ` David Turner [this message]
2014-05-06  0:26   ` Duy Nguyen
2014-05-06  0:30     ` Duy Nguyen
2014-05-10  5:26 ` Duy Nguyen
2014-05-10 18:38   ` David Turner
2014-05-11  0:21     ` Duy Nguyen
2014-05-11 22:56       ` David Turner
2014-05-12 10:45         ` Duy Nguyen
2014-05-13 22:38           ` David Turner
2014-05-13 22:54             ` Duy Nguyen
2014-05-13 23:19               ` David Turner
2014-05-10  8:16 ` Duy Nguyen
2014-05-13 23:44   ` David Turner
2014-05-14 10:36     ` Duy Nguyen
2014-05-14 10:52       ` Duy Nguyen
2014-05-15 19:42       ` David Turner
2014-05-19 10:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1399346024.11843.43.camel@stross \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).