git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Watchman/inotify support and other ways to speed up git status
@ 2015-10-22  5:59 Christian Couder
  2015-10-22  7:29 ` Duy Nguyen
  2015-10-27 23:54 ` David Turner
  0 siblings, 2 replies; 8+ messages in thread
From: Christian Couder @ 2015-10-22  5:59 UTC (permalink / raw)
  To: git
  Cc: Nguyen Thai Ngoc Duy, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, David Turner

Hi everyone,

I am starting to investigate ways to speed up git status and other git
commands for Booking.com (thanks to AEvar) and I'd be happy to discuss
the current status or be pointed to relevant documentation or mailing
list threads.

>From the threads below ([0], [1], [2], [3], [4], [5], [6], [7], [8]) I
understand that the status is roughly the following:

- instead of working on inotify support it's better to work on using a
cross platform tool like Watchman

- instead of working on Watchman support it is better to work first on
caching information in the index

- git update-index --untracked-cache has been developed by Duy and
others and merged to master in May 2015 to cache untracked status in
the index; it is still considered experimental

- git index-helper has been worked on by Duy but its status is not
clear (at least to me)

Is that correct?
What are the possible/planned next steps in this area? improving
--untracked-cache? git index-helper? watchman support?

Thanks,
Christian.

[0] March 8 2015: [PATCH 00/24] nd/untracked-cache updates
http://thread.gmane.org/gmane.comp.version-control.git/265053/

[1] November 11 2014: [RFC] On watchman support
http://thread.gmane.org/gmane.comp.version-control.git/259399/

[2] October 27 2014:[PATCH 00/19] Untracked cache to speed up "git status"
http://thread.gmane.org/gmane.comp.version-control.git/258766

[3] July 28 2014: [PATCH v3 0/9] Speed up cache loading time
http://thread.gmane.org/gmane.comp.version-control.git/254314/

[4] May 7 2014: [PATCH 00/20] Untracked cache to speed up "git status"
http://thread.gmane.org/gmane.comp.version-control.git/248306

[5] May 2 2014: Watchman support for git
http://thread.gmane.org/gmane.comp.version-control.git/248004/

[6] March 10 2014:
http://git.661346.n2.nabble.com/question-about-Facebook-makes-Mercurial-faster-than-Git-tt7605273.html#a7605280

[7] January 29 2014:
http://git.661346.n2.nabble.com/inotify-support-nearly-there-tt7602739.html

[8] January 12 2014:
http://git.661346.n2.nabble.com/PATCH-0-6-inotify-support-tt7601877.html#a7603955

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-10-22  5:59 Watchman/inotify support and other ways to speed up git status Christian Couder
@ 2015-10-22  7:29 ` Duy Nguyen
  2015-10-27 23:54 ` David Turner
  1 sibling, 0 replies; 8+ messages in thread
From: Duy Nguyen @ 2015-10-22  7:29 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	David Turner

On Thu, Oct 22, 2015 at 7:59 AM, Christian Couder
<christian.couder@gmail.com> wrote:
> Hi everyone,
>
> I am starting to investigate ways to speed up git status and other git
> commands for Booking.com (thanks to AEvar) and I'd be happy to discuss
> the current status or be pointed to relevant documentation or mailing
> list threads.
>
> From the threads below ([0], [1], [2], [3], [4], [5], [6], [7], [8]) I
> understand that the status is roughly the following:
>
> - instead of working on inotify support it's better to work on using a
> cross platform tool like Watchman

Definitely. Especially because watchman has recently gained
(experimental?) Windows support

> - instead of working on Watchman support it is better to work first on
> caching information in the index
>
> - git update-index --untracked-cache has been developed by Duy and
> others and merged to master in May 2015 to cache untracked status in
> the index; it is still considered experimental
>
> - git index-helper has been worked on by Duy but its status is not
> clear (at least to me)

My roadmap is speeding up index write speed (split index), then read
speed (index-helper) and watchman can be run on top of the
index-helper. Untracked cache solves another performance problem with
.gitignore. All these four pieces are big and getting slowly into
git.git.

The last piece is using watchman with untracked cache to kill the
(small) last stream of lstat() calls. But we'll see if we actually
need it.

> What are the possible/planned next steps in this area? improving
> --untracked-cache? git index-helper? watchman support?

The index-helper needs some polishing and perhaps more eyeballing. I
have the watchman patch on top of it, but I don't remember if I have
ever sent it out. If anyone wants to help I can resend everything I
have. I don't think I can resume the work soon.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-10-22  5:59 Watchman/inotify support and other ways to speed up git status Christian Couder
  2015-10-22  7:29 ` Duy Nguyen
@ 2015-10-27 23:54 ` David Turner
  2015-10-29  8:10   ` Christian Couder
  1 sibling, 1 reply; 8+ messages in thread
From: David Turner @ 2015-10-27 23:54 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Junio C Hamano,
	Ævar Arnfjörð Bjarmason


On Thu, 2015-10-22 at 07:59 +0200, Christian Couder wrote:
> Hi everyone,
> 
> I am starting to investigate ways to speed up git status and other git
> commands for Booking.com (thanks to AEvar) and I'd be happy to discuss
> the current status or be pointed to relevant documentation or mailing
> list threads.
> 
> From the threads below ([0], [1], [2], [3], [4], [5], [6], [7], [8]) I
> understand that the status is roughly the following:
> 
> - instead of working on inotify support it's better to work on using a
> cross platform tool like Watchman
> 
> - instead of working on Watchman support it is better to work first on
> caching information in the index
> 
> - git update-index --untracked-cache has been developed by Duy and
> others and merged to master in May 2015 to cache untracked status in
> the index; it is still considered experimental
> 
> - git index-helper has been worked on by Duy but its status is not
> clear (at least to me)
> 
> Is that correct?
> What are the possible/planned next steps in this area? improving

We're using Watchman at Twitter.  A week or two ago posted a dump of our
code to github, but I would advise waiting a day or two to use it, as
I'm about to pull a large number of bugfixes into it (I'll update this
thread and provide a link once I do so).  

It's good, but it's not great.  One major problem is a bug on OS X[1]
that causes missed updates.  Another is that wide changes end up being
quite inefficient when querying watchman.  This means that we do some
hackery to manually update the fs_cache during various large git
operations.

I agree that in general it would be better to store or all some of this
information in the index, and the untracked-cache is a good step on
that. But with it enabled and watchman disabled, there still appears to
be 1 lstat per file (plus one stat per dir).  The stats per-directory
alone are a large issue for Twitter because we have a relatively deep
and bushy directory structure (an average dir has about 3 or 4 entries
in it).  As a result, git status with watchman is almost twice as fast
as with the untracked cache (on my particular machine).


[1] https://github.com/facebook/watchman/issues/172

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-10-27 23:54 ` David Turner
@ 2015-10-29  8:10   ` Christian Couder
  2015-11-02 20:56     ` David Turner
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Couder @ 2015-10-29  8:10 UTC (permalink / raw)
  To: David Turner
  Cc: git, Nguyen Thai Ngoc Duy, Junio C Hamano,
	Ævar Arnfjörð, Luciano Rocha

On Wed, Oct 28, 2015 at 12:54 AM, David Turner <dturner@twopensource.com> wrote:
>
> On Thu, 2015-10-22 at 07:59 +0200, Christian Couder wrote:
>> Hi everyone,
>>
>> I am starting to investigate ways to speed up git status and other git
>> commands for Booking.com (thanks to AEvar) and I'd be happy to discuss
>> the current status or be pointed to relevant documentation or mailing
>> list threads.
>>
>> From the threads below ([0], [1], [2], [3], [4], [5], [6], [7], [8]) I
>> understand that the status is roughly the following:
>>
>> - instead of working on inotify support it's better to work on using a
>> cross platform tool like Watchman
>>
>> - instead of working on Watchman support it is better to work first on
>> caching information in the index
>>
>> - git update-index --untracked-cache has been developed by Duy and
>> others and merged to master in May 2015 to cache untracked status in
>> the index; it is still considered experimental
>>
>> - git index-helper has been worked on by Duy but its status is not
>> clear (at least to me)
>>
>> Is that correct?
>> What are the possible/planned next steps in this area? improving
>
> We're using Watchman at Twitter.  A week or two ago posted a dump of our
> code to github, but I would advise waiting a day or two to use it, as
> I'm about to pull a large number of bugfixes into it (I'll update this
> thread and provide a link once I do so).

Great, I will have a look at it then!

> It's good, but it's not great.  One major problem is a bug on OS X[1]
> that causes missed updates.  Another is that wide changes end up being
> quite inefficient when querying watchman.  This means that we do some
> hackery to manually update the fs_cache during various large git
> operations.
>
> I agree that in general it would be better to store or all some of this
> information in the index, and the untracked-cache is a good step on
> that. But with it enabled and watchman disabled, there still appears to
> be 1 lstat per file (plus one stat per dir).  The stats per-directory
> alone are a large issue for Twitter because we have a relatively deep
> and bushy directory structure (an average dir has about 3 or 4 entries
> in it).  As a result, git status with watchman is almost twice as fast
> as with the untracked cache (on my particular machine).

Thanks for this detailled description.

> [1] https://github.com/facebook/watchman/issues/172

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-10-29  8:10   ` Christian Couder
@ 2015-11-02 20:56     ` David Turner
  2015-11-03  5:45       ` Duy Nguyen
  0 siblings, 1 reply; 8+ messages in thread
From: David Turner @ 2015-11-02 20:56 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Junio C Hamano,
	Ævar Arnfjörð, Luciano Rocha, Lars Schneider

On Thu, 2015-10-29 at 09:10 +0100, Christian Couder wrote:
> > We're using Watchman at Twitter.  A week or two ago posted a dump of our
> > code to github, but I would advise waiting a day or two to use it, as
> > I'm about to pull a large number of bugfixes into it (I'll update this
> > thread and provide a link once I do so).
> 
> Great, I will have a look at it then!

Here's the most recent version:

https://github.com/dturner-tw/git/tree/dturner/watchman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-11-02 20:56     ` David Turner
@ 2015-11-03  5:45       ` Duy Nguyen
  2015-11-03  7:09         ` Christian Couder
  0 siblings, 1 reply; 8+ messages in thread
From: Duy Nguyen @ 2015-11-03  5:45 UTC (permalink / raw)
  To: David Turner, Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð, Luciano Rocha,
	Lars Schneider

On Mon, Nov 2, 2015 at 9:56 PM, David Turner <dturner@twopensource.com> wrote:
> On Thu, 2015-10-29 at 09:10 +0100, Christian Couder wrote:
>> > We're using Watchman at Twitter.  A week or two ago posted a dump of our
>> > code to github, but I would advise waiting a day or two to use it, as
>> > I'm about to pull a large number of bugfixes into it (I'll update this
>> > thread and provide a link once I do so).
>>
>> Great, I will have a look at it then!
>
> Here's the most recent version:
>
> https://github.com/dturner-tw/git/tree/dturner/watchman

Christian, the index-helper/watchman series are posted because you
showed interest in this area. I'm not rerolling to address David's
comments on the series for now. Take your time evaluate the two
approaches, then you can pick one (and let me know if you want me to
hand my series over, very glad to do so).
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-11-03  5:45       ` Duy Nguyen
@ 2015-11-03  7:09         ` Christian Couder
  2015-11-03 20:32           ` David Turner
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Couder @ 2015-11-03  7:09 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: David Turner, git, Junio C Hamano, Ævar Arnfjörð,
	Luciano Rocha, Lars Schneider

On Tue, Nov 3, 2015 at 6:45 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Mon, Nov 2, 2015 at 9:56 PM, David Turner <dturner@twopensource.com> wrote:
>> On Thu, 2015-10-29 at 09:10 +0100, Christian Couder wrote:
>>> > We're using Watchman at Twitter.  A week or two ago posted a dump of our
>>> > code to github, but I would advise waiting a day or two to use it, as
>>> > I'm about to pull a large number of bugfixes into it (I'll update this
>>> > thread and provide a link once I do so).
>>>
>>> Great, I will have a look at it then!
>>
>> Here's the most recent version:
>>
>> https://github.com/dturner-tw/git/tree/dturner/watchman
>
> Christian, the index-helper/watchman series are posted because you
> showed interest in this area. I'm not rerolling to address David's
> comments on the series for now.

Ok no problem. Thanks a lot to you and David for posting your rebased series!

> Take your time evaluate the two
> approaches, then you can pick one (and let me know if you want me to
> hand my series over, very glad to do so).

Yeah, I will do that, thanks again!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Watchman/inotify support and other ways to speed up git status
  2015-11-03  7:09         ` Christian Couder
@ 2015-11-03 20:32           ` David Turner
  0 siblings, 0 replies; 8+ messages in thread
From: David Turner @ 2015-11-03 20:32 UTC (permalink / raw)
  To: Christian Couder
  Cc: Duy Nguyen, git, Junio C Hamano, Ævar Arnfjörð,
	Luciano Rocha, Lars Schneider

On Tue, 2015-11-03 at 08:09 +0100, Christian Couder wrote:
> On Tue, Nov 3, 2015 at 6:45 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> > On Mon, Nov 2, 2015 at 9:56 PM, David Turner <dturner@twopensource.com> wrote:
> >> On Thu, 2015-10-29 at 09:10 +0100, Christian Couder wrote:
> >>> > We're using Watchman at Twitter.  A week or two ago posted a dump of our
> >>> > code to github, but I would advise waiting a day or two to use it, as
> >>> > I'm about to pull a large number of bugfixes into it (I'll update this
> >>> > thread and provide a link once I do so).
> >>>
> >>> Great, I will have a look at it then!
> >>
> >> Here's the most recent version:
> >>
> >> https://github.com/dturner-tw/git/tree/dturner/watchman
> >
> > Christian, the index-helper/watchman series are posted because you
> > showed interest in this area. I'm not rerolling to address David's
> > comments on the series for now.
> 
> Ok no problem. Thanks a lot to you and David for posting your rebased series!
> 
> > Take your time evaluate the two
> > approaches, then you can pick one (and let me know if you want me to
> > hand my series over, very glad to do so).
> 
> Yeah, I will do that, thanks again!

To be clear: I think Duy's approach is probably best in the long term.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-11-03 20:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-22  5:59 Watchman/inotify support and other ways to speed up git status Christian Couder
2015-10-22  7:29 ` Duy Nguyen
2015-10-27 23:54 ` David Turner
2015-10-29  8:10   ` Christian Couder
2015-11-02 20:56     ` David Turner
2015-11-03  5:45       ` Duy Nguyen
2015-11-03  7:09         ` Christian Couder
2015-11-03 20:32           ` David Turner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).