All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Peart <benpeart@microsoft.com>
To: benpeart@microsoft.com
Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net
Subject: [PATCH v6 00/12] Fast git status via a file system watcher
Date: Fri, 15 Sep 2017 15:20:31 -0400	[thread overview]
Message-ID: <20170915192043.4516-1-benpeart@microsoft.com> (raw)
In-Reply-To: <20170610134026.104552-1-benpeart@microsoft.com>

This is a fairly significant rewrite since V5. The big changes include:

Multiple functions including preload-index(), ie_match_stat(), and
refresh_cache_ent() have been updated to honor the CE_FSMONITOR_VALID bit
following the same pattern as skip_worktree and CE_VALID.  As a result,
performance improvements apply to all git commands that would otherwise
have had to scan the entire working directory.

core.fsmonitor is now a registered command (instead of a hook) to
provide additional flexibility.  It is called when needed to ensure the
state of the index is up-to-date.

The Watchman integration script is now entirely written in perl to
minimize spawning additional helper commands.  This along with the other
changes have helped reduce the overhead and made the extension applicable
to more (ie smaller) repos.

There are additional opportunities for performance improvements but I
wanted to get this version out there and then build on it as the
foundation.  Some potential examples of future patches include:

 - call the integration script on a background thread so that it can
   execute in parallel.

 - optimize traverse trees by pruning out entire branches that do not
   contain any changes.

Other optimizations likely exist where knowledge that files have not
changed can be used to short circuit some of the normal workflow.

Performance
===========

With the various enhancements, performance has been improved especially
for smaller repos.  The included perf test compares status times without
fsmonitor to those with fsmonitor using the provided Watchman integration
script.

Due to the overhead of calling out to Watchman, on very small repos
(<10K files) the overhead exceeds the savings.  Once repos hit 10K files
the savings kick in and for repos beyond that, the savings are dramatic.

Test with 10,000 files                                           this tree
------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.35(0.03+0.04)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.37(0.00+0.09)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   0.43(0.03+0.06)
7519.6: status (fsmonitor=)                                      0.45(0.00+0.07)
7519.7: status -uno (fsmonitor=)                                 0.40(0.03+0.07)
7519.8: status -uall (fsmonitor=)                                0.44(0.04+0.04)

Test with 100,000 files                                          this tree
------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.33(0.01+0.03)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.36(0.00+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   0.93(0.00+0.07)
7519.6: status (fsmonitor=)                                      2.66(0.04+0.03)
7519.7: status -uno (fsmonitor=)                                 2.44(0.01+0.06)
7519.8: status -uall (fsmonitor=)                                2.94(0.03+0.07)

Test with 1,000,000 files                                        this tree
---------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         1.45(0.00+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.88(0.01+0.04)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   6.14(0.03+0.04)
7519.6: status (fsmonitor=)                                      25.91(0.04+0.06)
7519.7: status -uno (fsmonitor=)                                 23.96(0.04+0.03)
7519.8: status -uall (fsmonitor=)                                28.81(0.00+0.07)

Note: all numbers above are with a warm disk cache on a fast SSD, real
world performance numbers are often dramatically better as fsmonitor can
eliminate all the file IO to lstat every file and then traverse the
working directory looking for untracked files.  For example, a cold
status without fsmonitor on a HDD with 1M files takes 1m22.774s

$ time git -c core.fsmonitor= status
On branch p0006-ballast

It took 2.09 seconds to enumerate untracked files. 'status -uno'
may speed it up, but you have to be careful not to forget to add
new files yourself (see 'git help status').
nothing to commit, working tree clean

real    1m22.774s
user    0m0.000s
sys     0m0.000s


Ben Peart (12):
  bswap: add 64 bit endianness helper get_be64
  preload-index: add override to enable testing preload-index
  update-index: add a new --force-write-index option
  fsmonitor: teach git to optionally utilize a file system monitor to
    speed up detecting new or changed files.
  fsmonitor: add documentation for the fsmonitor extension.
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  update-index: add fsmonitor support to update-index
  fsmonitor: add a test tool to dump the index extension
  split-index: disable the fsmonitor extension when running the split
    index test
  fsmonitor: add test cases for fsmonitor extension
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add a performance test

 Documentation/config.txt                   |   6 +
 Documentation/githooks.txt                 |  23 +++
 Documentation/technical/index-format.txt   |  19 +++
 Makefile                                   |   3 +
 apply.c                                    |   2 +-
 builtin/ls-files.c                         |   8 +-
 builtin/update-index.c                     |  26 ++-
 cache.h                                    |  10 +-
 compat/bswap.h                             |  22 +++
 config.c                                   |  14 ++
 config.h                                   |   1 +
 diff-lib.c                                 |   2 +
 dir.c                                      |  27 +--
 dir.h                                      |   2 +
 entry.c                                    |   4 +-
 environment.c                              |   1 +
 fsmonitor.c                                | 253 ++++++++++++++++++++++++++++
 fsmonitor.h                                |  61 +++++++
 preload-index.c                            |   8 +-
 read-cache.c                               |  49 +++++-
 submodule.c                                |   2 +-
 t/helper/.gitignore                        |   1 +
 t/helper/test-drop-caches.c                | 161 ++++++++++++++++++
 t/helper/test-dump-fsmonitor.c             |  21 +++
 t/perf/p7519-fsmonitor.sh                  | 184 ++++++++++++++++++++
 t/t1700-split-index.sh                     |   1 +
 t/t7519-status-fsmonitor.sh                | 259 +++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all                      |  23 +++
 t/t7519/fsmonitor-none                     |  21 +++
 t/t7519/fsmonitor-watchman                 | 128 ++++++++++++++
 templates/hooks--fsmonitor-watchman.sample | 119 +++++++++++++
 unpack-trees.c                             |   8 +-
 32 files changed, 1440 insertions(+), 29 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100644 t/helper/test-dump-fsmonitor.c
 create mode 100755 t/perf/p7519-fsmonitor.sh
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

-- 
2.14.1.548.ge54b1befee.dirty


  parent reply	other threads:[~2017-09-15 19:21 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-06-10 13:40 ` [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c Ben Peart
2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-06-27 15:43   ` Christian Couder
2017-07-03 21:25     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-06-27 16:20   ` Christian Couder
2017-07-07 18:50     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 5/7] fsmonitor: add documentation for the " Ben Peart
2017-06-10 13:40 ` [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
2017-06-10 14:04   ` Ben Peart
2017-06-12 22:04   ` Junio C Hamano
2017-06-14 14:12     ` Ben Peart
2017-06-14 18:36       ` Junio C Hamano
2017-07-07 18:14         ` Ben Peart
2017-07-07 18:35           ` Junio C Hamano
2017-07-07 19:07             ` Ben Peart
2017-07-07 19:33             ` David Turner
2017-07-08  7:19             ` Christian Couder
2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
2017-07-10 13:36   ` Ben Peart
2017-07-10 14:40     ` Ben Peart
2017-09-15 19:20 ` Ben Peart [this message]
2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-15 19:20   ` [PATCH v6 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-15 21:35     ` David Turner
2017-09-18 13:07       ` Ben Peart
2017-09-18 13:32         ` David Turner
2017-09-18 13:49           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-15 19:43     ` David Turner
2017-09-18 13:27       ` Ben Peart
2017-09-17  8:03     ` Junio C Hamano
2017-09-18 13:29       ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-15 20:34     ` David Turner
2017-09-15 19:20   ` [PATCH v6 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-17  8:02     ` Junio C Hamano
2017-09-18 13:38       ` Ben Peart
2017-09-18 15:43         ` Torsten Bögershausen
2017-09-18 16:28           ` Ben Peart
2017-09-19 14:16             ` Torsten Bögershausen
2017-09-19 15:36               ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 20:43     ` Jonathan Nieder
2017-09-20 17:11       ` Ben Peart
2017-09-20 17:46         ` Jonathan Nieder
2017-09-21  0:05           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-15 22:00     ` David Turner
2017-09-19 19:32       ` David Turner
2017-09-19 20:30         ` Ben Peart
2017-09-16 15:27     ` Torsten Bögershausen
2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
2017-09-19 20:37         ` Jonathan Nieder
2017-09-20 13:49           ` Torsten Bögershausen
2017-09-22  1:04             ` Junio C Hamano
2017-09-18 14:06       ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-17  4:47     ` Junio C Hamano
2017-09-18 15:25       ` Ben Peart
2017-09-19 20:34         ` Jonathan Nieder
2017-09-15 19:20   ` [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
2017-09-15 21:56     ` David Turner
2017-09-18 14:24     ` Johannes Schindelin
2017-09-18 18:19       ` Ben Peart
2017-09-19 15:28         ` Johannes Schindelin
2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-20 22:06       ` Stefan Beller
2017-09-21  0:02         ` Ben Peart
2017-09-21  0:44           ` Stefan Beller
2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-20  5:47       ` Junio C Hamano
2017-09-20 14:58         ` Ben Peart
2017-09-21  1:46           ` Junio C Hamano
2017-09-21  2:06             ` Ben Peart
2017-09-21  2:18               ` Junio C Hamano
2017-09-21  2:32                 ` Junio C Hamano
2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-20  2:28       ` Junio C Hamano
2017-09-20 16:19         ` Ben Peart
2017-09-21  2:00           ` Junio C Hamano
2017-09-21  2:24             ` Ben Peart
2017-09-21 14:35               ` Ben Peart
2017-09-22  1:02                 ` Junio C Hamano
2017-09-20  6:23       ` Junio C Hamano
2017-09-20 16:29         ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-20 10:00       ` Martin Ågren
2017-09-20 17:02         ` Ben Peart
2017-09-20 17:11           ` Martin Ågren
2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-19 19:46       ` David Turner
2017-09-19 20:44         ` Ben Peart
2017-09-19 21:27           ` David Turner
2017-09-19 22:44             ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-19 19:27     ` [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 19:27     ` [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-19 19:27     ` [PATCH v7 12/12] fsmonitor: add a performance test Ben Peart
2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:31           ` Ben Peart
2017-09-24  3:51             ` Jeff King
2017-09-24  3:52             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-22 16:35       ` [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-22 16:35       ` [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-22 16:35       ` [PATCH v8 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:33           ` Ben Peart
2017-09-24  3:51             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-22 16:35       ` [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-22 16:35       ` [PATCH v8 12/12] fsmonitor: add a performance test Ben Peart
2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
2017-09-29 12:07         ` Ben Peart
2017-10-01  8:24           ` Junio C Hamano
2017-10-03 19:48             ` Ben Peart
2017-10-04  2:09               ` Junio C Hamano
2017-10-04  6:38                 ` Alex Vandiver
2017-10-04 12:48                   ` Ben Peart
2017-10-04 12:27                 ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170915192043.4516-1-benpeart@microsoft.com \
    --to=benpeart@microsoft.com \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.