git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Ben Peart" <benpeart@microsoft.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>,
	"David Turner" <David.Turner@twosigma.com>,
	"Jeff King" <peff@peff.net>,
	"Christian Couder" <christian.couder@gmail.com>
Subject: Re: [PATCH v4 0/6] Fast git status via a file system watcher
Date: Thu, 1 Jun 2017 20:40:20 -0400	[thread overview]
Message-ID: <4c3fed40-774b-8ae6-fa1b-50efe6ef552f@gmail.com> (raw)
In-Reply-To: <CACBZZX4eZ3G8LQ8O+_BkbkJ-ZXTOkUi9cW=QKYjfHKtmA3pgrA@mail.gmail.com>

Any chance you can provide me with a bash script that contains the exact 
sequence of commands you are running to get this result?  I've been 
trying to replicate it using your notes but have not been able to.  I'd 
like to see if it is a repo difference, a platform difference, a command 
sequence difference (or something else entirely :)).

Thanks,

Ben

On 6/1/2017 5:13 PM, Ævar Arnfjörð Bjarmason wrote:
> On Thu, Jun 1, 2017 at 10:51 PM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Thu, Jun 1, 2017 at 5:50 PM, Ben Peart <peartben@gmail.com> wrote:
>>> Changes from V3 include:
>>>   - update test script based on feedback
>>>   - update template hook proc with better post-processing code and make
>>>     it executable
>>
>> I have watchman running finally, so aside from issues applying this I
>> can finally test this. I set up a watch of linux.git:
>>
>> $ watchman watch $PWD
>> $ watchman watch-list|jq '.roots[]'
>> "/home/avar/g/linux"
>>
>> And first, for simplicity, I'll be turning core.fsmonitore on later:
>>
>> $ for c in fsmonitor untrackedCache splitIndex; do git config core.$c
>> false; done
>>
>> On a hot fs cache running "git status" takes me 80ms, but if I flush caches:
>>
>> # free && sync && echo 3 > /proc/sys/vm/drop_caches && free
>>
>> Running "git status" takes me 4s. Now, right after flushing those caches:
>>
>> $ date +%s
>> 1496349235
>> $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235,
>> "fields":["name"]}]' | watchman -j) | jq '.files[]'
>>
>> real    0m0.664s
>> user    0m0.000s
>> sys     0m0.004s
>> $ touch foo bar
>> $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235,
>> "fields":["name"]}]' | watchman -j) | jq '.files[]'
>> "bar"
>> "foo"
>>
>> real    0m0.044s
>> user    0m0.000s
>> sys     0m0.000s
>>
>> Without the monitor running "git status' takes me ~2.5s, i.e. from cold cache:
>>
>> $ time GIT_TRACE=1 ~/g/git/git-status
>> 20:34:49.154731 git.c:369               trace: built-in: git 'status'
>> On branch master
>> Your branch is up-to-date with 'origin/master'.
>>
>> Untracked files:
>>    (use "git add <file>..." to include in what will be committed)
>>
>>          bar
>>          foo
>>
>> nothing added to commit but untracked files present (use "git add" to track)
>>
>> real    0m2.546s
>> user    0m0.060s
>> sys     0m0.228s
>>
>> Now, I would expect the case where the working tree isn't in the fs
>> cache to be lightning fast, since the watchman command is really fast
>> (flushed the cache again, turned on the fsmonitor):
>>
>> $ date +%s
>> 1496349523
>> $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523,
>> "fields":["name"]}]' | watchman -j) | jq '.files[]'
>>
>> real    0m0.529s
>> user    0m0.000s
>> sys     0m0.004s
>> $ touch foo bar
>> $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523,
>> "fields":["name"]}]' | watchman -j) | jq '.files[]'
>> "bar"
>> "foo"
>>
>> real    0m0.312s
>> user    0m0.000s
>> sys     0m0.000s
>>
>> But if I run "git status" (and I instrumented the hook to dump the
>> changed files) it takes a long time (those 10s are just the disk being
>> slow, but it should be ~0s, right?):
>>
>> $ time GIT_TRACE=1 ~/g/git/git-status
>> 20:39:11.250128 git.c:369               trace: built-in: git 'status'
>> 20:39:14.586720 run-command.c:626       trace: run_command:
>> '.git/hooks/query-fsmonitor' '1' '1496349494197821000'
>> Watchman says these changed: bar foo hi there .git .git/index.lock .git/index
>> On branch master
>> Your branch is up-to-date with 'origin/master'.
>>
>> Untracked files:
>>    (use "git add <file>..." to include in what will be committed)
>>
>>          bar
>>          foo
>>
>> nothing added to commit but untracked files present (use "git add" to track)
>>
>> real    0m10.791s
>> user    0m0.108s
>> sys     0m0.228s
>>
>> If I re-run that it takes ~160-200ms with the hook, 80-120ms without
>> it, so in this case once the data is in the fscache watchman is
>> actually slower.
>>
>> But manually running watchman shows that it's really fast, usually
>> returning within 3ms, I flush the fscache in the middle of this,
>> that's the 600ms time, that could be some global kernel lock or
>> something due to the odd manual proc operation, I haven't actually
>> tested this on a system under memory pressure:
>>
>> $ for i in {1..10}; do (time (printf '["query", "/home/avar/g/linux",
>> {"since": %s, "fields":["name"]}]' $(($(date +%s)-3)) | watchman -j |
>> jq '.files[]')) 2>&1 | grep -e '"' -e ^real && touch $i && sleep 1;
>> done
>> real    0m0.004s
>> "1"
>> real    0m0.003s
>> "2"
>> "1"
>> real    0m0.605s
>> "3"
>> real    0m0.003s
>> "4"
>> "3"
>> real    0m0.003s
>> "5"
>> "4"
>> real    0m0.003s
>> "6"
>> "5"
>> real    0m0.003s
>> "7"
>> "6"
>> real    0m0.003s
>> "8"
>> "7"
>> real    0m0.003s
>> "9"
>> "8"
>> real    0m0.003s
>>
>> So something really odd is going on here. This should be speeding up
>> "git status" a lot, even with a hot fs cache doing stat on all of
>> linux.git takes a lot longer than 3ms, but for some reason it's
>> slower.
> 
> Dug a bit further, with cold cache and no fsmonitor, gprof output from
> -O0 compiled git:
> 
> Each sample counts as 0.01 seconds.
>    %   cumulative   self              self     total
>   time   seconds   seconds    calls  ms/call  ms/call  name
>   37.50      0.03     0.03   123886     0.00     0.00  memihash
>   12.50      0.04     0.01   284727     0.00     0.00  match_basename
>   12.50      0.05     0.01    73989     0.00     0.00  match_pathname
>   12.50      0.06     0.01    59924     0.00     0.00  hashmap_add
>   12.50      0.07     0.01    59847     0.00     0.00  same_name
>   12.50      0.08     0.01    59844     0.00     0.00  hash_index_entry
>    0.00      0.08     0.00   598446     0.00     0.00  git_bswap32
>    0.00      0.08     0.00   259150     0.00     0.00  fspathncmp
>    0.00      0.08     0.00   178731     0.00     0.00  match_pathspec
>    0.00      0.08     0.00   178655     0.00     0.00  do_match_pathspec
> And then cold cache with fsmonitor:
> 
> Each sample counts as 0.01 seconds.
>    %   cumulative   self              self     total
>   time   seconds   seconds    calls  ms/call  ms/call  name
>   66.67      0.04     0.04    96696     0.00     0.00  blk_SHA1_Block
>   16.67      0.05     0.01    59844     0.00     0.00  cache_entry_from_ondisk
>   16.67      0.06     0.01    58869     0.00     0.00  longest_path_match
>    0.00      0.06     0.00  1547157     0.00     0.00  git_bswap32
>    0.00      0.06     0.00  1196893     0.00     0.00  git_bswap32
>    0.00      0.06     0.00   284727     0.00     0.00  match_basename
>    0.00      0.06     0.00   259150     0.00     0.00  fspathncmp
>    0.00      0.06     0.00   178561     0.00     0.00  do_match_pathspec
>    0.00      0.06     0.00   178543     0.00     0.00  match_pathspec
>    0.00      0.06     0.00   145141     0.00     0.00  memory_limit_check
> 

  reply	other threads:[~2017-06-02  0:40 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01 15:50 [PATCH v4 0/6] Fast git status via a file system watcher Ben Peart
2017-06-01 15:51 ` [PATCH v4 1/6] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-06-01 15:51 ` [PATCH v4 2/6] dir: make lookup_untracked() available outside of dir.c Ben Peart
2017-06-01 15:51 ` [PATCH v4 3/6] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-06-01 15:51 ` [PATCH v4 4/6] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-06-01 15:51 ` [PATCH v4 5/6] fsmonitor: add documentation for the " Ben Peart
2017-06-01 15:51 ` [PATCH v4 6/6] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
2017-06-07 21:38   ` Ævar Arnfjörð Bjarmason
2017-06-01 19:57 ` [PATCH v4 0/6] Fast git status via a file system watcher Ævar Arnfjörð Bjarmason
2017-06-01 21:06   ` Ben Peart
2017-06-01 21:12     ` Ævar Arnfjörð Bjarmason
2017-06-01 21:13     ` Stefan Beller
2017-06-01 21:26       ` Jeff King
2017-06-01 20:51 ` Ævar Arnfjörð Bjarmason
2017-06-01 21:13   ` Ævar Arnfjörð Bjarmason
2017-06-02  0:40     ` Ben Peart [this message]
2017-06-02 10:28       ` [WIP/PATCH 7/6] perf: add a performance test for core.fsmonitor Ævar Arnfjörð Bjarmason
2017-06-02 21:44         ` David Turner
2017-06-03 18:08           ` Ævar Arnfjörð Bjarmason
2017-06-05 14:27           ` Ben Peart
2017-06-02 22:05         ` Ben Peart
2017-06-02 23:06           ` Ævar Arnfjörð Bjarmason
2017-06-07 19:51             ` Ben Peart
2017-06-07 21:46               ` Ævar Arnfjörð Bjarmason
2017-06-08  1:57                 ` Ben Peart
2017-06-04  1:59         ` Junio C Hamano
2017-06-04  7:46           ` Ævar Arnfjörð Bjarmason
2017-06-04  8:21             ` Jeff King
2017-06-02  1:56 ` [PATCH v4 0/6] Fast git status via a file system watcher Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c3fed40-774b-8ae6-fa1b-50efe6ef552f@gmail.com \
    --to=peartben@gmail.com \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).