git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: A Large Angry SCM <gitzilla@gmail.com>
To: Dun Peal <dunpealer@gmail.com>
Cc: Joshua Jensen <jjensen@workspacewhiz.com>,
	Tay Ray Chuan <rctay89@gmail.com>,
	Wilbert van Dolleweerd <wilbert@arentheym.com>,
	Git ML <git@vger.kernel.org>
Subject: Re: Inexplicably deteriorating performance of Git repositories on Windows
Date: Wed, 24 Nov 2010 16:18:31 -0500	[thread overview]
Message-ID: <4CED8127.8060505@gmail.com> (raw)
In-Reply-To: <AANLkTi=X724OJgUvG0Ggu3OwxyaJprr9CLL+t+x=MbTO@mail.gmail.com>

On 11/24/2010 04:00 PM, Dun Peal wrote:
> On Wed, Nov 24, 2010 at 5:16 PM, Joshua Jensen
> <jjensen@workspacewhiz.com>  wrote:
>> Whenever I want to know exactly what is going on with disk access, I
>> download Process Monitor from http://sysinternals.com/.
>>
>> In order to just show disk access, I filter entries that begin with TCP,
>> UDP, and Reg out.
>>
>> Josh
>
> Thanks, we tried that and we don't see a whole lot of disk activity on
> the "fast" machines.
>
> One emerging theory is that the "slow" Windows machines differ from
> the "fast" ones by how their disk cache works.
>
> So `git status` on a large tree heavily depends on caching. Without
> it, it would be slow; with it, it's much faster.
>
> We verified that part since when we reboot a fast Windows machine, the
> first run of `git status` is slow (~30s) but the next one is much
> faster (~5s).
>
> We see a similar phenomenon on Linux: the first run is always
> significantly slower than the others.
>
> On slow Windows machines, this difference is much less pronounced.
>
> On a typical "slow" machine, if you clone the repo, the first run of
> `git status` on it would already be fast (5s). But then your reboot,
> and the first run is slow, but then it only gets up to 14s. And you
> can't get back the 5s latency unless you re-clone the repo and status
> the fresh clone.
>
> So my theory is that there's a cache that on the "fast" machines
> aggressively caches the entire tree on a regular `git status` run. On
> such a machine, it's enough to run `git status` once, and after that
> initial cold run, the rest will be warm... until you reboot the
> machine, rinse, repeat.
>
> On a slow machine, however, cache isn't so aggressive. It might be
> write-oriented. So when you write out a whole new working tree, that
> tree gets cached as it is written. And for the remainder of the
> lifetime of that cache, you get the fully-cached performance you see
> on the "fast" machines. But then you reboot the machine, and lose the
> cache. And since the caching process isn't aggressive, any number of
> `git status` runs won't get you back to the fully cached state. You
> will only get that on a newly written working copy.
>
> What do you think?

How much memory do the fast and slow machines have? How much memory will 
windows use for disk caching? Is it possible that your normal work flow 
between status' are forcing the caches to pruned due to memory pressure?

  reply	other threads:[~2010-11-24 21:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-23 19:08 Inexplicably deteriorating performance of Git repositories on Windows Dun Peal
2010-11-23 19:12 ` Wilbert van Dolleweerd
2010-11-23 19:59   ` Dun Peal
2010-11-23 20:10     ` Wilbert van Dolleweerd
2010-11-23 20:25     ` Stephen Bash
2010-11-23 21:07       ` Dun Peal
2010-11-24 14:16     ` Tay Ray Chuan
2010-11-24 17:16       ` Joshua Jensen
2010-11-24 21:00         ` Dun Peal
2010-11-24 21:18           ` A Large Angry SCM [this message]
2010-11-24 22:06           ` Johannes Sixt
2010-11-24 20:48       ` Dun Peal
2010-11-23 21:13 ` Martin Langhoff
2010-11-23 21:17   ` Dun Peal
2010-11-23 21:49 ` Ferry Huberts
2010-11-23 23:23   ` Dun Peal
2010-11-24 11:34 ` Andreas Ericsson
2010-11-24 20:10   ` Dun Peal
2010-11-24 13:32 ` Nguyen Thai Ngoc Duy
2010-11-24 20:22   ` Dun Peal
2010-11-28 22:18 ` Robin Rosenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CED8127.8060505@gmail.com \
    --to=gitzilla@gmail.com \
    --cc=dunpealer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jjensen@workspacewhiz.com \
    --cc=rctay89@gmail.com \
    --cc=wilbert@arentheym.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).