git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jeff King <peff@peff.net>
Cc: John Keeping <john@keeping.me.uk>,
	Jim Kinsman <jakinsman@gmail.com>,
	Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: git status takes 30 seconds on Windows 7. Why?
Date: Wed, 27 Mar 2013 12:27:23 -0700	[thread overview]
Message-ID: <CA+55aFypcwbLwPLq++AU9FggCKLYkgkuN6i-gOD9pRioH1Dz2g@mail.gmail.com> (raw)
In-Reply-To: <20130327190425.GA26380@sigill.intra.peff.net>

On Wed, Mar 27, 2013 at 12:04 PM, Jeff King <peff@peff.net> wrote:
>
> Yes, I think that's pretty much the case (though most of my
> Git-on-Windows experience is from cygwin long ago, where the stat
> performance was truly horrendous). Have you tried setting
> core.preloadindex, which should run the stats in parallel?

I wonder if preloadindex shouldn't be enabled by default.. It's a huge
deal on NFS, and the only real downside is that it expects threading
to work. It potentially slows things down a tiny bit for single-CPU
cases with everything cached, but that isn't likely to be a relevant
case.

Of course, it can trigger filesystem scalability issues, and as a
result it will often not help very much if you have the bulk of your
files in one (or a few) directories. But anybody who has so many files
that performance is an issue is not likely to have them all in one
place.

And apparently the Windows FS metadata caching sucks, and things fall
out of the cache for large trees. Color me not-very-surprised. It's
probably some size limit on the metadata that you can tweak. So I';m
sure there's some registry setting or other that would make windows
able to cache more than a few thousand filenames, and it would
probably improve performance a lot, but I do think preloadindex has
been around long enough that it could just be the default.

Of course, Jim should verify that preloadindex actually does solve his
problem.  With 20k+ files, it should max out the 20 IO threads for
preloading, and assuming the filesystem IO scales reasonably well, it
should fix the problem. But we do do a number of metadata ops
synchronously even with preloadindex, so things won't scale perfectly.

(In particular: do open each directory and do the readdir stuff and
try to open .gitignore whether it exists or not. So you'll get
synchronous IO for each directory, but at least the per-file IO to
check all the file stat data should scale).

             Linus

  reply	other threads:[~2013-03-27 19:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
2013-03-27 16:44 ` Andreas Ericsson
2013-03-27 17:02 ` Konstantin Khomoutov
2013-03-27 17:17 ` Matthieu Moy
2013-03-27 18:15   ` Jim Kinsman
2013-03-27 18:46     ` John Keeping
2013-03-27 19:04       ` Jeff King
2013-03-27 19:27         ` Linus Torvalds [this message]
2013-03-27 20:00           ` Junio C Hamano
2013-03-27 20:12             ` Linus Torvalds
2013-03-27 17:22 ` John Keeping
2013-03-28  1:19 ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFypcwbLwPLq++AU9FggCKLYkgkuN6i-gOD9pRioH1Dz2g@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=jakinsman@gmail.com \
    --cc=john@keeping.me.uk \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).