git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git status takes 30 seconds on Windows 7. Why?
@ 2013-03-27 16:39 Jim Kinsman
  2013-03-27 16:44 ` Andreas Ericsson
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Jim Kinsman @ 2013-03-27 16:39 UTC (permalink / raw)
  To: git

git status takes 30 seconds on Windows 7. Here are some stats:
git ls-files | wc -l
27330

git ls-files -o | wc -l
4

$ git diff --name-only | xargs du -chs
68K     update_import_contacts.php
68K     total

What can I do??? This is so slow it is unbearable.
By the way i've done git gc several times and nothing changed.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
@ 2013-03-27 16:44 ` Andreas Ericsson
  2013-03-27 17:02 ` Konstantin Khomoutov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Andreas Ericsson @ 2013-03-27 16:44 UTC (permalink / raw)
  To: Jim Kinsman; +Cc: git

On 03/27/2013 05:39 PM, Jim Kinsman wrote:
> git status takes 30 seconds on Windows 7. Here are some stats:
> git ls-files | wc -l
> 27330
> 
> git ls-files -o | wc -l
> 4
> 
> $ git diff --name-only | xargs du -chs
> 68K     update_import_contacts.php
> 68K     total
> 
> What can I do??? This is so slow it is unbearable.
> By the way i've done git gc several times and nothing changed.

I'm guessing it's the disk that's so slow. I accidentally put a git
repo on a network-mounted drive once. With 20ms round-trip time to
the server, git operations took forever.

Could you try it on a disk you know is local? Preferrably a solid
state drive. If it's still slow there, we know for sure something's
broken inside git. If switching media causes git to become fast,
you'll know it's a hardware problem.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
  2013-03-27 16:44 ` Andreas Ericsson
@ 2013-03-27 17:02 ` Konstantin Khomoutov
  2013-03-27 17:17 ` Matthieu Moy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Konstantin Khomoutov @ 2013-03-27 17:02 UTC (permalink / raw)
  To: Jim Kinsman; +Cc: git

On Wed, 27 Mar 2013 11:39:31 -0500
Jim Kinsman <jakinsman@gmail.com> wrote:

> git status takes 30 seconds on Windows 7. Here are some stats:
[...]
> What can I do??? This is so slow it is unbearable.
> By the way i've done git gc several times and nothing changed.

You could try some voodoo [1] or experimental caching features [2].

1. http://groups.google.com/group/msysgit/browse_thread/thread/02e3c0e046f07215
2. http://groups.google.com/group/msysgit/browse_thread/thread/7cbfe3ca452650d1/93ce48e3875f7416

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
  2013-03-27 16:44 ` Andreas Ericsson
  2013-03-27 17:02 ` Konstantin Khomoutov
@ 2013-03-27 17:17 ` Matthieu Moy
  2013-03-27 18:15   ` Jim Kinsman
  2013-03-27 17:22 ` John Keeping
  2013-03-28  1:19 ` Duy Nguyen
  4 siblings, 1 reply; 12+ messages in thread
From: Matthieu Moy @ 2013-03-27 17:17 UTC (permalink / raw)
  To: Jim Kinsman; +Cc: git

Jim Kinsman <jakinsman@gmail.com> writes:

> git status takes 30 seconds on Windows 7.

Any anti-virus installed? They can interfer badly with disk-intensive
tasks ...

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
                   ` (2 preceding siblings ...)
  2013-03-27 17:17 ` Matthieu Moy
@ 2013-03-27 17:22 ` John Keeping
  2013-03-28  1:19 ` Duy Nguyen
  4 siblings, 0 replies; 12+ messages in thread
From: John Keeping @ 2013-03-27 17:22 UTC (permalink / raw)
  To: Jim Kinsman; +Cc: git

On Wed, Mar 27, 2013 at 11:39:31AM -0500, Jim Kinsman wrote:
> git status takes 30 seconds on Windows 7. Here are some stats:
> git ls-files | wc -l
> 27330
> 
> git ls-files -o | wc -l
> 4
> 
> $ git diff --name-only | xargs du -chs
> 68K     update_import_contacts.php
> 68K     total
> 
> What can I do??? This is so slow it is unbearable.
> By the way i've done git gc several times and nothing changed.

Can you run these commands under "time" so that we can see that it's
definitely the "git ls-files" taking 30 seconds and not something in
$PS1?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 17:17 ` Matthieu Moy
@ 2013-03-27 18:15   ` Jim Kinsman
  2013-03-27 18:46     ` John Keeping
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Kinsman @ 2013-03-27 18:15 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git

The only anti-virus I have installed is Microsoft Security Essentials
I turned off and it was still the same:
$ cat /usr/bin/gitstatus
start_time=`date +%s`
git status && echo run time is $(expr `date +%s` - $start_time) s


$ gitstatus
# On branch test
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       modified:   orgoptions.php
#       modified:   update_import_contacts.php
#
no changes added to commit (use "git add" and/or "git commit -a")
run time is 10 s

On Wed, Mar 27, 2013 at 12:17 PM, Matthieu Moy
<Matthieu.Moy@grenoble-inp.fr> wrote:
> Jim Kinsman <jakinsman@gmail.com> writes:
>
>> git status takes 30 seconds on Windows 7.
>
> Any anti-virus installed? They can interfer badly with disk-intensive
> tasks ...
>
> --
> Matthieu Moy
> http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 18:15   ` Jim Kinsman
@ 2013-03-27 18:46     ` John Keeping
  2013-03-27 19:04       ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: John Keeping @ 2013-03-27 18:46 UTC (permalink / raw)
  To: Jim Kinsman; +Cc: Matthieu Moy, git

On Wed, Mar 27, 2013 at 01:15:43PM -0500, Jim Kinsman wrote:
> The only anti-virus I have installed is Microsoft Security Essentials
> I turned off and it was still the same:
> $ cat /usr/bin/gitstatus
> start_time=`date +%s`
> git status && echo run time is $(expr `date +%s` - $start_time) s
> 
> 
> $ gitstatus
> # On branch test
> # Changes not staged for commit:
> #   (use "git add <file>..." to update what will be committed)
> #   (use "git checkout -- <file>..." to discard changes in working directory)
> #
> #       modified:   orgoptions.php
> #       modified:   update_import_contacts.php
> #
> no changes added to commit (use "git add" and/or "git commit -a")
> run time is 10 s

That doesn't seem hugely surprising to me.  I have a moderately sized
repository (3047 files, although it's Java so there are some deep trees)
and I get the following (Vista on a reasonably old laptop, best of 3,
Git version 1.8.1.msysgit.1):

$ time git ls-files >/dev/null

real	0m0.047s
user	0m0.015s
sys	0m0.015s

$ time git status >/dev/null

real	0m2.715s
user	0m0.000s
sys	0m0.031s


I'm not sure the "user" and "sys" times are correct, but the "real"
times feel right.  By comparison, on Linux on a much newer machine (so
not much of a comparison) on the same repository:

$ time git status >/dev/null

real	0m0.347s
user	0m0.171s
sys	0m0.167s


I think the simple reality is that Git was written with the assumption
that stat is cheap and that isn't really the case on Windows, where the
filesystem cache doesn't seem to do that well with this.  It may be that
Git's Windows compatibility code could do be made more efficient but I
know nothing about that, although a quick look in compat/mingw.c
indicates that Git does already use its own stat implementations in
place of the MSys ones in search of speed.


John

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 18:46     ` John Keeping
@ 2013-03-27 19:04       ` Jeff King
  2013-03-27 19:27         ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2013-03-27 19:04 UTC (permalink / raw)
  To: John Keeping; +Cc: Jim Kinsman, Matthieu Moy, git

On Wed, Mar 27, 2013 at 06:46:57PM +0000, John Keeping wrote:

> I think the simple reality is that Git was written with the assumption
> that stat is cheap and that isn't really the case on Windows, where the
> filesystem cache doesn't seem to do that well with this.

Yes, I think that's pretty much the case (though most of my
Git-on-Windows experience is from cygwin long ago, where the stat
performance was truly horrendous). Have you tried setting
core.preloadindex, which should run the stats in parallel?

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 19:04       ` Jeff King
@ 2013-03-27 19:27         ` Linus Torvalds
  2013-03-27 20:00           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2013-03-27 19:27 UTC (permalink / raw)
  To: Jeff King; +Cc: John Keeping, Jim Kinsman, Matthieu Moy, Git Mailing List

On Wed, Mar 27, 2013 at 12:04 PM, Jeff King <peff@peff.net> wrote:
>
> Yes, I think that's pretty much the case (though most of my
> Git-on-Windows experience is from cygwin long ago, where the stat
> performance was truly horrendous). Have you tried setting
> core.preloadindex, which should run the stats in parallel?

I wonder if preloadindex shouldn't be enabled by default.. It's a huge
deal on NFS, and the only real downside is that it expects threading
to work. It potentially slows things down a tiny bit for single-CPU
cases with everything cached, but that isn't likely to be a relevant
case.

Of course, it can trigger filesystem scalability issues, and as a
result it will often not help very much if you have the bulk of your
files in one (or a few) directories. But anybody who has so many files
that performance is an issue is not likely to have them all in one
place.

And apparently the Windows FS metadata caching sucks, and things fall
out of the cache for large trees. Color me not-very-surprised. It's
probably some size limit on the metadata that you can tweak. So I';m
sure there's some registry setting or other that would make windows
able to cache more than a few thousand filenames, and it would
probably improve performance a lot, but I do think preloadindex has
been around long enough that it could just be the default.

Of course, Jim should verify that preloadindex actually does solve his
problem.  With 20k+ files, it should max out the 20 IO threads for
preloading, and assuming the filesystem IO scales reasonably well, it
should fix the problem. But we do do a number of metadata ops
synchronously even with preloadindex, so things won't scale perfectly.

(In particular: do open each directory and do the readdir stuff and
try to open .gitignore whether it exists or not. So you'll get
synchronous IO for each directory, but at least the per-file IO to
check all the file stat data should scale).

             Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 19:27         ` Linus Torvalds
@ 2013-03-27 20:00           ` Junio C Hamano
  2013-03-27 20:12             ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2013-03-27 20:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff King, John Keeping, Jim Kinsman, Matthieu Moy,
	Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, Mar 27, 2013 at 12:04 PM, Jeff King <peff@peff.net> wrote:
>>
>> Yes, I think that's pretty much the case (though most of my
>> Git-on-Windows experience is from cygwin long ago, where the stat
>> performance was truly horrendous). Have you tried setting
>> core.preloadindex, which should run the stats in parallel?
>
> I wonder if preloadindex shouldn't be enabled by default.

I am surprised that we haven't done so.

Given that we haven't tweaked the parallelism or thread-cost
parameters since the inception of the mechanism in Nov 2008, I
suspect that we would see praises from some and grievances from
other corners of the user base for a while until we find acceptable
values for them, but I agree the feature has been in use
sufficiently by some people (heh, I just discovered that I don't
have it in my config), it can be the default.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 20:00           ` Junio C Hamano
@ 2013-03-27 20:12             ` Linus Torvalds
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2013-03-27 20:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, John Keeping, Jim Kinsman, Matthieu Moy,
	Git Mailing List

On Wed, Mar 27, 2013 at 1:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Given that we haven't tweaked the parallelism or thread-cost
> parameters since the inception of the mechanism in Nov 2008, I
> suspect that we would see praises from some and grievances from
> other corners of the user base for a while until we find acceptable
> values for them

Looking at the parameters again, I really think they are pretty sane,
and I don't think the numbers are all that likely to have shifted from
2008. The maximum thread value is quite reasonable: twenty threads is
sufficient to cover quite a bit of latency, and brings "several
seconds" down to "under half a second" for any truly IO-limited load,
while not being disastrous for the case where everything is in cache
and we only have a limited number of CPU cores.

And the "at least 500 files per thread" limit is eminently reasonable
too - smaller projects like git won't have more than five or so
threads.

So I'd be very surprised if the values need much tweaking. Sure, there
might be some extreme cases that might tune for some particular
patterns, and maybe we should make the values be tunable rather than
totally hardcoded, but I suspect there's limited up-side.

It might be interesting for the people who really like tuning, though.
So in addition to "index.preload=true", maybe an extended config
format like "index_preload=50,200" to say "maximum of fifty threads,
for every 200 files" could be done just so people could play around
with the numbers and see how much (if at all) they actually matter.

But I really don't think the original 20/500 rule is likely to be all
that bad for anybody. Unless there is some *really* sucky thread
library out there (ie fully user-space threads, so filename lookup
isn't actually parallelised at all), but at least for that case the
fix is to just say "ok, your threads aren't real threads, so just
disable index preloading entirely).

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git status takes 30 seconds on Windows 7. Why?
  2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
                   ` (3 preceding siblings ...)
  2013-03-27 17:22 ` John Keeping
@ 2013-03-28  1:19 ` Duy Nguyen
  4 siblings, 0 replies; 12+ messages in thread
From: Duy Nguyen @ 2013-03-28  1:19 UTC (permalink / raw)
  To: Jim Kinsman; +Cc: git

On Wed, Mar 27, 2013 at 11:39 PM, Jim Kinsman <jakinsman@gmail.com> wrote:
> git status takes 30 seconds on Windows 7. Here are some stats:
> git ls-files | wc -l
> 27330
>
> git ls-files -o | wc -l
> 4
>
> $ git diff --name-only | xargs du -chs
> 68K     update_import_contacts.php
> 68K     total
>
> What can I do??? This is so slow it is unbearable.
> By the way i've done git gc several times and nothing changed.

You can try "status -uno" to skip showing untracked files (and may be
do without -uno before commit so you don't miss files). You may also
try core.ignoreStat (but I think it's not very convenient to use)
-- 
Duy

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-03-28  1:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-27 16:39 git status takes 30 seconds on Windows 7. Why? Jim Kinsman
2013-03-27 16:44 ` Andreas Ericsson
2013-03-27 17:02 ` Konstantin Khomoutov
2013-03-27 17:17 ` Matthieu Moy
2013-03-27 18:15   ` Jim Kinsman
2013-03-27 18:46     ` John Keeping
2013-03-27 19:04       ` Jeff King
2013-03-27 19:27         ` Linus Torvalds
2013-03-27 20:00           ` Junio C Hamano
2013-03-27 20:12             ` Linus Torvalds
2013-03-27 17:22 ` John Keeping
2013-03-28  1:19 ` Duy Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).