git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Piotr Krukowiecki <piotr.krukowiecki@gmail.com>
Cc: Thomas Rast <trast@inf.ethz.ch>,
	Git Mailing List <git@vger.kernel.org>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Subject: Re: git status: small difference between stating whole repository and small subdirectory
Date: Wed, 15 Feb 2012 14:03:18 -0500	[thread overview]
Message-ID: <20120215190318.GA5992@sigill.intra.peff.net> (raw)
In-Reply-To: <CAA01Cso_8=159UDMFUHiYz1X=gYtpbqRO4h3TMw7N=4YMV8YNg@mail.gmail.com>

On Wed, Feb 15, 2012 at 09:57:29AM +0100, Piotr Krukowiecki wrote:

> All is on local disk and system is idle.
> 
> Indeed, after gc the times went down:
> 10s -> 2.3s (subdirectory)
> 17s -> 9.5s (whole repo)
> 
> 2 seconds is much better and I'd say acceptable for me. But my questions are:

Obviously these answers didn't come from any deep analysis, but are
educated guesses from me based on previous performance patterns we've
seen on the list:

> - why is it so slow with not packed repo?

Your numbers show that you're I/O-bound:

>>> $ time git status    > /dev/null
>>> real    0m41.670s
>>> user    0m0.980s
>>> sys     0m2.908s
>>>
>>> $ time git status -- src/.../somedir   > /dev/null
>>> real    0m17.380s
>>> user    0m0.748s
>>> sys     0m0.328s

which is not surprising, since you said you dropped caches before-hand.
Repacking probably reduced your disk footprint by a lot, which meant
less I/O.

I notice that you're still I/O bound even after the repack:

> $ time git status  -- .
> real    0m2.503s
> user    0m0.160s
> sys     0m0.096s
> 
> $ time git status
> real    0m9.663s
> user    0m0.232s
> sys     0m0.556s

Did you drop caches here, too?  Usually that would not be the case on a
warm cache. If it is, then it sounds like you are short on memory to
actually hold the directory tree and object db in cache. If not, what do
the warm cache numbers look like?

> - can it be faster without repacking?

Not really. You're showing an I/O problem, and repacking is git's way of
reducing I/O.

> - even with packed repo, the time on small subdirectory is much higher
> than I'd expect given time on whole repo and subdirectory size - why?

Hard to say without profiling.  It may be that we reduced the object db
lookups, saving some time, but still end up stat()ing the whole tree.
The optimization to stat only the directories of interest was in 688cd6d
(status: only touch path we may need to check, 2010-01-14), which went
into v1.7.0. What version of git are you using?

-Peff

  parent reply	other threads:[~2012-02-15 19:03 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-10  9:42 git status: small difference between stating whole repository and small subdirectory Piotr Krukowiecki
2012-02-10 12:33 ` Nguyen Thai Ngoc Duy
2012-02-10 13:46   ` Piotr Krukowiecki
2012-02-10 14:37     ` Nguyen Thai Ngoc Duy
2012-02-13 16:54       ` Piotr Krukowiecki
2012-02-10 16:18 ` Piotr Krukowiecki
2012-02-14 11:34   ` Thomas Rast
2012-02-15  8:57     ` Piotr Krukowiecki
2012-02-15 11:01       ` Nguyen Thai Ngoc Duy
2012-02-15 15:14         ` Piotr Krukowiecki
2012-02-16 13:22           ` Piotr Krukowiecki
2012-02-15 19:03       ` Jeff King [this message]
2012-02-16 13:37         ` Piotr Krukowiecki
2012-02-16 14:05           ` Thomas Rast
2012-02-16 20:15             ` Junio C Hamano
2012-02-17 16:55             ` Piotr Krukowiecki
2012-02-16 19:20           ` Jeff King
2012-02-17 17:19             ` Piotr Krukowiecki
2012-02-17 20:37               ` Jeff King
2012-02-17 22:25                 ` Junio C Hamano
2012-02-17 22:29                   ` Jeff King
2012-02-20  8:25                     ` Piotr Krukowiecki
2012-02-20 14:06                       ` Jeff King
2012-02-20 14:09                         ` Thomas Rast
2012-02-20 14:36                           ` Nguyen Thai Ngoc Duy
2012-02-20 14:39                             ` Jeff King
2012-02-20 15:11                               ` Jeff King
2012-02-20 18:45                                 ` Thomas Rast
2012-02-20 20:35                                   ` Jeff King
2012-02-20 22:04                                     ` Junio C Hamano
2012-02-20 22:41                                       ` Jeff King
2012-02-20 23:31                                         ` Junio C Hamano
2012-02-21  7:21                                           ` Piotr Krukowiecki
2012-02-20 20:08                                 ` Junio C Hamano
2012-02-20 20:17                                   ` Jeff King
2012-02-21 14:45                             ` Nguyen Thai Ngoc Duy
2012-02-21 19:16                               ` Junio C Hamano
2012-02-22  2:12                                 ` Nguyen Thai Ngoc Duy
2012-02-22  2:55                                   ` Junio C Hamano
2012-02-22 12:54                                     ` Nguyen Thai Ngoc Duy
2012-02-22 13:17                                       ` Thomas Rast
2012-02-22 10:34                                 ` Nguyen Thai Ngoc Duy
2012-02-22  3:32                               ` Junio C Hamano
2012-04-10 15:16                                 ` Piotr Krukowiecki
2012-04-10 16:23                                   ` Junio C Hamano
2012-04-10 18:00                                     ` Jeff King
2012-02-20 19:57                           ` Junio C Hamano
2012-02-20 19:59                             ` Thomas Rast
2012-02-20 14:16                         ` Nguyen Thai Ngoc Duy
2012-02-20 14:22                           ` Jeff King
2012-02-20 19:56                         ` Junio C Hamano
2012-02-20 20:09                           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120215190318.GA5992@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=piotr.krukowiecki@gmail.com \
    --cc=trast@inf.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).