git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Rast <trast@student.ethz.ch>
To: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
Cc: Eric Herman <eric@freesa.org>, <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 0/3] grep multithreading and scaling
Date: Fri, 2 Dec 2011 14:07:45 +0100	[thread overview]
Message-ID: <cover.1322830368.git.trast@student.ethz.ch> (raw)
In-Reply-To: <201111291507.04754.trast@student.ethz.ch>

[Eric, I measured some numbers that may be interesting to the
discussion about b2924dc.  See below.]

This round wraps up the original patch I posted, plus the draft patch
I posted inline the other day with René's review taken into account.
I also added a patch that rips out threading in the non-worktree case;
read on for the reasoning.

René Scharfe wrote:
> Hmm, why are [gitattributes lookups] that expensive?
> 
> callgrind tells me that userdiff_find_by_path() contributes only 0.18%
> to the total cost with your first patch.  Timings in my virtual machine
> are very volatile, but it seems that here the difference is in the
> system time while user is basically the same for all combinations of
> patches.

Well, turns out I was measuring something completely stupid.  I had

  git grep --cached -W INITRAMFS_ROOT_UID

where I put the --cached originally because that makes it independent
of the worktree (which in the very first measurements I still had
wiped, as I tend to do for this repo; I checked it out again after
that).  This in fact gives me (~/g/git-grep --cached
INITRAMFS_ROOT_UID, leaving aside -W; best of 10):

  THREADS=8:   2.88user 0.21system 0:02.94elapsed
  THREADS=4:   2.89user 0.29system 0:02.99elapsed
  THREADS=2:   2.83user 0.36system 0:02.87elapsed
  NO_PTHREADS: 2.16user 0.08system 0:02.25elapsed

Uhuh.  Doesn't scale so well after all.  But removing the --cached, as
most people probably would:

  THREADS=8:   0.19user 0.32system 0:00.16elapsed
  THREADS=4:   0.16user 0.34system 0:00.17elapsed
  THREADS=2:   0.18user 0.32system 0:00.26elapsed
  NO_PTHREADS: 0.12user 0.17system 0:00.31elapsed

So I conclude that during any grep that cannot use the worktree,
having any threads hurts.

In addition, during a grep that *can* use the worktree, THREADS=8
still helps somewhat on my dual-core i7, though it goes downhill from
there (12 is again as fast as 4; I verified these details using
best-of-50 timings, and it is reproducible.)

I have also run timings on a 2*6-core workstation running OS X, where
performance is best at 5 cores:

  2 threads:  0.96 real   0.41 user   1.27 sys
  3 threads:  0.68 real   0.41 user   1.30 sys
  4 threads:  0.54 real   0.43 user   1.63 sys
  5 threads:  0.50 real   0.41 user   1.51 sys
  6 threads:  0.54 real   0.43 user   1.63 sys
  7 threads:  0.86 real   0.49 user   1.93 sys
  8 threads:  0.98 real   0.51 user   2.07 sys

I kid you not.  That's best-of-50 and rather stable.  It's on the same
tree as the Linux machine too, except for the problem that the OS X FS
is set to case-insensitive and thus cannot represent the tree exactly.
So from git's POV, there are unstaged changes.

Sadly I do not have access to a Linux box having more than 2 physical
cores.  If you have one, please run some tests :-)

So based on my measurements, I would suggest that unless we have
evidence of it scaling beyond 8 cores on some machine, b2924dc (grep:
detect number of CPUs for thread spawning) be dropped.  For now I'm
ignoring the problem that on OS X it doesn't even scale to 8; I'd
rather check how it fares on Linux first.

I added a third patch on top that disables threading in any case that
does not hit the worktree.  I wonder if I missed something or if it
really is that simple.  The neat part is that it's also a reduction in
code required, and at the same time avoids any issues 2/3 might have
with a future attributes-from-trees implementation.

With this I get

  worktree, 8 threads: 0.15user 0.37system 0:00.17elapsed
  --cached, 8 threads: 2.18user 0.07system 0:02.27elapsed

Of course, we could probably gain a huge boost if the read_sha1
machinery could be made threaded, so that it can unpack several
objects at a time.  In addition, I can well imagine that there are
combinations of delta density, object size, and luck where it pays off
to grep in parallel.  Do we care?

Now I really should do something else than fretting over the
sub-second performance of git-grep...


Thomas Rast (3):
  grep: load funcname patterns for -W
  grep: enable threading with -p and -W using lazy attribute lookup
  grep: disable threading in all but worktree case

 builtin/grep.c  |  153 ++++++++++++++++--------------------------------------
 grep.c          |   73 ++++++++++++++++----------
 grep.h          |    7 +++
 t/t7810-grep.sh |   14 +++++
 4 files changed, 112 insertions(+), 135 deletions(-)

-- 
1.7.8.rc4.388.ge53ab

  reply	other threads:[~2011-12-02 13:08 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-25 14:46 [PATCH] grep: load funcname patterns for -W Thomas Rast
2011-11-25 16:32 ` René Scharfe
2011-11-26 12:15   ` [PATCH] grep: enable multi-threading for -p and -W René Scharfe
2011-11-29  9:54     ` Thomas Rast
2011-11-29 13:49       ` René Scharfe
2011-11-29 14:07         ` Thomas Rast
2011-12-02 13:07           ` Thomas Rast [this message]
2011-12-02 13:07             ` [PATCH v2 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-02 13:07             ` [PATCH v2 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-02 13:07             ` [PATCH v2 3/3] grep: disable threading in all but worktree case Thomas Rast
2011-12-02 16:15               ` René Scharfe
2011-12-05  9:02                 ` Thomas Rast
2011-12-06 22:48                 ` René Scharfe
2011-12-06 23:01                   ` [PATCH 4/2] grep: turn off threading for non-worktree René Scharfe
2011-12-07  4:42                     ` Jeff King
2011-12-07 17:11                       ` René Scharfe
2011-12-07 18:28                         ` Jeff King
2011-12-07 20:11                       ` J. Bruce Fields
2011-12-07 20:45                         ` Jeff King
2011-12-07  8:12                     ` Thomas Rast
2011-12-07 17:00                       ` René Scharfe
2011-12-10 13:13                         ` Pete Wyckoff
2011-12-12 22:37                           ` René Scharfe
2011-12-07  4:24                   ` [PATCH v2 3/3] grep: disable threading in all but worktree case Jeff King
2011-12-07 16:52                     ` René Scharfe
2011-12-07 18:10                       ` Jeff King
2011-12-07  8:11                   ` Thomas Rast
2011-12-07 16:54                     ` René Scharfe
2011-12-12 21:16                 ` [PATCH v3 0/3] grep attributes and multithreading Thomas Rast
2011-12-12 21:16                   ` [PATCH v3 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-12 21:16                   ` [PATCH v3 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-16  8:22                     ` Johannes Sixt
2011-12-16 17:34                       ` Junio C Hamano
2011-12-12 21:16                   ` [PATCH v3 3/3] grep: disable threading in non-worktree case Thomas Rast
2011-12-12 22:37                   ` [PATCH v3 0/3] grep attributes and multithreading René Scharfe
2011-12-12 23:44                   ` Junio C Hamano
2011-12-13  8:44                     ` Thomas Rast
2011-12-23 22:37               ` [PATCH v2 3/3] grep: disable threading in all but worktree case Ævar Arnfjörð Bjarmason
2011-12-23 22:49                 ` Thomas Rast
2011-12-24  1:39                   ` Ævar Arnfjörð Bjarmason
2011-12-24  7:07                     ` Jeff King
2011-12-24 10:49                       ` Nguyen Thai Ngoc Duy
2011-12-24 10:55                       ` Nguyen Thai Ngoc Duy
2011-12-24 13:38                         ` Jeff King
2011-12-25  3:32                       ` Nguyen Thai Ngoc Duy
2011-12-02 17:34             ` [PATCH v2 0/3] grep multithreading and scaling Jeff King
2011-12-05  9:38               ` Thomas Rast
2011-12-05 20:16                 ` Thomas Rast
2011-12-06  0:40                 ` Jeff King
2011-12-02 20:02             ` Eric Herman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1322830368.git.trast@student.ethz.ch \
    --to=trast@student.ethz.ch \
    --cc=eric@freesa.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=rene.scharfe@lsrfire.ath.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).