git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* "git grep" parallelism question
@ 2013-04-26 17:31 Linus Torvalds
  2013-04-26 18:47 ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Linus Torvalds @ 2013-04-26 17:31 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List

Since I reboot fairly regularly to test new kernels, I don't *always*
have the kernel source tree in my caches, so I still care about some
cold-cache performance. And "git grep" tends to be the most noticeable
one.

Now, I have a SSD, and even the cold-cache case takes just five
seconds or so, but that's still somethng I react to, since the normal
"kernel tree in cache" case ends up bring close enough to
instantaneous (half a second) that then when it takes longer I react
to it.

And I started thinking about it, and our "git grep" parallelism seems
to be limited to 8.

Which is fine most of the time for CPU parallelism (although maybe
some people with big machines would prefer higher numbers), but for IO
parallelism I thought that maybe we'd like a higher number...

So I tried it out, and with THREADS set to 32, I get a roughly 15%
performance boost for the cold-cache case (the error bar is high
enough to not give a very precise number, but I see it going from
~4.8-4.9s on my machine down to 4.2..4.6s).

That's on an SSD, though - the performance implications might be very
different for other use cases (NFS would likely prefer higher IO
parallelism and might show bigger improvement, while a rotational disk
might end up just thrashing more)

Is this a big deal? Probably not. But I did react to how annoying it
was to set the parallelism factor (recompile git with a new number).
Wouldn't it be lovely if it was slightly smarter (something more akin
to the index preloading that takes number of files into account) or at
least allowed people to set the parallelism explicitly with a command
line switch?

Right now it disables the parallel grep entirely for UP, for example.
Which makes perfect sense if grep is all about CPU use. But even UP
might improve from parallel IO..

              Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-05-05 15:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-26 17:31 "git grep" parallelism question Linus Torvalds
2013-04-26 18:47 ` Junio C Hamano
2013-04-26 18:54   ` Linus Torvalds
2013-04-26 19:19     ` Junio C Hamano
2013-04-26 20:31       ` Linus Torvalds
2013-04-27 13:46         ` Thomas Rast
2013-04-29 14:05         ` Ramkumar Ramachandra
2013-04-29 16:18           ` John Keeping
2013-04-29 18:04             ` Thomas Rast
2013-04-29 18:08               ` John Keeping
2013-04-29 22:22                 ` Junio C Hamano
2013-04-30  8:08                   ` John Keeping
2013-04-30 15:59                     ` Jeff King
2013-04-30 16:12                       ` John Keeping
2013-04-30 16:14                         ` Jeff King
2013-05-05 15:40         ` Pete Wyckoff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).