git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Victor Leschuk <vleschuk@accesssoftek.com>,
	Junio C Hamano <gitster@pobox.com>,
	Victor Leschuk <vleschuk@gmail.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	"john@keeping.me.uk" <john@keeping.me.uk>
Subject: Re: [PATCH v4] Add git-grep threads param
Date: Mon, 9 Nov 2015 13:13:19 -0500	[thread overview]
Message-ID: <20151109181319.GA30003@sigill.intra.peff.net> (raw)
In-Reply-To: <CA+55aFzHic5AN05QkbERFszRC=i3aDDGy9yhXEjgzZjwzFVBLQ@mail.gmail.com>

On Mon, Nov 09, 2015 at 09:55:34AM -0800, Linus Torvalds wrote:

> > if (list.nr || cached )
> >   num_threads = 0; // do not use threads
> > else if (num_threads == 0)
> >   num_threads = online_cpus() <= 1 ? 0 : GREP_NUM_THREADS_DEFAULT;
> 
> I will say this AGAIN.
> 
> The number of threads is *not* about the number of CPU's. Stop this
> craziness. It's wrong.

I agree with you (and that's why we have a hard-coded default here, and
not online_cpus()).

This check is just whether to turn the threading on or not, based on
whether we have multiple CPUs at all. And I agree that probably should
_not_ be respecting online_cpus() either, because we are better off
parallelizing even on a single CPU to avoid fs latency.

But note that this is already what the code _currently_ does. I do not
mind changing it, but that is a separate notion from Victor's topic,
which is to make the number of threads configurable at all.

What I'd really hope to see is:

  1. Victor adds --threads support, and changes _nothing else_ about the
     defaults.

  2. People experiment with --threads to see what works best in a
     variety of cases (especially things like cold-cache NFS). Then we
     get patches based on real world numbers to adjust:

       2a. GREP_NUM_THREADS_DEFAULT (either bumping the hard-coded
           value, or using some dynamic heuristic)

       2b. What to do for the single-CPU case.

We're doing step 1 here. I don't mind jumping to 2b if we're pretty sure
it's the right thing (and I think it probably is), but it should
definitely be a separate patch.

> So *none* of the threading in git is about CPU's. Maybe we should add
> big honking comments about that.

Minor nit: there definitely _are_ CPU-bound parallel tasks in git.
pack-objects and index-pack come to mind. If we add any user-visible
advice about tweaking thread configuration, we should be sure
that it is scoped to the appropriate flags.

> And that big honking comment should be in the documentation for the
> new flag too. Because it would be really really sad if people say "I
> have a laptop with just two cores, so I'll set the threading option to
> 2", when they then work mostly over a slow wireless network and their
> company wants minimal local installs.

Agreed. It would probably make sense for the "--threads" option
documentation for each command to discuss whether it is meant to
parallelize work across CPUs, or avoid filesystem latency.

-Peff

  reply	other threads:[~2015-11-09 18:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-27 21:22 [PATCH v4] Add git-grep threads param Victor Leschuk
2015-11-02  4:43 ` Victor Leschuk
2015-11-02 15:13   ` Junio C Hamano
2015-11-03  8:36     ` Victor Leschuk
2015-11-03 17:22 ` Junio C Hamano
2015-11-04  6:40   ` Jeff King
2015-11-09 11:36     ` Victor Leschuk
2015-11-09 15:55       ` Jeff King
2015-11-09 16:34         ` Victor Leschuk
2015-11-09 16:53           ` Jeff King
2015-11-09 17:28             ` Victor Leschuk
2015-11-09 17:47               ` Jeff King
2015-11-09 17:55               ` Linus Torvalds
2015-11-09 18:13                 ` Jeff King [this message]
2015-11-09 18:32                 ` Victor Leschuk
2015-11-09 19:11                   ` Linus Torvalds
2015-11-09 19:52                     ` Victor Leschuk
     [not found]                       ` <CA+55aFyhBN5fEpB-CLQFhhDyf7nijs_Y3aZCSQAcpVPmruZLFg@mail.gmail.com>
2015-11-09 21:51                         ` Victor Leschuk
2015-11-09 18:40                 ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151109181319.GA30003@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=john@keeping.me.uk \
    --cc=torvalds@linux-foundation.org \
    --cc=vleschuk@accesssoftek.com \
    --cc=vleschuk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).