From: Jeff King <peff@peff.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Victor Leschuk <vleschuk@accesssoftek.com>,
Junio C Hamano <gitster@pobox.com>,
Victor Leschuk <vleschuk@gmail.com>,
"git@vger.kernel.org" <git@vger.kernel.org>,
"john@keeping.me.uk" <john@keeping.me.uk>
Subject: Re: [PATCH v4] Add git-grep threads param
Date: Mon, 9 Nov 2015 13:13:19 -0500 [thread overview]
Message-ID: <20151109181319.GA30003@sigill.intra.peff.net> (raw)
In-Reply-To: <CA+55aFzHic5AN05QkbERFszRC=i3aDDGy9yhXEjgzZjwzFVBLQ@mail.gmail.com>
On Mon, Nov 09, 2015 at 09:55:34AM -0800, Linus Torvalds wrote:
> > if (list.nr || cached )
> > num_threads = 0; // do not use threads
> > else if (num_threads == 0)
> > num_threads = online_cpus() <= 1 ? 0 : GREP_NUM_THREADS_DEFAULT;
>
> I will say this AGAIN.
>
> The number of threads is *not* about the number of CPU's. Stop this
> craziness. It's wrong.
I agree with you (and that's why we have a hard-coded default here, and
not online_cpus()).
This check is just whether to turn the threading on or not, based on
whether we have multiple CPUs at all. And I agree that probably should
_not_ be respecting online_cpus() either, because we are better off
parallelizing even on a single CPU to avoid fs latency.
But note that this is already what the code _currently_ does. I do not
mind changing it, but that is a separate notion from Victor's topic,
which is to make the number of threads configurable at all.
What I'd really hope to see is:
1. Victor adds --threads support, and changes _nothing else_ about the
defaults.
2. People experiment with --threads to see what works best in a
variety of cases (especially things like cold-cache NFS). Then we
get patches based on real world numbers to adjust:
2a. GREP_NUM_THREADS_DEFAULT (either bumping the hard-coded
value, or using some dynamic heuristic)
2b. What to do for the single-CPU case.
We're doing step 1 here. I don't mind jumping to 2b if we're pretty sure
it's the right thing (and I think it probably is), but it should
definitely be a separate patch.
> So *none* of the threading in git is about CPU's. Maybe we should add
> big honking comments about that.
Minor nit: there definitely _are_ CPU-bound parallel tasks in git.
pack-objects and index-pack come to mind. If we add any user-visible
advice about tweaking thread configuration, we should be sure
that it is scoped to the appropriate flags.
> And that big honking comment should be in the documentation for the
> new flag too. Because it would be really really sad if people say "I
> have a laptop with just two cores, so I'll set the threading option to
> 2", when they then work mostly over a slow wireless network and their
> company wants minimal local installs.
Agreed. It would probably make sense for the "--threads" option
documentation for each command to discuss whether it is meant to
parallelize work across CPUs, or avoid filesystem latency.
-Peff
next prev parent reply other threads:[~2015-11-09 18:13 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-27 21:22 [PATCH v4] Add git-grep threads param Victor Leschuk
2015-11-02 4:43 ` Victor Leschuk
2015-11-02 15:13 ` Junio C Hamano
2015-11-03 8:36 ` Victor Leschuk
2015-11-03 17:22 ` Junio C Hamano
2015-11-04 6:40 ` Jeff King
2015-11-09 11:36 ` Victor Leschuk
2015-11-09 15:55 ` Jeff King
2015-11-09 16:34 ` Victor Leschuk
2015-11-09 16:53 ` Jeff King
2015-11-09 17:28 ` Victor Leschuk
2015-11-09 17:47 ` Jeff King
2015-11-09 17:55 ` Linus Torvalds
2015-11-09 18:13 ` Jeff King [this message]
2015-11-09 18:32 ` Victor Leschuk
2015-11-09 19:11 ` Linus Torvalds
2015-11-09 19:52 ` Victor Leschuk
[not found] ` <CA+55aFyhBN5fEpB-CLQFhhDyf7nijs_Y3aZCSQAcpVPmruZLFg@mail.gmail.com>
2015-11-09 21:51 ` Victor Leschuk
2015-11-09 18:40 ` Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151109181319.GA30003@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=john@keeping.me.uk \
--cc=torvalds@linux-foundation.org \
--cc=vleschuk@accesssoftek.com \
--cc=vleschuk@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).