From: Ingo Molnar <mingo@kernel.org>
To: Mike Galbraith <efault@gmx.de>
Cc: Borislav Petkov <bp@alien8.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mel Gorman <mgorman@suse.de>,
Nikolay Ulyanitsky <lystor@gmail.com>,
linux-kernel@vger.kernel.org,
Andreas Herrmann <andreas.herrmann3@amd.com>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Suresh Siddha <suresh.b.siddha@intel.com>
Subject: Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected
Date: Thu, 27 Sep 2012 07:47:42 +0200 [thread overview]
Message-ID: <20120927054742.GA4370@gmail.com> (raw)
In-Reply-To: <1348722568.7059.115.camel@marge.simpson.net>
* Mike Galbraith <efault@gmx.de> wrote:
> I think the pgbench problem is more about latency for the 1 in
> 1:N than spinlocks.
So my understanding of the psql workload is that basically we've
got a central psql proxy process that is distributing work to
worker psql processes. If a freshly woken worker process ever
preempts the central proxy process then it is preventing a lot
of new work from getting distributed.
Correct?
So the central proxy psql process is 'much more important' to
run than any of the worker processes - an importance that is not
(currently) visible from the behavioral statistics the scheduler
keeps on tasks.
So the scheduler has the following problem here: a new wakee
might be starved enough and the proxy might have run long enough
to really justify the preemption here and now. The buddy
statistics help avoid some of these cases - but not all and the
difference is measurable.
Yet the 'best' way for psql to run is for this proxy process to
never be preempted. Your SCHED_BATCH experiments confirmed that.
The way remote CPU selection affects it is that if we ever get
more aggressive in selecting a remote CPU then we, as a side
effect, also reduce the chance of harmful preemption of the
central proxy psql process.
So in that sense sibling selection is somewhat of an indirect
red herring: it really only helps psql indirectly by preventing
the harmful preemption. It also, somewhat paradoxially argues
for suboptimal code: for example tearing apart buddies is
beneficial in the psql workload, because it also allows the more
important part of the buddy to run more (the proxy).
In that sense the *real* problem isnt even parallelism (although
we obviously should improve the decisions there - and the logic
has suffered in the past from the psql dilemma outlined above),
but whether the scheduler can (and should) identify the central
proxy and keep it running as much as possible, deprioritizing
fairness, wakeup buddies, runtime overlap and cache affinity
considerations.
There's two broad solutions that I can see:
- Add a kernel solution to somehow identify 'central' processes
and bias them. Xorg is a similar kind of process, so it would
help other workloads as well. That way lie dragons, but might
be worth an attempt or two. We already try to do a couple of
robust metrics, like overlap statistics to identify buddies.
- Let user-space occasionally identify its important (and less
important) tasks - say psql could mark it worker processes as
SCHED_BATCH and keep its central process(es) higher prio. A
single line of obvious code in 100 KLOCs of user-space code.
Just to confirm, if you turn off all preemption via a hack
(basically if you turn SCHED_OTHER into SCHED_BATCH), does psql
perform and scale much better, with the quality of sibling
selection and spreading of processes only being a secondary
effect?
Thanks,
Ingo
next prev parent reply other threads:[~2012-09-27 5:47 UTC|newest]
Thread overview: 115+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-14 7:47 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets Nikolay Ulyanitsky
2012-09-14 18:40 ` Borislav Petkov
2012-09-14 18:51 ` Borislav Petkov
2012-09-14 21:27 ` 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected Borislav Petkov
2012-09-14 21:40 ` Peter Zijlstra
2012-09-14 21:44 ` Linus Torvalds
2012-09-14 21:56 ` Peter Zijlstra
2012-09-14 21:59 ` Peter Zijlstra
2012-09-15 3:57 ` Mike Galbraith
2012-09-14 22:01 ` Linus Torvalds
2012-09-14 22:10 ` Peter Zijlstra
2012-09-14 22:20 ` Linus Torvalds
2012-09-14 22:14 ` Borislav Petkov
2012-09-14 21:45 ` Borislav Petkov
2012-09-14 21:42 ` Linus Torvalds
2012-09-15 3:33 ` Mike Galbraith
2012-09-15 16:16 ` Andi Kleen
2012-09-15 16:36 ` Mike Galbraith
2012-09-15 17:08 ` richard -rw- weinberger
2012-09-16 4:48 ` Mike Galbraith
2012-09-15 21:32 ` Alan Cox
2012-09-16 4:35 ` Mike Galbraith
2012-09-16 19:57 ` Linus Torvalds
2012-09-17 8:08 ` Mike Galbraith
2012-09-17 10:07 ` Ingo Molnar
2012-09-17 10:47 ` Mike Galbraith
2012-09-17 14:39 ` Andi Kleen
2012-09-19 12:35 ` Mike Galbraith
2012-09-19 14:54 ` Ingo Molnar
2012-09-19 15:23 ` Mike Galbraith
2012-09-24 15:00 ` Mel Gorman
2012-09-24 15:23 ` Nikolay Ulyanitsky
2012-09-24 15:53 ` Borislav Petkov
2012-09-24 15:30 ` Peter Zijlstra
2012-09-24 15:51 ` Mike Galbraith
2012-09-24 15:52 ` Linus Torvalds
2012-09-24 16:07 ` Peter Zijlstra
2012-09-24 16:33 ` Linus Torvalds
2012-09-24 16:54 ` Peter Zijlstra
2012-09-25 12:10 ` Hillf Danton
2012-09-24 16:12 ` Peter Zijlstra
2012-09-24 16:30 ` Linus Torvalds
2012-09-24 16:52 ` Borislav Petkov
2012-09-24 16:54 ` Peter Zijlstra
2012-09-24 17:44 ` Peter Zijlstra
2012-09-25 13:23 ` Mel Gorman
2012-09-25 14:36 ` Peter Zijlstra
2012-09-24 18:26 ` Mike Galbraith
2012-09-24 19:12 ` Linus Torvalds
2012-09-24 19:20 ` Borislav Petkov
2012-09-25 1:57 ` Mike Galbraith
2012-09-25 2:11 ` Linus Torvalds
2012-09-25 2:49 ` Mike Galbraith
2012-09-25 3:10 ` Linus Torvalds
2012-09-25 3:20 ` Mike Galbraith
2012-09-25 3:32 ` Linus Torvalds
2012-09-25 3:43 ` Mike Galbraith
2012-09-25 11:58 ` Peter Zijlstra
2012-09-25 13:17 ` Borislav Petkov
2012-09-25 17:00 ` Borislav Petkov
2012-09-25 17:21 ` Linus Torvalds
2012-09-25 18:42 ` Borislav Petkov
2012-09-25 19:08 ` Linus Torvalds
2012-09-26 2:23 ` Mike Galbraith
2012-09-26 17:17 ` Borislav Petkov
2012-09-26 2:00 ` Mike Galbraith
2012-09-26 2:22 ` Linus Torvalds
2012-09-26 2:42 ` Mike Galbraith
2012-09-26 17:15 ` Borislav Petkov
2012-09-26 16:32 ` Borislav Petkov
2012-09-26 18:19 ` Linus Torvalds
2012-09-26 21:37 ` Borislav Petkov
2012-09-27 5:09 ` Mike Galbraith
2012-09-27 5:18 ` Borislav Petkov
2012-09-27 5:44 ` Mike Galbraith
2012-09-27 5:47 ` Ingo Molnar [this message]
2012-09-27 5:59 ` Ingo Molnar
2012-09-27 6:34 ` Mike Galbraith
2012-09-27 6:41 ` Ingo Molnar
2012-09-27 6:54 ` Mike Galbraith
2012-09-27 7:10 ` Ingo Molnar
2012-09-27 16:25 ` Borislav Petkov
2012-09-27 17:44 ` Linus Torvalds
2012-09-27 18:05 ` Borislav Petkov
2012-09-27 18:19 ` Linus Torvalds
2012-09-27 18:29 ` Peter Zijlstra
2012-09-27 19:24 ` Borislav Petkov
2012-09-28 3:50 ` Mike Galbraith
2012-09-28 12:30 ` Borislav Petkov
2012-09-27 19:40 ` Linus Torvalds
2012-09-28 4:13 ` Mike Galbraith
2012-09-28 8:37 ` Peter Zijlstra
2012-09-27 7:17 ` david
2012-09-27 7:55 ` Mike Galbraith
2012-09-27 10:20 ` Borislav Petkov
2012-09-27 13:38 ` Mike Galbraith
2012-09-27 16:55 ` david
2012-09-27 4:32 ` Mike Galbraith
2012-09-27 8:21 ` Peter Zijlstra
2012-09-27 16:48 ` david
2012-09-27 17:38 ` Peter Zijlstra
2012-09-27 17:45 ` david
2012-09-27 18:09 ` Peter Zijlstra
2012-09-27 18:15 ` Linus Torvalds
2012-09-27 18:24 ` Borislav Petkov
2012-09-25 1:39 ` Mike Galbraith
2012-09-25 21:11 ` Suresh Siddha
2012-09-25 4:16 ` Mike Galbraith
2012-09-15 4:11 ` Mike Galbraith
[not found] ` <CA+55aFz1A7HbMYS9o-GTS5Zm=Xx8MUD7cR05GMVo--2E34jcgQ@mail.gmail.com>
2012-09-15 4:42 ` Mike Galbraith
2012-09-15 10:44 ` Borislav Petkov
2012-09-15 14:47 ` Mike Galbraith
2012-09-15 15:18 ` Borislav Petkov
2012-09-15 16:13 ` Mike Galbraith
2012-09-15 19:44 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120927054742.GA4370@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=andreas.herrmann3@amd.com \
--cc=bp@alien8.de \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=lystor@gmail.com \
--cc=mgorman@suse.de \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox