From: Mike Galbraith <efault@gmx.de>
To: Balazs Scheidler <bazsi@balabit.hu>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: scheduler oddity [bug?]
Date: Sun, 08 Mar 2009 10:58:29 +0100 [thread overview]
Message-ID: <1236506309.6972.8.camel@marge.simson.net> (raw)
In-Reply-To: <1236505323.6281.57.camel@marge.simson.net>
On Sun, 2009-03-08 at 10:42 +0100, Mike Galbraith wrote:
> On Sat, 2009-03-07 at 18:47 +0100, Balazs Scheidler wrote:
> > Hi,
> >
> > I'm experiencing an odd behaviour from the Linux scheduler. I have an
> > application that feeds data to another process using a pipe. Both
> > processes use a fair amount of CPU time apart from writing to/reading
> > from this pipe.
> >
> > The machine I'm running on is an Opteron Quad-Core CPU:
> > model name : Quad-Core AMD Opteron(tm) Processor 2347 HE
> > stepping : 3
> >
> > What I see is that only one of the cores is used, the other three is
> > idling without doing any work. If I explicitly set the CPU affinity of
> > the processes to use distinct CPUs the performance goes up
> > significantly. (e.g. it starts to use the other cores and the load
> > scales linearly).
> >
> > I've tried to reproduce the problem by writing a small test program,
> > which you can find attached. The program creates two processes, one
> > feeds the other using a pipe and each does a series of memset() calls to
> > simulate CPU load. I've also added capability to the program to set its
> > own CPU affinity. The results (the more the better):
> >
> > Without enabling CPU affinity:
> > $ ./a.out
> > Check: 0 loops/sec, sum: 1
> > Check: 12 loops/sec, sum: 13
> > Check: 41 loops/sec, sum: 54
> > Check: 41 loops/sec, sum: 95
> > Check: 41 loops/sec, sum: 136
> > Check: 41 loops/sec, sum: 177
> > Check: 41 loops/sec, sum: 218
> > Check: 40 loops/sec, sum: 258
> > Check: 41 loops/sec, sum: 299
> > Check: 41 loops/sec, sum: 340
> > Check: 41 loops/sec, sum: 381
> > Check: 41 loops/sec, sum: 422
> > Check: 41 loops/sec, sum: 463
> > Check: 41 loops/sec, sum: 504
> > Check: 41 loops/sec, sum: 545
> > Check: 40 loops/sec, sum: 585
> > Check: 41 loops/sec, sum: 626
> > Check: 41 loops/sec, sum: 667
> > Check: 41 loops/sec, sum: 708
> > Check: 41 loops/sec, sum: 749
> > Check: 41 loops/sec, sum: 790
> > Check: 41 loops/sec, sum: 831
> > Final: 39 loops/sec, sum: 831
> >
> >
> > With CPU affinity:
> > # ./a.out 1
> > Check: 0 loops/sec, sum: 1
> > Check: 41 loops/sec, sum: 42
> > Check: 49 loops/sec, sum: 91
> > Check: 49 loops/sec, sum: 140
> > Check: 49 loops/sec, sum: 189
> > Check: 49 loops/sec, sum: 238
> > Check: 49 loops/sec, sum: 287
> > Check: 50 loops/sec, sum: 337
> > Check: 49 loops/sec, sum: 386
> > Check: 49 loops/sec, sum: 435
> > Check: 49 loops/sec, sum: 484
> > Check: 49 loops/sec, sum: 533
> > Check: 49 loops/sec, sum: 582
> > Check: 49 loops/sec, sum: 631
> > Check: 49 loops/sec, sum: 680
> > Check: 49 loops/sec, sum: 729
> > Check: 49 loops/sec, sum: 778
> > Check: 49 loops/sec, sum: 827
> > Check: 49 loops/sec, sum: 876
> > Check: 49 loops/sec, sum: 925
> > Check: 50 loops/sec, sum: 975
> > Check: 49 loops/sec, sum: 1024
> > Final: 48 loops/sec, sum: 1024
> >
> > The difference is about 20%, which is about the same work performed by
> > the slave process. If the two processes race for the same CPU this 20%
> > of performance is lost.
> >
> > I've tested this on 3 computers and each showed the same symptoms:
> > * quad core Opteron, running Ubuntu kernel 2.6.27-13.29
> > * Core 2 Duo, running Ubuntu kernel 2.6.27-11.27
> > * Dual Core Opteron, Debian backports.org kernel 2.6.26-13~bpo40+1
> >
> > Is this a bug, or a feature?
>
> Both. Affine wakeups are cache friendly, and generally a feature, but
> can lead to underutilized CPUs in some cases, thus turning feature into
> bug as your testcase demonstrates. The metric we for the affinity hint
> works well, but clearly wants some refinement.
>
> You can turn this scheduler hint off via:
> echo NO_SYNC_WAKEUPS > /sys/kernel/debug/sched_features
>
The problem with your particular testcase is that while one half has an
avg_overlap (what we use as affinity hint for synchronous wakeups) which
triggers the affinity hint, the other half has avg_overlap of zero, what
it was born with, so despite significant execution overlap, the
scheduler treats them as if they were truly synchronous tasks.
The below cures it, but is only a demo hack.
diff --git a/kernel/sched.c b/kernel/sched.c
index 8e2558c..85f9ced 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1712,11 +1712,15 @@ static void enqueue_task(struct rq *rq, struct task_struct *p, int wakeup)
static void dequeue_task(struct rq *rq, struct task_struct *p, int sleep)
{
+ u64 limit = sysctl_sched_migration_cost;
+ u64 runtime = p->se.sum_exec_runtime - p->se.prev_sum_exec_runtime;
+
if (sleep && p->se.last_wakeup) {
update_avg(&p->se.avg_overlap,
p->se.sum_exec_runtime - p->se.last_wakeup);
p->se.last_wakeup = 0;
- }
+ } else if (p->se.avg_overlap < limit && runtime >= lpipetest (6701, #threads: 1)
---------------------------------------------------------
se.exec_start : 5607096.896687
se.vruntime : 274158.274352
se.sum_exec_runtime : 139434.783417
se.avg_overlap : 6.477067 <== was ze
nr_switches : 2246
nr_voluntary_switches : 1
nr_involuntary_switches : 2245
se.load.weight : 1024
policy : 0
prio : 120
clock-delta : 102
pipetest (6702, #threads: 1)
---------------------------------------------------------
se.exec_start : 5607096.896687
se.vruntime : 274098.273516
se.sum_exec_runtime : 32987.899515
se.avg_overlap : 0.502174
nr_switches : 13631
nr_voluntary_switches : 11639
nr_involuntary_switches : 1992
se.load.weight : 1024
policy : 0
prio : 120
clock-delta : 117
imit)
+ update_avg(&p->se.avg_overlap, runtime);
sched_info_dequeued(p);
p->sched_class->dequeue_task(rq, p, sleep);
next prev parent reply other threads:[~2009-03-08 9:58 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-07 17:47 scheduler oddity [bug?] Balazs Scheidler
2009-03-07 18:47 ` Balazs Scheidler
2009-03-08 19:45 ` Balazs Scheidler
2009-03-08 22:03 ` Willy Tarreau
2009-03-09 3:35 ` Mike Galbraith
2009-03-09 11:19 ` David Newall
2009-03-08 9:42 ` Mike Galbraith
2009-03-08 9:58 ` Mike Galbraith [this message]
2009-03-08 10:02 ` Mike Galbraith
2009-03-08 10:19 ` Peter Zijlstra
2009-03-08 13:35 ` Mike Galbraith
2009-03-08 15:39 ` Ingo Molnar
2009-03-08 16:20 ` Mike Galbraith
2009-03-08 17:52 ` Ingo Molnar
2009-03-08 18:39 ` Mike Galbraith
2009-03-08 18:55 ` Ingo Molnar
2009-03-09 4:10 ` Mike Galbraith
2009-03-09 6:52 ` Ingo Molnar
2009-03-09 8:02 ` [patch] " Mike Galbraith
2009-03-09 8:07 ` Ingo Molnar
2009-03-09 10:16 ` David Newall
2009-03-09 11:04 ` Peter Zijlstra
2009-03-09 13:16 ` Mike Galbraith
2009-03-09 13:27 ` Peter Zijlstra
2009-03-09 13:51 ` Mike Galbraith
2009-03-09 14:00 ` David Newall
2009-03-09 14:19 ` Peter Zijlstra
2009-03-10 0:20 ` David Newall
2009-03-09 13:37 ` Mike Galbraith
2009-03-09 13:46 ` Peter Zijlstra
2009-03-09 13:58 ` Mike Galbraith
2009-03-09 14:11 ` Mike Galbraith
2009-03-09 14:41 ` Peter Zijlstra
2009-03-09 15:30 ` Mike Galbraith
2009-03-09 16:12 ` Peter Zijlstra
2009-03-09 17:28 ` Mike Galbraith
2009-03-15 13:53 ` Balazs Scheidler
2009-03-15 17:16 ` Mike Galbraith
2009-03-15 18:57 ` Ingo Molnar
2009-03-16 11:55 ` Balazs Scheidler
2009-03-09 15:57 ` Balazs Scheidler
2009-03-10 3:16 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1236506309.6972.8.camel@marge.simson.net \
--to=efault@gmx.de \
--cc=a.p.zijlstra@chello.nl \
--cc=bazsi@balabit.hu \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox