From: Mike Galbraith <efault@gmx.de>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>, Balazs Scheidler <bazsi@balabit.hu>,
linux-kernel@vger.kernel.org, Willy Tarreau <w@1wt.eu>
Subject: Re: [patch] Re: scheduler oddity [bug?]
Date: Mon, 09 Mar 2009 16:30:49 +0100 [thread overview]
Message-ID: <1236612649.6019.38.camel@marge.simson.net> (raw)
In-Reply-To: <1236609711.8389.583.camel@laptop>
On Mon, 2009-03-09 at 15:41 +0100, Peter Zijlstra wrote:
> On Mon, 2009-03-09 at 15:11 +0100, Mike Galbraith wrote:
>
> > > Yes 2* worked fine. Mysql+oltp was my worry spot, being a very affinity
> > > sensitive little <bleep>, but my patchlet didn't cause any trouble, so
> > > this one shouldn't either. I'll do some re-test in any case, and squeak
> > > should anything turn up.
> >
> > Squeak! Didn't even get to mysql+oltp.
> >
> > marge:..local/tmp # netperf -t UDP_STREAM -l 60 -H 127.0.0.1 -- -P 15888,12384 -s 32768 -S 32768 -m 4096
> > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 15888 AF_INET to 127.0.0.1 (127.0.0.1) port 12384 AF_INET
> > Socket Message Elapsed Messages
> > Size Size Time Okay Errors Throughput
> > bytes bytes secs # # 10^6bits/sec
> >
> > 65536 4096 60.00 5161103 0 2818.65
> > 65536 60.00 5149666 2812.40
> >
> > 6188 root 20 0 1040 544 324 R 100 0.0 0:31.49 0 netperf
> > 6189 root 20 0 1044 260 164 R 48 0.0 0:15.35 3 netserver
> >
> > Hurt, pain, ouch, vs...
> >
> > marge:..local/tmp # netperf -t UDP_STREAM -l 60 -H 127.0.0.1 -T 0,0 -- -P 15888,12384 -s 32768 -S 32768 -m 4096
> > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 15888 AF_INET to 127.0.0.1 (127.0.0.1) port 12384 AF_INET : cpu bind
> > Socket Message Elapsed Messages
> > Size Size Time Okay Errors Throughput
> > bytes bytes secs # # 10^6bits/sec
> >
> > 65536 4096 60.00 8452028 0 4615.93
> > 65536 60.00 8442945 4610.97
> >
> > Drat.
>
> Bugger, so back to the drawing board it is...
Hm.
CPU utilization wise, this test is similar to pipetest. The major
difference is chunk size. Netperf is waking and being preempted (if on
the same CPU) at a very high rate, so the hog component gets cpu in tiny
chunks, vs hefty chunks for pipetest.
Simply doing the below (will look very familiar) made both netperf and
pipetest happy again, because of that preemption rate. Both start life
wanting to be affine, and due to the switch rate, pipetest becomes
non-affine, but netperf remains affine.
Maybe we should factor in wakeup rate, and whether we're waking many vs
one. Wakeup is tied to data, so there is correlation to potential
cache-miss pain, no?
There is also evidence that your patch did in fact make the right
decision, but that we really REALLY should try to punt to a CPU that
shares a cache if available. Check out the numbers when the netperf
test runs on two CPUs that share cache.
marge:..local/tmp # netperf -t UDP_STREAM -l 60 -H 127.0.0.1 -T 0,1 -- -P 15888,12384 -s 32768 -S 32768 -m 4096
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 15888 AF_INET to 127.0.0.1 (127.0.0.1) port 12384 AF_INET : cpu bind
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec
65536 4096 60.00 15325632 0 8369.84
65536 60.00 15321176 8367.40
(You can skip the below, nothing new there. Just for completeness;)
diff --git a/kernel/sched.c b/kernel/sched.c
index 8e2558c..0f67b2a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4508,6 +4508,24 @@ static inline void schedule_debug(struct task_struct *prev)
#endif
}
+static void put_prev_task(struct rq *rq, struct task_struct *prev)
+{
+ if (prev->state == TASK_RUNNING) {
+ u64 runtime = prev->se.sum_exec_runtime;
+
+ runtime -= prev->se.prev_sum_exec_runtime;
+ runtime = min_t(u64, runtime, 2*sysctl_sched_migration_cost);
+
+ /*
+ * In order to avoid avg_overlap growing stale when we are
+ * indeed overlapping and hence not getting put to sleep, grow
+ * the avg_overlap on preemption.
+ */
+ update_avg(&prev->se.avg_overlap, runtime);
+ }
+ prev->sched_class->put_prev_task(rq, prev);
+}
+
/*
* Pick up the highest-prio task:
*/
@@ -4586,7 +4604,7 @@ need_resched_nonpreemptible:
if (unlikely(!rq->nr_running))
idle_balance(cpu, rq);
- prev->sched_class->put_prev_task(rq, prev);
+ put_prev_task(rq, prev);
next = pick_next_task(rq, prev);
if (likely(prev != next)) {
next prev parent reply other threads:[~2009-03-09 15:31 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-07 17:47 scheduler oddity [bug?] Balazs Scheidler
2009-03-07 18:47 ` Balazs Scheidler
2009-03-08 19:45 ` Balazs Scheidler
2009-03-08 22:03 ` Willy Tarreau
2009-03-09 3:35 ` Mike Galbraith
2009-03-09 11:19 ` David Newall
2009-03-08 9:42 ` Mike Galbraith
2009-03-08 9:58 ` Mike Galbraith
2009-03-08 10:02 ` Mike Galbraith
2009-03-08 10:19 ` Peter Zijlstra
2009-03-08 13:35 ` Mike Galbraith
2009-03-08 15:39 ` Ingo Molnar
2009-03-08 16:20 ` Mike Galbraith
2009-03-08 17:52 ` Ingo Molnar
2009-03-08 18:39 ` Mike Galbraith
2009-03-08 18:55 ` Ingo Molnar
2009-03-09 4:10 ` Mike Galbraith
2009-03-09 6:52 ` Ingo Molnar
2009-03-09 8:02 ` [patch] " Mike Galbraith
2009-03-09 8:07 ` Ingo Molnar
2009-03-09 10:16 ` David Newall
2009-03-09 11:04 ` Peter Zijlstra
2009-03-09 13:16 ` Mike Galbraith
2009-03-09 13:27 ` Peter Zijlstra
2009-03-09 13:51 ` Mike Galbraith
2009-03-09 14:00 ` David Newall
2009-03-09 14:19 ` Peter Zijlstra
2009-03-10 0:20 ` David Newall
2009-03-09 13:37 ` Mike Galbraith
2009-03-09 13:46 ` Peter Zijlstra
2009-03-09 13:58 ` Mike Galbraith
2009-03-09 14:11 ` Mike Galbraith
2009-03-09 14:41 ` Peter Zijlstra
2009-03-09 15:30 ` Mike Galbraith [this message]
2009-03-09 16:12 ` Peter Zijlstra
2009-03-09 17:28 ` Mike Galbraith
2009-03-15 13:53 ` Balazs Scheidler
2009-03-15 17:16 ` Mike Galbraith
2009-03-15 18:57 ` Ingo Molnar
2009-03-16 11:55 ` Balazs Scheidler
2009-03-09 15:57 ` Balazs Scheidler
2009-03-10 3:16 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1236612649.6019.38.camel@marge.simson.net \
--to=efault@gmx.de \
--cc=a.p.zijlstra@chello.nl \
--cc=bazsi@balabit.hu \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox