From: Nicholas Piggin <npiggin@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Nicholas Piggin <npiggin@gmail.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Subject: [RFC PATCH] kernel/sched/core: busy wait before going idle
Date: Sun, 15 Apr 2018 23:31:49 +1000 [thread overview]
Message-ID: <20180415133149.24112-1-npiggin@gmail.com> (raw)
This is a quick hack for comments, but I've always wondered --
if we have a short term polling idle states in cpuidle for performance
-- why not skip the context switch and entry into all the idle states,
and just wait for a bit to see if something wakes up again.
It's not uncommon to see various going-to-idle work in kernel profiles.
This might be a way to reduce that (and just the cost of switching
registers and kernel stack to idle thread). This can be an important
path for single thread request-response throughput.
tbench bandwidth seems to be improved (the numbers aren't too stable
but they pretty consistently show some gain). 10-20% would be a pretty
nice gain for such workloads
clients 1 2 4 8 16 128
vanilla 232 467 823 1819 3218 9065
patched 310 503 962 2465 3743 9820
---
kernel/sched/core.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e8afd6086f23..30a0b13edfa5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3404,6 +3404,7 @@ static void __sched notrace __schedule(bool preempt)
struct rq_flags rf;
struct rq *rq;
int cpu;
+ bool do_idle_spin = true;
cpu = smp_processor_id();
rq = cpu_rq(cpu);
@@ -3428,6 +3429,7 @@ static void __sched notrace __schedule(bool preempt)
rq_lock(rq, &rf);
smp_mb__after_spinlock();
+idle_spin_end:
/* Promote REQ to ACT */
rq->clock_update_flags <<= 1;
update_rq_clock(rq);
@@ -3437,6 +3439,32 @@ static void __sched notrace __schedule(bool preempt)
if (unlikely(signal_pending_state(prev->state, prev))) {
prev->state = TASK_RUNNING;
} else {
+ /*
+ * Busy wait before switching to idle thread. This
+ * is marked unlikely because we're idle so jumping
+ * out of line doesn't matter too much.
+ */
+ if (unlikely(do_idle_spin && rq->nr_running == 1)) {
+ u64 start;
+
+ do_idle_spin = false;
+
+ rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
+ rq_unlock_irq(rq, &rf);
+
+ spin_begin();
+ start = local_clock();
+ while (!need_resched() && prev->state &&
+ !signal_pending_state(prev->state, prev)) {
+ spin_cpu_relax();
+ if (local_clock() - start > 1000000)
+ break;
+ }
+ spin_end();
+
+ rq_lock_irq(rq, &rf);
+ goto idle_spin_end;
+ }
deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK);
prev->on_rq = 0;
--
2.17.0
next reply other threads:[~2018-04-15 13:32 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-15 13:31 Nicholas Piggin [this message]
2018-04-20 7:44 ` [RFC PATCH] kernel/sched/core: busy wait before going idle Peter Zijlstra
2018-04-20 9:01 ` Nicholas Piggin
2018-04-20 10:58 ` Peter Zijlstra
2018-04-20 12:28 ` Nicholas Piggin
2018-04-23 10:17 ` Pavan Kondeti
2018-04-24 5:26 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180415133149.24112-1-npiggin@gmail.com \
--to=npiggin@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox