From: Mel Gorman <mgorman@techsingularity.net>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>,
Matt Fleming <matt@codeblueprint.co.uk>,
LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 3/4] sched/fair: Do not migrate if the prev_cpu is idle
Date: Tue, 30 Jan 2018 10:45:54 +0000 [thread overview]
Message-ID: <20180130104555.4125-4-mgorman@techsingularity.net> (raw)
In-Reply-To: <20180130104555.4125-1-mgorman@techsingularity.net>
wake_affine_idle prefers to move a task to the current CPU if the
wakeup is due to an interrupt. The expectation is that the interrupt
data is cache hot and relevant to the waking task as well as avoiding
a search. However, there is no way to determine if there was cache hot
data on the previous CPU that may exceed the interrupt data. Furthermore,
round-robin delivery of interrupts can migrate tasks around a socket where
each CPU is under-utilised. This can interact badly with cpufreq which
makes decisions based on per-cpu data. It has been observed on machines
with HWP that p-states are not boosted to their maximum levels even though
the workload is latency and throughput sensitive.
This patch uses the previous CPU for the task if it's idle and cache-affine
with the current CPU even if the current CPU is idle due to the wakup
being related to the interrupt. This reduces migrations at the cost of
the interrupt data not being cache hot when the task wakes.
A variety of workloads were tested on various machines and no adverse
impact was noticed that was outside noise. dbench on ext4 on UMA showed
roughly 10% reduction in the number of CPU migrations and it is a case
where interrupts are frequent for IO competions. In most cases, the
difference in performance is quite small but variability is often
reduced. For example, this is the result for pgbench running on a UMA
machine with different numbers of clients.
4.15.0-rc9 4.15.0-rc9
baseline waprev-v1
Hmean 1 22096.28 ( 0.00%) 22734.86 ( 2.89%)
Hmean 4 74633.42 ( 0.00%) 75496.77 ( 1.16%)
Hmean 7 115017.50 ( 0.00%) 113030.81 ( -1.73%)
Hmean 12 126209.63 ( 0.00%) 126613.40 ( 0.32%)
Hmean 16 131886.91 ( 0.00%) 130844.35 ( -0.79%)
Stddev 1 636.38 ( 0.00%) 417.11 ( 34.46%)
Stddev 4 614.64 ( 0.00%) 583.24 ( 5.11%)
Stddev 7 542.46 ( 0.00%) 435.45 ( 19.73%)
Stddev 12 173.93 ( 0.00%) 171.50 ( 1.40%)
Stddev 16 671.42 ( 0.00%) 680.30 ( -1.32%)
CoeffVar 1 2.88 ( 0.00%) 1.83 ( 36.26%)
Note that the different in performance is marginal but for low utilisation,
there is less variability.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
kernel/sched/fair.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1aebe79da2ab..3b732caa6fba 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5704,9 +5704,15 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
* context. Only allow the move if cache is shared. Otherwise an
* interrupt intensive workload could force all tasks onto one
* node depending on the IO topology or IRQ affinity settings.
+ *
+ * If the prev_cpu is idle and cache affine then avoid a migration.
+ * There is no guarantee that the cache hot data from an interrupt
+ * is more important than cache hot data on the prev_cpu and from
+ * a cpufreq perspective, it's better to have higher utilisation
+ * on one CPU.
*/
if (idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
- return this_cpu;
+ return idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
if (sync && cpu_rq(this_cpu)->nr_running == 1)
return this_cpu;
--
2.15.1
next prev parent reply other threads:[~2018-01-30 10:46 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-30 10:45 [PATCH 0/4] Reduce migrations and unnecessary spreading of load to multiple CPUs Mel Gorman
2018-01-30 10:45 ` [PATCH 1/4] sched/fair: Remove unnecessary parameters from wake_affine_idle Mel Gorman
2018-02-06 11:55 ` [tip:sched/urgent] sched/fair: Remove unnecessary parameters from wake_affine_idle() tip-bot for Mel Gorman
2018-01-30 10:45 ` [PATCH 2/4] sched/fair: Restructure wake_affine to return a CPU id Mel Gorman
2018-02-06 11:56 ` [tip:sched/urgent] sched/fair: Restructure wake_affine*() " tip-bot for Mel Gorman
2018-01-30 10:45 ` Mel Gorman [this message]
2018-02-06 11:56 ` [tip:sched/urgent] sched/fair: Do not migrate if the prev_cpu is idle tip-bot for Mel Gorman
2018-01-30 10:45 ` [PATCH 4/4] sched/fair: Use a recently used CPU as an idle candidate and the basis for SIS Mel Gorman
2018-01-30 11:50 ` Peter Zijlstra
2018-01-30 12:57 ` Mel Gorman
2018-01-30 13:15 ` Peter Zijlstra
2018-01-30 13:25 ` Mel Gorman
2018-01-30 13:40 ` Peter Zijlstra
2018-01-30 14:06 ` Mel Gorman
2018-01-31 9:22 ` Rafael J. Wysocki
2018-01-31 10:17 ` Peter Zijlstra
2018-01-31 11:54 ` Mel Gorman
2018-01-31 17:44 ` Srinivas Pandruvada
2018-02-01 9:11 ` Peter Zijlstra
2018-02-01 7:50 ` Rafael J. Wysocki
2018-02-01 9:11 ` Peter Zijlstra
2018-02-01 13:18 ` Srinivas Pandruvada
2018-02-02 11:00 ` Rafael J. Wysocki
2018-02-02 14:54 ` Srinivas Pandruvada
2018-02-02 19:48 ` Mel Gorman
2018-02-02 20:01 ` Srinivas Pandruvada
2018-02-05 11:10 ` Mel Gorman
2018-02-05 17:04 ` Srinivas Pandruvada
2018-02-05 17:50 ` Mel Gorman
2018-02-04 8:42 ` Rafael J. Wysocki
2018-02-04 8:38 ` Rafael J. Wysocki
2018-02-02 11:42 ` Rafael J. Wysocki
2018-02-02 12:46 ` Peter Zijlstra
2018-02-02 12:55 ` Peter Zijlstra
2018-02-02 14:08 ` Peter Zijlstra
2018-02-03 16:30 ` Srinivas Pandruvada
2018-02-05 10:44 ` Peter Zijlstra
2018-02-05 10:58 ` Ingo Molnar
2018-02-02 12:58 ` Peter Zijlstra
2018-02-02 13:27 ` Mel Gorman
2018-01-30 13:15 ` Mike Galbraith
2018-01-30 13:25 ` Peter Zijlstra
2018-01-30 13:35 ` Mike Galbraith
2018-01-30 11:53 ` Peter Zijlstra
2018-01-30 12:59 ` Mel Gorman
2018-01-30 13:06 ` Peter Zijlstra
2018-01-30 13:18 ` Mel Gorman
2018-02-06 11:56 ` [tip:sched/urgent] " tip-bot for Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180130104555.4125-4-mgorman@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).