From: Mike Galbraith <efault@gmx.de>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Barry Song <21cnbao@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Valentin Schneider <valentin.schneider@arm.com>,
Aubrey Li <aubrey.li@linux.intel.com>,
Barry Song <song.bao.hua@hisilicon.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: wakeup_affine_weight() is b0rked - was Re: [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr_running
Date: Fri, 08 Oct 2021 07:06:51 +0200 [thread overview]
Message-ID: <63057cf75e91bd0d348b5475ffab8e5a9f5d20f4.camel@gmx.de> (raw)
In-Reply-To: <20211005093137.GQ3959@techsingularity.net>
On Tue, 2021-10-05 at 10:31 +0100, Mel Gorman wrote:
> Ideally, I would do some tracing to confirm that maximum runqueue depth
> is really reduced by the path.
I would expect your worst case to remain unchanged, mine does. The
patch mitigates, it does not eradicate.
I dug up a late 2016 mitigation patch, wedged it into 2021 and added a
BFH that does eradicate my stacking depth woes. I'll probably keep it,
at least for a while. Not because I feel anything in my desktop, rather
because meeting this again (and it being deeper than I recall) reminded
me of measuring impact on NFS etc, making it a tad difficult to ignore.
Oh well, I'll forget about it eventually.. BTDT.
(standard beloved Granny disclaimer)
sched: Add SIS stacking mitigation feature
Select the least loaded LLC CPU for cache cold tasks and kthreads.
Addendum: renamed feature, and give it a big brother.
Not-Signed-off-by: Mike Galbraith <efault@gmx.de>
---
kernel/sched/fair.c | 54 ++++++++++++++++++++++++++++++++++++++++++++----
kernel/sched/features.h | 5 ++++
2 files changed, 55 insertions(+), 4 deletions(-)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6261,6 +6261,26 @@ static inline int select_idle_smt(struct
#endif /* CONFIG_SCHED_SMT */
+static bool task_is_kthread_or_cold(struct task_struct *p)
+{
+ s64 cold = sysctl_sched_migration_cost;
+
+ if (p->flags & PF_KTHREAD)
+ return true;
+ if (cold <= 0)
+ return false;
+ return task_rq(p)->clock_task - p->se.exec_start > cold;
+}
+
+static bool cpu_load_inconsistent(int cpu)
+{
+ struct rq *rq = cpu_rq(cpu);
+
+ if (rq->cfs.h_nr_running < 4)
+ return false;
+ return cpu_load(rq) << 2 < scale_load_down(rq->cfs.load.weight);
+}
+
/*
* Scan the LLC domain for idle CPUs; this is dynamically regulated by
* comparing the average scan cost (tracked in sd->avg_scan_cost) against the
@@ -6269,7 +6289,7 @@ static inline int select_idle_smt(struct
static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
{
struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
- int i, cpu, idle_cpu = -1, nr = INT_MAX;
+ int i, cpu, idle_cpu = -1, nr = INT_MAX, ld = -1;
struct rq *this_rq = this_rq();
int this = smp_processor_id();
struct sched_domain *this_sd;
@@ -6309,6 +6329,21 @@ static int select_idle_cpu(struct task_s
time = cpu_clock(this);
}
+ /*
+ * Select the least loaded CPU for kthreads and cache cold tasks
+ * if no idle CPU is found.
+ */
+ if ((sched_feat(SIS_SPOT) && task_is_kthread_or_cold(p)) ||
+ (sched_feat(SIS_REXY) && cpu_load_inconsistent(target))) {
+ idle_cpu = task_cpu(p);
+ if (idle_cpu != target && !cpus_share_cache(idle_cpu, target))
+ idle_cpu = target;
+ if (unlikely(!sched_cpu_cookie_match(cpu_rq(idle_cpu), p)))
+ idle_cpu = -1;
+ else
+ ld = scale_load_down(cpu_rq(idle_cpu)->cfs.load.weight);
+ }
+
for_each_cpu_wrap(cpu, cpus, target + 1) {
if (has_idle_core) {
i = select_idle_core(p, cpu, cpus, &idle_cpu);
@@ -6317,10 +6352,21 @@ static int select_idle_cpu(struct task_s
} else {
if (!--nr)
- return -1;
- idle_cpu = __select_idle_cpu(cpu, p);
- if ((unsigned int)idle_cpu < nr_cpumask_bits)
+ return idle_cpu;
+ i = __select_idle_cpu(cpu, p);
+ if ((unsigned int)i < nr_cpumask_bits) {
+ idle_cpu = i;
break;
+ }
+ }
+ if (ld > 0 && sched_cpu_cookie_match(cpu_rq(cpu), p)) {
+ i = scale_load_down(cpu_rq(cpu)->cfs.load.weight);
+ if (i < ld) {
+ idle_cpu = cpu;
+ if (i == 0)
+ break;
+ ld = i;
+ }
}
}
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -95,3 +95,8 @@ SCHED_FEAT(LATENCY_WARN, false)
SCHED_FEAT(ALT_PERIOD, true)
SCHED_FEAT(BASE_SLICE, true)
+
+/* Mitigate PELT induced stacking. */
+SCHED_FEAT(SIS_SPOT, true)
+/* Spot's 12 ton big brother. */
+SCHED_FEAT(SIS_REXY, true)
next prev parent reply other threads:[~2021-10-08 5:10 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-20 14:26 [PATCH 0/2] Scale wakeup granularity relative to nr_running Mel Gorman
2021-09-20 14:26 ` [PATCH 1/2] sched/fair: Remove redundant lookup of rq in check_preempt_wakeup Mel Gorman
2021-09-21 7:21 ` Vincent Guittot
2021-09-21 7:53 ` Mel Gorman
2021-09-21 8:12 ` Vincent Guittot
2021-09-21 8:21 ` Peter Zijlstra
2021-09-21 10:03 ` Mel Gorman
2021-09-20 14:26 ` [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr_running Mel Gorman
2021-09-21 3:52 ` Mike Galbraith
2021-09-21 5:50 ` Mike Galbraith
2021-09-21 7:04 ` Mike Galbraith
2021-09-21 10:36 ` Mel Gorman
2021-09-21 12:32 ` Mike Galbraith
2021-09-21 14:03 ` Mel Gorman
2021-10-05 9:24 ` Peter Zijlstra
2021-09-22 5:22 ` Mike Galbraith
2021-09-22 13:20 ` Mel Gorman
2021-09-22 14:04 ` Mike Galbraith
2021-09-22 14:15 ` Vincent Guittot
2021-09-22 15:04 ` Mel Gorman
2021-09-22 16:00 ` Vincent Guittot
2021-09-22 17:38 ` Mel Gorman
2021-09-22 18:22 ` Vincent Guittot
2021-09-22 18:57 ` Mel Gorman
2021-09-23 1:47 ` Mike Galbraith
2021-09-23 8:40 ` Vincent Guittot
2021-09-23 9:21 ` Mike Galbraith
2021-09-23 12:41 ` Vincent Guittot
2021-09-23 13:14 ` Mike Galbraith
2021-09-27 11:17 ` Mel Gorman
2021-09-27 14:17 ` Mike Galbraith
2021-10-04 8:05 ` Mel Gorman
2021-10-04 16:37 ` Vincent Guittot
2021-10-05 7:41 ` Mel Gorman
2021-09-27 14:19 ` Vincent Guittot
2021-09-27 15:02 ` Mel Gorman
2021-09-23 12:24 ` Phil Auld
2021-10-05 10:36 ` Peter Zijlstra
2021-10-05 14:12 ` Phil Auld
2021-10-05 14:32 ` Peter Zijlstra
2021-10-05 10:28 ` Peter Zijlstra
2021-10-05 10:23 ` Peter Zijlstra
2021-10-05 9:41 ` Peter Zijlstra
2021-09-22 15:05 ` Vincent Guittot
2021-10-05 9:32 ` Peter Zijlstra
2021-10-03 3:07 ` wakeup_affine_weight() is b0rked - was " Mike Galbraith
2021-10-03 7:34 ` Barry Song
2021-10-03 14:52 ` Mike Galbraith
2021-10-03 21:06 ` Barry Song
2021-10-04 1:49 ` Mike Galbraith
2021-10-04 4:34 ` Mike Galbraith
2021-10-04 9:06 ` Mike Galbraith
2021-10-05 7:47 ` Mel Gorman
2021-10-05 8:42 ` Mike Galbraith
2021-10-05 9:31 ` Mel Gorman
2021-10-06 6:46 ` Mike Galbraith
2021-10-08 5:06 ` Mike Galbraith [this message]
2021-09-21 8:03 ` Vincent Guittot
2021-09-21 10:45 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=63057cf75e91bd0d348b5475ffab8e5a9f5d20f4.camel@gmx.de \
--to=efault@gmx.de \
--cc=21cnbao@gmail.com \
--cc=aubrey.li@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=song.bao.hua@hisilicon.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).