* [PATCH v5 0/4] Scan for an idle sibling in a single pass
@ 2021-01-27 13:51 Mel Gorman
2021-01-27 13:52 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman
` (4 more replies)
0 siblings, 5 replies; 22+ messages in thread
From: Mel Gorman @ 2021-01-27 13:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar
Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman
Changelog since v4
o Avoid use of intermediate variable during select_idle_cpu
Changelog since v3
o Drop scanning based on cores, SMT4 results showed problems
Changelog since v2
o Remove unnecessary parameters
o Update nr during scan only when scanning for cpus
Changlog since v1
o Move extern declaration to header for coding style
o Remove unnecessary parameter from __select_idle_cpu
This series of 4 patches reposts three patches from Peter entitled
"select_idle_sibling() wreckage". It only scans the runqueues in a single
pass when searching for an idle sibling.
Three patches from Peter were dropped. The first patch altered how scan
depth was calculated. Scan depth deletion is a random number generator
with two major limitations. The avg_idle time is based on the time
between a CPU going idle and being woken up clamped approximately by
2*sysctl_sched_migration_cost. This is difficult to compare in a sensible
fashion to avg_scan_cost. The second issue is that only the avg_scan_cost
of scan failures is recorded and it does not decay. This requires deeper
surgery that would justify a patch on its own although Peter notes that
https://lkml.kernel.org/r/20180530143105.977759909@infradead.org is
potentially useful for an alternative avg_idle metric.
The second patch dropped scanned based on cores instead of CPUs as it
rationalised the difference between core scanning and CPU scanning.
Unfortunately, Vincent reported problems with SMT4 so it's dropped
for now until depth searching can be fixed.
The third patch dropped converted the idle core scan throttling mechanism
to SIS_PROP. While this would unify the throttling of core and CPU
scanning, it was not free of regressions and has_idle_cores is a fairly
effective throttling mechanism with the caveat that it can have a lot of
false positives for workloads like hackbench.
Peter's series tried to solve three problems at once, this subset addresses
one problem.
kernel/sched/fair.c | 151 +++++++++++++++++++---------------------
kernel/sched/features.h | 1 -
2 files changed, 70 insertions(+), 82 deletions(-)
--
2.26.2
Mel Gorman (4):
sched/fair: Remove SIS_AVG_CPU
sched/fair: Move avg_scan_cost calculations under SIS_PROP
sched/fair: Remove select_idle_smt()
sched/fair: Merge select_idle_core/cpu()
kernel/sched/fair.c | 151 +++++++++++++++++++---------------------
kernel/sched/features.h | 1 -
2 files changed, 70 insertions(+), 82 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 22+ messages in thread* [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman @ 2021-01-27 13:52 ` Mel Gorman 2021-01-27 13:52 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman ` (3 subsequent siblings) 4 siblings, 0 replies; 22+ messages in thread From: Mel Gorman @ 2021-01-27 13:52 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman SIS_AVG_CPU was introduced as a means of avoiding a search when the average search cost indicated that the search would likely fail. It was a blunt instrument and disabled by commit 4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive") and later replaced with a proportional search depth by commit 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()"). While there are corner cases where SIS_AVG_CPU is better, it has now been disabled for almost three years. As the intent of SIS_PROP is to reduce the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus on SIS_PROP as a throttling mechanism. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 20 +++++++++----------- kernel/sched/features.h | 1 - 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 04a3ce20da67..9f5682aeda2e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6145,7 +6145,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); struct sched_domain *this_sd; - u64 avg_cost, avg_idle; u64 time; int this = smp_processor_id(); int cpu, nr = INT_MAX; @@ -6154,18 +6153,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; - /* - * Due to large variance we need a large fuzz factor; hackbench in - * particularly is sensitive here. - */ - avg_idle = this_rq()->avg_idle / 512; - avg_cost = this_sd->avg_scan_cost + 1; + if (sched_feat(SIS_PROP)) { + u64 avg_cost, avg_idle, span_avg; - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) - return -1; + /* + * Due to large variance we need a large fuzz factor; + * hackbench in particularly is sensitive here. + */ + avg_idle = this_rq()->avg_idle / 512; + avg_cost = this_sd->avg_scan_cost + 1; - if (sched_feat(SIS_PROP)) { - u64 span_avg = sd->span_weight * avg_idle; + span_avg = sd->span_weight * avg_idle; if (span_avg > 4*avg_cost) nr = div_u64(span_avg, avg_cost); else diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 68d369cba9e4..e875eabb6600 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) /* * When doing wakeups, attempt to limit superfluous scans of the LLC domain. */ -SCHED_FEAT(SIS_AVG_CPU, false) SCHED_FEAT(SIS_PROP, true) /* -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP 2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman 2021-01-27 13:52 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman @ 2021-01-27 13:52 ` Mel Gorman 2021-01-27 13:52 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman ` (2 subsequent siblings) 4 siblings, 0 replies; 22+ messages in thread From: Mel Gorman @ 2021-01-27 13:52 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP check and while we are at it, exclude the cost of initialising the CPU mask from the average scan cost. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9f5682aeda2e..c8d8e185cf3b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + if (sched_feat(SIS_PROP)) { u64 avg_cost, avg_idle, span_avg; @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t nr = div_u64(span_avg, avg_cost); else nr = 4; - } - - time = cpu_clock(this); - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + time = cpu_clock(this); + } for_each_cpu_wrap(cpu, cpus, target) { if (!--nr) @@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t break; } - time = cpu_clock(this) - time; - update_avg(&this_sd->avg_scan_cost, time); + if (sched_feat(SIS_PROP)) { + time = cpu_clock(this) - time; + update_avg(&this_sd->avg_scan_cost, time); + } return cpu; } -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/4] sched/fair: Remove select_idle_smt() 2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman 2021-01-27 13:52 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 2021-01-27 13:52 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman @ 2021-01-27 13:52 ` Mel Gorman 2021-01-27 13:52 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman 2021-02-01 1:13 ` [PATCH v5 0/4] Scan for an idle sibling in a single pass Li, Aubrey 4 siblings, 0 replies; 22+ messages in thread From: Mel Gorman @ 2021-01-27 13:52 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman From: Peter Zijlstra (Intel) <peterz@infradead.org> In order to make the next patch more readable, and to quantify the actual effectiveness of this pass, start by removing it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 30 ------------------------------ 1 file changed, 30 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c8d8e185cf3b..fe587350ea14 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6101,27 +6101,6 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int return -1; } -/* - * Scan the local SMT mask for idle CPUs. - */ -static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) -{ - int cpu; - - if (!static_branch_likely(&sched_smt_present)) - return -1; - - for_each_cpu(cpu, cpu_smt_mask(target)) { - if (!cpumask_test_cpu(cpu, p->cpus_ptr) || - !cpumask_test_cpu(cpu, sched_domain_span(sd))) - continue; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) - return cpu; - } - - return -1; -} - #else /* CONFIG_SCHED_SMT */ static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) @@ -6129,11 +6108,6 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s return -1; } -static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) -{ - return -1; -} - #endif /* CONFIG_SCHED_SMT */ /* @@ -6323,10 +6297,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) if ((unsigned)i < nr_cpumask_bits) return i; - i = select_idle_smt(p, sd, target); - if ((unsigned)i < nr_cpumask_bits) - return i; - return target; } -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() 2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman ` (2 preceding siblings ...) 2021-01-27 13:52 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman @ 2021-01-27 13:52 ` Mel Gorman 2021-02-17 13:17 ` [tip: sched/core] " tip-bot2 for Mel Gorman 2021-02-01 1:13 ` [PATCH v5 0/4] Scan for an idle sibling in a single pass Li, Aubrey 4 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2021-01-27 13:52 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman From: Peter Zijlstra (Intel) <peterz@infradead.org> Both select_idle_core() and select_idle_cpu() do a loop over the same cpumask. Observe that by clearing the already visited CPUs, we can fold the iteration and iterate a core at a time. All we need to do is remember any non-idle CPU we encountered while scanning for an idle core. This way we'll only iterate every CPU once. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 99 +++++++++++++++++++++++++++------------------ 1 file changed, 59 insertions(+), 40 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fe587350ea14..01591e24a95f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6006,6 +6006,14 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p return new_cpu; } +static inline int __select_idle_cpu(int cpu) +{ + if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) + return cpu; + + return -1; +} + #ifdef CONFIG_SCHED_SMT DEFINE_STATIC_KEY_FALSE(sched_smt_present); EXPORT_SYMBOL_GPL(sched_smt_present); @@ -6064,48 +6072,51 @@ void __update_idle_core(struct rq *rq) * there are no idle cores left in the system; tracked through * sd_llc->shared->has_idle_cores and enabled through update_idle_core() above. */ -static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) +static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu) { - struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); - int core, cpu; + bool idle = true; + int cpu; if (!static_branch_likely(&sched_smt_present)) - return -1; - - if (!test_idle_cores(target, false)) - return -1; - - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + return __select_idle_cpu(core); - for_each_cpu_wrap(core, cpus, target) { - bool idle = true; - - for_each_cpu(cpu, cpu_smt_mask(core)) { - if (!available_idle_cpu(cpu)) { - idle = false; - break; + for_each_cpu(cpu, cpu_smt_mask(core)) { + if (!available_idle_cpu(cpu)) { + idle = false; + if (*idle_cpu == -1) { + if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) { + *idle_cpu = cpu; + break; + } + continue; } + break; } - - if (idle) - return core; - - cpumask_andnot(cpus, cpus, cpu_smt_mask(core)); + if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr)) + *idle_cpu = cpu; } - /* - * Failed to find an idle core; stop looking for one. - */ - set_idle_cores(target, 0); + if (idle) + return core; + cpumask_andnot(cpus, cpus, cpu_smt_mask(core)); return -1; } #else /* CONFIG_SCHED_SMT */ -static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) +static inline void set_idle_cores(int cpu, int val) { - return -1; +} + +static inline bool test_idle_cores(int cpu, bool def) +{ + return def; +} + +static inline int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu) +{ + return __select_idle_cpu(core); } #endif /* CONFIG_SCHED_SMT */ @@ -6118,10 +6129,11 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target) { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); + int i, cpu, idle_cpu = -1, nr = INT_MAX; + bool smt = test_idle_cores(target, false); + int this = smp_processor_id(); struct sched_domain *this_sd; u64 time; - int this = smp_processor_id(); - int cpu, nr = INT_MAX; this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc)); if (!this_sd) @@ -6129,7 +6141,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); - if (sched_feat(SIS_PROP)) { + if (sched_feat(SIS_PROP) && !smt) { u64 avg_cost, avg_idle, span_avg; /* @@ -6149,18 +6161,29 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t } for_each_cpu_wrap(cpu, cpus, target) { - if (!--nr) - return -1; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) - break; + if (smt) { + i = select_idle_core(p, cpu, cpus, &idle_cpu); + if ((unsigned int)i < nr_cpumask_bits) + return i; + + } else { + if (!--nr) + return -1; + idle_cpu = __select_idle_cpu(cpu); + if ((unsigned int)idle_cpu < nr_cpumask_bits) + break; + } } - if (sched_feat(SIS_PROP)) { + if (smt) + set_idle_cores(this, false); + + if (sched_feat(SIS_PROP) && !smt) { time = cpu_clock(this) - time; update_avg(&this_sd->avg_scan_cost, time); } - return cpu; + return idle_cpu; } /* @@ -6289,10 +6312,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) if (!sd) return target; - i = select_idle_core(p, sd, target); - if ((unsigned)i < nr_cpumask_bits) - return i; - i = select_idle_cpu(p, sd, target); if ((unsigned)i < nr_cpumask_bits) return i; -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [tip: sched/core] sched/fair: Merge select_idle_core/cpu() 2021-01-27 13:52 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman @ 2021-02-17 13:17 ` tip-bot2 for Mel Gorman 0 siblings, 0 replies; 22+ messages in thread From: tip-bot2 for Mel Gorman @ 2021-02-17 13:17 UTC (permalink / raw) To: linux-tip-commits Cc: Mel Gorman, Peter Zijlstra (Intel), Ingo Molnar, Vincent Guittot, x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 9fe1f127b913318c631d0041ecf71486e38c2c2d Gitweb: https://git.kernel.org/tip/9fe1f127b913318c631d0041ecf71486e38c2c2d Author: Mel Gorman <mgorman@techsingularity.net> AuthorDate: Wed, 27 Jan 2021 13:52:03 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Wed, 17 Feb 2021 14:07:25 +01:00 sched/fair: Merge select_idle_core/cpu() Both select_idle_core() and select_idle_cpu() do a loop over the same cpumask. Observe that by clearing the already visited CPUs, we can fold the iteration and iterate a core at a time. All we need to do is remember any non-idle CPU we encountered while scanning for an idle core. This way we'll only iterate every CPU once. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210127135203.19633-5-mgorman@techsingularity.net --- kernel/sched/fair.c | 99 +++++++++++++++++++++++++------------------ 1 file changed, 59 insertions(+), 40 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6a0fc8a..c73d588 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6019,6 +6019,14 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p return new_cpu; } +static inline int __select_idle_cpu(int cpu) +{ + if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) + return cpu; + + return -1; +} + #ifdef CONFIG_SCHED_SMT DEFINE_STATIC_KEY_FALSE(sched_smt_present); EXPORT_SYMBOL_GPL(sched_smt_present); @@ -6077,48 +6085,51 @@ unlock: * there are no idle cores left in the system; tracked through * sd_llc->shared->has_idle_cores and enabled through update_idle_core() above. */ -static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) +static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu) { - struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); - int core, cpu; + bool idle = true; + int cpu; if (!static_branch_likely(&sched_smt_present)) - return -1; - - if (!test_idle_cores(target, false)) - return -1; - - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + return __select_idle_cpu(core); - for_each_cpu_wrap(core, cpus, target) { - bool idle = true; - - for_each_cpu(cpu, cpu_smt_mask(core)) { - if (!available_idle_cpu(cpu)) { - idle = false; - break; + for_each_cpu(cpu, cpu_smt_mask(core)) { + if (!available_idle_cpu(cpu)) { + idle = false; + if (*idle_cpu == -1) { + if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) { + *idle_cpu = cpu; + break; + } + continue; } + break; } - - if (idle) - return core; - - cpumask_andnot(cpus, cpus, cpu_smt_mask(core)); + if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr)) + *idle_cpu = cpu; } - /* - * Failed to find an idle core; stop looking for one. - */ - set_idle_cores(target, 0); + if (idle) + return core; + cpumask_andnot(cpus, cpus, cpu_smt_mask(core)); return -1; } #else /* CONFIG_SCHED_SMT */ -static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) +static inline void set_idle_cores(int cpu, int val) { - return -1; +} + +static inline bool test_idle_cores(int cpu, bool def) +{ + return def; +} + +static inline int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu) +{ + return __select_idle_cpu(core); } #endif /* CONFIG_SCHED_SMT */ @@ -6131,10 +6142,11 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target) { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); + int i, cpu, idle_cpu = -1, nr = INT_MAX; + bool smt = test_idle_cores(target, false); + int this = smp_processor_id(); struct sched_domain *this_sd; u64 time; - int this = smp_processor_id(); - int cpu, nr = INT_MAX; this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc)); if (!this_sd) @@ -6142,7 +6154,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); - if (sched_feat(SIS_PROP)) { + if (sched_feat(SIS_PROP) && !smt) { u64 avg_cost, avg_idle, span_avg; /* @@ -6162,18 +6174,29 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t } for_each_cpu_wrap(cpu, cpus, target) { - if (!--nr) - return -1; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) - break; + if (smt) { + i = select_idle_core(p, cpu, cpus, &idle_cpu); + if ((unsigned int)i < nr_cpumask_bits) + return i; + + } else { + if (!--nr) + return -1; + idle_cpu = __select_idle_cpu(cpu); + if ((unsigned int)idle_cpu < nr_cpumask_bits) + break; + } } - if (sched_feat(SIS_PROP)) { + if (smt) + set_idle_cores(this, false); + + if (sched_feat(SIS_PROP) && !smt) { time = cpu_clock(this) - time; update_avg(&this_sd->avg_scan_cost, time); } - return cpu; + return idle_cpu; } /* @@ -6302,10 +6325,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) if (!sd) return target; - i = select_idle_core(p, sd, target); - if ((unsigned)i < nr_cpumask_bits) - return i; - i = select_idle_cpu(p, sd, target); if ((unsigned)i < nr_cpumask_bits) return i; ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v5 0/4] Scan for an idle sibling in a single pass 2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman ` (3 preceding siblings ...) 2021-01-27 13:52 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman @ 2021-02-01 1:13 ` Li, Aubrey 2021-02-01 13:06 ` Mel Gorman 4 siblings, 1 reply; 22+ messages in thread From: Li, Aubrey @ 2021-02-01 1:13 UTC (permalink / raw) To: Mel Gorman, Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Qais Yousef, LKML On 2021/1/27 21:51, Mel Gorman wrote: > Changelog since v4 > o Avoid use of intermediate variable during select_idle_cpu > > Changelog since v3 > o Drop scanning based on cores, SMT4 results showed problems > > Changelog since v2 > o Remove unnecessary parameters > o Update nr during scan only when scanning for cpus > > Changlog since v1 > o Move extern declaration to header for coding style > o Remove unnecessary parameter from __select_idle_cpu > > This series of 4 patches reposts three patches from Peter entitled > "select_idle_sibling() wreckage". It only scans the runqueues in a single > pass when searching for an idle sibling. > > Three patches from Peter were dropped. The first patch altered how scan > depth was calculated. Scan depth deletion is a random number generator > with two major limitations. The avg_idle time is based on the time > between a CPU going idle and being woken up clamped approximately by > 2*sysctl_sched_migration_cost. This is difficult to compare in a sensible > fashion to avg_scan_cost. The second issue is that only the avg_scan_cost > of scan failures is recorded and it does not decay. This requires deeper > surgery that would justify a patch on its own although Peter notes that > https://lkml.kernel.org/r/20180530143105.977759909@infradead.org is > potentially useful for an alternative avg_idle metric. > > The second patch dropped scanned based on cores instead of CPUs as it > rationalised the difference between core scanning and CPU scanning. > Unfortunately, Vincent reported problems with SMT4 so it's dropped > for now until depth searching can be fixed. > > The third patch dropped converted the idle core scan throttling mechanism > to SIS_PROP. While this would unify the throttling of core and CPU > scanning, it was not free of regressions and has_idle_cores is a fairly > effective throttling mechanism with the caveat that it can have a lot of > false positives for workloads like hackbench. > > Peter's series tried to solve three problems at once, this subset addresses > one problem. > > kernel/sched/fair.c | 151 +++++++++++++++++++--------------------- > kernel/sched/features.h | 1 - > 2 files changed, 70 insertions(+), 82 deletions(-) > 4 benchmarks measured on a x86 4s system with 24 cores per socket and 2 HTs per core, total 192 CPUs. The load level is [25%, 50%, 75%, 100%]. - hackbench almost has a universal win. - netperf high load has notable changes, as well as tbench 50% load. Details below: hackbench: 10 iterations, 10000 loops, 40 fds per group ====================================================== - pipe process group base %std v5 %std 3 1 19.18 1.0266 9.06 6 1 9.17 0.987 13.03 9 1 7.11 1.0195 4.61 12 1 1.07 0.9927 1.43 - pipe thread group base %std v5 %std 3 1 11.14 0.9742 7.27 6 1 9.15 0.9572 7.48 9 1 2.95 0.986 4.05 12 1 1.75 0.9992 1.68 - socket process group base %std v5 %std 3 1 2.9 0.9586 2.39 6 1 0.68 0.9641 1.3 9 1 0.64 0.9388 0.76 12 1 0.56 0.9375 0.55 - socket thread group base %std v5 %std 3 1 3.82 0.9686 2.97 6 1 2.06 0.9667 1.91 9 1 0.44 0.9354 1.25 12 1 0.54 0.9362 0.6 netperf: 10 iterations x 100 seconds, transactions rate / sec ============================================================= - tcp request/response performance thread base %std v4 %std 25% 1 5.34 1.0039 5.13 50% 1 4.97 1.0115 6.3 75% 1 5.09 0.9257 6.75 100% 1 4.53 0.908 4.83 - udp request/response performance thread base %std v4 %std 25% 1 6.18 0.9896 6.09 50% 1 5.88 1.0198 8.92 75% 1 24.38 0.9236 29.14 100% 1 26.16 0.9063 22.16 tbench: 10 iterations x 100 seconds, throughput / sec ===================================================== thread base %std v4 %std 25% 1 0.45 1.003 1.48 50% 1 1.71 0.9286 0.82 75% 1 0.84 0.9928 0.94 100% 1 0.76 0.9762 0.59 schbench: 10 iterations x 100 seconds, 99th percentile latency ============================================================== mthread base %std v4 %std 25% 1 2.89 0.9884 7.34 50% 1 40.38 1.0055 38.37 75% 1 4.76 1.0095 4.62 100% 1 10.09 1.0083 8.03 Thanks, -Aubrey ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v5 0/4] Scan for an idle sibling in a single pass 2021-02-01 1:13 ` [PATCH v5 0/4] Scan for an idle sibling in a single pass Li, Aubrey @ 2021-02-01 13:06 ` Mel Gorman 0 siblings, 0 replies; 22+ messages in thread From: Mel Gorman @ 2021-02-01 13:06 UTC (permalink / raw) To: Li, Aubrey Cc: Peter Zijlstra, Ingo Molnar, Vincent Guittot, Qais Yousef, LKML On Mon, Feb 01, 2021 at 09:13:16AM +0800, Li, Aubrey wrote: > > Peter's series tried to solve three problems at once, this subset addresses > > one problem. > > > > kernel/sched/fair.c | 151 +++++++++++++++++++--------------------- > > kernel/sched/features.h | 1 - > > 2 files changed, 70 insertions(+), 82 deletions(-) > > > > 4 benchmarks measured on a x86 4s system with 24 cores per socket and > 2 HTs per core, total 192 CPUs. > > The load level is [25%, 50%, 75%, 100%]. > > - hackbench almost has a universal win. > - netperf high load has notable changes, as well as tbench 50% load. > Ok, both netperf and tbench are somewhat expected as at those loads are rapidly idling. Previously I observed that rapidly idling loads can allow the has_idle_cores test pass for short durations and the double scanning means there is a greater chance of finding an idle CPU over the two passes. I think overall it's better to avoid double scanning even if there are counter examples as it's possible we'll get that back through better depth selection in the future. Thanks. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 0/4] Scan for an idle sibling in a single pass @ 2021-01-25 8:59 Mel Gorman 2021-01-25 8:59 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2021-01-25 8:59 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman Changelog since v3 o Drop scanning based on cores, SMT4 results showed problems Changelog since v2 o Remove unnecessary parameters o Update nr during scan only when scanning for cpus Changlog since v1 o Move extern declaration to header for coding style o Remove unnecessary parameter from __select_idle_cpu This series of 4 patches reposts three patches from Peter entitled "select_idle_sibling() wreckage". It only scans the runqueues in a single pass when searching for an idle sibling. Three patches from Peter were dropped. The first patch altered how scan depth was calculated. Scan depth deletion is a random number generator with two major limitations. The avg_idle time is based on the time between a CPU going idle and being woken up clamped approximately by 2*sysctl_sched_migration_cost. This is difficult to compare in a sensible fashion to avg_scan_cost. The second issue is that only the avg_scan_cost of scan failures is recorded and it does not decay. This requires deeper surgery that would justify a patch on its own although Peter notes that https://lkml.kernel.org/r/20180530143105.977759909@infradead.org is potentially useful for an alternative avg_idle metric. The second patch dropped scanned based on cores instead of CPUs as it rationalised the difference between core scanning and CPU scanning. Unfortunately, Vincent reported problems with SMT4 so it's dropped for now until depth searching can be fixed. The third patch dropped converted the idle core scan throttling mechanism to SIS_PROP. While this would unify the throttling of core and CPU scanning, it was not free of regressions and has_idle_cores is a fairly effective throttling mechanism with the caveat that it can have a lot of false positives for workloads like hackbench. Peter's series tried to solve three problems at once, this subset addresses one problem. losses but won more than it lost across a range of workloads and machines. kernel/sched/fair.c | 153 +++++++++++++++++++--------------------- kernel/sched/features.h | 1 - 2 files changed, 72 insertions(+), 82 deletions(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2021-01-25 8:59 [PATCH v4 " Mel Gorman @ 2021-01-25 8:59 ` Mel Gorman 0 siblings, 0 replies; 22+ messages in thread From: Mel Gorman @ 2021-01-25 8:59 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman SIS_AVG_CPU was introduced as a means of avoiding a search when the average search cost indicated that the search would likely fail. It was a blunt instrument and disabled by commit 4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive") and later replaced with a proportional search depth by commit 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()"). While there are corner cases where SIS_AVG_CPU is better, it has now been disabled for almost three years. As the intent of SIS_PROP is to reduce the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus on SIS_PROP as a throttling mechanism. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 20 +++++++++----------- kernel/sched/features.h | 1 - 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 04a3ce20da67..9f5682aeda2e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6145,7 +6145,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); struct sched_domain *this_sd; - u64 avg_cost, avg_idle; u64 time; int this = smp_processor_id(); int cpu, nr = INT_MAX; @@ -6154,18 +6153,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; - /* - * Due to large variance we need a large fuzz factor; hackbench in - * particularly is sensitive here. - */ - avg_idle = this_rq()->avg_idle / 512; - avg_cost = this_sd->avg_scan_cost + 1; + if (sched_feat(SIS_PROP)) { + u64 avg_cost, avg_idle, span_avg; - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) - return -1; + /* + * Due to large variance we need a large fuzz factor; + * hackbench in particularly is sensitive here. + */ + avg_idle = this_rq()->avg_idle / 512; + avg_cost = this_sd->avg_scan_cost + 1; - if (sched_feat(SIS_PROP)) { - u64 span_avg = sd->span_weight * avg_idle; + span_avg = sd->span_weight * avg_idle; if (span_avg > 4*avg_cost) nr = div_u64(span_avg, avg_cost); else diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 68d369cba9e4..e875eabb6600 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) /* * When doing wakeups, attempt to limit superfluous scans of the LLC domain. */ -SCHED_FEAT(SIS_AVG_CPU, false) SCHED_FEAT(SIS_PROP, true) /* -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 0/4] Reduce scanning of runqueues in select_idle_sibling
@ 2020-12-08 15:34 Mel Gorman
2020-12-08 15:34 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman
0 siblings, 1 reply; 22+ messages in thread
From: Mel Gorman @ 2020-12-08 15:34 UTC (permalink / raw)
To: Peter Ziljstra, Ingo Molnar, LKML
Cc: Aubrey Li, Barry Song, Juri Lelli, Vincent Guittot,
Valentin Schneider, Linux-ARM, Mel Gorman
Changelog since v1
o Drop single-pass patch (vincent)
o Scope variables used for SIS_AVG_CPU (dietmar)
o Remove redundant assignment (dietmar
This reduces the amount of runqueue scanning in select_idle_sibling in
the worst case.
Patch 1 removes SIS_AVG_CPU because it's unused.
Patch 2 moves all SIS_PROP-related calculations under SIS_PROP
Patch 3 improves the hit rate of p->recent_used_cpu to reduce the amount
of scanning. It should be relatively uncontroversial
Patch 4 returns an idle candidate if one is found while scanning for a
free core.
--
2.26.2
Mel Gorman (4):
sched/fair: Remove SIS_AVG_CPU
sched/fair: Move avg_scan_cost calculations under SIS_PROP
sched/fair: Do not replace recent_used_cpu with the new target
sched/fair: Return an idle cpu if one is found after a failed search
for an idle core
kernel/sched/fair.c | 51 ++++++++++++++++++++---------------------
kernel/sched/features.h | 1 -
2 files changed, 25 insertions(+), 27 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 22+ messages in thread* [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 15:34 [PATCH 0/4] Reduce scanning of runqueues in select_idle_sibling Mel Gorman @ 2020-12-08 15:34 ` Mel Gorman 2020-12-08 16:13 ` Vincent Guittot 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2020-12-08 15:34 UTC (permalink / raw) To: Peter Ziljstra, Ingo Molnar, LKML Cc: Aubrey Li, Barry Song, Juri Lelli, Vincent Guittot, Valentin Schneider, Linux-ARM, Mel Gorman SIS_AVG_CPU was introduced as a means of avoiding a search when the average search cost indicated that the search would likely fail. It was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive") and later replaced with a proportional search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()"). While there are corner cases where SIS_AVG_CPU is better, it has now been disabled for almost three years. As the intent of SIS_PROP is to reduce the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus on SIS_PROP as a throttling mechanism. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- kernel/sched/fair.c | 20 +++++++++----------- kernel/sched/features.h | 1 - 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 98075f9ea9a8..ac7b34e7372b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6145,7 +6145,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); struct sched_domain *this_sd; - u64 avg_cost, avg_idle; u64 time; int this = smp_processor_id(); int cpu, nr = INT_MAX; @@ -6154,18 +6153,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; - /* - * Due to large variance we need a large fuzz factor; hackbench in - * particularly is sensitive here. - */ - avg_idle = this_rq()->avg_idle / 512; - avg_cost = this_sd->avg_scan_cost + 1; + if (sched_feat(SIS_PROP)) { + u64 avg_cost, avg_idle, span_avg; - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) - return -1; + /* + * Due to large variance we need a large fuzz factor; + * hackbench in particularly is sensitive here. + */ + avg_idle = this_rq()->avg_idle / 512; + avg_cost = this_sd->avg_scan_cost + 1; - if (sched_feat(SIS_PROP)) { - u64 span_avg = sd->span_weight * avg_idle; + span_avg = sd->span_weight * avg_idle; if (span_avg > 4*avg_cost) nr = div_u64(span_avg, avg_cost); else diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 68d369cba9e4..e875eabb6600 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) /* * When doing wakeups, attempt to limit superfluous scans of the LLC domain. */ -SCHED_FEAT(SIS_AVG_CPU, false) SCHED_FEAT(SIS_PROP, true) /* -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 15:34 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman @ 2020-12-08 16:13 ` Vincent Guittot 0 siblings, 0 replies; 22+ messages in thread From: Vincent Guittot @ 2020-12-08 16:13 UTC (permalink / raw) To: Mel Gorman Cc: Peter Ziljstra, Ingo Molnar, LKML, Aubrey Li, Barry Song, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, 8 Dec 2020 at 16:35, Mel Gorman <mgorman@techsingularity.net> wrote: > > SIS_AVG_CPU was introduced as a means of avoiding a search when the > average search cost indicated that the search would likely fail. It > was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make > select_idle_cpu() more aggressive") and later replaced with a proportional > search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to > scale select_idle_cpu()"). > > While there are corner cases where SIS_AVG_CPU is better, it has now been > disabled for almost three years. As the intent of SIS_PROP is to reduce > the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus > on SIS_PROP as a throttling mechanism. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > kernel/sched/fair.c | 20 +++++++++----------- > kernel/sched/features.h | 1 - > 2 files changed, 9 insertions(+), 12 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 98075f9ea9a8..ac7b34e7372b 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6145,7 +6145,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > { > struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); > struct sched_domain *this_sd; > - u64 avg_cost, avg_idle; > u64 time; > int this = smp_processor_id(); > int cpu, nr = INT_MAX; > @@ -6154,18 +6153,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > if (!this_sd) > return -1; > > - /* > - * Due to large variance we need a large fuzz factor; hackbench in > - * particularly is sensitive here. > - */ > - avg_idle = this_rq()->avg_idle / 512; > - avg_cost = this_sd->avg_scan_cost + 1; > + if (sched_feat(SIS_PROP)) { > + u64 avg_cost, avg_idle, span_avg; > > - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) > - return -1; > + /* > + * Due to large variance we need a large fuzz factor; > + * hackbench in particularly is sensitive here. > + */ > + avg_idle = this_rq()->avg_idle / 512; > + avg_cost = this_sd->avg_scan_cost + 1; > > - if (sched_feat(SIS_PROP)) { > - u64 span_avg = sd->span_weight * avg_idle; > + span_avg = sd->span_weight * avg_idle; > if (span_avg > 4*avg_cost) > nr = div_u64(span_avg, avg_cost); > else > diff --git a/kernel/sched/features.h b/kernel/sched/features.h > index 68d369cba9e4..e875eabb6600 100644 > --- a/kernel/sched/features.h > +++ b/kernel/sched/features.h > @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) > /* > * When doing wakeups, attempt to limit superfluous scans of the LLC domain. > */ > -SCHED_FEAT(SIS_AVG_CPU, false) > SCHED_FEAT(SIS_PROP, true) > > /* > -- > 2.26.2 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 0/4] Reduce worst-case scanning of runqueues in select_idle_sibling @ 2020-12-07 9:15 Mel Gorman 2020-12-07 9:15 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2020-12-07 9:15 UTC (permalink / raw) To: LKML Cc: Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Vincent Guittot, Valentin Schneider, Linux-ARM, Mel Gorman This is a minimal series to reduce the amount of runqueue scanning in select_idle_sibling in the worst case. Patch 1 removes SIS_AVG_CPU because it's unused. Patch 2 improves the hit rate of p->recent_used_cpu to reduce the amount of scanning. It should be relatively uncontroversial Patch 3-4 scans the runqueues in a single pass for select_idle_core() and select_idle_cpu() so runqueues are not scanned twice. It's a tradeoff because it benefits deep scans but introduces overhead for shallow scans. Even if patch 3-4 is rejected to allow more time for Aubrey's idle cpu mask approach to stand on its own, patches 1-2 should be fine. The main decision with patch 4 is whether select_idle_core() should do a full scan when searching for an idle core, whether it should be throttled in some other fashion or whether it should be just left alone. -- 2.26.2 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-07 9:15 [RFC PATCH 0/4] Reduce worst-case scanning of runqueues in select_idle_sibling Mel Gorman @ 2020-12-07 9:15 ` Mel Gorman 2020-12-07 15:05 ` Vincent Guittot 2020-12-08 10:07 ` Dietmar Eggemann 0 siblings, 2 replies; 22+ messages in thread From: Mel Gorman @ 2020-12-07 9:15 UTC (permalink / raw) To: LKML Cc: Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Vincent Guittot, Valentin Schneider, Linux-ARM, Mel Gorman SIS_AVG_CPU was introduced as a means of avoiding a search when the average search cost indicated that the search would likely fail. It was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive") and later replaced with a proportional search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()"). While there are corner cases where SIS_AVG_CPU is better, it has now been disabled for almost three years. As the intent of SIS_PROP is to reduce the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus on SIS_PROP as a throttling mechanism. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- kernel/sched/fair.c | 3 --- kernel/sched/features.h | 1 - 2 files changed, 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 98075f9ea9a8..23934dbac635 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6161,9 +6161,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t avg_idle = this_rq()->avg_idle / 512; avg_cost = this_sd->avg_scan_cost + 1; - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) - return -1; - if (sched_feat(SIS_PROP)) { u64 span_avg = sd->span_weight * avg_idle; if (span_avg > 4*avg_cost) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 68d369cba9e4..e875eabb6600 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) /* * When doing wakeups, attempt to limit superfluous scans of the LLC domain. */ -SCHED_FEAT(SIS_AVG_CPU, false) SCHED_FEAT(SIS_PROP, true) /* -- 2.26.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-07 9:15 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman @ 2020-12-07 15:05 ` Vincent Guittot 2020-12-08 10:07 ` Dietmar Eggemann 1 sibling, 0 replies; 22+ messages in thread From: Vincent Guittot @ 2020-12-07 15:05 UTC (permalink / raw) To: Mel Gorman Cc: LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Mon, 7 Dec 2020 at 10:15, Mel Gorman <mgorman@techsingularity.net> wrote: > > SIS_AVG_CPU was introduced as a means of avoiding a search when the > average search cost indicated that the search would likely fail. It > was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make > select_idle_cpu() more aggressive") and later replaced with a proportional > search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to > scale select_idle_cpu()"). > > While there are corner cases where SIS_AVG_CPU is better, it has now been > disabled for almost three years. As the intent of SIS_PROP is to reduce > the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus > on SIS_PROP as a throttling mechanism. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Let see if someone complains but looks reasonable > --- > kernel/sched/fair.c | 3 --- > kernel/sched/features.h | 1 - > 2 files changed, 4 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 98075f9ea9a8..23934dbac635 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6161,9 +6161,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > avg_idle = this_rq()->avg_idle / 512; > avg_cost = this_sd->avg_scan_cost + 1; > > - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) > - return -1; > - > if (sched_feat(SIS_PROP)) { > u64 span_avg = sd->span_weight * avg_idle; > if (span_avg > 4*avg_cost) > diff --git a/kernel/sched/features.h b/kernel/sched/features.h > index 68d369cba9e4..e875eabb6600 100644 > --- a/kernel/sched/features.h > +++ b/kernel/sched/features.h > @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) > /* > * When doing wakeups, attempt to limit superfluous scans of the LLC domain. > */ > -SCHED_FEAT(SIS_AVG_CPU, false) > SCHED_FEAT(SIS_PROP, true) > > /* > -- > 2.26.2 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-07 9:15 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 2020-12-07 15:05 ` Vincent Guittot @ 2020-12-08 10:07 ` Dietmar Eggemann 2020-12-08 10:59 ` Mel Gorman 1 sibling, 1 reply; 22+ messages in thread From: Dietmar Eggemann @ 2020-12-08 10:07 UTC (permalink / raw) To: Mel Gorman, LKML Cc: Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Vincent Guittot, Valentin Schneider, Linux-ARM On 07/12/2020 10:15, Mel Gorman wrote: > SIS_AVG_CPU was introduced as a means of avoiding a search when the > average search cost indicated that the search would likely fail. It > was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make > select_idle_cpu() more aggressive") and later replaced with a proportional > search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to > scale select_idle_cpu()"). > > While there are corner cases where SIS_AVG_CPU is better, it has now been > disabled for almost three years. As the intent of SIS_PROP is to reduce > the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus > on SIS_PROP as a throttling mechanism. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > kernel/sched/fair.c | 3 --- > kernel/sched/features.h | 1 - > 2 files changed, 4 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 98075f9ea9a8..23934dbac635 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6161,9 +6161,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > avg_idle = this_rq()->avg_idle / 512; > avg_cost = this_sd->avg_scan_cost + 1; > > - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) > - return -1; > - > if (sched_feat(SIS_PROP)) { > u64 span_avg = sd->span_weight * avg_idle; > if (span_avg > 4*avg_cost) Nitpick: Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go completely into the SIS_PROP if condition. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 09f6f0edead4..fce9457cccb9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6121,7 +6121,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); struct sched_domain *this_sd; - u64 avg_cost, avg_idle; u64 time; int this = smp_processor_id(); int cpu, nr = INT_MAX; @@ -6130,14 +6129,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; - /* - * Due to large variance we need a large fuzz factor; hackbench in - * particularly is sensitive here. - */ - avg_idle = this_rq()->avg_idle / 512; - avg_cost = this_sd->avg_scan_cost + 1; - if (sched_feat(SIS_PROP)) { + /* + * Due to large variance we need a large fuzz factor; hackbench in + * particularly is sensitive here. + */ + u64 avg_idle = this_rq()->avg_idle / 512; + u64 avg_cost = this_sd->avg_scan_cost + 1; u64 span_avg = sd->span_weight * avg_idle; if (span_avg > 4*avg_cost) nr = div_u64(span_avg, avg_cost); -- 2.17.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 10:07 ` Dietmar Eggemann @ 2020-12-08 10:59 ` Mel Gorman 2020-12-08 13:24 ` Vincent Guittot 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2020-12-08 10:59 UTC (permalink / raw) To: Dietmar Eggemann Cc: LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Vincent Guittot, Valentin Schneider, Linux-ARM On Tue, Dec 08, 2020 at 11:07:19AM +0100, Dietmar Eggemann wrote: > On 07/12/2020 10:15, Mel Gorman wrote: > > SIS_AVG_CPU was introduced as a means of avoiding a search when the > > average search cost indicated that the search would likely fail. It > > was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make > > select_idle_cpu() more aggressive") and later replaced with a proportional > > search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to > > scale select_idle_cpu()"). > > > > While there are corner cases where SIS_AVG_CPU is better, it has now been > > disabled for almost three years. As the intent of SIS_PROP is to reduce > > the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus > > on SIS_PROP as a throttling mechanism. > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > --- > > kernel/sched/fair.c | 3 --- > > kernel/sched/features.h | 1 - > > 2 files changed, 4 deletions(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 98075f9ea9a8..23934dbac635 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6161,9 +6161,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > avg_idle = this_rq()->avg_idle / 512; > > avg_cost = this_sd->avg_scan_cost + 1; > > > > - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) > > - return -1; > > - > > if (sched_feat(SIS_PROP)) { > > u64 span_avg = sd->span_weight * avg_idle; > > if (span_avg > 4*avg_cost) > > Nitpick: > > Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go > completely into the SIS_PROP if condition. > Yeah, I can do that. In the initial prototype, that happened in a separate patch that split out SIS_PROP into a helper function and I never merged it back. It's a trivial change. Thanks. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 10:59 ` Mel Gorman @ 2020-12-08 13:24 ` Vincent Guittot 2020-12-08 13:36 ` Mel Gorman 0 siblings, 1 reply; 22+ messages in thread From: Vincent Guittot @ 2020-12-08 13:24 UTC (permalink / raw) To: Mel Gorman Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, 8 Dec 2020 at 11:59, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Tue, Dec 08, 2020 at 11:07:19AM +0100, Dietmar Eggemann wrote: > > On 07/12/2020 10:15, Mel Gorman wrote: > > > SIS_AVG_CPU was introduced as a means of avoiding a search when the > > > average search cost indicated that the search would likely fail. It > > > was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make > > > select_idle_cpu() more aggressive") and later replaced with a proportional > > > search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to > > > scale select_idle_cpu()"). > > > > > > While there are corner cases where SIS_AVG_CPU is better, it has now been > > > disabled for almost three years. As the intent of SIS_PROP is to reduce > > > the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus > > > on SIS_PROP as a throttling mechanism. > > > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > --- > > > kernel/sched/fair.c | 3 --- > > > kernel/sched/features.h | 1 - > > > 2 files changed, 4 deletions(-) > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index 98075f9ea9a8..23934dbac635 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -6161,9 +6161,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > > avg_idle = this_rq()->avg_idle / 512; > > > avg_cost = this_sd->avg_scan_cost + 1; > > > > > > - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) > > > - return -1; > > > - > > > if (sched_feat(SIS_PROP)) { > > > u64 span_avg = sd->span_weight * avg_idle; > > > if (span_avg > 4*avg_cost) > > > > Nitpick: > > > > Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go > > completely into the SIS_PROP if condition. > > > > Yeah, I can do that. In the initial prototype, that happened in a > separate patch that split out SIS_PROP into a helper function and I > never merged it back. It's a trivial change. while doing this, should you also put the update of this_sd->avg_scan_cost under the SIS_PROP feature ? > > Thanks. > > -- > Mel Gorman > SUSE Labs ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 13:24 ` Vincent Guittot @ 2020-12-08 13:36 ` Mel Gorman 2020-12-08 13:43 ` Vincent Guittot 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2020-12-08 13:36 UTC (permalink / raw) To: Vincent Guittot Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, Dec 08, 2020 at 02:24:32PM +0100, Vincent Guittot wrote: > > > Nitpick: > > > > > > Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go > > > completely into the SIS_PROP if condition. > > > > > > > Yeah, I can do that. In the initial prototype, that happened in a > > separate patch that split out SIS_PROP into a helper function and I > > never merged it back. It's a trivial change. > > while doing this, should you also put the update of > this_sd->avg_scan_cost under the SIS_PROP feature ? > It's outside the scope of the series but why not. This? --8<-- sched/fair: Move avg_scan_cost calculations under SIS_PROP As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP check and while we are at it, exclude the cost of initialising the CPU mask from the average scan cost. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 19ca0265f8aa..0fee53b1aae4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6176,10 +6176,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t nr = 4; } - time = cpu_clock(this); - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + if (sched_feat(SIS_PROP)) + time = cpu_clock(this); for_each_cpu_wrap(cpu, cpus, target) { if (!--nr) return -1; @@ -6187,8 +6187,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t break; } - time = cpu_clock(this) - time; - update_avg(&this_sd->avg_scan_cost, time); + if (sched_feat(SIS_PROP)) { + time = cpu_clock(this) - time; + update_avg(&this_sd->avg_scan_cost, time); + } return cpu; } ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 13:36 ` Mel Gorman @ 2020-12-08 13:43 ` Vincent Guittot 2020-12-08 13:53 ` Mel Gorman 0 siblings, 1 reply; 22+ messages in thread From: Vincent Guittot @ 2020-12-08 13:43 UTC (permalink / raw) To: Mel Gorman Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, 8 Dec 2020 at 14:36, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Tue, Dec 08, 2020 at 02:24:32PM +0100, Vincent Guittot wrote: > > > > Nitpick: > > > > > > > > Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go > > > > completely into the SIS_PROP if condition. > > > > > > > > > > Yeah, I can do that. In the initial prototype, that happened in a > > > separate patch that split out SIS_PROP into a helper function and I > > > never merged it back. It's a trivial change. > > > > while doing this, should you also put the update of > > this_sd->avg_scan_cost under the SIS_PROP feature ? > > > > It's outside the scope of the series but why not. This? > > --8<-- > sched/fair: Move avg_scan_cost calculations under SIS_PROP > > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > check and while we are at it, exclude the cost of initialising the CPU > mask from the average scan cost. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 19ca0265f8aa..0fee53b1aae4 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6176,10 +6176,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > nr = 4; > } > > - time = cpu_clock(this); I would move it in the if (sched_feat(SIS_PROP)) above. > - > cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > > + if (sched_feat(SIS_PROP)) > + time = cpu_clock(this); > for_each_cpu_wrap(cpu, cpus, target) { > if (!--nr) > return -1; > @@ -6187,8 +6187,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > break; > } > > - time = cpu_clock(this) - time; > - update_avg(&this_sd->avg_scan_cost, time); > + if (sched_feat(SIS_PROP)) { > + time = cpu_clock(this) - time; > + update_avg(&this_sd->avg_scan_cost, time); > + } > > return cpu; > } ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 13:43 ` Vincent Guittot @ 2020-12-08 13:53 ` Mel Gorman 2020-12-08 14:47 ` Vincent Guittot 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2020-12-08 13:53 UTC (permalink / raw) To: Vincent Guittot Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, Dec 08, 2020 at 02:43:10PM +0100, Vincent Guittot wrote: > On Tue, 8 Dec 2020 at 14:36, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > On Tue, Dec 08, 2020 at 02:24:32PM +0100, Vincent Guittot wrote: > > > > > Nitpick: > > > > > > > > > > Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go > > > > > completely into the SIS_PROP if condition. > > > > > > > > > > > > > Yeah, I can do that. In the initial prototype, that happened in a > > > > separate patch that split out SIS_PROP into a helper function and I > > > > never merged it back. It's a trivial change. > > > > > > while doing this, should you also put the update of > > > this_sd->avg_scan_cost under the SIS_PROP feature ? > > > > > > > It's outside the scope of the series but why not. This? > > > > --8<-- > > sched/fair: Move avg_scan_cost calculations under SIS_PROP > > > > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > > check and while we are at it, exclude the cost of initialising the CPU > > mask from the average scan cost. > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 19ca0265f8aa..0fee53b1aae4 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6176,10 +6176,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > nr = 4; > > } > > > > - time = cpu_clock(this); > > I would move it in the if (sched_feat(SIS_PROP)) above. > I considered it but made the choice to exclude the cost of cpumask_and() from the avg_scan_cost instead. It's minor but when doing the original prototype, I didn't think it was appropriate to count the cpumask clearing as part of the scan cost as it's not directly related. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 13:53 ` Mel Gorman @ 2020-12-08 14:47 ` Vincent Guittot 2020-12-08 15:12 ` Mel Gorman 0 siblings, 1 reply; 22+ messages in thread From: Vincent Guittot @ 2020-12-08 14:47 UTC (permalink / raw) To: Mel Gorman Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, 8 Dec 2020 at 14:54, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Tue, Dec 08, 2020 at 02:43:10PM +0100, Vincent Guittot wrote: > > On Tue, 8 Dec 2020 at 14:36, Mel Gorman <mgorman@techsingularity.net> wrote: > > > > > > On Tue, Dec 08, 2020 at 02:24:32PM +0100, Vincent Guittot wrote: > > > > > > Nitpick: > > > > > > > > > > > > Since now avg_cost and avg_idle are only used w/ SIS_PROP, they could go > > > > > > completely into the SIS_PROP if condition. > > > > > > > > > > > > > > > > Yeah, I can do that. In the initial prototype, that happened in a > > > > > separate patch that split out SIS_PROP into a helper function and I > > > > > never merged it back. It's a trivial change. > > > > > > > > while doing this, should you also put the update of > > > > this_sd->avg_scan_cost under the SIS_PROP feature ? > > > > > > > > > > It's outside the scope of the series but why not. This? > > > > > > --8<-- > > > sched/fair: Move avg_scan_cost calculations under SIS_PROP > > > > > > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > > > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > > > check and while we are at it, exclude the cost of initialising the CPU > > > mask from the average scan cost. > > > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index 19ca0265f8aa..0fee53b1aae4 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -6176,10 +6176,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > > > nr = 4; > > > } > > > > > > - time = cpu_clock(this); > > > > I would move it in the if (sched_feat(SIS_PROP)) above. > > > > I considered it but made the choice to exclude the cost of cpumask_and() > from the avg_scan_cost instead. It's minor but when doing the original At the cost of a less readable code > prototype, I didn't think it was appropriate to count the cpumask > clearing as part of the scan cost as it's not directly related. hmm... I think it is because the number of loop is directly related to the allowed cpus > > -- > Mel Gorman > SUSE Labs ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 14:47 ` Vincent Guittot @ 2020-12-08 15:12 ` Mel Gorman 2020-12-08 15:19 ` Vincent Guittot 0 siblings, 1 reply; 22+ messages in thread From: Mel Gorman @ 2020-12-08 15:12 UTC (permalink / raw) To: Vincent Guittot Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, Dec 08, 2020 at 03:47:40PM +0100, Vincent Guittot wrote: > > I considered it but made the choice to exclude the cost of cpumask_and() > > from the avg_scan_cost instead. It's minor but when doing the original > > At the cost of a less readable code > Slightly less readable, yes. > > prototype, I didn't think it was appropriate to count the cpumask > > clearing as part of the scan cost as it's not directly related. > > hmm... I think it is because the number of loop is directly related to > the allowed cpus > While that is true, the cost of initialising the map is constant and what is most important is tracking the scan cost which is variable. Without SIS_AVG_CPU, the cpumask init can go before SIS_PROP without any penalty so is this version preferable? --8<-- sched/fair: Move avg_scan_cost calculations under SIS_PROP As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP check and while we are at it, exclude the cost of initialising the CPU mask from the average scan cost. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- kernel/sched/fair.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ac7b34e7372b..5c41875aec23 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t if (!this_sd) return -1; + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + if (sched_feat(SIS_PROP)) { u64 avg_cost, avg_idle, span_avg; @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t nr = div_u64(span_avg, avg_cost); else nr = 4; - } - - time = cpu_clock(this); - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + time = cpu_clock(this); + } for_each_cpu_wrap(cpu, cpus, target) { if (!--nr) @@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t break; } - time = cpu_clock(this) - time; - update_avg(&this_sd->avg_scan_cost, time); + if (sched_feat(SIS_PROP)) { + time = cpu_clock(this) - time; + update_avg(&this_sd->avg_scan_cost, time); + } return cpu; } ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU 2020-12-08 15:12 ` Mel Gorman @ 2020-12-08 15:19 ` Vincent Guittot 0 siblings, 0 replies; 22+ messages in thread From: Vincent Guittot @ 2020-12-08 15:19 UTC (permalink / raw) To: Mel Gorman Cc: Dietmar Eggemann, LKML, Aubrey Li, Barry Song, Ingo Molnar, Peter Ziljstra, Juri Lelli, Valentin Schneider, Linux-ARM On Tue, 8 Dec 2020 at 16:12, Mel Gorman <mgorman@techsingularity.net> wrote: > > On Tue, Dec 08, 2020 at 03:47:40PM +0100, Vincent Guittot wrote: > > > I considered it but made the choice to exclude the cost of cpumask_and() > > > from the avg_scan_cost instead. It's minor but when doing the original > > > > At the cost of a less readable code > > > > Slightly less readable, yes. > > > > prototype, I didn't think it was appropriate to count the cpumask > > > clearing as part of the scan cost as it's not directly related. > > > > hmm... I think it is because the number of loop is directly related to > > the allowed cpus > > > > While that is true, the cost of initialising the map is constant and > what is most important is tracking the scan cost which is variable. > Without SIS_AVG_CPU, the cpumask init can go before SIS_PROP without any > penalty so is this version preferable? yes looks good to me > > --8<-- > sched/fair: Move avg_scan_cost calculations under SIS_PROP > > As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP > even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP > check and while we are at it, exclude the cost of initialising the CPU > mask from the average scan cost. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > kernel/sched/fair.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index ac7b34e7372b..5c41875aec23 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > if (!this_sd) > return -1; > > + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + > if (sched_feat(SIS_PROP)) { > u64 avg_cost, avg_idle, span_avg; > > @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > nr = div_u64(span_avg, avg_cost); > else > nr = 4; > - } > - > - time = cpu_clock(this); > > - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); > + time = cpu_clock(this); > + } > > for_each_cpu_wrap(cpu, cpus, target) { > if (!--nr) > @@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t > break; > } > > - time = cpu_clock(this) - time; > - update_avg(&this_sd->avg_scan_cost, time); > + if (sched_feat(SIS_PROP)) { > + time = cpu_clock(this) - time; > + update_avg(&this_sd->avg_scan_cost, time); > + } > > return cpu; > } ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2021-02-17 13:30 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman 2021-01-27 13:52 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 2021-01-27 13:52 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman 2021-01-27 13:52 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman 2021-01-27 13:52 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman 2021-02-17 13:17 ` [tip: sched/core] " tip-bot2 for Mel Gorman 2021-02-01 1:13 ` [PATCH v5 0/4] Scan for an idle sibling in a single pass Li, Aubrey 2021-02-01 13:06 ` Mel Gorman -- strict thread matches above, loose matches on Subject: below -- 2021-01-25 8:59 [PATCH v4 " Mel Gorman 2021-01-25 8:59 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 2020-12-08 15:34 [PATCH 0/4] Reduce scanning of runqueues in select_idle_sibling Mel Gorman 2020-12-08 15:34 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 2020-12-08 16:13 ` Vincent Guittot 2020-12-07 9:15 [RFC PATCH 0/4] Reduce worst-case scanning of runqueues in select_idle_sibling Mel Gorman 2020-12-07 9:15 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman 2020-12-07 15:05 ` Vincent Guittot 2020-12-08 10:07 ` Dietmar Eggemann 2020-12-08 10:59 ` Mel Gorman 2020-12-08 13:24 ` Vincent Guittot 2020-12-08 13:36 ` Mel Gorman 2020-12-08 13:43 ` Vincent Guittot 2020-12-08 13:53 ` Mel Gorman 2020-12-08 14:47 ` Vincent Guittot 2020-12-08 15:12 ` Mel Gorman 2020-12-08 15:19 ` Vincent Guittot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).