From: Tobias Huschle <huschle@linux.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
bristot@redhat.com, vschneid@redhat.com,
sshegde@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com,
linuxppc-dev@lists.ozlabs.org
Subject: [RFC 1/1] sched/fair: Consider asymmetric scheduler groups in load balancer
Date: Mon, 15 May 2023 13:46:01 +0200 [thread overview]
Message-ID: <20230515114601.12737-2-huschle@linux.ibm.com> (raw)
In-Reply-To: <20230515114601.12737-1-huschle@linux.ibm.com>
The current load balancer implementation implies that scheduler groups,
within the same domain, all host the same number of CPUs. This is
reflected in the condition, that a scheduler group, which is load
balancing and classified as having spare capacity, should pull work
from the busiest group, if the local group runs less processes than
the busiest one. This implies that these two groups should run the
same number of processes, which is problematic if the groups are not
of the same size.
The assumption that scheduler groups within the same scheduler domain
host the same number of CPUs appears to be true for non-s390
architectures. Nevertheless, s390 can have scheduler groups of unequal
size.
This introduces a performance degredation in the following scenario:
Consider a system with 8 CPUs, 6 CPUs are located on one CPU socket,
the remaining 2 are located on another socket:
Socket -----1----- -2-
CPU 1 2 3 4 5 6 7 8
Placing some workload ( x = one task ) yields the following
scenarios:
The first 5 tasks are distributed evenly across the two groups.
Socket -----1----- -2-
CPU 1 2 3 4 5 6 7 8
x x x x x
Adding a 6th task yields the following distribution:
Socket -----1----- -2-
CPU 1 2 3 4 5 6 7 8
SMT1 x x x x x
SMT2 x
The task is added to the 2nd scheduler group, as the scheduler has the
assumption that scheduler groups are of the same size, so they should
also host the same number of tasks. This makes CPU 7 run into SMT
thread, which comes with a performance penalty. This means, that in
the window of 6-8 tasks, load balancing is done suboptimally, because
SMT is used although there is no reason to do so as fully idle CPUs
are still available.
Taking the weight of the scheduler groups into account, ensures that
a load balancing CPU within a smaller group will not try to pull tasks
from a bigger group while the bigger group still has idle CPUs
available.
Signed-off-by: Tobias Huschle <huschle@linux.ibm.com>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 48b6f0ca13ac..b1307d7e4065 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10426,7 +10426,8 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
* group's child domain.
*/
if (sds.prefer_sibling && local->group_type == group_has_spare &&
- busiest->sum_nr_running > local->sum_nr_running + 1)
+ busiest->sum_nr_running * local->group_weight >
+ local->sum_nr_running * busiest->group_weight + 1)
goto force_balance;
if (busiest->group_type != group_overloaded) {
--
2.34.1
next prev parent reply other threads:[~2023-05-15 11:55 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-15 11:46 [RFC 0/1] sched/fair: Consider asymmetric scheduler groups in load balancer Tobias Huschle
2023-05-15 11:46 ` Tobias Huschle [this message]
2023-05-16 13:36 ` [RFC 1/1] " Vincent Guittot
2023-06-05 8:07 ` Tobias Huschle
2023-07-05 7:52 ` Vincent Guittot
2023-07-07 7:44 ` Tobias Huschle
2023-07-07 14:33 ` Shrikanth Hegde
2023-07-07 15:59 ` Tobias Huschle
2023-07-07 16:26 ` Shrikanth Hegde
2023-07-04 13:40 ` Peter Zijlstra
2023-07-07 7:44 ` Tobias Huschle
2023-07-06 17:19 ` Shrikanth Hegde
2023-07-07 7:45 ` Tobias Huschle
2023-05-16 16:35 ` [RFC 0/1] " Dietmar Eggemann
2023-07-04 9:11 ` Tobias Huschle
2023-07-06 11:11 ` Dietmar Eggemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230515114601.12737-2-huschle@linux.ibm.com \
--to=huschle@linux.ibm.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=srikar@linux.vnet.ibm.com \
--cc=sshegde@linux.vnet.ibm.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox