From: Andrea Arcangeli <aarcange@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@techsingularity.net>
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH 1/1] sched/fair: skip select_idle_sibling() in presence of sync wakeups
Date: Tue, 8 Jan 2019 22:49:41 -0500 [thread overview]
Message-ID: <20190109034941.28759-2-aarcange@redhat.com> (raw)
In-Reply-To: <20190109034941.28759-1-aarcange@redhat.com>
__wake_up_sync() gives a very explicit hint to the scheduler that the
current task will immediately go to sleep and won't return running
until after the waken tasks has started running again.
This is common behavior for message passing through pipes or local
sockets (AF_UNIX or through the loopback interface).
The scheduler does everything right up to the point it calls
select_idle_sibling(). Up to that point the CPU selected for the next
task that got a sync-wakeup could very well be the local CPU. That way
the sync-waken task will start running immediately after the current
task goes to sleep without requiring remote CPU wakeups.
However when select_idle_sibling() is called (especially if
SCHED_MC=y) if there's at least one idle core in the same package, the
sync-waken task will be forcefully waken to run on a different idle
core, in turn destroying the "sync" information and all work done up
to that point.
Without this patch under such a workload there will be two different
CPUs at ~50% utilization and the __wake_up_sync() hint won't really
provide much of benefit if compared to the regular non-sync
wakeup. With this patch there will be a single CPU used at 100%
utilization and that increases performance for those common workloads.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
kernel/sched/fair.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d1907506318a..b2ac152a6935 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -691,7 +691,8 @@ static u64 sched_vslice(struct cfs_rq *cfs_rq, struct sched_entity *se)
#include "pelt.h"
#include "sched-pelt.h"
-static int select_idle_sibling(struct task_struct *p, int prev_cpu, int cpu);
+static int select_idle_sibling(struct task_struct *p, int prev_cpu,
+ int cpu, int target, int sync);
static unsigned long task_h_load(struct task_struct *p);
static unsigned long capacity_of(int cpu);
@@ -1678,7 +1679,7 @@ static void task_numa_compare(struct task_numa_env *env,
*/
local_irq_disable();
env->dst_cpu = select_idle_sibling(env->p, env->src_cpu,
- env->dst_cpu);
+ -1, env->dst_cpu, 0);
local_irq_enable();
}
@@ -6161,12 +6162,14 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
/*
* Try and locate an idle core/thread in the LLC cache domain.
*/
-static int select_idle_sibling(struct task_struct *p, int prev, int target)
+static int select_idle_sibling(struct task_struct *p, int prev, int this_cpu,
+ int target, int sync)
{
struct sched_domain *sd;
int i, recent_used_cpu;
- if (available_idle_cpu(target))
+ if (available_idle_cpu(target) ||
+ (sync && target == this_cpu && cpu_rq(this_cpu)->nr_running == 1))
return target;
/*
@@ -6649,7 +6652,7 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f
} else if (sd_flag & SD_BALANCE_WAKE) { /* XXX always ? */
/* Fast path */
- new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
+ new_cpu = select_idle_sibling(p, prev_cpu, cpu, new_cpu, sync);
if (want_affine)
current->recent_used_cpu = cpu;
next prev parent reply other threads:[~2019-01-09 3:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-09 3:49 [PATCH 0/1] RFC: sched/fair: skip select_idle_sibling() in presence of sync wakeups Andrea Arcangeli
2019-01-09 3:49 ` Andrea Arcangeli [this message]
2019-01-09 4:19 ` Mike Galbraith
2019-01-09 10:07 ` Mel Gorman
2019-01-09 18:24 ` Andrea Arcangeli
2019-01-09 18:02 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190109034941.28759-2-aarcange@redhat.com \
--to=aarcange@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.