From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org,
linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Carsten Emde <C.Emde@osadl.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
John Kacur <jkacur@redhat.com>, Daniel Wagner <wagi@monom.org>,
Tom Zanussi <zanussi@kernel.org>,
"Srivatsa S. Bhat" <srivatsa@csail.mit.edu>,
stable@kernel.org,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Valentin Schneider <valentin.schneider@arm.com>,
Paul Gortmaker <paul.gortmaker@windriver.com>
Subject: [PATCH RT 1/8] sched: Fix migration_cpu_stop() requeueing
Date: Fri, 09 Jul 2021 17:59:54 -0400 [thread overview]
Message-ID: <20210709220017.262194093@goodmis.org> (raw)
In-Reply-To: 20210709215953.122804544@goodmis.org
5.10.47-rt46-rc1 stable review patch.
If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra <peterz@infradead.org>
commit 8a6edb5257e2a84720fe78cb179eca58ba76126f upstream.
When affine_move_task(p) is called on a running task @p, which is not
otherwise already changing affinity, we'll first set
p->migration_pending and then do:
stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, &arg);
This then gets us to migration_cpu_stop() running on the CPU that was
previously running our victim task @p.
If we find that our task is no longer on that runqueue (this can
happen because of a concurrent migration due to load-balance etc.),
then we'll end up at the:
} else if (dest_cpu < 1 || pending) {
branch. Which we'll take because we set pending earlier. Here we first
check if the task @p has already satisfied the affinity constraints,
if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop()
onto the CPU that is now hosting our task @p:
stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
&pending->arg, &pending->stop_work);
Except, we've never initialized pending->arg, which will be all 0s.
This then results in running migration_cpu_stop() on the next CPU with
arg->p == NULL, which gives the by now obvious result of fireworks.
The cure is to change affine_move_task() to always use pending->arg,
furthermore we can use the exact same pattern as the
SCA_MIGRATE_ENABLE case, since we'll block on the pending->done
completion anyway, no point in adding yet another completion in
stop_one_cpu().
This then gives a clear distinction between the two
migration_cpu_stop() use cases:
- sched_exec() / migrate_task_to() : arg->pending == NULL
- affine_move_task() : arg->pending != NULL;
And we can have it ignore p->migration_pending when !arg->pending. Any
stop work from sched_exec() / migrate_task_to() is in addition to stop
works from affine_move_task(), which will be sufficient to issue the
completion.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.357743989@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
kernel/sched/core.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3d3aa9db1548..a3dea38f410a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1958,6 +1958,24 @@ static int migration_cpu_stop(void *data)
rq_lock(rq, &rf);
pending = p->migration_pending;
+ if (pending && !arg->pending) {
+ /*
+ * This happens from sched_exec() and migrate_task_to(),
+ * neither of them care about pending and just want a task to
+ * maybe move about.
+ *
+ * Even if there is a pending, we can ignore it, since
+ * affine_move_task() will have it's own stop_work's in flight
+ * which will manage the completion.
+ *
+ * Notably, pending doesn't need to match arg->pending. This can
+ * happen when tripple concurrent affine_move_task() first sets
+ * pending, then clears pending and eventually sets another
+ * pending.
+ */
+ pending = NULL;
+ }
+
/*
* If task_rq(p) != rq, it cannot be migrated here, because we're
* holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
@@ -2230,10 +2248,6 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
int dest_cpu, unsigned int flags)
{
struct set_affinity_pending my_pending = { }, *pending = NULL;
- struct migration_arg arg = {
- .task = p,
- .dest_cpu = dest_cpu,
- };
bool complete = false;
/* Can the task run on the task's current CPU? If so, we're done */
@@ -2271,6 +2285,12 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
/* Install the request */
refcount_set(&my_pending.refs, 1);
init_completion(&my_pending.done);
+ my_pending.arg = (struct migration_arg) {
+ .task = p,
+ .dest_cpu = -1, /* any */
+ .pending = &my_pending,
+ };
+
p->migration_pending = &my_pending;
} else {
pending = p->migration_pending;
@@ -2301,12 +2321,6 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
p->migration_flags &= ~MDF_PUSH;
task_rq_unlock(rq, p, rf);
- pending->arg = (struct migration_arg) {
- .task = p,
- .dest_cpu = -1,
- .pending = pending,
- };
-
stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
&pending->arg, &pending->stop_work);
@@ -2319,8 +2333,11 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
* is_migration_disabled(p) checks to the stopper, which will
* run on the same CPU as said p.
*/
+ refcount_inc(&pending->refs); /* pending->{arg,stop_work} */
task_rq_unlock(rq, p, rf);
- stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
+
+ stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
+ &pending->arg, &pending->stop_work);
} else {
--
2.30.2
next prev parent reply other threads:[~2021-07-09 22:00 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-09 21:59 [PATCH RT 0/8] Linux 5.10.47-rt46-rc1 Steven Rostedt
2021-07-09 21:59 ` Steven Rostedt [this message]
2021-07-09 21:59 ` [PATCH RT 2/8] sched: Simplify migration_cpu_stop() Steven Rostedt
2021-07-09 21:59 ` [PATCH RT 3/8] sched: Collate affine_move_task() stoppers Steven Rostedt
2021-07-09 21:59 ` [PATCH RT 4/8] sched: Optimize migration_cpu_stop() Steven Rostedt
2021-07-09 21:59 ` [PATCH RT 5/8] sched: Fix affine_move_task() self-concurrency Steven Rostedt
2021-07-25 5:03 ` Pavel Machek
2021-07-26 13:39 ` Valentin Schneider
2021-07-26 16:07 ` Paul Gortmaker
2021-07-09 21:59 ` [PATCH RT 6/8] sched: Simplify set_affinity_pending refcounts Steven Rostedt
2021-07-09 22:00 ` [PATCH RT 7/8] sched: Dont defer CPU pick to migration_cpu_stop() Steven Rostedt
2021-07-09 22:00 ` [PATCH RT 8/8] Linux 5.10.47-rt46-rc1 Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210709220017.262194093@goodmis.org \
--to=rostedt@goodmis.org \
--cc=C.Emde@osadl.org \
--cc=bigeasy@linutronix.de \
--cc=jkacur@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paul.gortmaker@windriver.com \
--cc=peterz@infradead.org \
--cc=srivatsa@csail.mit.edu \
--cc=stable@kernel.org \
--cc=tglx@linutronix.de \
--cc=valentin.schneider@arm.com \
--cc=wagi@monom.org \
--cc=zanussi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.