From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>,
"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
Boqun Feng <boqun.feng@gmail.com>,
Andy Lutomirski <luto@amacapital.net>,
Dave Watson <davejwatson@fb.com>
Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
Paul Turner <pjt@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Russell King <linux@arm.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
Steven Rostedt <rostedt@goodmis.org>,
Josh Triplett <josh@joshtriplett.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Michael Kerrisk <mtk.manpages@gmail.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Maged Michael <maged.michael@gmail.com>, Avi Kivity <avi@scyllad>
Subject: [RFC PATCH for 4.15 v5 20/22] membarrier: Document scheduler barrier requirements
Date: Tue, 21 Nov 2017 09:18:58 -0500 [thread overview]
Message-ID: <20171121141900.18471-21-mathieu.desnoyers@efficios.com> (raw)
In-Reply-To: <20171121141900.18471-1-mathieu.desnoyers@efficios.com>
Document the membarrier requirement on having a full memory barrier in
__schedule() after coming from user-space, before storing to rq->curr.
It is provided by smp_mb__after_spinlock() in __schedule().
Document that membarrier requires a full barrier on transition from
kernel thread to userspace thread. We currently have an implicit barrier
from atomic_dec_and_test() in mmdrop() that ensures this.
The x86 switch_mm_irqs_off() full barrier is currently provided by many
cpumask update operations as well as write_cr3(). Document that
write_cr3() provides this barrier.
Changes since v1:
- Update comments to match reality for code paths which are after
storing to rq->curr, before returning to user-space, based on feedback
from Andrea Parri.
Changes since v2:
- Update changelog (smp_mb__before_spinlock -> smp_mb__after_spinlock).
Based on feedback from Andrea Parri.
Changes since v3:
- Clarify comments following feeback from Peter Zijlstra.
Changes since v4:
- Update comment regarding powerpc barrier.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Maged Michael <maged.michael@gmail.com>
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Dave Watson <davejwatson@fb.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andrea Parri <parri.andrea@gmail.com>
CC: x86@kernel.org
---
arch/x86/mm/tlb.c | 5 +++++
include/linux/sched/mm.h | 5 +++++
kernel/sched/core.c | 37 ++++++++++++++++++++++++++-----------
3 files changed, 36 insertions(+), 11 deletions(-)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 3118392cdf75..5abf9bfcca1f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -146,6 +146,11 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
#endif
this_cpu_write(cpu_tlbstate.is_lazy, false);
+ /*
+ * The membarrier system call requires a full memory barrier
+ * before returning to user-space, after storing to rq->curr.
+ * Writing to CR3 provides that full memory barrier.
+ */
if (real_prev == next) {
VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
next->context.ctx_id);
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 7077253d0df4..0f9e1a96b890 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -39,6 +39,11 @@ static inline void mmgrab(struct mm_struct *mm)
extern void __mmdrop(struct mm_struct *);
static inline void mmdrop(struct mm_struct *mm)
{
+ /*
+ * The implicit full barrier implied by atomic_dec_and_test is
+ * required by the membarrier system call before returning to
+ * user-space, after storing to rq->curr.
+ */
if (unlikely(atomic_dec_and_test(&mm->mm_count)))
__mmdrop(mm);
}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 55cc426ff46e..0151eb1fcd5b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2697,6 +2697,12 @@ static struct rq *finish_task_switch(struct task_struct *prev)
finish_arch_post_lock_switch();
fire_sched_in_preempt_notifiers(current);
+ /*
+ * When transitioning from a kernel thread to a userspace
+ * thread, mmdrop()'s implicit full barrier is required by the
+ * membarrier system call, because the current active_mm can
+ * become the current mm without going through switch_mm().
+ */
if (mm)
mmdrop(mm);
if (unlikely(prev_state == TASK_DEAD)) {
@@ -2802,6 +2808,13 @@ context_switch(struct rq *rq, struct task_struct *prev,
*/
arch_start_context_switch(prev);
+ /*
+ * If mm is non-NULL, we pass through switch_mm(). If mm is
+ * NULL, we will pass through mmdrop() in finish_task_switch().
+ * Both of these contain the full memory barrier required by
+ * membarrier after storing to rq->curr, before returning to
+ * user-space.
+ */
if (!mm) {
next->active_mm = oldmm;
mmgrab(oldmm);
@@ -3338,6 +3351,9 @@ static void __sched notrace __schedule(bool preempt)
* Make sure that signal_pending_state()->signal_pending() below
* can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
* done by the caller to avoid the race with signal_wake_up().
+ *
+ * The membarrier system call requires a full memory barrier
+ * after coming from user-space, before storing to rq->curr.
*/
rq_lock(rq, &rf);
smp_mb__after_spinlock();
@@ -3386,17 +3402,16 @@ static void __sched notrace __schedule(bool preempt)
/*
* The membarrier system call requires each architecture
* to have a full memory barrier after updating
- * rq->curr, before returning to user-space. For TSO
- * (e.g. x86), the architecture must provide its own
- * barrier in switch_mm(). For weakly ordered machines
- * for which spin_unlock() acts as a full memory
- * barrier, finish_lock_switch() in common code takes
- * care of this barrier. For weakly ordered machines for
- * which spin_unlock() acts as a RELEASE barrier (only
- * arm64 and PowerPC), arm64 has a full barrier in
- * switch_to(), and PowerPC has
- * smp_mb__after_unlock_lock() before
- * finish_lock_switch().
+ * rq->curr, before returning to user-space.
+ *
+ * Here are the schemes providing that barrier on the
+ * various architectures:
+ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
+ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
+ * - finish_lock_switch() for weakly-ordered
+ * architectures where spin_unlock is a full barrier,
+ * - switch_to() for arm64 (weakly-ordered, spin_unlock
+ * is a RELEASE barrier),
*/
++*switch_count;
--
2.11.0
next prev parent reply other threads:[~2017-11-21 14:18 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-21 14:18 [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 01/22] uapi headers: Provide types_32_64.h Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v12 02/22] rseq: Introduce restartable sequences system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 03/22] arm: Add restartable sequences support Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 04/22] arm: Wire up restartable sequences system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 07/22] powerpc: Add support for restartable sequences Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 08/22] powerpc: Wire up restartable sequences system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 09/22] sched: Implement push_task_to_cpu Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v4 10/22] cpu_opv: Provide cpu_opv system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 11/22] x86: Wire up " Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 12/22] powerpc: " Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 13/22] arm: " Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v3 14/22] cpu_opv: selftests: Implement selftests Mathieu Desnoyers
[not found] ` <20171121141900.18471-15-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 15:17 ` Shuah Khan
[not found] ` <c311cd4b-4d8f-bfb5-3731-2cb2eefa4865-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-11-21 16:46 ` Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v3 15/22] rseq: selftests: Provide self-tests Mathieu Desnoyers
[not found] ` <20171121141900.18471-16-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 15:34 ` Shuah Khan
2017-11-21 17:05 ` Mathieu Desnoyers
2017-11-21 17:40 ` Shuah Khan
2017-11-21 21:22 ` Mathieu Desnoyers
[not found] ` <911090321.19621.1511299340968.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 21:24 ` Shuah Khan
[not found] ` <d059d29d-83a7-f16d-dc84-05a4609a3479-JPH+aEBZ4P+UEJcrhfAQsw@public.gmane.org>
2017-11-21 21:44 ` Mathieu Desnoyers
2017-11-22 19:38 ` Peter Zijlstra
[not found] ` <20171122193821.GJ3165-IIpfhp3q70x9+YH6RuovlLjjLBE8jN/0@public.gmane.org>
2017-11-23 21:16 ` Mathieu Desnoyers
2017-11-22 21:48 ` Peter Zijlstra
[not found] ` <20171122214813.GK3165-IIpfhp3q70x9+YH6RuovlLjjLBE8jN/0@public.gmane.org>
2017-11-23 22:53 ` Mathieu Desnoyers
2017-11-23 8:55 ` Peter Zijlstra
[not found] ` <20171123085511.ohwuobf5v32vggho-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2017-11-23 8:57 ` Peter Zijlstra
[not found] ` <20171123085721.bzburngdsatgewjo-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2017-11-24 14:15 ` Mathieu Desnoyers
2017-11-24 13:55 ` Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 16/22] rseq: selftests: arm: workaround gcc asm size guess Mathieu Desnoyers
[not found] ` <20171121141900.18471-17-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 15:39 ` Shuah Khan
2017-11-21 14:18 ` [RFC PATCH for 4.15 17/22] Fix: membarrier: add missing preempt off around smp_call_function_many Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 18/22] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v7 19/22] powerpc: membarrier: Skip memory barrier in switch_mm() Mathieu Desnoyers
2017-11-21 14:18 ` Mathieu Desnoyers [this message]
2017-11-21 14:18 ` [RFC PATCH for 4.15 v2 21/22] membarrier: provide SHARED_EXPEDITED command Mathieu Desnoyers
2017-11-21 14:19 ` [RFC PATCH for 4.15 22/22] membarrier: selftest: Test shared expedited cmd Mathieu Desnoyers
2017-11-21 17:21 ` [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector Andi Kleen
[not found] ` <20171121172144.GL2482-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2017-11-21 22:05 ` Mathieu Desnoyers
[not found] ` <740195164.19702.1511301908907.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 22:59 ` Thomas Gleixner
2017-11-22 12:36 ` Mathieu Desnoyers
[not found] ` <809252084.19901.1511354219731.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-22 15:25 ` Thomas Gleixner
2017-11-22 19:32 ` Peter Zijlstra
2017-11-22 19:37 ` Will Deacon
[not found] ` <20171122193734.GO22648-5wv7dgnIgG8@public.gmane.org>
2017-11-23 21:15 ` Mathieu Desnoyers
[not found] ` <165707648.21250.1511471721845.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-23 22:51 ` Thomas Gleixner
2017-11-23 23:01 ` Mathieu Desnoyers
[not found] ` <739755311.21380.1511478071883.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-23 23:38 ` Thomas Gleixner
2017-11-24 0:04 ` Mathieu Desnoyers
[not found] ` <1589826840.21401.1511481874141.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-24 14:47 ` Thomas Gleixner
2017-11-23 21:13 ` Mathieu Desnoyers
2017-11-23 21:49 ` Andi Kleen
2017-11-22 15:28 ` Andy Lutomirski
[not found] ` <CALCETrX84s-WEa5osP0t6CKh8C4Wj7-ARNMpaSp8Eop1_2ycLA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-11-22 16:43 ` Mathieu Desnoyers
[not found] ` <718035530.20074.1511369023901.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-22 18:10 ` Andi Kleen
[not found] ` <20171121141900.18471-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 14:18 ` [RFC PATCH for 4.15 05/22] x86: Add support for restartable sequences Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 06/22] x86: Wire up restartable sequence system call Mathieu Desnoyers
2017-11-21 22:19 ` [PATCH update for 4.15 1/3] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
[not found] ` <20171121221933.25959-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-21 22:19 ` [PATCH update for 4.15 2/3] cpu_opv: selftests: Implement selftests (v4) Mathieu Desnoyers
[not found] ` <20171121221933.25959-2-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-22 15:20 ` Shuah Khan
2017-11-21 22:22 ` [PATCH update for 4.15 1/3] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
2017-11-22 15:16 ` Shuah Khan
2017-11-21 22:19 ` [PATCH update for 4.15 3/3] rseq: selftests: Provide self-tests (v4) Mathieu Desnoyers
[not found] ` <20171121221933.25959-3-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-22 15:23 ` Shuah Khan
2017-11-22 16:31 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171121141900.18471-21-mathieu.desnoyers@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=ahh@google.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=avi@scyllad \
--cc=bmaurer@fb.com \
--cc=boqun.feng@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=davejwatson@fb.com \
--cc=hpa@zytor.com \
--cc=josh@joshtriplett.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=luto@amacapital.net \
--cc=maged.michael@gmail.com \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).