public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>,
	"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>
Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
	Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andy Lutomirski <luto@kernel.org>,
	Maged Michael <maged.michael@gmail.com>,
	Avi Kivity <avi@scylladb.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Andrea Parri <parri.andrea@gmail.com>,
	Russell King <linux@armlinux.org.uk>,
	Greg Hackmann <ghackmann@google.com>,
	David Sehr <sehr@google.com>,
	x86@kernel.org, linux-arch@vger.kernel.org
Subject: [RFC PATCH v2 for 4.15 22/24] membarrier: x86: Provide core serializing command
Date: Tue, 14 Nov 2017 15:04:12 -0500	[thread overview]
Message-ID: <20171114200414.2188-23-mathieu.desnoyers@efficios.com> (raw)
In-Reply-To: <20171114200414.2188-1-mathieu.desnoyers@efficios.com>

There are two places where core serialization is needed by membarrier:

1) When returning from the membarrier IPI,
2) After scheduler updates curr to a thread with a different mm, before
   going back to user-space, since the curr->mm is used by membarrier to
   check whether it needs to send an IPI to that CPU.

x86-32 uses iret as return from interrupt, and both iret and sysexit to go
back to user-space. The iret instruction is core serializing, but not
sysexit.

x86-64 uses iret as return from interrupt, which takes care of the IPI.
However, it can return to user-space through either sysretl (compat
code), sysretq, or iret. Given that sysret{l,q} is not core serializing,
we rely instead on write_cr3() performed by switch_mm() to provide core
serialization after changing the current mm, and deal with the special
case of kthread -> uthread (temporarily keeping current mm into
active_mm) by adding a sync_core() in that specific case.

Use the new sync_core_before_usermode() to guarantee this.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Andy Lutomirski <luto@kernel.org>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Maged Michael <maged.michael@gmail.com>
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Dave Watson <davejwatson@fb.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andrea Parri <parri.andrea@gmail.com>
CC: Russell King <linux@armlinux.org.uk>
CC: Greg Hackmann <ghackmann@google.com>
CC: Will Deacon <will.deacon@arm.com>
CC: David Sehr <sehr@google.com>
CC: x86@kernel.org
CC: linux-arch@vger.kernel.org

---
Changes since v1:
- Use the newly introduced sync_core_before_usermode(). Move all state
  handling to generic code.
- Add linux/processor.h include to include/linux/sched/mm.h.
---
 arch/x86/Kconfig          |  1 +
 arch/x86/entry/entry_32.S |  5 +++++
 arch/x86/entry/entry_64.S |  8 ++++++++
 arch/x86/mm/tlb.c         |  7 ++++---
 include/linux/sched/mm.h  | 12 ++++++++++++
 kernel/sched/core.c       |  6 +++++-
 kernel/sched/membarrier.c |  4 ++++
 7 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 54fbb8960d94..94bdf5fc7d94 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -54,6 +54,7 @@ config X86
 	select ARCH_HAS_FORTIFY_SOURCE
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_KCOV			if X86_64
+	select ARCH_HAS_MEMBARRIER_SYNC_CORE
 	select ARCH_HAS_PMEM_API		if X86_64
 	# Causing hangs/crashes, see the commit that added this change for details.
 	select ARCH_HAS_REFCOUNT		if BROKEN
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 4838037f97f6..04e5daba8456 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -553,6 +553,11 @@ restore_all:
 .Lrestore_nocheck:
 	RESTORE_REGS 4				# skip orig_eax/error_code
 .Lirq_return:
+	/*
+	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization
+	 * when returning from IPI handler and when returning from
+	 * scheduler to user-space.
+	 */
 	INTERRUPT_RETURN
 
 .section .fixup, "ax"
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index bcfc5668dcb2..4859f04e1695 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -642,6 +642,10 @@ GLOBAL(restore_regs_and_iret)
 restore_c_regs_and_iret:
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
+	/*
+	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization
+	 * when returning from IPI handler.
+	 */
 	INTERRUPT_RETURN
 
 ENTRY(native_iret)
@@ -1122,6 +1126,10 @@ paranoid_exit_restore:
 	RESTORE_EXTRA_REGS
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
+	/*
+	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization
+	 * when returning from IPI handler.
+	 */
 	INTERRUPT_RETURN
 END(paranoid_exit)
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 5abf9bfcca1f..3b13d6735fa5 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -147,9 +147,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 	this_cpu_write(cpu_tlbstate.is_lazy, false);
 
 	/*
-	 * The membarrier system call requires a full memory barrier
-	 * before returning to user-space, after storing to rq->curr.
-	 * Writing to CR3 provides that full memory barrier.
+	 * The membarrier system call requires a full memory barrier and
+	 * core serialization before returning to user-space, after
+	 * storing to rq->curr. Writing to CR3 provides that full
+	 * memory barrier and core serializing instruction.
 	 */
 	if (real_prev == next) {
 		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index b7abb7de250f..5b763a73ae43 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -7,6 +7,7 @@
 #include <linux/sched.h>
 #include <linux/mm_types.h>
 #include <linux/gfp.h>
+#include <linux/processor.h>
 
 /*
  * Routines for handling mm_structs
@@ -235,6 +236,14 @@ enum {
 #include <asm/membarrier.h>
 #endif
 
+static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
+{
+	if (likely(!(atomic_read(&mm->membarrier_state) &
+			MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE)))
+		return;
+	sync_core_before_usermode();
+}
+
 static inline void membarrier_execve(struct task_struct *t)
 {
 	atomic_set(&t->mm->membarrier_state, 0);
@@ -249,6 +258,9 @@ static inline void membarrier_arch_switch_mm(struct mm_struct *prev,
 static inline void membarrier_execve(struct task_struct *t)
 {
 }
+static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
+{
+}
 #endif
 
 #endif /* _LINUX_SCHED_MM_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 688b65c68731..43099a091742 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2694,9 +2694,13 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	 * thread, mmdrop()'s implicit full barrier is required by the
 	 * membarrier system call, because the current active_mm can
 	 * become the current mm without going through switch_mm().
+	 * membarrier also requires a core serializing instruction
+	 * before going back to user-space after storing to rq->curr.
 	 */
-	if (mm)
+	if (mm) {
 		mmdrop(mm);
+		membarrier_mm_sync_core_before_usermode(mm);
+	}
 	if (unlikely(prev_state == TASK_DEAD)) {
 		if (prev->sched_class->task_dead)
 			prev->sched_class->task_dead(prev);
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 72f42eac99ab..d48185974ae0 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -239,6 +239,10 @@ static int membarrier_register_private_expedited(int flags)
 		return 0;
 	atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED,
 			&mm->membarrier_state);
+	if (flags & MEMBARRIER_FLAG_SYNC_CORE) {
+		atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE,
+				&mm->membarrier_state);
+	}
 	if (!(atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)) {
 		/*
 		 * Ensure all future scheduler executions will observe the
-- 
2.11.0

  parent reply	other threads:[~2017-11-14 20:07 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-14 20:03 [RFC PATCH for 4.15 00/24] Restartable sequences and CPU op vector v11 Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH v11 for 4.15 01/24] Restartable sequences system call Mathieu Desnoyers
2017-11-14 20:39   ` Ben Maurer
2017-11-14 20:52     ` Mathieu Desnoyers
2017-11-14 21:48       ` Ben Maurer
     [not found]   ` <CY4PR15MB168884529B3C0F8E6CC06257CF280@CY4PR15MB1688.namprd15.prod.outlook.com>
2017-11-14 20:49     ` Ben Maurer
2017-11-14 21:03       ` Mathieu Desnoyers
2017-11-16 16:18   ` Peter Zijlstra
2017-11-16 16:27     ` Mathieu Desnoyers
2017-11-16 16:32       ` Peter Zijlstra
2017-11-16 17:09         ` Mathieu Desnoyers
2017-11-16 18:43   ` Peter Zijlstra
2017-11-16 18:49     ` Mathieu Desnoyers
2017-11-16 19:06       ` Thomas Gleixner
2017-11-16 20:06         ` Mathieu Desnoyers
2017-11-16 19:14   ` Peter Zijlstra
2017-11-16 20:37     ` Mathieu Desnoyers
2017-11-16 20:46       ` Peter Zijlstra
2017-11-16 21:08   ` Thomas Gleixner
2017-11-19 17:24     ` Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH for 4.15 02/24] Restartable sequences: ARM 32 architecture support Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH for 4.15 03/24] Restartable sequences: wire up ARM 32 system call Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH for 4.15 04/24] Restartable sequences: x86 32/64 architecture support Mathieu Desnoyers
2017-11-16 21:14   ` Thomas Gleixner
2017-11-19 17:41     ` Mathieu Desnoyers
2017-11-20  8:38       ` Thomas Gleixner
2017-11-14 20:03 ` [RFC PATCH for 4.15 05/24] Restartable sequences: wire up x86 32/64 system call Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH for 4.15 06/24] Restartable sequences: powerpc architecture support Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH for 4.15 07/24] Restartable sequences: Wire up powerpc system call Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH v3 for 4.15 08/24] Provide cpu_opv " Mathieu Desnoyers
2017-11-15  1:34   ` Mathieu Desnoyers
2017-11-15  7:44   ` Michael Kerrisk (man-pages)
2017-11-15 14:30     ` Mathieu Desnoyers
2017-11-16 23:26   ` Thomas Gleixner
2017-11-17  0:14     ` Andi Kleen
2017-11-17 10:09       ` Thomas Gleixner
2017-11-17 17:14         ` Mathieu Desnoyers
2017-11-17 18:18           ` Andi Kleen
2017-11-17 18:59             ` Thomas Gleixner
2017-11-17 19:15               ` Andi Kleen
2017-11-17 20:07                 ` Thomas Gleixner
2017-11-18 21:09                   ` Andy Lutomirski
2017-11-17 20:22           ` Thomas Gleixner
2017-11-20 17:13             ` Mathieu Desnoyers
2017-11-20 16:13     ` Mathieu Desnoyers
2017-11-20 17:48       ` Thomas Gleixner
2017-11-20 18:03         ` Thomas Gleixner
2017-11-20 18:42           ` Mathieu Desnoyers
2017-11-20 18:39         ` Mathieu Desnoyers
2017-11-20 18:49           ` Andi Kleen
2017-11-20 22:46             ` Mathieu Desnoyers
2017-11-20 19:44           ` Thomas Gleixner
2017-11-21 11:25             ` Mathieu Desnoyers
2017-11-14 20:03 ` [RFC PATCH for 4.15 09/24] cpu_opv: Wire up x86 32/64 " Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH for 4.15 10/24] cpu_opv: Wire up powerpc " Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH for 4.15 11/24] cpu_opv: Wire up ARM32 " Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH v2 for 4.15 12/24] cpu_opv: Implement selftests Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH v2 for 4.15 13/24] Restartable sequences: Provide self-tests Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH for 4.15 14/24] Restartable sequences selftests: arm: workaround gcc asm size guess Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH v2 for 4.15 15/24] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH v7 for 4.15 16/24] membarrier: powerpc: Skip memory barrier in switch_mm() Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH v5 for 4.15 17/24] membarrier: Document scheduler barrier requirements Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH for 4.15 18/24] membarrier: provide SHARED_EXPEDITED command Mathieu Desnoyers
2017-11-15  1:36   ` Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH for 4.15 19/24] membarrier: selftest: Test shared expedited cmd Mathieu Desnoyers
2017-11-17 15:07   ` Shuah Khan
2017-11-14 20:04 ` [RFC PATCH for 4.15 20/24] membarrier: Provide core serializing command Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH v2 for 4.15 21/24] x86: Introduce sync_core_before_usermode Mathieu Desnoyers
2017-11-14 20:04 ` Mathieu Desnoyers [this message]
2017-11-14 20:04 ` [RFC PATCH for 4.15 23/24] membarrier: selftest: Test private expedited sync core cmd Mathieu Desnoyers
2017-11-17 15:09   ` Shuah Khan
2017-11-17 16:17     ` Mathieu Desnoyers
2017-11-14 20:04 ` [RFC PATCH for 4.15 24/24] membarrier: arm64: Provide core serializing command Mathieu Desnoyers
2017-11-14 21:08 ` [RFC PATCH for 4.15 00/24] Restartable sequences and CPU op vector v11 Linus Torvalds
2017-11-14 21:15   ` Andy Lutomirski
2017-11-14 21:32     ` Paul Turner
2018-03-27 18:15       ` Mathieu Desnoyers
2017-11-14 21:32     ` Mathieu Desnoyers
2017-11-15  4:12       ` Andy Lutomirski
2017-11-15  6:34         ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171114200414.2188-23-mathieu.desnoyers@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ahh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avi@scylladb.com \
    --cc=benh@kernel.crashing.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=ghackmann@google.com \
    --cc=hpa@zytor.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=linux@armlinux.org.uk \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=maged.michael@gmail.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=mtk.manpages@gmail.com \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sehr@google.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox