linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
  • * [PATCH for 4.16 02/11] powerpc: membarrier: Skip memory barrier in switch_mm() (v7)
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
           [not found] ` <20180117165458.13330-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      2018-01-17 16:54 ` [PATCH for 4.16 05/11] membarrier: selftest: Test global expedited cmd (v2) Mathieu Desnoyers
                       ` (5 subsequent siblings)
      7 siblings, 1 reply; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: Maged Michael, Dave Watson, Will Deacon, Russell King, David Sehr,
    	Paul Mackerras, H . Peter Anvin, linux-arch, x86, Andrew Hunter,
    	Greg Hackmann, Alan Stern, Paul E . McKenney, Andrea Parri,
    	Avi Kivity, Boqun Feng, linuxppc-dev, Nicholas Piggin,
    	Mathieu Desnoyers, Alexander Viro, Andy Lutomirski, linux-api,
    	linux-kernel, Linus Torvalds
    
    Allow PowerPC to skip the full memory barrier in switch_mm(), and
    only issue the barrier when scheduling into a task belonging to a
    process that has registered to use expedited private.
    
    Threads targeting the same VM but which belong to different thread
    groups is a tricky case. It has a few consequences:
    
    It turns out that we cannot rely on get_nr_threads(p) to count the
    number of threads using a VM. We can use
    (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)
    instead to skip the synchronize_sched() for cases where the VM only has
    a single user, and that user only has a single thread.
    
    It also turns out that we cannot use for_each_thread() to set
    thread flags in all threads using a VM, as it only iterates on the
    thread group.
    
    Therefore, test the membarrier state variable directly rather than
    relying on thread flags. This means
    membarrier_register_private_expedited() needs to set the
    MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and
    only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows
    private expedited membarrier commands to succeed.
    membarrier_arch_switch_mm() now tests for the
    MEMBARRIER_STATE_PRIVATE_EXPEDITED flag.
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Alan Stern <stern@rowland.harvard.edu>
    CC: Will Deacon <will.deacon@arm.com>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Ingo Molnar <mingo@redhat.com>
    CC: Alexander Viro <viro@zeniv.linux.org.uk>
    CC: Nicholas Piggin <npiggin@gmail.com>
    CC: linuxppc-dev@lists.ozlabs.org
    CC: linux-arch@vger.kernel.org
    ---
    Changes since v1:
    - Use test_ti_thread_flag(next, ...) instead of test_thread_flag() in
      powerpc membarrier_arch_sched_in(), given that we want to specifically
      check the next thread state.
    - Add missing ARCH_HAS_MEMBARRIER_HOOKS in Kconfig.
    - Use task_thread_info() to pass thread_info from task to
      *_ti_thread_flag().
    
    Changes since v2:
    - Move membarrier_arch_sched_in() call to finish_task_switch().
    - Check for NULL t->mm in membarrier_arch_fork().
    - Use membarrier_sched_in() in generic code, which invokes the
      arch-specific membarrier_arch_sched_in(). This fixes allnoconfig
      build on PowerPC.
    - Move asm/membarrier.h include under CONFIG_MEMBARRIER, fixing
      allnoconfig build on PowerPC.
    - Build and runtime tested on PowerPC.
    
    Changes since v3:
    - Simply rely on copy_mm() to copy the membarrier_private_expedited mm
      field on fork.
    - powerpc: test thread flag instead of reading
      membarrier_private_expedited in membarrier_arch_fork().
    - powerpc: skip memory barrier in membarrier_arch_sched_in() if coming
      from kernel thread, since mmdrop() implies a full barrier.
    - Set membarrier_private_expedited to 1 only after arch registration
      code, thus eliminating a race where concurrent commands could succeed
      when they should fail if issued concurrently with process
      registration.
    - Use READ_ONCE() for membarrier_private_expedited field access in
      membarrier_private_expedited. Matches WRITE_ONCE() performed in
      process registration.
    
    Changes since v4:
    - Move powerpc hook from sched_in() to switch_mm(), based on feedback
      from Nicholas Piggin.
    
    Changes since v5:
    - Rebase on v4.14-rc6.
    - Fold "Fix: membarrier: Handle CLONE_VM + !CLONE_THREAD correctly on
      powerpc (v2)"
    
    Changes since v6:
    - Rename MEMBARRIER_STATE_SWITCH_MM to MEMBARRIER_STATE_PRIVATE_EXPEDITED.
    ---
     MAINTAINERS                           |  1 +
     arch/powerpc/Kconfig                  |  1 +
     arch/powerpc/include/asm/membarrier.h | 26 ++++++++++++++++++++++++++
     arch/powerpc/mm/mmu_context.c         |  7 +++++++
     include/linux/sched/mm.h              | 13 ++++++++++++-
     init/Kconfig                          |  3 +++
     kernel/sched/core.c                   | 10 ----------
     kernel/sched/membarrier.c             |  8 ++++++++
     8 files changed, 58 insertions(+), 11 deletions(-)
     create mode 100644 arch/powerpc/include/asm/membarrier.h
    
    diff --git a/MAINTAINERS b/MAINTAINERS
    index 18994806e441..c2f0d9a48a10 100644
    --- a/MAINTAINERS
    +++ b/MAINTAINERS
    @@ -8931,6 +8931,7 @@ L:	linux-kernel@vger.kernel.org
     S:	Supported
     F:	kernel/sched/membarrier.c
     F:	include/uapi/linux/membarrier.h
    +F:	arch/powerpc/include/asm/membarrier.h
     
     MEMORY MANAGEMENT
     L:	linux-mm@kvack.org
    diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
    index c51e6ce42e7a..a63adb082c0a 100644
    --- a/arch/powerpc/Kconfig
    +++ b/arch/powerpc/Kconfig
    @@ -140,6 +140,7 @@ config PPC
     	select ARCH_HAS_FORTIFY_SOURCE
     	select ARCH_HAS_GCOV_PROFILE_ALL
     	select ARCH_HAS_PMEM_API                if PPC64
    +	select ARCH_HAS_MEMBARRIER_HOOKS
     	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE
     	select ARCH_HAS_SG_CHAIN
     	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
    diff --git a/arch/powerpc/include/asm/membarrier.h b/arch/powerpc/include/asm/membarrier.h
    new file mode 100644
    index 000000000000..98ff4f1fcf2b
    --- /dev/null
    +++ b/arch/powerpc/include/asm/membarrier.h
    @@ -0,0 +1,26 @@
    +#ifndef _ASM_POWERPC_MEMBARRIER_H
    +#define _ASM_POWERPC_MEMBARRIER_H
    +
    +static inline void membarrier_arch_switch_mm(struct mm_struct *prev,
    +					     struct mm_struct *next,
    +					     struct task_struct *tsk)
    +{
    +	/*
    +	 * Only need the full barrier when switching between processes.
    +	 * Barrier when switching from kernel to userspace is not
    +	 * required here, given that it is implied by mmdrop(). Barrier
    +	 * when switching from userspace to kernel is not needed after
    +	 * store to rq->curr.
    +	 */
    +	if (likely(!(atomic_read(&next->membarrier_state) &
    +		     MEMBARRIER_STATE_PRIVATE_EXPEDITED) || !prev))
    +		return;
    +
    +	/*
    +	 * The membarrier system call requires a full memory barrier
    +	 * after storing to rq->curr, before going back to user-space.
    +	 */
    +	smp_mb();
    +}
    +
    +#endif /* _ASM_POWERPC_MEMBARRIER_H */
    diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c
    index d60a62bf4fc7..0ab297c4cfad 100644
    --- a/arch/powerpc/mm/mmu_context.c
    +++ b/arch/powerpc/mm/mmu_context.c
    @@ -12,6 +12,7 @@
     
     #include <linux/mm.h>
     #include <linux/cpu.h>
    +#include <linux/sched/mm.h>
     
     #include <asm/mmu_context.h>
     
    @@ -58,6 +59,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
     		 *
     		 * On the read side the barrier is in pte_xchg(), which orders
     		 * the store to the PTE vs the load of mm_cpumask.
    +		 *
    +		 * This full barrier is needed by membarrier when switching
    +		 * between processes after store to rq->curr, before user-space
    +		 * memory accesses.
     		 */
     		smp_mb();
     
    @@ -80,6 +85,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
     
     	if (new_on_cpu)
     		radix_kvm_prefetch_workaround(next);
    +	else
    +		membarrier_arch_switch_mm(prev, next, tsk);
     
     	/*
     	 * The actual HW switching method differs between the various
    diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
    index 3d49b91b674d..1754396795f6 100644
    --- a/include/linux/sched/mm.h
    +++ b/include/linux/sched/mm.h
    @@ -215,14 +215,25 @@ static inline void memalloc_noreclaim_restore(unsigned int flags)
     #ifdef CONFIG_MEMBARRIER
     enum {
     	MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY	= (1U << 0),
    -	MEMBARRIER_STATE_SWITCH_MM			= (1U << 1),
    +	MEMBARRIER_STATE_PRIVATE_EXPEDITED		= (1U << 1),
     };
     
    +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_HOOKS
    +#include <asm/membarrier.h>
    +#endif
    +
     static inline void membarrier_execve(struct task_struct *t)
     {
     	atomic_set(&t->mm->membarrier_state, 0);
     }
     #else
    +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_HOOKS
    +static inline void membarrier_arch_switch_mm(struct mm_struct *prev,
    +					     struct mm_struct *next,
    +					     struct task_struct *tsk)
    +{
    +}
    +#endif
     static inline void membarrier_execve(struct task_struct *t)
     {
     }
    diff --git a/init/Kconfig b/init/Kconfig
    index a9a2e2c86671..2d118b6adee2 100644
    --- a/init/Kconfig
    +++ b/init/Kconfig
    @@ -1412,6 +1412,9 @@ config USERFAULTFD
     	  Enable the userfaultfd() system call that allows to intercept and
     	  handle page faults in userland.
     
    +config ARCH_HAS_MEMBARRIER_HOOKS
    +	bool
    +
     config EMBEDDED
     	bool "Embedded system"
     	option allnoconfig_y
    diff --git a/kernel/sched/core.c b/kernel/sched/core.c
    index 644fa2e3d993..524b705892db 100644
    --- a/kernel/sched/core.c
    +++ b/kernel/sched/core.c
    @@ -2653,16 +2653,6 @@ static struct rq *finish_task_switch(struct task_struct *prev)
     	prev_state = prev->state;
     	vtime_task_switch(prev);
     	perf_event_task_sched_in(prev, current);
    -	/*
    -	 * The membarrier system call requires a full memory barrier
    -	 * after storing to rq->curr, before going back to user-space.
    -	 *
    -	 * TODO: This smp_mb__after_unlock_lock can go away if PPC end
    -	 * up adding a full barrier to switch_mm(), or we should figure
    -	 * out if a smp_mb__after_unlock_lock is really the proper API
    -	 * to use.
    -	 */
    -	smp_mb__after_unlock_lock();
     	finish_lock_switch(rq, prev);
     	finish_arch_post_lock_switch();
     
    diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
    index 9bcbacba82a8..678577267a9a 100644
    --- a/kernel/sched/membarrier.c
    +++ b/kernel/sched/membarrier.c
    @@ -118,6 +118,14 @@ static void membarrier_register_private_expedited(void)
     	if (atomic_read(&mm->membarrier_state)
     			& MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY)
     		return;
    +	atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED, &mm->membarrier_state);
    +	if (!(atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)) {
    +		/*
    +		 * Ensure all future scheduler executions will observe the
    +		 * new thread flag state for this process.
    +		 */
    +		synchronize_sched();
    +	}
     	atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY,
     			&mm->membarrier_state);
     }
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread
  • * [PATCH for 4.16 05/11] membarrier: selftest: Test global expedited cmd (v2)
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
           [not found] ` <20180117165458.13330-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
      2018-01-17 16:54 ` [PATCH for 4.16 02/11] powerpc: membarrier: Skip memory barrier in switch_mm() (v7) Mathieu Desnoyers
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      2018-01-17 16:54 ` [PATCH for 4.16 07/11] x86: Implement sync_core_before_usermode (v3) Mathieu Desnoyers
                       ` (4 subsequent siblings)
      7 siblings, 1 reply; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: linux-kernel, linux-api, Andy Lutomirski, Paul E . McKenney,
    	Boqun Feng, Andrew Hunter, Maged Michael, Avi Kivity,
    	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
    	Dave Watson, H . Peter Anvin, Andrea Parri, Russell King,
    	Greg Hackmann, Will Deacon, David Sehr, Linus Torvalds, x86,
    	Mathieu Desnoyers <mathieu.desnoy>
    
    Test the new MEMBARRIER_CMD_GLOBAL_EXPEDITED and
    MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED commands.
    
    Adapt to the MEMBARRIER_CMD_SHARED -> MEMBARRIER_CMD_GLOBAL rename.
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
    CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Alan Stern <stern@rowland.harvard.edu>
    CC: Will Deacon <will.deacon@arm.com>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Alice Ferrazzi <alice.ferrazzi@gmail.com>
    CC: Paul Elder <paul.elder@pitt.edu>
    CC: linux-kselftest@vger.kernel.org
    CC: linux-arch@vger.kernel.org
    ---
    Changes since v1:
    - Rename SHARED to GLOBAL.
    ---
     .../testing/selftests/membarrier/membarrier_test.c | 59 ++++++++++++++++++++--
     1 file changed, 54 insertions(+), 5 deletions(-)
    
    diff --git a/tools/testing/selftests/membarrier/membarrier_test.c b/tools/testing/selftests/membarrier/membarrier_test.c
    index e6ee73d01fa1..eec39a0e1f77 100644
    --- a/tools/testing/selftests/membarrier/membarrier_test.c
    +++ b/tools/testing/selftests/membarrier/membarrier_test.c
    @@ -59,10 +59,10 @@ static int test_membarrier_flags_fail(void)
     	return 0;
     }
     
    -static int test_membarrier_shared_success(void)
    +static int test_membarrier_global_success(void)
     {
    -	int cmd = MEMBARRIER_CMD_SHARED, flags = 0;
    -	const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED";
    +	int cmd = MEMBARRIER_CMD_GLOBAL, flags = 0;
    +	const char *test_name = "sys membarrier MEMBARRIER_CMD_GLOBAL";
     
     	if (sys_membarrier(cmd, flags) != 0) {
     		ksft_exit_fail_msg(
    @@ -132,6 +132,40 @@ static int test_membarrier_private_expedited_success(void)
     	return 0;
     }
     
    +static int test_membarrier_register_global_expedited_success(void)
    +{
    +	int cmd = MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED, flags = 0;
    +	const char *test_name = "sys membarrier MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED";
    +
    +	if (sys_membarrier(cmd, flags) != 0) {
    +		ksft_exit_fail_msg(
    +			"%s test: flags = %d, errno = %d\n",
    +			test_name, flags, errno);
    +	}
    +
    +	ksft_test_result_pass(
    +		"%s test: flags = %d\n",
    +		test_name, flags);
    +	return 0;
    +}
    +
    +static int test_membarrier_global_expedited_success(void)
    +{
    +	int cmd = MEMBARRIER_CMD_GLOBAL_EXPEDITED, flags = 0;
    +	const char *test_name = "sys membarrier MEMBARRIER_CMD_GLOBAL_EXPEDITED";
    +
    +	if (sys_membarrier(cmd, flags) != 0) {
    +		ksft_exit_fail_msg(
    +			"%s test: flags = %d, errno = %d\n",
    +			test_name, flags, errno);
    +	}
    +
    +	ksft_test_result_pass(
    +		"%s test: flags = %d\n",
    +		test_name, flags);
    +	return 0;
    +}
    +
     static int test_membarrier(void)
     {
     	int status;
    @@ -142,7 +176,7 @@ static int test_membarrier(void)
     	status = test_membarrier_flags_fail();
     	if (status)
     		return status;
    -	status = test_membarrier_shared_success();
    +	status = test_membarrier_global_success();
     	if (status)
     		return status;
     	status = test_membarrier_private_expedited_fail();
    @@ -154,6 +188,19 @@ static int test_membarrier(void)
     	status = test_membarrier_private_expedited_success();
     	if (status)
     		return status;
    +	/*
    +	 * It is valid to send a global membarrier from a non-registered
    +	 * process.
    +	 */
    +	status = test_membarrier_global_expedited_success();
    +	if (status)
    +		return status;
    +	status = test_membarrier_register_global_expedited_success();
    +	if (status)
    +		return status;
    +	status = test_membarrier_global_expedited_success();
    +	if (status)
    +		return status;
     	return 0;
     }
     
    @@ -173,8 +220,10 @@ static int test_membarrier_query(void)
     		}
     		ksft_exit_fail_msg("sys_membarrier() failed\n");
     	}
    -	if (!(ret & MEMBARRIER_CMD_SHARED))
    +	if (!(ret & MEMBARRIER_CMD_GLOBAL)) {
    +		ksft_test_result_fail("sys_membarrier() CMD_GLOBAL query failed\n");
     		ksft_exit_fail_msg("sys_membarrier is not supported.\n");
    +	}
     
     	ksft_test_result_pass("sys_membarrier available\n");
     	return 0;
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread
  • * [PATCH for 4.16 07/11] x86: Implement sync_core_before_usermode (v3)
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
                       ` (2 preceding siblings ...)
      2018-01-17 16:54 ` [PATCH for 4.16 05/11] membarrier: selftest: Test global expedited cmd (v2) Mathieu Desnoyers
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      2018-01-17 17:53   ` Andy Lutomirski
      2018-01-17 16:54 ` [PATCH for 4.16 08/11] membarrier: Provide core serializing command Mathieu Desnoyers
                       ` (3 subsequent siblings)
      7 siblings, 2 replies; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: linux-kernel, linux-api, Andy Lutomirski, Paul E . McKenney,
    	Boqun Feng, Andrew Hunter, Maged Michael, Avi Kivity,
    	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
    	Dave Watson, H . Peter Anvin, Andrea Parri, Russell King,
    	Greg Hackmann, Will Deacon, David Sehr, Linus Torvalds, x86,
    	Mathieu Desnoyers <mathieu.desnoy>
    
    Ensure that a core serializing instruction is issued before returning to
    user-mode. x86 implements return to user-space through sysexit, sysrel,
    and sysretq, which are not core serializing.
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Ingo Molnar <mingo@redhat.com>
    CC: "H. Peter Anvin" <hpa@zytor.com>
    CC: Andrea Parri <parri.andrea@gmail.com>
    CC: Russell King <linux@armlinux.org.uk>
    CC: Greg Hackmann <ghackmann@google.com>
    CC: Will Deacon <will.deacon@arm.com>
    CC: David Sehr <sehr@google.com>
    CC: Linus Torvalds <torvalds@linux-foundation.org>
    CC: x86@kernel.org
    CC: linux-arch@vger.kernel.org
    ---
    Changes since v1:
    - Fix prototype of sync_core_before_usermode in generic code (missing
      return type).
    - Add linux/processor.h include to sched/core.c.
    - Add ARCH_HAS_SYNC_CORE_BEFORE_USERMODE to init/Kconfig.
    - Fix linux/processor.h ifdef to target
      CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE rather than
      ARCH_HAS_SYNC_CORE_BEFORE_USERMODE.
    - Move empty static inline in processor.h to generic patch.
    ---
     arch/x86/Kconfig                 |  1 +
     arch/x86/include/asm/processor.h | 10 ++++++++++
     2 files changed, 11 insertions(+)
    
    diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
    index 20da391b5f32..0b44c8dd0e95 100644
    --- a/arch/x86/Kconfig
    +++ b/arch/x86/Kconfig
    @@ -61,6 +61,7 @@ config X86
     	select ARCH_HAS_SG_CHAIN
     	select ARCH_HAS_STRICT_KERNEL_RWX
     	select ARCH_HAS_STRICT_MODULE_RWX
    +	select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
     	select ARCH_HAS_UBSAN_SANITIZE_ALL
     	select ARCH_HAS_ZONE_DEVICE		if X86_64
     	select ARCH_HAVE_NMI_SAFE_CMPXCHG
    diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
    index d3a67fba200a..3257d34dbb40 100644
    --- a/arch/x86/include/asm/processor.h
    +++ b/arch/x86/include/asm/processor.h
    @@ -722,6 +722,16 @@ static inline void sync_core(void)
     #endif
     }
     
    +/*
    + * Ensure that a core serializing instruction is issued before returning
    + * to user-mode. x86 implements return to user-space through sysexit,
    + * sysrel, and sysretq, which are not core serializing.
    + */
    +static inline void sync_core_before_usermode(void)
    +{
    +	sync_core();
    +}
    +
     extern void select_idle_routine(const struct cpuinfo_x86 *c);
     extern void amd_e400_c1e_apic_setup(void);
     
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread
  • * [PATCH for 4.16 08/11] membarrier: Provide core serializing command
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
                       ` (3 preceding siblings ...)
      2018-01-17 16:54 ` [PATCH for 4.16 07/11] x86: Implement sync_core_before_usermode (v3) Mathieu Desnoyers
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      2018-01-17 16:54 ` [PATCH for 4.16 09/11] membarrier: x86: Provide core serializing command (v4) Mathieu Desnoyers
                       ` (2 subsequent siblings)
      7 siblings, 1 reply; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: linux-kernel, linux-api, Andy Lutomirski, Paul E . McKenney,
    	Boqun Feng, Andrew Hunter, Maged Michael, Avi Kivity,
    	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
    	Dave Watson, H . Peter Anvin, Andrea Parri, Russell King,
    	Greg Hackmann, Will Deacon, David Sehr, Linus Torvalds, x86,
    	Mathieu Desnoyers <mathieu.desnoy>
    
    Provide core serializing membarrier command to support memory reclaim
    by JIT.
    
    Each architecture needs to explicitly opt into that support by
    documenting in their architecture code how they provide the core
    serializing instructions required when returning from the membarrier
    IPI, and after the scheduler has updated the curr->mm pointer (before
    going back to user-space). They should then select
    ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on
    their architecture.
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Thomas Gleixner <tglx@linutronix.de>
    CC: Ingo Molnar <mingo@redhat.com>
    CC: "H. Peter Anvin" <hpa@zytor.com>
    CC: Andrea Parri <parri.andrea@gmail.com>
    CC: Russell King <linux@armlinux.org.uk>
    CC: Greg Hackmann <ghackmann@google.com>
    CC: Will Deacon <will.deacon@arm.com>
    CC: David Sehr <sehr@google.com>
    CC: linux-arch@vger.kernel.org
    ---
     include/linux/sched/mm.h        | 18 ++++++++++++++
     include/uapi/linux/membarrier.h | 32 ++++++++++++++++++++++++-
     init/Kconfig                    |  3 +++
     kernel/sched/core.c             |  6 ++++-
     kernel/sched/membarrier.c       | 53 +++++++++++++++++++++++++++++++----------
     5 files changed, 98 insertions(+), 14 deletions(-)
    
    diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
    index c7ad7a70cfef..14553f8cc666 100644
    --- a/include/linux/sched/mm.h
    +++ b/include/linux/sched/mm.h
    @@ -7,6 +7,7 @@
     #include <linux/sched.h>
     #include <linux/mm_types.h>
     #include <linux/gfp.h>
    +#include <linux/processor.h>
     
     /*
      * Routines for handling mm_structs
    @@ -223,12 +224,26 @@ enum {
     	MEMBARRIER_STATE_PRIVATE_EXPEDITED			= (1U << 1),
     	MEMBARRIER_STATE_GLOBAL_EXPEDITED_READY			= (1U << 2),
     	MEMBARRIER_STATE_GLOBAL_EXPEDITED			= (1U << 3),
    +	MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY	= (1U << 4),
    +	MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE		= (1U << 5),
    +};
    +
    +enum {
    +	MEMBARRIER_FLAG_SYNC_CORE	= (1U << 0),
     };
     
     #ifdef CONFIG_ARCH_HAS_MEMBARRIER_HOOKS
     #include <asm/membarrier.h>
     #endif
     
    +static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
    +{
    +	if (likely(!(atomic_read(&mm->membarrier_state) &
    +		     MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE)))
    +		return;
    +	sync_core_before_usermode();
    +}
    +
     static inline void membarrier_execve(struct task_struct *t)
     {
     	atomic_set(&t->mm->membarrier_state, 0);
    @@ -244,6 +259,9 @@ static inline void membarrier_arch_switch_mm(struct mm_struct *prev,
     static inline void membarrier_execve(struct task_struct *t)
     {
     }
    +static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
    +{
    +}
     #endif
     
     #endif /* _LINUX_SCHED_MM_H */
    diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
    index d252506e1b5e..5891d7614c8c 100644
    --- a/include/uapi/linux/membarrier.h
    +++ b/include/uapi/linux/membarrier.h
    @@ -73,7 +73,7 @@
      *                          to and return from the system call
      *                          (non-running threads are de facto in such a
      *                          state). This only covers threads from the
    - *                          same processes as the caller thread. This
    + *                          same process as the caller thread. This
      *                          command returns 0 on success. The
      *                          "expedited" commands complete faster than
      *                          the non-expedited ones, they never block,
    @@ -86,6 +86,34 @@
      *                          Register the process intent to use
      *                          MEMBARRIER_CMD_PRIVATE_EXPEDITED. Always
      *                          returns 0.
    + * @MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE:
    + *                          In addition to provide memory ordering
    + *                          guarantees described in
    + *                          MEMBARRIER_CMD_PRIVATE_EXPEDITED, ensure
    + *                          the caller thread, upon return from system
    + *                          call, that all its running threads siblings
    + *                          have executed a core serializing
    + *                          instruction. (architectures are required to
    + *                          guarantee that non-running threads issue
    + *                          core serializing instructions before they
    + *                          resume user-space execution). This only
    + *                          covers threads from the same process as the
    + *                          caller thread. This command returns 0 on
    + *                          success. The "expedited" commands complete
    + *                          faster than the non-expedited ones, they
    + *                          never block, but have the downside of
    + *                          causing extra overhead. If this command is
    + *                          not implemented by an architecture, -EINVAL
    + *                          is returned. A process needs to register its
    + *                          intent to use the private expedited sync
    + *                          core command prior to using it, otherwise
    + *                          this command returns -EPERM.
    + * @MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE:
    + *                          Register the process intent to use
    + *                          MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.
    + *                          If this command is not implemented by an
    + *                          architecture, -EINVAL is returned.
    + *                          Returns 0 on success.
      * @MEMBARRIER_CMD_SHARED:
      *                          Alias to MEMBARRIER_CMD_GLOBAL. Provided for
      *                          header backward compatibility.
    @@ -101,6 +129,8 @@ enum membarrier_cmd {
     	MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED		= (1 << 2),
     	MEMBARRIER_CMD_PRIVATE_EXPEDITED			= (1 << 3),
     	MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED		= (1 << 4),
    +	MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE		= (1 << 5),
    +	MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE	= (1 << 6),
     
     	/* Alias for header backward compatibility. */
     	MEMBARRIER_CMD_SHARED			= MEMBARRIER_CMD_GLOBAL,
    diff --git a/init/Kconfig b/init/Kconfig
    index 30208da2221f..30b65febeb23 100644
    --- a/init/Kconfig
    +++ b/init/Kconfig
    @@ -1415,6 +1415,9 @@ config USERFAULTFD
     config ARCH_HAS_MEMBARRIER_HOOKS
     	bool
     
    +config ARCH_HAS_MEMBARRIER_SYNC_CORE
    +	bool
    +
     config EMBEDDED
     	bool "Embedded system"
     	option allnoconfig_y
    diff --git a/kernel/sched/core.c b/kernel/sched/core.c
    index 62f269980e29..f86cbba038b9 100644
    --- a/kernel/sched/core.c
    +++ b/kernel/sched/core.c
    @@ -2662,9 +2662,13 @@ static struct rq *finish_task_switch(struct task_struct *prev)
     	 * thread, mmdrop()'s implicit full barrier is required by the
     	 * membarrier system call, because the current active_mm can
     	 * become the current mm without going through switch_mm().
    +	 * membarrier also requires a core serializing instruction
    +	 * before going back to user-space after storing to rq->curr.
     	 */
    -	if (mm)
    +	if (mm) {
    +		membarrier_mm_sync_core_before_usermode(mm);
     		mmdrop(mm);
    +	}
     	if (unlikely(prev_state == TASK_DEAD)) {
     		if (prev->sched_class->task_dead)
     			prev->sched_class->task_dead(prev);
    diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
    index d2087d5f9837..5d0762633639 100644
    --- a/kernel/sched/membarrier.c
    +++ b/kernel/sched/membarrier.c
    @@ -26,11 +26,20 @@
      * Bitmask made from a "or" of all commands within enum membarrier_cmd,
      * except MEMBARRIER_CMD_QUERY.
      */
    +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE
    +#define MEMBARRIER_PRIVATE_EXPEDITED_SYNC_CORE_BITMASK	\
    +	(MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE \
    +	| MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE)
    +#else
    +#define MEMBARRIER_PRIVATE_EXPEDITED_SYNC_CORE_BITMASK	0
    +#endif
    +
     #define MEMBARRIER_CMD_BITMASK	\
     	(MEMBARRIER_CMD_GLOBAL | MEMBARRIER_CMD_GLOBAL_EXPEDITED \
     	| MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED \
     	| MEMBARRIER_CMD_PRIVATE_EXPEDITED	\
    -	| MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED)
    +	| MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED	\
    +	| MEMBARRIER_PRIVATE_EXPEDITED_SYNC_CORE_BITMASK)
     
     static void ipi_mb(void *info)
     {
    @@ -104,15 +113,23 @@ static int membarrier_global_expedited(void)
     	return 0;
     }
     
    -static int membarrier_private_expedited(void)
    +static int membarrier_private_expedited(int flags)
     {
     	int cpu;
     	bool fallback = false;
     	cpumask_var_t tmpmask;
     
    -	if (!(atomic_read(&current->mm->membarrier_state)
    -			& MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY))
    -		return -EPERM;
    +	if (flags & MEMBARRIER_FLAG_SYNC_CORE) {
    +		if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE))
    +			return -EINVAL;
    +		if (!(atomic_read(&current->mm->membarrier_state) &
    +		      MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY))
    +			return -EPERM;
    +	} else {
    +		if (!(atomic_read(&current->mm->membarrier_state) &
    +		      MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY))
    +			return -EPERM;
    +	}
     
     	if (num_online_cpus() == 1)
     		return 0;
    @@ -205,20 +222,29 @@ static int membarrier_register_global_expedited(void)
     	return 0;
     }
     
    -static int membarrier_register_private_expedited(void)
    +static int membarrier_register_private_expedited(int flags)
     {
     	struct task_struct *p = current;
     	struct mm_struct *mm = p->mm;
    +	int state = MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY;
    +
    +	if (flags & MEMBARRIER_FLAG_SYNC_CORE) {
    +		if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE))
    +			return -EINVAL;
    +		state = MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY;
    +	}
     
     	/*
     	 * We need to consider threads belonging to different thread
     	 * groups, which use the same mm. (CLONE_VM but not
     	 * CLONE_THREAD).
     	 */
    -	if (atomic_read(&mm->membarrier_state)
    -			& MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY)
    +	if (atomic_read(&mm->membarrier_state) & state)
     		return 0;
     	atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED, &mm->membarrier_state);
    +	if (flags & MEMBARRIER_FLAG_SYNC_CORE)
    +		atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE,
    +			  &mm->membarrier_state);
     	if (!(atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)) {
     		/*
     		 * Ensure all future scheduler executions will observe the
    @@ -226,8 +252,7 @@ static int membarrier_register_private_expedited(void)
     		 */
     		synchronize_sched();
     	}
    -	atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY,
    -			&mm->membarrier_state);
    +	atomic_or(state, &mm->membarrier_state);
     	return 0;
     }
     
    @@ -283,9 +308,13 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
     	case MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED:
     		return membarrier_register_global_expedited();
     	case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
    -		return membarrier_private_expedited();
    +		return membarrier_private_expedited(0);
     	case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED:
    -		return membarrier_register_private_expedited();
    +		return membarrier_register_private_expedited(0);
    +	case MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE:
    +		return membarrier_private_expedited(MEMBARRIER_FLAG_SYNC_CORE);
    +	case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE:
    +		return membarrier_register_private_expedited(MEMBARRIER_FLAG_SYNC_CORE);
     	default:
     		return -EINVAL;
     	}
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread
  • * [PATCH for 4.16 09/11] membarrier: x86: Provide core serializing command (v4)
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
                       ` (4 preceding siblings ...)
      2018-01-17 16:54 ` [PATCH for 4.16 08/11] membarrier: Provide core serializing command Mathieu Desnoyers
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      2018-01-17 16:54 ` [PATCH for 4.16 10/11] membarrier: arm64: Provide core serializing command Mathieu Desnoyers
      2018-01-17 16:54 ` [PATCH for 4.16 11/11] membarrier: selftest: Test private expedited sync core cmd Mathieu Desnoyers
      7 siblings, 1 reply; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: linux-kernel, linux-api, Andy Lutomirski, Paul E . McKenney,
    	Boqun Feng, Andrew Hunter, Maged Michael, Avi Kivity,
    	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
    	Dave Watson, H . Peter Anvin, Andrea Parri, Russell King,
    	Greg Hackmann, Will Deacon, David Sehr, Linus Torvalds, x86,
    	Mathieu Desnoyers <mathieu.desnoy>
    
    There are two places where core serialization is needed by membarrier:
    
    1) When returning from the membarrier IPI,
    2) After scheduler updates curr to a thread with a different mm, before
       going back to user-space, since the curr->mm is used by membarrier to
       check whether it needs to send an IPI to that CPU.
    
    x86-32 uses iret as return from interrupt, and both iret and sysexit to go
    back to user-space. The iret instruction is core serializing, but not
    sysexit.
    
    x86-64 uses iret as return from interrupt, which takes care of the IPI.
    However, it can return to user-space through either sysretl (compat
    code), sysretq, or iret. Given that sysret{l,q} is not core serializing,
    we rely instead on write_cr3() performed by switch_mm() to provide core
    serialization after changing the current mm, and deal with the special
    case of kthread -> uthread (temporarily keeping current mm into
    active_mm) by adding a sync_core() in that specific case.
    
    Use the new sync_core_before_usermode() to guarantee this.
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Thomas Gleixner <tglx@linutronix.de>
    CC: Ingo Molnar <mingo@redhat.com>
    CC: "H. Peter Anvin" <hpa@zytor.com>
    CC: Andrea Parri <parri.andrea@gmail.com>
    CC: Russell King <linux@armlinux.org.uk>
    CC: Greg Hackmann <ghackmann@google.com>
    CC: Will Deacon <will.deacon@arm.com>
    CC: David Sehr <sehr@google.com>
    CC: x86@kernel.org
    CC: linux-arch@vger.kernel.org
    ---
    Changes since v1:
    - Use the newly introduced sync_core_before_usermode(). Move all state
      handling to generic code.
    - Add linux/processor.h include to include/linux/sched/mm.h.
    Changes since v2:
    - Fix use-after-free in membarrier_mm_sync_core_before_usermode.
    Changes since v3:
    - Move generic code into separate patch.
    ---
     arch/x86/Kconfig          | 1 +
     arch/x86/entry/entry_32.S | 5 +++++
     arch/x86/entry/entry_64.S | 4 ++++
     arch/x86/mm/tlb.c         | 7 ++++---
     4 files changed, 14 insertions(+), 3 deletions(-)
    
    diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
    index 0b44c8dd0e95..b5324f2e3162 100644
    --- a/arch/x86/Kconfig
    +++ b/arch/x86/Kconfig
    @@ -54,6 +54,7 @@ config X86
     	select ARCH_HAS_FORTIFY_SOURCE
     	select ARCH_HAS_GCOV_PROFILE_ALL
     	select ARCH_HAS_KCOV			if X86_64
    +	select ARCH_HAS_MEMBARRIER_SYNC_CORE
     	select ARCH_HAS_PMEM_API		if X86_64
     	select ARCH_HAS_REFCOUNT
     	select ARCH_HAS_UACCESS_FLUSHCACHE	if X86_64
    diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
    index a1f28a54f23a..0c89cef690cf 100644
    --- a/arch/x86/entry/entry_32.S
    +++ b/arch/x86/entry/entry_32.S
    @@ -554,6 +554,11 @@ restore_all:
     .Lrestore_nocheck:
     	RESTORE_REGS 4				# skip orig_eax/error_code
     .Lirq_return:
    +	/*
    +	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization
    +	 * when returning from IPI handler and when returning from
    +	 * scheduler to user-space.
    +	 */
     	INTERRUPT_RETURN
     
     .section .fixup, "ax"
    diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
    index 4f8e1d35a97c..8a32390240f1 100644
    --- a/arch/x86/entry/entry_64.S
    +++ b/arch/x86/entry/entry_64.S
    @@ -792,6 +792,10 @@ GLOBAL(restore_regs_and_return_to_kernel)
     	POP_EXTRA_REGS
     	POP_C_REGS
     	addq	$8, %rsp	/* skip regs->orig_ax */
    +	/*
    +	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization
    +	 * when returning from IPI handler.
    +	 */
     	INTERRUPT_RETURN
     
     ENTRY(native_iret)
    diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
    index c28cd5592b0d..df4e21371c89 100644
    --- a/arch/x86/mm/tlb.c
    +++ b/arch/x86/mm/tlb.c
    @@ -201,9 +201,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
     	this_cpu_write(cpu_tlbstate.is_lazy, false);
     
     	/*
    -	 * The membarrier system call requires a full memory barrier
    -	 * before returning to user-space, after storing to rq->curr.
    -	 * Writing to CR3 provides that full memory barrier.
    +	 * The membarrier system call requires a full memory barrier and
    +	 * core serialization before returning to user-space, after
    +	 * storing to rq->curr. Writing to CR3 provides that full
    +	 * memory barrier and core serializing instruction.
     	 */
     	if (real_prev == next) {
     		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread
  • * [PATCH for 4.16 10/11] membarrier: arm64: Provide core serializing command
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
                       ` (5 preceding siblings ...)
      2018-01-17 16:54 ` [PATCH for 4.16 09/11] membarrier: x86: Provide core serializing command (v4) Mathieu Desnoyers
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      2018-01-17 16:54 ` [PATCH for 4.16 11/11] membarrier: selftest: Test private expedited sync core cmd Mathieu Desnoyers
      7 siblings, 1 reply; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: linux-kernel, linux-api, Andy Lutomirski, Paul E . McKenney,
    	Boqun Feng, Andrew Hunter, Maged Michael, Avi Kivity,
    	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
    	Dave Watson, H . Peter Anvin, Andrea Parri, Russell King,
    	Greg Hackmann, Will Deacon, David Sehr, Linus Torvalds, x86,
    	Mathieu Desnoyers <mathieu.desnoy>
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Thomas Gleixner <tglx@linutronix.de>
    CC: Ingo Molnar <mingo@redhat.com>
    CC: "H. Peter Anvin" <hpa@zytor.com>
    CC: Andrea Parri <parri.andrea@gmail.com>
    CC: Russell King <linux@armlinux.org.uk>
    CC: Greg Hackmann <ghackmann@google.com>
    CC: Will Deacon <will.deacon@arm.com>
    CC: David Sehr <sehr@google.com>
    CC: x86@kernel.org
    CC: linux-arch@vger.kernel.org
    ---
     arch/arm64/Kconfig        | 1 +
     arch/arm64/kernel/entry.S | 4 ++++
     2 files changed, 5 insertions(+)
    
    diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
    index c9a7e9e1414f..5b0c06d8dbbe 100644
    --- a/arch/arm64/Kconfig
    +++ b/arch/arm64/Kconfig
    @@ -16,6 +16,7 @@ config ARM64
     	select ARCH_HAS_GCOV_PROFILE_ALL
     	select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
     	select ARCH_HAS_KCOV
    +	select ARCH_HAS_MEMBARRIER_SYNC_CORE
     	select ARCH_HAS_SET_MEMORY
     	select ARCH_HAS_SG_CHAIN
     	select ARCH_HAS_STRICT_KERNEL_RWX
    diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
    index 6d14b8f29b5f..5edde1c2e93e 100644
    --- a/arch/arm64/kernel/entry.S
    +++ b/arch/arm64/kernel/entry.S
    @@ -302,6 +302,10 @@ alternative_else_nop_endif
     	ldp	x28, x29, [sp, #16 * 14]
     	ldr	lr, [sp, #S_LR]
     	add	sp, sp, #S_FRAME_SIZE		// restore sp
    +	/*
    +	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on eret context synchronization
    +	 * when returning from IPI handler, and when returning to user-space.
    +	 */
     	eret					// return to kernel
     	.endm
     
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread
  • * [PATCH for 4.16 11/11] membarrier: selftest: Test private expedited sync core cmd
           [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
                       ` (6 preceding siblings ...)
      2018-01-17 16:54 ` [PATCH for 4.16 10/11] membarrier: arm64: Provide core serializing command Mathieu Desnoyers
    @ 2018-01-17 16:54 ` Mathieu Desnoyers
      2018-01-17 16:54   ` Mathieu Desnoyers
      7 siblings, 1 reply; 28+ messages in thread
    From: Mathieu Desnoyers @ 2018-01-17 16:54 UTC (permalink / raw)
      To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
      Cc: linux-kernel, linux-api, Andy Lutomirski, Paul E . McKenney,
    	Boqun Feng, Andrew Hunter, Maged Michael, Avi Kivity,
    	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
    	Dave Watson, H . Peter Anvin, Andrea Parri, Russell King,
    	Greg Hackmann, Will Deacon, David Sehr, Linus Torvalds, x86,
    	Mathieu Desnoyers <mathieu.desnoy>
    
    Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE and
    MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE commands.
    
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
    CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    CC: Peter Zijlstra <peterz@infradead.org>
    CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    CC: Boqun Feng <boqun.feng@gmail.com>
    CC: Andrew Hunter <ahh@google.com>
    CC: Maged Michael <maged.michael@gmail.com>
    CC: Avi Kivity <avi@scylladb.com>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: Paul Mackerras <paulus@samba.org>
    CC: Michael Ellerman <mpe@ellerman.id.au>
    CC: Dave Watson <davejwatson@fb.com>
    CC: Alan Stern <stern@rowland.harvard.edu>
    CC: Will Deacon <will.deacon@arm.com>
    CC: Andy Lutomirski <luto@kernel.org>
    CC: Alice Ferrazzi <alice.ferrazzi@gmail.com>
    CC: Paul Elder <paul.elder@pitt.edu>
    CC: linux-kselftest@vger.kernel.org
    CC: linux-arch@vger.kernel.org
    ---
     .../testing/selftests/membarrier/membarrier_test.c | 73 ++++++++++++++++++++++
     1 file changed, 73 insertions(+)
    
    diff --git a/tools/testing/selftests/membarrier/membarrier_test.c b/tools/testing/selftests/membarrier/membarrier_test.c
    index eec39a0e1f77..22bffd55a523 100644
    --- a/tools/testing/selftests/membarrier/membarrier_test.c
    +++ b/tools/testing/selftests/membarrier/membarrier_test.c
    @@ -132,6 +132,63 @@ static int test_membarrier_private_expedited_success(void)
     	return 0;
     }
     
    +static int test_membarrier_private_expedited_sync_core_fail(void)
    +{
    +	int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, flags = 0;
    +	const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE not registered failure";
    +
    +	if (sys_membarrier(cmd, flags) != -1) {
    +		ksft_exit_fail_msg(
    +			"%s test: flags = %d. Should fail, but passed\n",
    +			test_name, flags);
    +	}
    +	if (errno != EPERM) {
    +		ksft_exit_fail_msg(
    +			"%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n",
    +			test_name, flags, EPERM, strerror(EPERM),
    +			errno, strerror(errno));
    +	}
    +
    +	ksft_test_result_pass(
    +		"%s test: flags = %d, errno = %d\n",
    +		test_name, flags, errno);
    +	return 0;
    +}
    +
    +static int test_membarrier_register_private_expedited_sync_core_success(void)
    +{
    +	int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE, flags = 0;
    +	const char *test_name = "sys membarrier MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE";
    +
    +	if (sys_membarrier(cmd, flags) != 0) {
    +		ksft_exit_fail_msg(
    +			"%s test: flags = %d, errno = %d\n",
    +			test_name, flags, errno);
    +	}
    +
    +	ksft_test_result_pass(
    +		"%s test: flags = %d\n",
    +		test_name, flags);
    +	return 0;
    +}
    +
    +static int test_membarrier_private_expedited_sync_core_success(void)
    +{
    +	int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0;
    +	const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE";
    +
    +	if (sys_membarrier(cmd, flags) != 0) {
    +		ksft_exit_fail_msg(
    +			"%s test: flags = %d, errno = %d\n",
    +			test_name, flags, errno);
    +	}
    +
    +	ksft_test_result_pass(
    +		"%s test: flags = %d\n",
    +		test_name, flags);
    +	return 0;
    +}
    +
     static int test_membarrier_register_global_expedited_success(void)
     {
     	int cmd = MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED, flags = 0;
    @@ -188,6 +245,22 @@ static int test_membarrier(void)
     	status = test_membarrier_private_expedited_success();
     	if (status)
     		return status;
    +	status = sys_membarrier(MEMBARRIER_CMD_QUERY, 0);
    +	if (status < 0) {
    +		ksft_test_result_fail("sys_membarrier() failed\n");
    +		return status;
    +	}
    +	if (status & MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE) {
    +		status = test_membarrier_private_expedited_sync_core_fail();
    +		if (status)
    +			return status;
    +		status = test_membarrier_register_private_expedited_sync_core_success();
    +		if (status)
    +			return status;
    +		status = test_membarrier_private_expedited_sync_core_success();
    +		if (status)
    +			return status;
    +	}
     	/*
     	 * It is valid to send a global membarrier from a non-registered
     	 * process.
    -- 
    2.11.0
    
    ^ permalink raw reply related	[flat|nested] 28+ messages in thread

  • end of thread, other threads:[~2018-01-17 23:38 UTC | newest]
    
    Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <20180117165458.13330-1-mathieu.desnoyers@efficios.com>
         [not found] ` <20180117165458.13330-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
    2018-01-17 16:54   ` [PATCH for 4.16 01/11] membarrier: selftest: Test private expedited cmd (v2) Mathieu Desnoyers
    2018-01-17 16:54     ` Mathieu Desnoyers
    2018-01-17 16:54   ` [PATCH for 4.16 06/11] Introduce sync_core_before_usermode Mathieu Desnoyers
    2018-01-17 16:54     ` Mathieu Desnoyers
    2018-01-17 16:54 ` [PATCH for 4.16 02/11] powerpc: membarrier: Skip memory barrier in switch_mm() (v7) Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    2018-01-17 16:54 ` [PATCH for 4.16 05/11] membarrier: selftest: Test global expedited cmd (v2) Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    2018-01-17 16:54 ` [PATCH for 4.16 07/11] x86: Implement sync_core_before_usermode (v3) Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    2018-01-17 17:53   ` Andy Lutomirski
    2018-01-17 17:53     ` Andy Lutomirski
    2018-01-17 18:10     ` Mathieu Desnoyers
    2018-01-17 18:10       ` Mathieu Desnoyers
         [not found]       ` <712464074.4666.1516212636860.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
    2018-01-17 18:13         ` Andy Lutomirski
    2018-01-17 18:13           ` Andy Lutomirski
    2018-01-17 23:25           ` Mathieu Desnoyers
    2018-01-17 23:25             ` Mathieu Desnoyers
    2018-01-17 23:37             ` Andy Lutomirski
    2018-01-17 23:37               ` Andy Lutomirski
    2018-01-17 16:54 ` [PATCH for 4.16 08/11] membarrier: Provide core serializing command Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    2018-01-17 16:54 ` [PATCH for 4.16 09/11] membarrier: x86: Provide core serializing command (v4) Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    2018-01-17 16:54 ` [PATCH for 4.16 10/11] membarrier: arm64: Provide core serializing command Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    2018-01-17 16:54 ` [PATCH for 4.16 11/11] membarrier: selftest: Test private expedited sync core cmd Mathieu Desnoyers
    2018-01-17 16:54   ` Mathieu Desnoyers
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox;
    as well as URLs for NNTP newsgroup(s).