* [PATCH 6/8] powerpc/membarrier: Remove special barrier on mm switch [not found] <cover.1623813516.git.luto@kernel.org> @ 2021-06-16 3:21 ` Andy Lutomirski 2021-06-16 4:36 ` Nicholas Piggin 2021-06-16 3:21 ` [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski 1 sibling, 1 reply; 19+ messages in thread From: Andy Lutomirski @ 2021-06-16 3:21 UTC (permalink / raw) To: x86 Cc: linux-mm, Peter Zijlstra, LKML, Nicholas Piggin, Dave Hansen, Paul Mackerras, Andy Lutomirski, Mathieu Desnoyers, Andrew Morton, linuxppc-dev powerpc did the following on some, but not all, paths through switch_mm_irqs_off(): /* * Only need the full barrier when switching between processes. * Barrier when switching from kernel to userspace is not * required here, given that it is implied by mmdrop(). Barrier * when switching from userspace to kernel is not needed after * store to rq->curr. */ if (likely(!(atomic_read(&next->membarrier_state) & (MEMBARRIER_STATE_PRIVATE_EXPEDITED | MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) return; This is puzzling: if !prev, then one might expect that we are switching from kernel to user, not user to kernel, which is inconsistent with the comment. But this is all nonsense, because the one and only caller would never have prev == NULL and would, in fact, OOPS if prev == NULL. In any event, this code is unnecessary, since the new generic membarrier_finish_switch_mm() provides the same barrier without arch help. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> --- arch/powerpc/include/asm/membarrier.h | 27 --------------------------- arch/powerpc/mm/mmu_context.c | 2 -- 2 files changed, 29 deletions(-) delete mode 100644 arch/powerpc/include/asm/membarrier.h diff --git a/arch/powerpc/include/asm/membarrier.h b/arch/powerpc/include/asm/membarrier.h deleted file mode 100644 index 6e20bb5c74ea..000000000000 --- a/arch/powerpc/include/asm/membarrier.h +++ /dev/null @@ -1,27 +0,0 @@ -#ifndef _ASM_POWERPC_MEMBARRIER_H -#define _ASM_POWERPC_MEMBARRIER_H - -static inline void membarrier_arch_switch_mm(struct mm_struct *prev, - struct mm_struct *next, - struct task_struct *tsk) -{ - /* - * Only need the full barrier when switching between processes. - * Barrier when switching from kernel to userspace is not - * required here, given that it is implied by mmdrop(). Barrier - * when switching from userspace to kernel is not needed after - * store to rq->curr. - */ - if (likely(!(atomic_read(&next->membarrier_state) & - (MEMBARRIER_STATE_PRIVATE_EXPEDITED | - MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) - return; - - /* - * The membarrier system call requires a full memory barrier - * after storing to rq->curr, before going back to user-space. - */ - smp_mb(); -} - -#endif /* _ASM_POWERPC_MEMBARRIER_H */ diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c index a857af401738..8daa95b3162b 100644 --- a/arch/powerpc/mm/mmu_context.c +++ b/arch/powerpc/mm/mmu_context.c @@ -85,8 +85,6 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, if (new_on_cpu) radix_kvm_prefetch_workaround(next); - else - membarrier_arch_switch_mm(prev, next, tsk); /* * The actual HW switching method differs between the various -- 2.31.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 6/8] powerpc/membarrier: Remove special barrier on mm switch 2021-06-16 3:21 ` [PATCH 6/8] powerpc/membarrier: Remove special barrier on mm switch Andy Lutomirski @ 2021-06-16 4:36 ` Nicholas Piggin 0 siblings, 0 replies; 19+ messages in thread From: Nicholas Piggin @ 2021-06-16 4:36 UTC (permalink / raw) To: Andy Lutomirski, x86 Cc: Dave Hansen, Peter Zijlstra, LKML, linux-mm, Mathieu Desnoyers, Paul Mackerras, Andrew Morton, linuxppc-dev Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: > powerpc did the following on some, but not all, paths through > switch_mm_irqs_off(): > > /* > * Only need the full barrier when switching between processes. > * Barrier when switching from kernel to userspace is not > * required here, given that it is implied by mmdrop(). Barrier > * when switching from userspace to kernel is not needed after > * store to rq->curr. > */ > if (likely(!(atomic_read(&next->membarrier_state) & > (MEMBARRIER_STATE_PRIVATE_EXPEDITED | > MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) > return; > > This is puzzling: if !prev, then one might expect that we are switching > from kernel to user, not user to kernel, which is inconsistent with the > comment. But this is all nonsense, because the one and only caller would > never have prev == NULL and would, in fact, OOPS if prev == NULL. Yeah that's strange, code definitely doesn't match comment. Good catch. > > In any event, this code is unnecessary, since the new generic > membarrier_finish_switch_mm() provides the same barrier without arch help. If that's merged then I think this could be too. I'll do a bit more digging into this too. Thanks, Nick > > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Paul Mackerras <paulus@samba.org> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Andy Lutomirski <luto@kernel.org> > --- > arch/powerpc/include/asm/membarrier.h | 27 --------------------------- > arch/powerpc/mm/mmu_context.c | 2 -- > 2 files changed, 29 deletions(-) > delete mode 100644 arch/powerpc/include/asm/membarrier.h > > diff --git a/arch/powerpc/include/asm/membarrier.h b/arch/powerpc/include/asm/membarrier.h > deleted file mode 100644 > index 6e20bb5c74ea..000000000000 > --- a/arch/powerpc/include/asm/membarrier.h > +++ /dev/null > @@ -1,27 +0,0 @@ > -#ifndef _ASM_POWERPC_MEMBARRIER_H > -#define _ASM_POWERPC_MEMBARRIER_H > - > -static inline void membarrier_arch_switch_mm(struct mm_struct *prev, > - struct mm_struct *next, > - struct task_struct *tsk) > -{ > - /* > - * Only need the full barrier when switching between processes. > - * Barrier when switching from kernel to userspace is not > - * required here, given that it is implied by mmdrop(). Barrier > - * when switching from userspace to kernel is not needed after > - * store to rq->curr. > - */ > - if (likely(!(atomic_read(&next->membarrier_state) & > - (MEMBARRIER_STATE_PRIVATE_EXPEDITED | > - MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) > - return; > - > - /* > - * The membarrier system call requires a full memory barrier > - * after storing to rq->curr, before going back to user-space. > - */ > - smp_mb(); > -} > - > -#endif /* _ASM_POWERPC_MEMBARRIER_H */ > diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c > index a857af401738..8daa95b3162b 100644 > --- a/arch/powerpc/mm/mmu_context.c > +++ b/arch/powerpc/mm/mmu_context.c > @@ -85,8 +85,6 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, > > if (new_on_cpu) > radix_kvm_prefetch_workaround(next); > - else > - membarrier_arch_switch_mm(prev, next, tsk); > > /* > * The actual HW switching method differs between the various > -- > 2.31.1 > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation [not found] <cover.1623813516.git.luto@kernel.org> 2021-06-16 3:21 ` [PATCH 6/8] powerpc/membarrier: Remove special barrier on mm switch Andy Lutomirski @ 2021-06-16 3:21 ` Andy Lutomirski 2021-06-16 4:45 ` Nicholas Piggin ` (3 more replies) 1 sibling, 4 replies; 19+ messages in thread From: Andy Lutomirski @ 2021-06-16 3:21 UTC (permalink / raw) To: x86 Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, LKML, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andy Lutomirski, Mathieu Desnoyers, Andrew Morton, linuxppc-dev, linux-arm-kernel The old sync_core_before_usermode() comments suggested that a non-icache-syncing return-to-usermode instruction is x86-specific and that all other architectures automatically notice cross-modified code on return to userspace. This is misleading. The incantation needed to modify code from one CPU and execute it on another CPU is highly architecture dependent. On x86, according to the SDM, one must modify the code, issue SFENCE if the modification was WC or nontemporal, and then issue a "serializing instruction" on the CPU that will execute the code. membarrier() can do the latter. On arm64 and powerpc, one must flush the icache and then flush the pipeline on the target CPU, although the CPU manuals don't necessarily use this language. So let's drop any pretense that we can have a generic way to define or implement membarrier's SYNC_CORE operation and instead require all architectures to define the helper and supply their own documentation as to how to use it. This means x86, arm64, and powerpc for now. Let's also rename the function from sync_core_before_usermode() to membarrier_sync_core_before_usermode() because the precise flushing details may very well be specific to membarrier, and even the concept of "sync_core" in the kernel is mostly an x86-ism. (It may well be the case that, on real x86 processors, synchronizing the icache (which requires no action at all) and "flushing the pipeline" is sufficient, but trying to use this language would be confusing at best. LFENCE does something awfully like "flushing the pipeline", but the SDM does not permit LFENCE as an alternative to a "serializing instruction" for this purpose.) Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Cc: stable@vger.kernel.org Fixes: 70216e18e519 ("membarrier: Provide core serializing command, *_SYNC_CORE") Signed-off-by: Andy Lutomirski <luto@kernel.org> --- .../membarrier-sync-core/arch-support.txt | 68 ++++++------------- arch/arm64/include/asm/sync_core.h | 19 ++++++ arch/powerpc/include/asm/sync_core.h | 14 ++++ arch/x86/Kconfig | 1 - arch/x86/include/asm/sync_core.h | 7 +- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/cpu/mce/core.c | 2 +- arch/x86/mm/tlb.c | 3 +- drivers/misc/sgi-gru/grufault.c | 2 +- drivers/misc/sgi-gru/gruhandles.c | 2 +- drivers/misc/sgi-gru/grukservices.c | 2 +- include/linux/sched/mm.h | 1 - include/linux/sync_core.h | 21 ------ init/Kconfig | 3 - kernel/sched/membarrier.c | 15 ++-- 15 files changed, 75 insertions(+), 87 deletions(-) create mode 100644 arch/arm64/include/asm/sync_core.h create mode 100644 arch/powerpc/include/asm/sync_core.h delete mode 100644 include/linux/sync_core.h diff --git a/Documentation/features/sched/membarrier-sync-core/arch-support.txt b/Documentation/features/sched/membarrier-sync-core/arch-support.txt index 883d33b265d6..41c9ebcb275f 100644 --- a/Documentation/features/sched/membarrier-sync-core/arch-support.txt +++ b/Documentation/features/sched/membarrier-sync-core/arch-support.txt @@ -5,51 +5,25 @@ # # Architecture requirements # -# * arm/arm64/powerpc # -# Rely on implicit context synchronization as a result of exception return -# when returning from IPI handler, and when returning to user-space. -# -# * x86 -# -# x86-32 uses IRET as return from interrupt, which takes care of the IPI. -# However, it uses both IRET and SYSEXIT to go back to user-space. The IRET -# instruction is core serializing, but not SYSEXIT. -# -# x86-64 uses IRET as return from interrupt, which takes care of the IPI. -# However, it can return to user-space through either SYSRETL (compat code), -# SYSRETQ, or IRET. -# -# Given that neither SYSRET{L,Q}, nor SYSEXIT, are core serializing, we rely -# instead on write_cr3() performed by switch_mm() to provide core serialization -# after changing the current mm, and deal with the special case of kthread -> -# uthread (temporarily keeping current mm into active_mm) by issuing a -# sync_core_before_usermode() in that specific case. -# - ----------------------- - | arch |status| - ----------------------- - | alpha: | TODO | - | arc: | TODO | - | arm: | ok | - | arm64: | ok | - | csky: | TODO | - | h8300: | TODO | - | hexagon: | TODO | - | ia64: | TODO | - | m68k: | TODO | - | microblaze: | TODO | - | mips: | TODO | - | nds32: | TODO | - | nios2: | TODO | - | openrisc: | TODO | - | parisc: | TODO | - | powerpc: | ok | - | riscv: | TODO | - | s390: | TODO | - | sh: | TODO | - | sparc: | TODO | - | um: | TODO | - | x86: | ok | - | xtensa: | TODO | - ----------------------- +# An architecture that wants to support +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE needs to define precisely what it +# is supposed to do and implement membarrier_sync_core_before_usermode() to +# make it do that. Then it can select ARCH_HAS_MEMBARRIER_SYNC_CORE via +# Kconfig.Unfortunately, MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is not a +# fantastic API and may not make sense on all architectures. Once an +# architecture meets these requirements, +# +# On x86, a program can safely modify code, issue +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, and then execute that code, via +# the modified address or an alias, from any thread in the calling process. +# +# On arm64, a program can modify code, flush the icache as needed, and issue +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE to force a "context synchronizing +# event", aka pipeline flush on all CPUs that might run the calling process. +# Then the program can execute the modified code as long as it is executed +# from an address consistent with the icache flush and the CPU's cache type. +# +# On powerpc, a program can use MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE +# similarly to arm64. It would be nice if the powerpc maintainers could +# add a more clear explanantion. diff --git a/arch/arm64/include/asm/sync_core.h b/arch/arm64/include/asm/sync_core.h new file mode 100644 index 000000000000..74996bf533bb --- /dev/null +++ b/arch/arm64/include/asm/sync_core.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_ARM64_SYNC_CORE_H +#define _ASM_ARM64_SYNC_CORE_H + +#include <asm/barrier.h> + +/* + * On arm64, anyone trying to use membarrier() to handle JIT code is + * required to first flush the icache and then do SYNC_CORE. All that's + * needed after the icache flush is to execute a "context synchronization + * event". Right now, ERET does this, and we are guaranteed to ERET before + * any user code runs. If Linux ever programs the CPU to make ERET stop + * being a context synchronizing event, then this will need to be adjusted. + */ +static inline void membarrier_sync_core_before_usermode(void) +{ +} + +#endif /* _ASM_ARM64_SYNC_CORE_H */ diff --git a/arch/powerpc/include/asm/sync_core.h b/arch/powerpc/include/asm/sync_core.h new file mode 100644 index 000000000000..589fdb34beab --- /dev/null +++ b/arch/powerpc/include/asm/sync_core.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_SYNC_CORE_H +#define _ASM_POWERPC_SYNC_CORE_H + +#include <asm/barrier.h> + +/* + * XXX: can a powerpc person put an appropriate comment here? + */ +static inline void membarrier_sync_core_before_usermode(void) +{ +} + +#endif /* _ASM_POWERPC_SYNC_CORE_H */ diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 0045e1b44190..f010897a1e8a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -89,7 +89,6 @@ config X86 select ARCH_HAS_SET_DIRECT_MAP select ARCH_HAS_STRICT_KERNEL_RWX select ARCH_HAS_STRICT_MODULE_RWX - select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_HAS_DEBUG_WX diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h index ab7382f92aff..c665b453969a 100644 --- a/arch/x86/include/asm/sync_core.h +++ b/arch/x86/include/asm/sync_core.h @@ -89,11 +89,10 @@ static inline void sync_core(void) } /* - * Ensure that a core serializing instruction is issued before returning - * to user-mode. x86 implements return to user-space through sysexit, - * sysrel, and sysretq, which are not core serializing. + * Ensure that the CPU notices any instruction changes before the next time + * it returns to usermode. */ -static inline void sync_core_before_usermode(void) +static inline void membarrier_sync_core_before_usermode(void) { /* With PTI, we unconditionally serialize before running user code. */ if (static_cpu_has(X86_FEATURE_PTI)) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 6974b5174495..52ead5f4fcdc 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -17,7 +17,7 @@ #include <linux/kprobes.h> #include <linux/mmu_context.h> #include <linux/bsearch.h> -#include <linux/sync_core.h> +#include <asm/sync_core.h> #include <asm/text-patching.h> #include <asm/alternative.h> #include <asm/sections.h> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index bf7fe87a7e88..4a577980d4d1 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -41,12 +41,12 @@ #include <linux/irq_work.h> #include <linux/export.h> #include <linux/set_memory.h> -#include <linux/sync_core.h> #include <linux/task_work.h> #include <linux/hardirq.h> #include <asm/intel-family.h> #include <asm/processor.h> +#include <asm/sync_core.h> #include <asm/traps.h> #include <asm/tlbflush.h> #include <asm/mce.h> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 59488d663e68..35b622fd2ed1 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -11,6 +11,7 @@ #include <linux/sched/mm.h> #include <asm/tlbflush.h> +#include <asm/sync_core.h> #include <asm/mmu_context.h> #include <asm/nospec-branch.h> #include <asm/cache.h> @@ -538,7 +539,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, */ if (unlikely(atomic_read(&next->membarrier_state) & MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE)) - sync_core_before_usermode(); + membarrier_sync_core_before_usermode(); #endif return; diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c index 723825524ea0..48fd5b101de1 100644 --- a/drivers/misc/sgi-gru/grufault.c +++ b/drivers/misc/sgi-gru/grufault.c @@ -20,8 +20,8 @@ #include <linux/io.h> #include <linux/uaccess.h> #include <linux/security.h> -#include <linux/sync_core.h> #include <linux/prefetch.h> +#include <asm/sync_core.h> #include "gru.h" #include "grutables.h" #include "grulib.h" diff --git a/drivers/misc/sgi-gru/gruhandles.c b/drivers/misc/sgi-gru/gruhandles.c index 1d75d5e540bc..c8cba1c1b00f 100644 --- a/drivers/misc/sgi-gru/gruhandles.c +++ b/drivers/misc/sgi-gru/gruhandles.c @@ -16,7 +16,7 @@ #define GRU_OPERATION_TIMEOUT (((cycles_t) local_cpu_data->itc_freq)*10) #define CLKS2NSEC(c) ((c) *1000000000 / local_cpu_data->itc_freq) #else -#include <linux/sync_core.h> +#include <asm/sync_core.h> #include <asm/tsc.h> #define GRU_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000) #define CLKS2NSEC(c) ((c) * 1000000 / tsc_khz) diff --git a/drivers/misc/sgi-gru/grukservices.c b/drivers/misc/sgi-gru/grukservices.c index 0ea923fe6371..ce03ff3f7c3a 100644 --- a/drivers/misc/sgi-gru/grukservices.c +++ b/drivers/misc/sgi-gru/grukservices.c @@ -16,10 +16,10 @@ #include <linux/miscdevice.h> #include <linux/proc_fs.h> #include <linux/interrupt.h> -#include <linux/sync_core.h> #include <linux/uaccess.h> #include <linux/delay.h> #include <linux/export.h> +#include <asm/sync_core.h> #include <asm/io_apic.h> #include "gru.h" #include "grulib.h" diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index c6eebbafadb0..845db11190cd 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -7,7 +7,6 @@ #include <linux/sched.h> #include <linux/mm_types.h> #include <linux/gfp.h> -#include <linux/sync_core.h> /* * Routines for handling mm_structs diff --git a/include/linux/sync_core.h b/include/linux/sync_core.h deleted file mode 100644 index 013da4b8b327..000000000000 --- a/include/linux/sync_core.h +++ /dev/null @@ -1,21 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef _LINUX_SYNC_CORE_H -#define _LINUX_SYNC_CORE_H - -#ifdef CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE -#include <asm/sync_core.h> -#else -/* - * This is a dummy sync_core_before_usermode() implementation that can be used - * on all architectures which return to user-space through core serializing - * instructions. - * If your architecture returns to user-space through non-core-serializing - * instructions, you need to write your own functions. - */ -static inline void sync_core_before_usermode(void) -{ -} -#endif - -#endif /* _LINUX_SYNC_CORE_H */ - diff --git a/init/Kconfig b/init/Kconfig index 1ea12c64e4c9..e5d552b0823e 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2377,9 +2377,6 @@ source "kernel/Kconfig.locks" config ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE bool -config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE - bool - # It may be useful for an architecture to override the definitions of the # SYSCALL_DEFINE() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h> # and the COMPAT_ variants in <linux/compat.h>, in particular to use a diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index c32c32a2441e..f72a6ab3fac2 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -5,6 +5,9 @@ * membarrier system call */ #include "sched.h" +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE +#include <asm/sync_core.h> +#endif /* * The basic principle behind the regular memory barrier mode of membarrier() @@ -221,6 +224,7 @@ static void ipi_mb(void *info) smp_mb(); /* IPIs should be serializing but paranoid. */ } +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE static void ipi_sync_core(void *info) { /* @@ -230,13 +234,14 @@ static void ipi_sync_core(void *info) * the big comment at the top of this file. * * A sync_core() would provide this guarantee, but - * sync_core_before_usermode() might end up being deferred until - * after membarrier()'s smp_mb(). + * membarrier_sync_core_before_usermode() might end up being deferred + * until after membarrier()'s smp_mb(). */ smp_mb(); /* IPIs should be serializing but paranoid. */ - sync_core_before_usermode(); + membarrier_sync_core_before_usermode(); } +#endif static void ipi_rseq(void *info) { @@ -368,12 +373,14 @@ static int membarrier_private_expedited(int flags, int cpu_id) smp_call_func_t ipi_func = ipi_mb; if (flags == MEMBARRIER_FLAG_SYNC_CORE) { - if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) +#ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE return -EINVAL; +#else if (!(atomic_read(&mm->membarrier_state) & MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) return -EPERM; ipi_func = ipi_sync_core; +#endif } else if (flags == MEMBARRIER_FLAG_RSEQ) { if (!IS_ENABLED(CONFIG_RSEQ)) return -EINVAL; -- 2.31.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 3:21 ` [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski @ 2021-06-16 4:45 ` Nicholas Piggin 2021-06-16 18:52 ` Andy Lutomirski 2021-06-16 10:20 ` Will Deacon ` (2 subsequent siblings) 3 siblings, 1 reply; 19+ messages in thread From: Nicholas Piggin @ 2021-06-16 4:45 UTC (permalink / raw) To: Andy Lutomirski, x86 Cc: Will Deacon, linux-mm, Peter Zijlstra, LKML, stable, Dave Hansen, Mathieu Desnoyers, Catalin Marinas, Paul Mackerras, Andrew Morton, linuxppc-dev, linux-arm-kernel Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: > The old sync_core_before_usermode() comments suggested that a non-icache-syncing > return-to-usermode instruction is x86-specific and that all other > architectures automatically notice cross-modified code on return to > userspace. > > This is misleading. The incantation needed to modify code from one > CPU and execute it on another CPU is highly architecture dependent. > On x86, according to the SDM, one must modify the code, issue SFENCE > if the modification was WC or nontemporal, and then issue a "serializing > instruction" on the CPU that will execute the code. membarrier() can do > the latter. > > On arm64 and powerpc, one must flush the icache and then flush the pipeline > on the target CPU, although the CPU manuals don't necessarily use this > language. > > So let's drop any pretense that we can have a generic way to define or > implement membarrier's SYNC_CORE operation and instead require all > architectures to define the helper and supply their own documentation as to > how to use it. This means x86, arm64, and powerpc for now. Let's also > rename the function from sync_core_before_usermode() to > membarrier_sync_core_before_usermode() because the precise flushing details > may very well be specific to membarrier, and even the concept of > "sync_core" in the kernel is mostly an x86-ism. > > (It may well be the case that, on real x86 processors, synchronizing the > icache (which requires no action at all) and "flushing the pipeline" is > sufficient, but trying to use this language would be confusing at best. > LFENCE does something awfully like "flushing the pipeline", but the SDM > does not permit LFENCE as an alternative to a "serializing instruction" > for this purpose.) > > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Paul Mackerras <paulus@samba.org> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: x86@kernel.org > Cc: stable@vger.kernel.org > Fixes: 70216e18e519 ("membarrier: Provide core serializing command, *_SYNC_CORE") > Signed-off-by: Andy Lutomirski <luto@kernel.org> > --- > .../membarrier-sync-core/arch-support.txt | 68 ++++++------------- > arch/arm64/include/asm/sync_core.h | 19 ++++++ > arch/powerpc/include/asm/sync_core.h | 14 ++++ > arch/x86/Kconfig | 1 - > arch/x86/include/asm/sync_core.h | 7 +- > arch/x86/kernel/alternative.c | 2 +- > arch/x86/kernel/cpu/mce/core.c | 2 +- > arch/x86/mm/tlb.c | 3 +- > drivers/misc/sgi-gru/grufault.c | 2 +- > drivers/misc/sgi-gru/gruhandles.c | 2 +- > drivers/misc/sgi-gru/grukservices.c | 2 +- > include/linux/sched/mm.h | 1 - > include/linux/sync_core.h | 21 ------ > init/Kconfig | 3 - > kernel/sched/membarrier.c | 15 ++-- > 15 files changed, 75 insertions(+), 87 deletions(-) > create mode 100644 arch/arm64/include/asm/sync_core.h > create mode 100644 arch/powerpc/include/asm/sync_core.h > delete mode 100644 include/linux/sync_core.h > > diff --git a/Documentation/features/sched/membarrier-sync-core/arch-support.txt b/Documentation/features/sched/membarrier-sync-core/arch-support.txt > index 883d33b265d6..41c9ebcb275f 100644 > --- a/Documentation/features/sched/membarrier-sync-core/arch-support.txt > +++ b/Documentation/features/sched/membarrier-sync-core/arch-support.txt > @@ -5,51 +5,25 @@ > # > # Architecture requirements > # > -# * arm/arm64/powerpc > # > -# Rely on implicit context synchronization as a result of exception return > -# when returning from IPI handler, and when returning to user-space. > -# > -# * x86 > -# > -# x86-32 uses IRET as return from interrupt, which takes care of the IPI. > -# However, it uses both IRET and SYSEXIT to go back to user-space. The IRET > -# instruction is core serializing, but not SYSEXIT. > -# > -# x86-64 uses IRET as return from interrupt, which takes care of the IPI. > -# However, it can return to user-space through either SYSRETL (compat code), > -# SYSRETQ, or IRET. > -# > -# Given that neither SYSRET{L,Q}, nor SYSEXIT, are core serializing, we rely > -# instead on write_cr3() performed by switch_mm() to provide core serialization > -# after changing the current mm, and deal with the special case of kthread -> > -# uthread (temporarily keeping current mm into active_mm) by issuing a > -# sync_core_before_usermode() in that specific case. > -# > - ----------------------- > - | arch |status| > - ----------------------- > - | alpha: | TODO | > - | arc: | TODO | > - | arm: | ok | > - | arm64: | ok | > - | csky: | TODO | > - | h8300: | TODO | > - | hexagon: | TODO | > - | ia64: | TODO | > - | m68k: | TODO | > - | microblaze: | TODO | > - | mips: | TODO | > - | nds32: | TODO | > - | nios2: | TODO | > - | openrisc: | TODO | > - | parisc: | TODO | > - | powerpc: | ok | > - | riscv: | TODO | > - | s390: | TODO | > - | sh: | TODO | > - | sparc: | TODO | > - | um: | TODO | > - | x86: | ok | > - | xtensa: | TODO | > - ----------------------- > +# An architecture that wants to support > +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE needs to define precisely what it > +# is supposed to do and implement membarrier_sync_core_before_usermode() to > +# make it do that. Then it can select ARCH_HAS_MEMBARRIER_SYNC_CORE via > +# Kconfig.Unfortunately, MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is not a > +# fantastic API and may not make sense on all architectures. Once an > +# architecture meets these requirements, > +# > +# On x86, a program can safely modify code, issue > +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, and then execute that code, via > +# the modified address or an alias, from any thread in the calling process. > +# > +# On arm64, a program can modify code, flush the icache as needed, and issue > +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE to force a "context synchronizing > +# event", aka pipeline flush on all CPUs that might run the calling process. > +# Then the program can execute the modified code as long as it is executed > +# from an address consistent with the icache flush and the CPU's cache type. > +# > +# On powerpc, a program can use MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE > +# similarly to arm64. It would be nice if the powerpc maintainers could > +# add a more clear explanantion. > diff --git a/arch/arm64/include/asm/sync_core.h b/arch/arm64/include/asm/sync_core.h > new file mode 100644 > index 000000000000..74996bf533bb > --- /dev/null > +++ b/arch/arm64/include/asm/sync_core.h > @@ -0,0 +1,19 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_ARM64_SYNC_CORE_H > +#define _ASM_ARM64_SYNC_CORE_H > + > +#include <asm/barrier.h> > + > +/* > + * On arm64, anyone trying to use membarrier() to handle JIT code is > + * required to first flush the icache and then do SYNC_CORE. All that's > + * needed after the icache flush is to execute a "context synchronization > + * event". Right now, ERET does this, and we are guaranteed to ERET before > + * any user code runs. If Linux ever programs the CPU to make ERET stop > + * being a context synchronizing event, then this will need to be adjusted. > + */ > +static inline void membarrier_sync_core_before_usermode(void) > +{ > +} > + > +#endif /* _ASM_ARM64_SYNC_CORE_H */ > diff --git a/arch/powerpc/include/asm/sync_core.h b/arch/powerpc/include/asm/sync_core.h > new file mode 100644 > index 000000000000..589fdb34beab > --- /dev/null > +++ b/arch/powerpc/include/asm/sync_core.h > @@ -0,0 +1,14 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_POWERPC_SYNC_CORE_H > +#define _ASM_POWERPC_SYNC_CORE_H > + > +#include <asm/barrier.h> > + > +/* > + * XXX: can a powerpc person put an appropriate comment here? > + */ > +static inline void membarrier_sync_core_before_usermode(void) > +{ > +} > + > +#endif /* _ASM_POWERPC_SYNC_CORE_H */ powerpc's can just go in asm/membarrier.h /* * The RFI family of instructions are context synchronising, and * that is how we return to userspace, so nothing is required here. */ > diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c > index c32c32a2441e..f72a6ab3fac2 100644 > --- a/kernel/sched/membarrier.c > +++ b/kernel/sched/membarrier.c > @@ -5,6 +5,9 @@ > * membarrier system call > */ > #include "sched.h" > +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > +#include <asm/sync_core.h> > +#endif Can you #else static inline void membarrier_sync_core_before_usermode(void) { /* this gets constant folded out */ } #endif And avoid adding the ifdefs in the following code? Otherwise I think this is good. Acked-by: Nicholas Piggin <npiggin@gmail.com> Thanks, Nick > > /* > * The basic principle behind the regular memory barrier mode of membarrier() > @@ -221,6 +224,7 @@ static void ipi_mb(void *info) > smp_mb(); /* IPIs should be serializing but paranoid. */ > } > > +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > static void ipi_sync_core(void *info) > { > /* > @@ -230,13 +234,14 @@ static void ipi_sync_core(void *info) > * the big comment at the top of this file. > * > * A sync_core() would provide this guarantee, but > - * sync_core_before_usermode() might end up being deferred until > - * after membarrier()'s smp_mb(). > + * membarrier_sync_core_before_usermode() might end up being deferred > + * until after membarrier()'s smp_mb(). > */ > smp_mb(); /* IPIs should be serializing but paranoid. */ > > - sync_core_before_usermode(); > + membarrier_sync_core_before_usermode(); > } > +#endif > > static void ipi_rseq(void *info) > { > @@ -368,12 +373,14 @@ static int membarrier_private_expedited(int flags, int cpu_id) > smp_call_func_t ipi_func = ipi_mb; > > if (flags == MEMBARRIER_FLAG_SYNC_CORE) { > - if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) > +#ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > return -EINVAL; > +#else > if (!(atomic_read(&mm->membarrier_state) & > MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) > return -EPERM; > ipi_func = ipi_sync_core; > +#endif > } else if (flags == MEMBARRIER_FLAG_RSEQ) { > if (!IS_ENABLED(CONFIG_RSEQ)) > return -EINVAL; > -- > 2.31.1 > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 4:45 ` Nicholas Piggin @ 2021-06-16 18:52 ` Andy Lutomirski 2021-06-16 23:48 ` Andy Lutomirski 2021-06-18 15:27 ` Christophe Leroy 0 siblings, 2 replies; 19+ messages in thread From: Andy Lutomirski @ 2021-06-16 18:52 UTC (permalink / raw) To: Nicholas Piggin, x86 Cc: Will Deacon, linux-mm, Peter Zijlstra, LKML, stable, Dave Hansen, Mathieu Desnoyers, Catalin Marinas, Paul Mackerras, Andrew Morton, linuxppc-dev, linux-arm-kernel On 6/15/21 9:45 PM, Nicholas Piggin wrote: > Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: >> The old sync_core_before_usermode() comments suggested that a non-icache-syncing >> return-to-usermode instruction is x86-specific and that all other >> architectures automatically notice cross-modified code on return to >> userspace. >> +/* >> + * XXX: can a powerpc person put an appropriate comment here? >> + */ >> +static inline void membarrier_sync_core_before_usermode(void) >> +{ >> +} >> + >> +#endif /* _ASM_POWERPC_SYNC_CORE_H */ > > powerpc's can just go in asm/membarrier.h $ ls arch/powerpc/include/asm/membarrier.h ls: cannot access 'arch/powerpc/include/asm/membarrier.h': No such file or directory > > /* > * The RFI family of instructions are context synchronising, and > * that is how we return to userspace, so nothing is required here. > */ Thanks! ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 18:52 ` Andy Lutomirski @ 2021-06-16 23:48 ` Andy Lutomirski 2021-06-18 15:27 ` Christophe Leroy 1 sibling, 0 replies; 19+ messages in thread From: Andy Lutomirski @ 2021-06-16 23:48 UTC (permalink / raw) To: Nicholas Piggin, the arch/x86 maintainers Cc: Will Deacon, linux-mm, Peter Zijlstra (Intel), Linux Kernel Mailing List, stable, Dave Hansen, Mathieu Desnoyers, Catalin Marinas, Paul Mackerras, Andrew Morton, linuxppc-dev, linux-arm-kernel On Wed, Jun 16, 2021, at 11:52 AM, Andy Lutomirski wrote: > On 6/15/21 9:45 PM, Nicholas Piggin wrote: > > Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: > >> The old sync_core_before_usermode() comments suggested that a non-icache-syncing > >> return-to-usermode instruction is x86-specific and that all other > >> architectures automatically notice cross-modified code on return to > >> userspace. > > >> +/* > >> + * XXX: can a powerpc person put an appropriate comment here? > >> + */ > >> +static inline void membarrier_sync_core_before_usermode(void) > >> +{ > >> +} > >> + > >> +#endif /* _ASM_POWERPC_SYNC_CORE_H */ > > > > powerpc's can just go in asm/membarrier.h > > $ ls arch/powerpc/include/asm/membarrier.h > ls: cannot access 'arch/powerpc/include/asm/membarrier.h': No such file > or directory Which is because I deleted it. Duh. I'll clean this up. > > > > > > /* > > * The RFI family of instructions are context synchronising, and > > * that is how we return to userspace, so nothing is required here. > > */ > > Thanks! > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 18:52 ` Andy Lutomirski 2021-06-16 23:48 ` Andy Lutomirski @ 2021-06-18 15:27 ` Christophe Leroy 1 sibling, 0 replies; 19+ messages in thread From: Christophe Leroy @ 2021-06-18 15:27 UTC (permalink / raw) To: Andy Lutomirski, Nicholas Piggin, x86 Cc: Dave Hansen, Peter Zijlstra, Catalin Marinas, linuxppc-dev, LKML, stable, linux-mm, Mathieu Desnoyers, Paul Mackerras, Andrew Morton, Will Deacon, linux-arm-kernel Le 16/06/2021 à 20:52, Andy Lutomirski a écrit : > On 6/15/21 9:45 PM, Nicholas Piggin wrote: >> Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: >>> The old sync_core_before_usermode() comments suggested that a non-icache-syncing >>> return-to-usermode instruction is x86-specific and that all other >>> architectures automatically notice cross-modified code on return to >>> userspace. > >>> +/* >>> + * XXX: can a powerpc person put an appropriate comment here? >>> + */ >>> +static inline void membarrier_sync_core_before_usermode(void) >>> +{ >>> +} >>> + >>> +#endif /* _ASM_POWERPC_SYNC_CORE_H */ >> >> powerpc's can just go in asm/membarrier.h > > $ ls arch/powerpc/include/asm/membarrier.h > ls: cannot access 'arch/powerpc/include/asm/membarrier.h': No such file > or directory https://github.com/torvalds/linux/blob/master/arch/powerpc/include/asm/membarrier.h Was added by https://github.com/torvalds/linux/commit/3ccfebedd8cf54e291c809c838d8ad5cc00f5688 > > >> >> /* >> * The RFI family of instructions are context synchronising, and >> * that is how we return to userspace, so nothing is required here. >> */ > > Thanks! > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 3:21 ` [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski 2021-06-16 4:45 ` Nicholas Piggin @ 2021-06-16 10:20 ` Will Deacon 2021-06-16 23:58 ` Andy Lutomirski 2021-06-17 14:47 ` Mathieu Desnoyers 2021-06-17 15:16 ` Mathieu Desnoyers 3 siblings, 1 reply; 19+ messages in thread From: Will Deacon @ 2021-06-16 10:20 UTC (permalink / raw) To: Andy Lutomirski Cc: Catalin Marinas, linux-mm, Peter Zijlstra, x86, LKML, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Mathieu Desnoyers, Andrew Morton, linuxppc-dev, linux-arm-kernel On Tue, Jun 15, 2021 at 08:21:13PM -0700, Andy Lutomirski wrote: > The old sync_core_before_usermode() comments suggested that a non-icache-syncing > return-to-usermode instruction is x86-specific and that all other > architectures automatically notice cross-modified code on return to > userspace. > > This is misleading. The incantation needed to modify code from one > CPU and execute it on another CPU is highly architecture dependent. > On x86, according to the SDM, one must modify the code, issue SFENCE > if the modification was WC or nontemporal, and then issue a "serializing > instruction" on the CPU that will execute the code. membarrier() can do > the latter. > > On arm64 and powerpc, one must flush the icache and then flush the pipeline > on the target CPU, although the CPU manuals don't necessarily use this > language. > > So let's drop any pretense that we can have a generic way to define or > implement membarrier's SYNC_CORE operation and instead require all > architectures to define the helper and supply their own documentation as to > how to use it. This means x86, arm64, and powerpc for now. Let's also > rename the function from sync_core_before_usermode() to > membarrier_sync_core_before_usermode() because the precise flushing details > may very well be specific to membarrier, and even the concept of > "sync_core" in the kernel is mostly an x86-ism. > > (It may well be the case that, on real x86 processors, synchronizing the > icache (which requires no action at all) and "flushing the pipeline" is > sufficient, but trying to use this language would be confusing at best. > LFENCE does something awfully like "flushing the pipeline", but the SDM > does not permit LFENCE as an alternative to a "serializing instruction" > for this purpose.) > > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Paul Mackerras <paulus@samba.org> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: x86@kernel.org > Cc: stable@vger.kernel.org > Fixes: 70216e18e519 ("membarrier: Provide core serializing command, *_SYNC_CORE") > Signed-off-by: Andy Lutomirski <luto@kernel.org> > --- > .../membarrier-sync-core/arch-support.txt | 68 ++++++------------- > arch/arm64/include/asm/sync_core.h | 19 ++++++ > arch/powerpc/include/asm/sync_core.h | 14 ++++ > arch/x86/Kconfig | 1 - > arch/x86/include/asm/sync_core.h | 7 +- > arch/x86/kernel/alternative.c | 2 +- > arch/x86/kernel/cpu/mce/core.c | 2 +- > arch/x86/mm/tlb.c | 3 +- > drivers/misc/sgi-gru/grufault.c | 2 +- > drivers/misc/sgi-gru/gruhandles.c | 2 +- > drivers/misc/sgi-gru/grukservices.c | 2 +- > include/linux/sched/mm.h | 1 - > include/linux/sync_core.h | 21 ------ > init/Kconfig | 3 - > kernel/sched/membarrier.c | 15 ++-- > 15 files changed, 75 insertions(+), 87 deletions(-) > create mode 100644 arch/arm64/include/asm/sync_core.h > create mode 100644 arch/powerpc/include/asm/sync_core.h > delete mode 100644 include/linux/sync_core.h For the arm64 bits (docs and asm/sync_core.h): Acked-by: Will Deacon <will@kernel.org> Will ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 10:20 ` Will Deacon @ 2021-06-16 23:58 ` Andy Lutomirski 0 siblings, 0 replies; 19+ messages in thread From: Andy Lutomirski @ 2021-06-16 23:58 UTC (permalink / raw) To: Will Deacon Cc: Catalin Marinas, linux-mm, Peter Zijlstra (Intel), the arch/x86 maintainers, Linux Kernel Mailing List, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Mathieu Desnoyers, Andrew Morton, linuxppc-dev, linux-arm-kernel On Wed, Jun 16, 2021, at 3:20 AM, Will Deacon wrote: > > For the arm64 bits (docs and asm/sync_core.h): > > Acked-by: Will Deacon <will@kernel.org> > Thanks. Per Nick's suggestion, I renamed the header to membarrier.h. Unless I hear otherwise, I'll keep the ack. > Will > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 3:21 ` [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski 2021-06-16 4:45 ` Nicholas Piggin 2021-06-16 10:20 ` Will Deacon @ 2021-06-17 14:47 ` Mathieu Desnoyers 2021-06-18 0:12 ` Andy Lutomirski 2021-06-17 15:16 ` Mathieu Desnoyers 3 siblings, 1 reply; 19+ messages in thread From: Mathieu Desnoyers @ 2021-06-17 14:47 UTC (permalink / raw) To: Andy Lutomirski Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel ----- On Jun 15, 2021, at 11:21 PM, Andy Lutomirski luto@kernel.org wrote: > The old sync_core_before_usermode() comments suggested that a non-icache-syncing > return-to-usermode instruction is x86-specific and that all other > architectures automatically notice cross-modified code on return to > userspace. > > This is misleading. The incantation needed to modify code from one > CPU and execute it on another CPU is highly architecture dependent. > On x86, according to the SDM, one must modify the code, issue SFENCE > if the modification was WC or nontemporal, and then issue a "serializing > instruction" on the CPU that will execute the code. membarrier() can do > the latter. > > On arm64 and powerpc, one must flush the icache and then flush the pipeline > on the target CPU, although the CPU manuals don't necessarily use this > language. > > So let's drop any pretense that we can have a generic way to define or > implement membarrier's SYNC_CORE operation and instead require all > architectures to define the helper and supply their own documentation as to > how to use it. Agreed. Documentation of the sequence of operations that need to be performed when cross-modifying code on SMP should be per-architecture. The documentation of the architectural effects of membarrier sync-core should be per-arch as well. > This means x86, arm64, and powerpc for now. And also arm32, as discussed in the other leg of the patchset's email thread. > Let's also > rename the function from sync_core_before_usermode() to > membarrier_sync_core_before_usermode() because the precise flushing details > may very well be specific to membarrier, and even the concept of > "sync_core" in the kernel is mostly an x86-ism. OK > [...] > > static void ipi_rseq(void *info) > { > @@ -368,12 +373,14 @@ static int membarrier_private_expedited(int flags, int > cpu_id) > smp_call_func_t ipi_func = ipi_mb; > > if (flags == MEMBARRIER_FLAG_SYNC_CORE) { > - if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) > +#ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > return -EINVAL; > +#else > if (!(atomic_read(&mm->membarrier_state) & > MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) > return -EPERM; > ipi_func = ipi_sync_core; > +#endif Please change back this #ifndef / #else / #endif within function for if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { ... } else { ... } I don't think mixing up preprocessor and code logic makes it more readable. Thanks, Mathieu > } else if (flags == MEMBARRIER_FLAG_RSEQ) { > if (!IS_ENABLED(CONFIG_RSEQ)) > return -EINVAL; > -- > 2.31.1 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-17 14:47 ` Mathieu Desnoyers @ 2021-06-18 0:12 ` Andy Lutomirski 2021-06-18 16:31 ` Mathieu Desnoyers 0 siblings, 1 reply; 19+ messages in thread From: Andy Lutomirski @ 2021-06-18 0:12 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: > Please change back this #ifndef / #else / #endif within function for > > if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { > ... > } else { > ... > } > > I don't think mixing up preprocessor and code logic makes it more readable. I agree, but I don't know how to make the result work well. membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED case, so either I need to fake up a definition or use #ifdef. If I faked up a definition, I would want to assert, at build time, that it isn't called. I don't think we can do: static void membarrier_sync_core_before_usermode() { BUILD_BUG_IF_REACHABLE(); } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-18 0:12 ` Andy Lutomirski @ 2021-06-18 16:31 ` Mathieu Desnoyers 2021-06-18 19:58 ` Andy Lutomirski 0 siblings, 1 reply; 19+ messages in thread From: Mathieu Desnoyers @ 2021-06-18 16:31 UTC (permalink / raw) To: Andy Lutomirski Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel ----- On Jun 17, 2021, at 8:12 PM, Andy Lutomirski luto@kernel.org wrote: > On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: > >> Please change back this #ifndef / #else / #endif within function for >> >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { >> ... >> } else { >> ... >> } >> >> I don't think mixing up preprocessor and code logic makes it more readable. > > I agree, but I don't know how to make the result work well. > membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED > case, so either I need to fake up a definition or use #ifdef. > > If I faked up a definition, I would want to assert, at build time, that > it isn't called. I don't think we can do: > > static void membarrier_sync_core_before_usermode() > { > BUILD_BUG_IF_REACHABLE(); > } Let's look at the context here: static void ipi_sync_core(void *info) { [....] membarrier_sync_core_before_usermode() } ^ this can be within #ifdef / #endif static int membarrier_private_expedited(int flags, int cpu_id) [...] if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) return -EINVAL; if (!(atomic_read(&mm->membarrier_state) & MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) return -EPERM; ipi_func = ipi_sync_core; All we need to make the line above work is to define an empty ipi_sync_core function in the #else case after the ipi_sync_core() function definition. Or am I missing your point ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-18 16:31 ` Mathieu Desnoyers @ 2021-06-18 19:58 ` Andy Lutomirski 2021-06-18 20:09 ` Mathieu Desnoyers 0 siblings, 1 reply; 19+ messages in thread From: Andy Lutomirski @ 2021-06-18 19:58 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra (Intel), the arch/x86 maintainers, Linux Kernel Mailing List, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel On Fri, Jun 18, 2021, at 9:31 AM, Mathieu Desnoyers wrote: > ----- On Jun 17, 2021, at 8:12 PM, Andy Lutomirski luto@kernel.org wrote: > > > On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: > > > >> Please change back this #ifndef / #else / #endif within function for > >> > >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { > >> ... > >> } else { > >> ... > >> } > >> > >> I don't think mixing up preprocessor and code logic makes it more readable. > > > > I agree, but I don't know how to make the result work well. > > membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED > > case, so either I need to fake up a definition or use #ifdef. > > > > If I faked up a definition, I would want to assert, at build time, that > > it isn't called. I don't think we can do: > > > > static void membarrier_sync_core_before_usermode() > > { > > BUILD_BUG_IF_REACHABLE(); > > } > > Let's look at the context here: > > static void ipi_sync_core(void *info) > { > [....] > membarrier_sync_core_before_usermode() > } > > ^ this can be within #ifdef / #endif > > static int membarrier_private_expedited(int flags, int cpu_id) > [...] > if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) > return -EINVAL; > if (!(atomic_read(&mm->membarrier_state) & > MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) > return -EPERM; > ipi_func = ipi_sync_core; > > All we need to make the line above work is to define an empty ipi_sync_core > function in the #else case after the ipi_sync_core() function definition. > > Or am I missing your point ? Maybe? My objection is that an empty ipi_sync_core is a lie — it doesn’t sync the core. I would be fine with that if I could have the compiler statically verify that it’s not called, but I’m uncomfortable having it there if the implementation is actively incorrect. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-18 19:58 ` Andy Lutomirski @ 2021-06-18 20:09 ` Mathieu Desnoyers 2021-06-19 6:02 ` Nicholas Piggin 0 siblings, 1 reply; 19+ messages in thread From: Mathieu Desnoyers @ 2021-06-18 20:09 UTC (permalink / raw) To: Andy Lutomirski Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel ----- On Jun 18, 2021, at 3:58 PM, Andy Lutomirski luto@kernel.org wrote: > On Fri, Jun 18, 2021, at 9:31 AM, Mathieu Desnoyers wrote: >> ----- On Jun 17, 2021, at 8:12 PM, Andy Lutomirski luto@kernel.org wrote: >> >> > On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: >> > >> >> Please change back this #ifndef / #else / #endif within function for >> >> >> >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { >> >> ... >> >> } else { >> >> ... >> >> } >> >> >> >> I don't think mixing up preprocessor and code logic makes it more readable. >> > >> > I agree, but I don't know how to make the result work well. >> > membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED >> > case, so either I need to fake up a definition or use #ifdef. >> > >> > If I faked up a definition, I would want to assert, at build time, that >> > it isn't called. I don't think we can do: >> > >> > static void membarrier_sync_core_before_usermode() >> > { >> > BUILD_BUG_IF_REACHABLE(); >> > } >> >> Let's look at the context here: >> >> static void ipi_sync_core(void *info) >> { >> [....] >> membarrier_sync_core_before_usermode() >> } >> >> ^ this can be within #ifdef / #endif >> >> static int membarrier_private_expedited(int flags, int cpu_id) >> [...] >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) >> return -EINVAL; >> if (!(atomic_read(&mm->membarrier_state) & >> MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) >> return -EPERM; >> ipi_func = ipi_sync_core; >> >> All we need to make the line above work is to define an empty ipi_sync_core >> function in the #else case after the ipi_sync_core() function definition. >> >> Or am I missing your point ? > > Maybe? > > My objection is that an empty ipi_sync_core is a lie — it doesn’t sync the core. > I would be fine with that if I could have the compiler statically verify that > it’s not called, but I’m uncomfortable having it there if the implementation is > actively incorrect. I see. Another approach would be to implement a "setter" function to populate "ipi_func". That setter function would return -EINVAL in its #ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE implementation. Would that be better ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-18 20:09 ` Mathieu Desnoyers @ 2021-06-19 6:02 ` Nicholas Piggin 2021-06-19 15:50 ` Andy Lutomirski 0 siblings, 1 reply; 19+ messages in thread From: Nicholas Piggin @ 2021-06-19 6:02 UTC (permalink / raw) To: Andy Lutomirski, Mathieu Desnoyers Cc: Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, stable, Dave Hansen, Paul Mackerras, Catalin Marinas, Andrew Morton, linuxppc-dev, linux-arm-kernel Excerpts from Mathieu Desnoyers's message of June 19, 2021 6:09 am: > ----- On Jun 18, 2021, at 3:58 PM, Andy Lutomirski luto@kernel.org wrote: > >> On Fri, Jun 18, 2021, at 9:31 AM, Mathieu Desnoyers wrote: >>> ----- On Jun 17, 2021, at 8:12 PM, Andy Lutomirski luto@kernel.org wrote: >>> >>> > On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: >>> > >>> >> Please change back this #ifndef / #else / #endif within function for >>> >> >>> >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { >>> >> ... >>> >> } else { >>> >> ... >>> >> } >>> >> >>> >> I don't think mixing up preprocessor and code logic makes it more readable. >>> > >>> > I agree, but I don't know how to make the result work well. >>> > membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED >>> > case, so either I need to fake up a definition or use #ifdef. >>> > >>> > If I faked up a definition, I would want to assert, at build time, that >>> > it isn't called. I don't think we can do: >>> > >>> > static void membarrier_sync_core_before_usermode() >>> > { >>> > BUILD_BUG_IF_REACHABLE(); >>> > } >>> >>> Let's look at the context here: >>> >>> static void ipi_sync_core(void *info) >>> { >>> [....] >>> membarrier_sync_core_before_usermode() >>> } >>> >>> ^ this can be within #ifdef / #endif >>> >>> static int membarrier_private_expedited(int flags, int cpu_id) >>> [...] >>> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) >>> return -EINVAL; >>> if (!(atomic_read(&mm->membarrier_state) & >>> MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) >>> return -EPERM; >>> ipi_func = ipi_sync_core; >>> >>> All we need to make the line above work is to define an empty ipi_sync_core >>> function in the #else case after the ipi_sync_core() function definition. >>> >>> Or am I missing your point ? >> >> Maybe? >> >> My objection is that an empty ipi_sync_core is a lie — it doesn’t sync the core. >> I would be fine with that if I could have the compiler statically verify that >> it’s not called, but I’m uncomfortable having it there if the implementation is >> actively incorrect. > > I see. Another approach would be to implement a "setter" function to populate > "ipi_func". That setter function would return -EINVAL in its #ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > implementation. I still don't get the problem with my suggestion. Sure the ipi is a "lie", but it doesn't get used. That's how a lot of ifdef folding works out. E.g., diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index b5add64d9698..54cb32d064af 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -5,6 +5,15 @@ * membarrier system call */ #include "sched.h" +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE +#include <asm/sync_core.h> +#else +static inline void membarrier_sync_core_before_usermode(void) +{ + compiletime_assert(0, "architecture does not implement membarrier_sync_core_before_usermode"); +} + +#endif /* * For documentation purposes, here are some membarrier ordering ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-19 6:02 ` Nicholas Piggin @ 2021-06-19 15:50 ` Andy Lutomirski 2021-06-20 2:10 ` Nicholas Piggin 0 siblings, 1 reply; 19+ messages in thread From: Andy Lutomirski @ 2021-06-19 15:50 UTC (permalink / raw) To: Nicholas Piggin, Mathieu Desnoyers Cc: Will Deacon, linux-mm, Peter Zijlstra (Intel), the arch/x86 maintainers, Linux Kernel Mailing List, stable, Dave Hansen, Paul Mackerras, Catalin Marinas, Andrew Morton, linuxppc-dev, linux-arm-kernel On Fri, Jun 18, 2021, at 11:02 PM, Nicholas Piggin wrote: > Excerpts from Mathieu Desnoyers's message of June 19, 2021 6:09 am: > > ----- On Jun 18, 2021, at 3:58 PM, Andy Lutomirski luto@kernel.org wrote: > > > >> On Fri, Jun 18, 2021, at 9:31 AM, Mathieu Desnoyers wrote: > >>> ----- On Jun 17, 2021, at 8:12 PM, Andy Lutomirski luto@kernel.org wrote: > >>> > >>> > On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: > >>> > > >>> >> Please change back this #ifndef / #else / #endif within function for > >>> >> > >>> >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { > >>> >> ... > >>> >> } else { > >>> >> ... > >>> >> } > >>> >> > >>> >> I don't think mixing up preprocessor and code logic makes it more readable. > >>> > > >>> > I agree, but I don't know how to make the result work well. > >>> > membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED > >>> > case, so either I need to fake up a definition or use #ifdef. > >>> > > >>> > If I faked up a definition, I would want to assert, at build time, that > >>> > it isn't called. I don't think we can do: > >>> > > >>> > static void membarrier_sync_core_before_usermode() > >>> > { > >>> > BUILD_BUG_IF_REACHABLE(); > >>> > } > >>> > >>> Let's look at the context here: > >>> > >>> static void ipi_sync_core(void *info) > >>> { > >>> [....] > >>> membarrier_sync_core_before_usermode() > >>> } > >>> > >>> ^ this can be within #ifdef / #endif > >>> > >>> static int membarrier_private_expedited(int flags, int cpu_id) > >>> [...] > >>> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) > >>> return -EINVAL; > >>> if (!(atomic_read(&mm->membarrier_state) & > >>> MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) > >>> return -EPERM; > >>> ipi_func = ipi_sync_core; > >>> > >>> All we need to make the line above work is to define an empty ipi_sync_core > >>> function in the #else case after the ipi_sync_core() function definition. > >>> > >>> Or am I missing your point ? > >> > >> Maybe? > >> > >> My objection is that an empty ipi_sync_core is a lie — it doesn’t sync the core. > >> I would be fine with that if I could have the compiler statically verify that > >> it’s not called, but I’m uncomfortable having it there if the implementation is > >> actively incorrect. > > > > I see. Another approach would be to implement a "setter" function to populate > > "ipi_func". That setter function would return -EINVAL in its #ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > > implementation. > > I still don't get the problem with my suggestion. Sure the > ipi is a "lie", but it doesn't get used. That's how a lot of > ifdef folding works out. E.g., > > diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c > index b5add64d9698..54cb32d064af 100644 > --- a/kernel/sched/membarrier.c > +++ b/kernel/sched/membarrier.c > @@ -5,6 +5,15 @@ > * membarrier system call > */ > #include "sched.h" > +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE > +#include <asm/sync_core.h> > +#else > +static inline void membarrier_sync_core_before_usermode(void) > +{ > + compiletime_assert(0, "architecture does not implement > membarrier_sync_core_before_usermode"); > +} > + With the assert there, I’m fine with this. Let me see if the result builds. > +#endif > > /* > * For documentation purposes, here are some membarrier ordering > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-19 15:50 ` Andy Lutomirski @ 2021-06-20 2:10 ` Nicholas Piggin 0 siblings, 0 replies; 19+ messages in thread From: Nicholas Piggin @ 2021-06-20 2:10 UTC (permalink / raw) To: Andy Lutomirski, Mathieu Desnoyers Cc: Will Deacon, linux-mm, Peter Zijlstra (Intel), the arch/x86 maintainers, Linux Kernel Mailing List, stable, Dave Hansen, Paul Mackerras, Catalin Marinas, Andrew Morton, linuxppc-dev, linux-arm-kernel Excerpts from Andy Lutomirski's message of June 20, 2021 1:50 am: > > > On Fri, Jun 18, 2021, at 11:02 PM, Nicholas Piggin wrote: >> Excerpts from Mathieu Desnoyers's message of June 19, 2021 6:09 am: >> > ----- On Jun 18, 2021, at 3:58 PM, Andy Lutomirski luto@kernel.org wrote: >> > >> >> On Fri, Jun 18, 2021, at 9:31 AM, Mathieu Desnoyers wrote: >> >>> ----- On Jun 17, 2021, at 8:12 PM, Andy Lutomirski luto@kernel.org wrote: >> >>> >> >>> > On 6/17/21 7:47 AM, Mathieu Desnoyers wrote: >> >>> > >> >>> >> Please change back this #ifndef / #else / #endif within function for >> >>> >> >> >>> >> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) { >> >>> >> ... >> >>> >> } else { >> >>> >> ... >> >>> >> } >> >>> >> >> >>> >> I don't think mixing up preprocessor and code logic makes it more readable. >> >>> > >> >>> > I agree, but I don't know how to make the result work well. >> >>> > membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED >> >>> > case, so either I need to fake up a definition or use #ifdef. >> >>> > >> >>> > If I faked up a definition, I would want to assert, at build time, that >> >>> > it isn't called. I don't think we can do: >> >>> > >> >>> > static void membarrier_sync_core_before_usermode() >> >>> > { >> >>> > BUILD_BUG_IF_REACHABLE(); >> >>> > } >> >>> >> >>> Let's look at the context here: >> >>> >> >>> static void ipi_sync_core(void *info) >> >>> { >> >>> [....] >> >>> membarrier_sync_core_before_usermode() >> >>> } >> >>> >> >>> ^ this can be within #ifdef / #endif >> >>> >> >>> static int membarrier_private_expedited(int flags, int cpu_id) >> >>> [...] >> >>> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) >> >>> return -EINVAL; >> >>> if (!(atomic_read(&mm->membarrier_state) & >> >>> MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) >> >>> return -EPERM; >> >>> ipi_func = ipi_sync_core; >> >>> >> >>> All we need to make the line above work is to define an empty ipi_sync_core >> >>> function in the #else case after the ipi_sync_core() function definition. >> >>> >> >>> Or am I missing your point ? >> >> >> >> Maybe? >> >> >> >> My objection is that an empty ipi_sync_core is a lie — it doesn’t sync the core. >> >> I would be fine with that if I could have the compiler statically verify that >> >> it’s not called, but I’m uncomfortable having it there if the implementation is >> >> actively incorrect. >> > >> > I see. Another approach would be to implement a "setter" function to populate >> > "ipi_func". That setter function would return -EINVAL in its #ifndef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE >> > implementation. >> >> I still don't get the problem with my suggestion. Sure the >> ipi is a "lie", but it doesn't get used. That's how a lot of >> ifdef folding works out. E.g., >> >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c >> index b5add64d9698..54cb32d064af 100644 >> --- a/kernel/sched/membarrier.c >> +++ b/kernel/sched/membarrier.c >> @@ -5,6 +5,15 @@ >> * membarrier system call >> */ >> #include "sched.h" >> +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE >> +#include <asm/sync_core.h> >> +#else >> +static inline void membarrier_sync_core_before_usermode(void) >> +{ >> + compiletime_assert(0, "architecture does not implement >> membarrier_sync_core_before_usermode"); >> +} >> + > > With the assert there, I’m fine with this. Let me see if the result builds. It had better, because compiletime_assert already relies on a similar level of code elimination to work. I think it's fine to use for now, but it may not be quite the the logically correct primitive if we want to be really clean, because a valid compiletime_assert implementation should be able to fire even for code that is never linked. We would want something like to be clean IMO: #ifdef CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE #include <asm/sync_core.h> #else extern void membarrier_sync_core_before_usermode(void) __compiletime_error("architecture does not implement membarrier_sync_core_before_usermode"); #endif However that does not have the ifdef for optimising compile so AFAIKS it could break with a false positive in some cases. Something like compiletime_assert_not_called("msg") that either compiles to a noop or a __compiletime_error depending on __OPTIMIZE__ would be the way to go IMO. I don't know if anything exists that fits, but it's certainly not a unique thing in the kernel so I may not be looking hard enough. Thanks, Nick ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-16 3:21 ` [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski ` (2 preceding siblings ...) 2021-06-17 14:47 ` Mathieu Desnoyers @ 2021-06-17 15:16 ` Mathieu Desnoyers 2021-06-18 0:13 ` Andy Lutomirski 3 siblings, 1 reply; 19+ messages in thread From: Mathieu Desnoyers @ 2021-06-17 15:16 UTC (permalink / raw) To: Andy Lutomirski Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel ----- On Jun 15, 2021, at 11:21 PM, Andy Lutomirski luto@kernel.org wrote: [...] > +# An architecture that wants to support > +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE needs to define precisely what it > +# is supposed to do and implement membarrier_sync_core_before_usermode() to > +# make it do that. Then it can select ARCH_HAS_MEMBARRIER_SYNC_CORE via > +# Kconfig.Unfortunately, MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is not a > +# fantastic API and may not make sense on all architectures. Once an > +# architecture meets these requirements, Can we please remove the editorial comment about the quality of the membarrier sync-core's API ? At least it's better than having all userspace rely on mprotect() undocumented side-effects to perform something which typically works, until it won't, or until this prevents mprotect's implementation to be improved because it will start breaking JITs all over the place. We can simply state that the definition of what membarrier sync-core does is defined per-architecture, and document the sequence of operations to perform when doing cross-modifying code specifically for each architecture. Now if there are weird architectures where membarrier is an odd fit (I've heard that riscv might need address ranges to which the core sync needs to apply), then those might need to implement their own arch-specific system call, which is all fine. > +# > +# On x86, a program can safely modify code, issue > +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, and then execute that code, via > +# the modified address or an alias, from any thread in the calling process. > +# > +# On arm64, a program can modify code, flush the icache as needed, and issue > +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE to force a "context synchronizing > +# event", aka pipeline flush on all CPUs that might run the calling process. > +# Then the program can execute the modified code as long as it is executed > +# from an address consistent with the icache flush and the CPU's cache type. > +# > +# On powerpc, a program can use MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE > +# similarly to arm64. It would be nice if the powerpc maintainers could > +# add a more clear explanantion. We should document the requirements on ARMv7 as well. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation 2021-06-17 15:16 ` Mathieu Desnoyers @ 2021-06-18 0:13 ` Andy Lutomirski 0 siblings, 0 replies; 19+ messages in thread From: Andy Lutomirski @ 2021-06-18 0:13 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Catalin Marinas, Will Deacon, linux-mm, Peter Zijlstra, x86, linux-kernel, Nicholas Piggin, Dave Hansen, Paul Mackerras, stable, Andrew Morton, linuxppc-dev, linux-arm-kernel On 6/17/21 8:16 AM, Mathieu Desnoyers wrote: > ----- On Jun 15, 2021, at 11:21 PM, Andy Lutomirski luto@kernel.org wrote: > > [...] > >> +# An architecture that wants to support >> +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE needs to define precisely what it >> +# is supposed to do and implement membarrier_sync_core_before_usermode() to >> +# make it do that. Then it can select ARCH_HAS_MEMBARRIER_SYNC_CORE via >> +# Kconfig.Unfortunately, MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is not a >> +# fantastic API and may not make sense on all architectures. Once an >> +# architecture meets these requirements, > > Can we please remove the editorial comment about the quality of the membarrier > sync-core's API ? Done >> +# >> +# On x86, a program can safely modify code, issue >> +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, and then execute that code, via >> +# the modified address or an alias, from any thread in the calling process. >> +# >> +# On arm64, a program can modify code, flush the icache as needed, and issue >> +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE to force a "context synchronizing >> +# event", aka pipeline flush on all CPUs that might run the calling process. >> +# Then the program can execute the modified code as long as it is executed >> +# from an address consistent with the icache flush and the CPU's cache type. >> +# >> +# On powerpc, a program can use MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE >> +# similarly to arm64. It would be nice if the powerpc maintainers could >> +# add a more clear explanantion. > > We should document the requirements on ARMv7 as well. Done. > > Thanks, > > Mathieu > ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2021-06-20 2:10 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1623813516.git.luto@kernel.org>
2021-06-16 3:21 ` [PATCH 6/8] powerpc/membarrier: Remove special barrier on mm switch Andy Lutomirski
2021-06-16 4:36 ` Nicholas Piggin
2021-06-16 3:21 ` [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski
2021-06-16 4:45 ` Nicholas Piggin
2021-06-16 18:52 ` Andy Lutomirski
2021-06-16 23:48 ` Andy Lutomirski
2021-06-18 15:27 ` Christophe Leroy
2021-06-16 10:20 ` Will Deacon
2021-06-16 23:58 ` Andy Lutomirski
2021-06-17 14:47 ` Mathieu Desnoyers
2021-06-18 0:12 ` Andy Lutomirski
2021-06-18 16:31 ` Mathieu Desnoyers
2021-06-18 19:58 ` Andy Lutomirski
2021-06-18 20:09 ` Mathieu Desnoyers
2021-06-19 6:02 ` Nicholas Piggin
2021-06-19 15:50 ` Andy Lutomirski
2021-06-20 2:10 ` Nicholas Piggin
2021-06-17 15:16 ` Mathieu Desnoyers
2021-06-18 0:13 ` Andy Lutomirski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).