public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] x86/bugs: more BHI
@ 2024-05-07  5:30 Josh Poimboeuf
  2024-05-07  5:30 ` [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn Josh Poimboeuf
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-07  5:30 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

Patch 1 fixes some objtool warnings and enables noreturn-related
optimizations for direct-called syscall handlers.

Patches 2 and 3 add 'spectre_bhi=vmexit' which is useful for mitigating
BHI in cloud host environments.

v5:
- dropped syscall hardening patch for now
- dropped "Fix CPU mitigation defaults for !x86" in favor of Sean's fix
- patch 1 fixes (Paul)

Josh Poimboeuf (3):
  x86/syscall: Mark exit[_group] syscall handlers __noreturn
  x86/bugs: Remove duplicate Spectre cmdline option descriptions
  x86/bugs: Add 'spectre_bhi=vmexit' cmdline option

 Documentation/admin-guide/hw-vuln/spectre.rst | 84 ++-----------------
 .../admin-guide/kernel-parameters.txt         | 12 ++-
 arch/x86/entry/syscall_32.c                   | 10 ++-
 arch/x86/entry/syscall_64.c                   |  9 +-
 arch/x86/entry/syscall_x32.c                  |  7 +-
 arch/x86/entry/syscalls/syscall_32.tbl        |  6 +-
 arch/x86/entry/syscalls/syscall_64.tbl        |  6 +-
 arch/x86/kernel/cpu/bugs.c                    | 16 ++--
 arch/x86/um/sys_call_table_32.c               | 10 ++-
 arch/x86/um/sys_call_table_64.c               | 11 ++-
 scripts/syscalltbl.sh                         | 18 +++-
 tools/objtool/noreturns.h                     |  4 +
 12 files changed, 85 insertions(+), 108 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-05-07  5:30 [PATCH v5 0/3] x86/bugs: more BHI Josh Poimboeuf
@ 2024-05-07  5:30 ` Josh Poimboeuf
  2024-05-07 14:38   ` Paul E. McKenney
  2024-05-27 11:15   ` Nikolay Borisov
  2024-05-07  5:30 ` [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions Josh Poimboeuf
  2024-05-07  5:30 ` [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option Josh Poimboeuf
  2 siblings, 2 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-07  5:30 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar, Paul E. McKenney

The direct-call syscall dispatch function doesn't know that the exit()
and exit_group() syscall handlers don't return, so the call sites aren't
optimized accordingly.

Fix that by marking those exit syscall declarations __noreturn.

Fixes the following warnings:

  vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
  vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation

Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 arch/x86/entry/syscall_32.c            | 10 ++++++----
 arch/x86/entry/syscall_64.c            |  9 ++++++---
 arch/x86/entry/syscall_x32.c           |  7 +++++--
 arch/x86/entry/syscalls/syscall_32.tbl |  6 +++---
 arch/x86/entry/syscalls/syscall_64.tbl |  6 +++---
 arch/x86/um/sys_call_table_32.c        | 10 ++++++----
 arch/x86/um/sys_call_table_64.c        | 11 +++++++----
 scripts/syscalltbl.sh                  | 18 ++++++++++++++++--
 tools/objtool/noreturns.h              |  4 ++++
 9 files changed, 56 insertions(+), 25 deletions(-)

diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
index c2235bae17ef..8cc9950d7104 100644
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -14,9 +14,12 @@
 #endif
 
 #define __SYSCALL(nr, sym) extern long __ia32_##sym(const struct pt_regs *);
-
+#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __ia32_##sym(const struct pt_regs *);
 #include <asm/syscalls_32.h>
-#undef __SYSCALL
+#undef  __SYSCALL
+
+#undef  __SYSCALL_NORETURN
+#define __SYSCALL_NORETURN __SYSCALL
 
 /*
  * The sys_call_table[] is no longer used for system calls, but
@@ -28,11 +31,10 @@
 const sys_call_ptr_t sys_call_table[] = {
 #include <asm/syscalls_32.h>
 };
-#undef __SYSCALL
+#undef  __SYSCALL
 #endif
 
 #define __SYSCALL(nr, sym) case nr: return __ia32_##sym(regs);
-
 long ia32_sys_call(const struct pt_regs *regs, unsigned int nr)
 {
 	switch (nr) {
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
index 33b3f09e6f15..ba8354424860 100644
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -8,8 +8,12 @@
 #include <asm/syscall.h>
 
 #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
+#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
 #include <asm/syscalls_64.h>
-#undef __SYSCALL
+#undef  __SYSCALL
+
+#undef  __SYSCALL_NORETURN
+#define __SYSCALL_NORETURN __SYSCALL
 
 /*
  * The sys_call_table[] is no longer used for system calls, but
@@ -20,10 +24,9 @@
 const sys_call_ptr_t sys_call_table[] = {
 #include <asm/syscalls_64.h>
 };
-#undef __SYSCALL
+#undef  __SYSCALL
 
 #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
-
 long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
 {
 	switch (nr) {
diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
index 03de4a932131..fb77908f44f3 100644
--- a/arch/x86/entry/syscall_x32.c
+++ b/arch/x86/entry/syscall_x32.c
@@ -8,11 +8,14 @@
 #include <asm/syscall.h>
 
 #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
+#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
 #include <asm/syscalls_x32.h>
-#undef __SYSCALL
+#undef  __SYSCALL
+
+#undef  __SYSCALL_NORETURN
+#define __SYSCALL_NORETURN __SYSCALL
 
 #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
-
 long x32_sys_call(const struct pt_regs *regs, unsigned int nr)
 {
 	switch (nr) {
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 5f8591ce7f25..9e9a908cd50d 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -2,7 +2,7 @@
 # 32-bit system call numbers and entry vectors
 #
 # The format is:
-# <number> <abi> <name> <entry point> <compat entry point>
+# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
 #
 # The __ia32_sys and __ia32_compat_sys stubs are created on-the-fly for
 # sys_*() system calls and compat_sys_*() compat system calls if
@@ -12,7 +12,7 @@
 # The abi is always "i386" for this file.
 #
 0	i386	restart_syscall		sys_restart_syscall
-1	i386	exit			sys_exit
+1	i386	exit			sys_exit			-			noreturn
 2	i386	fork			sys_fork
 3	i386	read			sys_read
 4	i386	write			sys_write
@@ -263,7 +263,7 @@
 249	i386	io_cancel		sys_io_cancel
 250	i386	fadvise64		sys_ia32_fadvise64
 # 251 is available for reuse (was briefly sys_set_zone_reclaim)
-252	i386	exit_group		sys_exit_group
+252	i386	exit_group		sys_exit_group			-			noreturn
 253	i386	lookup_dcookie
 254	i386	epoll_create		sys_epoll_create
 255	i386	epoll_ctl		sys_epoll_ctl
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 7e8d46f4147f..5ea7387c1aa1 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -2,7 +2,7 @@
 # 64-bit system call numbers and entry vectors
 #
 # The format is:
-# <number> <abi> <name> <entry point>
+# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
 #
 # The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls
 #
@@ -68,7 +68,7 @@
 57	common	fork			sys_fork
 58	common	vfork			sys_vfork
 59	64	execve			sys_execve
-60	common	exit			sys_exit
+60	common	exit			sys_exit			-			noreturn
 61	common	wait4			sys_wait4
 62	common	kill			sys_kill
 63	common	uname			sys_newuname
@@ -239,7 +239,7 @@
 228	common	clock_gettime		sys_clock_gettime
 229	common	clock_getres		sys_clock_getres
 230	common	clock_nanosleep		sys_clock_nanosleep
-231	common	exit_group		sys_exit_group
+231	common	exit_group		sys_exit_group			-			noreturn
 232	common	epoll_wait		sys_epoll_wait
 233	common	epoll_ctl		sys_epoll_ctl
 234	common	tgkill			sys_tgkill
diff --git a/arch/x86/um/sys_call_table_32.c b/arch/x86/um/sys_call_table_32.c
index 89df5d89d664..51655133eee3 100644
--- a/arch/x86/um/sys_call_table_32.c
+++ b/arch/x86/um/sys_call_table_32.c
@@ -9,6 +9,10 @@
 #include <linux/cache.h>
 #include <asm/syscall.h>
 
+extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long,
+				      unsigned long, unsigned long,
+				      unsigned long, unsigned long);
+
 /*
  * Below you can see, in terms of #define's, the differences between the x86-64
  * and the UML syscall table.
@@ -22,15 +26,13 @@
 #define sys_vm86 sys_ni_syscall
 
 #define __SYSCALL_WITH_COMPAT(nr, native, compat)	__SYSCALL(nr, native)
+#define __SYSCALL_NORETURN __SYSCALL
 
 #define __SYSCALL(nr, sym) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
 #include <asm/syscalls_32.h>
+#undef  __SYSCALL
 
-#undef __SYSCALL
 #define __SYSCALL(nr, sym) sym,
-
-extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-
 const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
 #include <asm/syscalls_32.h>
 };
diff --git a/arch/x86/um/sys_call_table_64.c b/arch/x86/um/sys_call_table_64.c
index b0b4cfd2308c..943d414f2109 100644
--- a/arch/x86/um/sys_call_table_64.c
+++ b/arch/x86/um/sys_call_table_64.c
@@ -9,6 +9,10 @@
 #include <linux/cache.h>
 #include <asm/syscall.h>
 
+extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long,
+				      unsigned long, unsigned long,
+				      unsigned long, unsigned long);
+
 /*
  * Below you can see, in terms of #define's, the differences between the x86-64
  * and the UML syscall table.
@@ -18,14 +22,13 @@
 #define sys_iopl sys_ni_syscall
 #define sys_ioperm sys_ni_syscall
 
+#define __SYSCALL_NORETURN __SYSCALL
+
 #define __SYSCALL(nr, sym) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
 #include <asm/syscalls_64.h>
+#undef  __SYSCALL
 
-#undef __SYSCALL
 #define __SYSCALL(nr, sym) sym,
-
-extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
-
 const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
 #include <asm/syscalls_64.h>
 };
diff --git a/scripts/syscalltbl.sh b/scripts/syscalltbl.sh
index 6abe143889ef..6a903b87a7c2 100755
--- a/scripts/syscalltbl.sh
+++ b/scripts/syscalltbl.sh
@@ -54,7 +54,7 @@ nxt=0
 
 grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
 
-	while read nr abi name native compat ; do
+	while read nr abi name native compat noreturn; do
 
 		if [ $nxt -gt $nr ]; then
 			echo "error: $infile: syscall table is not sorted or duplicates the same syscall number" >&2
@@ -66,7 +66,21 @@ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
 			nxt=$((nxt + 1))
 		done
 
-		if [ -n "$compat" ]; then
+		if [ "$compat" = "-" ]; then
+			unset compat
+		fi
+
+		if [ -n "$noreturn" ]; then
+			if [ "$noreturn" != "noreturn" ]; then
+				echo "error: $infile: invalid string \"$noreturn\" in 'noreturn' column"
+				exit 1
+			fi
+			if [ -n "$compat" ]; then
+				echo "__SYSCALL_COMPAT_NORETURN($nr, $native, $compat)"
+			else
+				echo "__SYSCALL_NORETURN($nr, $native)"
+			fi
+		elif [ -n "$compat" ]; then
 			echo "__SYSCALL_WITH_COMPAT($nr, $native, $compat)"
 		elif [ -n "$native" ]; then
 			echo "__SYSCALL($nr, $native)"
diff --git a/tools/objtool/noreturns.h b/tools/objtool/noreturns.h
index 7ebf29c91184..1e8141ef1b15 100644
--- a/tools/objtool/noreturns.h
+++ b/tools/objtool/noreturns.h
@@ -7,12 +7,16 @@
  * Yes, this is unfortunate.  A better solution is in the works.
  */
 NORETURN(__fortify_panic)
+NORETURN(__ia32_sys_exit)
+NORETURN(__ia32_sys_exit_group)
 NORETURN(__kunit_abort)
 NORETURN(__module_put_and_kthread_exit)
 NORETURN(__reiserfs_panic)
 NORETURN(__stack_chk_fail)
 NORETURN(__tdx_hypercall_failed)
 NORETURN(__ubsan_handle_builtin_unreachable)
+NORETURN(__x64_sys_exit)
+NORETURN(__x64_sys_exit_group)
 NORETURN(arch_cpu_idle_dead)
 NORETURN(bch2_trans_in_restart_error)
 NORETURN(bch2_trans_restart_error)
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions
  2024-05-07  5:30 [PATCH v5 0/3] x86/bugs: more BHI Josh Poimboeuf
  2024-05-07  5:30 ` [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn Josh Poimboeuf
@ 2024-05-07  5:30 ` Josh Poimboeuf
  2024-05-07 15:04   ` Daniel Sneddon
  2024-05-07  5:30 ` [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option Josh Poimboeuf
  2 siblings, 1 reply; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-07  5:30 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

Duplicating the documentation of all the Spectre kernel cmdline options
in two separate files is unwieldy and error-prone.  Instead just add a
reference to kernel-parameters.txt from spectre.rst.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 Documentation/admin-guide/hw-vuln/spectre.rst | 84 ++-----------------
 1 file changed, 9 insertions(+), 75 deletions(-)

diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index 25a04cda4c2c..f9797ab6b38f 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -592,85 +592,19 @@ Spectre variant 2
 Mitigation control on the kernel command line
 ---------------------------------------------
 
-Spectre variant 2 mitigation can be disabled or force enabled at the
-kernel command line.
+In general the kernel selects reasonable default mitigations for the
+current CPU.
+
+Spectre default mitigations can be disabled or changed at the kernel
+command line with the following options:
 
 	nospectre_v1
-
-		[X86,PPC] Disable mitigations for Spectre Variant 1
-		(bounds check bypass). With this option data leaks are
-		possible in the system.
-
 	nospectre_v2
+	spectre_v2={option}
+	spectre_v2_user={option}
+	spectre_bhi={option}
 
-		[X86] Disable all mitigations for the Spectre variant 2
-		(indirect branch prediction) vulnerability. System may
-		allow data leaks with this option, which is equivalent
-		to spectre_v2=off.
-
-
-        spectre_v2=
-
-		[X86] Control mitigation of Spectre variant 2
-		(indirect branch speculation) vulnerability.
-		The default operation protects the kernel from
-		user space attacks.
-
-		on
-			unconditionally enable, implies
-			spectre_v2_user=on
-		off
-			unconditionally disable, implies
-		        spectre_v2_user=off
-		auto
-			kernel detects whether your CPU model is
-		        vulnerable
-
-		Selecting 'on' will, and 'auto' may, choose a
-		mitigation method at run time according to the
-		CPU, the available microcode, the setting of the
-		CONFIG_MITIGATION_RETPOLINE configuration option,
-		and the compiler with which the kernel was built.
-
-		Selecting 'on' will also enable the mitigation
-		against user space to user space task attacks.
-
-		Selecting 'off' will disable both the kernel and
-		the user space protections.
-
-		Specific mitigations can also be selected manually:
-
-                retpoline               auto pick between generic,lfence
-                retpoline,generic       Retpolines
-                retpoline,lfence        LFENCE; indirect branch
-                retpoline,amd           alias for retpoline,lfence
-                eibrs                   Enhanced/Auto IBRS
-                eibrs,retpoline         Enhanced/Auto IBRS + Retpolines
-                eibrs,lfence            Enhanced/Auto IBRS + LFENCE
-                ibrs                    use IBRS to protect kernel
-
-		Not specifying this option is equivalent to
-		spectre_v2=auto.
-
-		In general the kernel by default selects
-		reasonable mitigations for the current CPU. To
-		disable Spectre variant 2 mitigations, boot with
-		spectre_v2=off. Spectre variant 1 mitigations
-		cannot be disabled.
-
-	spectre_bhi=
-
-		[X86] Control mitigation of Branch History Injection
-		(BHI) vulnerability.  This setting affects the deployment
-		of the HW BHI control and the SW BHB clearing sequence.
-
-		on
-			(default) Enable the HW or SW mitigation as
-			needed.
-		off
-			Disable the mitigation.
-
-For spectre_v2_user see Documentation/admin-guide/kernel-parameters.txt
+For more details on the available options, refer to Documentation/admin-guide/kernel-parameters.txt
 
 Mitigation selection guide
 --------------------------
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-07  5:30 [PATCH v5 0/3] x86/bugs: more BHI Josh Poimboeuf
  2024-05-07  5:30 ` [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn Josh Poimboeuf
  2024-05-07  5:30 ` [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions Josh Poimboeuf
@ 2024-05-07  5:30 ` Josh Poimboeuf
  2024-05-07 14:58   ` Daniel Sneddon
                     ` (2 more replies)
  2 siblings, 3 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-07  5:30 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar, Maksim Davydov

In cloud environments it can be useful to *only* enable the vmexit
mitigation and leave syscalls vulnerable.  Add that as an option.

This is similar to the old spectre_bhi=auto option which was removed
with the following commit:

  36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")

with the main difference being that this has a more descriptive name and
is disabled by default.

Requested-by: Maksim Davydov <davydov-max@yandex-team.ru>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt | 12 +++++++++---
 arch/x86/kernel/cpu/bugs.c                      | 16 +++++++++++-----
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 213d0719e2b7..9c1f63f04502 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6072,9 +6072,15 @@
 			deployment of the HW BHI control and the SW BHB
 			clearing sequence.
 
-			on   - (default) Enable the HW or SW mitigation
-			       as needed.
-			off  - Disable the mitigation.
+			on     - (default) Enable the HW or SW mitigation as
+				 needed.  This protects the kernel from
+				 both syscalls and VMs.
+			vmexit - On systems which don't have the HW mitigation
+				 available, enable the SW mitigation on vmexit
+				 ONLY.  On such systems, the host kernel is
+				 protected from VM-originated BHI attacks, but
+				 may still be vulnerable to syscall attacks.
+			off    - Disable the mitigation.
 
 	spectre_v2=	[X86,EARLY] Control mitigation of Spectre variant 2
 			(indirect branch speculation) vulnerability.
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index ab18185894df..6974c8c9792d 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1625,6 +1625,7 @@ static bool __init spec_ctrl_bhi_dis(void)
 enum bhi_mitigations {
 	BHI_MITIGATION_OFF,
 	BHI_MITIGATION_ON,
+	BHI_MITIGATION_VMEXIT_ONLY,
 };
 
 static enum bhi_mitigations bhi_mitigation __ro_after_init =
@@ -1639,6 +1640,8 @@ static int __init spectre_bhi_parse_cmdline(char *str)
 		bhi_mitigation = BHI_MITIGATION_OFF;
 	else if (!strcmp(str, "on"))
 		bhi_mitigation = BHI_MITIGATION_ON;
+	else if (!strcmp(str, "vmexit"))
+		bhi_mitigation = BHI_MITIGATION_VMEXIT_ONLY;
 	else
 		pr_err("Ignoring unknown spectre_bhi option (%s)", str);
 
@@ -1659,19 +1662,22 @@ static void __init bhi_select_mitigation(void)
 			return;
 	}
 
+	/* Mitigate in hardware if supported */
 	if (spec_ctrl_bhi_dis())
 		return;
 
 	if (!IS_ENABLED(CONFIG_X86_64))
 		return;
 
-	/* Mitigate KVM by default */
-	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
-	pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
+	if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) {
+		pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit only\n");
+		setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
+		return;
+	}
 
-	/* Mitigate syscalls when the mitigation is forced =on */
+	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall and vm exit\n");
 	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP);
-	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall\n");
+	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
 }
 
 static void __init spectre_v2_select_mitigation(void)
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-05-07  5:30 ` [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn Josh Poimboeuf
@ 2024-05-07 14:38   ` Paul E. McKenney
  2024-06-26  2:21     ` Paul E. McKenney
  2024-05-27 11:15   ` Nikolay Borisov
  1 sibling, 1 reply; 23+ messages in thread
From: Paul E. McKenney @ 2024-05-07 14:38 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

On Mon, May 06, 2024 at 10:30:04PM -0700, Josh Poimboeuf wrote:
> The direct-call syscall dispatch function doesn't know that the exit()
> and exit_group() syscall handlers don't return, so the call sites aren't
> optimized accordingly.
> 
> Fix that by marking those exit syscall declarations __noreturn.
> 
> Fixes the following warnings:
> 
>   vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
>   vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
> 
> Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
> Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
> Tested-by: Paul E. McKenney <paulmck@kernel.org>

Just reaffirming my Tested-by, and thank you!

							Thanx, Paul

> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
>  arch/x86/entry/syscall_32.c            | 10 ++++++----
>  arch/x86/entry/syscall_64.c            |  9 ++++++---
>  arch/x86/entry/syscall_x32.c           |  7 +++++--
>  arch/x86/entry/syscalls/syscall_32.tbl |  6 +++---
>  arch/x86/entry/syscalls/syscall_64.tbl |  6 +++---
>  arch/x86/um/sys_call_table_32.c        | 10 ++++++----
>  arch/x86/um/sys_call_table_64.c        | 11 +++++++----
>  scripts/syscalltbl.sh                  | 18 ++++++++++++++++--
>  tools/objtool/noreturns.h              |  4 ++++
>  9 files changed, 56 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
> index c2235bae17ef..8cc9950d7104 100644
> --- a/arch/x86/entry/syscall_32.c
> +++ b/arch/x86/entry/syscall_32.c
> @@ -14,9 +14,12 @@
>  #endif
>  
>  #define __SYSCALL(nr, sym) extern long __ia32_##sym(const struct pt_regs *);
> -
> +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __ia32_##sym(const struct pt_regs *);
>  #include <asm/syscalls_32.h>
> -#undef __SYSCALL
> +#undef  __SYSCALL
> +
> +#undef  __SYSCALL_NORETURN
> +#define __SYSCALL_NORETURN __SYSCALL
>  
>  /*
>   * The sys_call_table[] is no longer used for system calls, but
> @@ -28,11 +31,10 @@
>  const sys_call_ptr_t sys_call_table[] = {
>  #include <asm/syscalls_32.h>
>  };
> -#undef __SYSCALL
> +#undef  __SYSCALL
>  #endif
>  
>  #define __SYSCALL(nr, sym) case nr: return __ia32_##sym(regs);
> -
>  long ia32_sys_call(const struct pt_regs *regs, unsigned int nr)
>  {
>  	switch (nr) {
> diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
> index 33b3f09e6f15..ba8354424860 100644
> --- a/arch/x86/entry/syscall_64.c
> +++ b/arch/x86/entry/syscall_64.c
> @@ -8,8 +8,12 @@
>  #include <asm/syscall.h>
>  
>  #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
> +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
>  #include <asm/syscalls_64.h>
> -#undef __SYSCALL
> +#undef  __SYSCALL
> +
> +#undef  __SYSCALL_NORETURN
> +#define __SYSCALL_NORETURN __SYSCALL
>  
>  /*
>   * The sys_call_table[] is no longer used for system calls, but
> @@ -20,10 +24,9 @@
>  const sys_call_ptr_t sys_call_table[] = {
>  #include <asm/syscalls_64.h>
>  };
> -#undef __SYSCALL
> +#undef  __SYSCALL
>  
>  #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
> -
>  long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
>  {
>  	switch (nr) {
> diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
> index 03de4a932131..fb77908f44f3 100644
> --- a/arch/x86/entry/syscall_x32.c
> +++ b/arch/x86/entry/syscall_x32.c
> @@ -8,11 +8,14 @@
>  #include <asm/syscall.h>
>  
>  #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
> +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
>  #include <asm/syscalls_x32.h>
> -#undef __SYSCALL
> +#undef  __SYSCALL
> +
> +#undef  __SYSCALL_NORETURN
> +#define __SYSCALL_NORETURN __SYSCALL
>  
>  #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
> -
>  long x32_sys_call(const struct pt_regs *regs, unsigned int nr)
>  {
>  	switch (nr) {
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index 5f8591ce7f25..9e9a908cd50d 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -2,7 +2,7 @@
>  # 32-bit system call numbers and entry vectors
>  #
>  # The format is:
> -# <number> <abi> <name> <entry point> <compat entry point>
> +# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
>  #
>  # The __ia32_sys and __ia32_compat_sys stubs are created on-the-fly for
>  # sys_*() system calls and compat_sys_*() compat system calls if
> @@ -12,7 +12,7 @@
>  # The abi is always "i386" for this file.
>  #
>  0	i386	restart_syscall		sys_restart_syscall
> -1	i386	exit			sys_exit
> +1	i386	exit			sys_exit			-			noreturn
>  2	i386	fork			sys_fork
>  3	i386	read			sys_read
>  4	i386	write			sys_write
> @@ -263,7 +263,7 @@
>  249	i386	io_cancel		sys_io_cancel
>  250	i386	fadvise64		sys_ia32_fadvise64
>  # 251 is available for reuse (was briefly sys_set_zone_reclaim)
> -252	i386	exit_group		sys_exit_group
> +252	i386	exit_group		sys_exit_group			-			noreturn
>  253	i386	lookup_dcookie
>  254	i386	epoll_create		sys_epoll_create
>  255	i386	epoll_ctl		sys_epoll_ctl
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index 7e8d46f4147f..5ea7387c1aa1 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -2,7 +2,7 @@
>  # 64-bit system call numbers and entry vectors
>  #
>  # The format is:
> -# <number> <abi> <name> <entry point>
> +# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
>  #
>  # The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls
>  #
> @@ -68,7 +68,7 @@
>  57	common	fork			sys_fork
>  58	common	vfork			sys_vfork
>  59	64	execve			sys_execve
> -60	common	exit			sys_exit
> +60	common	exit			sys_exit			-			noreturn
>  61	common	wait4			sys_wait4
>  62	common	kill			sys_kill
>  63	common	uname			sys_newuname
> @@ -239,7 +239,7 @@
>  228	common	clock_gettime		sys_clock_gettime
>  229	common	clock_getres		sys_clock_getres
>  230	common	clock_nanosleep		sys_clock_nanosleep
> -231	common	exit_group		sys_exit_group
> +231	common	exit_group		sys_exit_group			-			noreturn
>  232	common	epoll_wait		sys_epoll_wait
>  233	common	epoll_ctl		sys_epoll_ctl
>  234	common	tgkill			sys_tgkill
> diff --git a/arch/x86/um/sys_call_table_32.c b/arch/x86/um/sys_call_table_32.c
> index 89df5d89d664..51655133eee3 100644
> --- a/arch/x86/um/sys_call_table_32.c
> +++ b/arch/x86/um/sys_call_table_32.c
> @@ -9,6 +9,10 @@
>  #include <linux/cache.h>
>  #include <asm/syscall.h>
>  
> +extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long,
> +				      unsigned long, unsigned long,
> +				      unsigned long, unsigned long);
> +
>  /*
>   * Below you can see, in terms of #define's, the differences between the x86-64
>   * and the UML syscall table.
> @@ -22,15 +26,13 @@
>  #define sys_vm86 sys_ni_syscall
>  
>  #define __SYSCALL_WITH_COMPAT(nr, native, compat)	__SYSCALL(nr, native)
> +#define __SYSCALL_NORETURN __SYSCALL
>  
>  #define __SYSCALL(nr, sym) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
>  #include <asm/syscalls_32.h>
> +#undef  __SYSCALL
>  
> -#undef __SYSCALL
>  #define __SYSCALL(nr, sym) sym,
> -
> -extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
> -
>  const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
>  #include <asm/syscalls_32.h>
>  };
> diff --git a/arch/x86/um/sys_call_table_64.c b/arch/x86/um/sys_call_table_64.c
> index b0b4cfd2308c..943d414f2109 100644
> --- a/arch/x86/um/sys_call_table_64.c
> +++ b/arch/x86/um/sys_call_table_64.c
> @@ -9,6 +9,10 @@
>  #include <linux/cache.h>
>  #include <asm/syscall.h>
>  
> +extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long,
> +				      unsigned long, unsigned long,
> +				      unsigned long, unsigned long);
> +
>  /*
>   * Below you can see, in terms of #define's, the differences between the x86-64
>   * and the UML syscall table.
> @@ -18,14 +22,13 @@
>  #define sys_iopl sys_ni_syscall
>  #define sys_ioperm sys_ni_syscall
>  
> +#define __SYSCALL_NORETURN __SYSCALL
> +
>  #define __SYSCALL(nr, sym) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
>  #include <asm/syscalls_64.h>
> +#undef  __SYSCALL
>  
> -#undef __SYSCALL
>  #define __SYSCALL(nr, sym) sym,
> -
> -extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
> -
>  const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
>  #include <asm/syscalls_64.h>
>  };
> diff --git a/scripts/syscalltbl.sh b/scripts/syscalltbl.sh
> index 6abe143889ef..6a903b87a7c2 100755
> --- a/scripts/syscalltbl.sh
> +++ b/scripts/syscalltbl.sh
> @@ -54,7 +54,7 @@ nxt=0
>  
>  grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
>  
> -	while read nr abi name native compat ; do
> +	while read nr abi name native compat noreturn; do
>  
>  		if [ $nxt -gt $nr ]; then
>  			echo "error: $infile: syscall table is not sorted or duplicates the same syscall number" >&2
> @@ -66,7 +66,21 @@ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
>  			nxt=$((nxt + 1))
>  		done
>  
> -		if [ -n "$compat" ]; then
> +		if [ "$compat" = "-" ]; then
> +			unset compat
> +		fi
> +
> +		if [ -n "$noreturn" ]; then
> +			if [ "$noreturn" != "noreturn" ]; then
> +				echo "error: $infile: invalid string \"$noreturn\" in 'noreturn' column"
> +				exit 1
> +			fi
> +			if [ -n "$compat" ]; then
> +				echo "__SYSCALL_COMPAT_NORETURN($nr, $native, $compat)"
> +			else
> +				echo "__SYSCALL_NORETURN($nr, $native)"
> +			fi
> +		elif [ -n "$compat" ]; then
>  			echo "__SYSCALL_WITH_COMPAT($nr, $native, $compat)"
>  		elif [ -n "$native" ]; then
>  			echo "__SYSCALL($nr, $native)"
> diff --git a/tools/objtool/noreturns.h b/tools/objtool/noreturns.h
> index 7ebf29c91184..1e8141ef1b15 100644
> --- a/tools/objtool/noreturns.h
> +++ b/tools/objtool/noreturns.h
> @@ -7,12 +7,16 @@
>   * Yes, this is unfortunate.  A better solution is in the works.
>   */
>  NORETURN(__fortify_panic)
> +NORETURN(__ia32_sys_exit)
> +NORETURN(__ia32_sys_exit_group)
>  NORETURN(__kunit_abort)
>  NORETURN(__module_put_and_kthread_exit)
>  NORETURN(__reiserfs_panic)
>  NORETURN(__stack_chk_fail)
>  NORETURN(__tdx_hypercall_failed)
>  NORETURN(__ubsan_handle_builtin_unreachable)
> +NORETURN(__x64_sys_exit)
> +NORETURN(__x64_sys_exit_group)
>  NORETURN(arch_cpu_idle_dead)
>  NORETURN(bch2_trans_in_restart_error)
>  NORETURN(bch2_trans_restart_error)
> -- 
> 2.44.0
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-07  5:30 ` [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option Josh Poimboeuf
@ 2024-05-07 14:58   ` Daniel Sneddon
  2024-05-08  5:19     ` Josh Poimboeuf
  2024-05-08 15:10   ` Nikolay Borisov
  2024-05-20 13:12   ` Maksim Davydov
  2 siblings, 1 reply; 23+ messages in thread
From: Daniel Sneddon @ 2024-05-07 14:58 UTC (permalink / raw)
  To: Josh Poimboeuf, x86
  Cc: linux-kernel, Linus Torvalds, Pawan Gupta, Thomas Gleixner,
	Alexandre Chartre, Konrad Rzeszutek Wilk, Peter Zijlstra,
	Greg Kroah-Hartman, Sean Christopherson, Andrew Cooper,
	Dave Hansen, Nikolay Borisov, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar, Maksim Davydov

On 5/6/24 22:30, Josh Poimboeuf wrote:
> In cloud environments it can be useful to *only* enable the vmexit
> mitigation and leave syscalls vulnerable.  Add that as an option.
> 
> This is similar to the old spectre_bhi=auto option which was removed
> with the following commit:
> 
>   36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")
> 
> with the main difference being that this has a more descriptive name and
> is disabled by default.
> 
> Requested-by: Maksim Davydov <davydov-max@yandex-team.ru>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---

Does the KConfig option need to be updated to support this as well? Other than
that,
Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>

>  Documentation/admin-guide/kernel-parameters.txt | 12 +++++++++---
>  arch/x86/kernel/cpu/bugs.c                      | 16 +++++++++++-----
>  2 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 213d0719e2b7..9c1f63f04502 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6072,9 +6072,15 @@
>  			deployment of the HW BHI control and the SW BHB
>  			clearing sequence.
>  
> -			on   - (default) Enable the HW or SW mitigation
> -			       as needed.
> -			off  - Disable the mitigation.
> +			on     - (default) Enable the HW or SW mitigation as
> +				 needed.  This protects the kernel from
> +				 both syscalls and VMs.
> +			vmexit - On systems which don't have the HW mitigation
> +				 available, enable the SW mitigation on vmexit
> +				 ONLY.  On such systems, the host kernel is
> +				 protected from VM-originated BHI attacks, but
> +				 may still be vulnerable to syscall attacks.
> +			off    - Disable the mitigation.
>  
>  	spectre_v2=	[X86,EARLY] Control mitigation of Spectre variant 2
>  			(indirect branch speculation) vulnerability.
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index ab18185894df..6974c8c9792d 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1625,6 +1625,7 @@ static bool __init spec_ctrl_bhi_dis(void)
>  enum bhi_mitigations {
>  	BHI_MITIGATION_OFF,
>  	BHI_MITIGATION_ON,
> +	BHI_MITIGATION_VMEXIT_ONLY,
>  };
>  
>  static enum bhi_mitigations bhi_mitigation __ro_after_init =
> @@ -1639,6 +1640,8 @@ static int __init spectre_bhi_parse_cmdline(char *str)
>  		bhi_mitigation = BHI_MITIGATION_OFF;
>  	else if (!strcmp(str, "on"))
>  		bhi_mitigation = BHI_MITIGATION_ON;
> +	else if (!strcmp(str, "vmexit"))
> +		bhi_mitigation = BHI_MITIGATION_VMEXIT_ONLY;
>  	else
>  		pr_err("Ignoring unknown spectre_bhi option (%s)", str);
>  
> @@ -1659,19 +1662,22 @@ static void __init bhi_select_mitigation(void)
>  			return;
>  	}
>  
> +	/* Mitigate in hardware if supported */
>  	if (spec_ctrl_bhi_dis())
>  		return;
>  
>  	if (!IS_ENABLED(CONFIG_X86_64))
>  		return;
>  
> -	/* Mitigate KVM by default */
> -	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> -	pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
> +	if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) {
> +		pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit only\n");
> +		setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> +		return;
> +	}
>  
> -	/* Mitigate syscalls when the mitigation is forced =on */
> +	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall and vm exit\n");
>  	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP);
> -	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall\n");
> +	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
>  }
>  
>  static void __init spectre_v2_select_mitigation(void)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions
  2024-05-07  5:30 ` [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions Josh Poimboeuf
@ 2024-05-07 15:04   ` Daniel Sneddon
  2024-05-08  5:55     ` Josh Poimboeuf
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel Sneddon @ 2024-05-07 15:04 UTC (permalink / raw)
  To: Josh Poimboeuf, x86
  Cc: linux-kernel, Linus Torvalds, Pawan Gupta, Thomas Gleixner,
	Alexandre Chartre, Konrad Rzeszutek Wilk, Peter Zijlstra,
	Greg Kroah-Hartman, Sean Christopherson, Andrew Cooper,
	Dave Hansen, Nikolay Borisov, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar

I love the idea here, but

>  	nospectre_v2
> +	spectre_v2={option}
> +	spectre_v2_user={option}
> +	spectre_bhi={option}
>  

this comes out as just a single line when I run make htmldocs.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-07 14:58   ` Daniel Sneddon
@ 2024-05-08  5:19     ` Josh Poimboeuf
  2024-05-27 10:45       ` Maksim Davydov
  0 siblings, 1 reply; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-08  5:19 UTC (permalink / raw)
  To: Daniel Sneddon
  Cc: x86, linux-kernel, Linus Torvalds, Pawan Gupta, Thomas Gleixner,
	Alexandre Chartre, Konrad Rzeszutek Wilk, Peter Zijlstra,
	Greg Kroah-Hartman, Sean Christopherson, Andrew Cooper,
	Dave Hansen, Nikolay Borisov, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar, Maksim Davydov

On Tue, May 07, 2024 at 07:58:07AM -0700, Daniel Sneddon wrote:
> On 5/6/24 22:30, Josh Poimboeuf wrote:
> > In cloud environments it can be useful to *only* enable the vmexit
> > mitigation and leave syscalls vulnerable.  Add that as an option.
> > 
> > This is similar to the old spectre_bhi=auto option which was removed
> > with the following commit:
> > 
> >   36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")
> > 
> > with the main difference being that this has a more descriptive name and
> > is disabled by default.
> > 
> > Requested-by: Maksim Davydov <davydov-max@yandex-team.ru>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > ---
> 
> Does the KConfig option need to be updated to support this as well?

In general we don't provide a config option for every possible
mitigation cmdline option.  If someone requests it we could add it
later.

> Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>

Thanks!

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions
  2024-05-07 15:04   ` Daniel Sneddon
@ 2024-05-08  5:55     ` Josh Poimboeuf
  2024-05-08 14:28       ` Daniel Sneddon
  0 siblings, 1 reply; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-08  5:55 UTC (permalink / raw)
  To: Daniel Sneddon
  Cc: x86, linux-kernel, Linus Torvalds, Pawan Gupta, Thomas Gleixner,
	Alexandre Chartre, Konrad Rzeszutek Wilk, Peter Zijlstra,
	Greg Kroah-Hartman, Sean Christopherson, Andrew Cooper,
	Dave Hansen, Nikolay Borisov, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar

On Tue, May 07, 2024 at 08:04:37AM -0700, Daniel Sneddon wrote:
> I love the idea here, but
> 
> >  	nospectre_v2
> > +	spectre_v2={option}
> > +	spectre_v2_user={option}
> > +	spectre_bhi={option}
> >  
> 
> this comes out as just a single line when I run make htmldocs.

Thanks, the below turns it into a bulleted list:

diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index f9797ab6b38f..132e0bc6007e 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -598,11 +598,11 @@ current CPU.
 Spectre default mitigations can be disabled or changed at the kernel
 command line with the following options:
 
-	nospectre_v1
-	nospectre_v2
-	spectre_v2={option}
-	spectre_v2_user={option}
-	spectre_bhi={option}
+	- nospectre_v1
+	- nospectre_v2
+	- spectre_v2={option}
+	- spectre_v2_user={option}
+	- spectre_bhi={option}
 
 For more details on the available options, refer to Documentation/admin-guide/kernel-parameters.txt
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions
  2024-05-08  5:55     ` Josh Poimboeuf
@ 2024-05-08 14:28       ` Daniel Sneddon
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Sneddon @ 2024-05-08 14:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, Linus Torvalds, Pawan Gupta, Thomas Gleixner,
	Alexandre Chartre, Konrad Rzeszutek Wilk, Peter Zijlstra,
	Greg Kroah-Hartman, Sean Christopherson, Andrew Cooper,
	Dave Hansen, Nikolay Borisov, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar

On 5/7/24 22:55, Josh Poimboeuf wrote:
> On Tue, May 07, 2024 at 08:04:37AM -0700, Daniel Sneddon wrote:
>> I love the idea here, but
>>
>>>  	nospectre_v2
>>> +	spectre_v2={option}
>>> +	spectre_v2_user={option}
>>> +	spectre_bhi={option}
>>>  
>>
>> this comes out as just a single line when I run make htmldocs.
> 
> Thanks, the below turns it into a bulleted list:
> 
> diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
> index f9797ab6b38f..132e0bc6007e 100644
> --- a/Documentation/admin-guide/hw-vuln/spectre.rst
> +++ b/Documentation/admin-guide/hw-vuln/spectre.rst
> @@ -598,11 +598,11 @@ current CPU.
>  Spectre default mitigations can be disabled or changed at the kernel
>  command line with the following options:
>  
> -	nospectre_v1
> -	nospectre_v2
> -	spectre_v2={option}
> -	spectre_v2_user={option}
> -	spectre_bhi={option}
> +	- nospectre_v1
> +	- nospectre_v2
> +	- spectre_v2={option}
> +	- spectre_v2_user={option}
> +	- spectre_bhi={option}
>  
>  For more details on the available options, refer to Documentation/admin-guide/kernel-parameters.txt
>  

Looks good.

Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-07  5:30 ` [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option Josh Poimboeuf
  2024-05-07 14:58   ` Daniel Sneddon
@ 2024-05-08 15:10   ` Nikolay Borisov
  2024-05-09  5:24     ` Josh Poimboeuf
  2024-05-20 13:12   ` Maksim Davydov
  2 siblings, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2024-05-08 15:10 UTC (permalink / raw)
  To: Josh Poimboeuf, x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar, Maksim Davydov



On 7.05.24 г. 8:30 ч., Josh Poimboeuf wrote:
> In cloud environments it can be useful to *only* enable the vmexit
> mitigation and leave syscalls vulnerable.  Add that as an option.
> 
> This is similar to the old spectre_bhi=auto option which was removed
> with the following commit:
> 
>    36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")
> 
> with the main difference being that this has a more descriptive name and
> is disabled by default.
> 
> Requested-by: Maksim Davydov <davydov-max@yandex-team.ru>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
>   Documentation/admin-guide/kernel-parameters.txt | 12 +++++++++---
>   arch/x86/kernel/cpu/bugs.c                      | 16 +++++++++++-----
>   2 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 213d0719e2b7..9c1f63f04502 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6072,9 +6072,15 @@
>   			deployment of the HW BHI control and the SW BHB
>   			clearing sequence.
>   
> -			on   - (default) Enable the HW or SW mitigation
> -			       as needed.
> -			off  - Disable the mitigation.
> +			on     - (default) Enable the HW or SW mitigation as
> +				 needed.  This protects the kernel from
> +				 both syscalls and VMs.
> +			vmexit - On systems which don't have the HW mitigation
> +				 available, enable the SW mitigation on vmexit
> +				 ONLY.  On such systems, the host kernel is
> +				 protected from VM-originated BHI attacks, but
> +				 may still be vulnerable to syscall attacks.
> +			off    - Disable the mitigation.
>   
>   	spectre_v2=	[X86,EARLY] Control mitigation of Spectre variant 2
>   			(indirect branch speculation) vulnerability.
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index ab18185894df..6974c8c9792d 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1625,6 +1625,7 @@ static bool __init spec_ctrl_bhi_dis(void)
>   enum bhi_mitigations {
>   	BHI_MITIGATION_OFF,
>   	BHI_MITIGATION_ON,
> +	BHI_MITIGATION_VMEXIT_ONLY,
>   };
>   
>   static enum bhi_mitigations bhi_mitigation __ro_after_init =
> @@ -1639,6 +1640,8 @@ static int __init spectre_bhi_parse_cmdline(char *str)
>   		bhi_mitigation = BHI_MITIGATION_OFF;
>   	else if (!strcmp(str, "on"))
>   		bhi_mitigation = BHI_MITIGATION_ON;
> +	else if (!strcmp(str, "vmexit"))
> +		bhi_mitigation = BHI_MITIGATION_VMEXIT_ONLY;
>   	else
>   		pr_err("Ignoring unknown spectre_bhi option (%s)", str);
>   
> @@ -1659,19 +1662,22 @@ static void __init bhi_select_mitigation(void)
>   			return;
>   	}
>   
> +	/* Mitigate in hardware if supported */
>   	if (spec_ctrl_bhi_dis())
>   		return;
>   
>   	if (!IS_ENABLED(CONFIG_X86_64))
>   		return;
>   
> -	/* Mitigate KVM by default */
> -	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> -	pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
> +	if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) {
> +		pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit only\n");
> +		setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> +		return;
> +	}

nit: How about setting CLEAR_BHB_LOOP_ON_VMEXIT unconditionally, then 
afterwards checking if MITIGATION_VMEXIT_ONLY is set and if yes simply 
return, that way you don't duplicate the setup of the VMEXIT code

>   
> -	/* Mitigate syscalls when the mitigation is forced =on */
> +	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall and vm exit\n");
>   	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP);
> -	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall\n");
> +	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
>   }
>   
>   static void __init spectre_v2_select_mitigation(void)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-08 15:10   ` Nikolay Borisov
@ 2024-05-09  5:24     ` Josh Poimboeuf
  2024-05-09  8:21       ` Nikolay Borisov
  0 siblings, 1 reply; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-09  5:24 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar, Maksim Davydov

On Wed, May 08, 2024 at 06:10:21PM +0300, Nikolay Borisov wrote:
> > @@ -1659,19 +1662,22 @@ static void __init bhi_select_mitigation(void)
> >   			return;
> >   	}
> > +	/* Mitigate in hardware if supported */
> >   	if (spec_ctrl_bhi_dis())
> >   		return;
> >   	if (!IS_ENABLED(CONFIG_X86_64))
> >   		return;
> > -	/* Mitigate KVM by default */
> > -	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> > -	pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
> > +	if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) {
> > +		pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit only\n");
> > +		setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> > +		return;
> > +	}
> 
> nit: How about setting CLEAR_BHB_LOOP_ON_VMEXIT unconditionally, then
> afterwards checking if MITIGATION_VMEXIT_ONLY is set and if yes simply
> return, that way you don't duplicate the setup of the VMEXIT code

I think the duplication actually makes it more readable.  In both cases
it puts the setting of the features together along with the
corresponding pr_info().

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-09  5:24     ` Josh Poimboeuf
@ 2024-05-09  8:21       ` Nikolay Borisov
  0 siblings, 0 replies; 23+ messages in thread
From: Nikolay Borisov @ 2024-05-09  8:21 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar, Maksim Davydov



On 9.05.24 г. 8:24 ч., Josh Poimboeuf wrote:
> On Wed, May 08, 2024 at 06:10:21PM +0300, Nikolay Borisov wrote:
>>> @@ -1659,19 +1662,22 @@ static void __init bhi_select_mitigation(void)
>>>    			return;
>>>    	}
>>> +	/* Mitigate in hardware if supported */
>>>    	if (spec_ctrl_bhi_dis())
>>>    		return;
>>>    	if (!IS_ENABLED(CONFIG_X86_64))
>>>    		return;
>>> -	/* Mitigate KVM by default */
>>> -	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
>>> -	pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
>>> +	if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) {
>>> +		pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit only\n");
>>> +		setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
>>> +		return;
>>> +	}
>>
>> nit: How about setting CLEAR_BHB_LOOP_ON_VMEXIT unconditionally, then
>> afterwards checking if MITIGATION_VMEXIT_ONLY is set and if yes simply
>> return, that way you don't duplicate the setup of the VMEXIT code
> 
> I think the duplication actually makes it more readable.  In both cases
> it puts the setting of the features together along with the
> corresponding pr_info().

Right, my suggestion also meant that setting + pr info will be together, 
unconditional and if MITIGATION_VMEXIT_ONLY is set we return early, 
without setting X86_FEATURE_CLEAR_BHB_LOOP. In any case it's a minor 
remark, feel free to ignore.

Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>

> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-07  5:30 ` [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option Josh Poimboeuf
  2024-05-07 14:58   ` Daniel Sneddon
  2024-05-08 15:10   ` Nikolay Borisov
@ 2024-05-20 13:12   ` Maksim Davydov
  2024-05-23  1:04     ` Josh Poimboeuf
  2 siblings, 1 reply; 23+ messages in thread
From: Maksim Davydov @ 2024-05-20 13:12 UTC (permalink / raw)
  To: Josh Poimboeuf, x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

Hi!
What is the current status of the series?


On 5/7/24 08:30, Josh Poimboeuf wrote:
> In cloud environments it can be useful to *only* enable the vmexit
> mitigation and leave syscalls vulnerable.  Add that as an option.
> 
> This is similar to the old spectre_bhi=auto option which was removed
> with the following commit:
> 
>    36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")
> 
> with the main difference being that this has a more descriptive name and
> is disabled by default.
> 
> Requested-by: Maksim Davydov <davydov-max@yandex-team.ru>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
>   Documentation/admin-guide/kernel-parameters.txt | 12 +++++++++---
>   arch/x86/kernel/cpu/bugs.c                      | 16 +++++++++++-----
>   2 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 213d0719e2b7..9c1f63f04502 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6072,9 +6072,15 @@
>   			deployment of the HW BHI control and the SW BHB
>   			clearing sequence.
>   
> -			on   - (default) Enable the HW or SW mitigation
> -			       as needed.
> -			off  - Disable the mitigation.
> +			on     - (default) Enable the HW or SW mitigation as
> +				 needed.  This protects the kernel from
> +				 both syscalls and VMs.
> +			vmexit - On systems which don't have the HW mitigation
> +				 available, enable the SW mitigation on vmexit
> +				 ONLY.  On such systems, the host kernel is
> +				 protected from VM-originated BHI attacks, but
> +				 may still be vulnerable to syscall attacks.
> +			off    - Disable the mitigation.
>   
>   	spectre_v2=	[X86,EARLY] Control mitigation of Spectre variant 2
>   			(indirect branch speculation) vulnerability.
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index ab18185894df..6974c8c9792d 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1625,6 +1625,7 @@ static bool __init spec_ctrl_bhi_dis(void)
>   enum bhi_mitigations {
>   	BHI_MITIGATION_OFF,
>   	BHI_MITIGATION_ON,
> +	BHI_MITIGATION_VMEXIT_ONLY,
>   };
>   
>   static enum bhi_mitigations bhi_mitigation __ro_after_init =
> @@ -1639,6 +1640,8 @@ static int __init spectre_bhi_parse_cmdline(char *str)
>   		bhi_mitigation = BHI_MITIGATION_OFF;
>   	else if (!strcmp(str, "on"))
>   		bhi_mitigation = BHI_MITIGATION_ON;
> +	else if (!strcmp(str, "vmexit"))
> +		bhi_mitigation = BHI_MITIGATION_VMEXIT_ONLY;
>   	else
>   		pr_err("Ignoring unknown spectre_bhi option (%s)", str);
>   
> @@ -1659,19 +1662,22 @@ static void __init bhi_select_mitigation(void)
>   			return;
>   	}
>   
> +	/* Mitigate in hardware if supported */
>   	if (spec_ctrl_bhi_dis())
>   		return;
>   
>   	if (!IS_ENABLED(CONFIG_X86_64))
>   		return;
>   
> -	/* Mitigate KVM by default */
> -	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> -	pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
> +	if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) {
> +		pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit only\n");
> +		setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
> +		return;
> +	}
>   
> -	/* Mitigate syscalls when the mitigation is forced =on */
> +	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall and vm exit\n");
>   	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP);
> -	pr_info("Spectre BHI mitigation: SW BHB clearing on syscall\n");
> +	setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
>   }
>   
>   static void __init spectre_v2_select_mitigation(void)

-- 
Best regards,
Maksim Davydov

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-20 13:12   ` Maksim Davydov
@ 2024-05-23  1:04     ` Josh Poimboeuf
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-05-23  1:04 UTC (permalink / raw)
  To: Maksim Davydov
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

On Mon, May 20, 2024 at 04:12:58PM +0300, Maksim Davydov wrote:
> Hi!
> What is the current status of the series?

Looks like it didn't make the merge window.  I can post a new version of
the series next week (with the minor documentation fix in patch 2).

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-08  5:19     ` Josh Poimboeuf
@ 2024-05-27 10:45       ` Maksim Davydov
  2024-06-26  5:58         ` Josh Poimboeuf
  0 siblings, 1 reply; 23+ messages in thread
From: Maksim Davydov @ 2024-05-27 10:45 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, Daniel Sneddon, Linus Torvalds, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar



On 5/8/24 08:19, Josh Poimboeuf wrote:
> On Tue, May 07, 2024 at 07:58:07AM -0700, Daniel Sneddon wrote:
>> On 5/6/24 22:30, Josh Poimboeuf wrote:
>>> In cloud environments it can be useful to *only* enable the vmexit
>>> mitigation and leave syscalls vulnerable.  Add that as an option.
>>>
>>> This is similar to the old spectre_bhi=auto option which was removed
>>> with the following commit:
>>>
>>>    36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")
>>>
>>> with the main difference being that this has a more descriptive name and
>>> is disabled by default.
>>>
>>> Requested-by: Maksim Davydov <davydov-max@yandex-team.ru>
>>> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
>>> ---
>>
>> Does the KConfig option need to be updated to support this as well?
> 
> In general we don't provide a config option for every possible
> mitigation cmdline option.  If someone requests it we could add it
> later.
> 
>> Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
> 
> Thanks!
> 

I think it will be useful for us to have appropriate Kconfig option. 
Could you please add it to the next version?

-- 
Best regards,
Maksim Davydov

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-05-07  5:30 ` [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn Josh Poimboeuf
  2024-05-07 14:38   ` Paul E. McKenney
@ 2024-05-27 11:15   ` Nikolay Borisov
  2024-06-26  5:21     ` Josh Poimboeuf
  1 sibling, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2024-05-27 11:15 UTC (permalink / raw)
  To: Josh Poimboeuf, x86
  Cc: linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar, Paul E. McKenney



On 7.05.24 г. 8:30 ч., Josh Poimboeuf wrote:
> The direct-call syscall dispatch function doesn't know that the exit()
> and exit_group() syscall handlers don't return, so the call sites aren't
> optimized accordingly.
> 
> Fix that by marking those exit syscall declarations __noreturn.
> 
> Fixes the following warnings:
> 
>    vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
>    vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
> 
> Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
> Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
> Tested-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
>   arch/x86/entry/syscall_32.c            | 10 ++++++----
>   arch/x86/entry/syscall_64.c            |  9 ++++++---
>   arch/x86/entry/syscall_x32.c           |  7 +++++--
>   arch/x86/entry/syscalls/syscall_32.tbl |  6 +++---
>   arch/x86/entry/syscalls/syscall_64.tbl |  6 +++---
>   arch/x86/um/sys_call_table_32.c        | 10 ++++++----
>   arch/x86/um/sys_call_table_64.c        | 11 +++++++----
>   scripts/syscalltbl.sh                  | 18 ++++++++++++++++--
>   tools/objtool/noreturns.h              |  4 ++++
>   9 files changed, 56 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
> index c2235bae17ef..8cc9950d7104 100644
> --- a/arch/x86/entry/syscall_32.c
> +++ b/arch/x86/entry/syscall_32.c
> @@ -14,9 +14,12 @@
>   #endif
>   
>   #define __SYSCALL(nr, sym) extern long __ia32_##sym(const struct pt_regs *);
> -
> +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __ia32_##sym(const struct pt_regs *);
>   #include <asm/syscalls_32.h>
> -#undef __SYSCALL
> +#undef  __SYSCALL
> +
> +#undef  __SYSCALL_NORETURN
> +#define __SYSCALL_NORETURN __SYSCALL
>   
>   /*
>    * The sys_call_table[] is no longer used for system calls, but
> @@ -28,11 +31,10 @@
>   const sys_call_ptr_t sys_call_table[] = {
>   #include <asm/syscalls_32.h>
>   };
> -#undef __SYSCALL
> +#undef  __SYSCALL

nit: Am I blind or all the __SYSCALL lines have an extra whitespace?

<snip>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-05-07 14:38   ` Paul E. McKenney
@ 2024-06-26  2:21     ` Paul E. McKenney
  2024-06-26  5:28       ` Josh Poimboeuf
  0 siblings, 1 reply; 23+ messages in thread
From: Paul E. McKenney @ 2024-06-26  2:21 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

On Tue, May 07, 2024 at 07:38:32AM -0700, Paul E. McKenney wrote:
> On Mon, May 06, 2024 at 10:30:04PM -0700, Josh Poimboeuf wrote:
> > The direct-call syscall dispatch function doesn't know that the exit()
> > and exit_group() syscall handlers don't return, so the call sites aren't
> > optimized accordingly.
> > 
> > Fix that by marking those exit syscall declarations __noreturn.
> > 
> > Fixes the following warnings:
> > 
> >   vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
> >   vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
> > 
> > Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
> > Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> > Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
> > Tested-by: Paul E. McKenney <paulmck@kernel.org>
> 
> Just reaffirming my Tested-by, and thank you!

And just following up, given that I do not yet see this in -next.  Any
chance of this making the upcoming merge window?

							Thanx, Paul

> > Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > ---
> >  arch/x86/entry/syscall_32.c            | 10 ++++++----
> >  arch/x86/entry/syscall_64.c            |  9 ++++++---
> >  arch/x86/entry/syscall_x32.c           |  7 +++++--
> >  arch/x86/entry/syscalls/syscall_32.tbl |  6 +++---
> >  arch/x86/entry/syscalls/syscall_64.tbl |  6 +++---
> >  arch/x86/um/sys_call_table_32.c        | 10 ++++++----
> >  arch/x86/um/sys_call_table_64.c        | 11 +++++++----
> >  scripts/syscalltbl.sh                  | 18 ++++++++++++++++--
> >  tools/objtool/noreturns.h              |  4 ++++
> >  9 files changed, 56 insertions(+), 25 deletions(-)
> > 
> > diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
> > index c2235bae17ef..8cc9950d7104 100644
> > --- a/arch/x86/entry/syscall_32.c
> > +++ b/arch/x86/entry/syscall_32.c
> > @@ -14,9 +14,12 @@
> >  #endif
> >  
> >  #define __SYSCALL(nr, sym) extern long __ia32_##sym(const struct pt_regs *);
> > -
> > +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __ia32_##sym(const struct pt_regs *);
> >  #include <asm/syscalls_32.h>
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> > +
> > +#undef  __SYSCALL_NORETURN
> > +#define __SYSCALL_NORETURN __SYSCALL
> >  
> >  /*
> >   * The sys_call_table[] is no longer used for system calls, but
> > @@ -28,11 +31,10 @@
> >  const sys_call_ptr_t sys_call_table[] = {
> >  #include <asm/syscalls_32.h>
> >  };
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> >  #endif
> >  
> >  #define __SYSCALL(nr, sym) case nr: return __ia32_##sym(regs);
> > -
> >  long ia32_sys_call(const struct pt_regs *regs, unsigned int nr)
> >  {
> >  	switch (nr) {
> > diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
> > index 33b3f09e6f15..ba8354424860 100644
> > --- a/arch/x86/entry/syscall_64.c
> > +++ b/arch/x86/entry/syscall_64.c
> > @@ -8,8 +8,12 @@
> >  #include <asm/syscall.h>
> >  
> >  #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
> > +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
> >  #include <asm/syscalls_64.h>
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> > +
> > +#undef  __SYSCALL_NORETURN
> > +#define __SYSCALL_NORETURN __SYSCALL
> >  
> >  /*
> >   * The sys_call_table[] is no longer used for system calls, but
> > @@ -20,10 +24,9 @@
> >  const sys_call_ptr_t sys_call_table[] = {
> >  #include <asm/syscalls_64.h>
> >  };
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> >  
> >  #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
> > -
> >  long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
> >  {
> >  	switch (nr) {
> > diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
> > index 03de4a932131..fb77908f44f3 100644
> > --- a/arch/x86/entry/syscall_x32.c
> > +++ b/arch/x86/entry/syscall_x32.c
> > @@ -8,11 +8,14 @@
> >  #include <asm/syscall.h>
> >  
> >  #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
> > +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
> >  #include <asm/syscalls_x32.h>
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> > +
> > +#undef  __SYSCALL_NORETURN
> > +#define __SYSCALL_NORETURN __SYSCALL
> >  
> >  #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
> > -
> >  long x32_sys_call(const struct pt_regs *regs, unsigned int nr)
> >  {
> >  	switch (nr) {
> > diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> > index 5f8591ce7f25..9e9a908cd50d 100644
> > --- a/arch/x86/entry/syscalls/syscall_32.tbl
> > +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> > @@ -2,7 +2,7 @@
> >  # 32-bit system call numbers and entry vectors
> >  #
> >  # The format is:
> > -# <number> <abi> <name> <entry point> <compat entry point>
> > +# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
> >  #
> >  # The __ia32_sys and __ia32_compat_sys stubs are created on-the-fly for
> >  # sys_*() system calls and compat_sys_*() compat system calls if
> > @@ -12,7 +12,7 @@
> >  # The abi is always "i386" for this file.
> >  #
> >  0	i386	restart_syscall		sys_restart_syscall
> > -1	i386	exit			sys_exit
> > +1	i386	exit			sys_exit			-			noreturn
> >  2	i386	fork			sys_fork
> >  3	i386	read			sys_read
> >  4	i386	write			sys_write
> > @@ -263,7 +263,7 @@
> >  249	i386	io_cancel		sys_io_cancel
> >  250	i386	fadvise64		sys_ia32_fadvise64
> >  # 251 is available for reuse (was briefly sys_set_zone_reclaim)
> > -252	i386	exit_group		sys_exit_group
> > +252	i386	exit_group		sys_exit_group			-			noreturn
> >  253	i386	lookup_dcookie
> >  254	i386	epoll_create		sys_epoll_create
> >  255	i386	epoll_ctl		sys_epoll_ctl
> > diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> > index 7e8d46f4147f..5ea7387c1aa1 100644
> > --- a/arch/x86/entry/syscalls/syscall_64.tbl
> > +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> > @@ -2,7 +2,7 @@
> >  # 64-bit system call numbers and entry vectors
> >  #
> >  # The format is:
> > -# <number> <abi> <name> <entry point>
> > +# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
> >  #
> >  # The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls
> >  #
> > @@ -68,7 +68,7 @@
> >  57	common	fork			sys_fork
> >  58	common	vfork			sys_vfork
> >  59	64	execve			sys_execve
> > -60	common	exit			sys_exit
> > +60	common	exit			sys_exit			-			noreturn
> >  61	common	wait4			sys_wait4
> >  62	common	kill			sys_kill
> >  63	common	uname			sys_newuname
> > @@ -239,7 +239,7 @@
> >  228	common	clock_gettime		sys_clock_gettime
> >  229	common	clock_getres		sys_clock_getres
> >  230	common	clock_nanosleep		sys_clock_nanosleep
> > -231	common	exit_group		sys_exit_group
> > +231	common	exit_group		sys_exit_group			-			noreturn
> >  232	common	epoll_wait		sys_epoll_wait
> >  233	common	epoll_ctl		sys_epoll_ctl
> >  234	common	tgkill			sys_tgkill
> > diff --git a/arch/x86/um/sys_call_table_32.c b/arch/x86/um/sys_call_table_32.c
> > index 89df5d89d664..51655133eee3 100644
> > --- a/arch/x86/um/sys_call_table_32.c
> > +++ b/arch/x86/um/sys_call_table_32.c
> > @@ -9,6 +9,10 @@
> >  #include <linux/cache.h>
> >  #include <asm/syscall.h>
> >  
> > +extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long,
> > +				      unsigned long, unsigned long,
> > +				      unsigned long, unsigned long);
> > +
> >  /*
> >   * Below you can see, in terms of #define's, the differences between the x86-64
> >   * and the UML syscall table.
> > @@ -22,15 +26,13 @@
> >  #define sys_vm86 sys_ni_syscall
> >  
> >  #define __SYSCALL_WITH_COMPAT(nr, native, compat)	__SYSCALL(nr, native)
> > +#define __SYSCALL_NORETURN __SYSCALL
> >  
> >  #define __SYSCALL(nr, sym) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
> >  #include <asm/syscalls_32.h>
> > +#undef  __SYSCALL
> >  
> > -#undef __SYSCALL
> >  #define __SYSCALL(nr, sym) sym,
> > -
> > -extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
> > -
> >  const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
> >  #include <asm/syscalls_32.h>
> >  };
> > diff --git a/arch/x86/um/sys_call_table_64.c b/arch/x86/um/sys_call_table_64.c
> > index b0b4cfd2308c..943d414f2109 100644
> > --- a/arch/x86/um/sys_call_table_64.c
> > +++ b/arch/x86/um/sys_call_table_64.c
> > @@ -9,6 +9,10 @@
> >  #include <linux/cache.h>
> >  #include <asm/syscall.h>
> >  
> > +extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long,
> > +				      unsigned long, unsigned long,
> > +				      unsigned long, unsigned long);
> > +
> >  /*
> >   * Below you can see, in terms of #define's, the differences between the x86-64
> >   * and the UML syscall table.
> > @@ -18,14 +22,13 @@
> >  #define sys_iopl sys_ni_syscall
> >  #define sys_ioperm sys_ni_syscall
> >  
> > +#define __SYSCALL_NORETURN __SYSCALL
> > +
> >  #define __SYSCALL(nr, sym) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
> >  #include <asm/syscalls_64.h>
> > +#undef  __SYSCALL
> >  
> > -#undef __SYSCALL
> >  #define __SYSCALL(nr, sym) sym,
> > -
> > -extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
> > -
> >  const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
> >  #include <asm/syscalls_64.h>
> >  };
> > diff --git a/scripts/syscalltbl.sh b/scripts/syscalltbl.sh
> > index 6abe143889ef..6a903b87a7c2 100755
> > --- a/scripts/syscalltbl.sh
> > +++ b/scripts/syscalltbl.sh
> > @@ -54,7 +54,7 @@ nxt=0
> >  
> >  grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
> >  
> > -	while read nr abi name native compat ; do
> > +	while read nr abi name native compat noreturn; do
> >  
> >  		if [ $nxt -gt $nr ]; then
> >  			echo "error: $infile: syscall table is not sorted or duplicates the same syscall number" >&2
> > @@ -66,7 +66,21 @@ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
> >  			nxt=$((nxt + 1))
> >  		done
> >  
> > -		if [ -n "$compat" ]; then
> > +		if [ "$compat" = "-" ]; then
> > +			unset compat
> > +		fi
> > +
> > +		if [ -n "$noreturn" ]; then
> > +			if [ "$noreturn" != "noreturn" ]; then
> > +				echo "error: $infile: invalid string \"$noreturn\" in 'noreturn' column"
> > +				exit 1
> > +			fi
> > +			if [ -n "$compat" ]; then
> > +				echo "__SYSCALL_COMPAT_NORETURN($nr, $native, $compat)"
> > +			else
> > +				echo "__SYSCALL_NORETURN($nr, $native)"
> > +			fi
> > +		elif [ -n "$compat" ]; then
> >  			echo "__SYSCALL_WITH_COMPAT($nr, $native, $compat)"
> >  		elif [ -n "$native" ]; then
> >  			echo "__SYSCALL($nr, $native)"
> > diff --git a/tools/objtool/noreturns.h b/tools/objtool/noreturns.h
> > index 7ebf29c91184..1e8141ef1b15 100644
> > --- a/tools/objtool/noreturns.h
> > +++ b/tools/objtool/noreturns.h
> > @@ -7,12 +7,16 @@
> >   * Yes, this is unfortunate.  A better solution is in the works.
> >   */
> >  NORETURN(__fortify_panic)
> > +NORETURN(__ia32_sys_exit)
> > +NORETURN(__ia32_sys_exit_group)
> >  NORETURN(__kunit_abort)
> >  NORETURN(__module_put_and_kthread_exit)
> >  NORETURN(__reiserfs_panic)
> >  NORETURN(__stack_chk_fail)
> >  NORETURN(__tdx_hypercall_failed)
> >  NORETURN(__ubsan_handle_builtin_unreachable)
> > +NORETURN(__x64_sys_exit)
> > +NORETURN(__x64_sys_exit_group)
> >  NORETURN(arch_cpu_idle_dead)
> >  NORETURN(bch2_trans_in_restart_error)
> >  NORETURN(bch2_trans_restart_error)
> > -- 
> > 2.44.0
> > 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-05-27 11:15   ` Nikolay Borisov
@ 2024-06-26  5:21     ` Josh Poimboeuf
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-06-26  5:21 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, KP Singh, Waiman Long,
	Borislav Petkov, Ingo Molnar, Paul E. McKenney

On Mon, May 27, 2024 at 02:15:57PM +0300, Nikolay Borisov wrote:
> 
> 
> On 7.05.24 г. 8:30 ч., Josh Poimboeuf wrote:
> > The direct-call syscall dispatch function doesn't know that the exit()
> > and exit_group() syscall handlers don't return, so the call sites aren't
> > optimized accordingly.
> > 
> > Fix that by marking those exit syscall declarations __noreturn.
> > 
> > Fixes the following warnings:
> > 
> >    vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
> >    vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
> > 
> > Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
> > Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> > Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
> > Tested-by: Paul E. McKenney <paulmck@kernel.org>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > ---
> >   arch/x86/entry/syscall_32.c            | 10 ++++++----
> >   arch/x86/entry/syscall_64.c            |  9 ++++++---
> >   arch/x86/entry/syscall_x32.c           |  7 +++++--
> >   arch/x86/entry/syscalls/syscall_32.tbl |  6 +++---
> >   arch/x86/entry/syscalls/syscall_64.tbl |  6 +++---
> >   arch/x86/um/sys_call_table_32.c        | 10 ++++++----
> >   arch/x86/um/sys_call_table_64.c        | 11 +++++++----
> >   scripts/syscalltbl.sh                  | 18 ++++++++++++++++--
> >   tools/objtool/noreturns.h              |  4 ++++
> >   9 files changed, 56 insertions(+), 25 deletions(-)
> > 
> > diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
> > index c2235bae17ef..8cc9950d7104 100644
> > --- a/arch/x86/entry/syscall_32.c
> > +++ b/arch/x86/entry/syscall_32.c
> > @@ -14,9 +14,12 @@
> >   #endif
> >   #define __SYSCALL(nr, sym) extern long __ia32_##sym(const struct pt_regs *);
> > -
> > +#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __ia32_##sym(const struct pt_regs *);
> >   #include <asm/syscalls_32.h>
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> > +
> > +#undef  __SYSCALL_NORETURN
> > +#define __SYSCALL_NORETURN __SYSCALL
> >   /*
> >    * The sys_call_table[] is no longer used for system calls, but
> > @@ -28,11 +31,10 @@
> >   const sys_call_ptr_t sys_call_table[] = {
> >   #include <asm/syscalls_32.h>
> >   };
> > -#undef __SYSCALL
> > +#undef  __SYSCALL
> 
> nit: Am I blind or all the __SYSCALL lines have an extra whitespace?
> 
> <snip>

That was a small readability tweak to make '__SYSCALL' vertically aligned:

#define __SYSCALL(nr, sym) extern long __ia32_##sym(const struct pt_regs *);
#define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __ia32_##sym(const struct pt_regs *);
#include <asm/syscalls_32.h>
#undef  __SYSCALL

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-06-26  2:21     ` Paul E. McKenney
@ 2024-06-26  5:28       ` Josh Poimboeuf
  2024-06-26  6:35         ` Paul E. McKenney
  2024-06-27  6:36         ` Alexandre Chartre
  0 siblings, 2 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-06-26  5:28 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

On Tue, Jun 25, 2024 at 07:21:34PM -0700, Paul E. McKenney wrote:
> On Tue, May 07, 2024 at 07:38:32AM -0700, Paul E. McKenney wrote:
> > On Mon, May 06, 2024 at 10:30:04PM -0700, Josh Poimboeuf wrote:
> > > The direct-call syscall dispatch function doesn't know that the exit()
> > > and exit_group() syscall handlers don't return, so the call sites aren't
> > > optimized accordingly.
> > > 
> > > Fix that by marking those exit syscall declarations __noreturn.
> > > 
> > > Fixes the following warnings:
> > > 
> > >   vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
> > >   vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
> > > 
> > > Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
> > > Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> > > Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
> > > Tested-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > Just reaffirming my Tested-by, and thank you!
> 
> And just following up, given that I do not yet see this in -next.  Any
> chance of this making the upcoming merge window?

Sorry for my slowness!  I'm traveling this week but let me repost this
(with your Tested-by) and grab somebody to merge it.

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  2024-05-27 10:45       ` Maksim Davydov
@ 2024-06-26  5:58         ` Josh Poimboeuf
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2024-06-26  5:58 UTC (permalink / raw)
  To: Maksim Davydov
  Cc: x86, linux-kernel, Daniel Sneddon, Linus Torvalds, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

On Mon, May 27, 2024 at 01:45:59PM +0300, Maksim Davydov wrote:
> I think it will be useful for us to have appropriate Kconfig option. Could
> you please add it to the next version?

That should probably be a separate patch, something like the below?

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1d7122a1883e..ab1ea701bc42 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2642,17 +2642,46 @@ config MITIGATION_RFDS
 	  stored in floating point, vector and integer registers.
 	  See also <file:Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst>
 
-config MITIGATION_SPECTRE_BHI
-	bool "Mitigate Spectre-BHB (Branch History Injection)"
+choice
+	prompt "Mitigate Spectre-BHB (Branch History Injection)"
 	depends on CPU_SUP_INTEL
-	default y
+	default MITIGATION_SPECTRE_BHI_ON
 	help
 	  Enable BHI mitigations. BHI attacks are a form of Spectre V2 attacks
 	  where the branch history buffer is poisoned to speculatively steer
 	  indirect branches.
+
+	  The compile-time default can be set to on, vmexit, or off,
+	  corresponding to the "spectre_bhi=" cmdline defaults described in
+	  Documentation/admin-guide/kernel-parameters.rst.  The cmdline
+	  options can be used to override this compile-time default.
+
 	  See <file:Documentation/admin-guide/hw-vuln/spectre.rst>
 
-endif
+config MITIGATION_SPECTRE_BHI_ON
+	bool "on"
+	help
+	  Enable the HW or SW mitigation as needed.  This protects the kernel
+	  from both syscalls and VMs.  Equivalent to the spectre_bhi=on cmdline
+	  option.
+
+config MITIGATION_SPECTRE_BHI_VMEXIT
+	bool "vmexit"
+	help
+	  On systems which don't have the HW mitigation available, enable the
+	  SW mitigation on vmexit ONLY.  On such systems, the host kernel is
+	  protected from VM-originated BHI attacks, but may still be vulnerable
+	  to syscall attacks.  Equivalent to the spectre_bhi=vmexit cmdline
+	  option.
+
+config MITIGATION_SPECTRE_BHI_OFF
+	bool "off"
+	help
+	  Disable the mitigation.  Equivalent to the spectre_bhi=off cmdline
+	  option.
+endchoice
+
+endif # CPU_MITIGATIONS
 
 config ARCH_HAS_ADD_PAGES
 	def_bool y
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 94bcf29df465..d415f24b7169 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1628,8 +1628,13 @@ enum bhi_mitigations {
 	BHI_MITIGATION_VMEXIT_ONLY,
 };
 
-static enum bhi_mitigations bhi_mitigation __ro_after_init =
-	IS_ENABLED(CONFIG_MITIGATION_SPECTRE_BHI) ? BHI_MITIGATION_ON : BHI_MITIGATION_OFF;
+#ifdef CONFIG_MITIGATION_SPECTRE_BHI_ON
+static enum bhi_mitigations bhi_mitigation __ro_after_init = BHI_MITIGATION_ON;
+#elif CONFIG_MITIGATION_SPECTRE_BHI_VMEXIT
+static enum bhi_mitigations bhi_mitigation __ro_after_init = BHI_MITIGATION_VMEXIT;
+#else
+static enum bhi_mitigations bhi_mitigation __ro_after_init = BHI_MITIGATION_OFF;
+#endif
 
 static int __init spectre_bhi_parse_cmdline(char *str)
 {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-06-26  5:28       ` Josh Poimboeuf
@ 2024-06-26  6:35         ` Paul E. McKenney
  2024-06-27  6:36         ` Alexandre Chartre
  1 sibling, 0 replies; 23+ messages in thread
From: Paul E. McKenney @ 2024-06-26  6:35 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, Linus Torvalds, Daniel Sneddon, Pawan Gupta,
	Thomas Gleixner, Alexandre Chartre, Konrad Rzeszutek Wilk,
	Peter Zijlstra, Greg Kroah-Hartman, Sean Christopherson,
	Andrew Cooper, Dave Hansen, Nikolay Borisov, KP Singh,
	Waiman Long, Borislav Petkov, Ingo Molnar

On Tue, Jun 25, 2024 at 10:28:25PM -0700, Josh Poimboeuf wrote:
> On Tue, Jun 25, 2024 at 07:21:34PM -0700, Paul E. McKenney wrote:
> > On Tue, May 07, 2024 at 07:38:32AM -0700, Paul E. McKenney wrote:
> > > On Mon, May 06, 2024 at 10:30:04PM -0700, Josh Poimboeuf wrote:
> > > > The direct-call syscall dispatch function doesn't know that the exit()
> > > > and exit_group() syscall handlers don't return, so the call sites aren't
> > > > optimized accordingly.
> > > > 
> > > > Fix that by marking those exit syscall declarations __noreturn.
> > > > 
> > > > Fixes the following warnings:
> > > > 
> > > >   vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
> > > >   vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
> > > > 
> > > > Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
> > > > Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> > > > Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
> > > > Tested-by: Paul E. McKenney <paulmck@kernel.org>
> > > 
> > > Just reaffirming my Tested-by, and thank you!
> > 
> > And just following up, given that I do not yet see this in -next.  Any
> > chance of this making the upcoming merge window?
> 
> Sorry for my slowness!  I'm traveling this week but let me repost this
> (with your Tested-by) and grab somebody to merge it.

No worries, and thank you!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn
  2024-06-26  5:28       ` Josh Poimboeuf
  2024-06-26  6:35         ` Paul E. McKenney
@ 2024-06-27  6:36         ` Alexandre Chartre
  1 sibling, 0 replies; 23+ messages in thread
From: Alexandre Chartre @ 2024-06-27  6:36 UTC (permalink / raw)
  To: Josh Poimboeuf, Paul E. McKenney
  Cc: alexandre.chartre, x86, linux-kernel, Linus Torvalds,
	Daniel Sneddon, Pawan Gupta, Thomas Gleixner,
	Konrad Rzeszutek Wilk, Peter Zijlstra, Greg Kroah-Hartman,
	Sean Christopherson, Andrew Cooper, Dave Hansen, Nikolay Borisov,
	KP Singh, Waiman Long, Borislav Petkov, Ingo Molnar


On 6/26/24 07:28, Josh Poimboeuf wrote:
> On Tue, Jun 25, 2024 at 07:21:34PM -0700, Paul E. McKenney wrote:
>> On Tue, May 07, 2024 at 07:38:32AM -0700, Paul E. McKenney wrote:
>>> On Mon, May 06, 2024 at 10:30:04PM -0700, Josh Poimboeuf wrote:
>>>> The direct-call syscall dispatch function doesn't know that the exit()
>>>> and exit_group() syscall handlers don't return, so the call sites aren't
>>>> optimized accordingly.
>>>>
>>>> Fix that by marking those exit syscall declarations __noreturn.
>>>>
>>>> Fixes the following warnings:
>>>>
>>>>    vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
>>>>    vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation
>>>>
>>>> Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")
>>>> Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
>>>> Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
>>>> Tested-by: Paul E. McKenney <paulmck@kernel.org>
>>>
>>> Just reaffirming my Tested-by, and thank you!
>>
>> And just following up, given that I do not yet see this in -next.  Any
>> chance of this making the upcoming merge window?
> 
> Sorry for my slowness!  I'm traveling this week but let me repost this
> (with your Tested-by) and grab somebody to merge it.
> 

Hi Josh,

I have another BHI related patch which has been reviewed:

  [PATCH v2] x86/bhi: BHI mitigation can trigger warning in #DB handler
  https://lore.kernel.org/kvm/20240524070459.3674025-1-alexandre.chartre@oracle.com/T/#rd3e9a5aadc18931f777b9a9e9c71f1efb178b7ef


Any chance it can be merged as well?

Thanks,

alex.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-06-27  6:37 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-07  5:30 [PATCH v5 0/3] x86/bugs: more BHI Josh Poimboeuf
2024-05-07  5:30 ` [PATCH v5 1/3] x86/syscall: Mark exit[_group] syscall handlers __noreturn Josh Poimboeuf
2024-05-07 14:38   ` Paul E. McKenney
2024-06-26  2:21     ` Paul E. McKenney
2024-06-26  5:28       ` Josh Poimboeuf
2024-06-26  6:35         ` Paul E. McKenney
2024-06-27  6:36         ` Alexandre Chartre
2024-05-27 11:15   ` Nikolay Borisov
2024-06-26  5:21     ` Josh Poimboeuf
2024-05-07  5:30 ` [PATCH v5 2/3] x86/bugs: Remove duplicate Spectre cmdline option descriptions Josh Poimboeuf
2024-05-07 15:04   ` Daniel Sneddon
2024-05-08  5:55     ` Josh Poimboeuf
2024-05-08 14:28       ` Daniel Sneddon
2024-05-07  5:30 ` [PATCH v5 3/3] x86/bugs: Add 'spectre_bhi=vmexit' cmdline option Josh Poimboeuf
2024-05-07 14:58   ` Daniel Sneddon
2024-05-08  5:19     ` Josh Poimboeuf
2024-05-27 10:45       ` Maksim Davydov
2024-06-26  5:58         ` Josh Poimboeuf
2024-05-08 15:10   ` Nikolay Borisov
2024-05-09  5:24     ` Josh Poimboeuf
2024-05-09  8:21       ` Nikolay Borisov
2024-05-20 13:12   ` Maksim Davydov
2024-05-23  1:04     ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox