linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v11 0/9] x86: Enable Linear Address Space Separation support
@ 2025-10-29 21:03 Sohil Mehta
  2025-10-29 21:03 ` [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits Sohil Mehta
                   ` (8 more replies)
  0 siblings, 9 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

Changes in v11
--------------
Based on internal suggestions, I significantly trimmed down the series
to make reviewing and merging easier. It still has all the essential
changes to enable the base LASS support.

Improvements:
 - Separate out a patch to explain LASS dependency on SMAP (patch 2)
 - Use lass_disable/enable() instead of lass_stac()/clac() (patch 4)
 - New patch to fix a vsyscall selftest that expects a #PF (patch 8)
 - Solidify commit logs and code comments

Removals:
 - Vsyscall emulation: Legacy vsyscalls aren't required in newer
   environments. For now, LASS is only enabled when vsyscall emulation
   support is absent.

 - CR pinning: Pinning CR4.LASS isn't strictly necessary because it
   only mitigates speculative attacks beyond SMAP. The only benefit
   would be to get a warning if LASS is accidently disabled.

 - #SS LASS hints: Kernel stack segment faults are very rare. Also, in
   most cases, the faulting instruction is unlikely to have a memory
   operand. There isn't any compelling reason to add LASS hints for
   these right now.

Dropping the non-essential patches reduces the patch count and code
changes by ~50%. I am planning to pursue these later after the base
series has merged.

Important changes in v10
------------------------
 - Use the simplified versions of inline memcpy/memset    (patch 3)
 - New patch to fix an issue during Kexec relocate kernel (patch 6)

v10: https://lore.kernel.org/lkml/20251007065119.148605-1-sohil.mehta@intel.com/

Patch structure
---------------
Patch   1-2: Enumerate LASS and its dependency on SMAP
Patch   3-4: Update text poking
Patch   5-6: Update EFI and kexec flows
Patch   7-8: Expecting a #GP instead of #PF
Patch     9: Enable LASS (without Vsyscall)

Please consider providing review tags/acks for patches that seem ready.

Background
----------
Linear Address Space Separation (LASS) is a security feature [1] that
works prior to page-walks to prevent a class of side-channel attacks
that rely on speculative access across the user/kernel boundary.

Privilege mode based access protection already exists today with paging
and features such as SMEP and SMAP. However, to enforce these
protections, the processor must traverse the paging structures in
memory.  An attacker can use timing information resulting from this
traversal to determine details about the paging structures, and to
determine the layout of the kernel memory.

The LASS mechanism provides the same mode-based protections as paging,
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, an attacker will not be able
to derive timing information from the various caching structures such as
the TLBs, mid-level caches, page walkers, data caches, etc. LASS can
prevent probing using double page faults, TLB flush and reload, and
software prefetch instructions. See [2], [3], and [4] for research
on the related attack vectors.

Though LASS was developed in response to Meltdown, in hindsight, it
alone could have mitigated Meltdown had it been available. In addition,
LASS prevents an attack vector targeting Linear Address Masking (LAM)
described in the Spectre LAM (SLAM) whitepaper [5].

LASS enforcement relies on the typical kernel implementation dividing
the 64-bit virtual address space into two halves:
  Addr[63]=0 -> User address space
  Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically
results in a #GP, with an #SS generated in some rare cases.

Kernel accesses
---------------
When there are valid reasons for the kernel to access memory in the user
half, it can temporarily suspend LASS enforcement by toggling the
RFLAGS.AC bit. Most of these cases are already covered today through the
stac()/clac() pairs, which avoid SMAP violations. However, there are
kernel usages, such as text poking, that access mappings (!_PAGE_USER)
in the lower half of the address space. LASS-specific AC bit toggling is
added for these cases.

There are a couple of cases where instruction fetches are done from a
lower address. Toggling the AC bit is not sufficient here because it
only manages data accesses. Therefore, CR4.LASS is modified in the case
of EFI set_virtual_address_map() and kexec relocate_kernel() to avoid
LASS violations.

Exception handling
------------------
With LASS enabled, NULL pointer dereferences generate a #GP instead of a
#PF. Due to the limited error information available during #GP, some of
the helpful hints would no longer be printed. The patches enchance the
#GP address decoding logic to identify LASS violations and NULL pointer
exceptions.

For example, two invalid userspace accesses would now generate:
#PF (without LASS):
  BUG: kernel NULL pointer dereference, address: 0000000000000000
  BUG: unable to handle page fault for address: 0000000000100000

#GP (with LASS):
  Oops: general protection fault, kernel NULL pointer dereference 0x0: 0000
  Oops: general protection fault, probably LASS violation for address 0x100000: 0000

Similar debug hints can be added for the #SS handling as well. But
running into a #SS is very rare and the complexity isn't worth it.

Userspace accesses
------------------
When LASS is enabled, userspace attempts to access any kernel address
generate a #GP instead of a #PF. A SIGSEGV is delivered to userspace in
both cases. However, the exception address present in the siginfo
structure for a #PF is absent for a #GP. This is a minor and expectedly
inconsequential change for userspace.

Legacy vsyscalls
----------------
Legacy vsyscall functions are located in the address range
0xffffffffff600000 - 0xffffffffff601000. Prior to LASS, accesses to the
vsyscall page would generate a #PF, and they would be emulated in the
#PF handler. Extending the emulation support to the #GP handler needs
complex instruction decoding and some refactoring.

Modern environments do not require legacy vsyscalls. To avoid breaking
user applications, LASS is disabled if vsyscall emulation support is
compiled in. Though this limits the initial LASS deployment, it makes
the merge considerably easier.

In future, the restriction can be relaxed by extending the emulation in
XONLY mode to the #GP handler.

Links
-----
[1]: "Linear-Address Pre-Processing", Intel SDM (June 2025), Vol 3, Chapter 4.
[2]: "Practical Timing Side Channel Attacks against Kernel Space ASLR", https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[3]: "Prefetch Side-Channel Attacks: Bypassing SMAP and Kernel ASLR", http://doi.acm.org/10.1145/2976749.2978356
[4]: "Harmful prefetch on Intel", https://ioactive.com/harmful-prefetch-on-intel/ (H/T Anders)
[5]: "Spectre LAM", https://download.vusec.net/papers/slam_sp24.pdf

Alexander Shishkin (2):
  x86/efi: Disable LASS while mapping the EFI runtime services
  x86/traps: Communicate a LASS violation in #GP message

Peter Zijlstra (Intel) (1):
  x86/asm: Introduce inline memcpy and memset

Sohil Mehta (6):
  x86/cpufeatures: Enumerate the LASS feature bits
  x86/cpu: Add an LASS dependency on SMAP
  x86/alternatives: Disable LASS when patching kernel code
  x86/kexec: Disable LASS during relocate kernel
  selftests/x86: Update the negative vsyscall tests to expect a #GP
  x86/cpu: Enable LASS by default during CPU initialization

 arch/x86/Kconfig.cpufeatures                |  4 ++
 arch/x86/include/asm/cpufeatures.h          |  1 +
 arch/x86/include/asm/smap.h                 | 41 ++++++++++++++++++-
 arch/x86/include/asm/string.h               | 26 ++++++++++++
 arch/x86/include/uapi/asm/processor-flags.h |  2 +
 arch/x86/kernel/alternative.c               | 18 ++++++++-
 arch/x86/kernel/cpu/common.c                | 21 +++++++++-
 arch/x86/kernel/cpu/cpuid-deps.c            |  1 +
 arch/x86/kernel/relocate_kernel_64.S        |  7 +++-
 arch/x86/kernel/traps.c                     | 45 +++++++++++++++------
 arch/x86/platform/efi/efi.c                 | 14 ++++++-
 tools/testing/selftests/x86/test_vsyscall.c | 21 +++++-----
 12 files changed, 172 insertions(+), 29 deletions(-)


base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
-- 
2.43.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:03   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 2/9] x86/cpu: Add an LASS dependency on SMAP Sohil Mehta
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

Linear Address Space Separation (LASS) is a security feature that
mitigates a class of side-channel attacks relying on speculative access
across the user/kernel boundary.

Privilege mode based access protection already exists today with paging
and features such as SMEP and SMAP. However, to enforce these
protections, the processor must traverse the paging structures in
memory. An attacker can use timing information resulting from this
traversal to determine details about the paging structures, and to
determine the layout of the kernel memory.

LASS provides the same mode-based protections as paging but without
traversing the paging structures. Because the protections are enforced
prior to page-walks, an attacker will not be able to derive paging-based
timing information from the various caching structures such as the TLBs,
mid-level caches, page walker, data caches, etc.

LASS enforcement relies on the kernel implementation to divide the
64-bit virtual address space into two halves:
  Addr[63]=0 -> User address space
  Addr[63]=1 -> Kernel address space

Any data access or code execution across address spaces typically
results in a #GP fault, with an #SS generated in some rare cases. The
LASS enforcement for kernel data accesses is dependent on CR4.SMAP being
set. The enforcement can be disabled by toggling the RFLAGS.AC bit
similar to SMAP.

Define the CPU feature bits to enumerate LASS. Also, disable the feature
at compile time on 32-bit kernels. Use a direct dependency on X86_32
(instead of !X86_64) to make it easier to combine with similar 32-bit
specific dependencies in the future.

LASS mitigates a class of side-channel speculative attacks, such as
Spectre LAM, described in the paper, "Leaky Address Masking: Exploiting
Unmasked Spectre Gadgets with Noncanonical Address Translation".

Add the "lass" flag to /proc/cpuinfo to indicate that the feature is
supported by hardware and enabled by the kernel. This allows userspace
to determine if the system is secure against such attacks.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Xin Li (Intel) <xin@zytor.com>
---
v11:
 - Split the SMAP dependency hunk into a separate patch (patch 2).
 - Improve commit message.

v10:
 - Do not modify tools/**/cpufeatures.h as those are synced separately.
---
 arch/x86/Kconfig.cpufeatures                | 4 ++++
 arch/x86/include/asm/cpufeatures.h          | 1 +
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/arch/x86/Kconfig.cpufeatures b/arch/x86/Kconfig.cpufeatures
index 250c10627ab3..733d5aff2456 100644
--- a/arch/x86/Kconfig.cpufeatures
+++ b/arch/x86/Kconfig.cpufeatures
@@ -124,6 +124,10 @@ config X86_DISABLED_FEATURE_PCID
 	def_bool y
 	depends on !X86_64
 
+config X86_DISABLED_FEATURE_LASS
+	def_bool y
+	depends on X86_32
+
 config X86_DISABLED_FEATURE_PKU
 	def_bool y
 	depends on !X86_INTEL_MEMORY_PROTECTION_KEYS
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 4091a776e37a..8d872eb08c16 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -314,6 +314,7 @@
 #define X86_FEATURE_SM4			(12*32+ 2) /* SM4 instructions */
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS		(12*32+ 6) /* "lass" Linear Address Space Separation */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* CMPccXADD instructions */
 #define X86_FEATURE_ARCH_PERFMON_EXT	(12*32+ 8) /* Intel Architectural PerfMon Extension */
 #define X86_FEATURE_FZRM		(12*32+10) /* Fast zero-length REP MOVSB */
diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index f1a4adc78272..81d0c8bf1137 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -136,6 +136,8 @@
 #define X86_CR4_PKE		_BITUL(X86_CR4_PKE_BIT)
 #define X86_CR4_CET_BIT		23 /* enable Control-flow Enforcement Technology */
 #define X86_CR4_CET		_BITUL(X86_CR4_CET_BIT)
+#define X86_CR4_LASS_BIT	27 /* enable Linear Address Space Separation support */
+#define X86_CR4_LASS		_BITUL(X86_CR4_LASS_BIT)
 #define X86_CR4_LAM_SUP_BIT	28 /* LAM for supervisor pointers */
 #define X86_CR4_LAM_SUP		_BITUL(X86_CR4_LAM_SUP_BIT)
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 2/9] x86/cpu: Add an LASS dependency on SMAP
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
  2025-10-29 21:03 ` [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:04   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 3/9] x86/asm: Introduce inline memcpy and memset Sohil Mehta
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

With LASS enabled, any kernel data access to userspace typically results
in a #GP, or a #SS in some stack-related cases. When the kernel needs to
access user memory, it can suspend LASS enforcement by toggling the
RFLAGS.AC bit. Most of these cases are already covered by the
stac()/clac() pairs used to avoid SMAP violations.

Even though LASS could potentially be enabled independently, it would be
very painful without SMAP and the related stac()/clac() calls. There is
no reason to support such a configuration because all future hardware
with LASS is expected to have SMAP as well. Also, the STAC/CLAC
instructions are architected to:
	#UD - If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.

So, make LASS depend on SMAP to conveniently reuse the existing AC bit
toggling already in place.

Note: Additional STAC/CLAC would still be needed for accesses such as
text poking which are not flagged by SMAP. This is because such mappings
are in the lower half but do not have the _PAGE_USER bit set which SMAP
uses for enforcement.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - New patch (split from patch 1).
---
 arch/x86/kernel/cpu/cpuid-deps.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 46efcbd6afa4..98d0cdd82574 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -89,6 +89,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_SHSTK,			X86_FEATURE_XSAVES    },
 	{ X86_FEATURE_FRED,			X86_FEATURE_LKGS      },
 	{ X86_FEATURE_SPEC_CTRL_SSBD,		X86_FEATURE_SPEC_CTRL },
+	{ X86_FEATURE_LASS,			X86_FEATURE_SMAP      },
 	{}
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 3/9] x86/asm: Introduce inline memcpy and memset
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
  2025-10-29 21:03 ` [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits Sohil Mehta
  2025-10-29 21:03 ` [PATCH v11 2/9] x86/cpu: Add an LASS dependency on SMAP Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:06   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code Sohil Mehta
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

From: "Peter Zijlstra (Intel)" <peterz@infradead.org>

Provide inline memcpy and memset functions that can be used instead of
the GCC builtins when necessary. The immediate use case is for the text
poking functions to avoid the standard memcpy()/memset() calls because
objtool complains about such dynamic calls within an AC=1 region. See
tools/objtool/Documentation/objtool.txt, warning #9, regarding function
calls with UACCESS enabled.

Some user copy functions such as copy_user_generic() and __clear_user()
have similar rep_{movs,stos} usages. But, those are highly specialized
and hard to combine or reuse for other things. Define these new helpers
for all other usages that need a completely unoptimized, strictly inline
version of memcpy() or memset().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - Improve commit log.

v10:
 - Reintroduce the simpler inline patch (dropped in v8).
---
 arch/x86/include/asm/string.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
index c3c2c1914d65..9cb5aae7fba9 100644
--- a/arch/x86/include/asm/string.h
+++ b/arch/x86/include/asm/string.h
@@ -1,6 +1,32 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_STRING_H
+#define _ASM_X86_STRING_H
+
 #ifdef CONFIG_X86_32
 # include <asm/string_32.h>
 #else
 # include <asm/string_64.h>
 #endif
+
+static __always_inline void *__inline_memcpy(void *to, const void *from, size_t len)
+{
+	void *ret = to;
+
+	asm volatile("rep movsb"
+		     : "+D" (to), "+S" (from), "+c" (len)
+		     : : "memory");
+	return ret;
+}
+
+static __always_inline void *__inline_memset(void *s, int v, size_t n)
+{
+	void *ret = s;
+
+	asm volatile("rep stosb"
+		     : "+D" (s), "+c" (n)
+		     : "a" ((uint8_t)v)
+		     : "memory");
+	return ret;
+}
+
+#endif /* _ASM_X86_STRING_H */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
                   ` (2 preceding siblings ...)
  2025-10-29 21:03 ` [PATCH v11 3/9] x86/asm: Introduce inline memcpy and memset Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:10   ` Dave Hansen
  2025-11-10 18:15   ` Sohil Mehta
  2025-10-29 21:03 ` [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services Sohil Mehta
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

For patching, the kernel initializes a temporary mm area in the lower
half of the address range. LASS blocks these accesses because its
enforcement relies on bit 63 of the virtual address as opposed to SMAP
which depends on the _PAGE_BIT_USER bit in the page table. Disable LASS
enforcement by toggling the RFLAGS.AC bit during patching to avoid
triggering a #GP fault.

Introduce LASS-specific STAC/CLAC helpers to set the AC bit only on
platforms that need it. Clarify the usage of the new helpers versus the
existing stac()/clac() helpers for SMAP.

The Text poking functions use standard memcpy()/memset() while patching
kernel code. However, objtool complains about calling such dynamic
functions within an AC=1 region. See warning #9, regarding function
calls with UACCESS enabled, in tools/objtool/Documentation/objtool.txt.

To pacify objtool, one option is to add memcpy() and memset() to the
list of allowed-functions. However, that would provide a blanket
exemption for all usages of memcpy() and memset(). Instead, replace the
standard calls in the text poking functions with their unoptimized,
always-inlined versions. Considering that patching is usually small,
there is no performance impact expected.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - Use lass_enable()/lass_disable() naming.
 - Improve commit log and code comments.

v10:
 - Revert to the inline functions instead of open-coding in assembly.
 - Simplify code comments.
---
 arch/x86/include/asm/smap.h   | 41 +++++++++++++++++++++++++++++++++--
 arch/x86/kernel/alternative.c | 18 +++++++++++++--
 2 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..90f178c78f9c 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -23,18 +23,55 @@
 
 #else /* __ASSEMBLER__ */
 
+/*
+ * The CLAC/STAC instructions toggle the enforcement of
+ * X86_FEATURE_SMAP along with X86_FEATURE_LASS.
+ *
+ * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page
+ * tables. The kernel is not allowed to touch pages with that bit set
+ * unless the AC bit is set.
+ *
+ * Use stac()/clac() when accessing userspace (_PAGE_USER) mappings,
+ * regardless of location.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
 static __always_inline void clac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
 	alternative("", "clac", X86_FEATURE_SMAP);
 }
 
 static __always_inline void stac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
 	alternative("", "stac", X86_FEATURE_SMAP);
 }
 
+/*
+ * LASS enforcement is based on bit 63 of the virtual address. The
+ * kernel is not allowed to touch memory in the lower half of the
+ * virtual address space.
+ *
+ * Use lass_disable()/lass_enable() to toggle the AC bit for kernel data
+ * accesses (!_PAGE_USER) that are blocked by LASS, but not by SMAP.
+ *
+ * Even with the AC bit set, LASS will continue to block instruction
+ * fetches from the user half of the address space. To allow those,
+ * clear CR4.LASS to disable the LASS mechanism entirely.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
+static __always_inline void lass_enable(void)
+{
+	alternative("", "clac", X86_FEATURE_LASS);
+}
+
+static __always_inline void lass_disable(void)
+{
+	alternative("", "stac", X86_FEATURE_LASS);
+}
+
 static __always_inline unsigned long smap_save(void)
 {
 	unsigned long flags;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 8ee5ff547357..b38dbf08d5cd 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2469,16 +2469,30 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
 __ro_after_init struct mm_struct *text_poke_mm;
 __ro_after_init unsigned long text_poke_mm_addr;
 
+/*
+ * Text poking creates and uses a mapping in the lower half of the
+ * address space. Relax LASS enforcement when accessing the poking
+ * address.
+ *
+ * objtool enforces a strict policy of "no function calls within AC=1
+ * regions". Adhere to the policy by using inline versions of
+ * memcpy()/memset() that will never result in a function call.
+ */
+
 static void text_poke_memcpy(void *dst, const void *src, size_t len)
 {
-	memcpy(dst, src, len);
+	lass_disable();
+	__inline_memcpy(dst, src, len);
+	lass_enable();
 }
 
 static void text_poke_memset(void *dst, const void *src, size_t len)
 {
 	int c = *(const int *)src;
 
-	memset(dst, c, len);
+	lass_disable();
+	__inline_memset(dst, c, len);
+	lass_enable();
 }
 
 typedef void text_poke_f(void *dst, const void *src, size_t len);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
                   ` (3 preceding siblings ...)
  2025-10-29 21:03 ` [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:11   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 6/9] x86/kexec: Disable LASS during relocate kernel Sohil Mehta
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

From: Alexander Shishkin <alexander.shishkin@linux.intel.com>

While mapping EFI runtime services, set_virtual_address_map() is called
at its lower mapping, which LASS prohibits. Wrapping the EFI call with
lass_disable()/_enable() is not enough, because the AC flag only
controls data accesses, and not instruction fetches.

Use the big hammer and toggle the CR4.LASS bit to make this work.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - No change.

v10:
 - Reword code comments
---
 arch/x86/platform/efi/efi.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 463b784499a8..ad9f76f90581 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -786,8 +786,8 @@ static void __init __efi_enter_virtual_mode(void)
 {
 	int count = 0, pg_shift = 0;
 	void *new_memmap = NULL;
+	unsigned long pa, lass;
 	efi_status_t status;
-	unsigned long pa;
 
 	if (efi_alloc_page_tables()) {
 		pr_err("Failed to allocate EFI page tables\n");
@@ -825,11 +825,23 @@ static void __init __efi_enter_virtual_mode(void)
 
 	efi_sync_low_kernel_mappings();
 
+	/*
+	 * LASS complains because set_virtual_address_map() is located
+	 * at a lower address. To pause enforcement, flipping RFLAGS.AC
+	 * is not sufficient, as it only permits data access and not
+	 * instruction fetch. Disable the entire LASS mechanism.
+	 */
+	lass = cr4_read_shadow() & X86_CR4_LASS;
+	cr4_clear_bits(lass);
+
 	status = efi_set_virtual_address_map(efi.memmap.desc_size * count,
 					     efi.memmap.desc_size,
 					     efi.memmap.desc_version,
 					     (efi_memory_desc_t *)pa,
 					     efi_systab_phys);
+
+	cr4_set_bits(lass);
+
 	if (status != EFI_SUCCESS) {
 		pr_err("Unable to switch EFI into virtual mode (status=%lx)!\n",
 		       status);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 6/9] x86/kexec: Disable LASS during relocate kernel
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
                   ` (4 preceding siblings ...)
  2025-10-29 21:03 ` [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:14   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message Sohil Mehta
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

The relocate kernel mechanism uses an identity mapping to copy the new
kernel, which leads to a LASS violation when executing from a low
address.

LASS must be disabled after the original CR4 value is saved because
kexec paths that preserve context need to restore CR4.LASS. But,
disabling it along with CET during identity_mapped() is too late. So,
disable LASS immediately after saving CR4, along with PGE, and before
jumping to the identity-mapped page.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - Improve commit message.

v10:
 - New patch to fix an issue detected during internal testing.
---
 arch/x86/kernel/relocate_kernel_64.S | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 11e20bb13aca..4ffba68dc57b 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -95,9 +95,12 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	/* Leave CR4 in %r13 to enable the right paging mode later. */
 	movq	%cr4, %r13
 
-	/* Disable global pages immediately to ensure this mapping is RWX */
+	/*
+	 * Disable global pages immediately to ensure this mapping is RWX.
+	 * Disable LASS before jumping to the identity mapped page.
+	 */
 	movq	%r13, %r12
-	andq	$~(X86_CR4_PGE), %r12
+	andq	$~(X86_CR4_PGE | X86_CR4_LASS), %r12
 	movq	%r12, %cr4
 
 	/* Save %rsp and CRs. */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
                   ` (5 preceding siblings ...)
  2025-10-29 21:03 ` [PATCH v11 6/9] x86/kexec: Disable LASS during relocate kernel Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:16   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 8/9] selftests/x86: Update the negative vsyscall tests to expect a #GP Sohil Mehta
  2025-10-29 21:03 ` [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization Sohil Mehta
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

From: Alexander Shishkin <alexander.shishkin@linux.intel.com>

A LASS violation typically results in a #GP. With LASS active, any
invalid access to user memory (including the first page frame) would be
reported as a #GP, instead of a #PF.

Unfortunately, the #GP error messages provide limited information about
the cause of the fault. This could be confusing for kernel developers
and users who are accustomed to the friendly #PF messages.

To make the transition easier, enhance the #GP Oops message to include a
hint about LASS violations. Also, add a special hint for kernel NULL
pointer dereferences to match with the existing #PF message.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - Improve commit log.

v10:
 - Minor improvement to code comments and hints.
---
 arch/x86/kernel/traps.c | 45 ++++++++++++++++++++++++++++++-----------
 1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 6b22611e69cc..30d5c690f9a1 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -635,13 +635,23 @@ DEFINE_IDTENTRY(exc_bounds)
 enum kernel_gp_hint {
 	GP_NO_HINT,
 	GP_NON_CANONICAL,
-	GP_CANONICAL
+	GP_CANONICAL,
+	GP_LASS_VIOLATION,
+	GP_NULL_POINTER,
+};
+
+static const char * const kernel_gp_hint_help[] = {
+	[GP_NON_CANONICAL]	= "probably for non-canonical address",
+	[GP_CANONICAL]		= "maybe for address",
+	[GP_LASS_VIOLATION]	= "probably LASS violation for address",
+	[GP_NULL_POINTER]	= "kernel NULL pointer dereference",
 };
 
 /*
  * When an uncaught #GP occurs, try to determine the memory address accessed by
  * the instruction and return that address to the caller. Also, try to figure
- * out whether any part of the access to that address was non-canonical.
+ * out whether any part of the access to that address was non-canonical or
+ * across privilege levels.
  */
 static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
 						 unsigned long *addr)
@@ -663,14 +673,27 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
 		return GP_NO_HINT;
 
 #ifdef CONFIG_X86_64
-	/*
-	 * Check that:
-	 *  - the operand is not in the kernel half
-	 *  - the last byte of the operand is not in the user canonical half
-	 */
-	if (*addr < ~__VIRTUAL_MASK &&
-	    *addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
+	/* Operand is in the kernel half */
+	if (*addr >= ~__VIRTUAL_MASK)
+		return GP_CANONICAL;
+
+	/* The last byte of the operand is not in the user canonical half */
+	if (*addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
 		return GP_NON_CANONICAL;
+
+	/*
+	 * If LASS is active, a NULL pointer dereference generates a #GP
+	 * instead of a #PF.
+	 */
+	if (*addr < PAGE_SIZE)
+		return GP_NULL_POINTER;
+
+	/*
+	 * Assume that LASS caused the exception, because the address is
+	 * canonical and in the user half.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_LASS))
+		return GP_LASS_VIOLATION;
 #endif
 
 	return GP_CANONICAL;
@@ -833,9 +856,7 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
 
 	if (hint != GP_NO_HINT)
 		snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
-			 (hint == GP_NON_CANONICAL) ? "probably for non-canonical address"
-						    : "maybe for address",
-			 gp_addr);
+			 kernel_gp_hint_help[hint], gp_addr);
 
 	/*
 	 * KASAN is interested only in the non-canonical case, clear it
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 8/9] selftests/x86: Update the negative vsyscall tests to expect a #GP
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
                   ` (6 preceding siblings ...)
  2025-10-29 21:03 ` [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-31 17:20   ` Dave Hansen
  2025-10-29 21:03 ` [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization Sohil Mehta
  8 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

Some of the vsyscall selftests expect a #PF when vsyscalls are disabled.
However, with LASS enabled, an invalid access results in a SIGSEGV due
to a #GP instead of a #PF. One such negative test fails because it is
expecting X86_PF_INSTR to be set.

Update the failing test to expect either a #GP or a #PF. Also, update
the printed messages to show the trap number (denoting the type of
fault) instead of assuming a #PF.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - New patch (Fixes a vsyscall selftest failure)
---
 tools/testing/selftests/x86/test_vsyscall.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/x86/test_vsyscall.c b/tools/testing/selftests/x86/test_vsyscall.c
index 05e1e6774fba..918eaec8bfbe 100644
--- a/tools/testing/selftests/x86/test_vsyscall.c
+++ b/tools/testing/selftests/x86/test_vsyscall.c
@@ -308,12 +308,13 @@ static void test_getcpu(int cpu)
 #ifdef __x86_64__
 
 static jmp_buf jmpbuf;
-static volatile unsigned long segv_err;
+static volatile unsigned long segv_err, segv_trapno;
 
 static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
 {
 	ucontext_t *ctx = (ucontext_t *)ctx_void;
 
+	segv_trapno = ctx->uc_mcontext.gregs[REG_TRAPNO];
 	segv_err =  ctx->uc_mcontext.gregs[REG_ERR];
 	siglongjmp(jmpbuf, 1);
 }
@@ -336,7 +337,8 @@ static void test_vsys_r(void)
 	else if (can_read)
 		ksft_test_result_pass("We have read access\n");
 	else
-		ksft_test_result_pass("We do not have read access: #PF(0x%lx)\n", segv_err);
+		ksft_test_result_pass("We do not have read access (trap=%ld, error=0x%lx)\n",
+				      segv_trapno, segv_err);
 }
 
 static void test_vsys_x(void)
@@ -347,7 +349,7 @@ static void test_vsys_x(void)
 		return;
 	}
 
-	ksft_print_msg("Make sure that vsyscalls really page fault\n");
+	ksft_print_msg("Make sure that vsyscalls really cause a fault\n");
 
 	bool can_exec;
 	if (sigsetjmp(jmpbuf, 1) == 0) {
@@ -358,13 +360,14 @@ static void test_vsys_x(void)
 	}
 
 	if (can_exec)
-		ksft_test_result_fail("Executing the vsyscall did not page fault\n");
-	else if (segv_err & (1 << 4)) /* INSTR */
-		ksft_test_result_pass("Executing the vsyscall page failed: #PF(0x%lx)\n",
-				      segv_err);
+		ksft_test_result_fail("Executing the vsyscall did not fault\n");
+	/* #GP or #PF (with X86_PF_INSTR) */
+	else if ((segv_trapno == 13) || ((segv_trapno == 14) && (segv_err & (1 << 4))))
+		ksft_test_result_pass("Executing the vsyscall page failed (trap=%ld, error=0x%lx)\n",
+				      segv_trapno, segv_err);
 	else
-		ksft_test_result_fail("Execution failed with the wrong error: #PF(0x%lx)\n",
-				      segv_err);
+		ksft_test_result_fail("Execution failed with the wrong error (trap=%ld, error=0x%lx)\n",
+				      segv_trapno, segv_err);
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
                   ` (7 preceding siblings ...)
  2025-10-29 21:03 ` [PATCH v11 8/9] selftests/x86: Update the negative vsyscall tests to expect a #GP Sohil Mehta
@ 2025-10-29 21:03 ` Sohil Mehta
  2025-10-30  8:40   ` H. Peter Anvin
  2025-10-31 17:21   ` Dave Hansen
  8 siblings, 2 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-29 21:03 UTC (permalink / raw)
  To: x86, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Sohil Mehta,
	Xin Li, David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

Linear Address Space Separation (LASS) mitigates a class of side-channel
attacks that rely on speculative access across the user/kernel boundary.

Enable LASS by default if the platform supports it. While at it, remove
the comment above the SMAP/SMEP/UMIP/LASS setup instead of updating it,
as the whole sequence is quite self-explanatory.

The legacy vsyscall page is mapped at 0xffffffffff60?000. Prior to LASS,
vsyscall page accesses would always generate a #PF. The kernel emulates
the accesses in the #PF handler and returns the appropriate values to
userspace.

With LASS, these accesses are intercepted before the paging structures
are traversed triggering a #GP instead of a #PF. To avoid breaking user
applications, equivalent emulation support is required in the #GP
handler. However, the #GP provides limited error information compared to
the #PF, making the emulation more complex.

For now, keep it simple and disable LASS if vsyscall emulation is
compiled in. This restricts LASS usability to newer environments where
legacy vsyscalls are absolutely not needed. In future, LASS support can
be expanded by enhancing the #GP handler.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v11:
 - Disable LASS if vsyscall emulation support is compiled in.
 - Drop Rick's review tag because of the new changes.

v10
 - No change.
---
 arch/x86/kernel/cpu/common.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index c7d3512914ca..71e89859dfb4 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -401,6 +401,25 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 	cr4_clear_bits(X86_CR4_UMIP);
 }
 
+static __always_inline void setup_lass(struct cpuinfo_x86 *c)
+{
+	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
+		/*
+		 * Legacy vsyscall page access causes a #GP when LASS is
+		 * active. However, vsyscall emulation isn't supported
+		 * with #GP. To avoid breaking userspace, disable LASS
+		 * if the emulation code is compiled in.
+		 */
+		if (IS_ENABLED(CONFIG_X86_VSYSCALL_EMULATION)) {
+			pr_info_once("x86/cpu: Disabling LASS due to CONFIG_X86_VSYSCALL_EMULATION=y\n");
+			setup_clear_cpu_cap(X86_FEATURE_LASS);
+			return;
+		}
+
+		cr4_set_bits(X86_CR4_LASS);
+	}
+}
+
 /* These bits should not change their value after CPU init is finished. */
 static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
 					     X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED;
@@ -2011,10 +2030,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	/* Disable the PN if appropriate */
 	squash_the_stupid_serial_number(c);
 
-	/* Set up SMEP/SMAP/UMIP */
 	setup_smep(c);
 	setup_smap(c);
 	setup_umip(c);
+	setup_lass(c);
 
 	/* Enable FSGSBASE instructions if available. */
 	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-29 21:03 ` [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization Sohil Mehta
@ 2025-10-30  8:40   ` H. Peter Anvin
  2025-10-30 15:45     ` Andy Lutomirski
  2025-10-30 16:27     ` Dave Hansen
  2025-10-31 17:21   ` Dave Hansen
  1 sibling, 2 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-10-30  8:40 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick Edgecombe, Vegard Nossum, Andrew Cooper,
	Randy Dunlap, Geert Uytterhoeven, Kees Cook, Tony Luck,
	Alexander Shishkin, linux-doc, linux-kernel, linux-efi

On October 29, 2025 2:03:10 PM PDT, Sohil Mehta <sohil.mehta@intel.com> wrote:
>Linear Address Space Separation (LASS) mitigates a class of side-channel
>attacks that rely on speculative access across the user/kernel boundary.
>
>Enable LASS by default if the platform supports it. While at it, remove
>the comment above the SMAP/SMEP/UMIP/LASS setup instead of updating it,
>as the whole sequence is quite self-explanatory.
>
>The legacy vsyscall page is mapped at 0xffffffffff60?000. Prior to LASS,
>vsyscall page accesses would always generate a #PF. The kernel emulates
>the accesses in the #PF handler and returns the appropriate values to
>userspace.
>
>With LASS, these accesses are intercepted before the paging structures
>are traversed triggering a #GP instead of a #PF. To avoid breaking user
>applications, equivalent emulation support is required in the #GP
>handler. However, the #GP provides limited error information compared to
>the #PF, making the emulation more complex.
>
>For now, keep it simple and disable LASS if vsyscall emulation is
>compiled in. This restricts LASS usability to newer environments where
>legacy vsyscalls are absolutely not needed. In future, LASS support can
>be expanded by enhancing the #GP handler.
>
>Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
>---
>v11:
> - Disable LASS if vsyscall emulation support is compiled in.
> - Drop Rick's review tag because of the new changes.
>
>v10
> - No change.
>---
> arch/x86/kernel/cpu/common.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>index c7d3512914ca..71e89859dfb4 100644
>--- a/arch/x86/kernel/cpu/common.c
>+++ b/arch/x86/kernel/cpu/common.c
>@@ -401,6 +401,25 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
> 	cr4_clear_bits(X86_CR4_UMIP);
> }
> 
>+static __always_inline void setup_lass(struct cpuinfo_x86 *c)
>+{
>+	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
>+		/*
>+		 * Legacy vsyscall page access causes a #GP when LASS is
>+		 * active. However, vsyscall emulation isn't supported
>+		 * with #GP. To avoid breaking userspace, disable LASS
>+		 * if the emulation code is compiled in.
>+		 */
>+		if (IS_ENABLED(CONFIG_X86_VSYSCALL_EMULATION)) {
>+			pr_info_once("x86/cpu: Disabling LASS due to CONFIG_X86_VSYSCALL_EMULATION=y\n");
>+			setup_clear_cpu_cap(X86_FEATURE_LASS);
>+			return;
>+		}
>+
>+		cr4_set_bits(X86_CR4_LASS);
>+	}
>+}
>+
> /* These bits should not change their value after CPU init is finished. */
> static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
> 					     X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED;
>@@ -2011,10 +2030,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
> 	/* Disable the PN if appropriate */
> 	squash_the_stupid_serial_number(c);
> 
>-	/* Set up SMEP/SMAP/UMIP */
> 	setup_smep(c);
> 	setup_smap(c);
> 	setup_umip(c);
>+	setup_lass(c);
> 
> 	/* Enable FSGSBASE instructions if available. */
> 	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {

Legacy vsyscalls have been obsolete for how long now?


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30  8:40   ` H. Peter Anvin
@ 2025-10-30 15:45     ` Andy Lutomirski
  2025-10-30 16:44       ` Sohil Mehta
  2025-10-30 16:27     ` Dave Hansen
  1 sibling, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2025-10-30 15:45 UTC (permalink / raw)
  To: H. Peter Anvin, Sohil Mehta, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi



On Thu, Oct 30, 2025, at 1:40 AM, H. Peter Anvin wrote:
> On October 29, 2025 2:03:10 PM PDT, Sohil Mehta <sohil.mehta@intel.com> wrote:
>>Linear Address Space Separation (LASS) mitigates a class of side-channel
>>attacks that rely on speculative access across the user/kernel boundary.
>>
>>Enable LASS by default if the platform supports it. While at it, remove
>>the comment above the SMAP/SMEP/UMIP/LASS setup instead of updating it,
>>as the whole sequence is quite self-explanatory.
>>
>>The legacy vsyscall page is mapped at 0xffffffffff60?000. Prior to LASS,
>>vsyscall page accesses would always generate a #PF. The kernel emulates
>>the accesses in the #PF handler and returns the appropriate values to
>>userspace.
>>
>>With LASS, these accesses are intercepted before the paging structures
>>are traversed triggering a #GP instead of a #PF. To avoid breaking user
>>applications, equivalent emulation support is required in the #GP
>>handler. However, the #GP provides limited error information compared to
>>the #PF, making the emulation more complex.
>>
>>For now, keep it simple and disable LASS if vsyscall emulation is
>>compiled in. This restricts LASS usability to newer environments where
>>legacy vsyscalls are absolutely not needed. In future, LASS support can
>>be expanded by enhancing the #GP handler.
>>
>>Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
>>---
>>v11:
>> - Disable LASS if vsyscall emulation support is compiled in.
>> - Drop Rick's review tag because of the new changes.
>>
>>v10
>> - No change.
>>---
>> arch/x86/kernel/cpu/common.c | 21 ++++++++++++++++++++-
>> 1 file changed, 20 insertions(+), 1 deletion(-)
>>
>>diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>>index c7d3512914ca..71e89859dfb4 100644
>>--- a/arch/x86/kernel/cpu/common.c
>>+++ b/arch/x86/kernel/cpu/common.c
>>@@ -401,6 +401,25 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
>> 	cr4_clear_bits(X86_CR4_UMIP);
>> }
>> 
>>+static __always_inline void setup_lass(struct cpuinfo_x86 *c)
>>+{
>>+	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
>>+		/*
>>+		 * Legacy vsyscall page access causes a #GP when LASS is
>>+		 * active. However, vsyscall emulation isn't supported
>>+		 * with #GP. To avoid breaking userspace, disable LASS
>>+		 * if the emulation code is compiled in.
>>+		 */
>>+		if (IS_ENABLED(CONFIG_X86_VSYSCALL_EMULATION)) {
>>+			pr_info_once("x86/cpu: Disabling LASS due to CONFIG_X86_VSYSCALL_EMULATION=y\n");
>>+			setup_clear_cpu_cap(X86_FEATURE_LASS);
>>+			return;
>>+		}
>>+
>>+		cr4_set_bits(X86_CR4_LASS);
>>+	}
>>+}
>>+
>> /* These bits should not change their value after CPU init is finished. */
>> static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
>> 					     X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED;
>>@@ -2011,10 +2030,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
>> 	/* Disable the PN if appropriate */
>> 	squash_the_stupid_serial_number(c);
>> 
>>-	/* Set up SMEP/SMAP/UMIP */
>> 	setup_smep(c);
>> 	setup_smap(c);
>> 	setup_umip(c);
>>+	setup_lass(c);
>> 
>> 	/* Enable FSGSBASE instructions if available. */
>> 	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
>
> Legacy vsyscalls have been obsolete for how long now?


A looooong time.

I would suggest defaulting LASS to on and *maybe* decoding just enough to log, once per boot, that a legacy vsyscall may have been attempted. It’s too bad that #GP doesn’t report the faulting address.

—Andy

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30  8:40   ` H. Peter Anvin
  2025-10-30 15:45     ` Andy Lutomirski
@ 2025-10-30 16:27     ` Dave Hansen
  2025-11-07  8:01       ` H. Peter Anvin
  1 sibling, 1 reply; 67+ messages in thread
From: Dave Hansen @ 2025-10-30 16:27 UTC (permalink / raw)
  To: H. Peter Anvin, Sohil Mehta, x86, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick Edgecombe, Vegard Nossum, Andrew Cooper,
	Randy Dunlap, Geert Uytterhoeven, Kees Cook, Tony Luck,
	Alexander Shishkin, linux-doc, linux-kernel, linux-efi

On 10/30/25 01:40, H. Peter Anvin wrote:
> Legacy vsyscalls have been obsolete for how long now?

I asked Sohil to start throwing out all the non-essential bits from this
series. My thought was that we could get mainline so that LASS itself
could be enabled, even if it was in a somewhat weird config that a
distro would probably never do.

After that is merged, we can circle back and decide what to do with the
remaining bits like CR pinning and vsyscall emulation. I don't think any
of those bits will change the basic desire to have LASS support in the
kernel.

Does that sound like a sane approach to everyone?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 15:45     ` Andy Lutomirski
@ 2025-10-30 16:44       ` Sohil Mehta
  2025-10-30 16:53         ` Andy Lutomirski
  2025-10-30 21:13         ` David Laight
  0 siblings, 2 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-30 16:44 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/30/2025 8:45 AM, Andy Lutomirski wrote:
> On Thu, Oct 30, 2025, at 1:40 AM, H. Peter Anvin wrote:
>> Legacy vsyscalls have been obsolete for how long now?
> 
> A looooong time.
> 
> I would suggest defaulting LASS to on and *maybe* decoding just enough to log, once per boot, that a legacy vsyscall may have been attempted. It’s too bad that #GP doesn’t report the faulting address.
> 

Unfortunately, CONFIG_X86_VSYSCALL_EMULATION defaults to y. Also, the
default Vsyscall mode is XONLY. So even if vsyscalls are deprecated,
there is a non-zero possibility someone would complain about it.

My primary goal here is to get the base LASS series merged (soonish)
with the simplest possible option.

I am planning to follow-up immediately with a vsyscall specific series
that relaxes *most* restrictions.

IIUC, supporting XONLY mode with LASS probably does not need complicated
decoding because the vsyscall address is available in the faulting RIP.

The spec says:
"LASS for instruction fetches applies when the linear address in RIP is
used to load an instruction from memory. Unlike canonicality checking
(see Section 4.5.2), LASS does not apply to branch instructions that
load RIP. A branch instruction can load RIP with an address that would
violate LASS. Only when the address is used to fetch an instruction will
a LASS violation occur, generating a #GP. (The return instruction
pointer of the #GP handler is the address that incurred the LASS
violation.)"

I attempted to do that in the last revision here:
https://lore.kernel.org/lkml/20251007065119.148605-9-sohil.mehta@intel.com/
https://lore.kernel.org/lkml/20251007065119.148605-11-sohil.mehta@intel.com/

On the other hand, supporting EMULATE mode during a #GP is a bit tricky,
which isn't worth the effort.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 16:44       ` Sohil Mehta
@ 2025-10-30 16:53         ` Andy Lutomirski
  2025-10-30 17:24           ` Sohil Mehta
  2025-10-30 21:13         ` David Laight
  1 sibling, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2025-10-30 16:53 UTC (permalink / raw)
  To: Sohil Mehta, H. Peter Anvin, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi



On Thu, Oct 30, 2025, at 9:44 AM, Sohil Mehta wrote:
> On 10/30/2025 8:45 AM, Andy Lutomirski wrote:
>> On Thu, Oct 30, 2025, at 1:40 AM, H. Peter Anvin wrote:
>>> Legacy vsyscalls have been obsolete for how long now?
>> 
>> A looooong time.
>> 
>> I would suggest defaulting LASS to on and *maybe* decoding just enough to log, once per boot, that a legacy vsyscall may have been attempted. It’s too bad that #GP doesn’t report the faulting address.
>> 
>
> Unfortunately, CONFIG_X86_VSYSCALL_EMULATION defaults to y. Also, the
> default Vsyscall mode is XONLY. So even if vsyscalls are deprecated,
> there is a non-zero possibility someone would complain about it.
>
> My primary goal here is to get the base LASS series merged (soonish)
> with the simplest possible option.
>
> I am planning to follow-up immediately with a vsyscall specific series
> that relaxes *most* restrictions.
>
> IIUC, supporting XONLY mode with LASS probably does not need complicated
> decoding because the vsyscall address is available in the faulting RIP.
>
> The spec says:
> "LASS for instruction fetches applies when the linear address in RIP is
> used to load an instruction from memory. Unlike canonicality checking
> (see Section 4.5.2), LASS does not apply to branch instructions that
> load RIP. A branch instruction can load RIP with an address that would
> violate LASS. Only when the address is used to fetch an instruction will
> a LASS violation occur, generating a #GP. (The return instruction
> pointer of the #GP handler is the address that incurred the LASS
> violation.)"
>
> I attempted to do that in the last revision here:
> https://lore.kernel.org/lkml/20251007065119.148605-9-sohil.mehta@intel.com/
> https://lore.kernel.org/lkml/20251007065119.148605-11-sohil.mehta@intel.com/
>
> On the other hand, supporting EMULATE mode during a #GP is a bit tricky,
> which isn't worth the effort.

I would say it's definitely worth the effort, but it probably does make sense to get the rest of the series in a mergeable condition such that it only works with vsyscall=none.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 16:53         ` Andy Lutomirski
@ 2025-10-30 17:24           ` Sohil Mehta
  2025-10-30 17:31             ` Andy Lutomirski
  0 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-30 17:24 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/30/2025 9:53 AM, Andy Lutomirski wrote:

>> On the other hand, supporting EMULATE mode during a #GP is a bit tricky,
>> which isn't worth the effort.
> 
> I would say it's definitely worth the effort, but it probably does make sense to get the rest of the series in a mergeable condition such that it only works with vsyscall=none.

I meant the full emulation mode where the Vsyscall page is readable. It
is only available via vsyscall=emulate. No one should be using that one,
right?

I thought you and Linus agreed on removing EMULATE mode completely:
https://lore.kernel.org/all/CALCETrXHJ7837+cmahg-wjR3iRHbDJ6JtVGaoDFC4dx-L8r8OA@mail.gmail.com/

I agree that it would be worthwhile (and relatively easy) to support the
execute (XONLY) mode (that only does instruction fetches). That is what
the separate vsyscall series would do once the LASS base is in.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 17:24           ` Sohil Mehta
@ 2025-10-30 17:31             ` Andy Lutomirski
  0 siblings, 0 replies; 67+ messages in thread
From: Andy Lutomirski @ 2025-10-30 17:31 UTC (permalink / raw)
  To: Sohil Mehta, H. Peter Anvin, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi



On Thu, Oct 30, 2025, at 10:24 AM, Sohil Mehta wrote:
> On 10/30/2025 9:53 AM, Andy Lutomirski wrote:
>
>>> On the other hand, supporting EMULATE mode during a #GP is a bit tricky,
>>> which isn't worth the effort.
>> 
>> I would say it's definitely worth the effort, but it probably does make sense to get the rest of the series in a mergeable condition such that it only works with vsyscall=none.
>
> I meant the full emulation mode where the Vsyscall page is readable. It
> is only available via vsyscall=emulate. No one should be using that one,
> right?
>
> I thought you and Linus agreed on removing EMULATE mode completely:
> https://lore.kernel.org/all/CALCETrXHJ7837+cmahg-wjR3iRHbDJ6JtVGaoDFC4dx-L8r8OA@mail.gmail.com/
>
> I agree that it would be worthwhile (and relatively easy) to support the
> execute (XONLY) mode (that only does instruction fetches). That is what
> the separate vsyscall series would do once the LASS base is in.

Ah, I misunderstood you. I agree.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 16:44       ` Sohil Mehta
  2025-10-30 16:53         ` Andy Lutomirski
@ 2025-10-30 21:13         ` David Laight
  2025-10-31  6:41           ` H. Peter Anvin
  2025-10-31 16:55           ` Dave Hansen
  1 sibling, 2 replies; 67+ messages in thread
From: David Laight @ 2025-10-30 21:13 UTC (permalink / raw)
  To: Sohil Mehta
  Cc: Andy Lutomirski, H. Peter Anvin, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Thu, 30 Oct 2025 09:44:02 -0700
Sohil Mehta <sohil.mehta@intel.com> wrote:

> On 10/30/2025 8:45 AM, Andy Lutomirski wrote:
> > On Thu, Oct 30, 2025, at 1:40 AM, H. Peter Anvin wrote:  
> >> Legacy vsyscalls have been obsolete for how long now?  
> > 
> > A looooong time.
> > 
> > I would suggest defaulting LASS to on and *maybe* decoding just enough to log, once per boot, that a legacy vsyscall may have been attempted. It’s too bad that #GP doesn’t report the faulting address.
> >   
> 
> Unfortunately, CONFIG_X86_VSYSCALL_EMULATION defaults to y. Also, the
> default Vsyscall mode is XONLY. So even if vsyscalls are deprecated,
> there is a non-zero possibility someone would complain about it.

Presumably a command line parameter could be used to disable LASS
in order to enable vsyscall emulation?

That might let LASS be enabled by default.

	David

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 21:13         ` David Laight
@ 2025-10-31  6:41           ` H. Peter Anvin
  2025-10-31 16:55           ` Dave Hansen
  1 sibling, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-10-31  6:41 UTC (permalink / raw)
  To: David Laight, Sohil Mehta
  Cc: Andy Lutomirski, the arch/x86 maintainers, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jonathan Corbet,
	Josh Poimboeuf, Peter Zijlstra (Intel), Ard Biesheuvel,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick P Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, Linux Kernel Mailing List, linux-efi

On October 30, 2025 2:13:18 PM PDT, David Laight <david.laight.linux@gmail.com> wrote:
>On Thu, 30 Oct 2025 09:44:02 -0700
>Sohil Mehta <sohil.mehta@intel.com> wrote:
>
>> On 10/30/2025 8:45 AM, Andy Lutomirski wrote:
>> > On Thu, Oct 30, 2025, at 1:40 AM, H. Peter Anvin wrote:  
>> >> Legacy vsyscalls have been obsolete for how long now?  
>> > 
>> > A looooong time.
>> > 
>> > I would suggest defaulting LASS to on and *maybe* decoding just enough to log, once per boot, that a legacy vsyscall may have been attempted. It’s too bad that #GP doesn’t report the faulting address.
>> >   
>> 
>> Unfortunately, CONFIG_X86_VSYSCALL_EMULATION defaults to y. Also, the
>> default Vsyscall mode is XONLY. So even if vsyscalls are deprecated,
>> there is a non-zero possibility someone would complain about it.
>
>Presumably a command line parameter could be used to disable LASS
>in order to enable vsyscall emulation?
>
>That might let LASS be enabled by default.
>
>	David
>

So I talked with Sohil about this earlier today, and there was a bit of a miscommunication — XONLY mode is just fine. It is read emulation mode that has problems.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 21:13         ` David Laight
  2025-10-31  6:41           ` H. Peter Anvin
@ 2025-10-31 16:55           ` Dave Hansen
  1 sibling, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 16:55 UTC (permalink / raw)
  To: David Laight, Sohil Mehta
  Cc: Andy Lutomirski, H. Peter Anvin, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Jonathan Corbet, Josh Poimboeuf, Peter Zijlstra (Intel),
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/30/25 14:13, David Laight wrote:
>> Unfortunately, CONFIG_X86_VSYSCALL_EMULATION defaults to y. Also, the
>> default Vsyscall mode is XONLY. So even if vsyscalls are deprecated,
>> there is a non-zero possibility someone would complain about it.
> Presumably a command line parameter could be used to disable LASS
> in order to enable vsyscall emulation?
> 
> That might let LASS be enabled by default.

Sure... There are a million ways to skin this cat. That's the problem.

The compile switch is the smallest amount of code with the fewest
implications to ABI or documentation that we can muster. All I'm saying
is we should _start_ here, not _end_ here.

If anyone agrees with that approach, acks would be appreciated.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits
  2025-10-29 21:03 ` [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits Sohil Mehta
@ 2025-10-31 17:03   ` Dave Hansen
  0 siblings, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:03 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> Linear Address Space Separation (LASS) is a security feature that
> mitigates a class of side-channel attacks relying on speculative access
> across the user/kernel boundary.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 2/9] x86/cpu: Add an LASS dependency on SMAP
  2025-10-29 21:03 ` [PATCH v11 2/9] x86/cpu: Add an LASS dependency on SMAP Sohil Mehta
@ 2025-10-31 17:04   ` Dave Hansen
  0 siblings, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:04 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> So, make LASS depend on SMAP to conveniently reuse the existing AC bit
> toggling already in place.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 3/9] x86/asm: Introduce inline memcpy and memset
  2025-10-29 21:03 ` [PATCH v11 3/9] x86/asm: Introduce inline memcpy and memset Sohil Mehta
@ 2025-10-31 17:06   ` Dave Hansen
  0 siblings, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:06 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> Provide inline memcpy and memset functions that can be used instead of
> the GCC builtins when necessary.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-10-29 21:03 ` [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code Sohil Mehta
@ 2025-10-31 17:10   ` Dave Hansen
  2025-11-10 18:15   ` Sohil Mehta
  1 sibling, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:10 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> Introduce LASS-specific STAC/CLAC helpers to set the AC bit only on
> platforms that need it. Clarify the usage of the new helpers versus the
> existing stac()/clac() helpers for SMAP.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

One review nit: The

-	/* Note: a barrier is implicit in alternative() */

looks a bit funky in the diffstat. It took me a minute to realize that
you'd moved it. I _probably_ would have specifically called out that you
*added* comments for stac()/clac() and moved and existing duplicated
comment there. Adding a whole new comment block deserves calling out
explicitly. It is far beyond the "clarify" that's in the changelog.

But it's just a nit in the end.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-29 21:03 ` [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services Sohil Mehta
@ 2025-10-31 17:11   ` Dave Hansen
  2025-10-31 17:38     ` Andy Lutomirski
  2025-10-31 18:32     ` Sohil Mehta
  0 siblings, 2 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:11 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> 
> While mapping EFI runtime services, set_virtual_address_map() is called
> at its lower mapping, which LASS prohibits. Wrapping the EFI call with
> lass_disable()/_enable() is not enough, because the AC flag only
> controls data accesses, and not instruction fetches.
> 
> Use the big hammer and toggle the CR4.LASS bit to make this work.

One thing that's actually missing here is an explanation on how it's OK
to munge CR bits here. Why are preemption and interrupts not a problem?

A reviewer would have to go off and figure this out on their own.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 6/9] x86/kexec: Disable LASS during relocate kernel
  2025-10-29 21:03 ` [PATCH v11 6/9] x86/kexec: Disable LASS during relocate kernel Sohil Mehta
@ 2025-10-31 17:14   ` Dave Hansen
  0 siblings, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:14 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> The relocate kernel mechanism uses an identity mapping to copy the new
> kernel, which leads to a LASS violation when executing from a low
> address.
> 
> LASS must be disabled after the original CR4 value is saved because
> kexec paths that preserve context need to restore CR4.LASS. But,
> disabling it along with CET during identity_mapped() is too late. So,
> disable LASS immediately after saving CR4, along with PGE, and before
> jumping to the identity-mapped page.

It's not great when we have to thread the needle like this, but I don't
have any better ideas:

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message
  2025-10-29 21:03 ` [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message Sohil Mehta
@ 2025-10-31 17:16   ` Dave Hansen
  2025-10-31 19:59     ` Sohil Mehta
  0 siblings, 1 reply; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:16 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> To make the transition easier, enhance the #GP Oops message to include a
> hint about LASS violations. Also, add a special hint for kernel NULL
> pointer dereferences to match with the existing #PF message.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

This also reminds me... Are there tests for this somewhere? How did you
test all these new messages?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 8/9] selftests/x86: Update the negative vsyscall tests to expect a #GP
  2025-10-29 21:03 ` [PATCH v11 8/9] selftests/x86: Update the negative vsyscall tests to expect a #GP Sohil Mehta
@ 2025-10-31 17:20   ` Dave Hansen
  0 siblings, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:20 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
> Update the failing test to expect either a #GP or a #PF. Also, update
> the printed messages to show the trap number (denoting the type of
> fault) instead of assuming a #PF.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-29 21:03 ` [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization Sohil Mehta
  2025-10-30  8:40   ` H. Peter Anvin
@ 2025-10-31 17:21   ` Dave Hansen
  2025-10-31 20:04     ` Sohil Mehta
  1 sibling, 1 reply; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:21 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/29/25 14:03, Sohil Mehta wrote:
...
> +static __always_inline void setup_lass(struct cpuinfo_x86 *c)
> +{
> +	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> +		/*
> +		 * Legacy vsyscall page access causes a #GP when LASS is
> +		 * active. However, vsyscall emulation isn't supported
> +		 * with #GP. To avoid breaking userspace, disable LASS
> +		 * if the emulation code is compiled in.
> +		 */
> +		if (IS_ENABLED(CONFIG_X86_VSYSCALL_EMULATION)) {
> +			pr_info_once("x86/cpu: Disabling LASS due to CONFIG_X86_VSYSCALL_EMULATION=y\n");
> +			setup_clear_cpu_cap(X86_FEATURE_LASS);
> +			return;
> +		}
> +
> +		cr4_set_bits(X86_CR4_LASS);
> +	}
> +}
This breaks two rules I have:

 1. Indent as little as practical
 2. Keep the main code flow at the lowest indentation level

IOW, this should be.

static __always_inline void setup_lass(struct cpuinfo_x86 *c)
{
	if (!cpu_feature_enabled(X86_FEATURE_LASS))
		return;
...

But I can fix that up when I apply this. I think it's mostly ready.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 17:11   ` Dave Hansen
@ 2025-10-31 17:38     ` Andy Lutomirski
  2025-10-31 17:41       ` Dave Hansen
  2025-10-31 19:04       ` Sohil Mehta
  2025-10-31 18:32     ` Sohil Mehta
  1 sibling, 2 replies; 67+ messages in thread
From: Andy Lutomirski @ 2025-10-31 17:38 UTC (permalink / raw)
  To: Dave Hansen, Sohil Mehta, the arch/x86 maintainers, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi



On Fri, Oct 31, 2025, at 10:11 AM, Dave Hansen wrote:
> On 10/29/25 14:03, Sohil Mehta wrote:
>> From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> 
>> While mapping EFI runtime services, set_virtual_address_map() is called
>> at its lower mapping, which LASS prohibits. Wrapping the EFI call with
>> lass_disable()/_enable() is not enough, because the AC flag only
>> controls data accesses, and not instruction fetches.
>> 
>> Use the big hammer and toggle the CR4.LASS bit to make this work.
>
> One thing that's actually missing here is an explanation on how it's OK
> to munge CR bits here. Why are preemption and interrupts not a problem?
>
> A reviewer would have to go off and figure this out on their own.

I have another question: why is this one specific call a problem as opposed to something more general?  Wouldn’t any EFI call that touches the low EFI mapping be a problem?  Are there any odd code paths that touch low mapped EFI *data* that would fault?

Am I imagining an issue that doesn’t exist?  Is there some way to be reasonably convinced that you haven’t missed another EFI code path?  Would it be ridiculous to defer enabling LASS until we’re almost ready to run user code?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 17:38     ` Andy Lutomirski
@ 2025-10-31 17:41       ` Dave Hansen
  2025-10-31 18:03         ` Sohil Mehta
  2025-10-31 19:04       ` Sohil Mehta
  1 sibling, 1 reply; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 17:41 UTC (permalink / raw)
  To: Andy Lutomirski, Sohil Mehta, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/31/25 10:38, Andy Lutomirski wrote:
> Am I imagining an issue that doesn’t exist?  Is there some way to be
> reasonably convinced that you haven’t missed another EFI code path?
> Would it be ridiculous to defer enabling LASS until we’re almost
> ready to run user code?
Deferring is a good idea. I was just asking for the same thing for the
CR pinning enforcement. The earlier we try to do these things, the more
we just trip over ourselves.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 17:41       ` Dave Hansen
@ 2025-10-31 18:03         ` Sohil Mehta
  2025-10-31 18:12           ` Dave Hansen
  0 siblings, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-31 18:03 UTC (permalink / raw)
  To: Dave Hansen, Andy Lutomirski, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/31/2025 10:41 AM, Dave Hansen wrote:
> On 10/31/25 10:38, Andy Lutomirski wrote:
>> Am I imagining an issue that doesn’t exist?  Is there some way to be
>> reasonably convinced that you haven’t missed another EFI code path?
>> Would it be ridiculous to defer enabling LASS until we’re almost
>> ready to run user code?

> Deferring is a good idea. I was just asking for the same thing for the
> CR pinning enforcement. The earlier we try to do these things, the more
> we just trip over ourselves.

I had suggested deferring as well to Kirill when I was reviewing the
series. He preferred to enable LASS with other similar features such as
SMAP, SMEP.

One other thing to consider:

Doing it in identify_cpu() makes it easy for all the APs to program
their CR4.LASS bit. If we were to defer it, we would need some
additional work to setup all the APs.

Do we already do this for something else? That would make it easier to
tag along.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 18:03         ` Sohil Mehta
@ 2025-10-31 18:12           ` Dave Hansen
  2025-11-07  9:04             ` Peter Zijlstra
  0 siblings, 1 reply; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 18:12 UTC (permalink / raw)
  To: Sohil Mehta, Andy Lutomirski, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/31/25 11:03, Sohil Mehta wrote:
>> Deferring is a good idea. I was just asking for the same thing for the
>> CR pinning enforcement. The earlier we try to do these things, the more
>> we just trip over ourselves.
> I had suggested deferring as well to Kirill when I was reviewing the
> series. He preferred to enable LASS with other similar features such as
> SMAP, SMEP.
> 
> One other thing to consider:
> 
> Doing it in identify_cpu() makes it easy for all the APs to program
> their CR4.LASS bit. If we were to defer it, we would need some
> additional work to setup all the APs.

That's true. We'd need an smp_call_function() of some kind. *But*, once
that is in place, it's hopefully just a matter of moving that one line
of code per feature from identify_cpu() over to the new function.

> Do we already do this for something else? That would make it easier to
> tag along.

We don't do it for anything else that I can think of.

But there's a pretty broad set of things that are for "security" that
aren't necessary while you're just running trusted ring0 code:

 * SMAP/SMEP
 * CR pinning itself
 * MSR_IA32_SPEC_CTRL
 * MSR_IA32_TSX_CTRL

They just haven't mattered until now because they don't have any
practical effect until you actually have code running on _PAGE_USER
mappings trying to attack the kernel.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 17:11   ` Dave Hansen
  2025-10-31 17:38     ` Andy Lutomirski
@ 2025-10-31 18:32     ` Sohil Mehta
  1 sibling, 0 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-31 18:32 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/31/2025 10:11 AM, Dave Hansen wrote:
> One thing that's actually missing here is an explanation on how it's OK
> to munge CR bits here. Why are preemption and interrupts not a problem?
> 

This is called pretty early on from the BSP init flow.

start_kernel()
  arch_cpu_finalize_init()
    efi_enter_virtual_mode()
      __efi_enter_virtual_mode()

I had assumed we run with interrupts disabled. But, that's not true.
Interrupts are enabled midway during start_kernel(). So,
arch_cpu_finalize_init() is called with interrupts enabled.

We write to CR bits during FPU init which happens right before EFI
enters virtual mode. So I am probably missing something obvious that
makes it okay.



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 17:38     ` Andy Lutomirski
  2025-10-31 17:41       ` Dave Hansen
@ 2025-10-31 19:04       ` Sohil Mehta
  2025-11-07  7:36         ` Sohil Mehta
  1 sibling, 1 reply; 67+ messages in thread
From: Sohil Mehta @ 2025-10-31 19:04 UTC (permalink / raw)
  To: Andy Lutomirski, Dave Hansen, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/31/2025 10:38 AM, Andy Lutomirski wrote:

> I have another question: why is this one specific call a problem as opposed to something more general?  Wouldn’t any EFI call that touches the low EFI mapping be a problem?  Are there any odd code paths that touch low mapped EFI *data* that would fault?
> 

I assumed EFI is running in physical mode before this.
efi_sync_low_kernel_mappings() is called right before calling
set_virtual_address_map(). So, this is the only call that happens at the
low mapping while switching to virtual mode.

But, my EFI knowledge is fairly limited. I am realizing that there are
some assumptions built into this patch that I may not be aware of.

> Is there some way to be reasonably convinced that you haven’t missed another EFI code path?

We have been running the patches on internal test platforms for a couple
of years. But, that would only cover the common paths. I'll dig deeper
to get you a convincing answer.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message
  2025-10-31 17:16   ` Dave Hansen
@ 2025-10-31 19:59     ` Sohil Mehta
  2025-10-31 20:03       ` Andy Lutomirski
  2025-10-31 20:56       ` Dave Hansen
  0 siblings, 2 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-31 19:59 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/31/2025 10:16 AM, Dave Hansen wrote:
> On 10/29/25 14:03, Sohil Mehta wrote:
>> To make the transition easier, enhance the #GP Oops message to include a
>> hint about LASS violations. Also, add a special hint for kernel NULL
>> pointer dereferences to match with the existing #PF message.
> 
> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
> 
> This also reminds me... Are there tests for this somewhere? How did you
> test all these new messages?

I have some very simple kernel modules that access invalid user memory
and generate these faults. I configure the kernel not to panic/reboot.
But, I have been running them manually.

Invalid accesses from the kernel generate:
#PF (without LASS):
  BUG: kernel NULL pointer dereference, address: 0000000000000000
  BUG: unable to handle page fault for address: 0000000000100000

#GP (with LASS):
  Oops: general protection fault, kernel NULL pointer dereference 0x0: 0000
  Oops: general protection fault, probably LASS violation for address
0x100000: 0000

For testing user SIGSEGVs, the Vsyscall tests have been sufficient to
cover all scenarios.

Were you looking for anything specific? I can clean them up and post
them if required.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message
  2025-10-31 19:59     ` Sohil Mehta
@ 2025-10-31 20:03       ` Andy Lutomirski
  2025-10-31 20:56       ` Dave Hansen
  1 sibling, 0 replies; 67+ messages in thread
From: Andy Lutomirski @ 2025-10-31 20:03 UTC (permalink / raw)
  To: Sohil Mehta, Dave Hansen, the arch/x86 maintainers, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi



On Fri, Oct 31, 2025, at 12:59 PM, Sohil Mehta wrote:
> On 10/31/2025 10:16 AM, Dave Hansen wrote:
>> On 10/29/25 14:03, Sohil Mehta wrote:
>>> To make the transition easier, enhance the #GP Oops message to include a
>>> hint about LASS violations. Also, add a special hint for kernel NULL
>>> pointer dereferences to match with the existing #PF message.
>> 
>> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
>> 
>> This also reminds me... Are there tests for this somewhere? How did you
>> test all these new messages?
>
> I have some very simple kernel modules that access invalid user memory
> and generate these faults. I configure the kernel not to panic/reboot.
> But, I have been running them manually.
>
> Invalid accesses from the kernel generate:
> #PF (without LASS):
>   BUG: kernel NULL pointer dereference, address: 0000000000000000
>   BUG: unable to handle page fault for address: 0000000000100000
>
> #GP (with LASS):
>   Oops: general protection fault, kernel NULL pointer dereference 0x0: 0000
>   Oops: general protection fault, probably LASS violation for address
> 0x100000: 0000
>
> For testing user SIGSEGVs, the Vsyscall tests have been sufficient to
> cover all scenarios.
>
> Were you looking for anything specific? I can clean them up and post
> them if required.

LKDTM is basically meant for this use case. If you can’t provoke a LASS failure from there, maybe just add another failure type?  I would expect that LKDTM can already do a SMAP violation.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-31 17:21   ` Dave Hansen
@ 2025-10-31 20:04     ` Sohil Mehta
  0 siblings, 0 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-10-31 20:04 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/31/2025 10:21 AM, Dave Hansen wrote:

> 
> But I can fix that up when I apply this. I think it's mostly ready.
> 

Thanks a lot for all the reviews. I'll try to get the EFI thing sorted.

If we end up applying this revision, can you please remove the "default"
from the patch title.

Making LASS depend on CONFIG_X86_VSYSCALL_EMULATION makes it off by default.



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message
  2025-10-31 19:59     ` Sohil Mehta
  2025-10-31 20:03       ` Andy Lutomirski
@ 2025-10-31 20:56       ` Dave Hansen
  1 sibling, 0 replies; 67+ messages in thread
From: Dave Hansen @ 2025-10-31 20:56 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi

On 10/31/25 12:59, Sohil Mehta wrote:
> Were you looking for anything specific? I can clean them up and post
> them if required.

It would be nice to have these in-kernel somehow, even if they were
silly debugfs knobs or something. Ira had some tests for PKS that never
went in but you might be able to reuse some of his techniques.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 19:04       ` Sohil Mehta
@ 2025-11-07  7:36         ` Sohil Mehta
  0 siblings, 0 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-11-07  7:36 UTC (permalink / raw)
  To: Andy Lutomirski, Dave Hansen, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf,
	Peter Zijlstra (Intel), Ard Biesheuvel, Kirill A . Shutemov,
	Xin Li, David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On 10/31/2025 12:04 PM, Sohil Mehta wrote:
>> Is there some way to be reasonably convinced that you haven’t missed another EFI code path?
> 
> We have been running the patches on internal test platforms for a couple
> of years. But, that would only cover the common paths. I'll dig deeper
> to get you a convincing answer.

In summary, the current approach could work for BIOSes that behave well.
But, the kernel makes lots of exceptions for broken firmware and odd
implementations. We would need extra guardrails and changes to support
those, or mark them unsupported. Please see my analysis below.

For now, I am wondering if we should disable the EFI support as
well (similar to vsyscall).

	if (IS_ENABLED(CONFIG_EFI))
		// Do not enable LASS

I think the rest of the patches are ready. I can post a new revision
with the above change to collect additional reviews/acks. Even though,
this would significantly restrict usage, it would make it easier to
review EFI support (as well vsyscall support) in its independent,
focussed series.

My analysis
-----------
After a 1-week crash course in EFI (mainly reading lkml archives) below
is my understanding. Thanks Rick and Peter Anvin for the pointers and
insights. I would highly appreciate it if folks can validate assumptions
and help with some opens.

1) Does LASS affect EFI BootTimeServices?

Contrary to my assumption, EFI_BOOT_SERVICES_CODE/_DATA could be
accessed even after ExitBootServices() has been called. For example,
early ACPI code in efi_bgrt_init() accesses it.

efi_check_for_embedded_firmwares() accesses this memory even after
SetVirtualAddressMap() has been called right before
efi_free_boot_services().

At a minimum, we need to disable LASS around these special cases or
enable LASS only after EFI has completely finished entering virtual mode
(including freeing boot services).

Ideally, we would enable LASS much later, right before enabling userspace.

2) How does SetVirtualAddressMap() impact LASS?

SetVirtualAddressMap() is the first and only runtime service call that
is made in EFI physical mode (at the lower mapping). After the call,
firmware is expected to switch all its pointers to the high virtual
address provided by the kernel.

If LASS is enabled, it needs to be temporarily turned off during
SetVirtualAddressMap() as done in this patch. Though, the resolution in
#1 would likely make this patch moot.

3) Would LASS interfere with other runtime services?

Unfortunately, some firmware tends to cling to the old physical
addresses even after SetVirtualAddressMap() and doesn't completely
switch over to using the new virtual addresses. To workaround, the
kernel dual maps all the memory marked as EFI_RUNTIME under a separate
efi_mm. First with a 1:1 map and second with the high virtual address.
See efi_map_region().

Also, some runtime services expect to access the First 4kb of physical
memory, which is also mapped 1:1 to avoid failures.

To avoid any of these corner cases, LASS must be toggled everytime we
make a runtime EFI call. Because efi_mm doesn't have real user mappings,
disabling LASS after efi_enter_mm() should be fine.

I am unsure whether the accesses are only data accesses, or could
instruction fetch happen as well. Based on that, we would need a
STAC/CLAC pair or a CR4.LASS toggle to disable LASS.

Writing to CR4 might be the safest option, because performance is not a
concern here, right?

4) What happens if an EFI runtime call trips LASS?

If a LASS violation happens with EFI, the system would trigger a #GP and
hang. For page faults, we seem to have introduced
efi_crash_gracefully_on_page_fault() to attribute the fault to EFI. Do
we require something similar for #GP?

My inclination is to add the helpful prints after we run into an issue.

5) Is there any other aspect of EFI that should be considered?

Please let me know if I have missed something.

Another approach could be to support only limited (well behaving)
firmware implementations with LASS. But, I am not sure how practical
that would be given all the quirks we have in place.

Thanks,
Sohil




^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-10-30 16:27     ` Dave Hansen
@ 2025-11-07  8:01       ` H. Peter Anvin
  2025-11-07 20:08         ` Sohil Mehta
  0 siblings, 1 reply; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-07  8:01 UTC (permalink / raw)
  To: Dave Hansen, Sohil Mehta, x86, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick Edgecombe, Vegard Nossum, Andrew Cooper,
	Randy Dunlap, Geert Uytterhoeven, Kees Cook, Tony Luck,
	Alexander Shishkin, linux-doc, linux-kernel, linux-efi

On 2025-10-30 09:27, Dave Hansen wrote:
> On 10/30/25 01:40, H. Peter Anvin wrote:
>> Legacy vsyscalls have been obsolete for how long now?
> 
> I asked Sohil to start throwing out all the non-essential bits from this
> series. My thought was that we could get mainline so that LASS itself
> could be enabled, even if it was in a somewhat weird config that a
> distro would probably never do.
> 
> After that is merged, we can circle back and decide what to do with the
> remaining bits like CR pinning and vsyscall emulation. I don't think any
> of those bits will change the basic desire to have LASS support in the
> kernel.
> 
> Does that sound like a sane approach to everyone?

XONLY vsyscall emulation should be trivial, though (just look for the magic
RIP values, just like the page fault code does now, too.) The FULL emulation
mode is completely irrelevant these days, so I don't think it matters at all.

EFI handling is similarly straightforward: just disable CR4.LASS right before
calling EFI, and enable it on return. As long as we are *already* in the
efi_mm context, it is perfectly safe to disable CR4.LASS, because there *is no
user memory mapped at that point*.

These two things should only be a few lines of code each, and I don't see any
reason to further elaborate them at this time (to handle FULL emulation, or to
take a LASS trap inside EFI to write a shame-the-vendor message; if we wanted
to do that, it would be better to make that independent of LASS and empty out
efi_mm entirely.

Am I missing something?

	-hpa



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-10-31 18:12           ` Dave Hansen
@ 2025-11-07  9:04             ` Peter Zijlstra
  2025-11-07  9:22               ` Ard Biesheuvel
  0 siblings, 1 reply; 67+ messages in thread
From: Peter Zijlstra @ 2025-11-07  9:04 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Sohil Mehta, Andy Lutomirski, the arch/x86 maintainers,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Jonathan Corbet, H. Peter Anvin, Josh Poimboeuf, Ard Biesheuvel,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick P Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, Linux Kernel Mailing List, linux-efi

On Fri, Oct 31, 2025 at 11:12:53AM -0700, Dave Hansen wrote:

> But there's a pretty broad set of things that are for "security" that
> aren't necessary while you're just running trusted ring0 code:
> 
>  * SMAP/SMEP
>  * CR pinning itself
>  * MSR_IA32_SPEC_CTRL
>  * MSR_IA32_TSX_CTRL
> 
> They just haven't mattered until now because they don't have any
> practical effect until you actually have code running on _PAGE_USER
> mappings trying to attack the kernel.

But that's just the thing EFI is *NOT* trusted! We're basically
disabling all security features (not listed above are CET and CFI) to
run this random garbage we have no control over.

How about we just flat out refuse EFI runtime services? What are they
actually needed for? Why are we bending over backwards and subverting
our security for this stuff?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07  9:04             ` Peter Zijlstra
@ 2025-11-07  9:22               ` Ard Biesheuvel
  2025-11-07  9:27                 ` H. Peter Anvin
                                   ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-07  9:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, H. Peter Anvin,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Fri, 7 Nov 2025 at 10:04, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Oct 31, 2025 at 11:12:53AM -0700, Dave Hansen wrote:
>
> > But there's a pretty broad set of things that are for "security" that
> > aren't necessary while you're just running trusted ring0 code:
> >
> >  * SMAP/SMEP
> >  * CR pinning itself
> >  * MSR_IA32_SPEC_CTRL
> >  * MSR_IA32_TSX_CTRL
> >
> > They just haven't mattered until now because they don't have any
> > practical effect until you actually have code running on _PAGE_USER
> > mappings trying to attack the kernel.
>
> But that's just the thing EFI is *NOT* trusted! We're basically
> disabling all security features (not listed above are CET and CFI) to
> run this random garbage we have no control over.
>
> How about we just flat out refuse EFI runtime services? What are they
> actually needed for? Why are we bending over backwards and subverting
> our security for this stuff?

On x86, it is mostly the EFI variable services that user space has
come to rely on, not only for setting the boot path (which typically
happens only once at installation time, when the path to GRUB is set
as the first boot option). Unfortunately, the systemd folks have taken
a liking to this feature too, and have started storing things in
there.

There is also PRM, which is much worse, as it permits devices in the
ACPI namespace to call firmware routines that are mapped privileged in
the OS address space in the same way. I objected to this at the time,
and asked for a facility where we could at least mark such code as
unprivileged (and run it as such) but this was ignored, as Intel and
MS had already sealed the deal and put this into production. This is
much worse than typical EFI routines, as the PRM code is ODM/OEM code
rather than something that comes from the upstream EFI implementation.
It is basically a dumping ground for code that used to run in SMM
because it was too ugly to run anywhere else. </rant>

It would be nice if we could

a) Get rid of SetVirtualAddressMap(), which is another insane hack
that should never have been supported on 64-bit systems. On arm64, we
no longer call it unless there is a specific need for it (some Ampere
Altra systems with buggy firmware will crash otherwise). On x86,
though, it might be tricky because there so much buggy firmware.
Perhaps we should phase it out by checking for the UEFI version, so
that future systems will avoid it. This would mean, however, that EFI
code remains in the low user address space, which may not be what you
want (unless we do c) perhaps?)

b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
for arm64 a couple of years ago, but it was very intrusive and the ARM
intern in question went on to do more satisyfing work.

c) Unmap the kernel KPTI style while the runtime calls are in
progress? This should be rather straight-forward, although it might
not help a lot as the code in question still runs privileged.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07  9:22               ` Ard Biesheuvel
@ 2025-11-07  9:27                 ` H. Peter Anvin
  2025-11-07  9:35                   ` Ard Biesheuvel
  2025-11-07  9:40                 ` Peter Zijlstra
  2025-11-07 10:10                 ` Peter Zijlstra
  2 siblings, 1 reply; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-07  9:27 UTC (permalink / raw)
  To: Ard Biesheuvel, Peter Zijlstra
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, Josh Poimboeuf,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick P Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, Linux Kernel Mailing List, linux-efi

On November 7, 2025 1:22:30 AM PST, Ard Biesheuvel <ardb@kernel.org> wrote:
>On Fri, 7 Nov 2025 at 10:04, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Fri, Oct 31, 2025 at 11:12:53AM -0700, Dave Hansen wrote:
>>
>> > But there's a pretty broad set of things that are for "security" that
>> > aren't necessary while you're just running trusted ring0 code:
>> >
>> >  * SMAP/SMEP
>> >  * CR pinning itself
>> >  * MSR_IA32_SPEC_CTRL
>> >  * MSR_IA32_TSX_CTRL
>> >
>> > They just haven't mattered until now because they don't have any
>> > practical effect until you actually have code running on _PAGE_USER
>> > mappings trying to attack the kernel.
>>
>> But that's just the thing EFI is *NOT* trusted! We're basically
>> disabling all security features (not listed above are CET and CFI) to
>> run this random garbage we have no control over.
>>
>> How about we just flat out refuse EFI runtime services? What are they
>> actually needed for? Why are we bending over backwards and subverting
>> our security for this stuff?
>
>On x86, it is mostly the EFI variable services that user space has
>come to rely on, not only for setting the boot path (which typically
>happens only once at installation time, when the path to GRUB is set
>as the first boot option). Unfortunately, the systemd folks have taken
>a liking to this feature too, and have started storing things in
>there.
>
>There is also PRM, which is much worse, as it permits devices in the
>ACPI namespace to call firmware routines that are mapped privileged in
>the OS address space in the same way. I objected to this at the time,
>and asked for a facility where we could at least mark such code as
>unprivileged (and run it as such) but this was ignored, as Intel and
>MS had already sealed the deal and put this into production. This is
>much worse than typical EFI routines, as the PRM code is ODM/OEM code
>rather than something that comes from the upstream EFI implementation.
>It is basically a dumping ground for code that used to run in SMM
>because it was too ugly to run anywhere else. </rant>
>
>It would be nice if we could
>
>a) Get rid of SetVirtualAddressMap(), which is another insane hack
>that should never have been supported on 64-bit systems. On arm64, we
>no longer call it unless there is a specific need for it (some Ampere
>Altra systems with buggy firmware will crash otherwise). On x86,
>though, it might be tricky because there so much buggy firmware.
>Perhaps we should phase it out by checking for the UEFI version, so
>that future systems will avoid it. This would mean, however, that EFI
>code remains in the low user address space, which may not be what you
>want (unless we do c) perhaps?)
>
>b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
>for arm64 a couple of years ago, but it was very intrusive and the ARM
>intern in question went on to do more satisyfing work.
>
>c) Unmap the kernel KPTI style while the runtime calls are in
>progress? This should be rather straight-forward, although it might
>not help a lot as the code in question still runs privileged.

Firmware update is a big one.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07  9:27                 ` H. Peter Anvin
@ 2025-11-07  9:35                   ` Ard Biesheuvel
  0 siblings, 0 replies; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-07  9:35 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Peter Zijlstra, Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, Josh Poimboeuf,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick P Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, Linux Kernel Mailing List, linux-efi

On Fri, 7 Nov 2025 at 10:27, H. Peter Anvin <hpa@zytor.com> wrote:
>
> On November 7, 2025 1:22:30 AM PST, Ard Biesheuvel <ardb@kernel.org> wrote:
> >On Fri, 7 Nov 2025 at 10:04, Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Fri, Oct 31, 2025 at 11:12:53AM -0700, Dave Hansen wrote:
> >>
> >> > But there's a pretty broad set of things that are for "security" that
> >> > aren't necessary while you're just running trusted ring0 code:
> >> >
> >> >  * SMAP/SMEP
> >> >  * CR pinning itself
> >> >  * MSR_IA32_SPEC_CTRL
> >> >  * MSR_IA32_TSX_CTRL
> >> >
> >> > They just haven't mattered until now because they don't have any
> >> > practical effect until you actually have code running on _PAGE_USER
> >> > mappings trying to attack the kernel.
> >>
> >> But that's just the thing EFI is *NOT* trusted! We're basically
> >> disabling all security features (not listed above are CET and CFI) to
> >> run this random garbage we have no control over.
> >>
> >> How about we just flat out refuse EFI runtime services? What are they
> >> actually needed for? Why are we bending over backwards and subverting
> >> our security for this stuff?
> >
> >On x86, it is mostly the EFI variable services that user space has
> >come to rely on, not only for setting the boot path (which typically
> >happens only once at installation time, when the path to GRUB is set
> >as the first boot option). Unfortunately, the systemd folks have taken
> >a liking to this feature too, and have started storing things in
> >there.
> >
> >There is also PRM, which is much worse, as it permits devices in the
> >ACPI namespace to call firmware routines that are mapped privileged in
> >the OS address space in the same way. I objected to this at the time,
> >and asked for a facility where we could at least mark such code as
> >unprivileged (and run it as such) but this was ignored, as Intel and
> >MS had already sealed the deal and put this into production. This is
> >much worse than typical EFI routines, as the PRM code is ODM/OEM code
> >rather than something that comes from the upstream EFI implementation.
> >It is basically a dumping ground for code that used to run in SMM
> >because it was too ugly to run anywhere else. </rant>
> >
> >It would be nice if we could
> >
> >a) Get rid of SetVirtualAddressMap(), which is another insane hack
> >that should never have been supported on 64-bit systems. On arm64, we
> >no longer call it unless there is a specific need for it (some Ampere
> >Altra systems with buggy firmware will crash otherwise). On x86,
> >though, it might be tricky because there so much buggy firmware.
> >Perhaps we should phase it out by checking for the UEFI version, so
> >that future systems will avoid it. This would mean, however, that EFI
> >code remains in the low user address space, which may not be what you
> >want (unless we do c) perhaps?)
> >
> >b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
> >for arm64 a couple of years ago, but it was very intrusive and the ARM
> >intern in question went on to do more satisyfing work.
> >
> >c) Unmap the kernel KPTI style while the runtime calls are in
> >progress? This should be rather straight-forward, although it might
> >not help a lot as the code in question still runs privileged.
>
> Firmware update is a big one.

Firmware update does not run under the OS.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07  9:22               ` Ard Biesheuvel
  2025-11-07  9:27                 ` H. Peter Anvin
@ 2025-11-07  9:40                 ` Peter Zijlstra
  2025-11-07 10:09                   ` Ard Biesheuvel
  2025-11-07 10:10                 ` Peter Zijlstra
  2 siblings, 1 reply; 67+ messages in thread
From: Peter Zijlstra @ 2025-11-07  9:40 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, H. Peter Anvin,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:

> > But that's just the thing EFI is *NOT* trusted! We're basically
> > disabling all security features (not listed above are CET and CFI) to
> > run this random garbage we have no control over.
> >
> > How about we just flat out refuse EFI runtime services? What are they
> > actually needed for? Why are we bending over backwards and subverting
> > our security for this stuff?
> 
> On x86, it is mostly the EFI variable services that user space has
> come to rely on, not only for setting the boot path (which typically
> happens only once at installation time, when the path to GRUB is set
> as the first boot option). Unfortunately, the systemd folks have taken
> a liking to this feature too, and have started storing things in
> there.

*groan*, so booting with noefi (I just went and found that option) will
cause a modern Linux system to fail booting?

> There is also PRM, which is much worse, as it permits devices in the
> ACPI namespace to call firmware routines that are mapped privileged in
> the OS address space in the same way. I objected to this at the time,
> and asked for a facility where we could at least mark such code as
> unprivileged (and run it as such) but this was ignored, as Intel and
> MS had already sealed the deal and put this into production. This is
> much worse than typical EFI routines, as the PRM code is ODM/OEM code
> rather than something that comes from the upstream EFI implementation.
> It is basically a dumping ground for code that used to run in SMM
> because it was too ugly to run anywhere else. </rant>

What the actual fuck!! And we support this garbage? Without
pr_err(FW_BUG ) notification?

How can one find such devices? I need to check my machine.

> It would be nice if we could
> 
> a) Get rid of SetVirtualAddressMap(), which is another insane hack
> that should never have been supported on 64-bit systems. On arm64, we
> no longer call it unless there is a specific need for it (some Ampere
> Altra systems with buggy firmware will crash otherwise). On x86,
> though, it might be tricky because there so much buggy firmware.
> Perhaps we should phase it out by checking for the UEFI version, so
> that future systems will avoid it. This would mean, however, that EFI
> code remains in the low user address space, which may not be what you
> want (unless we do c) perhaps?)
> 
> b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
> for arm64 a couple of years ago, but it was very intrusive and the ARM
> intern in question went on to do more satisyfing work.
> 
> c) Unmap the kernel KPTI style while the runtime calls are in
> progress? This should be rather straight-forward, although it might
> not help a lot as the code in question still runs privileged.

At the very least I think we should start printing scary messages about
disabling security to run untrusted code. This is all quite insane :/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07  9:40                 ` Peter Zijlstra
@ 2025-11-07 10:09                   ` Ard Biesheuvel
  2025-11-07 10:27                     ` Peter Zijlstra
  2025-11-08  0:48                     ` Andy Lutomirski
  0 siblings, 2 replies; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-07 10:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, H. Peter Anvin,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Fri, 7 Nov 2025 at 10:40, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:
>
> > > But that's just the thing EFI is *NOT* trusted! We're basically
> > > disabling all security features (not listed above are CET and CFI) to
> > > run this random garbage we have no control over.
> > >
> > > How about we just flat out refuse EFI runtime services? What are they
> > > actually needed for? Why are we bending over backwards and subverting
> > > our security for this stuff?
> >
> > On x86, it is mostly the EFI variable services that user space has
> > come to rely on, not only for setting the boot path (which typically
> > happens only once at installation time, when the path to GRUB is set
> > as the first boot option). Unfortunately, the systemd folks have taken
> > a liking to this feature too, and have started storing things in
> > there.
>
> *groan*, so booting with noefi (I just went and found that option) will
> cause a modern Linux system to fail booting?
>

As long as you install with EFI enabled, the impact of efi=noruntime
should be limited, given that x86 does not rely on EFI runtime
services for the RTC or for reboot/poweroff. But you will lose access
to the EFI variable store. (Not sure what 'noefi' does in comparison,
but keeping EFI enabled at boot time for things like secure/measured
boot and storage encryption will probably result in a net positive
impact on security/hardening as long as you avoid calling into the
firmware after boot)


> > There is also PRM, which is much worse, as it permits devices in the
> > ACPI namespace to call firmware routines that are mapped privileged in
> > the OS address space in the same way. I objected to this at the time,
> > and asked for a facility where we could at least mark such code as
> > unprivileged (and run it as such) but this was ignored, as Intel and
> > MS had already sealed the deal and put this into production. This is
> > much worse than typical EFI routines, as the PRM code is ODM/OEM code
> > rather than something that comes from the upstream EFI implementation.
> > It is basically a dumping ground for code that used to run in SMM
> > because it was too ugly to run anywhere else. </rant>
>
> What the actual fuck!! And we support this garbage? Without
> pr_err(FW_BUG ) notification?
>
> How can one find such devices? I need to check my machine.
>

Unless you have a PRMT table in the list of ACPI tables, your system
shouldn't be affected by this.

> > It would be nice if we could
> >
> > a) Get rid of SetVirtualAddressMap(), which is another insane hack
> > that should never have been supported on 64-bit systems. On arm64, we
> > no longer call it unless there is a specific need for it (some Ampere
> > Altra systems with buggy firmware will crash otherwise). On x86,
> > though, it might be tricky because there so much buggy firmware.
> > Perhaps we should phase it out by checking for the UEFI version, so
> > that future systems will avoid it. This would mean, however, that EFI
> > code remains in the low user address space, which may not be what you
> > want (unless we do c) perhaps?)
> >
> > b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
> > for arm64 a couple of years ago, but it was very intrusive and the ARM
> > intern in question went on to do more satisyfing work.
> >
> > c) Unmap the kernel KPTI style while the runtime calls are in
> > progress? This should be rather straight-forward, although it might
> > not help a lot as the code in question still runs privileged.
>
> At the very least I think we should start printing scary messages about
> disabling security to run untrusted code. This is all quite insane :/

I agree in principle. However, calling it 'untrusted' is a bit
misleading here, given that you already rely on the same body of code
to boot your computer to begin with. I.e., if you suspect that the
code in question is conspiring against you, not calling it at runtime
to manipulate EFI variables is not going to help with that.

But from a robustness point of view, I agree - running vendor code at
the OS's privilege level at runtime that was only tested with Windows
is not great for stability, and it would be nice if we could leverage
the principle of least privilege and only permit it to access the
things that it actually needs to perform the task that we've asked it
to. This is why I asked for the ability to mark PRM services as
unprivileged, given that they typically only run some code and perhaps
poke some memory (either RAM or MMIO registers) that the OS never
accesses directly.

Question is though whether on x86, sandboxing is feasible: can VMs
call into SMM? Because that is where 95% of the EFI variable services
logic resides - the code running directly under the OS does very
little other than marshalling the arguments and passing them on.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07  9:22               ` Ard Biesheuvel
  2025-11-07  9:27                 ` H. Peter Anvin
  2025-11-07  9:40                 ` Peter Zijlstra
@ 2025-11-07 10:10                 ` Peter Zijlstra
  2025-11-07 10:17                   ` Ard Biesheuvel
  2 siblings, 1 reply; 67+ messages in thread
From: Peter Zijlstra @ 2025-11-07 10:10 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, H. Peter Anvin,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:

> There is also PRM, which is much worse, as it permits devices in the
> ACPI namespace to call firmware routines that are mapped privileged in
> the OS address space in the same way. I objected to this at the time,
> and asked for a facility where we could at least mark such code as
> unprivileged (and run it as such) but this was ignored, as Intel and
> MS had already sealed the deal and put this into production. This is
> much worse than typical EFI routines, as the PRM code is ODM/OEM code
> rather than something that comes from the upstream EFI implementation.
> It is basically a dumping ground for code that used to run in SMM
> because it was too ugly to run anywhere else. </rant>

'https://uefi.org/sites/default/files/resources/Platform Runtime Mechanism - with legal notice.pdf'

Has on page 16, section 3.1:

  8. PRM handlers must not contain any privileged instructions.

So we should be able to actually run this crap in ring3, right?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07 10:10                 ` Peter Zijlstra
@ 2025-11-07 10:17                   ` Ard Biesheuvel
  0 siblings, 0 replies; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-07 10:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, H. Peter Anvin,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Fri, 7 Nov 2025 at 11:10, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:
>
> > There is also PRM, which is much worse, as it permits devices in the
> > ACPI namespace to call firmware routines that are mapped privileged in
> > the OS address space in the same way. I objected to this at the time,
> > and asked for a facility where we could at least mark such code as
> > unprivileged (and run it as such) but this was ignored, as Intel and
> > MS had already sealed the deal and put this into production. This is
> > much worse than typical EFI routines, as the PRM code is ODM/OEM code
> > rather than something that comes from the upstream EFI implementation.
> > It is basically a dumping ground for code that used to run in SMM
> > because it was too ugly to run anywhere else. </rant>
>
> 'https://uefi.org/sites/default/files/resources/Platform Runtime Mechanism - with legal notice.pdf'
>
> Has on page 16, section 3.1:
>
>   8. PRM handlers must not contain any privileged instructions.
>
> So we should be able to actually run this crap in ring3, right?

How interesting! This wasn't in the draft that I reviewed at the time,
so someone did listen.

So it does seem feasible to drop privileges and reacquire them in
principle, as long as we ensure that all the memory touched by the PRM
services (stack, code, data, MMIO regions) is mapped appropriately in
the EFI memory map.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07 10:09                   ` Ard Biesheuvel
@ 2025-11-07 10:27                     ` Peter Zijlstra
  2025-11-08  0:48                     ` Andy Lutomirski
  1 sibling, 0 replies; 67+ messages in thread
From: Peter Zijlstra @ 2025-11-07 10:27 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Dave Hansen, Sohil Mehta, Andy Lutomirski,
	the arch/x86 maintainers, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Jonathan Corbet, H. Peter Anvin,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On Fri, Nov 07, 2025 at 11:09:44AM +0100, Ard Biesheuvel wrote:

> As long as you install with EFI enabled, the impact of efi=noruntime
> should be limited, given that x86 does not rely on EFI runtime
> services for the RTC or for reboot/poweroff. But you will lose access
> to the EFI variable store. (Not sure what 'noefi' does in comparison,
> but keeping EFI enabled at boot time for things like secure/measured
> boot and storage encryption will probably result in a net positive
> impact on security/hardening as long as you avoid calling into the
> firmware after boot)

I would say it should all stay before we start userspace, because that's
where our trust boundary is. We definitely do not trust userspace.

Also, if they all think this is 'important' why not provide native
drivers for this service?

> > At the very least I think we should start printing scary messages about
> > disabling security to run untrusted code. This is all quite insane :/
> 
> I agree in principle. However, calling it 'untrusted' is a bit
> misleading here, given that you already rely on the same body of code
> to boot your computer to begin with. 

That PRM stuff really doesn't sound like its needed to boot. And it
sounds like it really should be part of the normal Linux driver, but
isn't for $corp reasons or something.

> I.e., if you suspect that the
> code in question is conspiring against you, not calling it at runtime
> to manipulate EFI variables is not going to help with that.

Well, the problem is the disabling of all the hardware and software
security measures to run this crap. This makes it a prime target to take
over stuff. Also, while EFI code might be good enough to boot the
machine, using it at runtime is a whole different league of security.

What if they have a 'bug' in the variable name parser and a variable
named "NSAWantsAccess" gets you a buffer overflow and random code
execution.

Trusting it to boot the machine and trusting it to be safe for general
runtime are two very different things.

> Question is though whether on x86, sandboxing is feasible: can VMs
> call into SMM? Because that is where 95% of the EFI variable services
> logic resides - the code running directly under the OS does very
> little other than marshalling the arguments and passing them on.

I just read in that PRM document that they *REALLY* want to get away
from SMM because it freezes all CPUs in the system for the duration of
the SMI. So this variable crud being in SMM would be inconsistent.

Anyway, I'm all for very aggressive runtime warnings and pushing vendors
that object to provide native drivers. I don't believe there is any real
technical reason for any of this.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization
  2025-11-07  8:01       ` H. Peter Anvin
@ 2025-11-07 20:08         ` Sohil Mehta
  0 siblings, 0 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-11-07 20:08 UTC (permalink / raw)
  To: H. Peter Anvin, Dave Hansen, x86, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov
  Cc: Jonathan Corbet, Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick Edgecombe, Vegard Nossum, Andrew Cooper,
	Randy Dunlap, Geert Uytterhoeven, Kees Cook, Tony Luck,
	Alexander Shishkin, linux-doc, linux-kernel, linux-efi

On 11/7/2025 12:01 AM, H. Peter Anvin wrote:
>> I asked Sohil to start throwing out all the non-essential bits from this
>> series. My thought was that we could get mainline so that LASS itself
>> could be enabled, even if it was in a somewhat weird config that a
>> distro would probably never do.
>>
>> After that is merged, we can circle back and decide what to do with the
>> remaining bits like CR pinning and vsyscall emulation. I don't think any
>> of those bits will change the basic desire to have LASS support in the
>> kernel.
>>
>> Does that sound like a sane approach to everyone?
> 
> XONLY vsyscall emulation should be trivial, though (just look for the magic
> RIP values, just like the page fault code does now, too.) The FULL emulation
> mode is completely irrelevant these days, so I don't think it matters at all.
> 

Yes, the actual change is quite simple. But along with minor refactoring
and updates to code comments and documentation, it ends up being a set
of 3 to 4 small patches.

> EFI handling is similarly straightforward: just disable CR4.LASS right before
> calling EFI, and enable it on return. As long as we are *already* in the
> efi_mm context, it is perfectly safe to disable CR4.LASS, because there *is no
> user memory mapped at that point*.
> 

At a minimum, We need to defer the initial LASS enabling until we are
truly done with the boot services memory, i.e. efi_enter_virtual_mode()
has completed (including efi_free_boot_services()).

Also, there is preference for deferring LASS enabling until userspace
comes up. Again, all of that should be simple enough but ends up adding
to the patch count (touching 20 now). It was getting difficult to get
the reviews/acks from experts in all the impacted areas.

Dave's suggestion of splitting the series makes it easier to merge the
base set and get focussed reviews on the missing features. My plan is to
promptly follow up with the EFI and Vsyscall support to make LASS
enabled by default.

> These two things should only be a few lines of code each, and I don't see any
> reason to further elaborate them at this time (to handle FULL emulation, or to

Yes, there is absolutely no need to support the full vsyscall emulation.

> take a LASS trap inside EFI to write a shame-the-vendor message; if we wanted
> to do that, it would be better to make that independent of LASS and empty out
> efi_mm entirely.
> 

This seems to be the takeaway of PeterZ's interaction with Ard. Though,
I agree, warning about misbehaving firmware should probably be done
separately and independent of LASS.

I'll send out another revision of this series without EFI next week, and
follow up with support for EFI and vsyscall soon after.



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-07 10:09                   ` Ard Biesheuvel
  2025-11-07 10:27                     ` Peter Zijlstra
@ 2025-11-08  0:48                     ` Andy Lutomirski
  2025-11-08 16:18                       ` H. Peter Anvin
  2025-11-08 22:50                       ` H. Peter Anvin
  1 sibling, 2 replies; 67+ messages in thread
From: Andy Lutomirski @ 2025-11-08  0:48 UTC (permalink / raw)
  To: Ard Biesheuvel, Peter Zijlstra (Intel)
  Cc: Dave Hansen, Sohil Mehta, the arch/x86 maintainers, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jonathan Corbet,
	H. Peter Anvin, Josh Poimboeuf, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick P Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi



On Fri, Nov 7, 2025, at 2:09 AM, Ard Biesheuvel wrote:
> On Fri, 7 Nov 2025 at 10:40, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:
>>
>> > > But that's just the thing EFI is *NOT* trusted! We're basically
>> > > disabling all security features (not listed above are CET and CFI) to
>> > > run this random garbage we have no control over.
>> > >
>> > > How about we just flat out refuse EFI runtime services? What are they
>> > > actually needed for? Why are we bending over backwards and subverting
>> > > our security for this stuff?
>> >
>> > On x86, it is mostly the EFI variable services that user space has
>> > come to rely on, not only for setting the boot path (which typically
>> > happens only once at installation time, when the path to GRUB is set
>> > as the first boot option). Unfortunately, the systemd folks have taken
>> > a liking to this feature too, and have started storing things in
>> > there.
>>
>> *groan*, so booting with noefi (I just went and found that option) will
>> cause a modern Linux system to fail booting?
>>
>
> As long as you install with EFI enabled, the impact of efi=noruntime
> should be limited, given that x86 does not rely on EFI runtime
> services for the RTC or for reboot/poweroff. But you will lose access
> to the EFI variable store. (Not sure what 'noefi' does in comparison,
> but keeping EFI enabled at boot time for things like secure/measured
> boot and storage encryption will probably result in a net positive
> impact on security/hardening as long as you avoid calling into the
> firmware after boot)
>
>
>> > There is also PRM, which is much worse, as it permits devices in the
>> > ACPI namespace to call firmware routines that are mapped privileged in
>> > the OS address space in the same way. I objected to this at the time,
>> > and asked for a facility where we could at least mark such code as
>> > unprivileged (and run it as such) but this was ignored, as Intel and
>> > MS had already sealed the deal and put this into production. This is
>> > much worse than typical EFI routines, as the PRM code is ODM/OEM code
>> > rather than something that comes from the upstream EFI implementation.
>> > It is basically a dumping ground for code that used to run in SMM
>> > because it was too ugly to run anywhere else. </rant>
>>
>> What the actual fuck!! And we support this garbage? Without
>> pr_err(FW_BUG ) notification?
>>
>> How can one find such devices? I need to check my machine.
>>
>
> Unless you have a PRMT table in the list of ACPI tables, your system
> shouldn't be affected by this.
>
>> > It would be nice if we could
>> >
>> > a) Get rid of SetVirtualAddressMap(), which is another insane hack
>> > that should never have been supported on 64-bit systems. On arm64, we
>> > no longer call it unless there is a specific need for it (some Ampere
>> > Altra systems with buggy firmware will crash otherwise). On x86,
>> > though, it might be tricky because there so much buggy firmware.
>> > Perhaps we should phase it out by checking for the UEFI version, so
>> > that future systems will avoid it. This would mean, however, that EFI
>> > code remains in the low user address space, which may not be what you
>> > want (unless we do c) perhaps?)
>> >
>> > b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
>> > for arm64 a couple of years ago, but it was very intrusive and the ARM
>> > intern in question went on to do more satisyfing work.
>> >
>> > c) Unmap the kernel KPTI style while the runtime calls are in
>> > progress? This should be rather straight-forward, although it might
>> > not help a lot as the code in question still runs privileged.
>>
>> At the very least I think we should start printing scary messages about
>> disabling security to run untrusted code. This is all quite insane :/
>
> I agree in principle. However, calling it 'untrusted' is a bit
> misleading here, given that you already rely on the same body of code
> to boot your computer to begin with. I.e., if you suspect that the
> code in question is conspiring against you, not calling it at runtime
> to manipulate EFI variables is not going to help with that.
>
> But from a robustness point of view, I agree - running vendor code at
> the OS's privilege level at runtime that was only tested with Windows
> is not great for stability, and it would be nice if we could leverage
> the principle of least privilege and only permit it to access the
> things that it actually needs to perform the task that we've asked it
> to. This is why I asked for the ability to mark PRM services as
> unprivileged, given that they typically only run some code and perhaps
> poke some memory (either RAM or MMIO registers) that the OS never
> accesses directly.
>
> Question is though whether on x86, sandboxing is feasible: can VMs
> call into SMM? Because that is where 95% of the EFI variable services
> logic resides - the code running directly under the OS does very
> little other than marshalling the arguments and passing them on.

Last time I looked at the calls into SMM (which was quite a while ago), they were fairly recognizable sequences that would nicely cause VM exits.  So the VM would exit and we would invoke SMM on its behalf.

But it’s very very very common for VMX/SVM to be unavailable.

Has anyone tried running EFI at CPL3?

P.S. Forget about relying on AC to make EFI work. I doubt we can trust EFI to leave AC set.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-08  0:48                     ` Andy Lutomirski
@ 2025-11-08 16:18                       ` H. Peter Anvin
  2025-11-08 22:50                       ` H. Peter Anvin
  1 sibling, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-08 16:18 UTC (permalink / raw)
  To: Andy Lutomirski, Ard Biesheuvel, Peter Zijlstra (Intel)
  Cc: Dave Hansen, Sohil Mehta, the arch/x86 maintainers, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jonathan Corbet,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On November 7, 2025 4:48:05 PM PST, Andy Lutomirski <luto@kernel.org> wrote:
>
>
>On Fri, Nov 7, 2025, at 2:09 AM, Ard Biesheuvel wrote:
>> On Fri, 7 Nov 2025 at 10:40, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:
>>>
>>> > > But that's just the thing EFI is *NOT* trusted! We're basically
>>> > > disabling all security features (not listed above are CET and CFI) to
>>> > > run this random garbage we have no control over.
>>> > >
>>> > > How about we just flat out refuse EFI runtime services? What are they
>>> > > actually needed for? Why are we bending over backwards and subverting
>>> > > our security for this stuff?
>>> >
>>> > On x86, it is mostly the EFI variable services that user space has
>>> > come to rely on, not only for setting the boot path (which typically
>>> > happens only once at installation time, when the path to GRUB is set
>>> > as the first boot option). Unfortunately, the systemd folks have taken
>>> > a liking to this feature too, and have started storing things in
>>> > there.
>>>
>>> *groan*, so booting with noefi (I just went and found that option) will
>>> cause a modern Linux system to fail booting?
>>>
>>
>> As long as you install with EFI enabled, the impact of efi=noruntime
>> should be limited, given that x86 does not rely on EFI runtime
>> services for the RTC or for reboot/poweroff. But you will lose access
>> to the EFI variable store. (Not sure what 'noefi' does in comparison,
>> but keeping EFI enabled at boot time for things like secure/measured
>> boot and storage encryption will probably result in a net positive
>> impact on security/hardening as long as you avoid calling into the
>> firmware after boot)
>>
>>
>>> > There is also PRM, which is much worse, as it permits devices in the
>>> > ACPI namespace to call firmware routines that are mapped privileged in
>>> > the OS address space in the same way. I objected to this at the time,
>>> > and asked for a facility where we could at least mark such code as
>>> > unprivileged (and run it as such) but this was ignored, as Intel and
>>> > MS had already sealed the deal and put this into production. This is
>>> > much worse than typical EFI routines, as the PRM code is ODM/OEM code
>>> > rather than something that comes from the upstream EFI implementation.
>>> > It is basically a dumping ground for code that used to run in SMM
>>> > because it was too ugly to run anywhere else. </rant>
>>>
>>> What the actual fuck!! And we support this garbage? Without
>>> pr_err(FW_BUG ) notification?
>>>
>>> How can one find such devices? I need to check my machine.
>>>
>>
>> Unless you have a PRMT table in the list of ACPI tables, your system
>> shouldn't be affected by this.
>>
>>> > It would be nice if we could
>>> >
>>> > a) Get rid of SetVirtualAddressMap(), which is another insane hack
>>> > that should never have been supported on 64-bit systems. On arm64, we
>>> > no longer call it unless there is a specific need for it (some Ampere
>>> > Altra systems with buggy firmware will crash otherwise). On x86,
>>> > though, it might be tricky because there so much buggy firmware.
>>> > Perhaps we should phase it out by checking for the UEFI version, so
>>> > that future systems will avoid it. This would mean, however, that EFI
>>> > code remains in the low user address space, which may not be what you
>>> > want (unless we do c) perhaps?)
>>> >
>>> > b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
>>> > for arm64 a couple of years ago, but it was very intrusive and the ARM
>>> > intern in question went on to do more satisyfing work.
>>> >
>>> > c) Unmap the kernel KPTI style while the runtime calls are in
>>> > progress? This should be rather straight-forward, although it might
>>> > not help a lot as the code in question still runs privileged.
>>>
>>> At the very least I think we should start printing scary messages about
>>> disabling security to run untrusted code. This is all quite insane :/
>>
>> I agree in principle. However, calling it 'untrusted' is a bit
>> misleading here, given that you already rely on the same body of code
>> to boot your computer to begin with. I.e., if you suspect that the
>> code in question is conspiring against you, not calling it at runtime
>> to manipulate EFI variables is not going to help with that.
>>
>> But from a robustness point of view, I agree - running vendor code at
>> the OS's privilege level at runtime that was only tested with Windows
>> is not great for stability, and it would be nice if we could leverage
>> the principle of least privilege and only permit it to access the
>> things that it actually needs to perform the task that we've asked it
>> to. This is why I asked for the ability to mark PRM services as
>> unprivileged, given that they typically only run some code and perhaps
>> poke some memory (either RAM or MMIO registers) that the OS never
>> accesses directly.
>>
>> Question is though whether on x86, sandboxing is feasible: can VMs
>> call into SMM? Because that is where 95% of the EFI variable services
>> logic resides - the code running directly under the OS does very
>> little other than marshalling the arguments and passing them on.
>
>Last time I looked at the calls into SMM (which was quite a while ago), they were fairly recognizable sequences that would nicely cause VM exits.  So the VM would exit and we would invoke SMM on its behalf.
>
>But it’s very very very common for VMX/SVM to be unavailable.
>
>Has anyone tried running EFI at CPL3?
>
>P.S. Forget about relying on AC to make EFI work. I doubt we can trust EFI to leave AC set.
>

Yeah, AC is way too volatile. 

This thread veered off topic, though. The point wasn't that EFI runtime calls weren't crap, but that LASS, SMEP, and SMAP add no value during the EFI runtime call *because we explicitly unmap user space anyway* (efi_mm) so there are no user space mappings to worry about, so disabling them during the execution of the EFI runtime call makes no difference at all — *as long as* the CR4 manipulation is done strictly inside the efi_mm switch.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services
  2025-11-08  0:48                     ` Andy Lutomirski
  2025-11-08 16:18                       ` H. Peter Anvin
@ 2025-11-08 22:50                       ` H. Peter Anvin
  1 sibling, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-08 22:50 UTC (permalink / raw)
  To: Andy Lutomirski, Ard Biesheuvel, Peter Zijlstra (Intel)
  Cc: Dave Hansen, Sohil Mehta, the arch/x86 maintainers, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jonathan Corbet,
	Josh Poimboeuf, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick P Edgecombe, Vegard Nossum,
	Andrew Cooper, Randy Dunlap, Geert Uytterhoeven, Kees Cook,
	Tony Luck, Alexander Shishkin, linux-doc,
	Linux Kernel Mailing List, linux-efi

On November 7, 2025 4:48:05 PM PST, Andy Lutomirski <luto@kernel.org> wrote:
>
>
>On Fri, Nov 7, 2025, at 2:09 AM, Ard Biesheuvel wrote:
>> On Fri, 7 Nov 2025 at 10:40, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Fri, Nov 07, 2025 at 10:22:30AM +0100, Ard Biesheuvel wrote:
>>>
>>> > > But that's just the thing EFI is *NOT* trusted! We're basically
>>> > > disabling all security features (not listed above are CET and CFI) to
>>> > > run this random garbage we have no control over.
>>> > >
>>> > > How about we just flat out refuse EFI runtime services? What are they
>>> > > actually needed for? Why are we bending over backwards and subverting
>>> > > our security for this stuff?
>>> >
>>> > On x86, it is mostly the EFI variable services that user space has
>>> > come to rely on, not only for setting the boot path (which typically
>>> > happens only once at installation time, when the path to GRUB is set
>>> > as the first boot option). Unfortunately, the systemd folks have taken
>>> > a liking to this feature too, and have started storing things in
>>> > there.
>>>
>>> *groan*, so booting with noefi (I just went and found that option) will
>>> cause a modern Linux system to fail booting?
>>>
>>
>> As long as you install with EFI enabled, the impact of efi=noruntime
>> should be limited, given that x86 does not rely on EFI runtime
>> services for the RTC or for reboot/poweroff. But you will lose access
>> to the EFI variable store. (Not sure what 'noefi' does in comparison,
>> but keeping EFI enabled at boot time for things like secure/measured
>> boot and storage encryption will probably result in a net positive
>> impact on security/hardening as long as you avoid calling into the
>> firmware after boot)
>>
>>
>>> > There is also PRM, which is much worse, as it permits devices in the
>>> > ACPI namespace to call firmware routines that are mapped privileged in
>>> > the OS address space in the same way. I objected to this at the time,
>>> > and asked for a facility where we could at least mark such code as
>>> > unprivileged (and run it as such) but this was ignored, as Intel and
>>> > MS had already sealed the deal and put this into production. This is
>>> > much worse than typical EFI routines, as the PRM code is ODM/OEM code
>>> > rather than something that comes from the upstream EFI implementation.
>>> > It is basically a dumping ground for code that used to run in SMM
>>> > because it was too ugly to run anywhere else. </rant>
>>>
>>> What the actual fuck!! And we support this garbage? Without
>>> pr_err(FW_BUG ) notification?
>>>
>>> How can one find such devices? I need to check my machine.
>>>
>>
>> Unless you have a PRMT table in the list of ACPI tables, your system
>> shouldn't be affected by this.
>>
>>> > It would be nice if we could
>>> >
>>> > a) Get rid of SetVirtualAddressMap(), which is another insane hack
>>> > that should never have been supported on 64-bit systems. On arm64, we
>>> > no longer call it unless there is a specific need for it (some Ampere
>>> > Altra systems with buggy firmware will crash otherwise). On x86,
>>> > though, it might be tricky because there so much buggy firmware.
>>> > Perhaps we should phase it out by checking for the UEFI version, so
>>> > that future systems will avoid it. This would mean, however, that EFI
>>> > code remains in the low user address space, which may not be what you
>>> > want (unless we do c) perhaps?)
>>> >
>>> > b) Run EFI runtime calls in a sandbox VM - there was a PoC implemented
>>> > for arm64 a couple of years ago, but it was very intrusive and the ARM
>>> > intern in question went on to do more satisyfing work.
>>> >
>>> > c) Unmap the kernel KPTI style while the runtime calls are in
>>> > progress? This should be rather straight-forward, although it might
>>> > not help a lot as the code in question still runs privileged.
>>>
>>> At the very least I think we should start printing scary messages about
>>> disabling security to run untrusted code. This is all quite insane :/
>>
>> I agree in principle. However, calling it 'untrusted' is a bit
>> misleading here, given that you already rely on the same body of code
>> to boot your computer to begin with. I.e., if you suspect that the
>> code in question is conspiring against you, not calling it at runtime
>> to manipulate EFI variables is not going to help with that.
>>
>> But from a robustness point of view, I agree - running vendor code at
>> the OS's privilege level at runtime that was only tested with Windows
>> is not great for stability, and it would be nice if we could leverage
>> the principle of least privilege and only permit it to access the
>> things that it actually needs to perform the task that we've asked it
>> to. This is why I asked for the ability to mark PRM services as
>> unprivileged, given that they typically only run some code and perhaps
>> poke some memory (either RAM or MMIO registers) that the OS never
>> accesses directly.
>>
>> Question is though whether on x86, sandboxing is feasible: can VMs
>> call into SMM? Because that is where 95% of the EFI variable services
>> logic resides - the code running directly under the OS does very
>> little other than marshalling the arguments and passing them on.
>
>Last time I looked at the calls into SMM (which was quite a while ago), they were fairly recognizable sequences that would nicely cause VM exits.  So the VM would exit and we would invoke SMM on its behalf.
>
>But it’s very very very common for VMX/SVM to be unavailable.
>
>Has anyone tried running EFI at CPL3?
>
>P.S. Forget about relying on AC to make EFI work. I doubt we can trust EFI to leave AC set.
>

They certainly cause vmexits, as they are mostly I/O port accesses. Maybe there are MSRs on some platforms. But what do you do with them?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-10-29 21:03 ` [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code Sohil Mehta
  2025-10-31 17:10   ` Dave Hansen
@ 2025-11-10 18:15   ` Sohil Mehta
  2025-11-10 19:09     ` H. Peter Anvin
                       ` (2 more replies)
  1 sibling, 3 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-11-10 18:15 UTC (permalink / raw)
  To: x86, Borislav Petkov
  Cc: Jonathan Corbet, H . Peter Anvin, Andy Lutomirski, Josh Poimboeuf,
	Peter Zijlstra, Ard Biesheuvel, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi, Dave Hansen, Thomas Gleixner, Ingo Molnar

Hi Boris,

On 10/29/2025 2:03 PM, Sohil Mehta wrote:
> +/*
> + * LASS enforcement is based on bit 63 of the virtual address. The
> + * kernel is not allowed to touch memory in the lower half of the
> + * virtual address space.
> + *
> + * Use lass_disable()/lass_enable() to toggle the AC bit for kernel data
> + * accesses (!_PAGE_USER) that are blocked by LASS, but not by SMAP.
> + *
> + * Even with the AC bit set, LASS will continue to block instruction
> + * fetches from the user half of the address space. To allow those,
> + * clear CR4.LASS to disable the LASS mechanism entirely.
> + *

Based on the EFI discussion, it looks like we would now need to toggle
CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
naming would be more suitable for those wrappers.

I am thinking of reverting this back to lass_clac()/lass_stac().

lass_clac()/_stac():
	Disable enforcement for kernel data accesses similar to SMAP.

lass_enable()/_disable():
	Disable the entire LASS mechanism (Data and instruction fetch)
	by toggling CR4.LASS

Would that work? Any other suggestions?


> +
> +static __always_inline void lass_enable(void)
> +{
> +	alternative("", "clac", X86_FEATURE_LASS);
> +}
> +
> +static __always_inline void lass_disable(void)
> +{
> +	alternative("", "stac", X86_FEATURE_LASS);
> +}
> +

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-10 18:15   ` Sohil Mehta
@ 2025-11-10 19:09     ` H. Peter Anvin
  2025-11-10 19:24     ` Borislav Petkov
  2025-11-12 13:56     ` Ard Biesheuvel
  2 siblings, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-10 19:09 UTC (permalink / raw)
  To: Sohil Mehta, x86, Borislav Petkov
  Cc: Jonathan Corbet, Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Ard Biesheuvel, Kirill A . Shutemov, Xin Li, David Woodhouse,
	Sean Christopherson, Rick Edgecombe, Vegard Nossum, Andrew Cooper,
	Randy Dunlap, Geert Uytterhoeven, Kees Cook, Tony Luck,
	Alexander Shishkin, linux-doc, linux-kernel, linux-efi,
	Dave Hansen, Thomas Gleixner, Ingo Molnar

On November 10, 2025 10:15:23 AM PST, Sohil Mehta <sohil.mehta@intel.com> wrote:
>Hi Boris,
>
>On 10/29/2025 2:03 PM, Sohil Mehta wrote:
>> +/*
>> + * LASS enforcement is based on bit 63 of the virtual address. The
>> + * kernel is not allowed to touch memory in the lower half of the
>> + * virtual address space.
>> + *
>> + * Use lass_disable()/lass_enable() to toggle the AC bit for kernel data
>> + * accesses (!_PAGE_USER) that are blocked by LASS, but not by SMAP.
>> + *
>> + * Even with the AC bit set, LASS will continue to block instruction
>> + * fetches from the user half of the address space. To allow those,
>> + * clear CR4.LASS to disable the LASS mechanism entirely.
>> + *
>
>Based on the EFI discussion, it looks like we would now need to toggle
>CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
>naming would be more suitable for those wrappers.
>
>I am thinking of reverting this back to lass_clac()/lass_stac().
>
>lass_clac()/_stac():
>	Disable enforcement for kernel data accesses similar to SMAP.
>
>lass_enable()/_disable():
>	Disable the entire LASS mechanism (Data and instruction fetch)
>	by toggling CR4.LASS
>
>Would that work? Any other suggestions?
>
>
>> +
>> +static __always_inline void lass_enable(void)
>> +{
>> +	alternative("", "clac", X86_FEATURE_LASS);
>> +}
>> +
>> +static __always_inline void lass_disable(void)
>> +{
>> +	alternative("", "stac", X86_FEATURE_LASS);
>> +}
>> +

That would be my suggestion for making, too.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-10 18:15   ` Sohil Mehta
  2025-11-10 19:09     ` H. Peter Anvin
@ 2025-11-10 19:24     ` Borislav Petkov
  2025-11-12 13:56     ` Ard Biesheuvel
  2 siblings, 0 replies; 67+ messages in thread
From: Borislav Petkov @ 2025-11-10 19:24 UTC (permalink / raw)
  To: Sohil Mehta
  Cc: x86, Jonathan Corbet, H . Peter Anvin, Andy Lutomirski,
	Josh Poimboeuf, Peter Zijlstra, Ard Biesheuvel,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On Mon, Nov 10, 2025 at 10:15:23AM -0800, Sohil Mehta wrote:
> lass_clac()/_stac():
> 	Disable enforcement for kernel data accesses similar to SMAP.
> 
> lass_enable()/_disable():
> 	Disable the entire LASS mechanism (Data and instruction fetch)
> 	by toggling CR4.LASS
> 
> Would that work? Any other suggestions?

Sure, as long as they're documented. And if we decide to change them later for
whatever reason, we can. More than enough bikeshedding we did here.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-10 18:15   ` Sohil Mehta
  2025-11-10 19:09     ` H. Peter Anvin
  2025-11-10 19:24     ` Borislav Petkov
@ 2025-11-12 13:56     ` Ard Biesheuvel
  2025-11-12 14:51       ` Dave Hansen
  2 siblings, 1 reply; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-12 13:56 UTC (permalink / raw)
  To: Sohil Mehta
  Cc: x86, Borislav Petkov, Jonathan Corbet, H . Peter Anvin,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On Mon, 10 Nov 2025 at 19:15, Sohil Mehta <sohil.mehta@intel.com> wrote:
>
> Hi Boris,
>
> On 10/29/2025 2:03 PM, Sohil Mehta wrote:
> > +/*
> > + * LASS enforcement is based on bit 63 of the virtual address. The
> > + * kernel is not allowed to touch memory in the lower half of the
> > + * virtual address space.
> > + *
> > + * Use lass_disable()/lass_enable() to toggle the AC bit for kernel data
> > + * accesses (!_PAGE_USER) that are blocked by LASS, but not by SMAP.
> > + *
> > + * Even with the AC bit set, LASS will continue to block instruction
> > + * fetches from the user half of the address space. To allow those,
> > + * clear CR4.LASS to disable the LASS mechanism entirely.
> > + *
>
> Based on the EFI discussion,

Which discussion is that?

> it looks like we would now need to toggle
> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
> naming would be more suitable for those wrappers.
>

Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
runtime regions into the upper [kernel] half of the address space.

SetVirtualAddressMap() itself is a terrible idea, but given that we
are already stuck with it, we should be able to rely on ordinary EFI
runtime calls to only execute from the upper address range. The only
exception is the call to SetVirtualAddressMap() itself, which occurs
only once during early boot.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 13:56     ` Ard Biesheuvel
@ 2025-11-12 14:51       ` Dave Hansen
  2025-11-12 14:57         ` H. Peter Anvin
  0 siblings, 1 reply; 67+ messages in thread
From: Dave Hansen @ 2025-11-12 14:51 UTC (permalink / raw)
  To: Ard Biesheuvel, Sohil Mehta
  Cc: x86, Borislav Petkov, Jonathan Corbet, H . Peter Anvin,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On 11/12/25 05:56, Ard Biesheuvel wrote:
...
>> it looks like we would now need to toggle
>> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
>> naming would be more suitable for those wrappers.
>>
> Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
> runtime regions into the upper [kernel] half of the address space.
> 
> SetVirtualAddressMap() itself is a terrible idea, but given that we
> are already stuck with it, we should be able to rely on ordinary EFI
> runtime calls to only execute from the upper address range. The only
> exception is the call to SetVirtualAddressMap() itself, which occurs
> only once during early boot.

Gah, I had it in my head that we needed to use the lower mapping at
runtime. The efi_mm gets used for that SetVirtualAddressMap() and the
efi_mm continues to get used at runtime. So I think I just assumed that
the lower mappings needed to get used too.

Thanks for the education!

Let's say we simply delayed CR4.LASS=1 until later in boot. Could we
completely ignore LASS during EFI calls, since the calls only use the
upper address range?

Also, in practice, are there buggy EFI implementations that use the
lower address range even though they're not supposed to? *If* we just
keep LASS on for these calls is there a chance it will cause a
regression in some buggy EFI implementations?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 14:51       ` Dave Hansen
@ 2025-11-12 14:57         ` H. Peter Anvin
  2025-11-12 15:18           ` Ard Biesheuvel
  0 siblings, 1 reply; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-12 14:57 UTC (permalink / raw)
  To: Dave Hansen, Ard Biesheuvel, Sohil Mehta
  Cc: x86, Borislav Petkov, Jonathan Corbet, Andy Lutomirski,
	Josh Poimboeuf, Peter Zijlstra, Kirill A . Shutemov, Xin Li,
	David Woodhouse, Sean Christopherson, Rick Edgecombe,
	Vegard Nossum, Andrew Cooper, Randy Dunlap, Geert Uytterhoeven,
	Kees Cook, Tony Luck, Alexander Shishkin, linux-doc, linux-kernel,
	linux-efi, Dave Hansen, Thomas Gleixner, Ingo Molnar

On November 12, 2025 6:51:45 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>On 11/12/25 05:56, Ard Biesheuvel wrote:
>...
>>> it looks like we would now need to toggle
>>> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
>>> naming would be more suitable for those wrappers.
>>>
>> Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
>> runtime regions into the upper [kernel] half of the address space.
>> 
>> SetVirtualAddressMap() itself is a terrible idea, but given that we
>> are already stuck with it, we should be able to rely on ordinary EFI
>> runtime calls to only execute from the upper address range. The only
>> exception is the call to SetVirtualAddressMap() itself, which occurs
>> only once during early boot.
>
>Gah, I had it in my head that we needed to use the lower mapping at
>runtime. The efi_mm gets used for that SetVirtualAddressMap() and the
>efi_mm continues to get used at runtime. So I think I just assumed that
>the lower mappings needed to get used too.
>
>Thanks for the education!
>
>Let's say we simply delayed CR4.LASS=1 until later in boot. Could we
>completely ignore LASS during EFI calls, since the calls only use the
>upper address range?
>
>Also, in practice, are there buggy EFI implementations that use the
>lower address range even though they're not supposed to? *If* we just
>keep LASS on for these calls is there a chance it will cause a
>regression in some buggy EFI implementations?

Yes, they are. And there are buggy ones which die if set up with virtual addresses in the low half.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 14:57         ` H. Peter Anvin
@ 2025-11-12 15:18           ` Ard Biesheuvel
  2025-11-12 15:23             ` H. Peter Anvin
  0 siblings, 1 reply; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-12 15:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Hansen, Sohil Mehta, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On Wed, 12 Nov 2025 at 15:58, H. Peter Anvin <hpa@zytor.com> wrote:
>
> On November 12, 2025 6:51:45 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
> >On 11/12/25 05:56, Ard Biesheuvel wrote:
> >...
> >>> it looks like we would now need to toggle
> >>> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
> >>> naming would be more suitable for those wrappers.
> >>>
> >> Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
> >> runtime regions into the upper [kernel] half of the address space.
> >>
> >> SetVirtualAddressMap() itself is a terrible idea, but given that we
> >> are already stuck with it, we should be able to rely on ordinary EFI
> >> runtime calls to only execute from the upper address range. The only
> >> exception is the call to SetVirtualAddressMap() itself, which occurs
> >> only once during early boot.
> >
> >Gah, I had it in my head that we needed to use the lower mapping at
> >runtime. The efi_mm gets used for that SetVirtualAddressMap() and the
> >efi_mm continues to get used at runtime. So I think I just assumed that
> >the lower mappings needed to get used too.
> >
> >Thanks for the education!
> >
> >Let's say we simply delayed CR4.LASS=1 until later in boot. Could we
> >completely ignore LASS during EFI calls, since the calls only use the
> >upper address range?
> >
> >Also, in practice, are there buggy EFI implementations that use the
> >lower address range even though they're not supposed to? *If* we just
> >keep LASS on for these calls is there a chance it will cause a
> >regression in some buggy EFI implementations?
>
> Yes, they are. And there are buggy ones which die if set up with virtual addresses in the low half.

To elaborate on that, there are systems where

a) not calling SetVirtualAddressMap() crashes the firmware, because,
in spite of being clearly documented as optional, not calling it
results in some event hook not being called, causing the firmware to
misbehave

b) calling SetVirtualAddressMap() with an 1:1 mapping crashes the
firmware (and so this is not a possible workaround for a))

c) calling SetVirtualAddressMap() crashes the firmware when not both
the old 1:1 and the new kernel mapping are already live (which
violates the UEFI spec)

d) calling SetVirtualAddressMap() does not result in all 1:1
references being converted to the new mapping.


To address d), the x86_64 implementation of efi_map_region() indeed
maps an 1:1 alias of each remapped runtime regions, so that stray
accesses don't fault. But the code addresses are all remapped, and so
the firmware routines are always invoked via their remapped aliases in
the kernel VA space. Not calling SetVirtualAddressMap() at all, or
calling it with a 1:1 mapping is not feasible, essentially because
Windows doesn't do that, and that is the only thing that is tested on
all x86 PCs by the respective OEMs.

Given that remapping the code is dealt with by the firmware's PE/COFF
loader, whereas remapping [dynamically allocated] data requires effort
on the part of the programmer, I'd hazard a guess that 99.9% of those
bugs do not involve attempts to execute via the lower mapping, but
stray references to data objects that were not remapped properly.

So we might consider
a) remapping those 1:1 aliases NX, so we don't have those patches of
RWX memory around
b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 15:18           ` Ard Biesheuvel
@ 2025-11-12 15:23             ` H. Peter Anvin
  2025-11-12 15:28               ` Ard Biesheuvel
  0 siblings, 1 reply; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-12 15:23 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Dave Hansen, Sohil Mehta, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On November 12, 2025 7:18:33 AM PST, Ard Biesheuvel <ardb@kernel.org> wrote:
>On Wed, 12 Nov 2025 at 15:58, H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> On November 12, 2025 6:51:45 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>> >On 11/12/25 05:56, Ard Biesheuvel wrote:
>> >...
>> >>> it looks like we would now need to toggle
>> >>> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
>> >>> naming would be more suitable for those wrappers.
>> >>>
>> >> Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
>> >> runtime regions into the upper [kernel] half of the address space.
>> >>
>> >> SetVirtualAddressMap() itself is a terrible idea, but given that we
>> >> are already stuck with it, we should be able to rely on ordinary EFI
>> >> runtime calls to only execute from the upper address range. The only
>> >> exception is the call to SetVirtualAddressMap() itself, which occurs
>> >> only once during early boot.
>> >
>> >Gah, I had it in my head that we needed to use the lower mapping at
>> >runtime. The efi_mm gets used for that SetVirtualAddressMap() and the
>> >efi_mm continues to get used at runtime. So I think I just assumed that
>> >the lower mappings needed to get used too.
>> >
>> >Thanks for the education!
>> >
>> >Let's say we simply delayed CR4.LASS=1 until later in boot. Could we
>> >completely ignore LASS during EFI calls, since the calls only use the
>> >upper address range?
>> >
>> >Also, in practice, are there buggy EFI implementations that use the
>> >lower address range even though they're not supposed to? *If* we just
>> >keep LASS on for these calls is there a chance it will cause a
>> >regression in some buggy EFI implementations?
>>
>> Yes, they are. And there are buggy ones which die if set up with virtual addresses in the low half.
>
>To elaborate on that, there are systems where
>
>a) not calling SetVirtualAddressMap() crashes the firmware, because,
>in spite of being clearly documented as optional, not calling it
>results in some event hook not being called, causing the firmware to
>misbehave
>
>b) calling SetVirtualAddressMap() with an 1:1 mapping crashes the
>firmware (and so this is not a possible workaround for a))
>
>c) calling SetVirtualAddressMap() crashes the firmware when not both
>the old 1:1 and the new kernel mapping are already live (which
>violates the UEFI spec)
>
>d) calling SetVirtualAddressMap() does not result in all 1:1
>references being converted to the new mapping.
>
>
>To address d), the x86_64 implementation of efi_map_region() indeed
>maps an 1:1 alias of each remapped runtime regions, so that stray
>accesses don't fault. But the code addresses are all remapped, and so
>the firmware routines are always invoked via their remapped aliases in
>the kernel VA space. Not calling SetVirtualAddressMap() at all, or
>calling it with a 1:1 mapping is not feasible, essentially because
>Windows doesn't do that, and that is the only thing that is tested on
>all x86 PCs by the respective OEMs.
>
>Given that remapping the code is dealt with by the firmware's PE/COFF
>loader, whereas remapping [dynamically allocated] data requires effort
>on the part of the programmer, I'd hazard a guess that 99.9% of those
>bugs do not involve attempts to execute via the lower mapping, but
>stray references to data objects that were not remapped properly.
>
>So we might consider
>a) remapping those 1:1 aliases NX, so we don't have those patches of
>RWX memory around
>b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.

Unless someone has a code pointer in their code.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 15:23             ` H. Peter Anvin
@ 2025-11-12 15:28               ` Ard Biesheuvel
  2025-11-12 15:47                 ` H. Peter Anvin
  2025-11-12 16:18                 ` Sohil Mehta
  0 siblings, 2 replies; 67+ messages in thread
From: Ard Biesheuvel @ 2025-11-12 15:28 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Hansen, Sohil Mehta, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On Wed, 12 Nov 2025 at 16:23, H. Peter Anvin <hpa@zytor.com> wrote:
>
> On November 12, 2025 7:18:33 AM PST, Ard Biesheuvel <ardb@kernel.org> wrote:
> >On Wed, 12 Nov 2025 at 15:58, H. Peter Anvin <hpa@zytor.com> wrote:
> >>
> >> On November 12, 2025 6:51:45 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
> >> >On 11/12/25 05:56, Ard Biesheuvel wrote:
> >> >...
> >> >>> it looks like we would now need to toggle
> >> >>> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
> >> >>> naming would be more suitable for those wrappers.
> >> >>>
> >> >> Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
> >> >> runtime regions into the upper [kernel] half of the address space.
> >> >>
> >> >> SetVirtualAddressMap() itself is a terrible idea, but given that we
> >> >> are already stuck with it, we should be able to rely on ordinary EFI
> >> >> runtime calls to only execute from the upper address range. The only
> >> >> exception is the call to SetVirtualAddressMap() itself, which occurs
> >> >> only once during early boot.
> >> >
> >> >Gah, I had it in my head that we needed to use the lower mapping at
> >> >runtime. The efi_mm gets used for that SetVirtualAddressMap() and the
> >> >efi_mm continues to get used at runtime. So I think I just assumed that
> >> >the lower mappings needed to get used too.
> >> >
> >> >Thanks for the education!
> >> >
> >> >Let's say we simply delayed CR4.LASS=1 until later in boot. Could we
> >> >completely ignore LASS during EFI calls, since the calls only use the
> >> >upper address range?
> >> >
> >> >Also, in practice, are there buggy EFI implementations that use the
> >> >lower address range even though they're not supposed to? *If* we just
> >> >keep LASS on for these calls is there a chance it will cause a
> >> >regression in some buggy EFI implementations?
> >>
> >> Yes, they are. And there are buggy ones which die if set up with virtual addresses in the low half.
> >
> >To elaborate on that, there are systems where
> >
> >a) not calling SetVirtualAddressMap() crashes the firmware, because,
> >in spite of being clearly documented as optional, not calling it
> >results in some event hook not being called, causing the firmware to
> >misbehave
> >
> >b) calling SetVirtualAddressMap() with an 1:1 mapping crashes the
> >firmware (and so this is not a possible workaround for a))
> >
> >c) calling SetVirtualAddressMap() crashes the firmware when not both
> >the old 1:1 and the new kernel mapping are already live (which
> >violates the UEFI spec)
> >
> >d) calling SetVirtualAddressMap() does not result in all 1:1
> >references being converted to the new mapping.
> >
> >
> >To address d), the x86_64 implementation of efi_map_region() indeed
> >maps an 1:1 alias of each remapped runtime regions, so that stray
> >accesses don't fault. But the code addresses are all remapped, and so
> >the firmware routines are always invoked via their remapped aliases in
> >the kernel VA space. Not calling SetVirtualAddressMap() at all, or
> >calling it with a 1:1 mapping is not feasible, essentially because
> >Windows doesn't do that, and that is the only thing that is tested on
> >all x86 PCs by the respective OEMs.
> >
> >Given that remapping the code is dealt with by the firmware's PE/COFF
> >loader, whereas remapping [dynamically allocated] data requires effort
> >on the part of the programmer, I'd hazard a guess that 99.9% of those
> >bugs do not involve attempts to execute via the lower mapping, but
> >stray references to data objects that were not remapped properly.
> >
> >So we might consider
> >a) remapping those 1:1 aliases NX, so we don't have those patches of
> >RWX memory around
> >b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.
>
> Unless someone has a code pointer in their code.

That is a good point, especially because the EFI universe is
constructed out of GUIDs and so-called protocols, which are just
structs with function pointers.

However, EFI protocols are only supported at boot time, and the
runtime execution context is much more restricted. So I'd still expect
the code pointer case to be much less likely.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 15:28               ` Ard Biesheuvel
@ 2025-11-12 15:47                 ` H. Peter Anvin
  2025-11-12 16:18                 ` Sohil Mehta
  1 sibling, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-12 15:47 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Dave Hansen, Sohil Mehta, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On November 12, 2025 7:28:12 AM PST, Ard Biesheuvel <ardb@kernel.org> wrote:
>On Wed, 12 Nov 2025 at 16:23, H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> On November 12, 2025 7:18:33 AM PST, Ard Biesheuvel <ardb@kernel.org> wrote:
>> >On Wed, 12 Nov 2025 at 15:58, H. Peter Anvin <hpa@zytor.com> wrote:
>> >>
>> >> On November 12, 2025 6:51:45 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>> >> >On 11/12/25 05:56, Ard Biesheuvel wrote:
>> >> >...
>> >> >>> it looks like we would now need to toggle
>> >> >>> CR4.LASS every time we switch to efi_mm. The lass_enable()/_disable()
>> >> >>> naming would be more suitable for those wrappers.
>> >> >>>
>> >> >> Note that Linux/x86 uses SetVirtualAddressMap() to remap all EFI
>> >> >> runtime regions into the upper [kernel] half of the address space.
>> >> >>
>> >> >> SetVirtualAddressMap() itself is a terrible idea, but given that we
>> >> >> are already stuck with it, we should be able to rely on ordinary EFI
>> >> >> runtime calls to only execute from the upper address range. The only
>> >> >> exception is the call to SetVirtualAddressMap() itself, which occurs
>> >> >> only once during early boot.
>> >> >
>> >> >Gah, I had it in my head that we needed to use the lower mapping at
>> >> >runtime. The efi_mm gets used for that SetVirtualAddressMap() and the
>> >> >efi_mm continues to get used at runtime. So I think I just assumed that
>> >> >the lower mappings needed to get used too.
>> >> >
>> >> >Thanks for the education!
>> >> >
>> >> >Let's say we simply delayed CR4.LASS=1 until later in boot. Could we
>> >> >completely ignore LASS during EFI calls, since the calls only use the
>> >> >upper address range?
>> >> >
>> >> >Also, in practice, are there buggy EFI implementations that use the
>> >> >lower address range even though they're not supposed to? *If* we just
>> >> >keep LASS on for these calls is there a chance it will cause a
>> >> >regression in some buggy EFI implementations?
>> >>
>> >> Yes, they are. And there are buggy ones which die if set up with virtual addresses in the low half.
>> >
>> >To elaborate on that, there are systems where
>> >
>> >a) not calling SetVirtualAddressMap() crashes the firmware, because,
>> >in spite of being clearly documented as optional, not calling it
>> >results in some event hook not being called, causing the firmware to
>> >misbehave
>> >
>> >b) calling SetVirtualAddressMap() with an 1:1 mapping crashes the
>> >firmware (and so this is not a possible workaround for a))
>> >
>> >c) calling SetVirtualAddressMap() crashes the firmware when not both
>> >the old 1:1 and the new kernel mapping are already live (which
>> >violates the UEFI spec)
>> >
>> >d) calling SetVirtualAddressMap() does not result in all 1:1
>> >references being converted to the new mapping.
>> >
>> >
>> >To address d), the x86_64 implementation of efi_map_region() indeed
>> >maps an 1:1 alias of each remapped runtime regions, so that stray
>> >accesses don't fault. But the code addresses are all remapped, and so
>> >the firmware routines are always invoked via their remapped aliases in
>> >the kernel VA space. Not calling SetVirtualAddressMap() at all, or
>> >calling it with a 1:1 mapping is not feasible, essentially because
>> >Windows doesn't do that, and that is the only thing that is tested on
>> >all x86 PCs by the respective OEMs.
>> >
>> >Given that remapping the code is dealt with by the firmware's PE/COFF
>> >loader, whereas remapping [dynamically allocated] data requires effort
>> >on the part of the programmer, I'd hazard a guess that 99.9% of those
>> >bugs do not involve attempts to execute via the lower mapping, but
>> >stray references to data objects that were not remapped properly.
>> >
>> >So we might consider
>> >a) remapping those 1:1 aliases NX, so we don't have those patches of
>> >RWX memory around
>> >b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.
>>
>> Unless someone has a code pointer in their code.
>
>That is a good point, especially because the EFI universe is
>constructed out of GUIDs and so-called protocols, which are just
>structs with function pointers.
>
>However, EFI protocols are only supported at boot time, and the
>runtime execution context is much more restricted. So I'd still expect
>the code pointer case to be much less likely.

Yes, but it only takes one. 

The main thing, though, is that this is being bikeshedded for no good reason: there isn't much to be had from trying to narrow down from what we have now, other than restricting the *upper* mapping further.

And this has nothing to do with LASS.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 15:28               ` Ard Biesheuvel
  2025-11-12 15:47                 ` H. Peter Anvin
@ 2025-11-12 16:18                 ` Sohil Mehta
  2025-11-12 16:26                   ` H. Peter Anvin
  2025-11-12 16:29                   ` H. Peter Anvin
  1 sibling, 2 replies; 67+ messages in thread
From: Sohil Mehta @ 2025-11-12 16:18 UTC (permalink / raw)
  To: Ard Biesheuvel, H. Peter Anvin
  Cc: Dave Hansen, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On 11/12/2025 7:28 AM, Ard Biesheuvel wrote:

>>> d) calling SetVirtualAddressMap() does not result in all 1:1
>>> references being converted to the new mapping.
>>>
>>>
>>> To address d), the x86_64 implementation of efi_map_region() indeed
>>> maps an 1:1 alias of each remapped runtime regions, so that stray
>>> accesses don't fault. But the code addresses are all remapped, and so
>>> the firmware routines are always invoked via their remapped aliases in
>>> the kernel VA space. Not calling SetVirtualAddressMap() at all, or
>>> calling it with a 1:1 mapping is not feasible, essentially because
>>> Windows doesn't do that, and that is the only thing that is tested on
>>> all x86 PCs by the respective OEMs.
>>>
>>> Given that remapping the code is dealt with by the firmware's PE/COFF
>>> loader, whereas remapping [dynamically allocated] data requires effort
>>> on the part of the programmer, I'd hazard a guess that 99.9% of those
>>> bugs do not involve attempts to execute via the lower mapping, but
>>> stray references to data objects that were not remapped properly.
>>>
>>> So we might consider
>>> a) remapping those 1:1 aliases NX, so we don't have those patches of
>>> RWX memory around
>>> b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.
>>
>> Unless someone has a code pointer in their code.
> 
> That is a good point, especially because the EFI universe is
> constructed out of GUIDs and so-called protocols, which are just
> structs with function pointers.
> 
> However, EFI protocols are only supported at boot time, and the
> runtime execution context is much more restricted. So I'd still expect
> the code pointer case to be much less likely.

But, that still leaves the stray data accesses. We would still need to
disable the LASS data access enforcement by toggling RFLAGS.AC during
the runtime calls.

Can we rely on EFI to not mess up RFLAGS and keep the AC bit intact?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 16:18                 ` Sohil Mehta
@ 2025-11-12 16:26                   ` H. Peter Anvin
  2025-11-12 16:29                   ` H. Peter Anvin
  1 sibling, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-12 16:26 UTC (permalink / raw)
  To: Sohil Mehta, Ard Biesheuvel
  Cc: Dave Hansen, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On November 12, 2025 8:18:20 AM PST, Sohil Mehta <sohil.mehta@intel.com> wrote:
>On 11/12/2025 7:28 AM, Ard Biesheuvel wrote:
>
>>>> d) calling SetVirtualAddressMap() does not result in all 1:1
>>>> references being converted to the new mapping.
>>>>
>>>>
>>>> To address d), the x86_64 implementation of efi_map_region() indeed
>>>> maps an 1:1 alias of each remapped runtime regions, so that stray
>>>> accesses don't fault. But the code addresses are all remapped, and so
>>>> the firmware routines are always invoked via their remapped aliases in
>>>> the kernel VA space. Not calling SetVirtualAddressMap() at all, or
>>>> calling it with a 1:1 mapping is not feasible, essentially because
>>>> Windows doesn't do that, and that is the only thing that is tested on
>>>> all x86 PCs by the respective OEMs.
>>>>
>>>> Given that remapping the code is dealt with by the firmware's PE/COFF
>>>> loader, whereas remapping [dynamically allocated] data requires effort
>>>> on the part of the programmer, I'd hazard a guess that 99.9% of those
>>>> bugs do not involve attempts to execute via the lower mapping, but
>>>> stray references to data objects that were not remapped properly.
>>>>
>>>> So we might consider
>>>> a) remapping those 1:1 aliases NX, so we don't have those patches of
>>>> RWX memory around
>>>> b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.
>>>
>>> Unless someone has a code pointer in their code.
>> 
>> That is a good point, especially because the EFI universe is
>> constructed out of GUIDs and so-called protocols, which are just
>> structs with function pointers.
>> 
>> However, EFI protocols are only supported at boot time, and the
>> runtime execution context is much more restricted. So I'd still expect
>> the code pointer case to be much less likely.
>
>But, that still leaves the stray data accesses. We would still need to
>disable the LASS data access enforcement by toggling RFLAGS.AC during
>the runtime calls.
>
>Can we rely on EFI to not mess up RFLAGS and keep the AC bit intact?

No.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code
  2025-11-12 16:18                 ` Sohil Mehta
  2025-11-12 16:26                   ` H. Peter Anvin
@ 2025-11-12 16:29                   ` H. Peter Anvin
  1 sibling, 0 replies; 67+ messages in thread
From: H. Peter Anvin @ 2025-11-12 16:29 UTC (permalink / raw)
  To: Sohil Mehta, Ard Biesheuvel
  Cc: Dave Hansen, x86, Borislav Petkov, Jonathan Corbet,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra,
	Kirill A . Shutemov, Xin Li, David Woodhouse, Sean Christopherson,
	Rick Edgecombe, Vegard Nossum, Andrew Cooper, Randy Dunlap,
	Geert Uytterhoeven, Kees Cook, Tony Luck, Alexander Shishkin,
	linux-doc, linux-kernel, linux-efi, Dave Hansen, Thomas Gleixner,
	Ingo Molnar

On November 12, 2025 8:18:20 AM PST, Sohil Mehta <sohil.mehta@intel.com> wrote:
>On 11/12/2025 7:28 AM, Ard Biesheuvel wrote:
>
>>>> d) calling SetVirtualAddressMap() does not result in all 1:1
>>>> references being converted to the new mapping.
>>>>
>>>>
>>>> To address d), the x86_64 implementation of efi_map_region() indeed
>>>> maps an 1:1 alias of each remapped runtime regions, so that stray
>>>> accesses don't fault. But the code addresses are all remapped, and so
>>>> the firmware routines are always invoked via their remapped aliases in
>>>> the kernel VA space. Not calling SetVirtualAddressMap() at all, or
>>>> calling it with a 1:1 mapping is not feasible, essentially because
>>>> Windows doesn't do that, and that is the only thing that is tested on
>>>> all x86 PCs by the respective OEMs.
>>>>
>>>> Given that remapping the code is dealt with by the firmware's PE/COFF
>>>> loader, whereas remapping [dynamically allocated] data requires effort
>>>> on the part of the programmer, I'd hazard a guess that 99.9% of those
>>>> bugs do not involve attempts to execute via the lower mapping, but
>>>> stray references to data objects that were not remapped properly.
>>>>
>>>> So we might consider
>>>> a) remapping those 1:1 aliases NX, so we don't have those patches of
>>>> RWX memory around
>>>> b) keeping LASS enabled during ordinary EFI runtime calls, as you suggest.
>>>
>>> Unless someone has a code pointer in their code.
>> 
>> That is a good point, especially because the EFI universe is
>> constructed out of GUIDs and so-called protocols, which are just
>> structs with function pointers.
>> 
>> However, EFI protocols are only supported at boot time, and the
>> runtime execution context is much more restricted. So I'd still expect
>> the code pointer case to be much less likely.
>
>But, that still leaves the stray data accesses. We would still need to
>disable the LASS data access enforcement by toggling RFLAGS.AC during
>the runtime calls.
>
>Can we rely on EFI to not mess up RFLAGS and keep the AC bit intact?

Let's not muck with this now; it is lately pointless and as you can see it's a rathole.

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2025-11-12 16:29 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29 21:03 [PATCH v11 0/9] x86: Enable Linear Address Space Separation support Sohil Mehta
2025-10-29 21:03 ` [PATCH v11 1/9] x86/cpufeatures: Enumerate the LASS feature bits Sohil Mehta
2025-10-31 17:03   ` Dave Hansen
2025-10-29 21:03 ` [PATCH v11 2/9] x86/cpu: Add an LASS dependency on SMAP Sohil Mehta
2025-10-31 17:04   ` Dave Hansen
2025-10-29 21:03 ` [PATCH v11 3/9] x86/asm: Introduce inline memcpy and memset Sohil Mehta
2025-10-31 17:06   ` Dave Hansen
2025-10-29 21:03 ` [PATCH v11 4/9] x86/alternatives: Disable LASS when patching kernel code Sohil Mehta
2025-10-31 17:10   ` Dave Hansen
2025-11-10 18:15   ` Sohil Mehta
2025-11-10 19:09     ` H. Peter Anvin
2025-11-10 19:24     ` Borislav Petkov
2025-11-12 13:56     ` Ard Biesheuvel
2025-11-12 14:51       ` Dave Hansen
2025-11-12 14:57         ` H. Peter Anvin
2025-11-12 15:18           ` Ard Biesheuvel
2025-11-12 15:23             ` H. Peter Anvin
2025-11-12 15:28               ` Ard Biesheuvel
2025-11-12 15:47                 ` H. Peter Anvin
2025-11-12 16:18                 ` Sohil Mehta
2025-11-12 16:26                   ` H. Peter Anvin
2025-11-12 16:29                   ` H. Peter Anvin
2025-10-29 21:03 ` [PATCH v11 5/9] x86/efi: Disable LASS while mapping the EFI runtime services Sohil Mehta
2025-10-31 17:11   ` Dave Hansen
2025-10-31 17:38     ` Andy Lutomirski
2025-10-31 17:41       ` Dave Hansen
2025-10-31 18:03         ` Sohil Mehta
2025-10-31 18:12           ` Dave Hansen
2025-11-07  9:04             ` Peter Zijlstra
2025-11-07  9:22               ` Ard Biesheuvel
2025-11-07  9:27                 ` H. Peter Anvin
2025-11-07  9:35                   ` Ard Biesheuvel
2025-11-07  9:40                 ` Peter Zijlstra
2025-11-07 10:09                   ` Ard Biesheuvel
2025-11-07 10:27                     ` Peter Zijlstra
2025-11-08  0:48                     ` Andy Lutomirski
2025-11-08 16:18                       ` H. Peter Anvin
2025-11-08 22:50                       ` H. Peter Anvin
2025-11-07 10:10                 ` Peter Zijlstra
2025-11-07 10:17                   ` Ard Biesheuvel
2025-10-31 19:04       ` Sohil Mehta
2025-11-07  7:36         ` Sohil Mehta
2025-10-31 18:32     ` Sohil Mehta
2025-10-29 21:03 ` [PATCH v11 6/9] x86/kexec: Disable LASS during relocate kernel Sohil Mehta
2025-10-31 17:14   ` Dave Hansen
2025-10-29 21:03 ` [PATCH v11 7/9] x86/traps: Communicate a LASS violation in #GP message Sohil Mehta
2025-10-31 17:16   ` Dave Hansen
2025-10-31 19:59     ` Sohil Mehta
2025-10-31 20:03       ` Andy Lutomirski
2025-10-31 20:56       ` Dave Hansen
2025-10-29 21:03 ` [PATCH v11 8/9] selftests/x86: Update the negative vsyscall tests to expect a #GP Sohil Mehta
2025-10-31 17:20   ` Dave Hansen
2025-10-29 21:03 ` [PATCH v11 9/9] x86/cpu: Enable LASS by default during CPU initialization Sohil Mehta
2025-10-30  8:40   ` H. Peter Anvin
2025-10-30 15:45     ` Andy Lutomirski
2025-10-30 16:44       ` Sohil Mehta
2025-10-30 16:53         ` Andy Lutomirski
2025-10-30 17:24           ` Sohil Mehta
2025-10-30 17:31             ` Andy Lutomirski
2025-10-30 21:13         ` David Laight
2025-10-31  6:41           ` H. Peter Anvin
2025-10-31 16:55           ` Dave Hansen
2025-10-30 16:27     ` Dave Hansen
2025-11-07  8:01       ` H. Peter Anvin
2025-11-07 20:08         ` Sohil Mehta
2025-10-31 17:21   ` Dave Hansen
2025-10-31 20:04     ` Sohil Mehta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).