* [PATCHv9 00/16] x86: Enable Linear Address Space Separation support
@ 2025-07-07 8:03 Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
` (15 more replies)
0 siblings, 16 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
Linear Address Space Separation (LASS) is a security feature that intends to
prevent malicious virtual address space accesses across user/kernel mode.
Such mode based access protection already exists today with paging and features
such as SMEP and SMAP. However, to enforce these protections, the processor
must traverse the paging structures in memory. Malicious software can use
timing information resulting from this traversal to determine details about the
paging structures, and these details may also be used to determine the layout
of the kernel memory.
The LASS mechanism provides the same mode-based protections as paging but
without traversing the paging structures. Because the protections enforced by
LASS are applied before paging, software will not be able to derive
paging-based timing information from the various caching structures such as the
TLBs, mid-level caches, page walker, data caches, etc. LASS can avoid probing
using double page faults, TLB flush and reload, and SW prefetch instructions.
See [2], [3] and [4] for some research on the related attack vectors.
Had it been available, LASS alone would have mitigated Meltdown. (Hindsight is
20/20 :)
In addition, LASS prevents an attack vector described in a Spectre LAM (SLAM)
whitepaper [7].
LASS enforcement relies on the typical kernel implementation to divide the
64-bit virtual address space into two halves:
Addr[63]=0 -> User address space
Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically results in a
#GP fault.
Kernel accesses usually only happen to the kernel address space. However, there
are valid reasons for kernel to access memory in the user half. For these cases
(such as text poking and EFI runtime accesses), the kernel can temporarily
suspend the enforcement of LASS by toggling SMAP (Supervisor Mode Access
Prevention) using the stac()/clac() instructions and in one instance a downright
disabling LASS for an EFI runtime call.
User space cannot access any kernel address while LASS is enabled.
Unfortunately, legacy vsyscall functions are located in the address range
0xffffffffff600000 - 0xffffffffff601000 and emulated in kernel. To avoid
breaking user applications when LASS is enabled, extend the vsyscall emulation
in execute (XONLY) mode to the #GP fault handler.
In contrast, the vsyscall EMULATE mode is deprecated and not expected to be
used by anyone. Supporting EMULATE mode with LASS would need complex
instruction decoding in the #GP fault handler and is probably not worth the
hassle. Disable LASS in this rare case when someone absolutely needs and
enables vsyscall=emulate via the command line.
Changed from v8[12]:
- Drop __inline_memcpy()/memset(). Directly use asm() for text poke;
- Rework ##SS handler;
- Restructure get_kernel_gp_address();
- Update commit messages and comments;
Changes from v7[11]:
- Fix __inline_memset();
- Rename lass_disable/enable_enforcement() back to to lass_clac/stac()
- Generalize #GP address decode and hint code. Rename stuff to be
non-GP-centric;
- Reorder patches;
- Update commit messages and comments;
Changes from v6[10]:
- Rework #SS handler to work properly on FRED;
- Do not require X86_PF_INSTR to emulate vsyscall;
- Move lass_clac()/stac() definition to the patch where they are used;
- Rename lass_clac/stac() to lass_disable/enable_enforcement();
- Fix several build issues around inline memcpy and memset;
- Fix sparse warning;
- Adjust comments and commit messages;
- Drop "x86/efi: Move runtime service initialization to arch/x86" patch
as it got applied;
Changes from v5[9]:
- Report LASS violation as NULL pointer dereference if the address is in the
first page frame;
- Provide helpful error message on #SS due to LASS violation;
- Fold patch for vsyscall=emulate documentation into patch
that disables LASS with vsyscall=emulate;
- Rewrite __inline_memeset() and __inline_memcpy();
- Adjust comments and commit messages;
Changes from v4[8]:
- Added PeterZ's Originally-by and SoB to 2/16
- Added lass_clac()/lass_stac() to differentiate from SMAP necessitated
clac()/stac() and to be NOPs on CPUs that don't support LASS
- Moved LASS enabling patch to the end to avoid rendering machines
unbootable between until the patch that disables LASS around EFI
initialization
- Reverted Pawan's LAM disabling commit
Changes from v3[6]:
- Made LAM dependent on LASS
- Moved EFI runtime initialization to x86 side of things
- Suspended LASS validation around EFI set_virtual_address_map call
- Added a message for the case of kernel side LASS violation
- Moved inline memset/memcpy versions to the common string.h
Changes from v2[5]:
- Added myself to the SoB chain
Changes from v1[1]:
- Emulate vsyscall violations in execute mode in the #GP fault handler
- Use inline memcpy and memset while patching alternatives
- Remove CONFIG_X86_LASS
- Make LASS depend on SMAP
- Dropped the minimal KVM enabling patch
[1] https://lore.kernel.org/lkml/20230110055204.3227669-1-yian.chen@intel.com/
[2] “Practical Timing Side Channel Attacks against Kernel Space ASLR”,
https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[3] “Prefetch Side-Channel Attacks: Bypassing SMAP and Kernel ASLR”, http://doi.acm.org/10.1145/2976749.2978356
[4] “Harmful prefetch on Intel”, https://ioactive.com/harmful-prefetch-on-intel/ (H/T Anders)
[5] https://lore.kernel.org/all/20230530114247.21821-1-alexander.shishkin@linux.intel.com/
[6] https://lore.kernel.org/all/20230609183632.48706-1-alexander.shishkin@linux.intel.com/
[7] https://download.vusec.net/papers/slam_sp24.pdf
[8] https://lore.kernel.org/all/20240710160655.3402786-1-alexander.shishkin@linux.intel.com/
[9] https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com
[10] https://lore.kernel.org/all/20250620135325.3300848-1-kirill.shutemov@linux.intel.com/
[11] https://lore.kernel.org/all/20250625125112.3943745-1-kirill.shutemov@linux.intel.com/
[12] https://lore.kernel.org/all/20250701095849.2360685-1-kirill.shutemov@linux.intel.com/
Alexander Shishkin (4):
x86/cpu: Defer CR pinning setup until core initcall
efi: Disable LASS around set_virtual_address_map() EFI call
x86/traps: Communicate a LASS violation in #GP message
x86/cpu: Make LAM depend on LASS
Kirill A. Shutemov (4):
x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
x86/traps: Generalize #GP address decode and hint code
x86/traps: Handle LASS thrown #SS
x86: Re-enable Linear Address Masking
Sohil Mehta (7):
x86/cpu: Enumerate the LASS feature bits
x86/alternatives: Disable LASS when patching kernel alternatives
x86/vsyscall: Reorganize the #PF emulation code
x86/traps: Consolidate user fixups in exc_general_protection()
x86/vsyscall: Add vsyscall emulation for #GP
x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
x86/cpu: Enable LASS during CPU initialization
Yian Chen (1):
x86/cpu: Set LASS CR4 bit as pinning sensitive
.../admin-guide/kernel-parameters.txt | 4 +-
arch/x86/Kconfig | 1 -
arch/x86/Kconfig.cpufeatures | 4 +
arch/x86/entry/vsyscall/vsyscall_64.c | 69 +++++++---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/smap.h | 33 ++++-
arch/x86/include/asm/vsyscall.h | 14 +-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/alternative.c | 28 +++-
arch/x86/kernel/cpu/common.c | 22 ++--
arch/x86/kernel/cpu/cpuid-deps.c | 2 +
arch/x86/kernel/dumpstack.c | 6 +-
arch/x86/kernel/traps.c | 122 ++++++++++++------
arch/x86/kernel/umip.c | 3 +
arch/x86/mm/fault.c | 2 +-
arch/x86/platform/efi/efi.c | 15 +++
tools/arch/x86/include/asm/cpufeatures.h | 1 +
17 files changed, 249 insertions(+), 80 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCHv9 01/16] x86/cpu: Enumerate the LASS feature bits
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
` (14 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Linear Address Space Separation (LASS) is a security feature that
intends to prevent malicious virtual address space accesses across
user/kernel mode.
Such mode based access protection already exists today with paging and
features such as SMEP and SMAP. However, to enforce these protections,
the processor must traverse the paging structures in memory. Malicious
software can use timing information resulting from this traversal to
determine details about the paging structures, and these details may
also be used to determine the layout of the kernel memory.
The LASS mechanism provides the same mode-based protections as paging
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, software will not be able to
derive paging-based timing information from the various caching
structures such as the TLBs, mid-level caches, page walker, data caches,
etc.
LASS enforcement relies on the typical kernel implementation to divide
the 64-bit virtual address space into two halves:
Addr[63]=0 -> User address space
Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically
results in a #GP fault.
The LASS enforcement for kernel data access is dependent on CR4.SMAP
being set. The enforcement can be disabled by toggling the RFLAGS.AC bit
similar to SMAP.
Define the CPU feature bits to enumerate this feature and include
feature dependencies to reflect the same.
LASS provides protection against a class of speculative attacks, such as
SLAM[1]. Add the "lass" flag to /proc/cpuinfo to indicate that the feature
is supported by hardware and enabled by the kernel. This allows userspace
to determine if the setup is secure against such attacks.
[1] https://download.vusec.net/papers/slam_sp24.pdf
Co-developed-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Xin Li (Intel) <xin@zytor.com>
---
arch/x86/Kconfig.cpufeatures | 4 ++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/uapi/asm/processor-flags.h | 2 ++
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
tools/arch/x86/include/asm/cpufeatures.h | 1 +
5 files changed, 9 insertions(+)
diff --git a/arch/x86/Kconfig.cpufeatures b/arch/x86/Kconfig.cpufeatures
index 250c10627ab3..733d5aff2456 100644
--- a/arch/x86/Kconfig.cpufeatures
+++ b/arch/x86/Kconfig.cpufeatures
@@ -124,6 +124,10 @@ config X86_DISABLED_FEATURE_PCID
def_bool y
depends on !X86_64
+config X86_DISABLED_FEATURE_LASS
+ def_bool y
+ depends on X86_32
+
config X86_DISABLED_FEATURE_PKU
def_bool y
depends on !X86_INTEL_MEMORY_PROTECTION_KEYS
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b78af55aa22e..8eef1ad7aca2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -313,6 +313,7 @@
#define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */
#define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */
#define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */
#define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */
diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index f1a4adc78272..81d0c8bf1137 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -136,6 +136,8 @@
#define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT)
#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */
#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT)
+#define X86_CR4_LASS_BIT 27 /* enable Linear Address Space Separation support */
+#define X86_CR4_LASS _BITUL(X86_CR4_LASS_BIT)
#define X86_CR4_LAM_SUP_BIT 28 /* LAM for supervisor pointers */
#define X86_CR4_LAM_SUP _BITUL(X86_CR4_LAM_SUP_BIT)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 46efcbd6afa4..98d0cdd82574 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -89,6 +89,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_SHSTK, X86_FEATURE_XSAVES },
{ X86_FEATURE_FRED, X86_FEATURE_LKGS },
{ X86_FEATURE_SPEC_CTRL_SSBD, X86_FEATURE_SPEC_CTRL },
+ { X86_FEATURE_LASS, X86_FEATURE_SMAP },
{}
};
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index ee176236c2be..4473a6f7800b 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -313,6 +313,7 @@
#define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */
#define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */
#define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */
#define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 1:08 ` Sohil Mehta
` (2 more replies)
2025-07-07 8:03 ` [PATCHv9 03/16] x86/cpu: Set LASS CR4 bit as pinning sensitive Kirill A. Shutemov
` (13 subsequent siblings)
15 siblings, 3 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
For patching, the kernel initializes a temporary mm area in the lower
half of the address range. See commit 4fc19708b165 ("x86/alternatives:
Initialize temporary mm for patching").
Disable LASS enforcement during patching to avoid triggering a #GP
fault.
The objtool warns due to a call to a non-allowed function that exists
outside of the stac/clac guard, or references to any function with a
dynamic function pointer inside the guard. See the Objtool warnings
section #9 in the document tools/objtool/Documentation/objtool.txt.
Considering that patching is usually small, replace the memcpy() and
memset() functions in the text poking functions with their open coded
versions.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/smap.h | 33 +++++++++++++++++++++++++++++++--
arch/x86/kernel/alternative.c | 28 ++++++++++++++++++++++++++--
2 files changed, 57 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..d0cc24348641 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -23,18 +23,47 @@
#else /* __ASSEMBLER__ */
+/*
+ * The CLAC/STAC instructions toggle the enforcement of X86_FEATURE_SMAP and
+ * X86_FEATURE_LASS.
+ *
+ * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page tables: the
+ * kernel is not allowed to touch pages with the bit set unless the AC bit is
+ * set.
+ *
+ * LASS enforcement is based on bit 63 of the virtual address. The kernel is
+ * not allowed to touch memory in the lower half of the virtual address space
+ * unless the AC bit is set.
+ *
+ * Use stac()/clac() when accessing userspace (_PAGE_USER) mappings,
+ * regardless of location.
+ *
+ * Use lass_stac()/lass_clac() when accessing kernel mappings (!_PAGE_USER)
+ * in the lower half of the address space.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
static __always_inline void clac(void)
{
- /* Note: a barrier is implicit in alternative() */
alternative("", "clac", X86_FEATURE_SMAP);
}
static __always_inline void stac(void)
{
- /* Note: a barrier is implicit in alternative() */
alternative("", "stac", X86_FEATURE_SMAP);
}
+static __always_inline void lass_clac(void)
+{
+ alternative("", "clac", X86_FEATURE_LASS);
+}
+
+static __always_inline void lass_stac(void)
+{
+ alternative("", "stac", X86_FEATURE_LASS);
+}
+
static __always_inline unsigned long smap_save(void)
{
unsigned long flags;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index ea1d984166cd..992ece0e879a 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2447,16 +2447,40 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
__ro_after_init struct mm_struct *text_poke_mm;
__ro_after_init unsigned long text_poke_mm_addr;
+/*
+ * Text poking creates and uses a mapping in the lower half of the
+ * address space. Relax LASS enforcement when accessing the poking
+ * address.
+ */
+
static void text_poke_memcpy(void *dst, const void *src, size_t len)
{
- memcpy(dst, src, len);
+ lass_stac();
+
+ /*
+ * Objtool is picky about what occurs within the STAC/CLAC region
+ * because this code runs with protection disabled. Objtool typically
+ * does not permit function calls in this area.
+ *
+ * Avoid using memcpy() here. Instead, open code it.
+ */
+ asm volatile("rep movsb"
+ : "+D" (dst), "+S" (src), "+c" (len) : : "memory");
+
+ lass_clac();
}
static void text_poke_memset(void *dst, const void *src, size_t len)
{
int c = *(const int *)src;
- memset(dst, c, len);
+ lass_stac();
+
+ /* Open code memset(): make objtool happy. See text_poke_memcpy(). */
+ asm volatile("rep stosb"
+ : "+D" (dst), "+c" (len) : "a" (c) : "memory");
+
+ lass_clac();
}
typedef void text_poke_f(void *dst, const void *src, size_t len);
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 03/16] x86/cpu: Set LASS CR4 bit as pinning sensitive
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall Kirill A. Shutemov
` (12 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Yian Chen <yian.chen@intel.com>
Security features such as LASS are not expected to be disabled once
initialized. Add LASS to the CR4 pinned mask.
Signed-off-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/common.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 325f892ee42d..ec62e2f9ea16 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -403,7 +403,8 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
- X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED;
+ X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED |
+ X86_CR4_LASS;
static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
static unsigned long cr4_pinned_bits __ro_after_init;
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (2 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 03/16] x86/cpu: Set LASS CR4 bit as pinning sensitive Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 1:19 ` Sohil Mehta
2025-07-09 17:00 ` Dave Hansen
2025-07-07 8:03 ` [PATCHv9 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
` (11 subsequent siblings)
15 siblings, 2 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
In order to map the EFI runtime services, set_virtual_address_map()
needs to be called, which resides in the lower half of the address
space. This means that LASS needs to be temporarily disabled around
this call. This can only be done before the CR pinning is set up.
Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
arch_cpu_finalize_init(), defer it until core initcall.
Wrapping efi_enter_virtual_mode() into lass_stac()/clac() is not enough
because AC flag gates data accesses, but not instruction fetch. Clearing
the CR4 bit is required.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/common.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ec62e2f9ea16..f10f9f618805 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -490,11 +490,14 @@ void cr4_init(void)
* parsed), record any of the sensitive CR bits that are set, and
* enable CR pinning.
*/
-static void __init setup_cr_pinning(void)
+static int __init setup_cr_pinning(void)
{
cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & cr4_pinned_mask;
static_key_enable(&cr_pinning.key);
+
+ return 0;
}
+core_initcall(setup_cr_pinning);
static __init int x86_nofsgsbase_setup(char *arg)
{
@@ -2082,7 +2085,6 @@ static __init void identify_boot_cpu(void)
enable_sep_cpu();
#endif
cpu_detect_tlb(&boot_cpu_data);
- setup_cr_pinning();
tsx_init();
tdx_init();
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 05/16] efi: Disable LASS around set_virtual_address_map() EFI call
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (3 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 1:27 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
` (10 subsequent siblings)
15 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Of all the EFI runtime services, set_virtual_address_map() is the only
one that is called at its lower mapping, which LASS prohibits regardless
of EFLAGS.AC setting. The only way to allow this to happen is to disable
LASS in the CR4 register.
Disable LASS around this low address EFI call.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/platform/efi/efi.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 463b784499a8..5b23c0daedef 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -787,6 +787,7 @@ static void __init __efi_enter_virtual_mode(void)
int count = 0, pg_shift = 0;
void *new_memmap = NULL;
efi_status_t status;
+ unsigned long lass;
unsigned long pa;
if (efi_alloc_page_tables()) {
@@ -825,11 +826,25 @@ static void __init __efi_enter_virtual_mode(void)
efi_sync_low_kernel_mappings();
+ /*
+ * set_virtual_address_map() is the only service located at lower
+ * addresses, so LASS has to be disabled around it.
+ *
+ * Note that flipping RFLAGS.AC is not sufficient for this, as it only
+ * permits data accesses and not instruction fetch. The entire LASS
+ * needs to be disabled.
+ */
+ lass = cr4_read_shadow() & X86_CR4_LASS;
+ cr4_clear_bits(lass);
+
status = efi_set_virtual_address_map(efi.memmap.desc_size * count,
efi.memmap.desc_size,
efi.memmap.desc_version,
(efi_memory_desc_t *)pa,
efi_systab_phys);
+
+ cr4_set_bits(lass);
+
if (status != EFI_SUCCESS) {
pr_err("Unable to switch EFI into virtual mode (status=%lx)!\n",
status);
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (4 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 07/16] x86/vsyscall: Reorganize the #PF emulation code Kirill A. Shutemov
` (9 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
emulate_vsyscall() expects to see X86_PF_INSTR in PFEC on a vsyscall
page fault, but the CPU does not report X86_PF_INSTR if neither
X86_FEATURE_NX nor X86_FEATURE_SMEP are enabled.
X86_FEATURE_NX should be enabled on nearly all 64-bit CPUs, except for
early P4 processors that did not support this feature.
Instead of explicitly checking for X86_PF_INSTR, compare the fault
address against RIP.
On machines with X86_FEATURE_NX enabled, issue a warning if RIP is equal
to fault address but X86_PF_INSTR is absent.
Originally-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/all/bd81a98b-f8d4-4304-ac55-d4151a1a77ab@intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index c9103a6fa06e..0b0e0283994f 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -124,7 +124,8 @@ bool emulate_vsyscall(unsigned long error_code,
if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
return false;
- if (!(error_code & X86_PF_INSTR)) {
+ /* Avoid emulation unless userspace was executing from vsyscall page: */
+ if (address != regs->ip) {
/* Failed vsyscall read */
if (vsyscall_mode == EMULATE)
return false;
@@ -136,13 +137,16 @@ bool emulate_vsyscall(unsigned long error_code,
return false;
}
+
+ /* X86_PF_INSTR is only set when NX is supported: */
+ if (cpu_feature_enabled(X86_FEATURE_NX))
+ WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
+
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
*/
- WARN_ON_ONCE(address != regs->ip);
-
if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none");
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 07/16] x86/vsyscall: Reorganize the #PF emulation code
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (5 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 08/16] x86/traps: Consolidate user fixups in exc_general_protection() Kirill A. Shutemov
` (8 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Separate out the actual vsyscall emulation from the page fault specific
handling in preparation for the upcoming #GP fault emulation.
No functional change intended.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 52 ++++++++++++++-------------
arch/x86/include/asm/vsyscall.h | 8 ++---
arch/x86/mm/fault.c | 2 +-
3 files changed, 33 insertions(+), 29 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 0b0e0283994f..25f94ac5fd35 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -112,36 +112,13 @@ static bool write_ok_or_segv(unsigned long ptr, size_t size)
}
}
-bool emulate_vsyscall(unsigned long error_code,
- struct pt_regs *regs, unsigned long address)
+static bool __emulate_vsyscall(struct pt_regs *regs, unsigned long address)
{
unsigned long caller;
int vsyscall_nr, syscall_nr, tmp;
long ret;
unsigned long orig_dx;
- /* Write faults or kernel-privilege faults never get fixed up. */
- if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
- return false;
-
- /* Avoid emulation unless userspace was executing from vsyscall page: */
- if (address != regs->ip) {
- /* Failed vsyscall read */
- if (vsyscall_mode == EMULATE)
- return false;
-
- /*
- * User code tried and failed to read the vsyscall page.
- */
- warn_bad_vsyscall(KERN_INFO, regs, "vsyscall read attempt denied -- look up the vsyscall kernel parameter if you need a workaround");
- return false;
- }
-
-
- /* X86_PF_INSTR is only set when NX is supported: */
- if (cpu_feature_enabled(X86_FEATURE_NX))
- WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
-
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
@@ -274,6 +251,33 @@ bool emulate_vsyscall(unsigned long error_code,
return true;
}
+bool emulate_vsyscall_pf(unsigned long error_code, struct pt_regs *regs,
+ unsigned long address)
+{
+ /* Write faults or kernel-privilege faults never get fixed up. */
+ if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
+ return false;
+
+ if (address == regs->ip) {
+ /* X86_PF_INSTR is only set when NX is supported: */
+ if (cpu_feature_enabled(X86_FEATURE_NX))
+ WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
+
+ return __emulate_vsyscall(regs, address);
+ }
+
+ /* Failed vsyscall read */
+ if (vsyscall_mode == EMULATE)
+ return false;
+
+ /*
+ * User code tried and failed to read the vsyscall page.
+ */
+ warn_bad_vsyscall(KERN_INFO, regs,
+ "vsyscall read attempt denied -- look up the vsyscall kernel parameter if you need a workaround");
+ return false;
+}
+
/*
* A pseudo VMA to allow ptrace access for the vsyscall page. This only
* covers the 64bit vsyscall page now. 32bit has a real VMA now and does
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index 472f0263dbc6..214977f4fa11 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -14,12 +14,12 @@ extern void set_vsyscall_pgtable_user_bits(pgd_t *root);
* Called on instruction fetch fault in vsyscall page.
* Returns true if handled.
*/
-extern bool emulate_vsyscall(unsigned long error_code,
- struct pt_regs *regs, unsigned long address);
+extern bool emulate_vsyscall_pf(unsigned long error_code,
+ struct pt_regs *regs, unsigned long address);
#else
static inline void map_vsyscall(void) {}
-static inline bool emulate_vsyscall(unsigned long error_code,
- struct pt_regs *regs, unsigned long address)
+static inline bool emulate_vsyscall_pf(unsigned long error_code,
+ struct pt_regs *regs, unsigned long address)
{
return false;
}
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 998bd807fc7b..fbcc2da75fd6 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1316,7 +1316,7 @@ void do_user_addr_fault(struct pt_regs *regs,
* to consider the PF_PK bit.
*/
if (is_vsyscall_vaddr(address)) {
- if (emulate_vsyscall(error_code, regs, address))
+ if (emulate_vsyscall_pf(error_code, regs, address))
return;
}
#endif
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 08/16] x86/traps: Consolidate user fixups in exc_general_protection()
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (6 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 07/16] x86/vsyscall: Reorganize the #PF emulation code Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 09/16] x86/vsyscall: Add vsyscall emulation for #GP Kirill A. Shutemov
` (7 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Move the UMIP exception fixup along with the other user mode fixups,
that is, under the common "if (user_mode(regs))" condition where the
rest of the fixups reside.
No functional change intended.
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
arch/x86/kernel/traps.c | 8 +++-----
arch/x86/kernel/umip.c | 3 +++
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 36354b470590..25b45193eb19 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -800,11 +800,6 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
cond_local_irq_enable(regs);
- if (static_cpu_has(X86_FEATURE_UMIP)) {
- if (user_mode(regs) && fixup_umip_exception(regs))
- goto exit;
- }
-
if (v8086_mode(regs)) {
local_irq_enable();
handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
@@ -819,6 +814,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
if (fixup_vdso_exception(regs, X86_TRAP_GP, error_code, 0))
goto exit;
+ if (fixup_umip_exception(regs))
+ goto exit;
+
gp_user_force_sig_segv(regs, X86_TRAP_GP, error_code, desc);
goto exit;
}
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5a4b21389b1d..80f2ad26363c 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -343,6 +343,9 @@ bool fixup_umip_exception(struct pt_regs *regs)
void __user *uaddr;
struct insn insn;
+ if (!cpu_feature_enabled(X86_FEATURE_UMIP))
+ return false;
+
if (!regs)
return false;
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 09/16] x86/vsyscall: Add vsyscall emulation for #GP
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (7 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 08/16] x86/traps: Consolidate user fixups in exc_general_protection() Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE Kirill A. Shutemov
` (6 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
The legacy vsyscall page is mapped at a fixed address in the kernel
address range 0xffffffffff600000-0xffffffffff601000. Prior to LASS being
introduced, a legacy vsyscall page access from userspace would always
generate a page fault. The kernel emulates the execute (XONLY) accesses
in the page fault handler and returns back to userspace with the
appropriate register values.
Since LASS intercepts these accesses before the paging structures are
traversed it generates a general protection fault instead of a page
fault. The #GP fault doesn't provide much information in terms of the
error code. So, use the faulting RIP which is preserved in the user
registers to emulate the vsyscall access without going through complex
instruction decoding.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 14 +++++++++++++-
arch/x86/include/asm/vsyscall.h | 6 ++++++
arch/x86/kernel/traps.c | 4 ++++
3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 25f94ac5fd35..be77385b311e 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -23,7 +23,7 @@
* soon be no new userspace code that will ever use a vsyscall.
*
* The code in this file emulates vsyscalls when notified of a page
- * fault to a vsyscall address.
+ * fault or a general protection fault to a vsyscall address.
*/
#include <linux/kernel.h>
@@ -278,6 +278,18 @@ bool emulate_vsyscall_pf(unsigned long error_code, struct pt_regs *regs,
return false;
}
+bool emulate_vsyscall_gp(struct pt_regs *regs)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_LASS))
+ return false;
+
+ /* Emulate only if the RIP points to the vsyscall address */
+ if (!is_vsyscall_vaddr(regs->ip))
+ return false;
+
+ return __emulate_vsyscall(regs, regs->ip);
+}
+
/*
* A pseudo VMA to allow ptrace access for the vsyscall page. This only
* covers the 64bit vsyscall page now. 32bit has a real VMA now and does
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index 214977f4fa11..4eb8d3673223 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -16,6 +16,7 @@ extern void set_vsyscall_pgtable_user_bits(pgd_t *root);
*/
extern bool emulate_vsyscall_pf(unsigned long error_code,
struct pt_regs *regs, unsigned long address);
+extern bool emulate_vsyscall_gp(struct pt_regs *regs);
#else
static inline void map_vsyscall(void) {}
static inline bool emulate_vsyscall_pf(unsigned long error_code,
@@ -23,6 +24,11 @@ static inline bool emulate_vsyscall_pf(unsigned long error_code,
{
return false;
}
+
+static inline bool emulate_vsyscall_gp(struct pt_regs *regs)
+{
+ return false;
+}
#endif
/*
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 25b45193eb19..59bfbdf0a1a0 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -69,6 +69,7 @@
#include <asm/tdx.h>
#include <asm/cfi.h>
#include <asm/msr.h>
+#include <asm/vsyscall.h>
#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@@ -817,6 +818,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
if (fixup_umip_exception(regs))
goto exit;
+ if (emulate_vsyscall_gp(regs))
+ goto exit;
+
gp_user_force_sig_segv(regs, X86_TRAP_GP, error_code, desc);
goto exit;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (8 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 09/16] x86/vsyscall: Add vsyscall emulation for #GP Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
` (5 subsequent siblings)
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
The EMULATE mode of vsyscall maps the vsyscall page into user address
space which can be read directly by the user application. This mode has
been deprecated recently and can only be enabled from a special command
line parameter vsyscall=emulate. See commit bf00745e7791 ("x86/vsyscall:
Remove CONFIG_LEGACY_VSYSCALL_EMULATE")
Fixing the LASS violations during the EMULATE mode would need complex
instruction decoding since the resulting #GP fault does not include any
useful error information and the vsyscall address is not readily
available in the RIP.
At this point, no one is expected to be using the insecure and
deprecated EMULATE mode. The rare usages that need support probably
don't care much about security anyway. Disable LASS when EMULATE mode is
requested during command line parsing to avoid breaking user software.
LASS will be supported if vsyscall mode is set to XONLY or NONE.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
Documentation/admin-guide/kernel-parameters.txt | 4 +++-
arch/x86/entry/vsyscall/vsyscall_64.c | 7 +++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f1f2c0874da9..796c987372df 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -7926,7 +7926,9 @@
emulate Vsyscalls turn into traps and are emulated
reasonably safely. The vsyscall page is
- readable.
+ readable. This disables the Linear
+ Address Space Separation (LASS) security
+ feature and makes the system less secure.
xonly [default] Vsyscalls turn into traps and are
emulated reasonably safely. The vsyscall
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index be77385b311e..d37df40bfb26 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -63,6 +63,13 @@ static int __init vsyscall_setup(char *str)
else
return -EINVAL;
+ if (cpu_feature_enabled(X86_FEATURE_LASS) &&
+ vsyscall_mode == EMULATE) {
+ cr4_clear_bits(X86_CR4_LASS);
+ setup_clear_cpu_cap(X86_FEATURE_LASS);
+ pr_warn_once("x86/cpu: Disabling LASS support due to vsyscall=emulate\n");
+ }
+
return 0;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (9 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 2:40 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 12/16] x86/traps: Generalize #GP address decode and hint code Kirill A. Shutemov
` (4 subsequent siblings)
15 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Provide a more helpful message on #GP when a kernel side LASS violation
is detected.
A NULL pointer dereference is reported if a LASS violation occurs due to
accessing the first page frame.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/traps.c | 41 +++++++++++++++++++++++++++++------------
1 file changed, 29 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 59bfbdf0a1a0..4a4194e1d119 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -636,7 +636,16 @@ DEFINE_IDTENTRY(exc_bounds)
enum kernel_gp_hint {
GP_NO_HINT,
GP_NON_CANONICAL,
- GP_CANONICAL
+ GP_CANONICAL,
+ GP_LASS_VIOLATION,
+ GP_NULL_POINTER,
+};
+
+static const char * const kernel_gp_hint_help[] = {
+ [GP_NON_CANONICAL] = "probably for non-canonical address",
+ [GP_CANONICAL] = "maybe for address",
+ [GP_LASS_VIOLATION] = "LASS prevented access to address",
+ [GP_NULL_POINTER] = "kernel NULL pointer dereference",
};
/*
@@ -664,14 +673,23 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
return GP_NO_HINT;
#ifdef CONFIG_X86_64
- /*
- * Check that:
- * - the operand is not in the kernel half
- * - the last byte of the operand is not in the user canonical half
- */
- if (*addr < ~__VIRTUAL_MASK &&
- *addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
+ /* Operand is in the kernel half */
+ if (*addr >= ~__VIRTUAL_MASK)
+ return GP_CANONICAL;
+
+ /* The last byte of the operand is not in the user canonical half */
+ if (*addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
return GP_NON_CANONICAL;
+
+ /*
+ * If LASS is enabled, NULL pointer dereference generates
+ * #GP instead of #PF.
+ */
+ if (*addr < PAGE_SIZE)
+ return GP_NULL_POINTER;
+
+ if (cpu_feature_enabled(X86_FEATURE_LASS))
+ return GP_LASS_VIOLATION;
#endif
return GP_CANONICAL;
@@ -833,11 +851,10 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
else
hint = get_kernel_gp_address(regs, &gp_addr);
- if (hint != GP_NO_HINT)
+ if (hint != GP_NO_HINT) {
snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
- (hint == GP_NON_CANONICAL) ? "probably for non-canonical address"
- : "maybe for address",
- gp_addr);
+ kernel_gp_hint_help[hint], gp_addr);
+ }
/*
* KASAN is interested only in the non-canonical case, clear it
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 12/16] x86/traps: Generalize #GP address decode and hint code
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (10 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 4:59 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
` (3 subsequent siblings)
15 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
In most cases, an access causing a LASS violation results in a general
protection exception (#GP); for stack accesses (those due to
stack-oriented instructions, as well as accesses that implicitly or
explicitly use the SS segment register), a stack fault (#SS) is
generated.
Handlers for #GP and #SS will now share code to decode the exception
address and retrieve the exception hint string.
The helper, enum, and array should be renamed as they are no longer
specific to #GP.
No functional change intended.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/dumpstack.c | 6 ++--
arch/x86/kernel/traps.c | 58 ++++++++++++++++++-------------------
2 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 71ee20102a8a..e0f85214e92f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -441,14 +441,14 @@ void die(const char *str, struct pt_regs *regs, long err)
oops_end(flags, regs, sig);
}
-void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr)
+void die_addr(const char *str, struct pt_regs *regs, long err, long addr)
{
unsigned long flags = oops_begin();
int sig = SIGSEGV;
__die_header(str, regs, err);
- if (gp_addr)
- kasan_non_canonical_hook(gp_addr);
+ if (addr)
+ kasan_non_canonical_hook(addr);
if (__die_body(str, regs, err))
sig = 0;
oops_end(flags, regs, sig);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4a4194e1d119..f75d6a8dcf20 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -633,28 +633,28 @@ DEFINE_IDTENTRY(exc_bounds)
cond_local_irq_disable(regs);
}
-enum kernel_gp_hint {
- GP_NO_HINT,
- GP_NON_CANONICAL,
- GP_CANONICAL,
- GP_LASS_VIOLATION,
- GP_NULL_POINTER,
+enum kernel_exc_hint {
+ EXC_NO_HINT,
+ EXC_NON_CANONICAL,
+ EXC_CANONICAL,
+ EXC_LASS_VIOLATION,
+ EXC_NULL_POINTER,
};
-static const char * const kernel_gp_hint_help[] = {
- [GP_NON_CANONICAL] = "probably for non-canonical address",
- [GP_CANONICAL] = "maybe for address",
- [GP_LASS_VIOLATION] = "LASS prevented access to address",
- [GP_NULL_POINTER] = "kernel NULL pointer dereference",
+static const char * const kernel_exc_hint_help[] = {
+ [EXC_NON_CANONICAL] = "probably for non-canonical address",
+ [EXC_CANONICAL] = "maybe for address",
+ [EXC_LASS_VIOLATION] = "LASS prevented access to address",
+ [EXC_NULL_POINTER] = "kernel NULL pointer dereference",
};
/*
- * When an uncaught #GP occurs, try to determine the memory address accessed by
- * the instruction and return that address to the caller. Also, try to figure
- * out whether any part of the access to that address was non-canonical.
+ * When an uncaught #GP/#SS occurs, try to determine the memory address accessed
+ * by the instruction and return that address to the caller. Also, try to
+ * figure out whether any part of the access to that address was non-canonical.
*/
-static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
- unsigned long *addr)
+static enum kernel_exc_hint get_kernel_exc_address(struct pt_regs *regs,
+ unsigned long *addr)
{
u8 insn_buf[MAX_INSN_SIZE];
struct insn insn;
@@ -662,37 +662,37 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
if (copy_from_kernel_nofault(insn_buf, (void *)regs->ip,
MAX_INSN_SIZE))
- return GP_NO_HINT;
+ return EXC_NO_HINT;
ret = insn_decode_kernel(&insn, insn_buf);
if (ret < 0)
- return GP_NO_HINT;
+ return EXC_NO_HINT;
*addr = (unsigned long)insn_get_addr_ref(&insn, regs);
if (*addr == -1UL)
- return GP_NO_HINT;
+ return EXC_NO_HINT;
#ifdef CONFIG_X86_64
/* Operand is in the kernel half */
if (*addr >= ~__VIRTUAL_MASK)
- return GP_CANONICAL;
+ return EXC_CANONICAL;
/* The last byte of the operand is not in the user canonical half */
if (*addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
- return GP_NON_CANONICAL;
+ return EXC_NON_CANONICAL;
/*
* If LASS is enabled, NULL pointer dereference generates
* #GP instead of #PF.
*/
if (*addr < PAGE_SIZE)
- return GP_NULL_POINTER;
+ return EXC_NULL_POINTER;
if (cpu_feature_enabled(X86_FEATURE_LASS))
- return GP_LASS_VIOLATION;
+ return EXC_LASS_VIOLATION;
#endif
- return GP_CANONICAL;
+ return EXC_CANONICAL;
}
#define GPFSTR "general protection fault"
@@ -811,7 +811,7 @@ static void gp_user_force_sig_segv(struct pt_regs *regs, int trapnr,
DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
{
char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
- enum kernel_gp_hint hint = GP_NO_HINT;
+ enum kernel_exc_hint hint = EXC_NO_HINT;
unsigned long gp_addr;
if (user_mode(regs) && try_fixup_enqcmd_gp())
@@ -849,18 +849,18 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
if (error_code)
snprintf(desc, sizeof(desc), "segment-related " GPFSTR);
else
- hint = get_kernel_gp_address(regs, &gp_addr);
+ hint = get_kernel_exc_address(regs, &gp_addr);
- if (hint != GP_NO_HINT) {
+ if (hint != EXC_NO_HINT) {
snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
- kernel_gp_hint_help[hint], gp_addr);
+ kernel_exc_hint_help[hint], gp_addr);
}
/*
* KASAN is interested only in the non-canonical case, clear it
* otherwise.
*/
- if (hint != GP_NON_CANONICAL)
+ if (hint != EXC_NON_CANONICAL)
gp_addr = 0;
die_addr(desc, regs, error_code, gp_addr);
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (11 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 12/16] x86/traps: Generalize #GP address decode and hint code Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 5:12 ` Sohil Mehta
2025-07-11 1:23 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 14/16] x86/cpu: Enable LASS during CPU initialization Kirill A. Shutemov
` (2 subsequent siblings)
15 siblings, 2 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
LASS throws a #GP for any violations except for stack register accesses,
in which case it throws a #SS instead. Handle this similarly to how other
LASS violations are handled.
In case of FRED, before handling #SS as LASS violation, kernel has to
check if there's a fixup for the exception. It can address #SS due to
invalid user context on ERETU. See 5105e7687ad3 ("x86/fred: Fixup
fault on ERETU by jumping to fred_entrypoint_user") for more details.
Co-developed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/traps.c | 41 +++++++++++++++++++++++++++++++++++------
1 file changed, 35 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index f75d6a8dcf20..0f6f187b1a9e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -418,12 +418,6 @@ DEFINE_IDTENTRY_ERRORCODE(exc_segment_not_present)
SIGBUS, 0, NULL);
}
-DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
-{
- do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
- 0, NULL);
-}
-
DEFINE_IDTENTRY_ERRORCODE(exc_alignment_check)
{
char *str = "alignment check";
@@ -869,6 +863,41 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
cond_local_irq_disable(regs);
}
+#define SSFSTR "stack segment"
+
+DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
+{
+ enum kernel_exc_hint hint;
+ unsigned long exc_addr;
+
+ if (user_mode(regs))
+ goto error_trap;
+
+ if (cpu_feature_enabled(X86_FEATURE_FRED) &&
+ fixup_exception(regs, X86_TRAP_SS, error_code, 0))
+ return;
+
+ if (!cpu_feature_enabled(X86_FEATURE_LASS))
+ goto error_trap;
+
+ if (notify_die(DIE_TRAP, SSFSTR, regs, error_code,
+ X86_TRAP_SS, SIGBUS) == NOTIFY_STOP)
+ return;
+
+ hint = get_kernel_exc_address(regs, &exc_addr);
+ if (hint != EXC_NO_HINT)
+ printk(SSFSTR ", %s 0x%lx", kernel_exc_hint_help[hint], exc_addr);
+
+ if (hint != EXC_NON_CANONICAL)
+ exc_addr = 0;
+
+ die_addr(SSFSTR, regs, error_code, exc_addr);
+ return;
+
+error_trap:
+ do_error_trap(regs, error_code, SSFSTR, X86_TRAP_SS, SIGBUS, 0, NULL);
+}
+
static bool do_int3(struct pt_regs *regs)
{
int res;
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 14/16] x86/cpu: Enable LASS during CPU initialization
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (12 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 15/16] x86/cpu: Make LAM depend on LASS Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Being a security feature, enable LASS by default if the platform
supports it.
While at it, get rid of the comment above the SMAP/SMEP/UMIP/LASS setup
instead of updating it to mention LASS as well, as the whole sequence is
quite self-explanatory.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f10f9f618805..382b687ce7e2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -401,6 +401,12 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
cr4_clear_bits(X86_CR4_UMIP);
}
+static __always_inline void setup_lass(struct cpuinfo_x86 *c)
+{
+ if (cpu_feature_enabled(X86_FEATURE_LASS))
+ cr4_set_bits(X86_CR4_LASS);
+}
+
/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED |
@@ -1978,10 +1984,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
- /* Set up SMEP/SMAP/UMIP */
setup_smep(c);
setup_smap(c);
setup_umip(c);
+ setup_lass(c);
/* Enable FSGSBASE instructions if available. */
if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 15/16] x86/cpu: Make LAM depend on LASS
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (13 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 14/16] x86/cpu: Enable LASS during CPU initialization Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
15 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
To prevent exploits for Spectre based on LAM as demonstrated by the
whitepaper [1], make LAM depend on LASS, which avoids this type of
vulnerability.
[1] https://download.vusec.net/papers/slam_sp24.pdf
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
---
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 98d0cdd82574..11bb9ed40140 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -90,6 +90,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_FRED, X86_FEATURE_LKGS },
{ X86_FEATURE_SPEC_CTRL_SSBD, X86_FEATURE_SPEC_CTRL },
{ X86_FEATURE_LASS, X86_FEATURE_SMAP },
+ { X86_FEATURE_LAM, X86_FEATURE_LASS },
{}
};
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCHv9 16/16] x86: Re-enable Linear Address Masking
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (14 preceding siblings ...)
2025-07-07 8:03 ` [PATCHv9 15/16] x86/cpu: Make LAM depend on LASS Kirill A. Shutemov
@ 2025-07-07 8:03 ` Kirill A. Shutemov
2025-07-09 5:31 ` Sohil Mehta
15 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-07 8:03 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
This reverts commit 3267cb6d3a174ff83d6287dcd5b0047bbd912452.
LASS mitigates the Spectre based on LAM (SLAM) [1] and the previous
commit made LAM depend on LASS, so we no longer need to disable LAM at
compile time, so revert the commit that disables LAM.
Adjust USER_PTR_MAX if LAM enabled, allowing tag bits to be set for
userspace pointers. The value for the constant is defined in a way to
avoid overflow compiler warning on 32-bit config.
[1] https://download.vusec.net/papers/slam_sp24.pdf
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/Kconfig | 1 -
arch/x86/kernel/cpu/common.c | 5 +----
2 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..2b48e916b754 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2181,7 +2181,6 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
config ADDRESS_MASKING
bool "Linear Address Masking support"
depends on X86_64
- depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS
help
Linear Address Masking (LAM) modifies the checking that is applied
to 64-bit linear addresses, allowing software to use of the
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 382b687ce7e2..7ae757498a6f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2553,11 +2553,8 @@ void __init arch_cpu_finalize_init(void)
if (IS_ENABLED(CONFIG_X86_64)) {
unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
- /*
- * Enable this when LAM is gated on LASS support
if (cpu_feature_enabled(X86_FEATURE_LAM))
- USER_PTR_MAX = (1ul << 63) - PAGE_SIZE;
- */
+ USER_PTR_MAX = (-1UL >> 1) & PAGE_MASK;
runtime_const_init(ptr, USER_PTR_MAX);
/*
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-07 8:03 ` [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
@ 2025-07-09 1:08 ` Sohil Mehta
2025-07-09 9:35 ` Kirill A. Shutemov
2025-07-09 16:58 ` Dave Hansen
2025-07-28 19:11 ` David Laight
2 siblings, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 1:08 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> From: Sohil Mehta <sohil.mehta@intel.com>
>
> For patching, the kernel initializes a temporary mm area in the lower
> half of the address range. See commit 4fc19708b165 ("x86/alternatives:
> Initialize temporary mm for patching").
>
> Disable LASS enforcement during patching to avoid triggering a #GP
> fault.
>
> The objtool warns due to a call to a non-allowed function that exists
> outside of the stac/clac guard, or references to any function with a
> dynamic function pointer inside the guard. See the Objtool warnings
> section #9 in the document tools/objtool/Documentation/objtool.txt.
>
> Considering that patching is usually small, replace the memcpy() and
> memset() functions in the text poking functions with their open coded
> versions.
>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Kirill, it might be worth adding your co-developed-by tag. The patch has
more changes than I can claim credit for.
> ---
> arch/x86/include/asm/smap.h | 33 +++++++++++++++++++++++++++++++--
> arch/x86/kernel/alternative.c | 28 ++++++++++++++++++++++++++--
> 2 files changed, 57 insertions(+), 4 deletions(-)
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-07-07 8:03 ` [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall Kirill A. Shutemov
@ 2025-07-09 1:19 ` Sohil Mehta
2025-07-09 9:38 ` Kirill A. Shutemov
2025-07-09 17:00 ` Dave Hansen
1 sibling, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 1:19 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>
> In order to map the EFI runtime services, set_virtual_address_map()
> needs to be called, which resides in the lower half of the address
> space. This means that LASS needs to be temporarily disabled around
> this call. This can only be done before the CR pinning is set up.
>
> Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
> arch_cpu_finalize_init(), defer it until core initcall.
>
> Wrapping efi_enter_virtual_mode() into lass_stac()/clac() is not enough
> because AC flag gates data accesses, but not instruction fetch. Clearing
> the CR4 bit is required.
>
I think the wording might need to be reordered. How about?
In order to map the EFI runtime services, set_virtual_address_map()
needs to be called, which resides in the lower half of the address
space. This means that LASS needs to be temporarily disabled around
this call.
Wrapping efi_enter_virtual_mode() into lass_stac()/clac() is not enough
because AC flag gates data accesses, but not instruction fetch. Clearing
the CR4 bit is required.
However, this must be done before the CR pinning is set up. Instead of
arbitrarily moving setup_cr_pinning() after efi_enter_virtual_mode() in
arch_cpu_finalize_init(), defer it until core initcall.
Other than that,
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/cpu/common.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index ec62e2f9ea16..f10f9f618805 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -490,11 +490,14 @@ void cr4_init(void)
> * parsed), record any of the sensitive CR bits that are set, and
> * enable CR pinning.
> */
> -static void __init setup_cr_pinning(void)
> +static int __init setup_cr_pinning(void)
> {
> cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & cr4_pinned_mask;
> static_key_enable(&cr_pinning.key);
> +
> + return 0;
> }
> +core_initcall(setup_cr_pinning);
>
> static __init int x86_nofsgsbase_setup(char *arg)
> {
> @@ -2082,7 +2085,6 @@ static __init void identify_boot_cpu(void)
> enable_sep_cpu();
> #endif
> cpu_detect_tlb(&boot_cpu_data);
> - setup_cr_pinning();
>
> tsx_init();
> tdx_init();
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 05/16] efi: Disable LASS around set_virtual_address_map() EFI call
2025-07-07 8:03 ` [PATCHv9 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
@ 2025-07-09 1:27 ` Sohil Mehta
0 siblings, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 1:27 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>
> Of all the EFI runtime services, set_virtual_address_map() is the only
> one that is called at its lower mapping, which LASS prohibits regardless
> of EFLAGS.AC setting. The only way to allow this to happen is to disable
> LASS in the CR4 register.
>
> Disable LASS around this low address EFI call.
>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/platform/efi/efi.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
A minor nit below.
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 463b784499a8..5b23c0daedef 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -787,6 +787,7 @@ static void __init __efi_enter_virtual_mode(void)
> int count = 0, pg_shift = 0;
> void *new_memmap = NULL;
> efi_status_t status;
> + unsigned long lass;
> unsigned long pa;
>
The two unsigned longs can be on the same line.
> if (efi_alloc_page_tables()) {
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message
2025-07-07 8:03 ` [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
@ 2025-07-09 2:40 ` Sohil Mehta
2025-07-09 9:31 ` Kirill A. Shutemov
0 siblings, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 2:40 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>
> Provide a more helpful message on #GP when a kernel side LASS violation
> is detected.
>
> A NULL pointer dereference is reported if a LASS violation occurs due to
> accessing the first page frame.
>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/traps.c | 41 +++++++++++++++++++++++++++++------------
> 1 file changed, 29 insertions(+), 12 deletions(-)
>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
A nit below.
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 59bfbdf0a1a0..4a4194e1d119 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -636,7 +636,16 @@ DEFINE_IDTENTRY(exc_bounds)
> enum kernel_gp_hint {
> GP_NO_HINT,
> GP_NON_CANONICAL,
> - GP_CANONICAL
> + GP_CANONICAL,
> + GP_LASS_VIOLATION,
> + GP_NULL_POINTER,
> +};
> +
> +static const char * const kernel_gp_hint_help[] = {
> + [GP_NON_CANONICAL] = "probably for non-canonical address",
> + [GP_CANONICAL] = "maybe for address",
> + [GP_LASS_VIOLATION] = "LASS prevented access to address",
> + [GP_NULL_POINTER] = "kernel NULL pointer dereference",
> };
>
> /*
> @@ -664,14 +673,23 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> return GP_NO_HINT;
>
> #ifdef CONFIG_X86_64
Might as well get rid of the #ifdef in C code, if possible.
if (!IS_ENABLED(CONFIG_X86_64)
return GP_CANONICAL;
or combine it with the next check.
> - /*
> - * Check that:
> - * - the operand is not in the kernel half
> - * - the last byte of the operand is not in the user canonical half
> - */
> - if (*addr < ~__VIRTUAL_MASK &&
> - *addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
> + /* Operand is in the kernel half */
> + if (*addr >= ~__VIRTUAL_MASK)
> + return GP_CANONICAL;
> +
> + /* The last byte of the operand is not in the user canonical half */
> + if (*addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
> return GP_NON_CANONICAL;
> +
> + /*
> + * If LASS is enabled, NULL pointer dereference generates
> + * #GP instead of #PF.
> + */
> + if (*addr < PAGE_SIZE)
> + return GP_NULL_POINTER;
> +
> + if (cpu_feature_enabled(X86_FEATURE_LASS))
> + return GP_LASS_VIOLATION;
> #endif
>
> return GP_CANONICAL;
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 12/16] x86/traps: Generalize #GP address decode and hint code
2025-07-07 8:03 ` [PATCHv9 12/16] x86/traps: Generalize #GP address decode and hint code Kirill A. Shutemov
@ 2025-07-09 4:59 ` Sohil Mehta
0 siblings, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 4:59 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> In most cases, an access causing a LASS violation results in a general
> protection exception (#GP); for stack accesses (those due to
> stack-oriented instructions, as well as accesses that implicitly or
> explicitly use the SS segment register), a stack fault (#SS) is
> generated.
>
> Handlers for #GP and #SS will now share code to decode the exception
> address and retrieve the exception hint string.
>
> The helper, enum, and array should be renamed as they are no longer
> specific to #GP.
>
> No functional change intended.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/dumpstack.c | 6 ++--
> arch/x86/kernel/traps.c | 58 ++++++++++++++++++-------------------
> 2 files changed, 32 insertions(+), 32 deletions(-)
>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS
2025-07-07 8:03 ` [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
@ 2025-07-09 5:12 ` Sohil Mehta
2025-07-09 10:38 ` Kirill A. Shutemov
2025-07-11 1:23 ` Sohil Mehta
1 sibling, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 5:12 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> + hint = get_kernel_exc_address(regs, &exc_addr);
> + if (hint != EXC_NO_HINT)
> + printk(SSFSTR ", %s 0x%lx", kernel_exc_hint_help[hint], exc_addr);
> +
> + if (hint != EXC_NON_CANONICAL)
> + exc_addr = 0;
> +
> + die_addr(SSFSTR, regs, error_code, exc_addr);
I see a slight difference between the #GP handling and the #SS handling
here. For the #GP case, we seem to pass the hint string to die_addr().
However, for the #SS, the hint is printed above and only SSFSTR gets
passed onto die_addr(). I am curious about the reasoning.
> + return;
> +
> +error_trap:
> + do_error_trap(regs, error_code, SSFSTR, X86_TRAP_SS, SIGBUS, 0, NULL);
> +}
> +
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 16/16] x86: Re-enable Linear Address Masking
2025-07-07 8:03 ` [PATCHv9 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
@ 2025-07-09 5:31 ` Sohil Mehta
2025-07-09 11:00 ` Kirill A. Shutemov
0 siblings, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-07-09 5:31 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> This reverts commit 3267cb6d3a174ff83d6287dcd5b0047bbd912452.
>
> LASS mitigates the Spectre based on LAM (SLAM) [1] and the previous
> commit made LAM depend on LASS, so we no longer need to disable LAM at
> compile time, so revert the commit that disables LAM.
>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
You may have missed my comments in the previous revision.
https://lore.kernel.org/all/af709ffa-eb11-4de5-9ae8-a179cb99750c@intel.com/
Mainly, x86 maintainers prefer imperative tone and references such as
"previous commit" can be confusing sometimes.
> Adjust USER_PTR_MAX if LAM enabled, allowing tag bits to be set for
> userspace pointers. The value for the constant is defined in a way to
> avoid overflow compiler warning on 32-bit config.
>
> [1] https://download.vusec.net/papers/slam_sp24.pdf
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
> arch/x86/Kconfig | 1 -
> arch/x86/kernel/cpu/common.c | 5 +----
> 2 files changed, 1 insertion(+), 5 deletions(-)
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message
2025-07-09 2:40 ` Sohil Mehta
@ 2025-07-09 9:31 ` Kirill A. Shutemov
2025-07-09 9:36 ` Geert Uytterhoeven
0 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-09 9:31 UTC (permalink / raw)
To: Sohil Mehta
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Tue, Jul 08, 2025 at 07:40:35PM -0700, Sohil Mehta wrote:
> > @@ -664,14 +673,23 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> > return GP_NO_HINT;
> >
> > #ifdef CONFIG_X86_64
>
> Might as well get rid of the #ifdef in C code, if possible.
>
> if (!IS_ENABLED(CONFIG_X86_64)
> return GP_CANONICAL;
>
> or combine it with the next check.
I tried this before. It triggers compiler error on 32-bit:
arch/x86/kernel/traps.c:673:16: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
673 | if (*addr >= ~__VIRTUAL_MASK)
| ^~~~~~~~~~~~~~
__VIRTUAL_MASK is not usable on 32-bit configs.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-09 1:08 ` Sohil Mehta
@ 2025-07-09 9:35 ` Kirill A. Shutemov
0 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-09 9:35 UTC (permalink / raw)
To: Sohil Mehta
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Tue, Jul 08, 2025 at 06:08:18PM -0700, Sohil Mehta wrote:
> On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> > From: Sohil Mehta <sohil.mehta@intel.com>
> >
> > For patching, the kernel initializes a temporary mm area in the lower
> > half of the address range. See commit 4fc19708b165 ("x86/alternatives:
> > Initialize temporary mm for patching").
> >
> > Disable LASS enforcement during patching to avoid triggering a #GP
> > fault.
> >
> > The objtool warns due to a call to a non-allowed function that exists
> > outside of the stac/clac guard, or references to any function with a
> > dynamic function pointer inside the guard. See the Objtool warnings
> > section #9 in the document tools/objtool/Documentation/objtool.txt.
> >
> > Considering that patching is usually small, replace the memcpy() and
> > memset() functions in the text poking functions with their open coded
> > versions.
> >
> > Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
> Kirill, it might be worth adding your co-developed-by tag. The patch has
> more changes than I can claim credit for.
Okay. Will do.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message
2025-07-09 9:31 ` Kirill A. Shutemov
@ 2025-07-09 9:36 ` Geert Uytterhoeven
2025-07-09 9:51 ` Kirill A. Shutemov
0 siblings, 1 reply; 47+ messages in thread
From: Geert Uytterhoeven @ 2025-07-09 9:36 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Sohil Mehta, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin,
Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
Hi Kirill,
On Wed, 9 Jul 2025 at 11:31, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> On Tue, Jul 08, 2025 at 07:40:35PM -0700, Sohil Mehta wrote:
> > > @@ -664,14 +673,23 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> > > return GP_NO_HINT;
> > >
> > > #ifdef CONFIG_X86_64
> >
> > Might as well get rid of the #ifdef in C code, if possible.
> >
> > if (!IS_ENABLED(CONFIG_X86_64)
> > return GP_CANONICAL;
> >
> > or combine it with the next check.
>
> I tried this before. It triggers compiler error on 32-bit:
>
> arch/x86/kernel/traps.c:673:16: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
> 673 | if (*addr >= ~__VIRTUAL_MASK)
> | ^~~~~~~~~~~~~~
>
> __VIRTUAL_MASK is not usable on 32-bit configs.
arch/x86/include/asm/page_32_types.h:#define __VIRTUAL_MASK_SHIFT 32
arch/x86/include/asm/page_32_types.h:#define __VIRTUAL_MASK_SHIFT 32
arch/x86/include/asm/page_64_types.h:#define __VIRTUAL_MASK_SHIFT
(pgtable_l5_enabled() ? 56 : 47)
arch/x86/include/asm/page_types.h:#define __VIRTUAL_MASK
((1UL << __VIRTUAL_MASK_SHIFT) - 1)
Given __VIRTUAL_MASK_SHIFT is 32 on 32-bit platforms, perhaps
__VIRTUAL_MASK should just be changed to shift 1ULL instead?
Or better, use GENMASK(__VIRTUAL_MASK_SHIFT - 1, 0), so the
resulting type is still unsigned long.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-07-09 1:19 ` Sohil Mehta
@ 2025-07-09 9:38 ` Kirill A. Shutemov
0 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-09 9:38 UTC (permalink / raw)
To: Sohil Mehta
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Tue, Jul 08, 2025 at 06:19:03PM -0700, Sohil Mehta wrote:
> On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> > From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> >
> > In order to map the EFI runtime services, set_virtual_address_map()
> > needs to be called, which resides in the lower half of the address
> > space. This means that LASS needs to be temporarily disabled around
> > this call. This can only be done before the CR pinning is set up.
> >
> > Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
> > arch_cpu_finalize_init(), defer it until core initcall.
> >
> > Wrapping efi_enter_virtual_mode() into lass_stac()/clac() is not enough
> > because AC flag gates data accesses, but not instruction fetch. Clearing
> > the CR4 bit is required.
> >
>
> I think the wording might need to be reordered. How about?
>
> In order to map the EFI runtime services, set_virtual_address_map()
> needs to be called, which resides in the lower half of the address
> space. This means that LASS needs to be temporarily disabled around
> this call.
>
> Wrapping efi_enter_virtual_mode() into lass_stac()/clac() is not enough
> because AC flag gates data accesses, but not instruction fetch. Clearing
> the CR4 bit is required.
>
> However, this must be done before the CR pinning is set up. Instead of
> arbitrarily moving setup_cr_pinning() after efi_enter_virtual_mode() in
> arch_cpu_finalize_init(), defer it until core initcall.
>
> Other than that,
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Okay, looks good, thanks!
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message
2025-07-09 9:36 ` Geert Uytterhoeven
@ 2025-07-09 9:51 ` Kirill A. Shutemov
0 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-09 9:51 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Sohil Mehta, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin,
Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On Wed, Jul 09, 2025 at 11:36:27AM +0200, Geert Uytterhoeven wrote:
> Hi Kirill,
>
> On Wed, 9 Jul 2025 at 11:31, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> > On Tue, Jul 08, 2025 at 07:40:35PM -0700, Sohil Mehta wrote:
> > > > @@ -664,14 +673,23 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> > > > return GP_NO_HINT;
> > > >
> > > > #ifdef CONFIG_X86_64
> > >
> > > Might as well get rid of the #ifdef in C code, if possible.
> > >
> > > if (!IS_ENABLED(CONFIG_X86_64)
> > > return GP_CANONICAL;
> > >
> > > or combine it with the next check.
> >
> > I tried this before. It triggers compiler error on 32-bit:
> >
> > arch/x86/kernel/traps.c:673:16: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
> > 673 | if (*addr >= ~__VIRTUAL_MASK)
> > | ^~~~~~~~~~~~~~
> >
> > __VIRTUAL_MASK is not usable on 32-bit configs.
>
> arch/x86/include/asm/page_32_types.h:#define __VIRTUAL_MASK_SHIFT 32
> arch/x86/include/asm/page_32_types.h:#define __VIRTUAL_MASK_SHIFT 32
> arch/x86/include/asm/page_64_types.h:#define __VIRTUAL_MASK_SHIFT
> (pgtable_l5_enabled() ? 56 : 47)
> arch/x86/include/asm/page_types.h:#define __VIRTUAL_MASK
> ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
>
> Given __VIRTUAL_MASK_SHIFT is 32 on 32-bit platforms, perhaps
> __VIRTUAL_MASK should just be changed to shift 1ULL instead?
> Or better, use GENMASK(__VIRTUAL_MASK_SHIFT - 1, 0), so the
> resulting type is still unsigned long.
Making __VIRTUAL_MASK unsigned long long is no-go. Virtual address are
unsigned long. I guess GENMASK() would work.
I think re-defining __VIRTUAL_MASK is out-of-scope for the patchset. Feel
free to prepare a separate patch to do it.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS
2025-07-09 5:12 ` Sohil Mehta
@ 2025-07-09 10:38 ` Kirill A. Shutemov
2025-07-11 1:22 ` Sohil Mehta
0 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-09 10:38 UTC (permalink / raw)
To: Sohil Mehta
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Tue, Jul 08, 2025 at 10:12:28PM -0700, Sohil Mehta wrote:
> On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
>
> > + hint = get_kernel_exc_address(regs, &exc_addr);
> > + if (hint != EXC_NO_HINT)
> > + printk(SSFSTR ", %s 0x%lx", kernel_exc_hint_help[hint], exc_addr);
> > +
> > + if (hint != EXC_NON_CANONICAL)
> > + exc_addr = 0;
> > +
> > + die_addr(SSFSTR, regs, error_code, exc_addr);
>
> I see a slight difference between the #GP handling and the #SS handling
> here. For the #GP case, we seem to pass the hint string to die_addr().
>
> However, for the #SS, the hint is printed above and only SSFSTR gets
> passed onto die_addr(). I am curious about the reasoning.
I hate how 'desc' size is defined in #GP handler:
char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
Too much voodoo to my liking. And it will overflow if any hint string is
going to be longer than 50.
I don't want to repeat this magic for #SS.
I would argue we need to print directly in #GP handler as I do in #SS.
But, IMO, it is outside of the scope of this patchset.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 16/16] x86: Re-enable Linear Address Masking
2025-07-09 5:31 ` Sohil Mehta
@ 2025-07-09 11:00 ` Kirill A. Shutemov
2025-07-11 0:42 ` Sohil Mehta
0 siblings, 1 reply; 47+ messages in thread
From: Kirill A. Shutemov @ 2025-07-09 11:00 UTC (permalink / raw)
To: Sohil Mehta
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Tue, Jul 08, 2025 at 10:31:05PM -0700, Sohil Mehta wrote:
> On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> > This reverts commit 3267cb6d3a174ff83d6287dcd5b0047bbd912452.
> >
> > LASS mitigates the Spectre based on LAM (SLAM) [1] and the previous
> > commit made LAM depend on LASS, so we no longer need to disable LAM at
> > compile time, so revert the commit that disables LAM.
> >
>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
>
> You may have missed my comments in the previous revision.
> https://lore.kernel.org/all/af709ffa-eb11-4de5-9ae8-a179cb99750c@intel.com/
>
> Mainly, x86 maintainers prefer imperative tone and references such as
> "previous commit" can be confusing sometimes.
Indeed, missed. My bad.
I've merged last two patches and updated the commit message:
https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git/commit/?h=x86/lass
I hope it is still okay to use your Reviewed-by tag.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-07 8:03 ` [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
2025-07-09 1:08 ` Sohil Mehta
@ 2025-07-09 16:58 ` Dave Hansen
2025-07-25 2:35 ` Sohil Mehta
2025-07-28 19:11 ` David Laight
2 siblings, 1 reply; 47+ messages in thread
From: Dave Hansen @ 2025-07-09 16:58 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/25 01:03, Kirill A. Shutemov wrote:
> From: Sohil Mehta <sohil.mehta@intel.com>
>
> For patching, the kernel initializes a temporary mm area in the lower
> half of the address range. See commit 4fc19708b165 ("x86/alternatives:
> Initialize temporary mm for patching").
>
> Disable LASS enforcement during patching to avoid triggering a #GP
> fault.
>
> The objtool warns due to a call to a non-allowed function that exists
> outside of the stac/clac guard, or references to any function with a
> dynamic function pointer inside the guard. See the Objtool warnings
> section #9 in the document tools/objtool/Documentation/objtool.txt.
>
> Considering that patching is usually small, replace the memcpy() and
> memset() functions in the text poking functions with their open coded
> versions.
>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/include/asm/smap.h | 33 +++++++++++++++++++++++++++++++--
> arch/x86/kernel/alternative.c | 28 ++++++++++++++++++++++++++--
> 2 files changed, 57 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
> index 4f84d421d1cf..d0cc24348641 100644
> --- a/arch/x86/include/asm/smap.h
> +++ b/arch/x86/include/asm/smap.h
> @@ -23,18 +23,47 @@
>
> #else /* __ASSEMBLER__ */
>
> +/*
> + * The CLAC/STAC instructions toggle the enforcement of X86_FEATURE_SMAP and
> + * X86_FEATURE_LASS.
> + *
> + * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page tables: the
> + * kernel is not allowed to touch pages with the bit set unless the AC bit is
> + * set.
> + *
> + * LASS enforcement is based on bit 63 of the virtual address. The kernel is
> + * not allowed to touch memory in the lower half of the virtual address space
> + * unless the AC bit is set.
> + *
> + * Use stac()/clac() when accessing userspace (_PAGE_USER) mappings,
> + * regardless of location.
> + *
> + * Use lass_stac()/lass_clac() when accessing kernel mappings (!_PAGE_USER)
> + * in the lower half of the address space.
> + *
> + * Note: a barrier is implicit in alternative().
> + */
> +
> static __always_inline void clac(void)
> {
> - /* Note: a barrier is implicit in alternative() */
> alternative("", "clac", X86_FEATURE_SMAP);
> }
>
> static __always_inline void stac(void)
> {
> - /* Note: a barrier is implicit in alternative() */
> alternative("", "stac", X86_FEATURE_SMAP);
> }
>
> +static __always_inline void lass_clac(void)
> +{
> + alternative("", "clac", X86_FEATURE_LASS);
> +}
> +
> +static __always_inline void lass_stac(void)
> +{
> + alternative("", "stac", X86_FEATURE_LASS);
> +}
Could we please move the comments about lass_*() closer to the LASS
functions?
> static __always_inline unsigned long smap_save(void)
> {
> unsigned long flags;
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index ea1d984166cd..992ece0e879a 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -2447,16 +2447,40 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
> __ro_after_init struct mm_struct *text_poke_mm;
> __ro_after_init unsigned long text_poke_mm_addr;
>
> +/*
> + * Text poking creates and uses a mapping in the lower half of the
> + * address space. Relax LASS enforcement when accessing the poking
> + * address.
> + */
> +
> static void text_poke_memcpy(void *dst, const void *src, size_t len)
> {
> - memcpy(dst, src, len);
> + lass_stac();
> +
> + /*
> + * Objtool is picky about what occurs within the STAC/CLAC region
> + * because this code runs with protection disabled. Objtool typically
> + * does not permit function calls in this area.
> + *
> + * Avoid using memcpy() here. Instead, open code it.
> + */
> + asm volatile("rep movsb"
> + : "+D" (dst), "+S" (src), "+c" (len) : : "memory");
> +
> + lass_clac();
> }
This didn't turn out great. At the _very_ least, we could have a:
inline_memcpy_i_really_mean_it()
with the rep mov. Or even a #define if we were super paranoid the
compiler is out to get us.
But _actually_ open-coding inline assembly is far too ugly to live.
We can also be a bit more compact about the comments:
/*
* objtool enforces a strict policy of "no function calls within
* AC=1 regions". Adhere to the policy by doing a memcpy() that
* will never result in a function call.
*/
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-07-07 8:03 ` [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall Kirill A. Shutemov
2025-07-09 1:19 ` Sohil Mehta
@ 2025-07-09 17:00 ` Dave Hansen
2025-07-31 23:45 ` Sohil Mehta
1 sibling, 1 reply; 47+ messages in thread
From: Dave Hansen @ 2025-07-09 17:00 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/25 01:03, Kirill A. Shutemov wrote:
> Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
> arch_cpu_finalize_init(), defer it until core initcall.
What are the side effects of this move? Are there other benefits? What
are the risks?
BTW, ideally, you'd get an ack from one of the folks who put the CR
pinning in the kernel in the first place to make sure this isn't
breaking the mechanism in any important ways.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 16/16] x86: Re-enable Linear Address Masking
2025-07-09 11:00 ` Kirill A. Shutemov
@ 2025-07-11 0:42 ` Sohil Mehta
0 siblings, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-07-11 0:42 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On 7/9/2025 4:00 AM, Kirill A. Shutemov wrote:
> I've merged last two patches and updated the commit message:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git/commit/?h=x86/lass
>
> I hope it is still okay to use your Reviewed-by tag.
>
Yes, that should be fine.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS
2025-07-09 10:38 ` Kirill A. Shutemov
@ 2025-07-11 1:22 ` Sohil Mehta
0 siblings, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-07-11 1:22 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On 7/9/2025 3:38 AM, Kirill A. Shutemov wrote:
> On Tue, Jul 08, 2025 at 10:12:28PM -0700, Sohil Mehta wrote:
>> On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
>>
>>> + hint = get_kernel_exc_address(regs, &exc_addr);
>>> + if (hint != EXC_NO_HINT)
>>> + printk(SSFSTR ", %s 0x%lx", kernel_exc_hint_help[hint], exc_addr);
>>> +
>>> + if (hint != EXC_NON_CANONICAL)
>>> + exc_addr = 0;
>>> +
>>> + die_addr(SSFSTR, regs, error_code, exc_addr);
>>
>> I see a slight difference between the #GP handling and the #SS handling
>> here. For the #GP case, we seem to pass the hint string to die_addr().
>>
>> However, for the #SS, the hint is printed above and only SSFSTR gets
>> passed onto die_addr(). I am curious about the reasoning.
>
> I hate how 'desc' size is defined in #GP handler:
>
> char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
>
> Too much voodoo to my liking. And it will overflow if any hint string is
> going to be longer than 50.
>
> I don't want to repeat this magic for #SS.
>
Thanks, that makes sense.
> I would argue we need to print directly in #GP handler as I do in #SS.
> But, IMO, it is outside of the scope of this patchset.
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS
2025-07-07 8:03 ` [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
2025-07-09 5:12 ` Sohil Mehta
@ 2025-07-11 1:23 ` Sohil Mehta
1 sibling, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-07-11 1:23 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 7/7/2025 1:03 AM, Kirill A. Shutemov wrote:
> LASS throws a #GP for any violations except for stack register accesses,
> in which case it throws a #SS instead. Handle this similarly to how other
> LASS violations are handled.
>
> In case of FRED, before handling #SS as LASS violation, kernel has to
> check if there's a fixup for the exception. It can address #SS due to
> invalid user context on ERETU. See 5105e7687ad3 ("x86/fred: Fixup
> fault on ERETU by jumping to fred_entrypoint_user") for more details.
>
> Co-developed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/traps.c | 41 +++++++++++++++++++++++++++++++++++------
> 1 file changed, 35 insertions(+), 6 deletions(-)
>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-09 16:58 ` Dave Hansen
@ 2025-07-25 2:35 ` Sohil Mehta
0 siblings, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-07-25 2:35 UTC (permalink / raw)
To: Dave Hansen, Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
Peter Zijlstra, Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf,
Xiongwei Song, Xin Li, Mike Rapoport (IBM), Brijesh Singh,
Michael Roth, Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
[-- Attachment #1: Type: text/plain, Size: 783 bytes --]
On 7/9/2025 9:58 AM, Dave Hansen wrote:
>> + * Avoid using memcpy() here. Instead, open code it.
>> + */
>> + asm volatile("rep movsb"
>> + : "+D" (dst), "+S" (src), "+c" (len) : : "memory");
>> +
>> + lass_clac();
>> }
>
> This didn't turn out great. At the _very_ least, we could have a:
>
> inline_memcpy_i_really_mean_it()
>
It looks like we should go back to __inline_memcpy()/_memset()
implementation that PeterZ had initially proposed. It seems to fit all
the requirements, right? Patch attached.
https://lore.kernel.org/lkml/20241028160917.1380714-3-alexander.shishkin@linux.intel.com/
> with the rep mov. Or even a #define if we were super paranoid the
> compiler is out to get us.
>
> But _actually_ open-coding inline assembly is far too ugly to live.
>
[-- Attachment #2: x86-asm-Introduce-inline-memcpy-and-memset.patch --]
[-- Type: text/plain, Size: 1419 bytes --]
From eb3b45b377df90d3b367e2b3fddfff1a72624a4e Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon, 28 Oct 2024 18:07:50 +0200
Subject: [PATCH] x86/asm: Introduce inline memcpy and memset
Provide inline memcpy and memset functions that can be used instead of
the GCC builtins whenever necessary.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
arch/x86/include/asm/string.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
index c3c2c1914d65..9cb5aae7fba9 100644
--- a/arch/x86/include/asm/string.h
+++ b/arch/x86/include/asm/string.h
@@ -1,6 +1,32 @@
/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_STRING_H
+#define _ASM_X86_STRING_H
+
#ifdef CONFIG_X86_32
# include <asm/string_32.h>
#else
# include <asm/string_64.h>
#endif
+
+static __always_inline void *__inline_memcpy(void *to, const void *from, size_t len)
+{
+ void *ret = to;
+
+ asm volatile("rep movsb"
+ : "+D" (to), "+S" (from), "+c" (len)
+ : : "memory");
+ return ret;
+}
+
+static __always_inline void *__inline_memset(void *s, int v, size_t n)
+{
+ void *ret = s;
+
+ asm volatile("rep stosb"
+ : "+D" (s), "+c" (n)
+ : "a" ((uint8_t)v)
+ : "memory");
+ return ret;
+}
+
+#endif /* _ASM_X86_STRING_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-07 8:03 ` [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
2025-07-09 1:08 ` Sohil Mehta
2025-07-09 16:58 ` Dave Hansen
@ 2025-07-28 19:11 ` David Laight
2025-07-28 19:28 ` H. Peter Anvin
2 siblings, 1 reply; 47+ messages in thread
From: David Laight @ 2025-07-28 19:11 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Mon, 7 Jul 2025 11:03:02 +0300
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> From: Sohil Mehta <sohil.mehta@intel.com>
>
> For patching, the kernel initializes a temporary mm area in the lower
> half of the address range. See commit 4fc19708b165 ("x86/alternatives:
> Initialize temporary mm for patching").
>
> Disable LASS enforcement during patching to avoid triggering a #GP
> fault.
>
> The objtool warns due to a call to a non-allowed function that exists
> outside of the stac/clac guard, or references to any function with a
> dynamic function pointer inside the guard. See the Objtool warnings
> section #9 in the document tools/objtool/Documentation/objtool.txt.
>
> Considering that patching is usually small, replace the memcpy() and
> memset() functions in the text poking functions with their open coded
> versions.
...
Or just write a byte copy loop in C with (eg) barrier() inside it
to stop gcc converting it to memcpy().
David
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-28 19:11 ` David Laight
@ 2025-07-28 19:28 ` H. Peter Anvin
2025-07-28 19:38 ` David Laight
0 siblings, 1 reply; 47+ messages in thread
From: H. Peter Anvin @ 2025-07-28 19:28 UTC (permalink / raw)
To: David Laight, Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On July 28, 2025 12:11:37 PM PDT, David Laight <david.laight.linux@gmail.com> wrote:
>On Mon, 7 Jul 2025 11:03:02 +0300
>"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
>
>> From: Sohil Mehta <sohil.mehta@intel.com>
>>
>> For patching, the kernel initializes a temporary mm area in the lower
>> half of the address range. See commit 4fc19708b165 ("x86/alternatives:
>> Initialize temporary mm for patching").
>>
>> Disable LASS enforcement during patching to avoid triggering a #GP
>> fault.
>>
>> The objtool warns due to a call to a non-allowed function that exists
>> outside of the stac/clac guard, or references to any function with a
>> dynamic function pointer inside the guard. See the Objtool warnings
>> section #9 in the document tools/objtool/Documentation/objtool.txt.
>>
>> Considering that patching is usually small, replace the memcpy() and
>> memset() functions in the text poking functions with their open coded
>> versions.
>...
>
>Or just write a byte copy loop in C with (eg) barrier() inside it
>to stop gcc converting it to memcpy().
>
> David
Great. It's rep movsb without any of the performance.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-28 19:28 ` H. Peter Anvin
@ 2025-07-28 19:38 ` David Laight
2025-08-01 0:15 ` Sohil Mehta
0 siblings, 1 reply; 47+ messages in thread
From: David Laight @ 2025-07-28 19:38 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Mon, 28 Jul 2025 12:28:33 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:
> On July 28, 2025 12:11:37 PM PDT, David Laight <david.laight.linux@gmail.com> wrote:
> >On Mon, 7 Jul 2025 11:03:02 +0300
> >"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> >
> >> From: Sohil Mehta <sohil.mehta@intel.com>
> >>
> >> For patching, the kernel initializes a temporary mm area in the lower
> >> half of the address range. See commit 4fc19708b165 ("x86/alternatives:
> >> Initialize temporary mm for patching").
> >>
> >> Disable LASS enforcement during patching to avoid triggering a #GP
> >> fault.
> >>
> >> The objtool warns due to a call to a non-allowed function that exists
> >> outside of the stac/clac guard, or references to any function with a
> >> dynamic function pointer inside the guard. See the Objtool warnings
> >> section #9 in the document tools/objtool/Documentation/objtool.txt.
> >>
> >> Considering that patching is usually small, replace the memcpy() and
> >> memset() functions in the text poking functions with their open coded
> >> versions.
> >...
> >
> >Or just write a byte copy loop in C with (eg) barrier() inside it
> >to stop gcc converting it to memcpy().
> >
> > David
>
> Great. It's rep movsb without any of the performance.
And without the massive setup overhead that dominates short copies.
Given the rest of the code I'm sure a byte copy loop won't make
any difference to the overall performance.
David
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-07-09 17:00 ` Dave Hansen
@ 2025-07-31 23:45 ` Sohil Mehta
2025-08-01 0:01 ` Dave Hansen
0 siblings, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-07-31 23:45 UTC (permalink / raw)
To: Dave Hansen, Thomas Gleixner, Dave Hansen, Kees Cook
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Eric Biggers, Jason Gunthorpe, Masami Hiramatsu (Google),
Andrew Morton, Luis Chamberlain, Yuntao Wang, Rasmus Villemoes,
Christophe Leroy, Tejun Heo, Changbin Du, Huang Shijie,
Geert Uytterhoeven, Namhyung Kim, Arnaldo Carvalho de Melo,
linux-doc, linux-kernel, linux-efi, linux-mm, Kirill A. Shutemov,
Kirill A. Shutemov, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel, Paul E. McKenney,
Josh Poimboeuf, Xiongwei Song, Xin Li, Mike Rapoport (IBM),
Brijesh Singh, Michael Roth, Tony Luck, Alexey Kardashevskiy,
Alexander Shishkin, X86-kernel
On 7/9/2025 10:00 AM, Dave Hansen wrote:
> On 7/7/25 01:03, Kirill A. Shutemov wrote:
>> Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
>> arch_cpu_finalize_init(), defer it until core initcall.
>
> What are the side effects of this move? Are there other benefits? What
> are the risks?
>
Picking this up from Kirill.. Reevaluating this, core_initcall() seems
too late for setup_cr_pinning().
We need to have CR pinning completed, and the associated static key
enabled before AP bring up. start_secondary()->cr4_init() depends on the
cr_pinning static key to initialize CR4 for APs.
To find the optimal location for CR pinning, here are the constraints:
1) The initialization of all the CPU-specific security features such as
SMAP, SMEP, UMIP and LASS must be done.
2) Since EFI needs to toggle CR4.LASS, EFI initialization must be completed.
3) Since APs depend on the BSP for CR initialization, CR pinning should
happen before AP bringup.
4) CR pinning should happen before userspace comes up, since that's what
we are protecting against.
I shortlisted two locations, arch_cpu_finalize_init() and early_initcall().
a) start_kernel()
arch_cpu_finalize_init()
arch_cpu_finalize_init() seems like the logical fit, since CR pinning
can be considered as the "finalizing" the security sensitive control
registers. Doing it at the conclusion of CPU initialization makes sense.
b) start_kernel()
rest_init()
kernel_init()
kernel_init_freeable()
do_pre_smp_initcalls()
early_initcall()
We could push the pinning until early_initcall() since that happens just
before SMP bringup as well the init process being executed. But, I don't
see any benefit to doing that.
Most of the stuff between arch_cpu_finalize_init() and rest_init() seems
to be arch agnostic (except maybe ACPI). Though the likelihood of
anything touching the pinned bits is low, it would be better to have the
bits pinned and complain if someone modifies them.
I am inclined towards arch_cpu_finalize_init() but I don't have a strong
preference. Dave, is there any other aspect I should consider?
> BTW, ideally, you'd get an ack from one of the folks who put the CR
> pinning in the kernel in the first place to make sure this isn't
> breaking the mechanism in any important ways.
Kees, do you have any preference or suggestions?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-07-31 23:45 ` Sohil Mehta
@ 2025-08-01 0:01 ` Dave Hansen
2025-08-01 4:43 ` Sohil Mehta
2025-08-02 18:51 ` Kees Cook
0 siblings, 2 replies; 47+ messages in thread
From: Dave Hansen @ 2025-08-01 0:01 UTC (permalink / raw)
To: Sohil Mehta, Thomas Gleixner, Dave Hansen, Kees Cook
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Eric Biggers, Jason Gunthorpe, Masami Hiramatsu (Google),
Andrew Morton, Luis Chamberlain, Yuntao Wang, Rasmus Villemoes,
Christophe Leroy, Tejun Heo, Changbin Du, Huang Shijie,
Geert Uytterhoeven, Namhyung Kim, Arnaldo Carvalho de Melo,
linux-doc, linux-kernel, linux-efi, linux-mm, Kirill A. Shutemov,
Kirill A. Shutemov, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel, Paul E. McKenney,
Josh Poimboeuf, Xiongwei Song, Xin Li, Mike Rapoport (IBM),
Brijesh Singh, Michael Roth, Tony Luck, Alexey Kardashevskiy,
Alexander Shishkin, X86-kernel
On 7/31/25 16:45, Sohil Mehta wrote:
> On 7/9/2025 10:00 AM, Dave Hansen wrote:
>> On 7/7/25 01:03, Kirill A. Shutemov wrote:
>>> Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
>>> arch_cpu_finalize_init(), defer it until core initcall.
>> What are the side effects of this move? Are there other benefits? What
>> are the risks?
>>
> Picking this up from Kirill.. Reevaluating this, core_initcall() seems
> too late for setup_cr_pinning().
>
> We need to have CR pinning completed, and the associated static key
> enabled before AP bring up. start_secondary()->cr4_init() depends on the
> cr_pinning static key to initialize CR4 for APs.
Sure, if you leave cr4_init() completely as-is.
'cr4_pinned_bits' should be set by the boot CPU. Secondary CPUs should
also read 'cr4_pinned_bits' when setting up their own cr4's,
unconditionally, independent of 'cr_pinning'.
The thing I think we should change is the pinning _enforcement_. The
easiest way to do that is to remove the static_branch_likely() in
cr4_init() and then delay flipping the static branch until just before
userspace starts.
Basically, split up the:
static void __init setup_cr_pinning(void)
{
cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & cr4_pinned_mask;
static_key_enable(&cr_pinning.key);
}
code into its two logical pieces:
1. Populate 'cr4_pinned_bits' from the boot CPU so the secondaries can
use it
2. Enable the static key so pinning enforcement is enabled
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-07-28 19:38 ` David Laight
@ 2025-08-01 0:15 ` Sohil Mehta
0 siblings, 0 replies; 47+ messages in thread
From: Sohil Mehta @ 2025-08-01 0:15 UTC (permalink / raw)
To: David Laight, H. Peter Anvin, Dave Hansen
Cc: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, x86, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On 7/28/2025 12:38 PM, David Laight wrote:
>>> ...
>>>
>>> Or just write a byte copy loop in C with (eg) barrier() inside it
>>> to stop gcc converting it to memcpy().
>>>
>>> David
>>
>> Great. It's rep movsb without any of the performance.
>
> And without the massive setup overhead that dominates short copies.
> Given the rest of the code I'm sure a byte copy loop won't make
> any difference to the overall performance.
>
Wouldn't it be better to introduce a generic mechanism than something
customized for this scenario?
PeterZ had suggested that inline memcpy could have more usages:
https://lore.kernel.org/lkml/20241029113611.GS14555@noisy.programming.kicks-ass.net/
Is there a concern that the inline versions might get optimized into
standard memcpy/memset calls by GCC? Wouldn't the volatile keyword
prevent that?
static __always_inline void *__inline_memcpy(void *to, const void *from,
size_t len)
{
void *ret = to;
asm volatile("rep movsb"
: "+D" (to), "+S" (from), "+c" (len)
: : "memory");
return ret;
}
static __always_inline void *__inline_memset(void *s, int v, size_t n)
{
void *ret = s;
asm volatile("rep stosb"
: "+D" (s), "+c" (n)
: "a" ((uint8_t)v)
: "memory");
return ret;
}
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-08-01 0:01 ` Dave Hansen
@ 2025-08-01 4:43 ` Sohil Mehta
2025-08-01 14:22 ` Dave Hansen
2025-08-02 18:51 ` Kees Cook
1 sibling, 1 reply; 47+ messages in thread
From: Sohil Mehta @ 2025-08-01 4:43 UTC (permalink / raw)
To: Dave Hansen, Thomas Gleixner, Dave Hansen, Kees Cook
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Eric Biggers, Jason Gunthorpe, Masami Hiramatsu (Google),
Andrew Morton, Luis Chamberlain, Yuntao Wang, Rasmus Villemoes,
Christophe Leroy, Tejun Heo, Changbin Du, Huang Shijie,
Geert Uytterhoeven, Namhyung Kim, Arnaldo Carvalho de Melo,
linux-doc, linux-kernel, linux-efi, linux-mm, Kirill A. Shutemov,
Kirill A. Shutemov, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel, Paul E. McKenney,
Josh Poimboeuf, Xiongwei Song, Xin Li, Mike Rapoport (IBM),
Brijesh Singh, Michael Roth, Tony Luck, Alexey Kardashevskiy,
Alexander Shishkin, X86-kernel
On 7/31/2025 5:01 PM, Dave Hansen wrote:
>
> 'cr4_pinned_bits' should be set by the boot CPU. Secondary CPUs should
> also read 'cr4_pinned_bits' when setting up their own cr4's,
> unconditionally, independent of 'cr_pinning'.
>
> The thing I think we should change is the pinning _enforcement_. The
> easiest way to do that is to remove the static_branch_likely() in
> cr4_init() and then delay flipping the static branch until just before
> userspace starts.
>
Based on the current implementation and some git history [1], it seems
that cr_pinning is expected to catch 2 types of issues.
One is a malicious user trying to change the CR registers from
userspace. But, another scenario is a regular caller to
native_write_cr4() *mistakenly* clearing a pinned bit.
[1]:
https://lore.kernel.org/all/alpine.DEB.2.21.1906141646320.1722@nanos.tec.linutronix.de/
Could deferring enforcement lead to a scenario where we end up with
different CR4 values on different CPUs? Maybe I am misinterpreting this
and protecting against in-kernel errors is not a goal.
In general, you want to delay the CR pinning enforcement until
absolutely needed. I am curious about the motivation. I understand we
should avoid doing it at arbitrary points in the boot. But,
arch_cpu_finalize_init() and early_initcall() seem to be decent
mileposts to me.
Are you anticipating that we would need to move setup_cr_pinning() again
when another user similar to EFI shows up?
> Basically, split up the:
>
> static void __init setup_cr_pinning(void)
> {
> cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & cr4_pinned_mask;
> static_key_enable(&cr_pinning.key);
> }
>
> code into its two logical pieces:
>
> 1. Populate 'cr4_pinned_bits' from the boot CPU so the secondaries can
> use it
> 2. Enable the static key so pinning enforcement is enabled
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-08-01 4:43 ` Sohil Mehta
@ 2025-08-01 14:22 ` Dave Hansen
0 siblings, 0 replies; 47+ messages in thread
From: Dave Hansen @ 2025-08-01 14:22 UTC (permalink / raw)
To: Sohil Mehta, Thomas Gleixner, Dave Hansen, Kees Cook
Cc: Jonathan Corbet, Ingo Molnar, Pawan Gupta, Daniel Sneddon,
Kai Huang, Sandipan Das, Breno Leitao, Rick Edgecombe,
Alexei Starovoitov, Hou Tao, Juergen Gross, Vegard Nossum,
Eric Biggers, Jason Gunthorpe, Masami Hiramatsu (Google),
Andrew Morton, Luis Chamberlain, Yuntao Wang, Rasmus Villemoes,
Christophe Leroy, Tejun Heo, Changbin Du, Huang Shijie,
Geert Uytterhoeven, Namhyung Kim, Arnaldo Carvalho de Melo,
linux-doc, linux-kernel, linux-efi, linux-mm, Kirill A. Shutemov,
Kirill A. Shutemov, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel, Paul E. McKenney,
Josh Poimboeuf, Xiongwei Song, Xin Li, Mike Rapoport (IBM),
Brijesh Singh, Michael Roth, Tony Luck, Alexey Kardashevskiy,
Alexander Shishkin, X86-kernel
On 7/31/25 21:43, Sohil Mehta wrote:
...
> Could deferring enforcement lead to a scenario where we end up with
> different CR4 values on different CPUs? Maybe I am misinterpreting this
> and protecting against in-kernel errors is not a goal.
Sure, theoretically.
But if that's a concern, it can be checked at the time that enforcement
starts:
for_each_online_cpu(cpu) {
unsigned long cr4 = per_cpu(cpu_tlbstate.cr4, cpu);
if ((cr4 & cr4_pinned_mask) == cr4_pinned_bits))
continue;
WARN("blah blah");
}
Or use smp_call_function() to check each CPU's CR4 directly.
Or, the next time that CPU does a TLB flush that toggles X86_CR4_PGE,
it'll get the warning from the regular pinning path.
So, sure, this does widen the window during boot where a secondary CPU
might get a bad CR4 value, and it would make it harder to track down
where it happened. We _could_ print a pr_debug() message when the bit
gets cleared but not enforce things if anyone is super worried about this.
> In general, you want to delay the CR pinning enforcement until
> absolutely needed. I am curious about the motivation. I understand we
> should avoid doing it at arbitrary points in the boot. But,
> arch_cpu_finalize_init() and early_initcall() seem to be decent
> mileposts to me.
>
> Are you anticipating that we would need to move setup_cr_pinning() again
> when another user similar to EFI shows up?
Yep.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-08-01 0:01 ` Dave Hansen
2025-08-01 4:43 ` Sohil Mehta
@ 2025-08-02 18:51 ` Kees Cook
2025-08-04 6:55 ` H. Peter Anvin
1 sibling, 1 reply; 47+ messages in thread
From: Kees Cook @ 2025-08-02 18:51 UTC (permalink / raw)
To: Dave Hansen
Cc: Sohil Mehta, Thomas Gleixner, Dave Hansen, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov, Kirill A. Shutemov, Andy Lutomirski,
Ingo Molnar, Borislav Petkov, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin, X86-kernel
On Thu, Jul 31, 2025 at 05:01:37PM -0700, Dave Hansen wrote:
> On 7/31/25 16:45, Sohil Mehta wrote:
> > On 7/9/2025 10:00 AM, Dave Hansen wrote:
> >> On 7/7/25 01:03, Kirill A. Shutemov wrote:
> >>> Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
> >>> arch_cpu_finalize_init(), defer it until core initcall.
> >> What are the side effects of this move? Are there other benefits? What
> >> are the risks?
> >>
> > Picking this up from Kirill.. Reevaluating this, core_initcall() seems
> > too late for setup_cr_pinning().
> >
> > We need to have CR pinning completed, and the associated static key
> > enabled before AP bring up. start_secondary()->cr4_init() depends on the
> > cr_pinning static key to initialize CR4 for APs.
>
> Sure, if you leave cr4_init() completely as-is.
>
> 'cr4_pinned_bits' should be set by the boot CPU. Secondary CPUs should
> also read 'cr4_pinned_bits' when setting up their own cr4's,
> unconditionally, independent of 'cr_pinning'.
>
> The thing I think we should change is the pinning _enforcement_. The
> easiest way to do that is to remove the static_branch_likely() in
> cr4_init() and then delay flipping the static branch until just before
> userspace starts.
Yeah, this is fine from my perspective. The goal with the pinning was
about keeping things safe in the face of an attack from userspace that
managed to get at MSR values and keeping them from being trivially
changed.
--
Kees Cook
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall
2025-08-02 18:51 ` Kees Cook
@ 2025-08-04 6:55 ` H. Peter Anvin
0 siblings, 0 replies; 47+ messages in thread
From: H. Peter Anvin @ 2025-08-04 6:55 UTC (permalink / raw)
To: Kees Cook, Dave Hansen
Cc: Sohil Mehta, Thomas Gleixner, Dave Hansen, Jonathan Corbet,
Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov, Kirill A. Shutemov, Andy Lutomirski,
Ingo Molnar, Borislav Petkov, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, X86-kernel
On August 2, 2025 11:51:28 AM PDT, Kees Cook <kees@kernel.org> wrote:
>On Thu, Jul 31, 2025 at 05:01:37PM -0700, Dave Hansen wrote:
>> On 7/31/25 16:45, Sohil Mehta wrote:
>> > On 7/9/2025 10:00 AM, Dave Hansen wrote:
>> >> On 7/7/25 01:03, Kirill A. Shutemov wrote:
>> >>> Instead of moving setup_cr_pinning() below efi_enter_virtual_mode() in
>> >>> arch_cpu_finalize_init(), defer it until core initcall.
>> >> What are the side effects of this move? Are there other benefits? What
>> >> are the risks?
>> >>
>> > Picking this up from Kirill.. Reevaluating this, core_initcall() seems
>> > too late for setup_cr_pinning().
>> >
>> > We need to have CR pinning completed, and the associated static key
>> > enabled before AP bring up. start_secondary()->cr4_init() depends on the
>> > cr_pinning static key to initialize CR4 for APs.
>>
>> Sure, if you leave cr4_init() completely as-is.
>>
>> 'cr4_pinned_bits' should be set by the boot CPU. Secondary CPUs should
>> also read 'cr4_pinned_bits' when setting up their own cr4's,
>> unconditionally, independent of 'cr_pinning'.
>>
>> The thing I think we should change is the pinning _enforcement_. The
>> easiest way to do that is to remove the static_branch_likely() in
>> cr4_init() and then delay flipping the static branch until just before
>> userspace starts.
>
>Yeah, this is fine from my perspective. The goal with the pinning was
>about keeping things safe in the face of an attack from userspace that
>managed to get at MSR values and keeping them from being trivially
>changed.
>
I have mentioned this before: I would like to see CR4-pinning use a patchable immediate to make it harder to manipulate. If the mask is final when alternatives are run, that would be a good time to install it; the code can just contain a zero immediate (no pinning) or a very limited set of bits that must never be changed at all up to that point.
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2025-08-04 6:56 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-07 8:03 [PATCHv9 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
2025-07-09 1:08 ` Sohil Mehta
2025-07-09 9:35 ` Kirill A. Shutemov
2025-07-09 16:58 ` Dave Hansen
2025-07-25 2:35 ` Sohil Mehta
2025-07-28 19:11 ` David Laight
2025-07-28 19:28 ` H. Peter Anvin
2025-07-28 19:38 ` David Laight
2025-08-01 0:15 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 03/16] x86/cpu: Set LASS CR4 bit as pinning sensitive Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 04/16] x86/cpu: Defer CR pinning setup until core initcall Kirill A. Shutemov
2025-07-09 1:19 ` Sohil Mehta
2025-07-09 9:38 ` Kirill A. Shutemov
2025-07-09 17:00 ` Dave Hansen
2025-07-31 23:45 ` Sohil Mehta
2025-08-01 0:01 ` Dave Hansen
2025-08-01 4:43 ` Sohil Mehta
2025-08-01 14:22 ` Dave Hansen
2025-08-02 18:51 ` Kees Cook
2025-08-04 6:55 ` H. Peter Anvin
2025-07-07 8:03 ` [PATCHv9 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
2025-07-09 1:27 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 07/16] x86/vsyscall: Reorganize the #PF emulation code Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 08/16] x86/traps: Consolidate user fixups in exc_general_protection() Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 09/16] x86/vsyscall: Add vsyscall emulation for #GP Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 11/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
2025-07-09 2:40 ` Sohil Mehta
2025-07-09 9:31 ` Kirill A. Shutemov
2025-07-09 9:36 ` Geert Uytterhoeven
2025-07-09 9:51 ` Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 12/16] x86/traps: Generalize #GP address decode and hint code Kirill A. Shutemov
2025-07-09 4:59 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
2025-07-09 5:12 ` Sohil Mehta
2025-07-09 10:38 ` Kirill A. Shutemov
2025-07-11 1:22 ` Sohil Mehta
2025-07-11 1:23 ` Sohil Mehta
2025-07-07 8:03 ` [PATCHv9 14/16] x86/cpu: Enable LASS during CPU initialization Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 15/16] x86/cpu: Make LAM depend on LASS Kirill A. Shutemov
2025-07-07 8:03 ` [PATCHv9 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
2025-07-09 5:31 ` Sohil Mehta
2025-07-09 11:00 ` Kirill A. Shutemov
2025-07-11 0:42 ` Sohil Mehta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).