* [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
2025-06-24 14:11 [PATCHv6 07/16] x86/vsyscall: Reorganize the #PF emulation code Dave Hansen
@ 2025-06-24 14:59 ` Kirill A. Shutemov
0 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-24 14:59 UTC (permalink / raw)
To: dave.hansen
Cc: acme, aik, akpm, alexander.shishkin, andrew.cooper3, ardb, ast,
bp, brijesh.singh, changbin.du, christophe.leroy, corbet,
daniel.sneddon, dave.hansen, ebiggers, geert+renesas, houtao1,
hpa, jgg, jgross, jpoimboe, kai.huang, kees, kirill.shutemov,
leitao, linux-doc, linux-efi, linux-kernel, linux-mm, linux, luto,
mcgrof, mhiramat, michael.roth, mingo, mingo, namhyung, paulmck,
pawan.kumar.gupta, peterz, rick.p.edgecombe, rppt, sandipan.das,
shijie, sohil.mehta, tglx, tj, tony.luck, vegard.nossum, x86,
xin3.li, xiongwei.song, ytcoode
emulate_vsyscall() expects to see X86_PF_INSTR in PFEC on a vsyscall
page fault, but the CPU does not report X86_PF_INSTR if neither
X86_FEATURE_NX nor X86_FEATURE_SMEP are enabled.
X86_FEATURE_NX should be enabled on nearly all 64-bit CPUs, except for
early P4 processors that did not support this feature.
Instead of explicitly checking for X86_PF_INSTR, compare the fault
address to RIP.
On machines with X86_FEATURE_NX enabled, issue a warning if RIP is equal
to fault address but X86_PF_INSTR is absent.
Originally-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/all/bd81a98b-f8d4-4304-ac55-d4151a1a77ab@intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index c9103a6fa06e..0b0e0283994f 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -124,7 +124,8 @@ bool emulate_vsyscall(unsigned long error_code,
if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
return false;
- if (!(error_code & X86_PF_INSTR)) {
+ /* Avoid emulation unless userspace was executing from vsyscall page: */
+ if (address != regs->ip) {
/* Failed vsyscall read */
if (vsyscall_mode == EMULATE)
return false;
@@ -136,13 +137,16 @@ bool emulate_vsyscall(unsigned long error_code,
return false;
}
+
+ /* X86_PF_INSTR is only set when NX is supported: */
+ if (cpu_feature_enabled(X86_FEATURE_NX))
+ WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
+
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
*/
- WARN_ON_ONCE(address != regs->ip);
-
if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none");
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 00/16] x86: Enable Linear Address Space Separation support
@ 2025-06-25 12:50 Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
` (17 more replies)
0 siblings, 18 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
Linear Address Space Separation (LASS) is a security feature that intends to
prevent malicious virtual address space accesses across user/kernel mode.
Such mode based access protection already exists today with paging and features
such as SMEP and SMAP. However, to enforce these protections, the processor
must traverse the paging structures in memory. Malicious software can use
timing information resulting from this traversal to determine details about the
paging structures, and these details may also be used to determine the layout
of the kernel memory.
The LASS mechanism provides the same mode-based protections as paging but
without traversing the paging structures. Because the protections enforced by
LASS are applied before paging, software will not be able to derive
paging-based timing information from the various caching structures such as the
TLBs, mid-level caches, page walker, data caches, etc. LASS can avoid probing
using double page faults, TLB flush and reload, and SW prefetch instructions.
See [2], [3] and [4] for some research on the related attack vectors.
Had it been available, LASS alone would have mitigated Meltdown. (Hindsight is
20/20 :)
In addition, LASS prevents an attack vector described in a Spectre LAM (SLAM)
whitepaper [7].
LASS enforcement relies on the typical kernel implementation to divide the
64-bit virtual address space into two halves:
Addr[63]=0 -> User address space
Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically results in a
#GP fault.
Kernel accesses usually only happen to the kernel address space. However, there
are valid reasons for kernel to access memory in the user half. For these cases
(such as text poking and EFI runtime accesses), the kernel can temporarily
suspend the enforcement of LASS by toggling SMAP (Supervisor Mode Access
Prevention) using the stac()/clac() instructions and in one instance a downright
disabling LASS for an EFI runtime call.
User space cannot access any kernel address while LASS is enabled.
Unfortunately, legacy vsyscall functions are located in the address range
0xffffffffff600000 - 0xffffffffff601000 and emulated in kernel. To avoid
breaking user applications when LASS is enabled, extend the vsyscall emulation
in execute (XONLY) mode to the #GP fault handler.
In contrast, the vsyscall EMULATE mode is deprecated and not expected to be
used by anyone. Supporting EMULATE mode with LASS would need complex
instruction decoding in the #GP fault handler and is probably not worth the
hassle. Disable LASS in this rare case when someone absolutely needs and
enables vsyscall=emulate via the command line.
Changes from v6[10]:
- Rewrok #SS handler to work properly on FRED;
- Do not require X86_PF_INSTR to emulate vsyscall;
- Move lass_clac()/stac() definition to the patch where they are used;
- Rename lass_clac/stac() to lass_disable/enable_enforcement();
- Fix several build issues around inline memcpy and memset;
- Fix sparse warning;
- Adjust comments and commit messages;
- Drop "x86/efi: Move runtime service initialization to arch/x86" patch
as it got applied;
Changes from v5[9]:
- Report LASS violation as NULL pointer dereference if the address is in the
first page frame;
- Provide helpful error message on #SS due to LASS violation;
- Fold patch for vsyscall=emulate documentation into patch
that disables LASS with vsyscall=emulate;
- Rewrite __inline_memeset() and __inline_memcpy();
- Adjust comments and commit messages;
Changes from v4[8]:
- Added PeterZ's Originally-by and SoB to 2/16
- Added lass_clac()/lass_stac() to differentiate from SMAP necessitated
clac()/stac() and to be NOPs on CPUs that don't support LASS
- Moved LASS enabling patch to the end to avoid rendering machines
unbootable between until the patch that disables LASS around EFI
initialization
- Reverted Pawan's LAM disabling commit
Changes from v3[6]:
- Made LAM dependent on LASS
- Moved EFI runtime initialization to x86 side of things
- Suspended LASS validation around EFI set_virtual_address_map call
- Added a message for the case of kernel side LASS violation
- Moved inline memset/memcpy versions to the common string.h
Changes from v2[5]:
- Added myself to the SoB chain
Changes from v1[1]:
- Emulate vsyscall violations in execute mode in the #GP fault handler
- Use inline memcpy and memset while patching alternatives
- Remove CONFIG_X86_LASS
- Make LASS depend on SMAP
- Dropped the minimal KVM enabling patch
[1] https://lore.kernel.org/lkml/20230110055204.3227669-1-yian.chen@intel.com/
[2] “Practical Timing Side Channel Attacks against Kernel Space ASLR”,
https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[3] “Prefetch Side-Channel Attacks: Bypassing SMAP and Kernel ASLR”, http://doi.acm.org/10.1145/2976749.2978356
[4] “Harmful prefetch on Intel”, https://ioactive.com/harmful-prefetch-on-intel/ (H/T Anders)
[5] https://lore.kernel.org/all/20230530114247.21821-1-alexander.shishkin@linux.intel.com/
[6] https://lore.kernel.org/all/20230609183632.48706-1-alexander.shishkin@linux.intel.com/
[7] https://download.vusec.net/papers/slam_sp24.pdf
[8] https://lore.kernel.org/all/20240710160655.3402786-1-alexander.shishkin@linux.intel.com/
[9] https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com
[10] https://lore.kernel.org/all/20250620135325.3300848-1-kirill.shutemov@linux.intel.com/
Alexander Shishkin (4):
x86/cpu: Defer CR pinning setup until after EFI initialization
efi: Disable LASS around set_virtual_address_map() EFI call
x86/traps: Communicate a LASS violation in #GP message
x86/cpu: Make LAM depend on LASS
Kirill A. Shutemov (4):
x86/asm: Introduce inline memcpy and memset
x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
x86/traps: Handle LASS thrown #SS
x86: Re-enable Linear Address Masking
Sohil Mehta (7):
x86/cpu: Enumerate the LASS feature bits
x86/alternatives: Disable LASS when patching kernel alternatives
x86/vsyscall: Reorganize the #PF emulation code
x86/traps: Consolidate user fixups in exc_general_protection()
x86/vsyscall: Add vsyscall emulation for #GP
x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
x86/cpu: Enable LASS during CPU initialization
Yian Chen (1):
x86/cpu: Set LASS CR4 bit as pinning sensitive
.../admin-guide/kernel-parameters.txt | 4 +-
arch/x86/Kconfig | 1 -
arch/x86/Kconfig.cpufeatures | 4 +
arch/x86/entry/vsyscall/vsyscall_64.c | 69 +++++++++++------
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/smap.h | 27 ++++++-
arch/x86/include/asm/string.h | 46 ++++++++++++
arch/x86/include/asm/uaccess_64.h | 38 +++-------
arch/x86/include/asm/vsyscall.h | 14 +++-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/alternative.c | 14 +++-
arch/x86/kernel/cpu/common.c | 21 ++++--
arch/x86/kernel/cpu/cpuid-deps.c | 2 +
arch/x86/kernel/traps.c | 75 +++++++++++++++----
arch/x86/kernel/umip.c | 3 +
arch/x86/lib/clear_page_64.S | 10 ++-
arch/x86/mm/fault.c | 2 +-
arch/x86/platform/efi/efi.c | 15 ++++
tools/arch/x86/include/asm/cpufeatures.h | 1 +
19 files changed, 264 insertions(+), 85 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-26 15:22 ` Borislav Petkov
2025-06-26 18:00 ` Xin Li
2025-06-25 12:50 ` [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
` (16 subsequent siblings)
17 siblings, 2 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Linear Address Space Separation (LASS) is a security feature that
intends to prevent malicious virtual address space accesses across
user/kernel mode.
Such mode based access protection already exists today with paging and
features such as SMEP and SMAP. However, to enforce these protections,
the processor must traverse the paging structures in memory. Malicious
software can use timing information resulting from this traversal to
determine details about the paging structures, and these details may
also be used to determine the layout of the kernel memory.
The LASS mechanism provides the same mode-based protections as paging
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, software will not be able to
derive paging-based timing information from the various caching
structures such as the TLBs, mid-level caches, page walker, data caches,
etc.
LASS enforcement relies on the typical kernel implementation to divide
the 64-bit virtual address space into two halves:
Addr[63]=0 -> User address space
Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically
results in a #GP fault.
The LASS enforcement for kernel data access is dependent on CR4.SMAP
being set. The enforcement can be disabled by toggling the RFLAGS.AC bit
similar to SMAP.
Define the CPU feature bits to enumerate this feature and include
feature dependencies to reflect the same.
LASS provides protection against a class of speculative attacks, such as
SLAM[1]. Add the "lass" flag to /proc/cpuinfo to indicate that the feature
is supported by hardware and enabled by the kernel. This allows userspace
to determine if the setup is secure against such attacks.
[1] https://download.vusec.net/papers/slam_sp24.pdf
Co-developed-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/Kconfig.cpufeatures | 4 ++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/uapi/asm/processor-flags.h | 2 ++
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
tools/arch/x86/include/asm/cpufeatures.h | 1 +
5 files changed, 9 insertions(+)
diff --git a/arch/x86/Kconfig.cpufeatures b/arch/x86/Kconfig.cpufeatures
index 250c10627ab3..733d5aff2456 100644
--- a/arch/x86/Kconfig.cpufeatures
+++ b/arch/x86/Kconfig.cpufeatures
@@ -124,6 +124,10 @@ config X86_DISABLED_FEATURE_PCID
def_bool y
depends on !X86_64
+config X86_DISABLED_FEATURE_LASS
+ def_bool y
+ depends on X86_32
+
config X86_DISABLED_FEATURE_PKU
def_bool y
depends on !X86_INTEL_MEMORY_PROTECTION_KEYS
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b78af55aa22e..8eef1ad7aca2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -313,6 +313,7 @@
#define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */
#define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */
#define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */
#define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */
diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index f1a4adc78272..81d0c8bf1137 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -136,6 +136,8 @@
#define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT)
#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */
#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT)
+#define X86_CR4_LASS_BIT 27 /* enable Linear Address Space Separation support */
+#define X86_CR4_LASS _BITUL(X86_CR4_LASS_BIT)
#define X86_CR4_LAM_SUP_BIT 28 /* LAM for supervisor pointers */
#define X86_CR4_LAM_SUP _BITUL(X86_CR4_LAM_SUP_BIT)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 46efcbd6afa4..98d0cdd82574 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -89,6 +89,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_SHSTK, X86_FEATURE_XSAVES },
{ X86_FEATURE_FRED, X86_FEATURE_LKGS },
{ X86_FEATURE_SPEC_CTRL_SSBD, X86_FEATURE_SPEC_CTRL },
+ { X86_FEATURE_LASS, X86_FEATURE_SMAP },
{}
};
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index ee176236c2be..4473a6f7800b 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -313,6 +313,7 @@
#define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */
#define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */
#define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */
#define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-25 12:55 ` Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 02/16] x86/asm: Introduce inline memcpy and memset Kirill A. Shutemov
` (15 subsequent siblings)
17 siblings, 1 reply; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
emulate_vsyscall() expects to see X86_PF_INSTR in PFEC on a vsyscall
page fault, but the CPU does not report X86_PF_INSTR if neither
X86_FEATURE_NX nor X86_FEATURE_SMEP are enabled.
X86_FEATURE_NX should be enabled on nearly all 64-bit CPUs, except for
early P4 processors that did not support this feature.
Instead of explicitly checking for X86_PF_INSTR, compare the fault
address to RIP.
On machines with X86_FEATURE_NX enabled, issue a warning if RIP is equal
to fault address but X86_PF_INSTR is absent.
Originally-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/all/bd81a98b-f8d4-4304-ac55-d4151a1a77ab@intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index c9103a6fa06e..0b0e0283994f 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -124,7 +124,8 @@ bool emulate_vsyscall(unsigned long error_code,
if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
return false;
- if (!(error_code & X86_PF_INSTR)) {
+ /* Avoid emulation unless userspace was executing from vsyscall page: */
+ if (address != regs->ip) {
/* Failed vsyscall read */
if (vsyscall_mode == EMULATE)
return false;
@@ -136,13 +137,16 @@ bool emulate_vsyscall(unsigned long error_code,
return false;
}
+
+ /* X86_PF_INSTR is only set when NX is supported: */
+ if (cpu_feature_enabled(X86_FEATURE_NX))
+ WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
+
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
*/
- WARN_ON_ONCE(address != regs->ip);
-
if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none");
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 02/16] x86/asm: Introduce inline memcpy and memset
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
2025-06-25 12:50 ` [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
` (14 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
Extract memcpy and memset functions from copy_user_generic() and
__clear_user().
They can be used as inline memcpy and memset instead of the GCC builtins
whenever necessary. LASS requires them to handle text_poke.
Originally-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/all/20241029184840.GJ14555@noisy.programming.kicks-ass.net/
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/string.h | 46 +++++++++++++++++++++++++++++++
arch/x86/include/asm/uaccess_64.h | 38 +++++++------------------
arch/x86/lib/clear_page_64.S | 10 +++++--
3 files changed, 64 insertions(+), 30 deletions(-)
diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
index c3c2c1914d65..5cd0f18a431f 100644
--- a/arch/x86/include/asm/string.h
+++ b/arch/x86/include/asm/string.h
@@ -1,6 +1,52 @@
/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_STRING_H
+#define _ASM_X86_STRING_H
+
+#include <asm/asm.h>
+#include <asm/alternative.h>
+#include <asm/cpufeatures.h>
+
#ifdef CONFIG_X86_32
# include <asm/string_32.h>
#else
# include <asm/string_64.h>
#endif
+
+#ifdef CONFIG_X86_64
+#define ALT_64(orig, alt, feat) ALTERNATIVE(orig, alt, feat)
+#else
+#define ALT_64(orig, alt, feat) orig "\n"
+#endif
+
+static __always_inline void *__inline_memcpy(void *to, const void *from, size_t len)
+{
+ void *ret = to;
+
+ asm volatile("1:\n\t"
+ ALT_64("rep movsb",
+ "call rep_movs_alternative", ALT_NOT(X86_FEATURE_FSRM))
+ "2:\n\t"
+ _ASM_EXTABLE_UA(1b, 2b)
+ :"+c" (len), "+D" (to), "+S" (from), ASM_CALL_CONSTRAINT
+ : : "memory", _ASM_AX);
+
+ return ret + len;
+}
+
+static __always_inline void *__inline_memset(void *addr, int v, size_t len)
+{
+ void *ret = addr;
+
+ asm volatile("1:\n\t"
+ ALT_64("rep stosb",
+ "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRM))
+ "2:\n\t"
+ _ASM_EXTABLE_UA(1b, 2b)
+ : "+c" (len), "+D" (addr), ASM_CALL_CONSTRAINT
+ : "a" ((uint8_t)v)
+ : "memory", _ASM_SI);
+
+ return ret + len;
+}
+
+#endif /* _ASM_X86_STRING_H */
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c8a5ae35c871..eb531e13e659 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -13,6 +13,7 @@
#include <asm/page.h>
#include <asm/percpu.h>
#include <asm/runtime-const.h>
+#include <asm/string.h>
/*
* Virtual variable: there's no actual backing store for this,
@@ -118,21 +119,12 @@ rep_movs_alternative(void *to, const void *from, unsigned len);
static __always_inline __must_check unsigned long
copy_user_generic(void *to, const void *from, unsigned long len)
{
+ void *ret;
+
stac();
- /*
- * If CPU has FSRM feature, use 'rep movs'.
- * Otherwise, use rep_movs_alternative.
- */
- asm volatile(
- "1:\n\t"
- ALTERNATIVE("rep movsb",
- "call rep_movs_alternative", ALT_NOT(X86_FEATURE_FSRM))
- "2:\n"
- _ASM_EXTABLE_UA(1b, 2b)
- :"+c" (len), "+D" (to), "+S" (from), ASM_CALL_CONSTRAINT
- : : "memory", "rax");
+ ret = __inline_memcpy(to, from, len);
clac();
- return len;
+ return ret - to;
}
static __always_inline __must_check unsigned long
@@ -178,25 +170,15 @@ rep_stos_alternative(void __user *addr, unsigned long len);
static __always_inline __must_check unsigned long __clear_user(void __user *addr, unsigned long size)
{
+ void *ptr = (__force void *)addr;
+ void *ret;
+
might_fault();
stac();
-
- /*
- * No memory constraint because it doesn't change any memory gcc
- * knows about.
- */
- asm volatile(
- "1:\n\t"
- ALTERNATIVE("rep stosb",
- "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRS))
- "2:\n"
- _ASM_EXTABLE_UA(1b, 2b)
- : "+c" (size), "+D" (addr), ASM_CALL_CONSTRAINT
- : "a" (0));
-
+ ret = __inline_memset(ptr, 0, size);
clac();
- return size;
+ return ret - ptr;
}
static __always_inline unsigned long clear_user(void __user *to, unsigned long n)
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index a508e4a8c66a..ca94828def62 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -55,17 +55,23 @@ SYM_FUNC_END(clear_page_erms)
EXPORT_SYMBOL_GPL(clear_page_erms)
/*
- * Default clear user-space.
+ * Default memset.
* Input:
* rdi destination
+ * rsi scratch
* rcx count
- * rax is zero
+ * al is value
*
* Output:
* rcx: uncleared bytes or 0 if successful.
*/
SYM_FUNC_START(rep_stos_alternative)
ANNOTATE_NOENDBR
+
+ movzbq %al, %rsi
+ movabs $0x0101010101010101, %rax
+ mulq %rsi
+
cmpq $64,%rcx
jae .Lunrolled
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (2 preceding siblings ...)
2025-06-25 12:50 ` [PATCHv7 02/16] x86/asm: Introduce inline memcpy and memset Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-26 13:49 ` Peter Zijlstra
2025-06-25 12:50 ` [PATCHv7 04/16] x86/cpu: Defer CR pinning setup until after EFI initialization Kirill A. Shutemov
` (13 subsequent siblings)
17 siblings, 1 reply; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
For patching, the kernel initializes a temporary mm area in the lower
half of the address range. See commit 4fc19708b165 ("x86/alternatives:
Initialize temporary mm for patching").
Disable LASS enforcement during patching to avoid triggering a #GP
fault.
The objtool warns due to a call to a non-allowed function that exists
outside of the stac/clac guard, or references to any function with a
dynamic function pointer inside the guard. See the Objtool warnings
section #9 in the document tools/objtool/Documentation/objtool.txt.
Considering that patching is usually small, replace the memcpy and
memset functions in the text poking functions with their inline versions
respectively.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/smap.h | 27 +++++++++++++++++++++++++--
arch/x86/kernel/alternative.c | 14 ++++++++++++--
2 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..2bb7f3808e51 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -23,18 +23,41 @@
#else /* __ASSEMBLER__ */
+/*
+ * The CLAC/STAC instructions toggle the enforcement of X86_FEATURE_SMAP and
+ * X86_FEATURE_LASS.
+ *
+ * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page tables: the
+ * kernel is not allowed to touch pages with the bit set unless the AC bit is
+ * set.
+ *
+ * LASS enforcement is based on bit 63 of the virtual address. The kernel is
+ * not allowed to touch memory in the lower half of the virtual address space
+ * unless the AC bit is set.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
static __always_inline void clac(void)
{
- /* Note: a barrier is implicit in alternative() */
alternative("", "clac", X86_FEATURE_SMAP);
}
static __always_inline void stac(void)
{
- /* Note: a barrier is implicit in alternative() */
alternative("", "stac", X86_FEATURE_SMAP);
}
+static __always_inline void lass_enable_enforcement(void)
+{
+ alternative("", "clac", X86_FEATURE_LASS);
+}
+
+static __always_inline void lass_disable_enforcement(void)
+{
+ alternative("", "stac", X86_FEATURE_LASS);
+}
+
static __always_inline unsigned long smap_save(void)
{
unsigned long flags;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index ea1d984166cd..ed75892590ee 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2447,16 +2447,26 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
__ro_after_init struct mm_struct *text_poke_mm;
__ro_after_init unsigned long text_poke_mm_addr;
+/*
+ * Text poking creates and uses a mapping in the lower half of the
+ * address space. Relax LASS enforcement when accessing the poking
+ * address.
+ */
+
static void text_poke_memcpy(void *dst, const void *src, size_t len)
{
- memcpy(dst, src, len);
+ lass_disable_enforcement();
+ __inline_memcpy(dst, src, len);
+ lass_enable_enforcement();
}
static void text_poke_memset(void *dst, const void *src, size_t len)
{
int c = *(const int *)src;
- memset(dst, c, len);
+ lass_disable_enforcement();
+ __inline_memset(dst, c, len);
+ lass_enable_enforcement();
}
typedef void text_poke_f(void *dst, const void *src, size_t len);
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 04/16] x86/cpu: Defer CR pinning setup until after EFI initialization
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (3 preceding siblings ...)
2025-06-25 12:50 ` [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
` (12 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
In order to map the EFI runtime services, set_virtual_address_map()
needs to be called, which resides in the lower half of the address
space. This means that LASS needs to be temporarily disabled around
this call. This can only be done before the CR pinning is set up.
Move CR pinning setup behind the EFI initialization.
Wrapping efi_enter_virtual_mode() into lass_disable/enable_enforcement()
is not enough because AC flag gates data accesses, but not instruction
fetch. Clearing the CR4 bit is required.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/common.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4f430be285de..9918121e0adc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2081,7 +2081,6 @@ static __init void identify_boot_cpu(void)
enable_sep_cpu();
#endif
cpu_detect_tlb(&boot_cpu_data);
- setup_cr_pinning();
tsx_init();
tdx_init();
@@ -2532,10 +2531,14 @@ void __init arch_cpu_finalize_init(void)
/*
* This needs to follow the FPU initializtion, since EFI depends on it.
+ *
+ * EFI twiddles CR4.LASS. Do it before CR pinning.
*/
if (efi_enabled(EFI_RUNTIME_SERVICES))
efi_enter_virtual_mode();
+ setup_cr_pinning();
+
/*
* Ensure that access to the per CPU representation has the initial
* boot CPU configuration.
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 05/16] efi: Disable LASS around set_virtual_address_map() EFI call
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (4 preceding siblings ...)
2025-06-25 12:50 ` [PATCHv7 04/16] x86/cpu: Defer CR pinning setup until after EFI initialization Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
` (11 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Of all the EFI runtime services, set_virtual_address_map() is the only
one that is called at its lower mapping, which LASS prohibits regardless
of EFLAGS.AC setting. The only way to allow this to happen is to disable
LASS in the CR4 register.
Disable LASS around this low address EFI call.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/platform/efi/efi.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 463b784499a8..5b23c0daedef 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -787,6 +787,7 @@ static void __init __efi_enter_virtual_mode(void)
int count = 0, pg_shift = 0;
void *new_memmap = NULL;
efi_status_t status;
+ unsigned long lass;
unsigned long pa;
if (efi_alloc_page_tables()) {
@@ -825,11 +826,25 @@ static void __init __efi_enter_virtual_mode(void)
efi_sync_low_kernel_mappings();
+ /*
+ * set_virtual_address_map() is the only service located at lower
+ * addresses, so LASS has to be disabled around it.
+ *
+ * Note that flipping RFLAGS.AC is not sufficient for this, as it only
+ * permits data accesses and not instruction fetch. The entire LASS
+ * needs to be disabled.
+ */
+ lass = cr4_read_shadow() & X86_CR4_LASS;
+ cr4_clear_bits(lass);
+
status = efi_set_virtual_address_map(efi.memmap.desc_size * count,
efi.memmap.desc_size,
efi.memmap.desc_version,
(efi_memory_desc_t *)pa,
efi_systab_phys);
+
+ cr4_set_bits(lass);
+
if (status != EFI_SUCCESS) {
pr_err("Unable to switch EFI into virtual mode (status=%lx)!\n",
status);
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (5 preceding siblings ...)
2025-06-25 12:50 ` [PATCHv7 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
@ 2025-06-25 12:50 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 07/16] x86/vsyscall: Reorganize the #PF emulation code Kirill A. Shutemov
` (10 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:50 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
emulate_vsyscall() expects to see X86_PF_INSTR in PFEC on a vsyscall
page fault, but the CPU does not report X86_PF_INSTR if neither
X86_FEATURE_NX nor X86_FEATURE_SMEP are enabled.
X86_FEATURE_NX should be enabled on nearly all 64-bit CPUs, except for
early P4 processors that did not support this feature.
Instead of explicitly checking for X86_PF_INSTR, compare the fault
address against RIP.
On machines with X86_FEATURE_NX enabled, issue a warning if RIP is equal
to fault address but X86_PF_INSTR is absent.
Originally-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/all/bd81a98b-f8d4-4304-ac55-d4151a1a77ab@intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index c9103a6fa06e..0b0e0283994f 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -124,7 +124,8 @@ bool emulate_vsyscall(unsigned long error_code,
if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
return false;
- if (!(error_code & X86_PF_INSTR)) {
+ /* Avoid emulation unless userspace was executing from vsyscall page: */
+ if (address != regs->ip) {
/* Failed vsyscall read */
if (vsyscall_mode == EMULATE)
return false;
@@ -136,13 +137,16 @@ bool emulate_vsyscall(unsigned long error_code,
return false;
}
+
+ /* X86_PF_INSTR is only set when NX is supported: */
+ if (cpu_feature_enabled(X86_FEATURE_NX))
+ WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
+
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
*/
- WARN_ON_ONCE(address != regs->ip);
-
if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none");
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 07/16] x86/vsyscall: Reorganize the #PF emulation code
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (6 preceding siblings ...)
2025-06-25 12:50 ` [PATCHv7 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 08/16] x86/traps: Consolidate user fixups in exc_general_protection() Kirill A. Shutemov
` (9 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Separate out the actual vsyscall emulation from the page fault specific
handling in preparation for the upcoming #GP fault emulation.
No functional change intended.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 52 ++++++++++++++-------------
arch/x86/include/asm/vsyscall.h | 8 ++---
arch/x86/mm/fault.c | 2 +-
3 files changed, 33 insertions(+), 29 deletions(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 0b0e0283994f..25f94ac5fd35 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -112,36 +112,13 @@ static bool write_ok_or_segv(unsigned long ptr, size_t size)
}
}
-bool emulate_vsyscall(unsigned long error_code,
- struct pt_regs *regs, unsigned long address)
+static bool __emulate_vsyscall(struct pt_regs *regs, unsigned long address)
{
unsigned long caller;
int vsyscall_nr, syscall_nr, tmp;
long ret;
unsigned long orig_dx;
- /* Write faults or kernel-privilege faults never get fixed up. */
- if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
- return false;
-
- /* Avoid emulation unless userspace was executing from vsyscall page: */
- if (address != regs->ip) {
- /* Failed vsyscall read */
- if (vsyscall_mode == EMULATE)
- return false;
-
- /*
- * User code tried and failed to read the vsyscall page.
- */
- warn_bad_vsyscall(KERN_INFO, regs, "vsyscall read attempt denied -- look up the vsyscall kernel parameter if you need a workaround");
- return false;
- }
-
-
- /* X86_PF_INSTR is only set when NX is supported: */
- if (cpu_feature_enabled(X86_FEATURE_NX))
- WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
-
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
@@ -274,6 +251,33 @@ bool emulate_vsyscall(unsigned long error_code,
return true;
}
+bool emulate_vsyscall_pf(unsigned long error_code, struct pt_regs *regs,
+ unsigned long address)
+{
+ /* Write faults or kernel-privilege faults never get fixed up. */
+ if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
+ return false;
+
+ if (address == regs->ip) {
+ /* X86_PF_INSTR is only set when NX is supported: */
+ if (cpu_feature_enabled(X86_FEATURE_NX))
+ WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
+
+ return __emulate_vsyscall(regs, address);
+ }
+
+ /* Failed vsyscall read */
+ if (vsyscall_mode == EMULATE)
+ return false;
+
+ /*
+ * User code tried and failed to read the vsyscall page.
+ */
+ warn_bad_vsyscall(KERN_INFO, regs,
+ "vsyscall read attempt denied -- look up the vsyscall kernel parameter if you need a workaround");
+ return false;
+}
+
/*
* A pseudo VMA to allow ptrace access for the vsyscall page. This only
* covers the 64bit vsyscall page now. 32bit has a real VMA now and does
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index 472f0263dbc6..214977f4fa11 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -14,12 +14,12 @@ extern void set_vsyscall_pgtable_user_bits(pgd_t *root);
* Called on instruction fetch fault in vsyscall page.
* Returns true if handled.
*/
-extern bool emulate_vsyscall(unsigned long error_code,
- struct pt_regs *regs, unsigned long address);
+extern bool emulate_vsyscall_pf(unsigned long error_code,
+ struct pt_regs *regs, unsigned long address);
#else
static inline void map_vsyscall(void) {}
-static inline bool emulate_vsyscall(unsigned long error_code,
- struct pt_regs *regs, unsigned long address)
+static inline bool emulate_vsyscall_pf(unsigned long error_code,
+ struct pt_regs *regs, unsigned long address)
{
return false;
}
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 998bd807fc7b..fbcc2da75fd6 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1316,7 +1316,7 @@ void do_user_addr_fault(struct pt_regs *regs,
* to consider the PF_PK bit.
*/
if (is_vsyscall_vaddr(address)) {
- if (emulate_vsyscall(error_code, regs, address))
+ if (emulate_vsyscall_pf(error_code, regs, address))
return;
}
#endif
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 08/16] x86/traps: Consolidate user fixups in exc_general_protection()
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (7 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 07/16] x86/vsyscall: Reorganize the #PF emulation code Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 09/16] x86/vsyscall: Add vsyscall emulation for #GP Kirill A. Shutemov
` (8 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Move the UMIP exception fixup along with the other user mode fixups,
that is, under the common "if (user_mode(regs))" condition where the
rest of the fixups reside.
No functional change intended.
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
arch/x86/kernel/traps.c | 8 +++-----
arch/x86/kernel/umip.c | 3 +++
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c5c897a86418..10856e0ac46c 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -800,11 +800,6 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
cond_local_irq_enable(regs);
- if (static_cpu_has(X86_FEATURE_UMIP)) {
- if (user_mode(regs) && fixup_umip_exception(regs))
- goto exit;
- }
-
if (v8086_mode(regs)) {
local_irq_enable();
handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
@@ -819,6 +814,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
if (fixup_vdso_exception(regs, X86_TRAP_GP, error_code, 0))
goto exit;
+ if (fixup_umip_exception(regs))
+ goto exit;
+
gp_user_force_sig_segv(regs, X86_TRAP_GP, error_code, desc);
goto exit;
}
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5a4b21389b1d..80f2ad26363c 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -343,6 +343,9 @@ bool fixup_umip_exception(struct pt_regs *regs)
void __user *uaddr;
struct insn insn;
+ if (!cpu_feature_enabled(X86_FEATURE_UMIP))
+ return false;
+
if (!regs)
return false;
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 09/16] x86/vsyscall: Add vsyscall emulation for #GP
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (8 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 08/16] x86/traps: Consolidate user fixups in exc_general_protection() Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE Kirill A. Shutemov
` (7 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
The legacy vsyscall page is mapped at a fixed address in the kernel
address range 0xffffffffff600000-0xffffffffff601000. Prior to LASS being
introduced, a legacy vsyscall page access from userspace would always
generate a page fault. The kernel emulates the execute (XONLY) accesses
in the page fault handler and returns back to userspace with the
appropriate register values.
Since LASS intercepts these accesses before the paging structures are
traversed it generates a general protection fault instead of a page
fault. The #GP fault doesn't provide much information in terms of the
error code. So, use the faulting RIP which is preserved in the user
registers to emulate the vsyscall access without going through complex
instruction decoding.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 14 +++++++++++++-
arch/x86/include/asm/vsyscall.h | 6 ++++++
arch/x86/kernel/traps.c | 4 ++++
3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 25f94ac5fd35..be77385b311e 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -23,7 +23,7 @@
* soon be no new userspace code that will ever use a vsyscall.
*
* The code in this file emulates vsyscalls when notified of a page
- * fault to a vsyscall address.
+ * fault or a general protection fault to a vsyscall address.
*/
#include <linux/kernel.h>
@@ -278,6 +278,18 @@ bool emulate_vsyscall_pf(unsigned long error_code, struct pt_regs *regs,
return false;
}
+bool emulate_vsyscall_gp(struct pt_regs *regs)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_LASS))
+ return false;
+
+ /* Emulate only if the RIP points to the vsyscall address */
+ if (!is_vsyscall_vaddr(regs->ip))
+ return false;
+
+ return __emulate_vsyscall(regs, regs->ip);
+}
+
/*
* A pseudo VMA to allow ptrace access for the vsyscall page. This only
* covers the 64bit vsyscall page now. 32bit has a real VMA now and does
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index 214977f4fa11..4eb8d3673223 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -16,6 +16,7 @@ extern void set_vsyscall_pgtable_user_bits(pgd_t *root);
*/
extern bool emulate_vsyscall_pf(unsigned long error_code,
struct pt_regs *regs, unsigned long address);
+extern bool emulate_vsyscall_gp(struct pt_regs *regs);
#else
static inline void map_vsyscall(void) {}
static inline bool emulate_vsyscall_pf(unsigned long error_code,
@@ -23,6 +24,11 @@ static inline bool emulate_vsyscall_pf(unsigned long error_code,
{
return false;
}
+
+static inline bool emulate_vsyscall_gp(struct pt_regs *regs)
+{
+ return false;
+}
#endif
/*
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 10856e0ac46c..40e34bb66d7c 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -69,6 +69,7 @@
#include <asm/tdx.h>
#include <asm/cfi.h>
#include <asm/msr.h>
+#include <asm/vsyscall.h>
#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@@ -817,6 +818,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
if (fixup_umip_exception(regs))
goto exit;
+ if (emulate_vsyscall_gp(regs))
+ goto exit;
+
gp_user_force_sig_segv(regs, X86_TRAP_GP, error_code, desc);
goto exit;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (9 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 09/16] x86/vsyscall: Add vsyscall emulation for #GP Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 11/16] x86/cpu: Set LASS CR4 bit as pinning sensitive Kirill A. Shutemov
` (6 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
The EMULATE mode of vsyscall maps the vsyscall page into user address
space which can be read directly by the user application. This mode has
been deprecated recently and can only be enabled from a special command
line parameter vsyscall=emulate. See commit bf00745e7791 ("x86/vsyscall:
Remove CONFIG_LEGACY_VSYSCALL_EMULATE")
Fixing the LASS violations during the EMULATE mode would need complex
instruction decoding since the resulting #GP fault does not include any
useful error information and the vsyscall address is not readily
available in the RIP.
At this point, no one is expected to be using the insecure and
deprecated EMULATE mode. The rare usages that need support probably
don't care much about security anyway. Disable LASS when EMULATE mode is
requested during command line parsing to avoid breaking user software.
LASS will be supported if vsyscall mode is set to XONLY or NONE.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
Documentation/admin-guide/kernel-parameters.txt | 4 +++-
arch/x86/entry/vsyscall/vsyscall_64.c | 7 +++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f1f2c0874da9..796c987372df 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -7926,7 +7926,9 @@
emulate Vsyscalls turn into traps and are emulated
reasonably safely. The vsyscall page is
- readable.
+ readable. This disables the Linear
+ Address Space Separation (LASS) security
+ feature and makes the system less secure.
xonly [default] Vsyscalls turn into traps and are
emulated reasonably safely. The vsyscall
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index be77385b311e..d37df40bfb26 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -63,6 +63,13 @@ static int __init vsyscall_setup(char *str)
else
return -EINVAL;
+ if (cpu_feature_enabled(X86_FEATURE_LASS) &&
+ vsyscall_mode == EMULATE) {
+ cr4_clear_bits(X86_CR4_LASS);
+ setup_clear_cpu_cap(X86_FEATURE_LASS);
+ pr_warn_once("x86/cpu: Disabling LASS support due to vsyscall=emulate\n");
+ }
+
return 0;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 11/16] x86/cpu: Set LASS CR4 bit as pinning sensitive
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (10 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 12/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
` (5 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Yian Chen <yian.chen@intel.com>
Security features such as LASS are not expected to be disabled once
initialized. Add LASS to the CR4 pinned mask.
Signed-off-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/common.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9918121e0adc..1552c7510380 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -403,7 +403,8 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
- X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED;
+ X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED |
+ X86_CR4_LASS;
static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
static unsigned long cr4_pinned_bits __ro_after_init;
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 12/16] x86/traps: Communicate a LASS violation in #GP message
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (11 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 11/16] x86/cpu: Set LASS CR4 bit as pinning sensitive Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
` (4 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Provide a more helpful message on #GP when a kernel side LASS violation
is detected.
A NULL pointer dereference is reported if a LASS violation occurs due to
accessing the first page frame.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/traps.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 40e34bb66d7c..e2ad760b17ea 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -636,7 +636,16 @@ DEFINE_IDTENTRY(exc_bounds)
enum kernel_gp_hint {
GP_NO_HINT,
GP_NON_CANONICAL,
- GP_CANONICAL
+ GP_CANONICAL,
+ GP_LASS_VIOLATION,
+ GP_NULL_POINTER,
+};
+
+static const char *kernel_gp_hint_help[] = {
+ [GP_NON_CANONICAL] = "probably for non-canonical address",
+ [GP_CANONICAL] = "maybe for address",
+ [GP_LASS_VIOLATION] = "LASS prevented access to address",
+ [GP_NULL_POINTER] = "kernel NULL pointer dereference",
};
/*
@@ -672,6 +681,12 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
if (*addr < ~__VIRTUAL_MASK &&
*addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
return GP_NON_CANONICAL;
+ else if (*addr < ~__VIRTUAL_MASK &&
+ cpu_feature_enabled(X86_FEATURE_LASS)) {
+ if (*addr < PAGE_SIZE)
+ return GP_NULL_POINTER;
+ return GP_LASS_VIOLATION;
+ }
#endif
return GP_CANONICAL;
@@ -833,11 +848,10 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
else
hint = get_kernel_gp_address(regs, &gp_addr);
- if (hint != GP_NO_HINT)
+ if (hint != GP_NO_HINT) {
snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
- (hint == GP_NON_CANONICAL) ? "probably for non-canonical address"
- : "maybe for address",
- gp_addr);
+ kernel_gp_hint_help[hint], gp_addr);
+ }
/*
* KASAN is interested only in the non-canonical case, clear it
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (12 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 12/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-26 17:57 ` Xin Li
2025-06-25 12:51 ` [PATCHv7 14/16] x86/cpu: Make LAM depend on LASS Kirill A. Shutemov
` (3 subsequent siblings)
17 siblings, 1 reply; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
LASS throws a #GP for any violations except for stack register accesses,
in which case it throws a #SS instead. Handle this similarly to how other
LASS violations are handled.
In case of FRED, before handling #SS as LASS violation, kernel has to
check if there's a fixup for the exception. It can address #SS due to
invalid user context on ERETU[1]. See 5105e7687ad3 ("x86/fred: Fixup
fault on ERETU by jumping to fred_entrypoint_user") for more details.
Co-developed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/traps.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e2ad760b17ea..f1f92e1ba524 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -418,12 +418,6 @@ DEFINE_IDTENTRY_ERRORCODE(exc_segment_not_present)
SIGBUS, 0, NULL);
}
-DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
-{
- do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
- 0, NULL);
-}
-
DEFINE_IDTENTRY_ERRORCODE(exc_alignment_check)
{
char *str = "alignment check";
@@ -866,6 +860,39 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
cond_local_irq_disable(regs);
}
+#define SSFSTR "stack segment fault"
+
+DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
+{
+ if (user_mode(regs))
+ goto error_trap;
+
+ if (cpu_feature_enabled(X86_FEATURE_FRED) &&
+ fixup_exception(regs, X86_TRAP_SS, error_code, 0))
+ return;
+
+ if (cpu_feature_enabled(X86_FEATURE_LASS)) {
+ enum kernel_gp_hint hint;
+ unsigned long gp_addr;
+
+ hint = get_kernel_gp_address(regs, &gp_addr);
+ if (hint != GP_NO_HINT) {
+ printk(SSFSTR ", %s 0x%lx", kernel_gp_hint_help[hint],
+ gp_addr);
+ }
+
+ if (hint != GP_NON_CANONICAL)
+ gp_addr = 0;
+
+ die_addr(SSFSTR, regs, error_code, gp_addr);
+ return;
+ }
+
+error_trap:
+ do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
+ 0, NULL);
+}
+
static bool do_int3(struct pt_regs *regs)
{
int res;
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 14/16] x86/cpu: Make LAM depend on LASS
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (13 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 15/16] x86/cpu: Enable LASS during CPU initialization Kirill A. Shutemov
` (2 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
To prevent exploits for Spectre based on LAM as demonstrated by the
whitepaper [1], make LAM depend on LASS, which avoids this type of
vulnerability.
[1] https://download.vusec.net/papers/slam_sp24.pdf
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 98d0cdd82574..11bb9ed40140 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -90,6 +90,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_FRED, X86_FEATURE_LKGS },
{ X86_FEATURE_SPEC_CTRL_SSBD, X86_FEATURE_SPEC_CTRL },
{ X86_FEATURE_LASS, X86_FEATURE_SMAP },
+ { X86_FEATURE_LAM, X86_FEATURE_LASS },
{}
};
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 15/16] x86/cpu: Enable LASS during CPU initialization
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (14 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 14/16] x86/cpu: Make LAM depend on LASS Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
2025-06-26 9:22 ` [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Vegard Nossum
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
From: Sohil Mehta <sohil.mehta@intel.com>
Being a security feature, enable LASS by default if the platform
supports it.
While at it, get rid of the comment above the SMAP/SMEP/UMIP/LASS setup
instead of updating it to mention LASS as well, as the whole sequence is
quite self-explanatory.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 1552c7510380..97a228f917a9 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -401,6 +401,12 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
cr4_clear_bits(X86_CR4_UMIP);
}
+static __always_inline void setup_lass(struct cpuinfo_x86 *c)
+{
+ if (cpu_feature_enabled(X86_FEATURE_LASS))
+ cr4_set_bits(X86_CR4_LASS);
+}
+
/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED |
@@ -1975,10 +1981,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
- /* Set up SMEP/SMAP/UMIP */
setup_smep(c);
setup_smap(c);
setup_umip(c);
+ setup_lass(c);
/* Enable FSGSBASE instructions if available. */
if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv7 16/16] x86: Re-enable Linear Address Masking
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (15 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 15/16] x86/cpu: Enable LASS during CPU initialization Kirill A. Shutemov
@ 2025-06-25 12:51 ` Kirill A. Shutemov
2025-06-26 9:22 ` [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Vegard Nossum
17 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:51 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm, Kirill A. Shutemov
This reverts commit 3267cb6d3a174ff83d6287dcd5b0047bbd912452.
LASS mitigates the Spectre based on LAM (SLAM) [1] and the previous
commit made LAM depend on LASS, so we no longer need to disable LAM at
compile time, so revert the commit that disables LAM.
Adjust USER_PTR_MAX if LAM enabled, allowing tag bits to be set for
userspace pointers. The value for the constant is defined in a way to
avoid overflow compiler warning on 32-bit config.
[1] https://download.vusec.net/papers/slam_sp24.pdf
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/Kconfig | 1 -
arch/x86/kernel/cpu/common.c | 5 +----
2 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..2b48e916b754 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2181,7 +2181,6 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
config ADDRESS_MASKING
bool "Linear Address Masking support"
depends on X86_64
- depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS
help
Linear Address Masking (LAM) modifies the checking that is applied
to 64-bit linear addresses, allowing software to use of the
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 97a228f917a9..6f2ae9e702bc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2558,11 +2558,8 @@ void __init arch_cpu_finalize_init(void)
if (IS_ENABLED(CONFIG_X86_64)) {
unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
- /*
- * Enable this when LAM is gated on LASS support
if (cpu_feature_enabled(X86_FEATURE_LAM))
- USER_PTR_MAX = (1ul << 63) - PAGE_SIZE;
- */
+ USER_PTR_MAX = (-1UL >> 1) & PAGE_MASK;
runtime_const_init(ptr, USER_PTR_MAX);
/*
--
2.47.2
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall
2025-06-25 12:50 ` [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
@ 2025-06-25 12:55 ` Kirill A. Shutemov
0 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-25 12:55 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
Please ignore this patch. It was sent by mistake. The same patch included
in the patchset in the right spot.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 00/16] x86: Enable Linear Address Space Separation support
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
` (16 preceding siblings ...)
2025-06-25 12:51 ` [PATCHv7 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
@ 2025-06-26 9:22 ` Vegard Nossum
2025-06-26 9:35 ` Vegard Nossum
17 siblings, 1 reply; 34+ messages in thread
From: Vegard Nossum @ 2025-06-26 9:22 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 25/06/2025 14:50, Kirill A. Shutemov wrote:
> Linear Address Space Separation (LASS) is a security feature that intends to
> prevent malicious virtual address space accesses across user/kernel mode.
I applied these patches on top of tip/master and when I try to boot it
fails with errno 12 (ENOMEM - Cannot allocate memory):
[ 1.517526] Kernel panic - not syncing: Requested init /bin/bash
failed (error -12).
Just using standard defconfig and booting in qemu/KVM with 2G RAM.
Bisect lands on "x86/asm: Introduce inline memcpy and memset".
Thanks,
Vegard
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 00/16] x86: Enable Linear Address Space Separation support
2025-06-26 9:22 ` [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Vegard Nossum
@ 2025-06-26 9:35 ` Vegard Nossum
2025-06-26 12:47 ` Kirill A. Shutemov
0 siblings, 1 reply; 34+ messages in thread
From: Vegard Nossum @ 2025-06-26 9:35 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 26/06/2025 11:22, Vegard Nossum wrote:
>
> On 25/06/2025 14:50, Kirill A. Shutemov wrote:
>> Linear Address Space Separation (LASS) is a security feature that
>> intends to
>> prevent malicious virtual address space accesses across user/kernel mode.
>
> I applied these patches on top of tip/master and when I try to boot it
> fails with errno 12 (ENOMEM - Cannot allocate memory):
>
> [ 1.517526] Kernel panic - not syncing: Requested init /bin/bash
> failed (error -12).
>
> Just using standard defconfig and booting in qemu/KVM with 2G RAM.
>
> Bisect lands on "x86/asm: Introduce inline memcpy and memset".
I think the newly added mulq to rep_stos_alternative clobbers %rdx, at
least this patch fixed it for me:
diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
index 5cd0f18a431fe..bc096526432a1 100644
--- a/arch/x86/include/asm/string.h
+++ b/arch/x86/include/asm/string.h
@@ -28,7 +28,7 @@ static __always_inline void *__inline_memcpy(void *to,
const void *from, size_t
"2:\n\t"
_ASM_EXTABLE_UA(1b, 2b)
:"+c" (len), "+D" (to), "+S" (from),
ASM_CALL_CONSTRAINT
- : : "memory", _ASM_AX);
+ : : "memory", _ASM_AX, _ASM_DX);
return ret + len;
}
@@ -44,7 +44,7 @@ static __always_inline void *__inline_memset(void
*addr, int v, size_t len)
_ASM_EXTABLE_UA(1b, 2b)
: "+c" (len), "+D" (addr), ASM_CALL_CONSTRAINT
: "a" ((uint8_t)v)
- : "memory", _ASM_SI);
+ : "memory", _ASM_SI, _ASM_DX);
return ret + len;
}
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index ca94828def624..77cfd75718623 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -64,6 +64,7 @@ EXPORT_SYMBOL_GPL(clear_page_erms)
*
* Output:
* rcx: uncleared bytes or 0 if successful.
+ * rdx: clobbered
*/
SYM_FUNC_START(rep_stos_alternative)
ANNOTATE_NOENDBR
Thanks,
Vegard
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCHv7 00/16] x86: Enable Linear Address Space Separation support
2025-06-26 9:35 ` Vegard Nossum
@ 2025-06-26 12:47 ` Kirill A. Shutemov
2025-06-26 13:15 ` Vegard Nossum
2025-06-29 11:40 ` David Laight
0 siblings, 2 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-26 12:47 UTC (permalink / raw)
To: Vegard Nossum
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On Thu, Jun 26, 2025 at 11:35:21AM +0200, Vegard Nossum wrote:
>
> On 26/06/2025 11:22, Vegard Nossum wrote:
> >
> > On 25/06/2025 14:50, Kirill A. Shutemov wrote:
> > > Linear Address Space Separation (LASS) is a security feature that
> > > intends to
> > > prevent malicious virtual address space accesses across user/kernel mode.
> >
> > I applied these patches on top of tip/master and when I try to boot it
> > fails with errno 12 (ENOMEM - Cannot allocate memory):
> >
> > [ 1.517526] Kernel panic - not syncing: Requested init /bin/bash
> > failed (error -12).
For some reason, I failed to reproduce it. What is your toolchain?
> > Just using standard defconfig and booting in qemu/KVM with 2G RAM.
> >
> > Bisect lands on "x86/asm: Introduce inline memcpy and memset".
>
> I think the newly added mulq to rep_stos_alternative clobbers %rdx,
Yes, it makes sense.
> at
> least this patch fixed it for me:
>
> diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
> index 5cd0f18a431fe..bc096526432a1 100644
> --- a/arch/x86/include/asm/string.h
> +++ b/arch/x86/include/asm/string.h
> @@ -28,7 +28,7 @@ static __always_inline void *__inline_memcpy(void *to,
> const void *from, size_t
> "2:\n\t"
> _ASM_EXTABLE_UA(1b, 2b)
> :"+c" (len), "+D" (to), "+S" (from),
> ASM_CALL_CONSTRAINT
> - : : "memory", _ASM_AX);
> + : : "memory", _ASM_AX, _ASM_DX);
>
> return ret + len;
> }
This part is not needed. rep_movs_alternative() doesn't touch RDX.
I will fold the patch below.
Or maybe some asm guru can suggest a better way to fix it without
clobbering RDX?
diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
index 5cd0f18a431f..b0a26a3f11e0 100644
--- a/arch/x86/include/asm/string.h
+++ b/arch/x86/include/asm/string.h
@@ -44,7 +44,7 @@ static __always_inline void *__inline_memset(void *addr, int v, size_t len)
_ASM_EXTABLE_UA(1b, 2b)
: "+c" (len), "+D" (addr), ASM_CALL_CONSTRAINT
: "a" ((uint8_t)v)
- : "memory", _ASM_SI);
+ : "memory", _ASM_SI, _ASM_DX);
return ret + len;
}
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index ca94828def62..d904c781fa3f 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -64,12 +64,15 @@ EXPORT_SYMBOL_GPL(clear_page_erms)
*
* Output:
* rcx: uncleared bytes or 0 if successful.
+ * rdx: clobbered
*/
SYM_FUNC_START(rep_stos_alternative)
ANNOTATE_NOENDBR
movzbq %al, %rsi
movabs $0x0101010101010101, %rax
+
+ /* %rdx:%rax = %rax * %rsi */
mulq %rsi
cmpq $64,%rcx
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCHv7 00/16] x86: Enable Linear Address Space Separation support
2025-06-26 12:47 ` Kirill A. Shutemov
@ 2025-06-26 13:15 ` Vegard Nossum
2025-06-29 11:40 ` David Laight
1 sibling, 0 replies; 34+ messages in thread
From: Vegard Nossum @ 2025-06-26 13:15 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 26/06/2025 14:47, Kirill A. Shutemov wrote:
> On Thu, Jun 26, 2025 at 11:35:21AM +0200, Vegard Nossum wrote:
>> On 26/06/2025 11:22, Vegard Nossum wrote:
>>> On 25/06/2025 14:50, Kirill A. Shutemov wrote:
>>>> Linear Address Space Separation (LASS) is a security feature that
>>>> intends to
>>>> prevent malicious virtual address space accesses across user/kernel mode.
>>>
>>> I applied these patches on top of tip/master and when I try to boot it
>>> fails with errno 12 (ENOMEM - Cannot allocate memory):
>>>
>>> [ 1.517526] Kernel panic - not syncing: Requested init /bin/bash
>>> failed (error -12).
>
> For some reason, I failed to reproduce it. What is your toolchain?
$ gcc --version
gcc (GCC) 11.4.1 20230605 (Red Hat 11.4.1-2.1.0.1)
I tried to diff vmlinux with and without the clobber change and I see a
bunch of changed functions, the first one I looked at is calling
put_user() -- I guess anything could be affected, really.
>> @@ -28,7 +28,7 @@ static __always_inline void *__inline_memcpy(void *to,
>> const void *from, size_t
>> "2:\n\t"
>> _ASM_EXTABLE_UA(1b, 2b)
>> :"+c" (len), "+D" (to), "+S" (from),
>> ASM_CALL_CONSTRAINT
>> - : : "memory", _ASM_AX);
>> + : : "memory", _ASM_AX, _ASM_DX);
>>
>> return ret + len;
>> }
>
> This part is not needed. rep_movs_alternative() doesn't touch RDX.
True, I didn't look closely enough...
> I will fold the patch below.
Thanks,
Vegard
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-06-25 12:50 ` [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
@ 2025-06-26 13:49 ` Peter Zijlstra
2025-06-26 14:18 ` Dave Hansen
0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2025-06-26 13:49 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Wed, Jun 25, 2025 at 03:50:56PM +0300, Kirill A. Shutemov wrote:
> +/*
> + * The CLAC/STAC instructions toggle the enforcement of X86_FEATURE_SMAP and
> + * X86_FEATURE_LASS.
> + *
> + * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page tables: the
> + * kernel is not allowed to touch pages with the bit set unless the AC bit is
> + * set.
> + *
> + * LASS enforcement is based on bit 63 of the virtual address. The kernel is
> + * not allowed to touch memory in the lower half of the virtual address space
> + * unless the AC bit is set.
> + *
> + * Note: a barrier is implicit in alternative().
> + */
> +
> static __always_inline void clac(void)
> {
> - /* Note: a barrier is implicit in alternative() */
> alternative("", "clac", X86_FEATURE_SMAP);
> }
>
> static __always_inline void stac(void)
> {
> - /* Note: a barrier is implicit in alternative() */
> alternative("", "stac", X86_FEATURE_SMAP);
> }
>
> +static __always_inline void lass_enable_enforcement(void)
> +{
> + alternative("", "clac", X86_FEATURE_LASS);
> +}
> +
> +static __always_inline void lass_disable_enforcement(void)
> +{
> + alternative("", "stac", X86_FEATURE_LASS);
> +}
Much hate for this naming. WTH was wrong with lass_{clac,stac}()?
We're not calling those other functions smap_{en,dis}able_enforcement()
either (and please don't take that as a suggestion, its terrible
naming).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-06-26 13:49 ` Peter Zijlstra
@ 2025-06-26 14:18 ` Dave Hansen
2025-06-27 10:27 ` Kirill A. Shutemov
0 siblings, 1 reply; 34+ messages in thread
From: Dave Hansen @ 2025-06-26 14:18 UTC (permalink / raw)
To: Peter Zijlstra, Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On 6/26/25 06:49, Peter Zijlstra wrote:
>> +static __always_inline void lass_enable_enforcement(void)
>> +{
>> + alternative("", "clac", X86_FEATURE_LASS);
>> +}
>> +
>> +static __always_inline void lass_disable_enforcement(void)
>> +{
>> + alternative("", "stac", X86_FEATURE_LASS);
>> +}
> Much hate for this naming. WTH was wrong with lass_{clac,stac}()?
>
> We're not calling those other functions smap_{en,dis}able_enforcement()
> either (and please don't take that as a suggestion, its terrible
> naming).
It was a response to a comment from Sohil about the delta between
lass_{cl,st}ac() and plain {cl,st}ac() being subtle. They are subtle,
but I don't think it's fixable with naming.
There are lots of crazy gymnastics we could do. But there are so few
sites where AC is twiddled for LASS that I don't think it's worth it.
Let's just use the lass_{cl,st}ac() and comment both variants. First,
the existing stac()/clac():
/*
* Use these when accessing userspace (_PAGE_USER)
* mappings, regardless of location.
*/
and the new ones:
/*
* Use these when accessing kernel mappings (!_PAGE_USER)
* in the lower half of the address space.
*/
Any objections to doing that?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits
2025-06-25 12:50 ` [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
@ 2025-06-26 15:22 ` Borislav Petkov
2025-06-26 18:00 ` Xin Li
1 sibling, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2025-06-26 15:22 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86,
H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel, Paul E. McKenney,
Josh Poimboeuf, Xiongwei Song, Xin Li, Mike Rapoport (IBM),
Brijesh Singh, Michael Roth, Tony Luck, Alexey Kardashevskiy,
Alexander Shishkin, Jonathan Corbet, Sohil Mehta, Ingo Molnar,
Pawan Gupta, Daniel Sneddon, Kai Huang, Sandipan Das,
Breno Leitao, Rick Edgecombe, Alexei Starovoitov, Hou Tao,
Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Wed, Jun 25, 2025 at 03:50:53PM +0300, Kirill A. Shutemov wrote:
> From: Sohil Mehta <sohil.mehta@intel.com>
>
> Linear Address Space Separation (LASS) is a security feature that
> intends to prevent malicious virtual address space accesses across
> user/kernel mode.
>
> Such mode based access protection already exists today with paging and
> features such as SMEP and SMAP. However, to enforce these protections,
> the processor must traverse the paging structures in memory. Malicious
> software can use timing information resulting from this traversal to
> determine details about the paging structures, and these details may
> also be used to determine the layout of the kernel memory.
>
> The LASS mechanism provides the same mode-based protections as paging
> but without traversing the paging structures. Because the protections
> enforced by LASS are applied before paging, software will not be able to
> derive paging-based timing information from the various caching
> structures such as the TLBs, mid-level caches, page walker, data caches,
> etc.
>
> LASS enforcement relies on the typical kernel implementation to divide
> the 64-bit virtual address space into two halves:
> Addr[63]=0 -> User address space
> Addr[63]=1 -> Kernel address space
>
> Any data access or code execution across address spaces typically
> results in a #GP fault.
>
> The LASS enforcement for kernel data access is dependent on CR4.SMAP
> being set. The enforcement can be disabled by toggling the RFLAGS.AC bit
> similar to SMAP.
>
> Define the CPU feature bits to enumerate this feature and include
> feature dependencies to reflect the same.
>
> LASS provides protection against a class of speculative attacks, such as
> SLAM[1]. Add the "lass" flag to /proc/cpuinfo to indicate that the feature
> is supported by hardware and enabled by the kernel. This allows userspace
> to determine if the setup is secure against such attacks.
>
> [1] https://download.vusec.net/papers/slam_sp24.pdf
>
> Co-developed-by: Yian Chen <yian.chen@intel.com>
> Signed-off-by: Yian Chen <yian.chen@intel.com>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/Kconfig.cpufeatures | 4 ++++
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/uapi/asm/processor-flags.h | 2 ++
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> tools/arch/x86/include/asm/cpufeatures.h | 1 +
> 5 files changed, 9 insertions(+)
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS
2025-06-25 12:51 ` [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
@ 2025-06-26 17:57 ` Xin Li
2025-06-27 10:31 ` Kirill A. Shutemov
0 siblings, 1 reply; 34+ messages in thread
From: Xin Li @ 2025-06-26 17:57 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 6/25/2025 5:51 AM, Kirill A. Shutemov wrote:
> LASS throws a #GP for any violations except for stack register accesses,
> in which case it throws a #SS instead. Handle this similarly to how other
> LASS violations are handled.
>
> In case of FRED, before handling #SS as LASS violation, kernel has to
> check if there's a fixup for the exception. It can address #SS due to
> invalid user context on ERETU[1]. See 5105e7687ad3 ("x86/fred: Fixup
Forgot to put the link to [1]? Maybe just remove "[1]"?
> fault on ERETU by jumping to fred_entrypoint_user") for more details.
>
> Co-developed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/kernel/traps.c | 39 +++++++++++++++++++++++++++++++++------
> 1 file changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index e2ad760b17ea..f1f92e1ba524 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -418,12 +418,6 @@ DEFINE_IDTENTRY_ERRORCODE(exc_segment_not_present)
> SIGBUS, 0, NULL);
> }
>
> -DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
> -{
> - do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
> - 0, NULL);
> -}
> -
> DEFINE_IDTENTRY_ERRORCODE(exc_alignment_check)
> {
> char *str = "alignment check";
> @@ -866,6 +860,39 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
> cond_local_irq_disable(regs);
> }
>
> +#define SSFSTR "stack segment fault"
> +
> +DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
> +{
> + if (user_mode(regs))
> + goto error_trap;
> +
> + if (cpu_feature_enabled(X86_FEATURE_FRED) &&
> + fixup_exception(regs, X86_TRAP_SS, error_code, 0))
> + return;
> +
Thanks for making the change for FRED.
> + if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> + enum kernel_gp_hint hint;
> + unsigned long gp_addr;
> +
> + hint = get_kernel_gp_address(regs, &gp_addr);
> + if (hint != GP_NO_HINT) {
> + printk(SSFSTR ", %s 0x%lx", kernel_gp_hint_help[hint],
> + gp_addr);
> + }
> +
> + if (hint != GP_NON_CANONICAL)
> + gp_addr = 0;
Nit: GP/gp don't seem fit here, maybe we need a more generic name?
Sorry I don't have a recommendation.
> +
> + die_addr(SSFSTR, regs, error_code, gp_addr);
> + return;
> + }
> +
> +error_trap:
> + do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
> + 0, NULL);
The indentation has changed; I believe the original formatting is
preferable.
> +}
> +
> static bool do_int3(struct pt_regs *regs)
> {
> int res;
Just minor comments, so
Reviewed-by: Xin Li (Intel) <xin@zytor.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits
2025-06-25 12:50 ` [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
2025-06-26 15:22 ` Borislav Petkov
@ 2025-06-26 18:00 ` Xin Li
1 sibling, 0 replies; 34+ messages in thread
From: Xin Li @ 2025-06-26 18:00 UTC (permalink / raw)
To: Kirill A. Shutemov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin
Cc: Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On 6/25/2025 5:50 AM, Kirill A. Shutemov wrote:
> From: Sohil Mehta <sohil.mehta@intel.com>
>
> Linear Address Space Separation (LASS) is a security feature that
> intends to prevent malicious virtual address space accesses across
> user/kernel mode.
>
> Such mode based access protection already exists today with paging and
> features such as SMEP and SMAP. However, to enforce these protections,
> the processor must traverse the paging structures in memory. Malicious
> software can use timing information resulting from this traversal to
> determine details about the paging structures, and these details may
> also be used to determine the layout of the kernel memory.
>
> The LASS mechanism provides the same mode-based protections as paging
> but without traversing the paging structures. Because the protections
> enforced by LASS are applied before paging, software will not be able to
> derive paging-based timing information from the various caching
> structures such as the TLBs, mid-level caches, page walker, data caches,
> etc.
>
> LASS enforcement relies on the typical kernel implementation to divide
> the 64-bit virtual address space into two halves:
> Addr[63]=0 -> User address space
> Addr[63]=1 -> Kernel address space
>
> Any data access or code execution across address spaces typically
> results in a #GP fault.
>
> The LASS enforcement for kernel data access is dependent on CR4.SMAP
> being set. The enforcement can be disabled by toggling the RFLAGS.AC bit
> similar to SMAP.
>
> Define the CPU feature bits to enumerate this feature and include
> feature dependencies to reflect the same.
>
> LASS provides protection against a class of speculative attacks, such as
> SLAM[1]. Add the "lass" flag to /proc/cpuinfo to indicate that the feature
> is supported by hardware and enabled by the kernel. This allows userspace
> to determine if the setup is secure against such attacks.
>
> [1] https://download.vusec.net/papers/slam_sp24.pdf
>
> Co-developed-by: Yian Chen <yian.chen@intel.com>
> Signed-off-by: Yian Chen <yian.chen@intel.com>
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> arch/x86/Kconfig.cpufeatures | 4 ++++
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/uapi/asm/processor-flags.h | 2 ++
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> tools/arch/x86/include/asm/cpufeatures.h | 1 +
> 5 files changed, 9 insertions(+)
>
> diff --git a/arch/x86/Kconfig.cpufeatures b/arch/x86/Kconfig.cpufeatures
> index 250c10627ab3..733d5aff2456 100644
> --- a/arch/x86/Kconfig.cpufeatures
> +++ b/arch/x86/Kconfig.cpufeatures
> @@ -124,6 +124,10 @@ config X86_DISABLED_FEATURE_PCID
> def_bool y
> depends on !X86_64
>
> +config X86_DISABLED_FEATURE_LASS
> + def_bool y
> + depends on X86_32
> +
> config X86_DISABLED_FEATURE_PKU
> def_bool y
> depends on !X86_INTEL_MEMORY_PROTECTION_KEYS
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index b78af55aa22e..8eef1ad7aca2 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -313,6 +313,7 @@
> #define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */
> #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
> #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
> +#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */
> #define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */
> #define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */
> #define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */
> diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
> index f1a4adc78272..81d0c8bf1137 100644
> --- a/arch/x86/include/uapi/asm/processor-flags.h
> +++ b/arch/x86/include/uapi/asm/processor-flags.h
> @@ -136,6 +136,8 @@
> #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT)
> #define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */
> #define X86_CR4_CET _BITUL(X86_CR4_CET_BIT)
> +#define X86_CR4_LASS_BIT 27 /* enable Linear Address Space Separation support */
> +#define X86_CR4_LASS _BITUL(X86_CR4_LASS_BIT)
> #define X86_CR4_LAM_SUP_BIT 28 /* LAM for supervisor pointers */
> #define X86_CR4_LAM_SUP _BITUL(X86_CR4_LAM_SUP_BIT)
>
> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index 46efcbd6afa4..98d0cdd82574 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -89,6 +89,7 @@ static const struct cpuid_dep cpuid_deps[] = {
> { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES },
> { X86_FEATURE_FRED, X86_FEATURE_LKGS },
> { X86_FEATURE_SPEC_CTRL_SSBD, X86_FEATURE_SPEC_CTRL },
> + { X86_FEATURE_LASS, X86_FEATURE_SMAP },
> {}
> };
>
> diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
> index ee176236c2be..4473a6f7800b 100644
> --- a/tools/arch/x86/include/asm/cpufeatures.h
> +++ b/tools/arch/x86/include/asm/cpufeatures.h
> @@ -313,6 +313,7 @@
> #define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */
> #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */
> #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */
> +#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */
> #define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */
> #define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */
> #define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */
Reviewed-by: Xin Li (Intel) <xin@zytor.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives
2025-06-26 14:18 ` Dave Hansen
@ 2025-06-27 10:27 ` Kirill A. Shutemov
0 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-27 10:27 UTC (permalink / raw)
To: Dave Hansen
Cc: Peter Zijlstra, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Thu, Jun 26, 2025 at 07:18:59AM -0700, Dave Hansen wrote:
> On 6/26/25 06:49, Peter Zijlstra wrote:
> >> +static __always_inline void lass_enable_enforcement(void)
> >> +{
> >> + alternative("", "clac", X86_FEATURE_LASS);
> >> +}
> >> +
> >> +static __always_inline void lass_disable_enforcement(void)
> >> +{
> >> + alternative("", "stac", X86_FEATURE_LASS);
> >> +}
> > Much hate for this naming. WTH was wrong with lass_{clac,stac}()?
> >
> > We're not calling those other functions smap_{en,dis}able_enforcement()
> > either (and please don't take that as a suggestion, its terrible
> > naming).
>
> It was a response to a comment from Sohil about the delta between
> lass_{cl,st}ac() and plain {cl,st}ac() being subtle. They are subtle,
> but I don't think it's fixable with naming.
>
> There are lots of crazy gymnastics we could do. But there are so few
> sites where AC is twiddled for LASS that I don't think it's worth it.
>
> Let's just use the lass_{cl,st}ac() and comment both variants. First,
> the existing stac()/clac():
>
> /*
> * Use these when accessing userspace (_PAGE_USER)
> * mappings, regardless of location.
> */
>
> and the new ones:
>
> /*
> * Use these when accessing kernel mappings (!_PAGE_USER)
> * in the lower half of the address space.
> */
>
> Any objections to doing that?
Looks good. Will update the patch.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS
2025-06-26 17:57 ` Xin Li
@ 2025-06-27 10:31 ` Kirill A. Shutemov
2025-06-30 8:30 ` David Laight
0 siblings, 1 reply; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-27 10:31 UTC (permalink / raw)
To: Xin Li
Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra, Ard Biesheuvel,
Paul E. McKenney, Josh Poimboeuf, Xiongwei Song, Xin Li,
Mike Rapoport (IBM), Brijesh Singh, Michael Roth, Tony Luck,
Alexey Kardashevskiy, Alexander Shishkin, Jonathan Corbet,
Sohil Mehta, Ingo Molnar, Pawan Gupta, Daniel Sneddon, Kai Huang,
Sandipan Das, Breno Leitao, Rick Edgecombe, Alexei Starovoitov,
Hou Tao, Juergen Gross, Vegard Nossum, Kees Cook, Eric Biggers,
Jason Gunthorpe, Masami Hiramatsu (Google), Andrew Morton,
Luis Chamberlain, Yuntao Wang, Rasmus Villemoes, Christophe Leroy,
Tejun Heo, Changbin Du, Huang Shijie, Geert Uytterhoeven,
Namhyung Kim, Arnaldo Carvalho de Melo, linux-doc, linux-kernel,
linux-efi, linux-mm
On Thu, Jun 26, 2025 at 10:57:47AM -0700, Xin Li wrote:
> On 6/25/2025 5:51 AM, Kirill A. Shutemov wrote:
> > LASS throws a #GP for any violations except for stack register accesses,
> > in which case it throws a #SS instead. Handle this similarly to how other
> > LASS violations are handled.
> >
> > In case of FRED, before handling #SS as LASS violation, kernel has to
> > check if there's a fixup for the exception. It can address #SS due to
> > invalid user context on ERETU[1]. See 5105e7687ad3 ("x86/fred: Fixup
>
> Forgot to put the link to [1]? Maybe just remove "[1]"?
I will add the link. It is important context.
> > fault on ERETU by jumping to fred_entrypoint_user") for more details.
> >
> > Co-developed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> > arch/x86/kernel/traps.c | 39 +++++++++++++++++++++++++++++++++------
> > 1 file changed, 33 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index e2ad760b17ea..f1f92e1ba524 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -418,12 +418,6 @@ DEFINE_IDTENTRY_ERRORCODE(exc_segment_not_present)
> > SIGBUS, 0, NULL);
> > }
> > -DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
> > -{
> > - do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
> > - 0, NULL);
> > -}
> > -
> > DEFINE_IDTENTRY_ERRORCODE(exc_alignment_check)
> > {
> > char *str = "alignment check";
> > @@ -866,6 +860,39 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
> > cond_local_irq_disable(regs);
> > }
> > +#define SSFSTR "stack segment fault"
> > +
> > +DEFINE_IDTENTRY_ERRORCODE(exc_stack_segment)
> > +{
> > + if (user_mode(regs))
> > + goto error_trap;
> > +
> > + if (cpu_feature_enabled(X86_FEATURE_FRED) &&
> > + fixup_exception(regs, X86_TRAP_SS, error_code, 0))
> > + return;
> > +
>
> Thanks for making the change for FRED.
>
> > + if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> > + enum kernel_gp_hint hint;
> > + unsigned long gp_addr;
> > +
> > + hint = get_kernel_gp_address(regs, &gp_addr);
> > + if (hint != GP_NO_HINT) {
> > + printk(SSFSTR ", %s 0x%lx", kernel_gp_hint_help[hint],
> > + gp_addr);
> > + }
> > +
> > + if (hint != GP_NON_CANONICAL)
> > + gp_addr = 0;
>
> Nit: GP/gp don't seem fit here, maybe we need a more generic name?
>
> Sorry I don't have a recommendation.
Naming is hard.
Maybe get_kernel_exc_address()/kernel_exc_hint_help/EXC_NO_HINT/... ?
> > +
> > + die_addr(SSFSTR, regs, error_code, gp_addr);
> > + return;
> > + }
> > +
> > +error_trap:
> > + do_error_trap(regs, error_code, "stack segment", X86_TRAP_SS, SIGBUS,
> > + 0, NULL);
>
> The indentation has changed; I believe the original formatting is
> preferable.
>
> > +}
> > +
> > static bool do_int3(struct pt_regs *regs)
> > {
> > int res;
>
> Just minor comments, so
>
> Reviewed-by: Xin Li (Intel) <xin@zytor.com>
Thanks.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 00/16] x86: Enable Linear Address Space Separation support
2025-06-26 12:47 ` Kirill A. Shutemov
2025-06-26 13:15 ` Vegard Nossum
@ 2025-06-29 11:40 ` David Laight
1 sibling, 0 replies; 34+ messages in thread
From: David Laight @ 2025-06-29 11:40 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Vegard Nossum, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin,
Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On Thu, 26 Jun 2025 15:47:36 +0300
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> On Thu, Jun 26, 2025 at 11:35:21AM +0200, Vegard Nossum wrote:
> >
> > On 26/06/2025 11:22, Vegard Nossum wrote:
> > >
> > > On 25/06/2025 14:50, Kirill A. Shutemov wrote:
> > > > Linear Address Space Separation (LASS) is a security feature that
> > > > intends to
> > > > prevent malicious virtual address space accesses across user/kernel mode.
> > >
> > > I applied these patches on top of tip/master and when I try to boot it
> > > fails with errno 12 (ENOMEM - Cannot allocate memory):
> > >
> > > [ 1.517526] Kernel panic - not syncing: Requested init /bin/bash
> > > failed (error -12).
>
> For some reason, I failed to reproduce it. What is your toolchain?
>
> > > Just using standard defconfig and booting in qemu/KVM with 2G RAM.
> > >
> > > Bisect lands on "x86/asm: Introduce inline memcpy and memset".
> >
> > I think the newly added mulq to rep_stos_alternative clobbers %rdx,
>
> Yes, it makes sense.
>
> > at
> > least this patch fixed it for me:
> >
> > diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h
> > index 5cd0f18a431fe..bc096526432a1 100644
> > --- a/arch/x86/include/asm/string.h
> > +++ b/arch/x86/include/asm/string.h
> > @@ -28,7 +28,7 @@ static __always_inline void *__inline_memcpy(void *to,
> > const void *from, size_t
> > "2:\n\t"
> > _ASM_EXTABLE_UA(1b, 2b)
> > :"+c" (len), "+D" (to), "+S" (from),
> > ASM_CALL_CONSTRAINT
> > - : : "memory", _ASM_AX);
> > + : : "memory", _ASM_AX, _ASM_DX);
> >
> > return ret + len;
> > }
>
> This part is not needed. rep_movs_alternative() doesn't touch RDX.
>
> I will fold the patch below.
>
> Or maybe some asm guru can suggest a better way to fix it without
> clobbering RDX?
Or separate out the code where the value is a compile-time zero.
That is pretty much 99% of the calls.
David
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS
2025-06-27 10:31 ` Kirill A. Shutemov
@ 2025-06-30 8:30 ` David Laight
2025-06-30 9:50 ` Kirill A. Shutemov
0 siblings, 1 reply; 34+ messages in thread
From: David Laight @ 2025-06-30 8:30 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Xin Li, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin,
Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On Fri, 27 Jun 2025 13:31:44 +0300
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> On Thu, Jun 26, 2025 at 10:57:47AM -0700, Xin Li wrote:
> > On 6/25/2025 5:51 AM, Kirill A. Shutemov wrote:
> > > LASS throws a #GP for any violations except for stack register accesses,
> > > in which case it throws a #SS instead. Handle this similarly to how other
> > > LASS violations are handled.
> > >
> > > In case of FRED, before handling #SS as LASS violation, kernel has to
> > > check if there's a fixup for the exception. It can address #SS due to
> > > invalid user context on ERETU[1]. See 5105e7687ad3 ("x86/fred: Fixup
> >
> > Forgot to put the link to [1]? Maybe just remove "[1]"?
>
> I will add the link. It is important context.
Will the link still be valid in 5 years time when someone
is looking back at the changes?
David
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS
2025-06-30 8:30 ` David Laight
@ 2025-06-30 9:50 ` Kirill A. Shutemov
0 siblings, 0 replies; 34+ messages in thread
From: Kirill A. Shutemov @ 2025-06-30 9:50 UTC (permalink / raw)
To: David Laight
Cc: Xin Li, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
Ard Biesheuvel, Paul E. McKenney, Josh Poimboeuf, Xiongwei Song,
Xin Li, Mike Rapoport (IBM), Brijesh Singh, Michael Roth,
Tony Luck, Alexey Kardashevskiy, Alexander Shishkin,
Jonathan Corbet, Sohil Mehta, Ingo Molnar, Pawan Gupta,
Daniel Sneddon, Kai Huang, Sandipan Das, Breno Leitao,
Rick Edgecombe, Alexei Starovoitov, Hou Tao, Juergen Gross,
Vegard Nossum, Kees Cook, Eric Biggers, Jason Gunthorpe,
Masami Hiramatsu (Google), Andrew Morton, Luis Chamberlain,
Yuntao Wang, Rasmus Villemoes, Christophe Leroy, Tejun Heo,
Changbin Du, Huang Shijie, Geert Uytterhoeven, Namhyung Kim,
Arnaldo Carvalho de Melo, linux-doc, linux-kernel, linux-efi,
linux-mm
On Mon, Jun 30, 2025 at 09:30:27AM +0100, David Laight wrote:
> On Fri, 27 Jun 2025 13:31:44 +0300
> "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
>
> > On Thu, Jun 26, 2025 at 10:57:47AM -0700, Xin Li wrote:
> > > On 6/25/2025 5:51 AM, Kirill A. Shutemov wrote:
> > > > LASS throws a #GP for any violations except for stack register accesses,
> > > > in which case it throws a #SS instead. Handle this similarly to how other
> > > > LASS violations are handled.
> > > >
> > > > In case of FRED, before handling #SS as LASS violation, kernel has to
> > > > check if there's a fixup for the exception. It can address #SS due to
> > > > invalid user context on ERETU[1]. See 5105e7687ad3 ("x86/fred: Fixup
> > >
> > > Forgot to put the link to [1]? Maybe just remove "[1]"?
> >
> > I will add the link. It is important context.
>
> Will the link still be valid in 5 years time when someone
> is looking back at the changes?
Re-reading the commit message I wrote, it is obvious that I reconsidered
putting the link and referenced commit instead.
I will drop [1];
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2025-06-30 9:50 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-25 12:50 [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 01/16] x86/cpu: Enumerate the LASS feature bits Kirill A. Shutemov
2025-06-26 15:22 ` Borislav Petkov
2025-06-26 18:00 ` Xin Li
2025-06-25 12:50 ` [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
2025-06-25 12:55 ` Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 02/16] x86/asm: Introduce inline memcpy and memset Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 03/16] x86/alternatives: Disable LASS when patching kernel alternatives Kirill A. Shutemov
2025-06-26 13:49 ` Peter Zijlstra
2025-06-26 14:18 ` Dave Hansen
2025-06-27 10:27 ` Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 04/16] x86/cpu: Defer CR pinning setup until after EFI initialization Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 05/16] efi: Disable LASS around set_virtual_address_map() EFI call Kirill A. Shutemov
2025-06-25 12:50 ` [PATCHv7 06/16] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 07/16] x86/vsyscall: Reorganize the #PF emulation code Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 08/16] x86/traps: Consolidate user fixups in exc_general_protection() Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 09/16] x86/vsyscall: Add vsyscall emulation for #GP Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 10/16] x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 11/16] x86/cpu: Set LASS CR4 bit as pinning sensitive Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 12/16] x86/traps: Communicate a LASS violation in #GP message Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 13/16] x86/traps: Handle LASS thrown #SS Kirill A. Shutemov
2025-06-26 17:57 ` Xin Li
2025-06-27 10:31 ` Kirill A. Shutemov
2025-06-30 8:30 ` David Laight
2025-06-30 9:50 ` Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 14/16] x86/cpu: Make LAM depend on LASS Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 15/16] x86/cpu: Enable LASS during CPU initialization Kirill A. Shutemov
2025-06-25 12:51 ` [PATCHv7 16/16] x86: Re-enable Linear Address Masking Kirill A. Shutemov
2025-06-26 9:22 ` [PATCHv7 00/16] x86: Enable Linear Address Space Separation support Vegard Nossum
2025-06-26 9:35 ` Vegard Nossum
2025-06-26 12:47 ` Kirill A. Shutemov
2025-06-26 13:15 ` Vegard Nossum
2025-06-29 11:40 ` David Laight
-- strict thread matches above, loose matches on Subject: below --
2025-06-24 14:11 [PATCHv6 07/16] x86/vsyscall: Reorganize the #PF emulation code Dave Hansen
2025-06-24 14:59 ` [PATCH] x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall Kirill A. Shutemov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).