public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] selftests/x86: Fix sysret_rip assertion failure on FRED systems
@ 2026-03-26  9:44 Yi Lai
  2026-03-26 22:06 ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Yi Lai @ 2026-03-26  9:44 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andrew Cooper, Xin Li, x86, hpa, Shuah Khan, linux-kernel,
	linux-kselftest, yi1.lai, yi1.lai

The existing 'sysret_rip' selftest asserts that 'regs->r11 ==
regs->flags'. This check relies on the behavior of the SYSCALL
instruction on legacy x86_64, which saves 'RFLAGS' into 'R11'.

However, on systems with FRED (Flexible Return and Event Delivery)
enabled, instead of using registers, all state is saved onto the stack.
Consequently, 'R11' retains its userspace value, causing the assertion
to fail.

Fix this by detecting if FRED is enabled and skipping the register
assertion in that case. The detection is done by checking if the RPL
bits of the GS selector are preserved after a hardware exception.
IDT (via IRET) clears the RPL bits of NULL selectors, while FRED (via
ERETU) preserves them.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Yi Lai <yi1.lai@intel.com>
---
v3:
- Move is_fred_enabled() to helpers.h for other x86 selftests to use.
  Rename empty_handler to fred_handler to avoid symbol conflicts.

v2:
- Replaced CPUID check with a runtime probe using INT3 and GS RPL
  preservation to robustly detect active FRED usage (Suggested by
  Andrew Cooper).

 tools/testing/selftests/x86/helpers.h    | 34 ++++++++++++++++++++++++
 tools/testing/selftests/x86/sysret_rip.c | 12 ++++++---
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/x86/helpers.h b/tools/testing/selftests/x86/helpers.h
index 4c747a1278d9..4d09ed97aaac 100644
--- a/tools/testing/selftests/x86/helpers.h
+++ b/tools/testing/selftests/x86/helpers.h
@@ -4,6 +4,7 @@
 
 #include <signal.h>
 #include <string.h>
+#include <stdbool.h>
 
 #include <asm/processor-flags.h>
 
@@ -50,4 +51,37 @@ static inline void clearhandler(int sig)
 		ksft_exit_fail_msg("sigaction failed");
 }
 
+static inline void fred_handler(int sig, siginfo_t *info, void *ctx_void)
+{
+}
+
+static inline bool is_fred_enabled(void)
+{
+	unsigned short gs_val;
+
+	sethandler(SIGTRAP, fred_handler, 0);
+
+	/*
+	 * Distinguish IDT and FRED mode by loading GS with a non-zero RPL and
+	 * triggering an exception:
+	 * IDT (IRET) clears RPL bits of NULL selectors.
+	 * FRED (ERETU) preserves them.
+	 *
+	 * If GS is loaded with 3 (Index=0, RPL=3), trigger an exception:
+	 * IDT should restore GS as 0.
+	 * FRED should preserve GS as 3.
+	 */
+	asm volatile (
+		"mov %[rpl3], %%gs\n\t"
+		"int3\n\t"
+		"mov %%gs, %[res]"
+		: [res] "=r" (gs_val)
+		: [rpl3] "r" (3)
+	);
+
+	clearhandler(SIGTRAP);
+
+	return gs_val == 3;
+}
+
 #endif /* __SELFTESTS_X86_HELPERS_H */
diff --git a/tools/testing/selftests/x86/sysret_rip.c b/tools/testing/selftests/x86/sysret_rip.c
index 2e423a335e1c..30b195266779 100644
--- a/tools/testing/selftests/x86/sysret_rip.c
+++ b/tools/testing/selftests/x86/sysret_rip.c
@@ -64,9 +64,15 @@ static void sigusr1(int sig, siginfo_t *info, void *ctx_void)
 	ctx->uc_mcontext.gregs[REG_RIP] = rip;
 	ctx->uc_mcontext.gregs[REG_RCX] = rip;
 
-	/* R11 and EFLAGS should already match. */
-	assert(ctx->uc_mcontext.gregs[REG_EFL] ==
-	       ctx->uc_mcontext.gregs[REG_R11]);
+	/*
+	 * SYSCALL works differently on FRED, it does not save RIP and RFLAGS
+	 * to RCX and R11.
+	 */
+	if (!is_fred_enabled()) {
+		/* R11 and EFLAGS should already match. */
+		assert(ctx->uc_mcontext.gregs[REG_EFL] ==
+		       ctx->uc_mcontext.gregs[REG_R11]);
+	}
 
 	sethandler(SIGSEGV, sigsegv_for_sigreturn_test, SA_RESETHAND);
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: [PATCH v3] selftests/x86: Fix sysret_rip assertion failure on FRED systems
@ 2026-04-01 14:59 Xin Li
  2026-04-01 15:18 ` H. Peter Anvin
  0 siblings, 1 reply; 13+ messages in thread
From: Xin Li @ 2026-04-01 14:59 UTC (permalink / raw)
  To: Anvin H. Peter
  Cc: Yi Lai, Zijlstra Peter, Lutomirski Andy, Thomas Gleixner,
	Molnar Ingo, Petkov Borislav, Hansen Dave, Cooper Andrew,
	x86 maintainers the arch, Khan Shuah, Kernel Mailing List Linux,
	linux-kselftest, yi1.lai


>>>>>>> The existing 'sysret_rip' selftest asserts that 'regs->r11 ==
>>>>>>> regs->flags'. This check relies on the behavior of the SYSCALL
>>>>>>> instruction on legacy x86_64, which saves 'RFLAGS' into 'R11'.
>>>>>>> However, on systems with FRED (Flexible Return and Event Delivery)
>>>>>>> enabled, instead of using registers, all state is saved onto the stack.
>>>>>>> Consequently, 'R11' retains its userspace value, causing the assertion
>>>>>>> to fail.
>>>>>>> Fix this by detecting if FRED is enabled and skipping the register
>>>>>>> assertion in that case. The detection is done by checking if the RPL
>>>>>>> bits of the GS selector are preserved after a hardware exception.
>>>>>>> IDT (via IRET) clears the RPL bits of NULL selectors, while FRED (via
>>>>>>> ERETU) preserves them.
>>>>>> I don't really like this.  I think we have two credible choices:
>>>>>> 1. Define the Linux ABI to be that, on FRED systems, SYSCALL preserves
>>>>>> R11 and RCX on entry and exit.  And update the test to actually test
>>>>>> this.
>>>>>> 2. Define the Linux ABI to be what it has been for quite a few years:
>>>>>> SYSCALL entry copies RFLAGS to R11 and RIP to RCX and SYSCALL exit
>>>>>> preserves all registers.
>>>>>> I'm in favor of #2.  People love making new programming languages and
>>>>>> runtimes and inline asm and, these days, vibe coded crap.  And it's
>>>>>> *easier* to emit a SYSCALL and forget to tell the compiler / code
>>>>>> generator that RCX and R11 are clobbered than it is to remember that
>>>>>> they're clobbered.  And it's easy to test on FRED (well, not really,
>>>>>> but it hopefully will be some day) and it's easy to publish one's
>>>>>> code, and then everyone is a bit screwed when the resulting program
>>>>>> crashes sometimes on non-FRED systems.  And it will be miserable to
>>>>>> debug.
>>>>>> (It's *really* *really* easy to screw this up in a way that sort of
>>>>>> works even on non-FRED: RCX and R11 are usually clobbered across
>>>>>> function calls, so one can get into a situation in which one's
>>>>>> generated code usually doesn't require that SYSCALL preserve one of
>>>>>> these registers until an inlining decision changes or some code gets
>>>>>> reordered, and then it will start failing.  And making the failure
>>>>>> depend on hardware details is just nasty.
>>>>>> So I think we should add the ~2 lines of code to fix the SYSCALL entry
>>>>>> on FRED to match non-FRED.
>>>>> Yes; I'm afraid I have to concur. Preserving the clobber on entry for
>>>>> FRED systems is by far the safest choice.
>>>>> Aside from this selftest, fancy debuggers and anything that can transfer
>>>>> userspace state between machines might be 'surprised'.
>>>> Thanks Andy and Peter.
>>>> Indeed, making the selftest branch on FRED vs. non-FRED behavior
>>>> is not a good practice. The selftest should validate ABI consistency.
>>>> I agree with Andy's option #2, so this should be fixed in the FRED
>>>> syscall entry implementation.
>>>> Li Xin, does this direction look right to you? I can assit with
>>>> validation and keep the selftest aligned with the agreed ABI.
>>> Yes, consistency should take precedence over hardware-specific variations.
>>> I would like to hear from Andrew Cooper and hpa before we do it.
>> Per Andy’s suggestion, the change would be:
>> diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
>> index 88c757ac8ccd..a19898747a2c 100644
>> --- a/arch/x86/entry/entry_fred.c
>> +++ b/arch/x86/entry/entry_fred.c
>> @@ -79,6 +79,9 @@ static __always_inline void fred_other(struct pt_regs *regs)
>> {
>> /* The compiler can fold these conditions into a single test */
>> if (likely(regs->fred_ss.vector == FRED_SYSCALL && regs->fred_ss.l)) {
>> +        regs->cx = regs->ip;
>> +        regs->r11 = regs->flags;
>> +
>>    regs->orig_ax = regs->ax;
>>    regs->ax = -ENOSYS;
>>    do_syscall_64(regs, regs->orig_ax);
>> It adds 4 extra MOVs on this hot path, but I don’t see it's a problem here.
> 
> We discussed this over a year ago, and at that point agreed that reserving the register was the desired behavior. Why has this changed now?

Yes, that is technically simpler and cleaner.

The question brought up by Andy is, is the RCX/R11 clobbering behavior an established architectural contract, or is it an implementation detail that software ignores?

But both are hard to prove.

I think Andy and PeterZ want to be on the safer side, i.e., this clobbering behavior is established.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-04-03 18:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26  9:44 [PATCH v3] selftests/x86: Fix sysret_rip assertion failure on FRED systems Yi Lai
2026-03-26 22:06 ` Andy Lutomirski
2026-03-27 12:33   ` Peter Zijlstra
2026-03-31  2:21     ` Lai, Yi
2026-03-31  6:03       ` Xin Li
2026-04-01  1:59         ` Xin Li
2026-04-01  2:48           ` H. Peter Anvin
2026-04-01 14:36             ` Xin Li
2026-04-01 17:54               ` H. Peter Anvin
2026-04-02 13:21                 ` Andy Lutomirski
2026-04-03 17:32                   ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2026-04-01 14:59 Xin Li
2026-04-01 15:18 ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox