public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [patch V3 00/14] futex: Address the robust futex unlock race for real
       [not found] ` <acp-l792SFNG9ROS@J2N7QTR9R3>
@ 2026-03-31 15:22   ` Thomas Gleixner
  0 siblings, 0 replies; only message in thread
From: Thomas Gleixner @ 2026-03-31 15:22 UTC (permalink / raw)
  To: Mark Rutland
  Cc: LKML, Mathieu Desnoyers, Andrè Almeida,
	Sebastian Andrzej Siewior, Carlos O'Donell, Peter Zijlstra,
	Florian Weimer, Rich Felker, Torvald Riegel, Darren Hart,
	Ingo Molnar, Davidlohr Bueso, Arnd Bergmann, Liam R . Howlett,
	Uros Bizjak, Thomas Weißschuh, linux-arch


Resending with linux-arch in CC and a bit more context.

On Mon, Mar 30 2026 at 14:45, Mark Rutland wrote:
> On Mon, Mar 30, 2026 at 02:01:58PM +0200, Thomas Gleixner wrote:
>> User space can't solve this problem without help from the kernel. This
>> series provides the kernel side infrastructure to help it along:
[ 6 more citation lines. Click/Enter to show. ]
>> 
>>   1) Combined unlock, pointer clearing, wake-up for the contended case
>> 
>>   2) VDSO based unlock and pointer clearing helpers with a fix-up function
>>      in the kernel when user space was interrupted within the critical
>>      section.

There is an effort to solve the long standing issue of robust futexes
versus unlock and clearing the list op pointer not being atomic. See:

   https://lore.kernel.org/20260316162316.356674433@kernel.org

TLDR:

The robust futex unlock mechanism is racy in respect to the clearing of the
robust_list_head::list_op_pending pointer because unlock and clearing the
pointer are not atomic. The race window is between the unlock and clearing
the pending op pointer. If the task is forced to exit in this window, exit
will access a potentially invalid pending op pointer when cleaning up the
robust list. That happens if another task manages to unmap the object
containing the lock before the cleanup, which results in an UAF. In the
worst case this UAF can lead to memory corruption when unrelated content
has been mapped to the same address by the time the access happens.

The contended unlock case will be solved with an extension to the
futex() syscall so that the kernel handles the "atomicity".

The uncontended case where user space unlocks the futex needs VDSO
support, so that the kernel can clear the list op pending pointer when
user space gets interrupted between unlock and clearing. Below is how
this works on x86. The series in progress covers only x86, so this a
heads up for that's coming to architectures which support VDSO. If
anyone sees a problem with then, then please issue your concerns now.

> I see the vdso bits in this series are specific to x86. Do other
> architectures need something here?

Yes.

> I might be missing some context; I'm not sure whether that's not
> necessary or just not implemented by this series, and so I'm not sure
> whether arm64 folk and other need to go dig into this.

The VDSO functions __vdso_futex_robust_list64_try_unlock() and
__vdso_futex_robust_list32_try_unlock() are architecture specific.

The scheme in x86 ASM is:

	mov		%esi,%eax	// Load TID into EAX
	xor		%ecx,%ecx	// Set ECX to 0
	lock cmpxchg	%ecx,(%rdi)	// Try the TID -> 0 transition
  .Lstart:
	jnz    		.Lend
	movq		%rcx,(%rdx)	// Clear list_op_pending
  .Lend:
	ret

.Lstart is the start of the critical section, .Lend the end. These two
addresses need to be retrieved from the VDSO when the VDSO is mapped to
user space and stored in mm::futex:unlock::cs_ranges[]. See patch 11/14.

If the cmpxchg was successful, then the pending pointer has to be
cleared when user space was interrupted before reaching .Lend.

So .Lstart has to be immediately after the instruction which did the try
compare exchange and the architecture needs to have their ASM variant
and the helper function which tells the generic code whether the pointer
has to be cleared or not. On x86 that is:

	return regs->flags & X86_EFLAGS_ZF ? (void __user *)regs->dx : NULL;

as the result of CMPXCHG is in the Zero Flag and the pointer is in
[ER]DX. The former is defined by the ISA, the latter is enforced by the
ASM constraints and has to be kept in sync between the VDSO ASM and the
evaluation helper.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-03-31 15:22 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260330114212.927686587@kernel.org>
     [not found] ` <acp-l792SFNG9ROS@J2N7QTR9R3>
2026-03-31 15:22   ` [patch V3 00/14] futex: Address the robust futex unlock race for real Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox