From: Thomas Gleixner <tglx@kernel.org>
To: "André Almeida" <andrealmeid@igalia.com>,
"Rich Felker" <dalias@aerifal.cx>
Cc: LKML <linux-kernel@vger.kernel.org>,
"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
"Carlos O'Donell" <carlos@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Florian Weimer" <fweimer@redhat.com>,
"Torvald Riegel" <triegel@redhat.com>,
"Darren Hart" <dvhart@infradead.org>,
"Ingo Molnar" <mingo@kernel.org>,
"Davidlohr Bueso" <dave@stgolabs.net>,
"Arnd Bergmann" <arnd@arndb.de>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
"Uros Bizjak" <ubizjak@gmail.com>,
"Thomas Weißschuh" <linux@weissschuh.net>
Subject: Re: [patch v2 00/11] futex: Address the robust futex unlock race for real
Date: Fri, 27 Mar 2026 11:08:10 +0100 [thread overview]
Message-ID: <875x6hd1px.ffs@tglx> (raw)
In-Reply-To: <d51aec74-64ee-419f-a880-b9c41e7f1c95@igalia.com>
On Fri, Mar 27 2026 at 00:42, André Almeida wrote:
> Em 26/03/2026 19:08, Rich Felker escreveu:
>> On Thu, Mar 26, 2026 at 10:59:20PM +0100, Thomas Gleixner wrote:
>>> On Fri, Mar 20 2026 at 00:24, Thomas Gleixner wrote:
>>>> If the functionality itself is agreed on we only need to agree on the names
>>>> and signatures of the functions exposed through the VDSO before we set them
>>>> in stone. That will hopefully not take another 15 years :)
>>>
>>> Have the libc folks any further opinion on the syscall and the vDSO part
>>> before I prepare v3?
>>
>> This whole conversation has been way too much for me to keep up with,
>> so I'm not sure where it's at right now.
>>
>> From musl's perspective, the way we make robust mutex unlocking safe
>> right now is by inhibiting munmap/mremap/MAP_FIXED and
>> pthread_mutex_destroy while there are any in-flight robust unlocks. It
>> will be nice to be able to conditionally stop doing that if vdso is
>> available, but I can't see using a fallback that requires a syscall,
>> as that would just be a lot more expensive than what we're doing right
>> now and still not work on older kernels. So I think the only part
>> we're interested in is the fully-userspace approach in vdso.
>>
>
> You just need the syscall for the contented case (where you would need a
> syscall anyway for a FUTEX_WAKE).
>
> As Thomas wrote in patch 09/11:
>
> The resulting code sequence for user space is:
>
> if (__vdso_futex_robust_list$SZ_try_unlock(lock, tid, &pending_op) !=
> tid)
> err = sys_futex($OP | FUTEX_ROBUST_UNLOCK,....);
>
> Both the VDSO unlock and the kernel side unlock ensure that the
> pending_op pointer is always cleared when the lock becomes unlocked.
>
>
> So you call the vDSO first. If it fails, it means that the lock is
> contented and you need to call futex(). It will wake a waiter, release
> the lock and clean list_op_pending.
See also the V1 cover letter which has a full deep dive:
https://lore.kernel.org/20260316162316.356674433@kernel.org
TLDR:
The problem can be split into two issues:
1) Contended unlock
2) Uncontended unlock
#1 is solved by moving the unlock into the kernel instead of unlocking
first and then invoking the syscall to wake waiters. The syscall
takes the list_pending_op pointer as an argument and after unlocking,
i.e. *lock = 0, it clears the list_pending_op pointer
For this to work, it needs to use try_cmpxchg() like PI unlock does.
#2 The race is between the succesful try_cmpxchg() and the clearing of
the list_pending_op pointer
That's where the VDSO comes into play. Instead of having the
try_cmpxchg() in the library code the library invokes the VDSO
provided variant. That allows the kernel to check in the signal
delivery path whether a successful unlock requires a helping hand to
clear the list pending op pointer. If the interrupted IP is in the
critical section _and_ the try_cmpxchg() succeeded then the kernel
clears the pointer.
In x86 ASM:
0000000000001590 <__vdso_futex_robust_list64_try_unlock@@LINUX_2.6>:
1590: mov %esi,%eax
1592: xor %ecx,%ecx
1594: lock cmpxchg %ecx,(%rdi) // Result goes into ZF
1598: jne 159d <- CS start
159a: mov %rcx,(%rdx) // Clear list_pending_op
159d: ret <- CS end
159e: xchg %ax,%ax
So if the kernel observes
IP >= CS start && IP < CS end
then it checks the ZF flag in pt_regs and if set it clears the
list_pending op.
Obviously #1 depends on #2 to close all holes.
Thanks,
tglx
next prev parent reply other threads:[~2026-03-27 10:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 23:24 [patch v2 00/11] futex: Address the robust futex unlock race for real Thomas Gleixner
2026-03-19 23:24 ` [patch v2 01/11] futex: Move futex task related data into a struct Thomas Gleixner
2026-03-20 14:59 ` André Almeida
2026-03-19 23:24 ` [patch v2 02/11] futex: Move futex related mm_struct " Thomas Gleixner
2026-03-20 15:00 ` André Almeida
2026-03-19 23:24 ` [patch v2 03/11] futex: Provide UABI defines for robust list entry modifiers Thomas Gleixner
2026-03-20 15:01 ` André Almeida
2026-03-19 23:24 ` [patch v2 04/11] uaccess: Provide unsafe_atomic_store_release_user() Thomas Gleixner
2026-03-20 9:11 ` Peter Zijlstra
2026-03-20 12:38 ` Thomas Gleixner
2026-03-20 16:07 ` André Almeida
2026-03-19 23:24 ` [patch v2 05/11] x86: Select ARCH_STORE_IMPLIES_RELEASE Thomas Gleixner
2026-03-20 16:08 ` André Almeida
2026-03-19 23:24 ` [patch v2 06/11] futex: Cleanup UAPI defines Thomas Gleixner
2026-03-20 16:09 ` André Almeida
2026-03-19 23:24 ` [patch v2 07/11] futex: Add support for unlocking robust futexes Thomas Gleixner
2026-03-20 17:14 ` André Almeida
2026-03-26 22:23 ` Thomas Gleixner
2026-03-27 0:48 ` André Almeida
2026-03-19 23:24 ` [patch v2 08/11] futex: Add robust futex unlock IP range Thomas Gleixner
2026-03-20 9:07 ` Peter Zijlstra
2026-03-20 12:07 ` Thomas Gleixner
2026-03-27 13:24 ` Sebastian Andrzej Siewior
2026-03-27 16:19 ` Thomas Gleixner
2026-03-19 23:24 ` [patch v2 09/11] futex: Provide infrastructure to plug the non contended robust futex unlock race Thomas Gleixner
2026-03-20 13:35 ` Thomas Gleixner
2026-03-19 23:24 ` [patch v2 10/11] x86/vdso: Prepare for robust futex unlock support Thomas Gleixner
2026-03-19 23:25 ` [patch v2 11/11] x86/vdso: Implement __vdso_futex_robust_try_unlock() Thomas Gleixner
2026-03-20 7:14 ` Uros Bizjak
2026-03-20 12:48 ` Thomas Gleixner
2026-03-26 21:59 ` [patch v2 00/11] futex: Address the robust futex unlock race for real Thomas Gleixner
2026-03-26 22:08 ` Rich Felker
2026-03-27 3:42 ` André Almeida
2026-03-27 10:08 ` Thomas Gleixner [this message]
2026-03-27 16:50 ` Rich Felker
2026-03-28 12:41 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875x6hd1px.ffs@tglx \
--to=tglx@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=andrealmeid@igalia.com \
--cc=arnd@arndb.de \
--cc=bigeasy@linutronix.de \
--cc=carlos@redhat.com \
--cc=dalias@aerifal.cx \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=fweimer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@weissschuh.net \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=triegel@redhat.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.