From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: "Thomas Weißschuh" <thomas.weissschuh@linutronix.de>
Cc: "André Almeida" <andrealmeid@igalia.com>,
linux-kernel@vger.kernel.org,
"Carlos O'Donell" <carlos@redhat.com>,
"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
"Peter Zijlstra" <peterz@infradead.org>,
"Florian Weimer" <fweimer@redhat.com>,
"Rich Felker" <dalias@aerifal.cx>,
"Torvald Riegel" <triegel@redhat.com>,
"Darren Hart" <dvhart@infradead.org>,
"Thomas Gleixner" <tglx@kernel.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Davidlohr Bueso" <dave@stgolabs.net>,
"Arnd Bergmann" <arnd@arndb.de>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
linux-api@vger.kernel.org
Subject: Re: [RFC PATCH v4] futex: Introduce __vdso_robust_futex_unlock and __vdso_robust_pi_futex_try_unlock
Date: Fri, 13 Mar 2026 11:19:20 -0400 [thread overview]
Message-ID: <29d24bd3-530d-429b-ae27-4c5f6e5723e1@efficios.com> (raw)
In-Reply-To: <20260313150111-64c38feb-825d-433e-9c71-f4f109b8cbfb@linutronix.de>
On 2026-03-13 10:24, Thomas Weißschuh wrote:
> Hi Mathieu,
>
> some small remarks around the vDSO code.
>
> On Fri, Mar 13, 2026 at 09:39:03AM -0400, Mathieu Desnoyers wrote:
>
> (...)
>
>> The approach taken to fix this is to introduce two vDSO and extend the
>> x86 vDSO exception table to track the relevant ip ranges: one for non-PI
>> robust futexes, and one for PI robust futexes.
>
> One of the central points behind the vDSO so far was that it is only a
> performance optimization. It is never required for correctness.
> What are applications supposed to do when the vDSO is disabled?
Good point!
For non-PI futexes, we'd need to introduce a new robust_futex_unlock
system call to handle the scenario where vDSO is unavailable. This
system call would perform the equivalent of what is done within the
vDSO, but from within the kernel. This comes with a significant
performance overhead, as this would be called on _every_ robust
futex unlock.
For PI futexes, applications could either directly invoke the
futex_unlock_pi system call when the vDSO is not available, or
we could provide a new system call which does the equivalent of
the vDSO within the kernel. The advantage of the new system call
is that we would not hit the hb->lock in the uncontended case.
Another alternative is that we tweak the existing futex_unlock_pi
to handle the case where it is called when there are no waiters more
efficiently.
>
> (...)
>
>> diff --git a/arch/x86/entry/vdso/common/vfutex.c b/arch/x86/entry/vdso/common/vfutex.c
>> new file mode 100644
>> index 000000000000..336095b04952
>> --- /dev/null
>> +++ b/arch/x86/entry/vdso/common/vfutex.c
>> @@ -0,0 +1,90 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (C) 2026 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> + */
>> +#include <linux/types.h>
>> +#include <linux/futex.h>
>
> This should be uapi/linux/futex.h. Headers from the linux/ namespace should
> not be used in vDSO code. The definitions from them may end up being wrong
> in the compat vDSO. Either use uapi/ or vdso/ headers. (linux/types.h is a bit
> of an exception for historic reasons, it could be replaced by uapi/linux/types.h)
OK, will fix.
>
>> +#include <vdso/futex.h>
>> +#include "extable.h"
>
>> +
>> +#ifdef CONFIG_X86_64
>
> This only works because of the ugly hacks in fake_32bit_build.h.
> Testing for '#ifdef __x86_64__' is simpler and nicer to read.
OK, will fix.
>
>> +# define ASM_PTR_BIT_SET "btsq "
>> +# define ASM_PTR_SET "movq "
>> +#else
>> +# define ASM_PTR_BIT_SET "btsl "
>> +# define ASM_PTR_SET "movl "
>> +#endif
>> +
>> +u32 __vdso_robust_futex_unlock(u32 *uaddr, struct robust_list_head *robust_list_head)
>> +{
>> + u32 val = 0;
>> +
>> + /*
>> + * Within the ip range identified by the futex exception table,
>> + * the register "eax" contains the value loaded by xchg. This is
>> + * expected by futex_vdso_exception() to check whether waiters
>> + * need to be woken up. This register state is transferred to
>> + * bit 1 (NEED_ACTION) of *op_pending_addr before the ip range
>> + * ends.
>> + */
>> + asm volatile (
>> + _ASM_VDSO_EXTABLE_FUTEX_HANDLE(1f, 3f)
>> + /* Exchange uaddr (store-release). */
>> + "xchg %[uaddr], %[val]\n\t"
>> + "1:\n\t"
>> + /* Test if FUTEX_WAITERS (0x80000000) is set. */
>> + "test %[val], %[val]\n\t"
>> + "js 2f\n\t"
>> + /* Clear *op_pending_addr if there are no waiters. */
>> + ASM_PTR_SET "$0, %[op_pending_addr]\n\t"
>> + "jmp 3f\n\t"
>> + "2:\n\t"
>> + /* Set bit 1 (NEED_ACTION) in *op_pending_addr. */
>> + ASM_PTR_BIT_SET "$1, %[op_pending_addr]\n\t"
>> + "3:\n\t"
>> + : [val] "+a" (val),
>> + [uaddr] "+m" (*uaddr)
>> + : [op_pending_addr] "m" (robust_list_head->list_op_pending)
>> + : "memory"
>> + );
>> + return val;
>> +}
>> +
>> +u32 robust_futex_unlock(u32 *, struct robust_list_head *)
>> + __attribute__((weak, alias("__vdso_robust_futex_unlock")));
>
> __weak and __alias() as per checkpatch.pl.
OK, will fix.
>
> The entries in the linkerscripts are missing.
Good catch, will fix for:
vdso/vdso64/vdsox32.lds.S
vdso/vdso64/vdso64.lds.S
vdso/vdso32/vdso32.lds.S
>
> (...)
>
>> --- /dev/null
>> +++ b/include/vdso/futex.h
>> @@ -0,0 +1,72 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright (C) 2026 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> + */
>> +
>> +#ifndef _VDSO_FUTEX_H
>> +#define _VDSO_FUTEX_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/futex.h>
>
> Same remarks about the linux/ namespace as before.
OK, fixed.
>
>> +/**
>> + * __vdso_robust_futex_unlock - Architecture-specific vDSO implementation of robust futex unlock.
>> + * @uaddr: Lock address (points to a 32-bit unsigned integer type).
>> + * @robust_list_head: The thread-specific robust list that has been registered with set_robust_list.
>> + *
>> + * This vDSO unlocks the robust futex by exchanging the content of
>> + * *@uaddr with 0 with a store-release semantic. If the futex has
>> + * waiters, it sets bit 1 of *@robust_list_head->list_op_pending, else
>> + * it clears *@robust_list_head->list_op_pending. Those operations are
>> + * within a code region known by the kernel, making them safe with
>> + * respect to asynchronous program termination either from thread
>> + * context or from a nested signal handler.
>> + *
>> + * Returns: The old value present at *@uaddr.
>> + *
>> + * Expected use of this vDSO:
>> + *
>> + * robust_list_head is the thread-specific robust list that has been
>> + * registered with set_robust_list.
>> + *
>> + * if ((__vdso_robust_futex_unlock((u32 *) &mutex->__data.__lock, robust_list_head)
>> + * & FUTEX_WAITERS) != 0)
>> + * futex_wake((u32 *) &mutex->__data.__lock, 1, private);
>> + * WRITE_ONCE(robust_list_head->list_op_pending, 0);
>> + */
>> +extern u32 __vdso_robust_futex_unlock(u32 *uaddr, struct robust_list_head *robust_list_head);
>
> Drop the extern.
OK, fixed.
Thanks,
Mathieu
>
> (...)
>
>> +#endif /* _VDSO_FUTEX_H */
>
> (...)
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
prev parent reply other threads:[~2026-03-13 15:19 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-13 13:39 [RFC PATCH v4] futex: Introduce __vdso_robust_futex_unlock and __vdso_robust_pi_futex_try_unlock Mathieu Desnoyers
2026-03-13 14:24 ` Thomas Weißschuh
2026-03-13 15:19 ` Mathieu Desnoyers [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=29d24bd3-530d-429b-ae27-4c5f6e5723e1@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=Liam.Howlett@oracle.com \
--cc=andrealmeid@igalia.com \
--cc=arnd@arndb.de \
--cc=bigeasy@linutronix.de \
--cc=carlos@redhat.com \
--cc=dalias@aerifal.cx \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=fweimer@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@kernel.org \
--cc=thomas.weissschuh@linutronix.de \
--cc=triegel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox