From: Alejandro Colomar <alx@kernel.org>
To: наб <nabijaczleweli@nabijaczleweli.xyz>
Cc: linux-man@vger.kernel.org
Subject: Re: [PATCH v4] futex_waitv.2: new page
Date: Wed, 11 Feb 2026 14:23:13 +0100 [thread overview]
Message-ID: <aYx8Ghyq8E76xx9c@devuan> (raw)
In-Reply-To: <mlooycjkz6pfdvket6f2gwet76aar3zsihtvpm7klt2wl2z3y4@tarta.nabijaczleweli.xyz>
[-- Attachment #1: Type: text/plain, Size: 14529 bytes --]
Hi наб,
On 2026-02-11T05:00:26+0100, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
> man/man2/futex_waitv.2 | 426 +++++++++++++++++++++++++++++++++++++++++
> man/man7/futex.7 | 9 +-
> 2 files changed, 433 insertions(+), 2 deletions(-)
> create mode 100644 man/man2/futex_waitv.2
>
> diff --git u/man/man2/futex_waitv.2 p/man/man2/futex_waitv.2
> new file mode 100644
> index 000000000..b05eb08ef
> --- /dev/null
> +++ p/man/man2/futex_waitv.2
> @@ -0,0 +1,426 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: MIT
> +.\"
> +.TH futex_waitv 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +futex_waitv \- wait for FUTEX_WAKE operation on multiple futexes
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.BR "#include <linux/futex.h>" " /* Definition of " "struct futex_waitv" " */"
The comment should mention the FUTEX_ constants, and not the structure
(which should be documented next to the structure).
> +.BR "#include <sys/syscall.h>" " /* Definition of " SYS_* " constants */"
> +.B #include <unistd.h>
> +.B #include <time.h>
> +.P
> +.BR "long syscall(" "unsigned int nr_futexes;"
Let's call this just 'n'. I have been wanting to globally trim
parameter names that are unnecessarily verbose. Let's not add more.
> +.BI " SYS_futex_waitv, struct futex_waitv " waiters [ nr_futexes ],
> +.BI " unsigned int " nr_futexes ", unsigned int " flags ,
> +.BI " const struct timespec *_Nullable " timeout ", clockid_t " clockid ");"
> +.fi
> +.P
> +.EX
Please include a header for this structure (consider as if the other
headers were not included).
> +struct futex_waitv {
> + u64 val; /* Expected value at \f[I]uaddr\f[] */
> + u64 uaddr; /* User address to wait on */
> + u32 flags; /* Flags for this waiter */
> + u32 __reserved; /* Align to u64 */
> +};
> +.EE
> +.SH DESCRIPTION
> +.\" This name is used internally in the kernel
> +Implements the FUTEX_WAIT_MULTIPLE operation,
> +analogous to a synchronous atomic parallel
> +.BR FUTEX_WAIT (2const)
> +or
> +.B FUTEX_WAIT_PRIVATE
> +on up to
> +.B FUTEX_WAITV_MAX
> +futex words.
> +For an overview of futexes, see
> +.BR futex (7);
> +for a description of the general interface, see
> +.BR futex (2);
> +for general minutiae of futex waiting, see the page above.
> +.P
> +This operation tests that the values at the
> +futex words pointed to by the addresses
> +.IR waiters []. uaddr
> +still contain respective expected values
> +.IR waiters []. val ,
> +and if so, sleeps waiting for a
> +.BR FUTEX_WAKE (2const)
> +operation on any of the futex words,
> +and returns the index of
> +.I a
> +waiter whose futex was woken.
> +.P
> +If the thread starts to sleep,
> +it is considered a waiter on all given futex words.
> +If any of the futex values do not match their respective
> +.IR waiters []. val ,
> +the call fails immediately with the error
> +.BR EAGAIN .
> +.P
> +If
> +.I timeout
> +is not NULL,
> +.I *timeout
> +specifies a deadline measured against clock
> +.IR clockid .
> +This interval will be rounded up to the system clock granularity,
> +and is guaranteed not to expire early.
> +If
> +.I timeout
> +is NULL, the call blocks indefinitely.
> +.P
> +Futex words to monitor are given by
> +.IR "struct futex_waitv" ,
> +whose fields are analogous to
> +.BR FUTEX_WAIT (2const)
> +parameters, except
> +.I __reserved
Please use .__reserved syntax for struct members (also with other
members).
> +must be 0
> +and
> +.I flags
> +must contain one of
> +.BR FUTEX2_SIZE_*
> +ORed with some of the flags below.
I expect the flags should start immediately after 'below'.
> +.P
> +C programs should assign to
> +.I uaddr
> +by casting a pointer to
> +.B uintptr_t
> +to ensure the top bits are cleared on 32-bit systems.
I'm not sure the paragraph above is correct.
What code are you worried about exactly? Could you show an example?
> +.TP
> +.BR FUTEX2_SIZE_U32
> +.I val
> +and
> +.I *uaddr
> +are 32-bit unsigned integers.
> +.TP
> +.B FUTEX2_NUMA
> +The futex word is followed by another word of the same size
> +.RI ( uaddr
> +points to
> +.IR uint N _t[2]
> +rather than
> +.IR uint N _t .
> +The word is given by
> +.IR uaddr[1] ),
> +which can be either
> +.B FUTEX_NO_NODE
> +(all bits set)
> +or a NUMA node number.
> +.IP
> +If the NUMA word is
> +.BR FUTEX_NO_NODE ,
> +the node number of the processor the syscall executes on is written to it.
> +(Except in an
> +.B EINVAL
> +or
> +.B EFAULT
> +condition, this happens to all waiters whose
> +.I flags
> +have
> +.B FUTEX2_NUMA
> +set.)
> +.IP
> +Futexes are placed on the NUMA node given by the NUMA word.
> +Futexes without this flag are placed on a random node.
> +.\" commit cec199c5e39bde7191a08087cc3d002ccfab31ff
> +.\" Author: Peter Zijlstra <peterz@infradead.org>
> +.\" Date: Wed Apr 16 18:29:16 2025 +0200
> +.\"
> +.\" futex: Implement FUTEX2_NUMA
> +.\"
> +.\" FUTEX2_MPOL is not documented or used anywhere;
> +.\" it's unclear to me what it does
> +.\" (defined in commit c042c505210dc3453f378df432c10fff3d471bc5
> +.\" "futex: Implement FUTEX2_MPOL")
> +.TP
> +.B FUTEX2_PRIVATE
> +By default, the futex is shared
> +.RB "(like " FUTEX_WAIT (2const)),
> +and can be accessed by multiple processes;
> +this flag waits on a private futex word,
> +where all users must use the same virtual memory map
> +(like
> +.BR FUTEX_WAIT_PRIVATE ;
> +this most often means they are part of the same process).
> +Private futexes are faster than shared ones.
> +.\"
> +.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"
> +.SH RETURN VALUE
> +Returns an index to an arbitrary entry in
> +.I waiters
> +corresponding to some woken-up futex.
> +This implies no information about other waiters.
> +.P
> +On error,
> +\-1 is returned,
> +and
> +.I errno
> +is set to indicate the error.
> +.\"
> +.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"
> +.SH ERRORS
> +.TP
> +.B EFAULT
> +.I waiters
> +points outside the accessible address space.
> +.TP
> +.B EFAULT
> +.I timeout
> +was not NULL and did not point to a valid user-space address.
> +.TP
> +.B EFAULT
> +Any
> +.IR waiters []. uaddr
> +field is not a valid user-space address.
Please homogenize the wording of this and the first EFAULT.
(I don't have a preference for which language is best).
> +.TP
> +.B EINVAL
> +Any
> +.IR waiters []. uaddr
> +field does not point to a valid object\[em]that is,
> +the address is not aligned appropriately for the specified
> +.BR FUTEX2_SIZE_* .
> +.TP
> +.B EINVAL
> +.I flags
> +was not 0.
> +.TP
> +.B EINVAL
> +.I nr_futexes
> +was not in
> +[1,
> +.B FUTEX_WAITV_MAX
> +(128)].
> +.TP
> +.B EINVAL
> +.I timeout
> +was not NULL and
> +.I clockid
> +was not a valid clock
> +.RB ( CLOCK_MONOTONIC
> +or
> +.BR CLOCK_REALTIME ).
> +.TP
> +.B EINVAL
> +.I *timeout
> +is denormal (before epoch or
> +.I tv_nsec
> +more than 999'999'999).
' should actually be \[aq].
> +.TP
> +.B EINVAL
> +Any
> +.IR waiters []. flags
> +field contains an unknown flag.
> +.TP
> +.B EINVAL
> +Any
> +.IR waiters []. flags
> +field is missing a
> +.B FUTEX2_SIZE_*
> +flag or has a size flag different than
> +.BR FUTEX2_SIZE_U32
> +set.
> +.TP
> +.B EINVAL
> +Any
> +.IR waiters []. __reserved
> +field is not 0.
> +.TP
> +.B EINVAL
> +Any
> +.IR waiters []. value
> +field has more bits set than permitted than the size flags.
> +.TP
> +.B EINVAL
> +.B FUTEX2_NUMA
> +was set in
> +.IR waiters []. flags ,
> +and the NUMA word
> +(which is the same size as the futex word)
> +is too small to contain the index of the biggest NUMA domain
> +(for example,
> +.B FUTEX2_SIZE_U8
> +and there are more than 255 NUMA domains).
> +.TP
> +.B EINVAL
> +.B FUTEX2_NUMA
> +was set in
> +.IR waiters []. flags ,
> +and the NUMA word is larger than the maximum possible NUMA node and not
> +.BR FUTEX_NO_NODE .
> +.TP
> +.B ETIMEDOUT
> +.I timeout
> +was not NULL and no futex was woken before the timeout elapsed.
> +.TP
> +.BR EAGAIN " or " EWOULDBLOCK
> +The value pointed to by
> +.I uaddr
> +was not equal to the expected value
> +.I val
> +at the time of the call.
> +.TP
> +.B EINTR
> +The
> +operation was interrupted by a signal (see
> +.BR signal (7)).
> +.\"
> +.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"
> +.SH STANDARDS
> +Linux.
> +.SH NOTES
> +.BR FUTEX2_SIZE_U8 ,
> +.BR FUTEX2_SIZE_U16 ,
> +and
> +.B FUTEX2_SIZE_U64
> +where
> +.I val
> +and
> +.I *uaddr
> +are 8, 16, or 64 bits are defined, but not implemented
> +.RB ( EINVAL ).
> +.SH HISTORY
> +.\" commit bf69bad38cf63d980e8a603f8d1bd1f85b5ed3d9
> +.\" Author: André Almeida <andrealmeid@igalia.com>
> +.\" Date: Thu Sep 23 14:11:05 2021 -0300
> +.\"
> +.\" futex: Implement sys_futex_waitv()
> +Linux 5.16.
> +.SH EXAMPLES
> +The program below executes a linear-time operation on 10 threads,
> +displaying the results in real time,
> +waiting at most 1 second for each new result.
> +The first 3 threads operate on the same data (complete in the same time).
> +.B !\&
> +indicates the futex that woke up each
> +.BR futex_waitv ().
> +.in +4
> +.EX
> +.RB $\~ ./futex_waitv
> +153 153 153 237 100 245 177 127 215 61
> + 122!
> + 200!
> + 254!
> +306 306!
> + 306!
> + 354!
> + 430!
> + 474!
> + 490!
> +Connection timed out
> +.EE
> +.P
> +.\" SRC BEGIN (futex_waitv.c)
> +.EX
> +#include <errno.h>
> +#include <inttypes.h>
> +#include <linux/futex.h>
> +#include <pthread.h>
> +#include <stdatomic.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/syscall.h>
> +#include <time.h>
> +#include <unistd.h>
> +\&
> +static inline long
> +my_futex_wait_private(_Atomic uint32_t *uaddr, uint32_t val,
Should this be atomic? If so, should we document futex(2) as taking
a pointer to atomic element?
> + const struct timespec *_Nullable timeout)
The two spaces are for variables, but not for function parameters.
Also, my bad: this shouldn't have _Nullable. We don't use it in
actual code (because Clang's _Nullable is really broken, and only serves
documentation purposes). Actually, Clang is still sub-par in many ways;
it's a C++ compiler, after all, where the C part is just a side-effect.
> +{
> + return syscall(SYS_futex, uaddr, FUTEX_WAKE_PRIVATE, val, timeout);
> +}
> +\&
> +static inline long
> +my_futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes,
> + unsigned int flags, const struct timespec *_Nullable timeout,
> + clockid_t clockid)
> +{
> + return syscall(SYS_futex_waitv, waiters, nr_futexes, flags, timeout, clockid);
> +}
> +\&
> +void *
> +worker(void *arg)
> +{
> + _Atomic uint32_t *futex = arg;
> +\&
> + usleep(*futex * 10000);
> + *futex *= 2;
> + my_futex_wait_private(futex, 1, NULL);
> + return NULL;
> +}
> +\&
> +int
> +main(void)
> +{
> + _Atomic uint32_t futexes[10];
> + uint8_t init[countof(futexes)];
> + struct futex_waitv waiters[countof(futexes)] = {};
> + int i;
> +\&
> + getentropy(init, sizeof(init));
> + init[0] = init[1] = init[2];
> + for (i = 0; i < countof(futexes); ++i) {
> + printf("%" PRIu8 "\[rs]t", init[i]);
> + atomic_init(&futexes[i], init[i]);
> + pthread_create(&(pthread_t){}, NULL, worker, &futexes[i]);
> + }
> + putchar('\[rs]n');
> +\&
> + for (i = 0; i < countof(futexes); ++i) {
> + waiters[i].val = futexes[i];
> + waiters[i].uaddr = (uintptr_t)&futexes[i];
> + waiters[i].flags = FUTEX2_SIZE_U32 | FUTEX2_PRIVATE;
> + }
> + for (;;) {
> + struct timespec timeout;
> + int woke;
> +\&
> + clock_gettime(CLOCK_MONOTONIC, &timeout);
> + timeout.tv_sec += 1;
> +\&
> + woke = my_futex_waitv(waiters, countof(futexes), 0, &timeout, CLOCK_MONOTONIC);
> + if (woke == -1 && (errno != EAGAIN && errno != EWOULDBLOCK))
> + break;
> +\&
> + for (i = 0; i < countof(futexes); ++i) {
> + if (futexes[i] != waiters[i].val)
> + printf("%" PRIu32 "%s", futexes[i], i == woke ? "!" : "");
> + putchar('\[rs]t');
> + }
> + putchar('\[rs]n');
> +\&
> + for (i = 0; i < countof(futexes); ++i)
> + waiters[i].val = futexes[i];
> + }
> + fprintf(stderr, "%s\[rs]n", strerror(errno));
> +}
> +.EE
> +.\" SRC END
> +.SH SEE ALSO
> +.ad l
What's the effect of this ad l, and why do we want it?
> +.BR futex (2),
> +.BR FUTEX_WAIT (2const),
> +.BR FUTEX_WAKE (2const),
> +.BR futex (7)
> +.P
> +The following kernel source files:
> +.IP \[bu]
> +.I Documentation/userspace-api/futex2.rst
I'd remove the below, and only point to the rst.
Have a lovely day!
Alex
> +.IP \[bu]
> +.I kernel/futex/syscall.c
> +.IP \[bu]
> +.I kernel/futex/waitwake.c
> +.IP \[bu]
> +.I kernel/futex/futex.h
> diff --git u/man/man7/futex.7 p/man/man7/futex.7
> index 51c5d5d9b..d271144ff 100644
> --- u/man/man7/futex.7
> +++ p/man/man7/futex.7
> @@ -45,7 +45,9 @@ .SS Semantics
> Any futex operation starts in user space,
> but it may be necessary to communicate with the kernel using the
> .BR futex (2)
> -system call.
> +or
> +.BR futex_waitv (2)
> +system calls.
> .P
> To "up" a futex, execute the proper assembler instructions that
> will cause the host CPU to atomically increment the integer.
> @@ -72,7 +74,9 @@ .SS Semantics
> .P
> The
> .BR futex (2)
> -system call can optionally be passed a timeout specifying how long
> +and
> +.BR futex_waitv (2)
> +system calls can optionally be passed a timeout specifying how long
> the kernel should
> wait for the futex to be upped.
> In this case, semantics are more complex and the programmer is referred
> @@ -107,6 +111,7 @@ .SH NOTES
> .SH SEE ALSO
> .BR clone (2),
> .BR futex (2),
> +.BR futex_waitv (2),
> .BR get_robust_list (2),
> .BR set_robust_list (2),
> .BR set_tid_address (2),
> --
> 2.39.5
--
<https://www.alejandro-colomar.es>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2026-02-11 13:23 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-07 12:49 [PATCH] futex_waitv.2: new page наб
2026-02-07 18:57 ` Alejandro Colomar
2026-02-07 19:16 ` наб
2026-02-07 21:50 ` Alejandro Colomar
2026-02-07 22:00 ` [PATCH v2] " наб
2026-02-09 22:35 ` Alejandro Colomar
2026-02-10 14:17 ` наб
2026-02-10 14:30 ` Alejandro Colomar
2026-02-10 15:54 ` Kristoffer Haugsbakk
2026-02-10 18:39 ` Alejandro Colomar
2026-02-11 7:35 ` Jeff King
2026-02-11 8:15 ` Kristoffer Haugsbakk
2026-02-11 15:43 ` Junio C Hamano
2026-02-10 16:54 ` Junio C Hamano
2026-02-10 17:11 ` Kristoffer Haugsbakk
2026-02-10 18:44 ` Alejandro Colomar
2026-02-10 20:05 ` Alejandro Colomar
2026-02-10 20:32 ` [PATCH v3] " наб
2026-02-10 21:11 ` Alejandro Colomar
2026-02-11 4:00 ` [PATCH v4] " наб
2026-02-11 13:23 ` Alejandro Colomar [this message]
2026-02-11 13:51 ` [PATCH v5] " наб
2026-02-11 14:15 ` [PATCH v6] " наб
2026-02-11 14:31 ` Alejandro Colomar
2026-02-11 14:44 ` [PATCH v7] " наб
2026-02-11 14:55 ` Alejandro Colomar
2026-02-11 14:59 ` наб
2026-02-11 15:13 ` Alejandro Colomar
2026-02-14 17:32 ` Alejandro Colomar
2026-02-14 19:30 ` [PATCH v8] " наб
2026-02-14 20:03 ` Alejandro Colomar
2026-02-14 20:48 ` [PATCH v9] " наб
2026-02-15 18:18 ` Alejandro Colomar
2026-02-15 19:00 ` [PATCH v10] " наб
2026-02-16 0:32 ` Alejandro Colomar
2026-02-16 14:20 ` [PATCH v11] " наб
2026-02-16 14:50 ` Alejandro Colomar
2026-02-16 20:43 ` [PATCH v12] " наб
2026-02-17 13:07 ` Alejandro Colomar
2026-02-17 14:31 ` [PATCH v13] " наб
2026-02-17 15:46 ` Alejandro Colomar
2026-02-17 16:17 ` наб
2026-02-18 0:31 ` Alejandro Colomar
2026-02-11 14:28 ` [PATCH v5] " Alejandro Colomar
2026-02-18 0:41 ` [PATCH v1 0/1] futex_waitv.2: Move text to a new PARAMETERS section Alejandro Colomar
2026-02-18 0:41 ` [PATCH v1 1/1] man/man2/futex_waitv.2: " Alejandro Colomar
2026-02-18 20:16 ` [PATCH v1 0/1] futex_waitv.2: " наб
2026-02-18 20:26 ` Alejandro Colomar
2026-02-18 20:30 ` наб
2026-02-18 20:33 ` Alejandro Colomar
2026-02-18 20:40 ` G. Branden Robinson
2026-02-18 21:28 ` Alejandro Colomar
2026-02-18 22:04 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYx8Ghyq8E76xx9c@devuan \
--to=alx@kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=nabijaczleweli@nabijaczleweli.xyz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox