All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Andrè Almeida" <andrealmeid@igalia.com>,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"Carlos O'Donell" <carlos@redhat.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Florian Weimer" <fweimer@redhat.com>,
	"Rich Felker" <dalias@aerifal.cx>,
	"Torvald Riegel" <triegel@redhat.com>,
	"Darren Hart" <dvhart@infradead.org>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	"Uros Bizjak" <ubizjak@gmail.com>,
	"Thomas Weißschuh" <linux@weissschuh.net>
Subject: [patch V3 13/14] Documentation: futex: Add a note about robust list race condition
Date: Mon, 30 Mar 2026 14:03:06 +0200	[thread overview]
Message-ID: <20260330120117.946243065@kernel.org> (raw)
In-Reply-To: 20260330114212.927686587@kernel.org

From: André Almeida <andrealmeid@igalia.com>

Add a note to the documentation giving a brief explanation why doing a
robust futex release in userspace is racy, what should be done to avoid
it and provide links to read more.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260326-tonyk-vdso_test-v1-1-30a6f78c8bc3@igalia.com

---
 Documentation/locking/robust-futex-ABI.rst |   44 +++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)
--- a/Documentation/locking/robust-futex-ABI.rst
+++ b/Documentation/locking/robust-futex-ABI.rst
@@ -153,6 +153,9 @@ manipulating this list), the user code m
  3) release the futex lock, and
  4) clear the 'lock_op_pending' word.
 
+Please note that the removal of a robust futex purely in userspace is
+racy. Refer to the next chapter to learn more and how to avoid this.
+
 On exit, the kernel will consider the address stored in
 'list_op_pending' and the address of each 'lock word' found by walking
 the list starting at 'head'.  For each such address, if the bottom 30
@@ -182,3 +185,44 @@ The kernel exit code will silently stop
 When the kernel sees a list entry whose 'lock word' doesn't have the
 current threads TID in the lower 30 bits, it does nothing with that
 entry, and goes on to the next entry.
+
+Robust release is racy
+----------------------
+
+The removal of a robust futex from the list is racy when doing it solely in
+userspace. Quoting Thomas Gleixer for the explanation:
+
+  The robust futex unlock mechanism is racy in respect to the clearing of the
+  robust_list_head::list_op_pending pointer because unlock and clearing the
+  pointer are not atomic. The race window is between the unlock and clearing
+  the pending op pointer. If the task is forced to exit in this window, exit
+  will access a potentially invalid pending op pointer when cleaning up the
+  robust list. That happens if another task manages to unmap the object
+  containing the lock before the cleanup, which results in an UAF. In the
+  worst case this UAF can lead to memory corruption when unrelated content
+  has been mapped to the same address by the time the access happens.
+
+A full in dept analysis can be read at
+https://lore.kernel.org/lkml/20260316162316.356674433@kernel.org/
+
+To overcome that, the kernel needs to participate in the lock release operation.
+This ensures that the release happens "atomically" in the regard of releasing
+the lock and removing the address from ``list_op_pending``. If the release is
+interrupted by a signal, the kernel will also verify if it interrupted the
+release operation.
+
+For the contended unlock case, where other threads are waiting for the lock
+release, there's the ``FUTEX_ROBUST_UNLOCK`` operation feature flag for the
+``futex()`` system call, which must be used with one of the following
+operations: ``FUTEX_WAKE``, ``FUTEX_WAKE_BITSET`` or ``FUTEX_UNLOCK_PI``.
+The kernel will release the lock (set the futex word to zero), clean the
+``list_op_pending`` field. Then, it will proceed with the normal wake path.
+
+For the non-contended path, there's still a race between checking the futex word
+and clearing the ``list_op_pending`` field. To solve this without the need of a
+complete system call, userspace should call the virtual syscall
+``__vdso_futex_robust_listXX_try_unlock()`` (where XX is either 32 or 64,
+depending on the size of the pointer). If the vDSO call succeeds, it means that
+it released the lock and cleared ``list_op_pending``. If it fails, that means
+that there are waiters for this lock and a call to ``futex()`` syscall with
+``FUTEX_ROBUST_UNLOCK`` is needed.


  parent reply	other threads:[~2026-03-30 12:03 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 12:01 [patch V3 00/14] futex: Address the robust futex unlock race for real Thomas Gleixner
2026-03-30 12:02 ` [patch V3 01/14] futex: Move futex task related data into a struct Thomas Gleixner
2026-03-30 12:02 ` [patch V3 02/14] futex: Make futex_mm_init() void Thomas Gleixner
2026-03-30 12:02 ` [patch V3 03/14] futex: Move futex related mm_struct data into a struct Thomas Gleixner
2026-03-30 15:23   ` Alexander Kuleshov
2026-03-30 12:02 ` [patch V3 04/14] futex: Provide UABI defines for robust list entry modifiers Thomas Gleixner
2026-03-30 12:02 ` [patch V3 05/14] uaccess: Provide unsafe_atomic_store_release_user() Thomas Gleixner
2026-03-30 13:33   ` Mark Rutland
2026-03-30 12:02 ` [patch V3 06/14] x86: Select ARCH_MEMORY_ORDER_TOS Thomas Gleixner
2026-03-30 13:34   ` Mark Rutland
2026-03-30 19:48     ` Thomas Gleixner
2026-03-30 12:02 ` [patch V3 07/14] futex: Cleanup UAPI defines Thomas Gleixner
2026-03-30 12:02 ` [patch V3 08/14] futex: Add support for unlocking robust futexes Thomas Gleixner
2026-03-30 12:02 ` [patch V3 09/14] futex: Add robust futex unlock IP range Thomas Gleixner
2026-03-30 12:02 ` [patch V3 10/14] futex: Provide infrastructure to plug the non contended robust futex unlock race Thomas Gleixner
2026-03-30 12:02 ` [patch V3 11/14] x86/vdso: Prepare for robust futex unlock support Thomas Gleixner
2026-03-30 12:03 ` [patch V3 12/14] x86/vdso: Implement __vdso_futex_robust_try_unlock() Thomas Gleixner
2026-03-30 12:03 ` Thomas Gleixner [this message]
2026-03-30 12:03 ` [patch V3 14/14] selftests: futex: Add tests for robust release operations Thomas Gleixner
2026-03-30 13:45 ` [patch V3 00/14] futex: Address the robust futex unlock race for real Mark Rutland
2026-03-30 13:51   ` Peter Zijlstra
2026-03-30 19:36   ` Thomas Gleixner
2026-03-31 14:12     ` Mark Rutland
2026-03-31 12:59   ` André Almeida
2026-03-31 13:03     ` Sebastian Andrzej Siewior
2026-03-31 14:13     ` Mark Rutland
2026-03-31 15:22   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260330120117.946243065@kernel.org \
    --to=tglx@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=andrealmeid@igalia.com \
    --cc=arnd@arndb.de \
    --cc=bigeasy@linutronix.de \
    --cc=carlos@redhat.com \
    --cc=dalias@aerifal.cx \
    --cc=dave@stgolabs.net \
    --cc=dvhart@infradead.org \
    --cc=fweimer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@weissschuh.net \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=triegel@redhat.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.