* [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
@ 2025-01-27 20:26 André Almeida
2025-01-27 20:26 ` [PATCH v2 1/4] " André Almeida
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: André Almeida @ 2025-01-27 20:26 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, Florian Weimer
Cc: linux-kernel, kernel-dev, Vinicius Peixoto, André Almeida
As requested by Peter at [1], this patchset drops the ROBUST_LIST_LIMIT. This is
achieve by simply rewriting the processed list element ->next to point to the
head->list address, destroying the linked list to avoid any circular list.
Patches 2-4 create selftest for robust lists. Patch 2/4 creates helpful
macros for futex selftests and 3/4 creates a bunch of tests for the interface (as I
had submitted before at [2]).
Patch 4/4 is used to validate the changes made at 1/4:
- That the kernel can now handle a lock that is allocated in an index bigger than
ROBUST_LIST_LIMIT
- That the kernel can still handle circular linked lists.
[1] https://lore.kernel.org/lkml/20241219171344.GA26279@noisy.programming.kicks-ass.net/
[2] https://lore.kernel.org/lkml/20241212210436.112076-1-andrealmeid@igalia.com/
Changelog:
- Rebased on top of tip/locking/urgent (for 6.14-rc1)
v1: https://lore.kernel.org/lkml/20250110200508.353290-1-andrealmeid@igalia.com/#t
André Almeida (4):
futex: Drop ROBUST_LIST_LIMIT
selftests/futex: Add ASSERT_ macros
selftests/futex: Create test for robust list
selftests/futex: Create tests for long and circular robust lists
include/uapi/linux/futex.h | 3 +-
kernel/futex/core.c | 13 +-
.../selftests/futex/functional/.gitignore | 1 +
.../selftests/futex/functional/Makefile | 3 +-
.../selftests/futex/functional/robust_list.c | 641 ++++++++++++++++++
.../testing/selftests/futex/include/logging.h | 38 ++
6 files changed, 690 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/futex/functional/robust_list.c
--
2.48.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/4] futex: Drop ROBUST_LIST_LIMIT
2025-01-27 20:26 [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT André Almeida
@ 2025-01-27 20:26 ` André Almeida
2025-01-27 20:26 ` [PATCH v2 2/4] selftests/futex: Add ASSERT_ macros André Almeida
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: André Almeida @ 2025-01-27 20:26 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, Florian Weimer
Cc: linux-kernel, kernel-dev, Vinicius Peixoto, André Almeida
ROBUST_LIST_LIMIT was introduced to avoid the kernel get stuck in
circular lists, stopping to handle locks after the limit. This could
cause starvation in applications that have very long lists with valid
and non repetitive elements.
Instead of having a hard limit, rewrite the next pointer list while
walking on it. In this way, if the kernel ever revisits a repetitive
element (characterizing a circular list) the loop will stop.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
include/uapi/linux/futex.h | 3 +--
kernel/futex/core.c | 13 +++++++------
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index d2ee625ea189..3f31f38245f9 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -165,8 +165,7 @@ struct robust_list_head {
#define FUTEX_TID_MASK 0x3fffffff
/*
- * This limit protects against a deliberately circular list.
- * (Not worth introducing an rlimit for it)
+ * Deprecated, do not use. There is no limit of items in a list.
*/
#define ROBUST_LIST_LIMIT 2048
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index 3db8567f5a44..b1328697d7cc 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -779,7 +779,7 @@ static void exit_robust_list(struct task_struct *curr)
{
struct robust_list_head __user *head = curr->robust_list;
struct robust_list __user *entry, *next_entry, *pending;
- unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
+ unsigned int pi, pip;
unsigned int next_pi;
unsigned long futex_offset;
int rc;
@@ -820,13 +820,14 @@ static void exit_robust_list(struct task_struct *curr)
}
if (rc)
return;
- entry = next_entry;
- pi = next_pi;
+
/*
- * Avoid excessively long or circular lists:
+ * Avoid circular lists:
*/
- if (!--limit)
- break;
+ put_user(&head->list, &entry->next);
+
+ entry = next_entry;
+ pi = next_pi;
cond_resched();
}
--
2.48.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/4] selftests/futex: Add ASSERT_ macros
2025-01-27 20:26 [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT André Almeida
2025-01-27 20:26 ` [PATCH v2 1/4] " André Almeida
@ 2025-01-27 20:26 ` André Almeida
2025-01-27 20:26 ` [PATCH v2 3/4] selftests/futex: Create test for robust list André Almeida
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: André Almeida @ 2025-01-27 20:26 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, Florian Weimer
Cc: linux-kernel, kernel-dev, Vinicius Peixoto, André Almeida
Create ASSERT_{EQ, NE, TRUE, FALSE} macros to make test creation easier.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
.../testing/selftests/futex/include/logging.h | 38 +++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/tools/testing/selftests/futex/include/logging.h b/tools/testing/selftests/futex/include/logging.h
index 874c69ce5cce..a19755622a87 100644
--- a/tools/testing/selftests/futex/include/logging.h
+++ b/tools/testing/selftests/futex/include/logging.h
@@ -23,6 +23,44 @@
#include <linux/futex.h>
#include "kselftest.h"
+#define ASSERT_EQ(var, value) \
+do { \
+ if (var != value) { \
+ ksft_test_result_fail("%s: expected %ld, but %s has %ld\n", \
+ __func__, (long) value, #var, \
+ (long) var); \
+ return; \
+ } \
+} while (0)
+
+#define ASSERT_NE(var, value) \
+do { \
+ if (var == value) { \
+ ksft_test_result_fail("%s: expected not %ld, but %s has %ld\n", \
+ __func__, (long) value, #var, \
+ (long) var); \
+ return; \
+ } \
+} while (0)
+
+#define ASSERT_TRUE(var) \
+do { \
+ if ((var) == 0) { \
+ ksft_test_result_fail("%s: expected %s to be true\n", \
+ __func__, #var); \
+ return; \
+ } \
+} while (0)
+
+#define ASSERT_FALSE(var) \
+do { \
+ if (var) { \
+ ksft_test_result_fail("%s: expected %s to be false\n", \
+ __func__, #var); \
+ return; \
+ } \
+} while (0)
+
/*
* Define PASS, ERROR, and FAIL strings with and without color escape
* sequences, default to no color.
--
2.48.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/4] selftests/futex: Create test for robust list
2025-01-27 20:26 [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT André Almeida
2025-01-27 20:26 ` [PATCH v2 1/4] " André Almeida
2025-01-27 20:26 ` [PATCH v2 2/4] selftests/futex: Add ASSERT_ macros André Almeida
@ 2025-01-27 20:26 ` André Almeida
2025-01-27 20:26 ` [PATCH v2 4/4] selftests/futex: Create tests for long and circular robust lists André Almeida
2025-01-28 7:50 ` [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT Florian Weimer
4 siblings, 0 replies; 11+ messages in thread
From: André Almeida @ 2025-01-27 20:26 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, Florian Weimer
Cc: linux-kernel, kernel-dev, Vinicius Peixoto, André Almeida
Create a test for the robust list mechanism. Test the following uAPI
operations:
- Creating a robust mutex where the lock waiter is wake by the kernel
when the lock owner died
- Setting a robust list to the current task
- Getting a robust list from the current task
- Getting a robust list from another task
- Using the list_op_pending field from robust_list_head struct to test
robustness when the lock owner dies before completing the locking
- Setting a invalid size for syscall argument `len`
This is the expected output:
TAP version 13
1..6
ok 1 test_robustness
ok 2 test_set_robust_list_invalid_size
ok 3 test_get_robust_list_self
ok 4 test_get_robust_list_child
ok 5 test_set_list_op_pending
ok 6 test_robust_list_multiple_elements
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
.../selftests/futex/functional/.gitignore | 1 +
.../selftests/futex/functional/Makefile | 3 +-
.../selftests/futex/functional/robust_list.c | 516 ++++++++++++++++++
3 files changed, 519 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/futex/functional/robust_list.c
diff --git a/tools/testing/selftests/futex/functional/.gitignore b/tools/testing/selftests/futex/functional/.gitignore
index fbcbdb6963b3..4726e1be7497 100644
--- a/tools/testing/selftests/futex/functional/.gitignore
+++ b/tools/testing/selftests/futex/functional/.gitignore
@@ -9,3 +9,4 @@ futex_wait_wouldblock
futex_wait
futex_requeue
futex_waitv
+robust_list
diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile
index f79f9bac7918..b8635a1ac7f6 100644
--- a/tools/testing/selftests/futex/functional/Makefile
+++ b/tools/testing/selftests/futex/functional/Makefile
@@ -17,7 +17,8 @@ TEST_GEN_PROGS := \
futex_wait_private_mapped_file \
futex_wait \
futex_requeue \
- futex_waitv
+ futex_waitv \
+ robust_list
TEST_PROGS := run.sh
diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools/testing/selftests/futex/functional/robust_list.c
new file mode 100644
index 000000000000..bd4437c6aebb
--- /dev/null
+++ b/tools/testing/selftests/futex/functional/robust_list.c
@@ -0,0 +1,516 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024 Igalia S.L.
+ *
+ * Robust list test by André Almeida <andrealmeid@igalia.com>
+ *
+ * The robust list uAPI allows userspace to create "robust" locks, in the sense
+ * that if the lock holder thread dies, the remaining threads that are waiting
+ * for the lock won't block forever, waiting for a lock that will never be
+ * released.
+ *
+ * This is achieve by userspace setting a list where a thread can enter all the
+ * locks (futexes) that it is holding. The robust list is a linked list, and
+ * userspace register the start of the list with the syscall set_robust_list().
+ * If such thread eventually dies, the kernel will walk this list, waking up one
+ * thread waiting for each futex and marking the futex word with the flag
+ * FUTEX_OWNER_DIED.
+ *
+ * See also
+ * man set_robust_list
+ * Documententation/locking/robust-futex-ABI.rst
+ * Documententation/locking/robust-futexes.rst
+ */
+
+#define _GNU_SOURCE
+
+#include "futextest.h"
+#include "logging.h"
+
+#include <errno.h>
+#include <pthread.h>
+#include <signal.h>
+#include <stdatomic.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <sys/mman.h>
+#include <sys/wait.h>
+
+#define STACK_SIZE (1024 * 1024)
+
+#define FUTEX_TIMEOUT 3
+
+static pthread_barrier_t barrier, barrier2;
+
+int set_robust_list(struct robust_list_head *head, size_t len)
+{
+ return syscall(SYS_set_robust_list, head, len);
+}
+
+int get_robust_list(int pid, struct robust_list_head **head, size_t *len_ptr)
+{
+ return syscall(SYS_get_robust_list, pid, head, len_ptr);
+}
+
+/*
+ * Basic lock struct, contains just the futex word and the robust list element
+ * Real implementations have also a *prev to easily walk in the list
+ */
+struct lock_struct {
+ _Atomic(unsigned int) futex;
+ struct robust_list list;
+};
+
+/*
+ * Helper function to spawn a child thread. Returns -1 on error, pid on success
+ */
+static int create_child(int (*fn)(void *arg), void *arg)
+{
+ char *stack;
+ pid_t pid;
+
+ stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
+ if (stack == MAP_FAILED)
+ return -1;
+
+ stack += STACK_SIZE;
+
+ pid = clone(fn, stack, CLONE_VM | SIGCHLD, arg);
+
+ if (pid == -1)
+ return -1;
+
+ return pid;
+}
+
+/*
+ * Helper function to prepare and register a robust list
+ */
+static int set_list(struct robust_list_head *head)
+{
+ int ret;
+
+ ret = set_robust_list(head, sizeof(struct robust_list_head));
+ if (ret)
+ return ret;
+
+ head->futex_offset = (size_t) offsetof(struct lock_struct, futex) -
+ (size_t) offsetof(struct lock_struct, list);
+ head->list.next = &head->list;
+ head->list_op_pending = NULL;
+
+ return 0;
+}
+
+/*
+ * A basic (and incomplete) mutex lock function with robustness
+ */
+static int mutex_lock(struct lock_struct *lock, struct robust_list_head *head, bool error_inject)
+{
+ _Atomic(unsigned int) *futex = &lock->futex;
+ unsigned int zero = 0;
+ int ret = -1;
+ pid_t tid = gettid();
+
+ /*
+ * Set list_op_pending before starting the lock, so the kernel can catch
+ * the case where the thread died during the lock operation
+ */
+ head->list_op_pending = &lock->list;
+
+ if (atomic_compare_exchange_strong(futex, &zero, tid)) {
+ /*
+ * We took the lock, insert it in the robust list
+ */
+ struct robust_list *list = &head->list;
+
+ /* Error injection to test list_op_pending */
+ if (error_inject)
+ return 0;
+
+ while (list->next != &head->list)
+ list = list->next;
+
+ list->next = &lock->list;
+ lock->list.next = &head->list;
+
+ ret = 0;
+ } else {
+ /*
+ * We didn't take the lock, wait until the owner wakes (or dies)
+ */
+ struct timespec to;
+
+ to.tv_sec = FUTEX_TIMEOUT;
+ to.tv_nsec = 0;
+
+ tid = atomic_load(futex);
+ /* Kernel ignores futexes without the waiters flag */
+ tid |= FUTEX_WAITERS;
+ atomic_store(futex, tid);
+
+ ret = futex_wait((futex_t *) futex, tid, &to, 0);
+
+ /*
+ * A real mutex_lock() implementation would loop here to finally
+ * take the lock. We don't care about that, so we stop here.
+ */
+ }
+
+ head->list_op_pending = NULL;
+
+ return ret;
+}
+
+/*
+ * This child thread will succeed taking the lock, and then will exit holding it
+ */
+static int child_fn_lock(void *arg)
+{
+ struct lock_struct *lock = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int ret;
+
+ ret = set_list(&head);
+ if (ret)
+ ksft_test_result_fail("set_robust_list error\n");
+
+ ret = mutex_lock(lock, &head, false);
+ if (ret)
+ ksft_test_result_fail("mutex_lock error\n");
+
+ pthread_barrier_wait(&barrier);
+
+ /*
+ * There's a race here: the parent thread needs to be inside
+ * futex_wait() before the child thread dies, otherwise it will miss the
+ * wakeup from handle_futex_death() that this child will emit. We wait a
+ * little bit just to make sure that this happens.
+ */
+ sleep(1);
+
+ return 0;
+}
+
+/*
+ * Spawns a child thread that will set a robust list, take the lock, register it
+ * in the robust list and die. The parent thread will wait on this futex, and
+ * should be waken up when the child exits.
+ */
+static void test_robustness(void)
+{
+ struct lock_struct lock = { .futex = 0 };
+ struct robust_list_head head;
+ _Atomic(unsigned int) *futex = &lock.futex;
+ int ret;
+
+ ret = set_list(&head);
+ ASSERT_EQ(ret, 0);
+
+ /*
+ * Lets use a barrier to ensure that the child thread takes the lock
+ * before the parent
+ */
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ ret = create_child(&child_fn_lock, &lock);
+ ASSERT_NE(ret, -1);
+
+ pthread_barrier_wait(&barrier);
+ ret = mutex_lock(&lock, &head, false);
+
+ /*
+ * futex_wait() should return 0 and the futex word should be marked with
+ * FUTEX_OWNER_DIED
+ */
+ ASSERT_EQ(ret, 0);
+ if (ret != 0)
+ printf("futex wait returned %d", errno);
+
+ ASSERT_TRUE(*futex | FUTEX_OWNER_DIED);
+
+ wait(NULL);
+ pthread_barrier_destroy(&barrier);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+/*
+ * The only valid value for len is sizeof(*head)
+ */
+static void test_set_robust_list_invalid_size(void)
+{
+ struct robust_list_head head;
+ size_t head_size = sizeof(struct robust_list_head);
+ int ret;
+
+ ret = set_robust_list(&head, head_size);
+ ASSERT_EQ(ret, 0);
+
+ ret = set_robust_list(&head, head_size * 2);
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EINVAL);
+
+ ret = set_robust_list(&head, head_size - 1);
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EINVAL);
+
+ ret = set_robust_list(&head, 0);
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EINVAL);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+/*
+ * Test get_robust_list with pid = 0, getting the list of the running thread
+ */
+static void test_get_robust_list_self(void)
+{
+ struct robust_list_head head, head2, *get_head;
+ size_t head_size = sizeof(struct robust_list_head), len_ptr;
+ int ret;
+
+ ret = set_robust_list(&head, head_size);
+ ASSERT_EQ(ret, 0);
+
+ ret = get_robust_list(0, &get_head, &len_ptr);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(get_head, &head);
+ ASSERT_EQ(head_size, len_ptr);
+
+ ret = set_robust_list(&head2, head_size);
+ ASSERT_EQ(ret, 0);
+
+ ret = get_robust_list(0, &get_head, &len_ptr);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(get_head, &head2);
+ ASSERT_EQ(head_size, len_ptr);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+static int child_list(void *arg)
+{
+ struct robust_list_head *head = (struct robust_list_head *) arg;
+ int ret;
+
+ ret = set_robust_list(head, sizeof(struct robust_list_head));
+ if (ret)
+ ksft_test_result_fail("set_robust_list error\n");
+
+ pthread_barrier_wait(&barrier);
+ pthread_barrier_wait(&barrier2);
+
+ return 0;
+}
+
+/*
+ * Test get_robust_list from another thread. We use two barriers here to ensure
+ * that:
+ * 1) the child thread set the list before we try to get it from the
+ * parent
+ * 2) the child thread still alive when we try to get the list from it
+ */
+static void test_get_robust_list_child(void)
+{
+ pid_t tid;
+ int ret;
+ struct robust_list_head head, *get_head;
+ size_t len_ptr;
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ret = pthread_barrier_init(&barrier2, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ tid = create_child(&child_list, &head);
+ ASSERT_NE(tid, -1);
+
+ pthread_barrier_wait(&barrier);
+
+ ret = get_robust_list(tid, &get_head, &len_ptr);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(&head, get_head);
+
+ pthread_barrier_wait(&barrier2);
+
+ wait(NULL);
+ pthread_barrier_destroy(&barrier);
+ pthread_barrier_destroy(&barrier2);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+static int child_fn_lock_with_error(void *arg)
+{
+ struct lock_struct *lock = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int ret;
+
+ ret = set_list(&head);
+ if (ret)
+ ksft_test_result_fail("set_robust_list error\n");
+
+ ret = mutex_lock(lock, &head, true);
+ if (ret)
+ ksft_test_result_fail("mutex_lock error\n");
+
+ pthread_barrier_wait(&barrier);
+
+ sleep(1);
+
+ return 0;
+}
+
+/*
+ * Same as robustness test, but inject an error where the mutex_lock() exits
+ * earlier, just after setting list_op_pending and taking the lock, to test the
+ * list_op_pending mechanism
+ */
+static void test_set_list_op_pending(void)
+{
+ struct lock_struct lock = { .futex = 0 };
+ struct robust_list_head head;
+ _Atomic(unsigned int) *futex = &lock.futex;
+ int ret;
+
+ ret = set_list(&head);
+ ASSERT_EQ(ret, 0);
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ ret = create_child(&child_fn_lock_with_error, &lock);
+ ASSERT_NE(ret, -1);
+
+ pthread_barrier_wait(&barrier);
+ ret = mutex_lock(&lock, &head, false);
+
+ ASSERT_EQ(ret, 0);
+ if (ret != 0)
+ printf("futex wait returned %d", errno);
+
+ ASSERT_TRUE(*futex | FUTEX_OWNER_DIED);
+
+ wait(NULL);
+ pthread_barrier_destroy(&barrier);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+#define CHILD_NR 10
+
+static int child_lock_holder(void *arg)
+{
+ struct lock_struct *locks = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int i;
+
+ set_list(&head);
+
+ for (i = 0; i < CHILD_NR; i++) {
+ locks[i].futex = 0;
+ mutex_lock(&locks[i], &head, false);
+ }
+
+ pthread_barrier_wait(&barrier);
+ pthread_barrier_wait(&barrier2);
+
+ sleep(1);
+ return 0;
+}
+
+static int child_wait_lock(void *arg)
+{
+ struct lock_struct *lock = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int ret;
+
+ pthread_barrier_wait(&barrier2);
+ ret = mutex_lock(lock, &head, false);
+
+ if (ret)
+ ksft_test_result_fail("mutex_lock error\n");
+
+ if (!(lock->futex | FUTEX_OWNER_DIED))
+ ksft_test_result_fail("futex not marked with FUTEX_OWNER_DIED\n");
+
+ return 0;
+}
+
+/*
+ * Test a robust list of more than one element. All the waiters should wake when
+ * the holder dies
+ */
+static void test_robust_list_multiple_elements(void)
+{
+ struct lock_struct locks[CHILD_NR];
+ int i, ret;
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+ ret = pthread_barrier_init(&barrier2, NULL, CHILD_NR + 1);
+ ASSERT_EQ(ret, 0);
+
+ create_child(&child_lock_holder, &locks);
+
+ /* Wait until the locker thread takes the look */
+ pthread_barrier_wait(&barrier);
+
+ for (i = 0; i < CHILD_NR; i++)
+ create_child(&child_wait_lock, &locks[i]);
+
+ /* Wait for all children to return */
+ while (wait(NULL) > 0);
+
+ pthread_barrier_destroy(&barrier);
+ pthread_barrier_destroy(&barrier2);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+void usage(char *prog)
+{
+ printf("Usage: %s\n", prog);
+ printf(" -c Use color\n");
+ printf(" -h Display this help message\n");
+ printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n",
+ VQUIET, VCRITICAL, VINFO);
+}
+
+int main(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt(argc, argv, "cht:v:")) != -1) {
+ switch (c) {
+ case 'c':
+ log_color(1);
+ break;
+ case 'h':
+ usage(basename(argv[0]));
+ exit(0);
+ case 'v':
+ log_verbosity(atoi(optarg));
+ break;
+ default:
+ usage(basename(argv[0]));
+ exit(1);
+ }
+ }
+
+ ksft_print_header();
+ ksft_set_plan(6);
+
+ test_robustness();
+ test_set_robust_list_invalid_size();
+ test_get_robust_list_self();
+ test_get_robust_list_child();
+ test_set_list_op_pending();
+ test_robust_list_multiple_elements();
+
+ ksft_print_cnts();
+ return 0;
+}
--
2.48.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 4/4] selftests/futex: Create tests for long and circular robust lists
2025-01-27 20:26 [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT André Almeida
` (2 preceding siblings ...)
2025-01-27 20:26 ` [PATCH v2 3/4] selftests/futex: Create test for robust list André Almeida
@ 2025-01-27 20:26 ` André Almeida
2025-01-28 7:50 ` [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT Florian Weimer
4 siblings, 0 replies; 11+ messages in thread
From: André Almeida @ 2025-01-27 20:26 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, Florian Weimer
Cc: linux-kernel, kernel-dev, Vinicius Peixoto, André Almeida
After dropping ROBUST_LIST_LIMIT, create new robust list tests to ensure
two conditions:
- That the kernel can correctly handle circular lists
- That the kernel can correctly wake up elements in very long lists,
larger that the old ROBUST_LIST_LIMIT.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
.../selftests/futex/functional/robust_list.c | 127 +++++++++++++++++-
1 file changed, 126 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools/testing/selftests/futex/functional/robust_list.c
index bd4437c6aebb..4bd6795b3ac5 100644
--- a/tools/testing/selftests/futex/functional/robust_list.c
+++ b/tools/testing/selftests/futex/functional/robust_list.c
@@ -471,6 +471,128 @@ static void test_robust_list_multiple_elements(void)
ksft_test_result_pass("%s\n", __func__);
}
+/*
+ * This is the old limit defined by the kernel
+ */
+#define ROBUST_LIST_LIMIT 2048
+#define CHILD_LIST_LIMIT (ROBUST_LIST_LIMIT + 10)
+
+static int child_robust_list_limit(void *arg)
+{
+ struct lock_struct *locks;
+ struct robust_list *list;
+ struct robust_list_head head;
+ int ret, i;
+
+ locks = (struct lock_struct *) arg;
+
+ ret = set_list(&head);
+ if (ret)
+ ksft_test_result_fail("set_list error\n");
+
+ /*
+ * Create a very long list of locks
+ */
+ head.list.next = &locks[0].list;
+
+ list = head.list.next;
+ for (i = 0; i < CHILD_LIST_LIMIT - 1; i++) {
+ list->next = &locks[i+1].list;
+ list = list->next;
+ }
+ list->next = &head.list;
+
+ /*
+ * Grab the lock in the last one, and die without releasing it
+ */
+ mutex_lock(&locks[CHILD_LIST_LIMIT], &head, false);
+ pthread_barrier_wait(&barrier);
+
+ sleep(1);
+
+ return 0;
+}
+
+/*
+ * Robust list used to have a limit of 2048 items from the kernel side. After
+ * this limit the kernel would stop walking the list and ignore the other
+ * futexes, causing deadlocks.
+ *
+ * After this limit has been dropped, test if we can wait for a list of more
+ * than 2048 elements.
+ */
+static void test_robust_list_limit(void)
+{
+ struct lock_struct locks[CHILD_LIST_LIMIT + 1];
+ _Atomic(unsigned int) *futex = &locks[CHILD_LIST_LIMIT].futex;
+ struct robust_list_head head;
+ int ret;
+
+ *futex = 0;
+
+ ret = set_list(&head);
+ ASSERT_EQ(ret, 0);
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ create_child(child_robust_list_limit, locks);
+
+ /*
+ * After the child thread creates the very long list of locks, wait on
+ * the last one.
+ */
+ pthread_barrier_wait(&barrier);
+ ret = mutex_lock(&locks[CHILD_LIST_LIMIT], &head, false);
+
+ if (ret != 0)
+ printf("futex wait returned %d\n", errno);
+ ASSERT_EQ(ret, 0);
+
+ ASSERT_TRUE(*futex | FUTEX_OWNER_DIED);
+
+ wait(NULL);
+ pthread_barrier_destroy(&barrier);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+static int child_circular_list(void *arg)
+{
+ static struct robust_list_head head;
+ struct lock_struct a, b, c;
+ int ret;
+
+ ret = set_list(&head);
+ if (ret)
+ ksft_test_result_fail("set_list error\n");
+
+ head.list.next = &a.list;
+
+ /*
+ * The last element should point to head list, but we short circuit it
+ */
+ a.list.next = &b.list;
+ b.list.next = &c.list;
+ c.list.next = &a.list;
+
+ return 0;
+}
+
+/*
+ * Create a circular robust list. The kernel should be able to destroy the list
+ * while processing it so it won't be trapped in an infinite loop while handling
+ * a process exit
+ */
+static void test_circular_list(void)
+{
+ create_child(child_circular_list, NULL);
+
+ wait(NULL);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
void usage(char *prog)
{
printf("Usage: %s\n", prog);
@@ -502,14 +624,17 @@ int main(int argc, char *argv[])
}
ksft_print_header();
- ksft_set_plan(6);
+ ksft_set_plan(8);
test_robustness();
+
test_set_robust_list_invalid_size();
test_get_robust_list_self();
test_get_robust_list_child();
test_set_list_op_pending();
test_robust_list_multiple_elements();
+ test_robust_list_limit();
+ test_circular_list();
ksft_print_cnts();
return 0;
--
2.48.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
2025-01-27 20:26 [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT André Almeida
` (3 preceding siblings ...)
2025-01-27 20:26 ` [PATCH v2 4/4] selftests/futex: Create tests for long and circular robust lists André Almeida
@ 2025-01-28 7:50 ` Florian Weimer
2025-01-28 14:28 ` André Almeida
2025-02-03 13:29 ` Peter Zijlstra
4 siblings, 2 replies; 11+ messages in thread
From: Florian Weimer @ 2025-01-28 7:50 UTC (permalink / raw)
To: André Almeida
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, linux-kernel, kernel-dev,
Vinicius Peixoto
* André Almeida:
> As requested by Peter at [1], this patchset drops the
> ROBUST_LIST_LIMIT. This is achieve by simply rewriting the processed
> list element ->next to point to the head->list address, destroying the
> linked list to avoid any circular list.
Doesn't this turn a robust mutex overwrite or a TCB overwrite into a
write-anything-anywhere primitive? Furthermore, I'm not entirely sure
if this is entirely backwards-compatible.
Could you use the tortoise/hare approach instead?
Thanks,
Florian
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
2025-01-28 7:50 ` [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT Florian Weimer
@ 2025-01-28 14:28 ` André Almeida
2025-01-28 20:35 ` Florian Weimer
2025-02-03 13:29 ` Peter Zijlstra
1 sibling, 1 reply; 11+ messages in thread
From: André Almeida @ 2025-01-28 14:28 UTC (permalink / raw)
To: Florian Weimer
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, linux-kernel, kernel-dev,
Vinicius Peixoto
Hi Florian,
Em 28/01/2025 04:50, Florian Weimer escreveu:
> * André Almeida:
>
>> As requested by Peter at [1], this patchset drops the
>> ROBUST_LIST_LIMIT. This is achieve by simply rewriting the processed
>> list element ->next to point to the head->list address, destroying the
>> linked list to avoid any circular list.
>
> Doesn't this turn a robust mutex overwrite or a TCB overwrite into a
> write-anything-anywhere primitive? Furthermore, I'm not entirely sure
> if this is entirely backwards-compatible.
>
The robust list is meant to be a private resource, per-process, and this
patch only rewrites it after the process exits, so I believe that any
changes done in this memory should be safe given that the process will
soon disappear anyway, right?
Do you think you can point out a scenario that wouldn't be
backwards-compatible? I would like to try to test it.
> Could you use the tortoise/hare approach instead?
>
I believe that you want the approach to be "slow and steady" but I'm not
sure what you have in mind, if you could you please elaborate :)
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
2025-01-28 14:28 ` André Almeida
@ 2025-01-28 20:35 ` Florian Weimer
2025-02-03 13:34 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Florian Weimer @ 2025-01-28 20:35 UTC (permalink / raw)
To: André Almeida
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, linux-kernel, kernel-dev,
Vinicius Peixoto
* André Almeida:
> Hi Florian,
>
> Em 28/01/2025 04:50, Florian Weimer escreveu:
>> * André Almeida:
>>
>>> As requested by Peter at [1], this patchset drops the
>>> ROBUST_LIST_LIMIT. This is achieve by simply rewriting the processed
>>> list element ->next to point to the head->list address, destroying the
>>> linked list to avoid any circular list.
>> Doesn't this turn a robust mutex overwrite or a TCB overwrite into a
>> write-anything-anywhere primitive? Furthermore, I'm not entirely sure
>> if this is entirely backwards-compatible.
>>
>
> The robust list is meant to be a private resource, per-process, and
> this patch only rewrites it after the process exits, so I believe that
> any changes done in this memory should be safe given that the process
> will soon disappear anyway, right?
At least in the glibc implementation, we let the kernel handle robust
mutex notification on thread exit, and that's observable.
Beyond that, process-shared robust mutexes exist, too, and those updates
will be observable, too.
> Do you think you can point out a scenario that wouldn't be
> backwards-compatible? I would like to try to test it.
I think it should be okay for the glibc implementation. The robust list
is libc-owned (at least in glibc implementation), so it should not
matter, but the are other libs out there.
>> Could you use the tortoise/hare approach instead?
> I believe that you want the approach to be "slow and steady" but I'm
> not sure what you have in mind, if you could you please elaborate :)
I meant cycle detection using Floyd's algorithm.
Thanks,
Florian
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
2025-01-28 7:50 ` [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT Florian Weimer
2025-01-28 14:28 ` André Almeida
@ 2025-02-03 13:29 ` Peter Zijlstra
2025-02-03 14:16 ` André Almeida
1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-02-03 13:29 UTC (permalink / raw)
To: Florian Weimer
Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, linux-kernel, kernel-dev,
Vinicius Peixoto
On Tue, Jan 28, 2025 at 08:50:41AM +0100, Florian Weimer wrote:
> * André Almeida:
>
> > As requested by Peter at [1], this patchset drops the
> > ROBUST_LIST_LIMIT. This is achieve by simply rewriting the processed
> > list element ->next to point to the head->list address, destroying the
> > linked list to avoid any circular list.
Well, I suggested we do this for a new robust list.
> Furthermore, I'm not entirely sure
> if this is entirely backwards-compatible.
I share Florian's concern about backward compat here. It might work, it
might not.
I was just saying that if we're going to be doing new robust lists, we
should try and fix all the known wrongs, and this one lets us get rid of the limit.
> Could you use the tortoise/hare approach instead?
That seems overly complicated for what we need.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
2025-01-28 20:35 ` Florian Weimer
@ 2025-02-03 13:34 ` Peter Zijlstra
0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2025-02-03 13:34 UTC (permalink / raw)
To: Florian Weimer
Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, linux-kernel, kernel-dev,
Vinicius Peixoto
On Tue, Jan 28, 2025 at 09:35:26PM +0100, Florian Weimer wrote:
> >> Doesn't this turn a robust mutex overwrite or a TCB overwrite into a
> >> write-anything-anywhere primitive?
> >
> > The robust list is meant to be a private resource, per-process, and
> > this patch only rewrites it after the process exits, so I believe that
> > any changes done in this memory should be safe given that the process
> > will soon disappear anyway, right?
>
> At least in the glibc implementation, we let the kernel handle robust
> mutex notification on thread exit, and that's observable.
>
> Beyond that, process-shared robust mutexes exist, too, and those updates
> will be observable, too.
AFAICT we don't allow writing anywhere we couldn't already. The process
shared things should be in shared memory, something we can already write
to.
Notably, the kernel doesn't change address space while walking the
robust list (the robust list doesn't even contain enough information to
do this, even if we wanted to), it can only write to things the dying
task could already write to.
So I don't think there is a security angle here. Yes userspace can shoot
itself in the foot with this, but what else is new.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT
2025-02-03 13:29 ` Peter Zijlstra
@ 2025-02-03 14:16 ` André Almeida
0 siblings, 0 replies; 11+ messages in thread
From: André Almeida @ 2025-02-03 14:16 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, Florian Weimer, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, linux-kernel, kernel-dev,
Vinicius Peixoto
Em 03/02/2025 10:29, Peter Zijlstra escreveu:
> On Tue, Jan 28, 2025 at 08:50:41AM +0100, Florian Weimer wrote:
>> * André Almeida:
>>
>>> As requested by Peter at [1], this patchset drops the
>>> ROBUST_LIST_LIMIT. This is achieve by simply rewriting the processed
>>> list element ->next to point to the head->list address, destroying the
>>> linked list to avoid any circular list.
>
> Well, I suggested we do this for a new robust list.
>
>> Furthermore, I'm not entirely sure
>> if this is entirely backwards-compatible.
>
> I share Florian's concern about backward compat here. It might work, it
> might not.
>
> I was just saying that if we're going to be doing new robust lists, we
> should try and fix all the known wrongs, and this one lets us get rid of the limit.
>
Oh, now I see, thanks for the clarification. I will add this for the new
interface then.
In the meanwhile, if you can a look at patch 3/4 "selftests/futex:
Create test for robust list" that would help to make sure the new
interface don't mess with the old one.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-02-03 14:16 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-27 20:26 [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT André Almeida
2025-01-27 20:26 ` [PATCH v2 1/4] " André Almeida
2025-01-27 20:26 ` [PATCH v2 2/4] selftests/futex: Add ASSERT_ macros André Almeida
2025-01-27 20:26 ` [PATCH v2 3/4] selftests/futex: Create test for robust list André Almeida
2025-01-27 20:26 ` [PATCH v2 4/4] selftests/futex: Create tests for long and circular robust lists André Almeida
2025-01-28 7:50 ` [PATCH v2 0/4] futex: Drop ROBUST_LIST_LIMIT Florian Weimer
2025-01-28 14:28 ` André Almeida
2025-01-28 20:35 ` Florian Weimer
2025-02-03 13:34 ` Peter Zijlstra
2025-02-03 13:29 ` Peter Zijlstra
2025-02-03 14:16 ` André Almeida
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox