* [PATCH v2 0/3] futex: Create set_robust_list2
@ 2024-11-01 16:21 André Almeida
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
André Almeida
This patch adds a new robust_list() syscall. The current syscall
can't be expanded to cover the following use case, so a new one is
needed. This new syscall allows users to set multiple robust lists per
process and to have either 32bit or 64bit pointers in the list.
* Use case
FEX-Emu[1] is an application that runs x86 and x86-64 binaries on an
AArch64 Linux host. One of the tasks of FEX-Emu is to translate syscalls
from one platform to another. Existing set_robust_list() can't be easily
translated because of two limitations:
1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
this is not a problem, because of the compat entry point. But there's
no such compat entry point for AArch64, so the kernel would do the
pointer arithmetic wrongly. Is also unviable to userspace to keep
track every addition/removal to the robust list and keep a 64bit
version of it somewhere else to feed the kernel. Thus, the new
interface has an option of telling the kernel if the list is filled
with 32bit or 64bit pointers.
2) Apps can set just one robust list (in theory, x86-64 can set two if
they also use the compat entry point). That means that when a x86 app
asks FEX-Emu to call set_robust_list(), FEX have two options: to
overwrite their own robust list pointer and make the app robust, or
to ignore the app robust list and keep the emulator robust. The new
interface allows for multiple robust lists per application, solving
this.
* Interface
This is the proposed interface:
long set_robust_list2(void *head, int index, unsigned int flags)
`head` is the head of the userspace struct robust_list_head, just as old
set_robust_list(). It needs to be a void pointer since it can point to a normal
robust_list_head or a compat_robust_list_head.
`flags` can be used for defining the list type:
enum robust_list_type {
ROBUST_LIST_32BIT,
ROBUST_LIST_64BIT,
};
`index` is the index in the internal robust_list's linked list (the naming
starts to get confusing, I reckon). If `index == -1`, that means that user wants
to set a new robust_list, and the kernel will append it in the end of the list,
assign a new index and return this index to the user. If `index >= 0`, that
means that user wants to re-set `*head` of an already existing list (similarly
to what happens when you call set_robust_list() twice with different `*head`).
If `index` is out of range, or it points to a non-existing robust_list, or if
the internal list is full, an error is returned.
* Implementation
The implementation re-uses most of the existing robust list interface as
possible. The new task_struct member `struct list_head robust_list2` is just a
linked list where new lists are appended as the user requests more lists, and by
futex_cleanup(), the kernel walks through the internal list feeding
exit_robust_list() with the robust_list's.
This implementation supports up to 10 lists (defined at ROBUST_LISTS_PER_TASK),
but it was an arbitrary number for this RFC. For the described use case above, 4
should be enough, I'm not sure which should be the limit.
It doesn't support list removal (should it support?). It doesn't have a proper
get_robust_list2() yet as well, but I can add it in a next revision. We could
also have a generic robust_list() syscall that can be used to set/get and be
controlled by flags.
The new interface has a `unsigned int flags` argument, making it
extensible for future use cases as well.
* Testing
I will provide a selftest similar to the one I proposed for the current
interface here:
https://lore.kernel.org/lkml/20241010011142.905297-1-andrealmeid@igalia.com/
Also, FEX-Emu added support for this interface to validate it:
https://github.com/FEX-Emu/FEX/pull/3966
Feedback is very welcomed!
Thanks,
André
[1] https://github.com/FEX-Emu/FEX
Changelog:
- Added a patch to properly deal with exit_robust_list() in 64bit vs 32bit
- Wired-up syscall for all archs
- Added more of the cover letter to the commit message
v1: https://lore.kernel.org/lkml/20241024145735.162090-1-andrealmeid@igalia.com/
André Almeida (3):
futex: Use explicit sizes for compat_exit_robust_list
futex: Create set_robust_list2
futex: Wire up set_robust_list2 syscall
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/linux/compat.h | 12 +-
include/linux/futex.h | 12 ++
include/linux/sched.h | 3 +-
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/futex.h | 24 ++++
init/init_task.c | 3 +
kernel/futex/core.c | 116 +++++++++++++++++---
kernel/futex/futex.h | 3 +
kernel/futex/syscalls.c | 40 ++++++-
kernel/sys_ni.c | 1 +
scripts/syscall.tbl | 1 +
26 files changed, 203 insertions(+), 32 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
@ 2024-11-01 16:21 ` André Almeida
2024-11-02 5:44 ` kernel test robot
2024-11-02 14:57 ` kernel test robot
2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
` (2 subsequent siblings)
3 siblings, 2 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
André Almeida
There are two functions for handling robust lists during the task
exit: exit_robust_list() and compat_exit_robust_list(). The first one
handles either 64bit or 32bit lists, depending if it's a 64bit or 32bit
kernel. The compat_exit_robust_list() only exists in 64bit kernels that
supports 32bit syscalls, and handles 32bit lists.
For the new syscall set_robust_list2(), 64bit kernels need to be able to
handle 32bit lists despite having or not support for 32bit syscalls, so
make compat_exit_robust_list() exist regardless of compat_ config.
Also, use explicitly sizing, otherwise in a 32bit kernel both
exit_robust_list() and compat_exit_robust_list() would be the exactly
same function, with none of them dealing with 64bit robust lists.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
This code was tested in 3 different setups:
- 64bit binary in 64bit kernel
- 32bit binary in 64bit kernel
- 32bit binary in 32bit kernel
Using this selftest: https://lore.kernel.org/lkml/20241010011142.905297-1-andrealmeid@igalia.com/
include/linux/compat.h | 12 +----------
include/linux/futex.h | 11 ++++++++++
include/linux/sched.h | 2 +-
kernel/futex/core.c | 45 ++++++++++++++++++++++++++---------------
kernel/futex/syscalls.c | 4 ++--
5 files changed, 44 insertions(+), 30 deletions(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 56cebaff0c91..968a9135ff48 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -385,16 +385,6 @@ struct compat_ifconf {
compat_caddr_t ifcbuf;
};
-struct compat_robust_list {
- compat_uptr_t next;
-};
-
-struct compat_robust_list_head {
- struct compat_robust_list list;
- compat_long_t futex_offset;
- compat_uptr_t list_op_pending;
-};
-
#ifdef CONFIG_COMPAT_OLD_SIGACTION
struct compat_old_sigaction {
compat_uptr_t sa_handler;
@@ -672,7 +662,7 @@ asmlinkage long compat_sys_waitid(int, compat_pid_t,
struct compat_siginfo __user *, int,
struct compat_rusage __user *);
asmlinkage long
-compat_sys_set_robust_list(struct compat_robust_list_head __user *head,
+compat_sys_set_robust_list(struct robust_list_head32 __user *head,
compat_size_t len);
asmlinkage long
compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr,
diff --git a/include/linux/futex.h b/include/linux/futex.h
index b70df27d7e85..8217b5ebdd9c 100644
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -53,6 +53,17 @@ union futex_key {
#define FUTEX_KEY_INIT (union futex_key) { .both = { .ptr = 0ULL } }
#ifdef CONFIG_FUTEX
+
+struct robust_list32 {
+ u32 next;
+};
+
+struct robust_list_head32 {
+ struct robust_list32 list;
+ s32 futex_offset;
+ u32 list_op_pending;
+};
+
enum {
FUTEX_STATE_OK,
FUTEX_STATE_EXITING,
diff --git a/include/linux/sched.h b/include/linux/sched.h
index bb343136ddd0..8f20b703557d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1282,7 +1282,7 @@ struct task_struct {
#ifdef CONFIG_FUTEX
struct robust_list_head __user *robust_list;
#ifdef CONFIG_COMPAT
- struct compat_robust_list_head __user *compat_robust_list;
+ struct robust_list_head32 __user *compat_robust_list;
#endif
struct list_head pi_state_list;
struct futex_pi_state *pi_state_cache;
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index 136768ae2637..bcd0e2a7ba65 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -790,13 +790,14 @@ static inline int fetch_robust_entry(struct robust_list __user **entry,
return 0;
}
+#ifdef CONFIG_64BIT
/*
* Walk curr->robust_list (very carefully, it's a userspace list!)
* and mark any locks found there dead, and notify any waiters.
*
* We silently return on any sign of list-walking problem.
*/
-static void exit_robust_list(struct task_struct *curr)
+static void exit_robust_list64(struct task_struct *curr)
{
struct robust_list_head __user *head = curr->robust_list;
struct robust_list __user *entry, *next_entry, *pending;
@@ -857,8 +858,14 @@ static void exit_robust_list(struct task_struct *curr)
curr, pip, HANDLE_DEATH_PENDING);
}
}
+#else
+static void exit_robust_list64(struct task_struct *curr)
+{
+ pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT");
+ return;
+}
+#endif
-#ifdef CONFIG_COMPAT
static void __user *futex_uaddr(struct robust_list __user *entry,
compat_long_t futex_offset)
{
@@ -872,13 +879,13 @@ static void __user *futex_uaddr(struct robust_list __user *entry,
* Fetch a robust-list pointer. Bit 0 signals PI futexes:
*/
static inline int
-compat_fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **entry,
- compat_uptr_t __user *head, unsigned int *pi)
+fetch_robust_entry32(u32 *uentry, struct robust_list __user **entry,
+ u32 __user *head, unsigned int *pi)
{
if (get_user(*uentry, head))
return -EFAULT;
- *entry = compat_ptr((*uentry) & ~1);
+ *entry = (void __user *)(unsigned long)((*uentry) & ~1);
*pi = (unsigned int)(*uentry) & 1;
return 0;
@@ -890,21 +897,21 @@ compat_fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **ent
*
* We silently return on any sign of list-walking problem.
*/
-static void compat_exit_robust_list(struct task_struct *curr)
+static void exit_robust_list32(struct task_struct *curr)
{
- struct compat_robust_list_head __user *head = curr->compat_robust_list;
+ struct robust_list_head32 __user *head = curr->compat_robust_list;
struct robust_list __user *entry, *next_entry, *pending;
unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
unsigned int next_pi;
- compat_uptr_t uentry, next_uentry, upending;
- compat_long_t futex_offset;
+ u32 uentry, next_uentry, upending;
+ s32 futex_offset;
int rc;
/*
* Fetch the list head (which was registered earlier, via
* sys_set_robust_list()):
*/
- if (compat_fetch_robust_entry(&uentry, &entry, &head->list.next, &pi))
+ if (fetch_robust_entry32((u32 *)&uentry, &entry, (u32 *)&head->list.next, &pi))
return;
/*
* Fetch the relative futex offset:
@@ -915,7 +922,7 @@ static void compat_exit_robust_list(struct task_struct *curr)
* Fetch any possibly pending lock-add first, and handle it
* if it exists:
*/
- if (compat_fetch_robust_entry(&upending, &pending,
+ if (fetch_robust_entry32(&upending, &pending,
&head->list_op_pending, &pip))
return;
@@ -925,8 +932,8 @@ static void compat_exit_robust_list(struct task_struct *curr)
* Fetch the next entry in the list before calling
* handle_futex_death:
*/
- rc = compat_fetch_robust_entry(&next_uentry, &next_entry,
- (compat_uptr_t __user *)&entry->next, &next_pi);
+ rc = fetch_robust_entry32(&next_uentry, &next_entry,
+ (u32 __user *)&entry->next, &next_pi);
/*
* A pending lock might already be on the list, so
* dont process it twice:
@@ -957,7 +964,6 @@ static void compat_exit_robust_list(struct task_struct *curr)
handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING);
}
}
-#endif
#ifdef CONFIG_FUTEX_PI
@@ -1040,14 +1046,21 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
static void futex_cleanup(struct task_struct *tsk)
{
+#ifdef CONFIG_64BIT
if (unlikely(tsk->robust_list)) {
- exit_robust_list(tsk);
+ exit_robust_list64(tsk);
tsk->robust_list = NULL;
}
+#else
+ if (unlikely(tsk->robust_list)) {
+ exit_robust_list32(tsk);
+ tsk->robust_list = NULL;
+ }
+#endif
#ifdef CONFIG_COMPAT
if (unlikely(tsk->compat_robust_list)) {
- compat_exit_robust_list(tsk);
+ exit_robust_list32(tsk);
tsk->compat_robust_list = NULL;
}
#endif
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index 4b6da9116aa6..dba193dfd216 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -440,7 +440,7 @@ SYSCALL_DEFINE4(futex_requeue,
#ifdef CONFIG_COMPAT
COMPAT_SYSCALL_DEFINE2(set_robust_list,
- struct compat_robust_list_head __user *, head,
+ struct robust_list_head32 __user *, head,
compat_size_t, len)
{
if (unlikely(len != sizeof(*head)))
@@ -455,7 +455,7 @@ COMPAT_SYSCALL_DEFINE3(get_robust_list, int, pid,
compat_uptr_t __user *, head_ptr,
compat_size_t __user *, len_ptr)
{
- struct compat_robust_list_head __user *head;
+ struct robust_list_head32 __user *head;
unsigned long ret;
struct task_struct *p;
--
2.47.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 2/3] futex: Create set_robust_list2
2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
@ 2024-11-01 16:21 ` André Almeida
2024-11-02 16:40 ` kernel test robot
2024-11-04 11:22 ` Peter Zijlstra
2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
3 siblings, 2 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
André Almeida
Create a new robust_list() syscall. The current syscall can't be
expanded to cover the following use case, so a new one is needed. This
new syscall allows users to set multiple robust lists per process and to
have either 32bit or 64bit pointers in the list.
* Interface
This is the proposed interface:
long set_robust_list2(void *head, int index, unsigned int flags)
`head` is the head of the userspace struct robust_list_head, just as old
set_robust_list(). It needs to be a void pointer since it can point to a
normal robust_list_head or a compat_robust_list_head.
`flags` can be used for defining the list type:
enum robust_list_type {
ROBUST_LIST_32BIT,
ROBUST_LIST_64BIT,
};
`index` is the index in the internal robust_list's linked list (the
naming starts to get confusing, I reckon). If `index == -1`, that means
that user wants to set a new robust_list, and the kernel will append it
in the end of the list, assign a new index and return this index to the
user. If `index >= 0`, that means that user wants to re-set `*head` of
an already existing list (similarly to what happens when you call
set_robust_list() twice with different `*head`).
If `index` is out of range, or it points to a non-existing robust_list,
or if the internal list is full, an error is returned.
User cannot remove lists.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
include/linux/futex.h | 1 +
include/linux/sched.h | 1 +
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/futex.h | 24 +++++++++
init/init_task.c | 3 ++
kernel/futex/core.c | 85 ++++++++++++++++++++++++++++---
kernel/futex/futex.h | 3 ++
kernel/futex/syscalls.c | 36 +++++++++++++
8 files changed, 149 insertions(+), 9 deletions(-)
diff --git a/include/linux/futex.h b/include/linux/futex.h
index 8217b5ebdd9c..997fe0013bc0 100644
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -76,6 +76,7 @@ static inline void futex_init_task(struct task_struct *tsk)
#ifdef CONFIG_COMPAT
tsk->compat_robust_list = NULL;
#endif
+ INIT_LIST_HEAD(&tsk->robust_list2);
INIT_LIST_HEAD(&tsk->pi_state_list);
tsk->pi_state_cache = NULL;
tsk->futex_state = FUTEX_STATE_OK;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8f20b703557d..4a2455f1b07c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1284,6 +1284,7 @@ struct task_struct {
#ifdef CONFIG_COMPAT
struct robust_list_head32 __user *compat_robust_list;
#endif
+ struct list_head robust_list2;
struct list_head pi_state_list;
struct futex_pi_state *pi_state_cache;
struct mutex futex_exit_mutex;
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 5bf6148cac2b..c1f5c9635c07 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -841,8 +841,11 @@ __SYSCALL(__NR_lsm_list_modules, sys_lsm_list_modules)
#define __NR_mseal 462
__SYSCALL(__NR_mseal, sys_mseal)
+#define __NR_set_robust_list2 463
+__SYSCALL(__NR_set_robust_list2, sys_set_robust_list2)
+
#undef __NR_syscalls
-#define __NR_syscalls 463
+#define __NR_syscalls 464
/*
* 32 bit systems traditionally used different
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index d2ee625ea189..13903a278b71 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -146,6 +146,30 @@ struct robust_list_head {
struct robust_list __user *list_op_pending;
};
+#define ROBUST_LISTS_PER_TASK 10
+
+enum robust_list2_type {
+ ROBUST_LIST_32BIT,
+ ROBUST_LIST_64BIT,
+};
+
+#define ROBUST_LIST_TYPE_MASK (ROBUST_LIST_32BIT | ROBUST_LIST_64BIT)
+
+/*
+ * This is an entry of a linked list of robust lists.
+ *
+ * @head: can point to a 64bit list or a 32bit list
+ * @list_type: determine the size of the futex pointers in the list
+ * @index: the index of this entry in the list
+ * @list: linked list element
+ */
+struct robust_list2_entry {
+ void __user *head;
+ enum robust_list2_type list_type;
+ unsigned int index;
+ struct list_head list;
+};
+
/*
* Are there any waiters for this robust futex:
*/
diff --git a/init/init_task.c b/init/init_task.c
index 136a8231355a..1b08e745c47d 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -219,6 +219,9 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
#ifdef CONFIG_SECCOMP_FILTER
.seccomp = { .filter_count = ATOMIC_INIT(0) },
#endif
+#ifdef CONFIG_FUTEX
+ .robust_list2 = LIST_HEAD_INIT(init_task.robust_list2),
+#endif
};
EXPORT_SYMBOL(init_task);
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index bcd0e2a7ba65..f74476d0bcc1 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -797,9 +797,9 @@ static inline int fetch_robust_entry(struct robust_list __user **entry,
*
* We silently return on any sign of list-walking problem.
*/
-static void exit_robust_list64(struct task_struct *curr)
+static void exit_robust_list64(struct task_struct *curr,
+ struct robust_list_head __user *head)
{
- struct robust_list_head __user *head = curr->robust_list;
struct robust_list __user *entry, *next_entry, *pending;
unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
unsigned int next_pi;
@@ -859,7 +859,8 @@ static void exit_robust_list64(struct task_struct *curr)
}
}
#else
-static void exit_robust_list64(struct task_struct *curr)
+static void exit_robust_list64(struct task_struct *curr,
+ struct robust_list_head __user *head)
{
pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT");
return;
@@ -897,9 +898,9 @@ fetch_robust_entry32(u32 *uentry, struct robust_list __user **entry,
*
* We silently return on any sign of list-walking problem.
*/
-static void exit_robust_list32(struct task_struct *curr)
+static void exit_robust_list32(struct task_struct *curr,
+ struct robust_list_head32 __user *head)
{
- struct robust_list_head32 __user *head = curr->compat_robust_list;
struct robust_list __user *entry, *next_entry, *pending;
unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
unsigned int next_pi;
@@ -965,6 +966,54 @@ static void exit_robust_list32(struct task_struct *curr)
}
}
+long do_set_robust_list2(struct robust_list_head __user *head,
+ int index, unsigned int type)
+{
+ struct list_head *list2 = ¤t->robust_list2;
+ struct robust_list2_entry *prev, *new = NULL;
+
+ if (index == -1) {
+ if (list_empty(list2)) {
+ index = 0;
+ } else {
+ prev = list_last_entry(list2, struct robust_list2_entry, list);
+ index = prev->index + 1;
+ }
+
+ if (index >= ROBUST_LISTS_PER_TASK)
+ return -EINVAL;
+
+ new = kmalloc(sizeof(struct robust_list2_entry), GFP_KERNEL);
+ if (!new)
+ return -ENOMEM;
+
+ list_add_tail(&new->list, list2);
+ new->index = index;
+
+ } else if (index >= 0) {
+ struct robust_list2_entry *curr;
+
+ if (list_empty(list2))
+ return -ENOENT;
+
+ list_for_each_entry(curr, list2, list) {
+ if (index == curr->index) {
+ new = curr;
+ break;
+ }
+ }
+
+ if (!new)
+ return -ENOENT;
+ }
+
+ BUG_ON(!new);
+ new->head = head;
+ new->list_type = type;
+
+ return index;
+}
+
#ifdef CONFIG_FUTEX_PI
/*
@@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
static void futex_cleanup(struct task_struct *tsk)
{
+ struct robust_list2_entry *curr, *n;
+ struct list_head *list2 = &tsk->robust_list2;
+
#ifdef CONFIG_64BIT
if (unlikely(tsk->robust_list)) {
- exit_robust_list64(tsk);
+ exit_robust_list64(tsk, tsk->robust_list);
tsk->robust_list = NULL;
}
#else
if (unlikely(tsk->robust_list)) {
- exit_robust_list32(tsk);
+ exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
tsk->robust_list = NULL;
}
#endif
#ifdef CONFIG_COMPAT
if (unlikely(tsk->compat_robust_list)) {
- exit_robust_list32(tsk);
+ exit_robust_list32(tsk, tsk->compat_robust_list);
tsk->compat_robust_list = NULL;
}
#endif
+ /*
+ * Walk through the linked list, parsing robust lists and freeing the
+ * allocated lists
+ */
+ if (unlikely(!list_empty(list2))) {
+ list_for_each_entry_safe(curr, n, list2, list) {
+ if (curr->head != NULL) {
+ if (curr->list_type == ROBUST_LIST_64BIT)
+ exit_robust_list64(tsk, curr->head);
+ else if (curr->list_type == ROBUST_LIST_32BIT)
+ exit_robust_list32(tsk, curr->head);
+ curr->head = NULL;
+ }
+ list_del_init(&curr->list);
+ kfree(curr);
+ }
+ }
if (unlikely(!list_empty(&tsk->pi_state_list)))
exit_pi_state_list(tsk);
diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
index 8b195d06f4e8..7247d5c583d5 100644
--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -349,6 +349,9 @@ extern int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
extern int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
ktime_t *abs_time, u32 bitset);
+extern long do_set_robust_list2(struct robust_list_head __user *head,
+ int index, unsigned int type);
+
/**
* struct futex_vector - Auxiliary struct for futex_waitv()
* @w: Userspace provided data
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index dba193dfd216..ff61570bb9c8 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -39,6 +39,42 @@ SYSCALL_DEFINE2(set_robust_list, struct robust_list_head __user *, head,
return 0;
}
+#define ROBUST_LIST_FLAGS ROBUST_LIST_TYPE_MASK
+
+/*
+ * sys_set_robust_list2()
+ *
+ * When index == -1, create a new list for user. When index >= 0, try to find
+ * the corresponding list and re-set the head there.
+ *
+ * Return values:
+ * >= 0: success, index of the robust list
+ * -EINVAL: invalid flags, invalid index
+ * -ENOENT: requested index no where to be found
+ * -ENOMEM: error allocating new list
+ * -ESRCH: too many allocated lists
+ */
+SYSCALL_DEFINE3(set_robust_list2, struct robust_list_head __user *, head,
+ int, index, unsigned int, flags)
+{
+ unsigned int type;
+
+ type = flags & ROBUST_LIST_TYPE_MASK;
+
+ if (index < -1 || index >= ROBUST_LISTS_PER_TASK)
+ return -EINVAL;
+
+ if ((flags & ~ROBUST_LIST_FLAGS) != 0)
+ return -EINVAL;
+
+#ifndef CONFIG_64BIT
+ if (type == ROBUST_LIST_64BIT)
+ return -EINVAL;
+#endif
+
+ return do_set_robust_list2(head, index, type);
+}
+
/**
* sys_get_robust_list() - Get the robust-futex list head of a task
* @pid: pid of the process [zero for current task]
--
2.47.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
@ 2024-11-01 16:21 ` André Almeida
2024-11-02 5:13 ` kernel test robot
2024-11-02 6:05 ` kernel test robot
2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
3 siblings, 2 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
André Almeida
Wire up the new set_robust_list2 syscall in all available architectures.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
kernel/sys_ni.c | 1 +
scripts/syscall.tbl | 1 +
17 files changed, 17 insertions(+)
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 74720667fe09..e7f9d9befdd5 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -502,3 +502,4 @@
570 common lsm_set_self_attr sys_lsm_set_self_attr
571 common lsm_list_modules sys_lsm_list_modules
572 common mseal sys_mseal
+573 common set_robust_list2 sys_robust_list2
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 23c98203c40f..31070d427ea2 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -477,3 +477,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 22a3cbd4c602..9c5d1fa7ca54 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -462,3 +462,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 2b81a6bd78b2..c03933182b4d 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -468,3 +468,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 953f5b7dc723..8f12bcd55d26 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -401,3 +401,4 @@
460 n32 lsm_set_self_attr sys_lsm_set_self_attr
461 n32 lsm_list_modules sys_lsm_list_modules
462 n32 mseal sys_mseal
+463 n32 set_robust_list2 sys_set_robust_list2
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 1464c6be6eb3..5e500a32c980 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -377,3 +377,4 @@
460 n64 lsm_set_self_attr sys_lsm_set_self_attr
461 n64 lsm_list_modules sys_lsm_list_modules
462 n64 mseal sys_mseal
+463 n64 set_robust_list2 sys_set_robust_list2
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 2439a2491cff..ea5be2805b3f 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -450,3 +450,4 @@
460 o32 lsm_set_self_attr sys_lsm_set_self_attr
461 o32 lsm_list_modules sys_lsm_list_modules
462 o32 mseal sys_mseal
+463 o32 set_robust_list2 sys_set_robust_list2
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 66dc406b12e4..49adcdb392a2 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -461,3 +461,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index ebae8415dfbb..eff6641f35e6 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -553,3 +553,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 01071182763e..a2366aa6791e 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -465,3 +465,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2 sys_set_robust_list2
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index c55fd7696d40..e6d7e565b942 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -466,3 +466,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index cfdfb3707c16..176022f9a236 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -508,3 +508,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 534c74b14fab..8607563a5510 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -468,3 +468,4 @@
460 i386 lsm_set_self_attr sys_lsm_set_self_attr
461 i386 lsm_list_modules sys_lsm_list_modules
462 i386 mseal sys_mseal
+463 i386 set_robust_list2 sys_set_robust_list2
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 7093ee21c0d1..fbc0cef1a97c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -386,6 +386,7 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
#
# Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 67083fc1b2f5..9081b3bf8272 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -433,3 +433,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index c00a86931f8c..71fbac6176c8 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -195,6 +195,7 @@ COND_SYSCALL(move_pages);
COND_SYSCALL(set_mempolicy_home_node);
COND_SYSCALL(cachestat);
COND_SYSCALL(mseal);
+COND_SYSCALL(set_robust_list2);
COND_SYSCALL(perf_event_open);
COND_SYSCALL(accept4);
diff --git a/scripts/syscall.tbl b/scripts/syscall.tbl
index 845e24eb372e..e174f6e2d521 100644
--- a/scripts/syscall.tbl
+++ b/scripts/syscall.tbl
@@ -403,3 +403,4 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal
+463 common set_robust_list2 sys_set_robust_list2
--
2.47.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
@ 2024-11-02 5:13 ` kernel test robot
2024-11-02 6:05 ` kernel test robot
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 5:13 UTC (permalink / raw)
To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor, André Almeida
Hi André,
kernel test robot noticed the following build errors:
[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/sched/core linus/master v6.12-rc5 next-20241101]
[cannot apply to tip/x86/asm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base: tip/locking/core
patch link: https://lore.kernel.org/r/20241101162147.284993-4-andrealmeid%40igalia.com
patch subject: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20241102/202411021208.9UQz3ahR-lkp@intel.com/config)
compiler: arc-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021208.9UQz3ahR-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021208.9UQz3ahR-lkp@intel.com/
All errors (new ones prefixed by >>):
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:455:1: note: in expansion of macro '__SYSCALL'
455 | __SYSCALL(454, sys_futex_wake)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
456 | __SYSCALL(455, sys_futex_wait)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[455]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
456 | __SYSCALL(455, sys_futex_wait)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
457 | __SYSCALL(456, sys_futex_requeue)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[456]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
457 | __SYSCALL(456, sys_futex_requeue)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
458 | __SYSCALL(457, sys_statmount)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[457]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
458 | __SYSCALL(457, sys_statmount)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
459 | __SYSCALL(458, sys_listmount)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[458]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
459 | __SYSCALL(458, sys_listmount)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
460 | __SYSCALL(459, sys_lsm_get_self_attr)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[459]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
460 | __SYSCALL(459, sys_lsm_get_self_attr)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
461 | __SYSCALL(460, sys_lsm_set_self_attr)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[460]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
461 | __SYSCALL(460, sys_lsm_set_self_attr)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
462 | __SYSCALL(461, sys_lsm_list_modules)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[461]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
462 | __SYSCALL(461, sys_lsm_list_modules)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
463 | __SYSCALL(462, sys_mseal)
| ^~~~~~~~~
arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[462]')
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^
./arch/arc/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
463 | __SYSCALL(462, sys_mseal)
| ^~~~~~~~~
>> ./arch/arc/include/generated/asm/syscall_table_32.h:464:16: error: 'sys_set_robust_list2' undeclared here (not in a function); did you mean 'sys_set_robust_list'?
464 | __SYSCALL(463, sys_set_robust_list2)
| ^~~~~~~~~~~~~~~~~~~~
arch/arc/kernel/sys.c:13:37: note: in definition of macro '__SYSCALL'
13 | #define __SYSCALL(nr, call) [nr] = (call),
| ^~~~
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
@ 2024-11-02 5:44 ` kernel test robot
2024-11-02 14:57 ` kernel test robot
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 5:44 UTC (permalink / raw)
To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor, André Almeida
Hi André,
kernel test robot noticed the following build warnings:
[auto build test WARNING on tip/locking/core]
[also build test WARNING on tip/sched/core linus/master tip/x86/asm v6.12-rc5 next-20241101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base: tip/locking/core
patch link: https://lore.kernel.org/r/20241101162147.284993-2-andrealmeid%40igalia.com
patch subject: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
config: alpha-allnoconfig (https://download.01.org/0day-ci/archive/20241102/202411021349.tqg42lGq-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 13.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021349.tqg42lGq-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021349.tqg42lGq-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from include/net/compat.h:8,
from include/net/scm.h:13,
from include/linux/netlink.h:9,
from include/uapi/linux/neighbour.h:6,
from include/linux/netdevice.h:44,
from include/net/sock.h:46,
from include/linux/tcp.h:19,
from include/linux/ipv6.h:102,
from include/net/ipv6.h:12,
from include/linux/sunrpc/clnt.h:29,
from include/linux/nfs_fs.h:32,
from init/do_mounts.c:23:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
| ^~~~~~~~~~~~~~~~~~
--
In file included from include/net/compat.h:8,
from include/net/scm.h:13,
from include/linux/netlink.h:9,
from include/uapi/linux/neighbour.h:6,
from include/linux/netdevice.h:44,
from include/net/sock.h:46,
from include/linux/tcp.h:19,
from include/linux/ipv6.h:102,
from include/net/addrconf.h:65,
from lib/vsprintf.c:41:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
| ^~~~~~~~~~~~~~~~~~
lib/vsprintf.c: In function 'va_format':
lib/vsprintf.c:1683:9: warning: function 'va_format' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
1683 | buf += vsnprintf(buf, end > buf ? end - buf : 0, va_fmt->fmt, va);
| ^~~
--
In file included from kernel/time/hrtimer.c:44:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
| ^~~~~~~~~~~~~~~~~~
kernel/time/hrtimer.c:121:35: warning: initialized field overwritten [-Woverride-init]
121 | [CLOCK_REALTIME] = HRTIMER_BASE_REALTIME,
| ^~~~~~~~~~~~~~~~~~~~~
kernel/time/hrtimer.c:121:35: note: (near initialization for 'hrtimer_clock_to_base_table[0]')
kernel/time/hrtimer.c:122:35: warning: initialized field overwritten [-Woverride-init]
122 | [CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC,
| ^~~~~~~~~~~~~~~~~~~~~~
kernel/time/hrtimer.c:122:35: note: (near initialization for 'hrtimer_clock_to_base_table[1]')
kernel/time/hrtimer.c:123:35: warning: initialized field overwritten [-Woverride-init]
123 | [CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME,
| ^~~~~~~~~~~~~~~~~~~~~
kernel/time/hrtimer.c:123:35: note: (near initialization for 'hrtimer_clock_to_base_table[7]')
kernel/time/hrtimer.c:124:35: warning: initialized field overwritten [-Woverride-init]
124 | [CLOCK_TAI] = HRTIMER_BASE_TAI,
| ^~~~~~~~~~~~~~~~
kernel/time/hrtimer.c:124:35: note: (near initialization for 'hrtimer_clock_to_base_table[11]')
--
In file included from kernel/futex/core.c:34:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
| ^~~~~~~~~~~~~~~~~~
kernel/futex/core.c: In function 'exit_robust_list32':
kernel/futex/core.c:902:54: error: 'struct task_struct' has no member named 'compat_robust_list'
902 | struct robust_list_head32 __user *head = curr->compat_robust_list;
| ^~
kernel/futex/core.c: At top level:
kernel/futex/core.c:900:13: warning: 'exit_robust_list32' defined but not used [-Wunused-function]
900 | static void exit_robust_list32(struct task_struct *curr)
| ^~~~~~~~~~~~~~~~~~
vim +665 include/linux/compat.h
621
622 #ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64
623 asmlinkage long compat_sys_pwritev64(unsigned long fd,
624 const struct iovec __user *vec,
625 unsigned long vlen, loff_t pos);
626 #endif
627 asmlinkage long compat_sys_sendfile(int out_fd, int in_fd,
628 compat_off_t __user *offset, compat_size_t count);
629 asmlinkage long compat_sys_sendfile64(int out_fd, int in_fd,
630 compat_loff_t __user *offset, compat_size_t count);
631 asmlinkage long compat_sys_pselect6_time32(int n, compat_ulong_t __user *inp,
632 compat_ulong_t __user *outp,
633 compat_ulong_t __user *exp,
634 struct old_timespec32 __user *tsp,
635 void __user *sig);
636 asmlinkage long compat_sys_pselect6_time64(int n, compat_ulong_t __user *inp,
637 compat_ulong_t __user *outp,
638 compat_ulong_t __user *exp,
639 struct __kernel_timespec __user *tsp,
640 void __user *sig);
641 asmlinkage long compat_sys_ppoll_time32(struct pollfd __user *ufds,
642 unsigned int nfds,
643 struct old_timespec32 __user *tsp,
644 const compat_sigset_t __user *sigmask,
645 compat_size_t sigsetsize);
646 asmlinkage long compat_sys_ppoll_time64(struct pollfd __user *ufds,
647 unsigned int nfds,
648 struct __kernel_timespec __user *tsp,
649 const compat_sigset_t __user *sigmask,
650 compat_size_t sigsetsize);
651 asmlinkage long compat_sys_signalfd4(int ufd,
652 const compat_sigset_t __user *sigmask,
653 compat_size_t sigsetsize, int flags);
654 asmlinkage long compat_sys_newfstatat(unsigned int dfd,
655 const char __user *filename,
656 struct compat_stat __user *statbuf,
657 int flag);
658 asmlinkage long compat_sys_newfstat(unsigned int fd,
659 struct compat_stat __user *statbuf);
660 /* No generic prototype for sync_file_range and sync_file_range2 */
661 asmlinkage long compat_sys_waitid(int, compat_pid_t,
662 struct compat_siginfo __user *, int,
663 struct compat_rusage __user *);
664 asmlinkage long
> 665 compat_sys_set_robust_list(struct robust_list_head32 __user *head,
666 compat_size_t len);
667 asmlinkage long
668 compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr,
669 compat_size_t __user *len_ptr);
670 asmlinkage long compat_sys_getitimer(int which,
671 struct old_itimerval32 __user *it);
672 asmlinkage long compat_sys_setitimer(int which,
673 struct old_itimerval32 __user *in,
674 struct old_itimerval32 __user *out);
675 asmlinkage long compat_sys_kexec_load(compat_ulong_t entry,
676 compat_ulong_t nr_segments,
677 struct compat_kexec_segment __user *,
678 compat_ulong_t flags);
679 asmlinkage long compat_sys_timer_create(clockid_t which_clock,
680 struct compat_sigevent __user *timer_event_spec,
681 timer_t __user *created_timer_id);
682 asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
683 compat_long_t addr, compat_long_t data);
684 asmlinkage long compat_sys_sched_setaffinity(compat_pid_t pid,
685 unsigned int len,
686 compat_ulong_t __user *user_mask_ptr);
687 asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid,
688 unsigned int len,
689 compat_ulong_t __user *user_mask_ptr);
690 asmlinkage long compat_sys_sigaltstack(const compat_stack_t __user *uss_ptr,
691 compat_stack_t __user *uoss_ptr);
692 asmlinkage long compat_sys_rt_sigsuspend(compat_sigset_t __user *unewset,
693 compat_size_t sigsetsize);
694 #ifndef CONFIG_ODD_RT_SIGACTION
695 asmlinkage long compat_sys_rt_sigaction(int,
696 const struct compat_sigaction __user *,
697 struct compat_sigaction __user *,
698 compat_size_t);
699 #endif
700 asmlinkage long compat_sys_rt_sigprocmask(int how, compat_sigset_t __user *set,
701 compat_sigset_t __user *oset,
702 compat_size_t sigsetsize);
703 asmlinkage long compat_sys_rt_sigpending(compat_sigset_t __user *uset,
704 compat_size_t sigsetsize);
705 asmlinkage long compat_sys_rt_sigtimedwait_time32(compat_sigset_t __user *uthese,
706 struct compat_siginfo __user *uinfo,
707 struct old_timespec32 __user *uts, compat_size_t sigsetsize);
708 asmlinkage long compat_sys_rt_sigtimedwait_time64(compat_sigset_t __user *uthese,
709 struct compat_siginfo __user *uinfo,
710 struct __kernel_timespec __user *uts, compat_size_t sigsetsize);
711 asmlinkage long compat_sys_rt_sigqueueinfo(compat_pid_t pid, int sig,
712 struct compat_siginfo __user *uinfo);
713 /* No generic prototype for rt_sigreturn */
714 asmlinkage long compat_sys_times(struct compat_tms __user *tbuf);
715 asmlinkage long compat_sys_getrlimit(unsigned int resource,
716 struct compat_rlimit __user *rlim);
717 asmlinkage long compat_sys_setrlimit(unsigned int resource,
718 struct compat_rlimit __user *rlim);
719 asmlinkage long compat_sys_getrusage(int who, struct compat_rusage __user *ru);
720 asmlinkage long compat_sys_gettimeofday(struct old_timeval32 __user *tv,
721 struct timezone __user *tz);
722 asmlinkage long compat_sys_settimeofday(struct old_timeval32 __user *tv,
723 struct timezone __user *tz);
724 asmlinkage long compat_sys_sysinfo(struct compat_sysinfo __user *info);
725 asmlinkage long compat_sys_mq_open(const char __user *u_name,
726 int oflag, compat_mode_t mode,
727 struct compat_mq_attr __user *u_attr);
728 asmlinkage long compat_sys_mq_notify(mqd_t mqdes,
729 const struct compat_sigevent __user *u_notification);
730 asmlinkage long compat_sys_mq_getsetattr(mqd_t mqdes,
731 const struct compat_mq_attr __user *u_mqstat,
732 struct compat_mq_attr __user *u_omqstat);
733 asmlinkage long compat_sys_msgctl(int first, int second, void __user *uptr);
734 asmlinkage long compat_sys_msgrcv(int msqid, compat_uptr_t msgp,
735 compat_ssize_t msgsz, compat_long_t msgtyp, int msgflg);
736 asmlinkage long compat_sys_msgsnd(int msqid, compat_uptr_t msgp,
737 compat_ssize_t msgsz, int msgflg);
738 asmlinkage long compat_sys_semctl(int semid, int semnum, int cmd, int arg);
739 asmlinkage long compat_sys_shmctl(int first, int second, void __user *uptr);
740 asmlinkage long compat_sys_shmat(int shmid, compat_uptr_t shmaddr, int shmflg);
741 asmlinkage long compat_sys_recvfrom(int fd, void __user *buf, compat_size_t len,
742 unsigned flags, struct sockaddr __user *addr,
743 int __user *addrlen);
744 asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg,
745 unsigned flags);
746 asmlinkage long compat_sys_recvmsg(int fd, struct compat_msghdr __user *msg,
747 unsigned int flags);
748 /* No generic prototype for readahead */
749 asmlinkage long compat_sys_keyctl(u32 option,
750 u32 arg2, u32 arg3, u32 arg4, u32 arg5);
751 asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr_t __user *argv,
752 const compat_uptr_t __user *envp);
753 /* No generic prototype for fadvise64_64 */
754 /* CONFIG_MMU only */
755 asmlinkage long compat_sys_rt_tgsigqueueinfo(compat_pid_t tgid,
756 compat_pid_t pid, int sig,
757 struct compat_siginfo __user *uinfo);
758 asmlinkage long compat_sys_recvmmsg_time64(int fd, struct compat_mmsghdr __user *mmsg,
759 unsigned vlen, unsigned int flags,
760 struct __kernel_timespec __user *timeout);
761 asmlinkage long compat_sys_recvmmsg_time32(int fd, struct compat_mmsghdr __user *mmsg,
762 unsigned vlen, unsigned int flags,
763 struct old_timespec32 __user *timeout);
764 asmlinkage long compat_sys_wait4(compat_pid_t pid,
765 compat_uint_t __user *stat_addr, int options,
766 struct compat_rusage __user *ru);
767 asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
768 int, const char __user *);
769 asmlinkage long compat_sys_open_by_handle_at(int mountdirfd,
770 struct file_handle __user *handle,
771 int flags);
772 asmlinkage long compat_sys_sendmmsg(int fd, struct compat_mmsghdr __user *mmsg,
773 unsigned vlen, unsigned int flags);
774 asmlinkage long compat_sys_execveat(int dfd, const char __user *filename,
775 const compat_uptr_t __user *argv,
776 const compat_uptr_t __user *envp, int flags);
777 asmlinkage ssize_t compat_sys_preadv2(compat_ulong_t fd,
778 const struct iovec __user *vec,
779 compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
780 asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
781 const struct iovec __user *vec,
782 compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
783 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
784 asmlinkage long compat_sys_preadv64v2(unsigned long fd,
785 const struct iovec __user *vec,
786 unsigned long vlen, loff_t pos, rwf_t flags);
787 #endif
788
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
2024-11-02 5:13 ` kernel test robot
@ 2024-11-02 6:05 ` kernel test robot
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 6:05 UTC (permalink / raw)
To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor, André Almeida
Hi André,
kernel test robot noticed the following build errors:
[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/sched/core linus/master v6.12-rc5 next-20241101]
[cannot apply to tip/x86/asm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base: tip/locking/core
patch link: https://lore.kernel.org/r/20241101162147.284993-4-andrealmeid%40igalia.com
patch subject: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
config: csky-allnoconfig (https://download.01.org/0day-ci/archive/20241102/202411021323.fazJ8GOs-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021323.fazJ8GOs-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021323.fazJ8GOs-lkp@intel.com/
All errors (new ones prefixed by >>):
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:455:1: note: in expansion of macro '__SYSCALL'
455 | __SYSCALL(454, sys_futex_wake)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
456 | __SYSCALL(455, sys_futex_wait)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[455]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
456 | __SYSCALL(455, sys_futex_wait)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
457 | __SYSCALL(456, sys_futex_requeue)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[456]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
457 | __SYSCALL(456, sys_futex_requeue)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
458 | __SYSCALL(457, sys_statmount)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[457]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
458 | __SYSCALL(457, sys_statmount)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
459 | __SYSCALL(458, sys_listmount)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[458]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
459 | __SYSCALL(458, sys_listmount)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
460 | __SYSCALL(459, sys_lsm_get_self_attr)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[459]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
460 | __SYSCALL(459, sys_lsm_get_self_attr)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
461 | __SYSCALL(460, sys_lsm_set_self_attr)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[460]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
461 | __SYSCALL(460, sys_lsm_set_self_attr)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
462 | __SYSCALL(461, sys_lsm_list_modules)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[461]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
462 | __SYSCALL(461, sys_lsm_list_modules)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
463 | __SYSCALL(462, sys_mseal)
| ^~~~~~~~~
arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[462]')
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^
./arch/csky/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
463 | __SYSCALL(462, sys_mseal)
| ^~~~~~~~~
>> ./arch/csky/include/generated/asm/syscall_table_32.h:464:16: error: 'sys_set_robust_list2' undeclared here (not in a function); did you mean 'sys_set_robust_list'?
464 | __SYSCALL(463, sys_set_robust_list2)
| ^~~~~~~~~~~~~~~~~~~~
arch/csky/kernel/syscall_table.c:8:36: note: in definition of macro '__SYSCALL'
8 | #define __SYSCALL(nr, call)[nr] = (call),
| ^~~~
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
2024-11-02 5:44 ` kernel test robot
@ 2024-11-02 14:57 ` kernel test robot
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 14:57 UTC (permalink / raw)
To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor, André Almeida
Hi André,
kernel test robot noticed the following build warnings:
[auto build test WARNING on tip/locking/core]
[also build test WARNING on tip/sched/core linus/master tip/x86/asm v6.12-rc5 next-20241101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base: tip/locking/core
patch link: https://lore.kernel.org/r/20241101162147.284993-2-andrealmeid%40igalia.com
patch subject: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
config: x86_64-randconfig-123-20241102 (https://download.01.org/0day-ci/archive/20241102/202411022242.XCJECOCz-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411022242.XCJECOCz-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411022242.XCJECOCz-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
>> kernel/futex/core.c:914:59: sparse: sparse: cast removes address space '__user' of expression
>> kernel/futex/core.c:914:59: sparse: sparse: incorrect type in argument 3 (different address spaces) @@ expected unsigned int [noderef] [usertype] __user *head @@ got unsigned int [usertype] * @@
kernel/futex/core.c:914:59: sparse: expected unsigned int [noderef] [usertype] __user *head
kernel/futex/core.c:914:59: sparse: got unsigned int [usertype] *
vim +/__user +914 kernel/futex/core.c
893
894 /*
895 * Walk curr->robust_list (very carefully, it's a userspace list!)
896 * and mark any locks found there dead, and notify any waiters.
897 *
898 * We silently return on any sign of list-walking problem.
899 */
900 static void exit_robust_list32(struct task_struct *curr)
901 {
902 struct robust_list_head32 __user *head = curr->compat_robust_list;
903 struct robust_list __user *entry, *next_entry, *pending;
904 unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
905 unsigned int next_pi;
906 u32 uentry, next_uentry, upending;
907 s32 futex_offset;
908 int rc;
909
910 /*
911 * Fetch the list head (which was registered earlier, via
912 * sys_set_robust_list()):
913 */
> 914 if (fetch_robust_entry32((u32 *)&uentry, &entry, (u32 *)&head->list.next, &pi))
915 return;
916 /*
917 * Fetch the relative futex offset:
918 */
919 if (get_user(futex_offset, &head->futex_offset))
920 return;
921 /*
922 * Fetch any possibly pending lock-add first, and handle it
923 * if it exists:
924 */
925 if (fetch_robust_entry32(&upending, &pending,
926 &head->list_op_pending, &pip))
927 return;
928
929 next_entry = NULL; /* avoid warning with gcc */
930 while (entry != (struct robust_list __user *) &head->list) {
931 /*
932 * Fetch the next entry in the list before calling
933 * handle_futex_death:
934 */
935 rc = fetch_robust_entry32(&next_uentry, &next_entry,
936 (u32 __user *)&entry->next, &next_pi);
937 /*
938 * A pending lock might already be on the list, so
939 * dont process it twice:
940 */
941 if (entry != pending) {
942 void __user *uaddr = futex_uaddr(entry, futex_offset);
943
944 if (handle_futex_death(uaddr, curr, pi,
945 HANDLE_DEATH_LIST))
946 return;
947 }
948 if (rc)
949 return;
950 uentry = next_uentry;
951 entry = next_entry;
952 pi = next_pi;
953 /*
954 * Avoid excessively long or circular lists:
955 */
956 if (!--limit)
957 break;
958
959 cond_resched();
960 }
961 if (pending) {
962 void __user *uaddr = futex_uaddr(pending, futex_offset);
963
964 handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING);
965 }
966 }
967
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] futex: Create set_robust_list2
2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
@ 2024-11-02 16:40 ` kernel test robot
2024-11-04 11:22 ` Peter Zijlstra
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 16:40 UTC (permalink / raw)
To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor, André Almeida
Hi André,
kernel test robot noticed the following build warnings:
[auto build test WARNING on tip/locking/core]
[also build test WARNING on tip/sched/core linus/master v6.12-rc5 next-20241101]
[cannot apply to tip/x86/asm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base: tip/locking/core
patch link: https://lore.kernel.org/r/20241101162147.284993-3-andrealmeid%40igalia.com
patch subject: [PATCH v2 2/3] futex: Create set_robust_list2
config: i386-randconfig-062-20241102 (https://download.01.org/0day-ci/archive/20241103/202411030038.W5DgvCYP-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241103/202411030038.W5DgvCYP-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411030038.W5DgvCYP-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
kernel/futex/core.c:915:59: sparse: sparse: cast removes address space '__user' of expression
kernel/futex/core.c:915:59: sparse: sparse: incorrect type in argument 3 (different address spaces) @@ expected unsigned int [noderef] [usertype] __user *head @@ got unsigned int [usertype] * @@
kernel/futex/core.c:915:59: sparse: expected unsigned int [noderef] [usertype] __user *head
kernel/futex/core.c:915:59: sparse: got unsigned int [usertype] *
kernel/futex/core.c:1108:42: sparse: sparse: cast removes address space '__user' of expression
>> kernel/futex/core.c:1108:42: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected struct robust_list_head32 [noderef] __user *head @@ got struct robust_list_head32 * @@
kernel/futex/core.c:1108:42: sparse: expected struct robust_list_head32 [noderef] __user *head
kernel/futex/core.c:1108:42: sparse: got struct robust_list_head32 *
kernel/futex/core.c: note: in included file (through include/linux/smp.h, include/linux/alloc_tag.h, include/linux/percpu.h, ...):
include/linux/list.h:83:21: sparse: sparse: self-comparison always evaluates to true
vim +1108 kernel/futex/core.c
1095
1096 static void futex_cleanup(struct task_struct *tsk)
1097 {
1098 struct robust_list2_entry *curr, *n;
1099 struct list_head *list2 = &tsk->robust_list2;
1100
1101 #ifdef CONFIG_64BIT
1102 if (unlikely(tsk->robust_list)) {
1103 exit_robust_list64(tsk, tsk->robust_list);
1104 tsk->robust_list = NULL;
1105 }
1106 #else
1107 if (unlikely(tsk->robust_list)) {
> 1108 exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
1109 tsk->robust_list = NULL;
1110 }
1111 #endif
1112
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] futex: Create set_robust_list2
2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
` (2 preceding siblings ...)
2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
@ 2024-11-02 21:58 ` Florian Weimer
2024-11-04 11:32 ` Peter Zijlstra
2024-11-04 21:49 ` André Almeida
3 siblings, 2 replies; 18+ messages in thread
From: Florian Weimer @ 2024-11-02 21:58 UTC (permalink / raw)
To: André Almeida
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
kernel-dev, linux-api, Nathan Chancellor
* André Almeida:
> 1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
> this is not a problem, because of the compat entry point. But there's
> no such compat entry point for AArch64, so the kernel would do the
> pointer arithmetic wrongly. Is also unviable to userspace to keep
> track every addition/removal to the robust list and keep a 64bit
> version of it somewhere else to feed the kernel. Thus, the new
> interface has an option of telling the kernel if the list is filled
> with 32bit or 64bit pointers.
The size is typically different for 32-bit and 64-bit mode (12 vs 24
bytes). Why isn't this enough to disambiguate?
> 2) Apps can set just one robust list (in theory, x86-64 can set two if
> they also use the compat entry point). That means that when a x86 app
> asks FEX-Emu to call set_robust_list(), FEX have two options: to
> overwrite their own robust list pointer and make the app robust, or
> to ignore the app robust list and keep the emulator robust. The new
> interface allows for multiple robust lists per application, solving
> this.
Can't you avoid mixing emulated and general userspace code on the same
thread? On emulator threads, you have full control over the TCB.
QEMU hints towards further problems (in linux-user/syscall.c):
case TARGET_NR_set_robust_list:
case TARGET_NR_get_robust_list:
/* The ABI for supporting robust futexes has userspace pass
* the kernel a pointer to a linked list which is updated by
* userspace after the syscall; the list is walked by the kernel
* when the thread exits. Since the linked list in QEMU guest
* memory isn't a valid linked list for the host and we have
* no way to reliably intercept the thread-death event, we can't
* support these. Silently return ENOSYS so that guest userspace
* falls back to a non-robust futex implementation (which should
* be OK except in the corner case of the guest crashing while
* holding a mutex that is shared with another process via
* shared memory).
*/
return -TARGET_ENOSYS;
The glibc implementation is not really prepared for this
(__ASSUME_SET_ROBUST_LIST is defined for must architectures). But a
couple of years ago, we had a bunch of kernels that regressed robust
list support on POWER, and I think we found out only when we tested an
unrelated glibc update and saw unexpected glibc test suite failures …
Thanks,
Florian
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] futex: Create set_robust_list2
2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
2024-11-02 16:40 ` kernel test robot
@ 2024-11-04 11:22 ` Peter Zijlstra
2024-11-04 21:55 ` André Almeida
1 sibling, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-04 11:22 UTC (permalink / raw)
To: André Almeida
Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, Davidlohr Bueso,
Arnd Bergmann, sonicadvance1, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor
On Fri, Nov 01, 2024 at 01:21:46PM -0300, André Almeida wrote:
> @@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
>
> static void futex_cleanup(struct task_struct *tsk)
> {
> + struct robust_list2_entry *curr, *n;
> + struct list_head *list2 = &tsk->robust_list2;
> +
> #ifdef CONFIG_64BIT
> if (unlikely(tsk->robust_list)) {
> - exit_robust_list64(tsk);
> + exit_robust_list64(tsk, tsk->robust_list);
> tsk->robust_list = NULL;
> }
> #else
> if (unlikely(tsk->robust_list)) {
> - exit_robust_list32(tsk);
> + exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
> tsk->robust_list = NULL;
> }
> #endif
>
> #ifdef CONFIG_COMPAT
> if (unlikely(tsk->compat_robust_list)) {
> - exit_robust_list32(tsk);
> + exit_robust_list32(tsk, tsk->compat_robust_list);
> tsk->compat_robust_list = NULL;
> }
> #endif
> + /*
> + * Walk through the linked list, parsing robust lists and freeing the
> + * allocated lists
> + */
> + if (unlikely(!list_empty(list2))) {
> + list_for_each_entry_safe(curr, n, list2, list) {
> + if (curr->head != NULL) {
> + if (curr->list_type == ROBUST_LIST_64BIT)
> + exit_robust_list64(tsk, curr->head);
> + else if (curr->list_type == ROBUST_LIST_32BIT)
> + exit_robust_list32(tsk, curr->head);
> + curr->head = NULL;
> + }
> + list_del_init(&curr->list);
> + kfree(curr);
> + }
> + }
>
> if (unlikely(!list_empty(&tsk->pi_state_list)))
> exit_pi_state_list(tsk);
I'm still digesting this, but the above seems particularly silly.
Should not the legacy lists also be on the list of lists? I mean, it
makes no sense to have two completely separate means of tracking lists.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] futex: Create set_robust_list2
2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
@ 2024-11-04 11:32 ` Peter Zijlstra
2024-11-04 11:56 ` Peter Zijlstra
2024-11-04 12:36 ` Florian Weimer
2024-11-04 21:49 ` André Almeida
1 sibling, 2 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-04 11:32 UTC (permalink / raw)
To: Florian Weimer
Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
kernel-dev, linux-api, Nathan Chancellor
On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
> QEMU hints towards further problems (in linux-user/syscall.c):
>
> case TARGET_NR_set_robust_list:
> case TARGET_NR_get_robust_list:
> /* The ABI for supporting robust futexes has userspace pass
> * the kernel a pointer to a linked list which is updated by
> * userspace after the syscall; the list is walked by the kernel
> * when the thread exits. Since the linked list in QEMU guest
> * memory isn't a valid linked list for the host and we have
> * no way to reliably intercept the thread-death event, we can't
> * support these. Silently return ENOSYS so that guest userspace
> * falls back to a non-robust futex implementation (which should
> * be OK except in the corner case of the guest crashing while
> * holding a mutex that is shared with another process via
> * shared memory).
> */
> return -TARGET_ENOSYS;
I don't think we can sanely fix that. Can't QEMU track the robust thing
itself and use waitpid() to discover the thread is gone and fudge things
from there?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] futex: Create set_robust_list2
2024-11-04 11:32 ` Peter Zijlstra
@ 2024-11-04 11:56 ` Peter Zijlstra
2024-11-04 12:36 ` Florian Weimer
1 sibling, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-04 11:56 UTC (permalink / raw)
To: Florian Weimer
Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
kernel-dev, linux-api, Nathan Chancellor
On Mon, Nov 04, 2024 at 12:32:40PM +0100, Peter Zijlstra wrote:
> On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
>
> > QEMU hints towards further problems (in linux-user/syscall.c):
> >
> > case TARGET_NR_set_robust_list:
> > case TARGET_NR_get_robust_list:
> > /* The ABI for supporting robust futexes has userspace pass
> > * the kernel a pointer to a linked list which is updated by
> > * userspace after the syscall; the list is walked by the kernel
> > * when the thread exits. Since the linked list in QEMU guest
> > * memory isn't a valid linked list for the host and we have
> > * no way to reliably intercept the thread-death event, we can't
> > * support these. Silently return ENOSYS so that guest userspace
> > * falls back to a non-robust futex implementation (which should
> > * be OK except in the corner case of the guest crashing while
> > * holding a mutex that is shared with another process via
> > * shared memory).
> > */
> > return -TARGET_ENOSYS;
>
> I don't think we can sanely fix that. Can't QEMU track the robust thing
> itself and use waitpid() to discover the thread is gone and fudge things
> from there?
Hmm, what about we mandate 'natural' alignement of the structure such
that it is always inside a single page, then QEMU can do the translation
here and hand the kernel the 'real' address.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] futex: Create set_robust_list2
2024-11-04 11:32 ` Peter Zijlstra
2024-11-04 11:56 ` Peter Zijlstra
@ 2024-11-04 12:36 ` Florian Weimer
2024-11-05 12:18 ` Peter Zijlstra
1 sibling, 1 reply; 18+ messages in thread
From: Florian Weimer @ 2024-11-04 12:36 UTC (permalink / raw)
To: Peter Zijlstra
Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
kernel-dev, linux-api, Nathan Chancellor
* Peter Zijlstra:
> On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
>
>> QEMU hints towards further problems (in linux-user/syscall.c):
>>
>> case TARGET_NR_set_robust_list:
>> case TARGET_NR_get_robust_list:
>> /* The ABI for supporting robust futexes has userspace pass
>> * the kernel a pointer to a linked list which is updated by
>> * userspace after the syscall; the list is walked by the kernel
>> * when the thread exits. Since the linked list in QEMU guest
>> * memory isn't a valid linked list for the host and we have
>> * no way to reliably intercept the thread-death event, we can't
>> * support these. Silently return ENOSYS so that guest userspace
>> * falls back to a non-robust futex implementation (which should
>> * be OK except in the corner case of the guest crashing while
>> * holding a mutex that is shared with another process via
>> * shared memory).
>> */
>> return -TARGET_ENOSYS;
>
> I don't think we can sanely fix that. Can't QEMU track the robust thing
> itself and use waitpid() to discover the thread is gone and fudge things
> from there?
There are race conditions with munmap, I think, and they probably get a
lot of worse if QEMU does that.
See Rich Felker's bug report:
| The corruption is performed by the kernel when it walks the robust
| list. The basic situation is the same as in PR #13690, except that
| here there's actually a potential write to the memory rather than just
| a read.
|
| The sequence of events leading to corruption goes like this:
|
| 1. Thread A unlocks the process-shared, robust mutex and is preempted
| after the mutex is removed from the robust list and atomically
| unlocked, but before it's removed from the list_op_pending field of
| the robust list header.
|
| 2. Thread B locks the mutex, and, knowing by program logic that it's
| the last user of the mutex, unlocks and unmaps it, allocates/maps
| something else that gets assigned the same address as the shared mutex
| mapping, and then exits.
|
| 3. The kernel destroys the process, which involves walking each
| thread's robust list and processing each thread's list_op_pending
| field of the robust list header. Since thread A has a list_op_pending
| pointing at the address previously occupied by the mutex, the kernel
| obliviously "unlocks the mutex" by writing a 0 to the address and
| futex-waking it. However, the kernel has instead overwritten part of
| whatever mapping thread A created. If this is private memory it
| (probably) doesn't matter since the process is ending anyway (but are
| there race conditions where this can be seen?). If this is shared
| memory or a shared file mapping, however, the kernel corrupts it.
|
| I suspect the race is difficult to hit since thread A has to get
| preempted at exactly the wrong time AND thread B has to do a fair
| amount of work without thread A getting scheduled again. So I'm not
| sure how much luck we'd have getting a test case.
<https://sourceware.org/bugzilla/show_bug.cgi?id=14485#c3>
We also have a silent unlocking failure because userspace does not know
about ROBUST_LIST_LIMIT:
Bug 19089 - Robust mutexes do not take ROBUST_LIST_LIMIT into account
<https://sourceware.org/bugzilla/show_bug.cgi?id=19089>
(I think we may have discussed this one before, and you may have
suggested to just hard-code 2048 in userspace because the constant is
not expected to change.)
So the in-mutex linked list has quite a few problems even outside of
emulation. 8-(
Thanks,
Florian
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] futex: Create set_robust_list2
2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
2024-11-04 11:32 ` Peter Zijlstra
@ 2024-11-04 21:49 ` André Almeida
1 sibling, 0 replies; 18+ messages in thread
From: André Almeida @ 2024-11-04 21:49 UTC (permalink / raw)
To: Florian Weimer
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
kernel-dev, linux-api, Nathan Chancellor
Hi Florian,
Em 02/11/2024 18:58, Florian Weimer escreveu:
> * André Almeida:
>
>> 1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
>> this is not a problem, because of the compat entry point. But there's
>> no such compat entry point for AArch64, so the kernel would do the
>> pointer arithmetic wrongly. Is also unviable to userspace to keep
>> track every addition/removal to the robust list and keep a 64bit
>> version of it somewhere else to feed the kernel. Thus, the new
>> interface has an option of telling the kernel if the list is filled
>> with 32bit or 64bit pointers.
>
> The size is typically different for 32-bit and 64-bit mode (12 vs 24
> bytes). Why isn't this enough to disambiguate?
>
Right, so the idea would be to use `size_t len` from the syscall
arguments for that?
>> 2) Apps can set just one robust list (in theory, x86-64 can set two if
>> they also use the compat entry point). That means that when a x86 app
>> asks FEX-Emu to call set_robust_list(), FEX have two options: to
>> overwrite their own robust list pointer and make the app robust, or
>> to ignore the app robust list and keep the emulator robust. The new
>> interface allows for multiple robust lists per application, solving
>> this.
>
> Can't you avoid mixing emulated and general userspace code on the same
> thread? On emulator threads, you have full control over the TCB.
>
FEX can't avoid that because it doesn't do a full system emulation, it
just does instructions translation. FEX doesn't have full control over
the TCB, that's still all glibc, or whatever other dynamic linker is used.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] futex: Create set_robust_list2
2024-11-04 11:22 ` Peter Zijlstra
@ 2024-11-04 21:55 ` André Almeida
2024-11-05 12:10 ` Peter Zijlstra
0 siblings, 1 reply; 18+ messages in thread
From: André Almeida @ 2024-11-04 21:55 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, Davidlohr Bueso,
Arnd Bergmann, sonicadvance1, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor
Hi Peter,
Em 04/11/2024 08:22, Peter Zijlstra escreveu:
> On Fri, Nov 01, 2024 at 01:21:46PM -0300, André Almeida wrote:
>> @@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
>>
>> static void futex_cleanup(struct task_struct *tsk)
>> {
>> + struct robust_list2_entry *curr, *n;
>> + struct list_head *list2 = &tsk->robust_list2;
>> +
>> #ifdef CONFIG_64BIT
>> if (unlikely(tsk->robust_list)) {
>> - exit_robust_list64(tsk);
>> + exit_robust_list64(tsk, tsk->robust_list);
>> tsk->robust_list = NULL;
>> }
>> #else
>> if (unlikely(tsk->robust_list)) {
>> - exit_robust_list32(tsk);
>> + exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
>> tsk->robust_list = NULL;
>> }
>> #endif
>>
>> #ifdef CONFIG_COMPAT
>> if (unlikely(tsk->compat_robust_list)) {
>> - exit_robust_list32(tsk);
>> + exit_robust_list32(tsk, tsk->compat_robust_list);
>> tsk->compat_robust_list = NULL;
>> }
>> #endif
>> + /*
>> + * Walk through the linked list, parsing robust lists and freeing the
>> + * allocated lists
>> + */
>> + if (unlikely(!list_empty(list2))) {
>> + list_for_each_entry_safe(curr, n, list2, list) {
>> + if (curr->head != NULL) {
>> + if (curr->list_type == ROBUST_LIST_64BIT)
>> + exit_robust_list64(tsk, curr->head);
>> + else if (curr->list_type == ROBUST_LIST_32BIT)
>> + exit_robust_list32(tsk, curr->head);
>> + curr->head = NULL;
>> + }
>> + list_del_init(&curr->list);
>> + kfree(curr);
>> + }
>> + }
>>
>> if (unlikely(!list_empty(&tsk->pi_state_list)))
>> exit_pi_state_list(tsk);
>
> I'm still digesting this, but the above seems particularly silly.
>
> Should not the legacy lists also be on the list of lists? I mean, it
> makes no sense to have two completely separate means of tracking lists.
>
You are asking if, whenever someone calls set_robust_list() or
compat_set_robust_list() to be inserted into ¤t->robust_list2
instead of using tsk->robust_list and tsk->compat_robust_list?
I was thinking of doing that, but my current implementation has a
kmalloc() call for every insertion, and I wasn't sure if I could add
this new latency to the old set_robust_list() syscall. Assuming it is
usually called just once during the thread initialization perhaps it
shouldn't cause much harm I guess.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] futex: Create set_robust_list2
2024-11-04 21:55 ` André Almeida
@ 2024-11-05 12:10 ` Peter Zijlstra
0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-05 12:10 UTC (permalink / raw)
To: André Almeida
Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, Davidlohr Bueso,
Arnd Bergmann, sonicadvance1, linux-kernel, kernel-dev, linux-api,
Nathan Chancellor
On Mon, Nov 04, 2024 at 06:55:45PM -0300, André Almeida wrote:
> Hi Peter,
>
> Em 04/11/2024 08:22, Peter Zijlstra escreveu:
> > On Fri, Nov 01, 2024 at 01:21:46PM -0300, André Almeida wrote:
> > > @@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
> > > static void futex_cleanup(struct task_struct *tsk)
> > > {
> > > + struct robust_list2_entry *curr, *n;
> > > + struct list_head *list2 = &tsk->robust_list2;
> > > +
> > > #ifdef CONFIG_64BIT
> > > if (unlikely(tsk->robust_list)) {
> > > - exit_robust_list64(tsk);
> > > + exit_robust_list64(tsk, tsk->robust_list);
> > > tsk->robust_list = NULL;
> > > }
> > > #else
> > > if (unlikely(tsk->robust_list)) {
> > > - exit_robust_list32(tsk);
> > > + exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
> > > tsk->robust_list = NULL;
> > > }
> > > #endif
> > > #ifdef CONFIG_COMPAT
> > > if (unlikely(tsk->compat_robust_list)) {
> > > - exit_robust_list32(tsk);
> > > + exit_robust_list32(tsk, tsk->compat_robust_list);
> > > tsk->compat_robust_list = NULL;
> > > }
> > > #endif
> > > + /*
> > > + * Walk through the linked list, parsing robust lists and freeing the
> > > + * allocated lists
> > > + */
> > > + if (unlikely(!list_empty(list2))) {
> > > + list_for_each_entry_safe(curr, n, list2, list) {
> > > + if (curr->head != NULL) {
> > > + if (curr->list_type == ROBUST_LIST_64BIT)
> > > + exit_robust_list64(tsk, curr->head);
> > > + else if (curr->list_type == ROBUST_LIST_32BIT)
> > > + exit_robust_list32(tsk, curr->head);
> > > + curr->head = NULL;
> > > + }
> > > + list_del_init(&curr->list);
> > > + kfree(curr);
> > > + }
> > > + }
> > > if (unlikely(!list_empty(&tsk->pi_state_list)))
> > > exit_pi_state_list(tsk);
> >
> > I'm still digesting this, but the above seems particularly silly.
> >
> > Should not the legacy lists also be on the list of lists? I mean, it
> > makes no sense to have two completely separate means of tracking lists.
> >
>
> You are asking if, whenever someone calls set_robust_list() or
> compat_set_robust_list() to be inserted into ¤t->robust_list2 instead
> of using tsk->robust_list and tsk->compat_robust_list?
Yes, that.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] futex: Create set_robust_list2
2024-11-04 12:36 ` Florian Weimer
@ 2024-11-05 12:18 ` Peter Zijlstra
0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-05 12:18 UTC (permalink / raw)
To: Florian Weimer
Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
kernel-dev, linux-api, Nathan Chancellor, Mathieu Desnoyers
On Mon, Nov 04, 2024 at 01:36:43PM +0100, Florian Weimer wrote:
> * Peter Zijlstra:
>
> > On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
> >
> >> QEMU hints towards further problems (in linux-user/syscall.c):
> >>
> >> case TARGET_NR_set_robust_list:
> >> case TARGET_NR_get_robust_list:
> >> /* The ABI for supporting robust futexes has userspace pass
> >> * the kernel a pointer to a linked list which is updated by
> >> * userspace after the syscall; the list is walked by the kernel
> >> * when the thread exits. Since the linked list in QEMU guest
> >> * memory isn't a valid linked list for the host and we have
> >> * no way to reliably intercept the thread-death event, we can't
> >> * support these. Silently return ENOSYS so that guest userspace
> >> * falls back to a non-robust futex implementation (which should
> >> * be OK except in the corner case of the guest crashing while
> >> * holding a mutex that is shared with another process via
> >> * shared memory).
> >> */
> >> return -TARGET_ENOSYS;
> >
> > I don't think we can sanely fix that. Can't QEMU track the robust thing
> > itself and use waitpid() to discover the thread is gone and fudge things
> > from there?
>
> There are race conditions with munmap, I think, and they probably get a
> lot of worse if QEMU does that.
>
> See Rich Felker's bug report:
>
> | The corruption is performed by the kernel when it walks the robust
> | list. The basic situation is the same as in PR #13690, except that
> | here there's actually a potential write to the memory rather than just
> | a read.
> |
> | The sequence of events leading to corruption goes like this:
> |
> | 1. Thread A unlocks the process-shared, robust mutex and is preempted
> | after the mutex is removed from the robust list and atomically
> | unlocked, but before it's removed from the list_op_pending field of
> | the robust list header.
> |
> | 2. Thread B locks the mutex, and, knowing by program logic that it's
> | the last user of the mutex, unlocks and unmaps it, allocates/maps
> | something else that gets assigned the same address as the shared mutex
> | mapping, and then exits.
> |
> | 3. The kernel destroys the process, which involves walking each
> | thread's robust list and processing each thread's list_op_pending
> | field of the robust list header. Since thread A has a list_op_pending
> | pointing at the address previously occupied by the mutex, the kernel
> | obliviously "unlocks the mutex" by writing a 0 to the address and
> | futex-waking it. However, the kernel has instead overwritten part of
> | whatever mapping thread A created. If this is private memory it
> | (probably) doesn't matter since the process is ending anyway (but are
> | there race conditions where this can be seen?). If this is shared
> | memory or a shared file mapping, however, the kernel corrupts it.
> |
> | I suspect the race is difficult to hit since thread A has to get
> | preempted at exactly the wrong time AND thread B has to do a fair
> | amount of work without thread A getting scheduled again. So I'm not
> | sure how much luck we'd have getting a test case.
>
>
> <https://sourceware.org/bugzilla/show_bug.cgi?id=14485#c3>
So I've only managed to conjure up two horrible solutions for this:
- put the robust futex operations under user-space RCU, and mandate a
matching synchronize_rcu() before any munmap() calls.
- add a robust-barrier syscall that waits until all list_op_pending are
either NULL or changed since invocation. And mandate this call before
munmap().
Neither are particularly pretty I admit, but at least they should work.
But doing this and mandating the alignment thing should at least make
this qemu thing workable, no?
> We also have a silent unlocking failure because userspace does not know
> about ROBUST_LIST_LIMIT:
>
> Bug 19089 - Robust mutexes do not take ROBUST_LIST_LIMIT into account
> <https://sourceware.org/bugzilla/show_bug.cgi?id=19089>
>
> (I think we may have discussed this one before, and you may have
> suggested to just hard-code 2048 in userspace because the constant is
> not expected to change.)
>
> So the in-mutex linked list has quite a few problems even outside of
> emulation. 8-(
It's futex, ofcourse its a pain in the arse :-)
And yeah, no better ideas on that limit for now...
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-11-05 12:18 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
2024-11-02 5:44 ` kernel test robot
2024-11-02 14:57 ` kernel test robot
2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
2024-11-02 16:40 ` kernel test robot
2024-11-04 11:22 ` Peter Zijlstra
2024-11-04 21:55 ` André Almeida
2024-11-05 12:10 ` Peter Zijlstra
2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
2024-11-02 5:13 ` kernel test robot
2024-11-02 6:05 ` kernel test robot
2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
2024-11-04 11:32 ` Peter Zijlstra
2024-11-04 11:56 ` Peter Zijlstra
2024-11-04 12:36 ` Florian Weimer
2024-11-05 12:18 ` Peter Zijlstra
2024-11-04 21:49 ` André Almeida
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).