linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] futex: Create set_robust_list2
@ 2024-11-01 16:21 André Almeida
  2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
	André Almeida

This patch adds a new robust_list() syscall. The current syscall
can't be expanded to cover the following use case, so a new one is
needed. This new syscall allows users to set multiple robust lists per
process and to have either 32bit or 64bit pointers in the list.

* Use case

FEX-Emu[1] is an application that runs x86 and x86-64 binaries on an
AArch64 Linux host. One of the tasks of FEX-Emu is to translate syscalls
from one platform to another. Existing set_robust_list() can't be easily
translated because of two limitations:

1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
   this is not a problem, because of the compat entry point. But there's
   no such compat entry point for AArch64, so the kernel would do the
   pointer arithmetic wrongly. Is also unviable to userspace to keep
   track every addition/removal to the robust list and keep a 64bit
   version of it somewhere else to feed the kernel. Thus, the new
   interface has an option of telling the kernel if the list is filled
   with 32bit or 64bit pointers.

2) Apps can set just one robust list (in theory, x86-64 can set two if
   they also use the compat entry point). That means that when a x86 app
   asks FEX-Emu to call set_robust_list(), FEX have two options: to
   overwrite their own robust list pointer and make the app robust, or
   to ignore the app robust list and keep the emulator robust. The new
   interface allows for multiple robust lists per application, solving
   this.

* Interface

This is the proposed interface:

	long set_robust_list2(void *head, int index, unsigned int flags)

`head` is the head of the userspace struct robust_list_head, just as old
set_robust_list(). It needs to be a void pointer since it can point to a normal
robust_list_head or a compat_robust_list_head.

`flags` can be used for defining the list type:

	enum robust_list_type {
	 	ROBUST_LIST_32BIT,
		ROBUST_LIST_64BIT,
	 };

`index` is the index in the internal robust_list's linked list (the naming
starts to get confusing, I reckon). If `index == -1`, that means that user wants
to set a new robust_list, and the kernel will append it in the end of the list,
assign a new index and return this index to the user. If `index >= 0`, that
means that user wants to re-set `*head` of an already existing list (similarly
to what happens when you call set_robust_list() twice with different `*head`).

If `index` is out of range, or it points to a non-existing robust_list, or if
the internal list is full, an error is returned.

* Implementation

The implementation re-uses most of the existing robust list interface as
possible. The new task_struct member `struct list_head robust_list2` is just a
linked list where new lists are appended as the user requests more lists, and by
futex_cleanup(), the kernel walks through the internal list feeding
exit_robust_list() with the robust_list's.

This implementation supports up to 10 lists (defined at ROBUST_LISTS_PER_TASK),
but it was an arbitrary number for this RFC. For the described use case above, 4
should be enough, I'm not sure which should be the limit.

It doesn't support list removal (should it support?). It doesn't have a proper
get_robust_list2() yet as well, but I can add it in a next revision. We could
also have a generic robust_list() syscall that can be used to set/get and be
controlled by flags.

The new interface has a `unsigned int flags` argument, making it
extensible for future use cases as well.

* Testing

I will provide a selftest similar to the one I proposed for the current
interface here:
https://lore.kernel.org/lkml/20241010011142.905297-1-andrealmeid@igalia.com/

Also, FEX-Emu added support for this interface to validate it:
https://github.com/FEX-Emu/FEX/pull/3966

Feedback is very welcomed!

Thanks,
	André

[1] https://github.com/FEX-Emu/FEX

Changelog:
- Added a patch to properly deal with exit_robust_list() in 64bit vs 32bit
- Wired-up syscall for all archs
- Added more of the cover letter to the commit message
v1: https://lore.kernel.org/lkml/20241024145735.162090-1-andrealmeid@igalia.com/

André Almeida (3):
  futex: Use explicit sizes for compat_exit_robust_list
  futex: Create set_robust_list2
  futex: Wire up set_robust_list2 syscall

 arch/alpha/kernel/syscalls/syscall.tbl      |   1 +
 arch/arm/tools/syscall.tbl                  |   1 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   1 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   1 +
 arch/s390/kernel/syscalls/syscall.tbl       |   1 +
 arch/sh/kernel/syscalls/syscall.tbl         |   1 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   1 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   1 +
 include/linux/compat.h                      |  12 +-
 include/linux/futex.h                       |  12 ++
 include/linux/sched.h                       |   3 +-
 include/uapi/asm-generic/unistd.h           |   5 +-
 include/uapi/linux/futex.h                  |  24 ++++
 init/init_task.c                            |   3 +
 kernel/futex/core.c                         | 116 +++++++++++++++++---
 kernel/futex/futex.h                        |   3 +
 kernel/futex/syscalls.c                     |  40 ++++++-
 kernel/sys_ni.c                             |   1 +
 scripts/syscall.tbl                         |   1 +
 26 files changed, 203 insertions(+), 32 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
  2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
@ 2024-11-01 16:21 ` André Almeida
  2024-11-02  5:44   ` kernel test robot
  2024-11-02 14:57   ` kernel test robot
  2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
	André Almeida

There are two functions for handling robust lists during the task
exit: exit_robust_list() and compat_exit_robust_list(). The first one
handles either 64bit or 32bit lists, depending if it's a 64bit or 32bit
kernel. The compat_exit_robust_list() only exists in 64bit kernels that
supports 32bit syscalls, and handles 32bit lists.

For the new syscall set_robust_list2(), 64bit kernels need to be able to
handle 32bit lists despite having or not support for 32bit syscalls, so
make compat_exit_robust_list() exist regardless of compat_ config.

Also, use explicitly sizing, otherwise in a 32bit kernel both
exit_robust_list() and compat_exit_robust_list() would be the exactly
same function, with none of them dealing with 64bit robust lists.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
This code was tested in 3 different setups:
- 64bit binary in 64bit kernel
- 32bit binary in 64bit kernel
- 32bit binary in 32bit kernel

Using this selftest: https://lore.kernel.org/lkml/20241010011142.905297-1-andrealmeid@igalia.com/

 include/linux/compat.h  | 12 +----------
 include/linux/futex.h   | 11 ++++++++++
 include/linux/sched.h   |  2 +-
 kernel/futex/core.c     | 45 ++++++++++++++++++++++++++---------------
 kernel/futex/syscalls.c |  4 ++--
 5 files changed, 44 insertions(+), 30 deletions(-)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index 56cebaff0c91..968a9135ff48 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -385,16 +385,6 @@ struct compat_ifconf {
 	compat_caddr_t  ifcbuf;
 };
 
-struct compat_robust_list {
-	compat_uptr_t			next;
-};
-
-struct compat_robust_list_head {
-	struct compat_robust_list	list;
-	compat_long_t			futex_offset;
-	compat_uptr_t			list_op_pending;
-};
-
 #ifdef CONFIG_COMPAT_OLD_SIGACTION
 struct compat_old_sigaction {
 	compat_uptr_t			sa_handler;
@@ -672,7 +662,7 @@ asmlinkage long compat_sys_waitid(int, compat_pid_t,
 		struct compat_siginfo __user *, int,
 		struct compat_rusage __user *);
 asmlinkage long
-compat_sys_set_robust_list(struct compat_robust_list_head __user *head,
+compat_sys_set_robust_list(struct robust_list_head32 __user *head,
 			   compat_size_t len);
 asmlinkage long
 compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr,
diff --git a/include/linux/futex.h b/include/linux/futex.h
index b70df27d7e85..8217b5ebdd9c 100644
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -53,6 +53,17 @@ union futex_key {
 #define FUTEX_KEY_INIT (union futex_key) { .both = { .ptr = 0ULL } }
 
 #ifdef CONFIG_FUTEX
+
+struct robust_list32 {
+	u32 next;
+};
+
+struct robust_list_head32 {
+	struct robust_list32	list;
+	s32			futex_offset;
+	u32			list_op_pending;
+};
+
 enum {
 	FUTEX_STATE_OK,
 	FUTEX_STATE_EXITING,
diff --git a/include/linux/sched.h b/include/linux/sched.h
index bb343136ddd0..8f20b703557d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1282,7 +1282,7 @@ struct task_struct {
 #ifdef CONFIG_FUTEX
 	struct robust_list_head __user	*robust_list;
 #ifdef CONFIG_COMPAT
-	struct compat_robust_list_head __user *compat_robust_list;
+	struct robust_list_head32 __user *compat_robust_list;
 #endif
 	struct list_head		pi_state_list;
 	struct futex_pi_state		*pi_state_cache;
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index 136768ae2637..bcd0e2a7ba65 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -790,13 +790,14 @@ static inline int fetch_robust_entry(struct robust_list __user **entry,
 	return 0;
 }
 
+#ifdef CONFIG_64BIT
 /*
  * Walk curr->robust_list (very carefully, it's a userspace list!)
  * and mark any locks found there dead, and notify any waiters.
  *
  * We silently return on any sign of list-walking problem.
  */
-static void exit_robust_list(struct task_struct *curr)
+static void exit_robust_list64(struct task_struct *curr)
 {
 	struct robust_list_head __user *head = curr->robust_list;
 	struct robust_list __user *entry, *next_entry, *pending;
@@ -857,8 +858,14 @@ static void exit_robust_list(struct task_struct *curr)
 				   curr, pip, HANDLE_DEATH_PENDING);
 	}
 }
+#else
+static void exit_robust_list64(struct task_struct *curr)
+{
+	pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT");
+	return;
+}
+#endif
 
-#ifdef CONFIG_COMPAT
 static void __user *futex_uaddr(struct robust_list __user *entry,
 				compat_long_t futex_offset)
 {
@@ -872,13 +879,13 @@ static void __user *futex_uaddr(struct robust_list __user *entry,
  * Fetch a robust-list pointer. Bit 0 signals PI futexes:
  */
 static inline int
-compat_fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **entry,
-		   compat_uptr_t __user *head, unsigned int *pi)
+fetch_robust_entry32(u32 *uentry, struct robust_list __user **entry,
+		     u32 __user *head, unsigned int *pi)
 {
 	if (get_user(*uentry, head))
 		return -EFAULT;
 
-	*entry = compat_ptr((*uentry) & ~1);
+	*entry = (void __user *)(unsigned long)((*uentry) & ~1);
 	*pi = (unsigned int)(*uentry) & 1;
 
 	return 0;
@@ -890,21 +897,21 @@ compat_fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **ent
  *
  * We silently return on any sign of list-walking problem.
  */
-static void compat_exit_robust_list(struct task_struct *curr)
+static void exit_robust_list32(struct task_struct *curr)
 {
-	struct compat_robust_list_head __user *head = curr->compat_robust_list;
+	struct robust_list_head32 __user *head = curr->compat_robust_list;
 	struct robust_list __user *entry, *next_entry, *pending;
 	unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
 	unsigned int next_pi;
-	compat_uptr_t uentry, next_uentry, upending;
-	compat_long_t futex_offset;
+	u32 uentry, next_uentry, upending;
+	s32 futex_offset;
 	int rc;
 
 	/*
 	 * Fetch the list head (which was registered earlier, via
 	 * sys_set_robust_list()):
 	 */
-	if (compat_fetch_robust_entry(&uentry, &entry, &head->list.next, &pi))
+	if (fetch_robust_entry32((u32 *)&uentry, &entry, (u32 *)&head->list.next, &pi))
 		return;
 	/*
 	 * Fetch the relative futex offset:
@@ -915,7 +922,7 @@ static void compat_exit_robust_list(struct task_struct *curr)
 	 * Fetch any possibly pending lock-add first, and handle it
 	 * if it exists:
 	 */
-	if (compat_fetch_robust_entry(&upending, &pending,
+	if (fetch_robust_entry32(&upending, &pending,
 			       &head->list_op_pending, &pip))
 		return;
 
@@ -925,8 +932,8 @@ static void compat_exit_robust_list(struct task_struct *curr)
 		 * Fetch the next entry in the list before calling
 		 * handle_futex_death:
 		 */
-		rc = compat_fetch_robust_entry(&next_uentry, &next_entry,
-			(compat_uptr_t __user *)&entry->next, &next_pi);
+		rc = fetch_robust_entry32(&next_uentry, &next_entry,
+			(u32 __user *)&entry->next, &next_pi);
 		/*
 		 * A pending lock might already be on the list, so
 		 * dont process it twice:
@@ -957,7 +964,6 @@ static void compat_exit_robust_list(struct task_struct *curr)
 		handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING);
 	}
 }
-#endif
 
 #ifdef CONFIG_FUTEX_PI
 
@@ -1040,14 +1046,21 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
 
 static void futex_cleanup(struct task_struct *tsk)
 {
+#ifdef CONFIG_64BIT
 	if (unlikely(tsk->robust_list)) {
-		exit_robust_list(tsk);
+		exit_robust_list64(tsk);
 		tsk->robust_list = NULL;
 	}
+#else
+	if (unlikely(tsk->robust_list)) {
+		exit_robust_list32(tsk);
+		tsk->robust_list = NULL;
+	}
+#endif
 
 #ifdef CONFIG_COMPAT
 	if (unlikely(tsk->compat_robust_list)) {
-		compat_exit_robust_list(tsk);
+		exit_robust_list32(tsk);
 		tsk->compat_robust_list = NULL;
 	}
 #endif
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index 4b6da9116aa6..dba193dfd216 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -440,7 +440,7 @@ SYSCALL_DEFINE4(futex_requeue,
 
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE2(set_robust_list,
-		struct compat_robust_list_head __user *, head,
+		struct robust_list_head32 __user *, head,
 		compat_size_t, len)
 {
 	if (unlikely(len != sizeof(*head)))
@@ -455,7 +455,7 @@ COMPAT_SYSCALL_DEFINE3(get_robust_list, int, pid,
 			compat_uptr_t __user *, head_ptr,
 			compat_size_t __user *, len_ptr)
 {
-	struct compat_robust_list_head __user *head;
+	struct robust_list_head32 __user *head;
 	unsigned long ret;
 	struct task_struct *p;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/3] futex: Create set_robust_list2
  2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
  2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
@ 2024-11-01 16:21 ` André Almeida
  2024-11-02 16:40   ` kernel test robot
  2024-11-04 11:22   ` Peter Zijlstra
  2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
  2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
  3 siblings, 2 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
	André Almeida

Create a new robust_list() syscall. The current syscall can't be
expanded to cover the following use case, so a new one is needed. This
new syscall allows users to set multiple robust lists per process and to
have either 32bit or 64bit pointers in the list.

* Interface

This is the proposed interface:

	long set_robust_list2(void *head, int index, unsigned int flags)

`head` is the head of the userspace struct robust_list_head, just as old
set_robust_list(). It needs to be a void pointer since it can point to a
normal robust_list_head or a compat_robust_list_head.

`flags` can be used for defining the list type:

	enum robust_list_type {
	 	ROBUST_LIST_32BIT,
		ROBUST_LIST_64BIT,
	 };

`index` is the index in the internal robust_list's linked list (the
naming starts to get confusing, I reckon). If `index == -1`, that means
that user wants to set a new robust_list, and the kernel will append it
in the end of the list, assign a new index and return this index to the
user. If `index >= 0`, that means that user wants to re-set `*head` of
an already existing list (similarly to what happens when you call
set_robust_list() twice with different `*head`).

If `index` is out of range, or it points to a non-existing robust_list,
or if the internal list is full, an error is returned.

User cannot remove lists.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
 include/linux/futex.h             |  1 +
 include/linux/sched.h             |  1 +
 include/uapi/asm-generic/unistd.h |  5 +-
 include/uapi/linux/futex.h        | 24 +++++++++
 init/init_task.c                  |  3 ++
 kernel/futex/core.c               | 85 ++++++++++++++++++++++++++++---
 kernel/futex/futex.h              |  3 ++
 kernel/futex/syscalls.c           | 36 +++++++++++++
 8 files changed, 149 insertions(+), 9 deletions(-)

diff --git a/include/linux/futex.h b/include/linux/futex.h
index 8217b5ebdd9c..997fe0013bc0 100644
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -76,6 +76,7 @@ static inline void futex_init_task(struct task_struct *tsk)
 #ifdef CONFIG_COMPAT
 	tsk->compat_robust_list = NULL;
 #endif
+	INIT_LIST_HEAD(&tsk->robust_list2);
 	INIT_LIST_HEAD(&tsk->pi_state_list);
 	tsk->pi_state_cache = NULL;
 	tsk->futex_state = FUTEX_STATE_OK;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8f20b703557d..4a2455f1b07c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1284,6 +1284,7 @@ struct task_struct {
 #ifdef CONFIG_COMPAT
 	struct robust_list_head32 __user *compat_robust_list;
 #endif
+	struct list_head		robust_list2;
 	struct list_head		pi_state_list;
 	struct futex_pi_state		*pi_state_cache;
 	struct mutex			futex_exit_mutex;
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 5bf6148cac2b..c1f5c9635c07 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -841,8 +841,11 @@ __SYSCALL(__NR_lsm_list_modules, sys_lsm_list_modules)
 #define __NR_mseal 462
 __SYSCALL(__NR_mseal, sys_mseal)
 
+#define __NR_set_robust_list2 463
+__SYSCALL(__NR_set_robust_list2, sys_set_robust_list2)
+
 #undef __NR_syscalls
-#define __NR_syscalls 463
+#define __NR_syscalls 464
 
 /*
  * 32 bit systems traditionally used different
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index d2ee625ea189..13903a278b71 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -146,6 +146,30 @@ struct robust_list_head {
 	struct robust_list __user *list_op_pending;
 };
 
+#define ROBUST_LISTS_PER_TASK 10
+
+enum robust_list2_type {
+	ROBUST_LIST_32BIT,
+	ROBUST_LIST_64BIT,
+};
+
+#define ROBUST_LIST_TYPE_MASK (ROBUST_LIST_32BIT | ROBUST_LIST_64BIT)
+
+/*
+ * This is an entry of a linked list of robust lists.
+ *
+ * @head: can point to a 64bit list or a 32bit list
+ * @list_type: determine the size of the futex pointers in the list
+ * @index: the index of this entry in the list
+ * @list: linked list element
+ */
+struct robust_list2_entry {
+	void __user *head;
+	enum robust_list2_type list_type;
+	unsigned int index;
+	struct list_head list;
+};
+
 /*
  * Are there any waiters for this robust futex:
  */
diff --git a/init/init_task.c b/init/init_task.c
index 136a8231355a..1b08e745c47d 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -219,6 +219,9 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
 #ifdef CONFIG_SECCOMP_FILTER
 	.seccomp	= { .filter_count = ATOMIC_INIT(0) },
 #endif
+#ifdef CONFIG_FUTEX
+	.robust_list2 = LIST_HEAD_INIT(init_task.robust_list2),
+#endif
 };
 EXPORT_SYMBOL(init_task);
 
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index bcd0e2a7ba65..f74476d0bcc1 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -797,9 +797,9 @@ static inline int fetch_robust_entry(struct robust_list __user **entry,
  *
  * We silently return on any sign of list-walking problem.
  */
-static void exit_robust_list64(struct task_struct *curr)
+static void exit_robust_list64(struct task_struct *curr,
+			       struct robust_list_head __user *head)
 {
-	struct robust_list_head __user *head = curr->robust_list;
 	struct robust_list __user *entry, *next_entry, *pending;
 	unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
 	unsigned int next_pi;
@@ -859,7 +859,8 @@ static void exit_robust_list64(struct task_struct *curr)
 	}
 }
 #else
-static void exit_robust_list64(struct task_struct *curr)
+static void exit_robust_list64(struct task_struct *curr,
+			      struct robust_list_head __user *head)
 {
 	pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT");
 	return;
@@ -897,9 +898,9 @@ fetch_robust_entry32(u32 *uentry, struct robust_list __user **entry,
  *
  * We silently return on any sign of list-walking problem.
  */
-static void exit_robust_list32(struct task_struct *curr)
+static void exit_robust_list32(struct task_struct *curr,
+			       struct robust_list_head32 __user *head)
 {
-	struct robust_list_head32 __user *head = curr->compat_robust_list;
 	struct robust_list __user *entry, *next_entry, *pending;
 	unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
 	unsigned int next_pi;
@@ -965,6 +966,54 @@ static void exit_robust_list32(struct task_struct *curr)
 	}
 }
 
+long do_set_robust_list2(struct robust_list_head __user *head,
+			 int index, unsigned int type)
+{
+	struct list_head *list2 = &current->robust_list2;
+	struct robust_list2_entry *prev, *new = NULL;
+
+	if (index == -1) {
+		if (list_empty(list2)) {
+			index = 0;
+		} else {
+			prev = list_last_entry(list2, struct robust_list2_entry, list);
+			index = prev->index + 1;
+		}
+
+		if (index >= ROBUST_LISTS_PER_TASK)
+			return -EINVAL;
+
+		new = kmalloc(sizeof(struct robust_list2_entry), GFP_KERNEL);
+		if (!new)
+			return -ENOMEM;
+
+		list_add_tail(&new->list, list2);
+		new->index = index;
+
+	} else if (index >= 0) {
+		struct robust_list2_entry *curr;
+
+		if (list_empty(list2))
+			return -ENOENT;
+
+		list_for_each_entry(curr, list2, list) {
+			if (index == curr->index) {
+				new = curr;
+				break;
+			}
+		}
+
+		if (!new)
+			return -ENOENT;
+	}
+
+	BUG_ON(!new);
+	new->head = head;
+	new->list_type = type;
+
+	return index;
+}
+
 #ifdef CONFIG_FUTEX_PI
 
 /*
@@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
 
 static void futex_cleanup(struct task_struct *tsk)
 {
+	struct robust_list2_entry *curr, *n;
+	struct list_head *list2 = &tsk->robust_list2;
+
 #ifdef CONFIG_64BIT
 	if (unlikely(tsk->robust_list)) {
-		exit_robust_list64(tsk);
+		exit_robust_list64(tsk, tsk->robust_list);
 		tsk->robust_list = NULL;
 	}
 #else
 	if (unlikely(tsk->robust_list)) {
-		exit_robust_list32(tsk);
+		exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
 		tsk->robust_list = NULL;
 	}
 #endif
 
 #ifdef CONFIG_COMPAT
 	if (unlikely(tsk->compat_robust_list)) {
-		exit_robust_list32(tsk);
+		exit_robust_list32(tsk, tsk->compat_robust_list);
 		tsk->compat_robust_list = NULL;
 	}
 #endif
+	/*
+	 * Walk through the linked list, parsing robust lists and freeing the
+	 * allocated lists
+	 */
+	if (unlikely(!list_empty(list2))) {
+		list_for_each_entry_safe(curr, n, list2, list) {
+			if (curr->head != NULL) {
+				if (curr->list_type == ROBUST_LIST_64BIT)
+					exit_robust_list64(tsk, curr->head);
+				else if (curr->list_type == ROBUST_LIST_32BIT)
+					exit_robust_list32(tsk, curr->head);
+				curr->head = NULL;
+			}
+			list_del_init(&curr->list);
+			kfree(curr);
+		}
+	}
 
 	if (unlikely(!list_empty(&tsk->pi_state_list)))
 		exit_pi_state_list(tsk);
diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
index 8b195d06f4e8..7247d5c583d5 100644
--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -349,6 +349,9 @@ extern int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
 extern int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
 		      ktime_t *abs_time, u32 bitset);
 
+extern long do_set_robust_list2(struct robust_list_head __user *head,
+			 int index, unsigned int type);
+
 /**
  * struct futex_vector - Auxiliary struct for futex_waitv()
  * @w: Userspace provided data
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index dba193dfd216..ff61570bb9c8 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -39,6 +39,42 @@ SYSCALL_DEFINE2(set_robust_list, struct robust_list_head __user *, head,
 	return 0;
 }
 
+#define ROBUST_LIST_FLAGS ROBUST_LIST_TYPE_MASK
+
+/*
+ * sys_set_robust_list2()
+ *
+ * When index == -1, create a new list for user. When index >= 0, try to find
+ * the corresponding list and re-set the head there.
+ *
+ * Return values:
+ *  >= 0: success, index of the robust list
+ *  -EINVAL: invalid flags, invalid index
+ *  -ENOENT: requested index no where to be found
+ *  -ENOMEM: error allocating new list
+ *  -ESRCH: too many allocated lists
+ */
+SYSCALL_DEFINE3(set_robust_list2, struct robust_list_head __user *, head,
+		int, index, unsigned int, flags)
+{
+	unsigned int type;
+
+	type = flags & ROBUST_LIST_TYPE_MASK;
+
+	if (index < -1 || index >= ROBUST_LISTS_PER_TASK)
+		return -EINVAL;
+
+	if ((flags & ~ROBUST_LIST_FLAGS) != 0)
+		return -EINVAL;
+
+#ifndef CONFIG_64BIT
+	if (type == ROBUST_LIST_64BIT)
+		return -EINVAL;
+#endif
+
+	return do_set_robust_list2(head, index, type);
+}
+
 /**
  * sys_get_robust_list() - Get the robust-futex list head of a task
  * @pid:	pid of the process [zero for current task]
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
  2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
  2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
  2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
@ 2024-11-01 16:21 ` André Almeida
  2024-11-02  5:13   ` kernel test robot
  2024-11-02  6:05   ` kernel test robot
  2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
  3 siblings, 2 replies; 18+ messages in thread
From: André Almeida @ 2024-11-01 16:21 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: linux-kernel, kernel-dev, linux-api, Nathan Chancellor,
	André Almeida

Wire up the new set_robust_list2 syscall in all available architectures.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
 arch/alpha/kernel/syscalls/syscall.tbl      | 1 +
 arch/arm/tools/syscall.tbl                  | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl       | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl     | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    | 1 +
 arch/s390/kernel/syscalls/syscall.tbl       | 1 +
 arch/sh/kernel/syscalls/syscall.tbl         | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl      | 1 +
 arch/x86/entry/syscalls/syscall_32.tbl      | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl      | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     | 1 +
 kernel/sys_ni.c                             | 1 +
 scripts/syscall.tbl                         | 1 +
 17 files changed, 17 insertions(+)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 74720667fe09..e7f9d9befdd5 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -502,3 +502,4 @@
 570	common	lsm_set_self_attr		sys_lsm_set_self_attr
 571	common	lsm_list_modules		sys_lsm_list_modules
 572	common  mseal				sys_mseal
+573	common	set_robust_list2		sys_robust_list2
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 23c98203c40f..31070d427ea2 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -477,3 +477,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 22a3cbd4c602..9c5d1fa7ca54 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -462,3 +462,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common  set_robust_list2		sys_set_robust_list2
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 2b81a6bd78b2..c03933182b4d 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -468,3 +468,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 953f5b7dc723..8f12bcd55d26 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -401,3 +401,4 @@
 460	n32	lsm_set_self_attr		sys_lsm_set_self_attr
 461	n32	lsm_list_modules		sys_lsm_list_modules
 462	n32	mseal				sys_mseal
+463	n32	set_robust_list2		sys_set_robust_list2
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 1464c6be6eb3..5e500a32c980 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -377,3 +377,4 @@
 460	n64	lsm_set_self_attr		sys_lsm_set_self_attr
 461	n64	lsm_list_modules		sys_lsm_list_modules
 462	n64	mseal				sys_mseal
+463	n64	set_robust_list2		sys_set_robust_list2
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 2439a2491cff..ea5be2805b3f 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -450,3 +450,4 @@
 460	o32	lsm_set_self_attr		sys_lsm_set_self_attr
 461	o32	lsm_list_modules		sys_lsm_list_modules
 462	o32	mseal				sys_mseal
+463	o32	set_robust_list2		sys_set_robust_list2
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 66dc406b12e4..49adcdb392a2 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -461,3 +461,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index ebae8415dfbb..eff6641f35e6 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -553,3 +553,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 01071182763e..a2366aa6791e 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -465,3 +465,4 @@
 460  common	lsm_set_self_attr	sys_lsm_set_self_attr		sys_lsm_set_self_attr
 461  common	lsm_list_modules	sys_lsm_list_modules		sys_lsm_list_modules
 462  common	mseal			sys_mseal			sys_mseal
+463  common	set_robust_list2	sys_set_robust_list2		sys_set_robust_list2
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index c55fd7696d40..e6d7e565b942 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -466,3 +466,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index cfdfb3707c16..176022f9a236 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -508,3 +508,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal 				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 534c74b14fab..8607563a5510 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -468,3 +468,4 @@
 460	i386	lsm_set_self_attr	sys_lsm_set_self_attr
 461	i386	lsm_list_modules	sys_lsm_list_modules
 462	i386	mseal 			sys_mseal
+463	i386	set_robust_list2	sys_set_robust_list2
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 7093ee21c0d1..fbc0cef1a97c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -386,6 +386,7 @@
 460	common	lsm_set_self_attr	sys_lsm_set_self_attr
 461	common	lsm_list_modules	sys_lsm_list_modules
 462 	common  mseal			sys_mseal
+463	common	set_robust_list2	sys_set_robust_list2
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 67083fc1b2f5..9081b3bf8272 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -433,3 +433,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal 				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index c00a86931f8c..71fbac6176c8 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -195,6 +195,7 @@ COND_SYSCALL(move_pages);
 COND_SYSCALL(set_mempolicy_home_node);
 COND_SYSCALL(cachestat);
 COND_SYSCALL(mseal);
+COND_SYSCALL(set_robust_list2);
 
 COND_SYSCALL(perf_event_open);
 COND_SYSCALL(accept4);
diff --git a/scripts/syscall.tbl b/scripts/syscall.tbl
index 845e24eb372e..e174f6e2d521 100644
--- a/scripts/syscall.tbl
+++ b/scripts/syscall.tbl
@@ -403,3 +403,4 @@
 460	common	lsm_set_self_attr		sys_lsm_set_self_attr
 461	common	lsm_list_modules		sys_lsm_list_modules
 462	common	mseal				sys_mseal
+463	common	set_robust_list2		sys_set_robust_list2
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
  2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
@ 2024-11-02  5:13   ` kernel test robot
  2024-11-02  6:05   ` kernel test robot
  1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02  5:13 UTC (permalink / raw)
  To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor, André Almeida

Hi André,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/sched/core linus/master v6.12-rc5 next-20241101]
[cannot apply to tip/x86/asm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20241101162147.284993-4-andrealmeid%40igalia.com
patch subject: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20241102/202411021208.9UQz3ahR-lkp@intel.com/config)
compiler: arc-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021208.9UQz3ahR-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021208.9UQz3ahR-lkp@intel.com/

All errors (new ones prefixed by >>):

         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:455:1: note: in expansion of macro '__SYSCALL'
     455 | __SYSCALL(454, sys_futex_wake)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
     456 | __SYSCALL(455, sys_futex_wait)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[455]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
     456 | __SYSCALL(455, sys_futex_wait)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
     457 | __SYSCALL(456, sys_futex_requeue)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[456]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
     457 | __SYSCALL(456, sys_futex_requeue)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
     458 | __SYSCALL(457, sys_statmount)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[457]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
     458 | __SYSCALL(457, sys_statmount)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
     459 | __SYSCALL(458, sys_listmount)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[458]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
     459 | __SYSCALL(458, sys_listmount)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
     460 | __SYSCALL(459, sys_lsm_get_self_attr)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[459]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
     460 | __SYSCALL(459, sys_lsm_get_self_attr)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
     461 | __SYSCALL(460, sys_lsm_set_self_attr)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[460]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
     461 | __SYSCALL(460, sys_lsm_set_self_attr)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
     462 | __SYSCALL(461, sys_lsm_list_modules)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[461]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
     462 | __SYSCALL(461, sys_lsm_list_modules)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: warning: initialized field overwritten [-Woverride-init]
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
     463 | __SYSCALL(462, sys_mseal)
         | ^~~~~~~~~
   arch/arc/kernel/sys.c:13:36: note: (near initialization for 'sys_call_table[462]')
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                    ^
   ./arch/arc/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
     463 | __SYSCALL(462, sys_mseal)
         | ^~~~~~~~~
>> ./arch/arc/include/generated/asm/syscall_table_32.h:464:16: error: 'sys_set_robust_list2' undeclared here (not in a function); did you mean 'sys_set_robust_list'?
     464 | __SYSCALL(463, sys_set_robust_list2)
         |                ^~~~~~~~~~~~~~~~~~~~
   arch/arc/kernel/sys.c:13:37: note: in definition of macro '__SYSCALL'
      13 | #define __SYSCALL(nr, call) [nr] = (call),
         |                                     ^~~~

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
  2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
@ 2024-11-02  5:44   ` kernel test robot
  2024-11-02 14:57   ` kernel test robot
  1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02  5:44 UTC (permalink / raw)
  To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor, André Almeida

Hi André,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/locking/core]
[also build test WARNING on tip/sched/core linus/master tip/x86/asm v6.12-rc5 next-20241101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20241101162147.284993-2-andrealmeid%40igalia.com
patch subject: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
config: alpha-allnoconfig (https://download.01.org/0day-ci/archive/20241102/202411021349.tqg42lGq-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 13.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021349.tqg42lGq-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021349.tqg42lGq-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/net/compat.h:8,
                    from include/net/scm.h:13,
                    from include/linux/netlink.h:9,
                    from include/uapi/linux/neighbour.h:6,
                    from include/linux/netdevice.h:44,
                    from include/net/sock.h:46,
                    from include/linux/tcp.h:19,
                    from include/linux/ipv6.h:102,
                    from include/net/ipv6.h:12,
                    from include/linux/sunrpc/clnt.h:29,
                    from include/linux/nfs_fs.h:32,
                    from init/do_mounts.c:23:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
     665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
         |                                   ^~~~~~~~~~~~~~~~~~
--
   In file included from include/net/compat.h:8,
                    from include/net/scm.h:13,
                    from include/linux/netlink.h:9,
                    from include/uapi/linux/neighbour.h:6,
                    from include/linux/netdevice.h:44,
                    from include/net/sock.h:46,
                    from include/linux/tcp.h:19,
                    from include/linux/ipv6.h:102,
                    from include/net/addrconf.h:65,
                    from lib/vsprintf.c:41:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
     665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
         |                                   ^~~~~~~~~~~~~~~~~~
   lib/vsprintf.c: In function 'va_format':
   lib/vsprintf.c:1683:9: warning: function 'va_format' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
    1683 |         buf += vsnprintf(buf, end > buf ? end - buf : 0, va_fmt->fmt, va);
         |         ^~~
--
   In file included from kernel/time/hrtimer.c:44:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
     665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
         |                                   ^~~~~~~~~~~~~~~~~~
   kernel/time/hrtimer.c:121:35: warning: initialized field overwritten [-Woverride-init]
     121 |         [CLOCK_REALTIME]        = HRTIMER_BASE_REALTIME,
         |                                   ^~~~~~~~~~~~~~~~~~~~~
   kernel/time/hrtimer.c:121:35: note: (near initialization for 'hrtimer_clock_to_base_table[0]')
   kernel/time/hrtimer.c:122:35: warning: initialized field overwritten [-Woverride-init]
     122 |         [CLOCK_MONOTONIC]       = HRTIMER_BASE_MONOTONIC,
         |                                   ^~~~~~~~~~~~~~~~~~~~~~
   kernel/time/hrtimer.c:122:35: note: (near initialization for 'hrtimer_clock_to_base_table[1]')
   kernel/time/hrtimer.c:123:35: warning: initialized field overwritten [-Woverride-init]
     123 |         [CLOCK_BOOTTIME]        = HRTIMER_BASE_BOOTTIME,
         |                                   ^~~~~~~~~~~~~~~~~~~~~
   kernel/time/hrtimer.c:123:35: note: (near initialization for 'hrtimer_clock_to_base_table[7]')
   kernel/time/hrtimer.c:124:35: warning: initialized field overwritten [-Woverride-init]
     124 |         [CLOCK_TAI]             = HRTIMER_BASE_TAI,
         |                                   ^~~~~~~~~~~~~~~~
   kernel/time/hrtimer.c:124:35: note: (near initialization for 'hrtimer_clock_to_base_table[11]')
--
   In file included from kernel/futex/core.c:34:
>> include/linux/compat.h:665:35: warning: 'struct robust_list_head32' declared inside parameter list will not be visible outside of this definition or declaration
     665 | compat_sys_set_robust_list(struct robust_list_head32 __user *head,
         |                                   ^~~~~~~~~~~~~~~~~~
   kernel/futex/core.c: In function 'exit_robust_list32':
   kernel/futex/core.c:902:54: error: 'struct task_struct' has no member named 'compat_robust_list'
     902 |         struct robust_list_head32 __user *head = curr->compat_robust_list;
         |                                                      ^~
   kernel/futex/core.c: At top level:
   kernel/futex/core.c:900:13: warning: 'exit_robust_list32' defined but not used [-Wunused-function]
     900 | static void exit_robust_list32(struct task_struct *curr)
         |             ^~~~~~~~~~~~~~~~~~


vim +665 include/linux/compat.h

   621	
   622	#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64
   623	asmlinkage long compat_sys_pwritev64(unsigned long fd,
   624			const struct iovec __user *vec,
   625			unsigned long vlen, loff_t pos);
   626	#endif
   627	asmlinkage long compat_sys_sendfile(int out_fd, int in_fd,
   628					    compat_off_t __user *offset, compat_size_t count);
   629	asmlinkage long compat_sys_sendfile64(int out_fd, int in_fd,
   630					    compat_loff_t __user *offset, compat_size_t count);
   631	asmlinkage long compat_sys_pselect6_time32(int n, compat_ulong_t __user *inp,
   632					    compat_ulong_t __user *outp,
   633					    compat_ulong_t __user *exp,
   634					    struct old_timespec32 __user *tsp,
   635					    void __user *sig);
   636	asmlinkage long compat_sys_pselect6_time64(int n, compat_ulong_t __user *inp,
   637					    compat_ulong_t __user *outp,
   638					    compat_ulong_t __user *exp,
   639					    struct __kernel_timespec __user *tsp,
   640					    void __user *sig);
   641	asmlinkage long compat_sys_ppoll_time32(struct pollfd __user *ufds,
   642					 unsigned int nfds,
   643					 struct old_timespec32 __user *tsp,
   644					 const compat_sigset_t __user *sigmask,
   645					 compat_size_t sigsetsize);
   646	asmlinkage long compat_sys_ppoll_time64(struct pollfd __user *ufds,
   647					 unsigned int nfds,
   648					 struct __kernel_timespec __user *tsp,
   649					 const compat_sigset_t __user *sigmask,
   650					 compat_size_t sigsetsize);
   651	asmlinkage long compat_sys_signalfd4(int ufd,
   652					     const compat_sigset_t __user *sigmask,
   653					     compat_size_t sigsetsize, int flags);
   654	asmlinkage long compat_sys_newfstatat(unsigned int dfd,
   655					      const char __user *filename,
   656					      struct compat_stat __user *statbuf,
   657					      int flag);
   658	asmlinkage long compat_sys_newfstat(unsigned int fd,
   659					    struct compat_stat __user *statbuf);
   660	/* No generic prototype for sync_file_range and sync_file_range2 */
   661	asmlinkage long compat_sys_waitid(int, compat_pid_t,
   662			struct compat_siginfo __user *, int,
   663			struct compat_rusage __user *);
   664	asmlinkage long
 > 665	compat_sys_set_robust_list(struct robust_list_head32 __user *head,
   666				   compat_size_t len);
   667	asmlinkage long
   668	compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr,
   669				   compat_size_t __user *len_ptr);
   670	asmlinkage long compat_sys_getitimer(int which,
   671					     struct old_itimerval32 __user *it);
   672	asmlinkage long compat_sys_setitimer(int which,
   673					     struct old_itimerval32 __user *in,
   674					     struct old_itimerval32 __user *out);
   675	asmlinkage long compat_sys_kexec_load(compat_ulong_t entry,
   676					      compat_ulong_t nr_segments,
   677					      struct compat_kexec_segment __user *,
   678					      compat_ulong_t flags);
   679	asmlinkage long compat_sys_timer_create(clockid_t which_clock,
   680				struct compat_sigevent __user *timer_event_spec,
   681				timer_t __user *created_timer_id);
   682	asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
   683					  compat_long_t addr, compat_long_t data);
   684	asmlinkage long compat_sys_sched_setaffinity(compat_pid_t pid,
   685					     unsigned int len,
   686					     compat_ulong_t __user *user_mask_ptr);
   687	asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid,
   688					     unsigned int len,
   689					     compat_ulong_t __user *user_mask_ptr);
   690	asmlinkage long compat_sys_sigaltstack(const compat_stack_t __user *uss_ptr,
   691					       compat_stack_t __user *uoss_ptr);
   692	asmlinkage long compat_sys_rt_sigsuspend(compat_sigset_t __user *unewset,
   693						 compat_size_t sigsetsize);
   694	#ifndef CONFIG_ODD_RT_SIGACTION
   695	asmlinkage long compat_sys_rt_sigaction(int,
   696					 const struct compat_sigaction __user *,
   697					 struct compat_sigaction __user *,
   698					 compat_size_t);
   699	#endif
   700	asmlinkage long compat_sys_rt_sigprocmask(int how, compat_sigset_t __user *set,
   701						  compat_sigset_t __user *oset,
   702						  compat_size_t sigsetsize);
   703	asmlinkage long compat_sys_rt_sigpending(compat_sigset_t __user *uset,
   704						 compat_size_t sigsetsize);
   705	asmlinkage long compat_sys_rt_sigtimedwait_time32(compat_sigset_t __user *uthese,
   706			struct compat_siginfo __user *uinfo,
   707			struct old_timespec32 __user *uts, compat_size_t sigsetsize);
   708	asmlinkage long compat_sys_rt_sigtimedwait_time64(compat_sigset_t __user *uthese,
   709			struct compat_siginfo __user *uinfo,
   710			struct __kernel_timespec __user *uts, compat_size_t sigsetsize);
   711	asmlinkage long compat_sys_rt_sigqueueinfo(compat_pid_t pid, int sig,
   712					struct compat_siginfo __user *uinfo);
   713	/* No generic prototype for rt_sigreturn */
   714	asmlinkage long compat_sys_times(struct compat_tms __user *tbuf);
   715	asmlinkage long compat_sys_getrlimit(unsigned int resource,
   716					     struct compat_rlimit __user *rlim);
   717	asmlinkage long compat_sys_setrlimit(unsigned int resource,
   718					     struct compat_rlimit __user *rlim);
   719	asmlinkage long compat_sys_getrusage(int who, struct compat_rusage __user *ru);
   720	asmlinkage long compat_sys_gettimeofday(struct old_timeval32 __user *tv,
   721			struct timezone __user *tz);
   722	asmlinkage long compat_sys_settimeofday(struct old_timeval32 __user *tv,
   723			struct timezone __user *tz);
   724	asmlinkage long compat_sys_sysinfo(struct compat_sysinfo __user *info);
   725	asmlinkage long compat_sys_mq_open(const char __user *u_name,
   726				int oflag, compat_mode_t mode,
   727				struct compat_mq_attr __user *u_attr);
   728	asmlinkage long compat_sys_mq_notify(mqd_t mqdes,
   729				const struct compat_sigevent __user *u_notification);
   730	asmlinkage long compat_sys_mq_getsetattr(mqd_t mqdes,
   731				const struct compat_mq_attr __user *u_mqstat,
   732				struct compat_mq_attr __user *u_omqstat);
   733	asmlinkage long compat_sys_msgctl(int first, int second, void __user *uptr);
   734	asmlinkage long compat_sys_msgrcv(int msqid, compat_uptr_t msgp,
   735			compat_ssize_t msgsz, compat_long_t msgtyp, int msgflg);
   736	asmlinkage long compat_sys_msgsnd(int msqid, compat_uptr_t msgp,
   737			compat_ssize_t msgsz, int msgflg);
   738	asmlinkage long compat_sys_semctl(int semid, int semnum, int cmd, int arg);
   739	asmlinkage long compat_sys_shmctl(int first, int second, void __user *uptr);
   740	asmlinkage long compat_sys_shmat(int shmid, compat_uptr_t shmaddr, int shmflg);
   741	asmlinkage long compat_sys_recvfrom(int fd, void __user *buf, compat_size_t len,
   742				    unsigned flags, struct sockaddr __user *addr,
   743				    int __user *addrlen);
   744	asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg,
   745					   unsigned flags);
   746	asmlinkage long compat_sys_recvmsg(int fd, struct compat_msghdr __user *msg,
   747					   unsigned int flags);
   748	/* No generic prototype for readahead */
   749	asmlinkage long compat_sys_keyctl(u32 option,
   750				      u32 arg2, u32 arg3, u32 arg4, u32 arg5);
   751	asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr_t __user *argv,
   752			     const compat_uptr_t __user *envp);
   753	/* No generic prototype for fadvise64_64 */
   754	/* CONFIG_MMU only */
   755	asmlinkage long compat_sys_rt_tgsigqueueinfo(compat_pid_t tgid,
   756						compat_pid_t pid, int sig,
   757						struct compat_siginfo __user *uinfo);
   758	asmlinkage long compat_sys_recvmmsg_time64(int fd, struct compat_mmsghdr __user *mmsg,
   759					    unsigned vlen, unsigned int flags,
   760					    struct __kernel_timespec __user *timeout);
   761	asmlinkage long compat_sys_recvmmsg_time32(int fd, struct compat_mmsghdr __user *mmsg,
   762					    unsigned vlen, unsigned int flags,
   763					    struct old_timespec32 __user *timeout);
   764	asmlinkage long compat_sys_wait4(compat_pid_t pid,
   765					 compat_uint_t __user *stat_addr, int options,
   766					 struct compat_rusage __user *ru);
   767	asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
   768						    int, const char __user *);
   769	asmlinkage long compat_sys_open_by_handle_at(int mountdirfd,
   770						     struct file_handle __user *handle,
   771						     int flags);
   772	asmlinkage long compat_sys_sendmmsg(int fd, struct compat_mmsghdr __user *mmsg,
   773					    unsigned vlen, unsigned int flags);
   774	asmlinkage long compat_sys_execveat(int dfd, const char __user *filename,
   775			     const compat_uptr_t __user *argv,
   776			     const compat_uptr_t __user *envp, int flags);
   777	asmlinkage ssize_t compat_sys_preadv2(compat_ulong_t fd,
   778			const struct iovec __user *vec,
   779			compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
   780	asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
   781			const struct iovec __user *vec,
   782			compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
   783	#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
   784	asmlinkage long  compat_sys_preadv64v2(unsigned long fd,
   785			const struct iovec __user *vec,
   786			unsigned long vlen, loff_t pos, rwf_t flags);
   787	#endif
   788	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
  2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
  2024-11-02  5:13   ` kernel test robot
@ 2024-11-02  6:05   ` kernel test robot
  1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02  6:05 UTC (permalink / raw)
  To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor, André Almeida

Hi André,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/sched/core linus/master v6.12-rc5 next-20241101]
[cannot apply to tip/x86/asm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20241101162147.284993-4-andrealmeid%40igalia.com
patch subject: [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall
config: csky-allnoconfig (https://download.01.org/0day-ci/archive/20241102/202411021323.fazJ8GOs-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021323.fazJ8GOs-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021323.fazJ8GOs-lkp@intel.com/

All errors (new ones prefixed by >>):

         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:455:1: note: in expansion of macro '__SYSCALL'
     455 | __SYSCALL(454, sys_futex_wake)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
     456 | __SYSCALL(455, sys_futex_wait)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[455]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:456:1: note: in expansion of macro '__SYSCALL'
     456 | __SYSCALL(455, sys_futex_wait)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
     457 | __SYSCALL(456, sys_futex_requeue)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[456]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:457:1: note: in expansion of macro '__SYSCALL'
     457 | __SYSCALL(456, sys_futex_requeue)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
     458 | __SYSCALL(457, sys_statmount)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[457]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:458:1: note: in expansion of macro '__SYSCALL'
     458 | __SYSCALL(457, sys_statmount)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
     459 | __SYSCALL(458, sys_listmount)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[458]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:459:1: note: in expansion of macro '__SYSCALL'
     459 | __SYSCALL(458, sys_listmount)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
     460 | __SYSCALL(459, sys_lsm_get_self_attr)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[459]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:460:1: note: in expansion of macro '__SYSCALL'
     460 | __SYSCALL(459, sys_lsm_get_self_attr)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
     461 | __SYSCALL(460, sys_lsm_set_self_attr)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[460]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:461:1: note: in expansion of macro '__SYSCALL'
     461 | __SYSCALL(460, sys_lsm_set_self_attr)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
     462 | __SYSCALL(461, sys_lsm_list_modules)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[461]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:462:1: note: in expansion of macro '__SYSCALL'
     462 | __SYSCALL(461, sys_lsm_list_modules)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: warning: initialized field overwritten [-Woverride-init]
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
     463 | __SYSCALL(462, sys_mseal)
         | ^~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:35: note: (near initialization for 'sys_call_table[462]')
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                   ^
   ./arch/csky/include/generated/asm/syscall_table_32.h:463:1: note: in expansion of macro '__SYSCALL'
     463 | __SYSCALL(462, sys_mseal)
         | ^~~~~~~~~
>> ./arch/csky/include/generated/asm/syscall_table_32.h:464:16: error: 'sys_set_robust_list2' undeclared here (not in a function); did you mean 'sys_set_robust_list'?
     464 | __SYSCALL(463, sys_set_robust_list2)
         |                ^~~~~~~~~~~~~~~~~~~~
   arch/csky/kernel/syscall_table.c:8:36: note: in definition of macro '__SYSCALL'
       8 | #define __SYSCALL(nr, call)[nr] = (call),
         |                                    ^~~~

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
  2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
  2024-11-02  5:44   ` kernel test robot
@ 2024-11-02 14:57   ` kernel test robot
  1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 14:57 UTC (permalink / raw)
  To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor, André Almeida

Hi André,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/locking/core]
[also build test WARNING on tip/sched/core linus/master tip/x86/asm v6.12-rc5 next-20241101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20241101162147.284993-2-andrealmeid%40igalia.com
patch subject: [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list
config: x86_64-randconfig-123-20241102 (https://download.01.org/0day-ci/archive/20241102/202411022242.XCJECOCz-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411022242.XCJECOCz-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411022242.XCJECOCz-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> kernel/futex/core.c:914:59: sparse: sparse: cast removes address space '__user' of expression
>> kernel/futex/core.c:914:59: sparse: sparse: incorrect type in argument 3 (different address spaces) @@     expected unsigned int [noderef] [usertype] __user *head @@     got unsigned int [usertype] * @@
   kernel/futex/core.c:914:59: sparse:     expected unsigned int [noderef] [usertype] __user *head
   kernel/futex/core.c:914:59: sparse:     got unsigned int [usertype] *

vim +/__user +914 kernel/futex/core.c

   893	
   894	/*
   895	 * Walk curr->robust_list (very carefully, it's a userspace list!)
   896	 * and mark any locks found there dead, and notify any waiters.
   897	 *
   898	 * We silently return on any sign of list-walking problem.
   899	 */
   900	static void exit_robust_list32(struct task_struct *curr)
   901	{
   902		struct robust_list_head32 __user *head = curr->compat_robust_list;
   903		struct robust_list __user *entry, *next_entry, *pending;
   904		unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
   905		unsigned int next_pi;
   906		u32 uentry, next_uentry, upending;
   907		s32 futex_offset;
   908		int rc;
   909	
   910		/*
   911		 * Fetch the list head (which was registered earlier, via
   912		 * sys_set_robust_list()):
   913		 */
 > 914		if (fetch_robust_entry32((u32 *)&uentry, &entry, (u32 *)&head->list.next, &pi))
   915			return;
   916		/*
   917		 * Fetch the relative futex offset:
   918		 */
   919		if (get_user(futex_offset, &head->futex_offset))
   920			return;
   921		/*
   922		 * Fetch any possibly pending lock-add first, and handle it
   923		 * if it exists:
   924		 */
   925		if (fetch_robust_entry32(&upending, &pending,
   926				       &head->list_op_pending, &pip))
   927			return;
   928	
   929		next_entry = NULL;	/* avoid warning with gcc */
   930		while (entry != (struct robust_list __user *) &head->list) {
   931			/*
   932			 * Fetch the next entry in the list before calling
   933			 * handle_futex_death:
   934			 */
   935			rc = fetch_robust_entry32(&next_uentry, &next_entry,
   936				(u32 __user *)&entry->next, &next_pi);
   937			/*
   938			 * A pending lock might already be on the list, so
   939			 * dont process it twice:
   940			 */
   941			if (entry != pending) {
   942				void __user *uaddr = futex_uaddr(entry, futex_offset);
   943	
   944				if (handle_futex_death(uaddr, curr, pi,
   945						       HANDLE_DEATH_LIST))
   946					return;
   947			}
   948			if (rc)
   949				return;
   950			uentry = next_uentry;
   951			entry = next_entry;
   952			pi = next_pi;
   953			/*
   954			 * Avoid excessively long or circular lists:
   955			 */
   956			if (!--limit)
   957				break;
   958	
   959			cond_resched();
   960		}
   961		if (pending) {
   962			void __user *uaddr = futex_uaddr(pending, futex_offset);
   963	
   964			handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING);
   965		}
   966	}
   967	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/3] futex: Create set_robust_list2
  2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
@ 2024-11-02 16:40   ` kernel test robot
  2024-11-04 11:22   ` Peter Zijlstra
  1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-11-02 16:40 UTC (permalink / raw)
  To: André Almeida, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, Arnd Bergmann, sonicadvance1
  Cc: oe-kbuild-all, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor, André Almeida

Hi André,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/locking/core]
[also build test WARNING on tip/sched/core linus/master v6.12-rc5 next-20241101]
[cannot apply to tip/x86/asm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_exit_robust_list/20241102-002419
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20241101162147.284993-3-andrealmeid%40igalia.com
patch subject: [PATCH v2 2/3] futex: Create set_robust_list2
config: i386-randconfig-062-20241102 (https://download.01.org/0day-ci/archive/20241103/202411030038.W5DgvCYP-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241103/202411030038.W5DgvCYP-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411030038.W5DgvCYP-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/futex/core.c:915:59: sparse: sparse: cast removes address space '__user' of expression
   kernel/futex/core.c:915:59: sparse: sparse: incorrect type in argument 3 (different address spaces) @@     expected unsigned int [noderef] [usertype] __user *head @@     got unsigned int [usertype] * @@
   kernel/futex/core.c:915:59: sparse:     expected unsigned int [noderef] [usertype] __user *head
   kernel/futex/core.c:915:59: sparse:     got unsigned int [usertype] *
   kernel/futex/core.c:1108:42: sparse: sparse: cast removes address space '__user' of expression
>> kernel/futex/core.c:1108:42: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct robust_list_head32 [noderef] __user *head @@     got struct robust_list_head32 * @@
   kernel/futex/core.c:1108:42: sparse:     expected struct robust_list_head32 [noderef] __user *head
   kernel/futex/core.c:1108:42: sparse:     got struct robust_list_head32 *
   kernel/futex/core.c: note: in included file (through include/linux/smp.h, include/linux/alloc_tag.h, include/linux/percpu.h, ...):
   include/linux/list.h:83:21: sparse: sparse: self-comparison always evaluates to true

vim +1108 kernel/futex/core.c

  1095	
  1096	static void futex_cleanup(struct task_struct *tsk)
  1097	{
  1098		struct robust_list2_entry *curr, *n;
  1099		struct list_head *list2 = &tsk->robust_list2;
  1100	
  1101	#ifdef CONFIG_64BIT
  1102		if (unlikely(tsk->robust_list)) {
  1103			exit_robust_list64(tsk, tsk->robust_list);
  1104			tsk->robust_list = NULL;
  1105		}
  1106	#else
  1107		if (unlikely(tsk->robust_list)) {
> 1108			exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
  1109			tsk->robust_list = NULL;
  1110		}
  1111	#endif
  1112	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] futex: Create set_robust_list2
  2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
                   ` (2 preceding siblings ...)
  2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
@ 2024-11-02 21:58 ` Florian Weimer
  2024-11-04 11:32   ` Peter Zijlstra
  2024-11-04 21:49   ` André Almeida
  3 siblings, 2 replies; 18+ messages in thread
From: Florian Weimer @ 2024-11-02 21:58 UTC (permalink / raw)
  To: André Almeida
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
	kernel-dev, linux-api, Nathan Chancellor

* André Almeida:

> 1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
>    this is not a problem, because of the compat entry point. But there's
>    no such compat entry point for AArch64, so the kernel would do the
>    pointer arithmetic wrongly. Is also unviable to userspace to keep
>    track every addition/removal to the robust list and keep a 64bit
>    version of it somewhere else to feed the kernel. Thus, the new
>    interface has an option of telling the kernel if the list is filled
>    with 32bit or 64bit pointers.

The size is typically different for 32-bit and 64-bit mode (12 vs 24
bytes).  Why isn't this enough to disambiguate?

> 2) Apps can set just one robust list (in theory, x86-64 can set two if
>    they also use the compat entry point). That means that when a x86 app
>    asks FEX-Emu to call set_robust_list(), FEX have two options: to
>    overwrite their own robust list pointer and make the app robust, or
>    to ignore the app robust list and keep the emulator robust. The new
>    interface allows for multiple robust lists per application, solving
>    this.

Can't you avoid mixing emulated and general userspace code on the same
thread?  On emulator threads, you have full control over the TCB.

QEMU hints towards further problems (in linux-user/syscall.c):

    case TARGET_NR_set_robust_list:
    case TARGET_NR_get_robust_list:
        /* The ABI for supporting robust futexes has userspace pass
         * the kernel a pointer to a linked list which is updated by
         * userspace after the syscall; the list is walked by the kernel
         * when the thread exits. Since the linked list in QEMU guest
         * memory isn't a valid linked list for the host and we have
         * no way to reliably intercept the thread-death event, we can't
         * support these. Silently return ENOSYS so that guest userspace
         * falls back to a non-robust futex implementation (which should
         * be OK except in the corner case of the guest crashing while
         * holding a mutex that is shared with another process via
         * shared memory).
         */
        return -TARGET_ENOSYS;

The glibc implementation is not really prepared for this
(__ASSUME_SET_ROBUST_LIST is defined for must architectures).  But a
couple of years ago, we had a bunch of kernels that regressed robust
list support on POWER, and I think we found out only when we tested an
unrelated glibc update and saw unexpected glibc test suite failures …

Thanks,
Florian


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/3] futex: Create set_robust_list2
  2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
  2024-11-02 16:40   ` kernel test robot
@ 2024-11-04 11:22   ` Peter Zijlstra
  2024-11-04 21:55     ` André Almeida
  1 sibling, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-04 11:22 UTC (permalink / raw)
  To: André Almeida
  Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	Arnd Bergmann, sonicadvance1, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor

On Fri, Nov 01, 2024 at 01:21:46PM -0300, André Almeida wrote:
> @@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
>  
>  static void futex_cleanup(struct task_struct *tsk)
>  {
> +	struct robust_list2_entry *curr, *n;
> +	struct list_head *list2 = &tsk->robust_list2;
> +
>  #ifdef CONFIG_64BIT
>  	if (unlikely(tsk->robust_list)) {
> -		exit_robust_list64(tsk);
> +		exit_robust_list64(tsk, tsk->robust_list);
>  		tsk->robust_list = NULL;
>  	}
>  #else
>  	if (unlikely(tsk->robust_list)) {
> -		exit_robust_list32(tsk);
> +		exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
>  		tsk->robust_list = NULL;
>  	}
>  #endif
>  
>  #ifdef CONFIG_COMPAT
>  	if (unlikely(tsk->compat_robust_list)) {
> -		exit_robust_list32(tsk);
> +		exit_robust_list32(tsk, tsk->compat_robust_list);
>  		tsk->compat_robust_list = NULL;
>  	}
>  #endif
> +	/*
> +	 * Walk through the linked list, parsing robust lists and freeing the
> +	 * allocated lists
> +	 */
> +	if (unlikely(!list_empty(list2))) {
> +		list_for_each_entry_safe(curr, n, list2, list) {
> +			if (curr->head != NULL) {
> +				if (curr->list_type == ROBUST_LIST_64BIT)
> +					exit_robust_list64(tsk, curr->head);
> +				else if (curr->list_type == ROBUST_LIST_32BIT)
> +					exit_robust_list32(tsk, curr->head);
> +				curr->head = NULL;
> +			}
> +			list_del_init(&curr->list);
> +			kfree(curr);
> +		}
> +	}
>  
>  	if (unlikely(!list_empty(&tsk->pi_state_list)))
>  		exit_pi_state_list(tsk);

I'm still digesting this, but the above seems particularly silly.

Should not the legacy lists also be on the list of lists? I mean, it
makes no sense to have two completely separate means of tracking lists.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] futex: Create set_robust_list2
  2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
@ 2024-11-04 11:32   ` Peter Zijlstra
  2024-11-04 11:56     ` Peter Zijlstra
  2024-11-04 12:36     ` Florian Weimer
  2024-11-04 21:49   ` André Almeida
  1 sibling, 2 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-04 11:32 UTC (permalink / raw)
  To: Florian Weimer
  Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
	kernel-dev, linux-api, Nathan Chancellor

On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:

> QEMU hints towards further problems (in linux-user/syscall.c):
> 
>     case TARGET_NR_set_robust_list:
>     case TARGET_NR_get_robust_list:
>         /* The ABI for supporting robust futexes has userspace pass
>          * the kernel a pointer to a linked list which is updated by
>          * userspace after the syscall; the list is walked by the kernel
>          * when the thread exits. Since the linked list in QEMU guest
>          * memory isn't a valid linked list for the host and we have
>          * no way to reliably intercept the thread-death event, we can't
>          * support these. Silently return ENOSYS so that guest userspace
>          * falls back to a non-robust futex implementation (which should
>          * be OK except in the corner case of the guest crashing while
>          * holding a mutex that is shared with another process via
>          * shared memory).
>          */
>         return -TARGET_ENOSYS;

I don't think we can sanely fix that. Can't QEMU track the robust thing
itself and use waitpid() to discover the thread is gone and fudge things
from there?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] futex: Create set_robust_list2
  2024-11-04 11:32   ` Peter Zijlstra
@ 2024-11-04 11:56     ` Peter Zijlstra
  2024-11-04 12:36     ` Florian Weimer
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-04 11:56 UTC (permalink / raw)
  To: Florian Weimer
  Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
	kernel-dev, linux-api, Nathan Chancellor

On Mon, Nov 04, 2024 at 12:32:40PM +0100, Peter Zijlstra wrote:
> On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
> 
> > QEMU hints towards further problems (in linux-user/syscall.c):
> > 
> >     case TARGET_NR_set_robust_list:
> >     case TARGET_NR_get_robust_list:
> >         /* The ABI for supporting robust futexes has userspace pass
> >          * the kernel a pointer to a linked list which is updated by
> >          * userspace after the syscall; the list is walked by the kernel
> >          * when the thread exits. Since the linked list in QEMU guest
> >          * memory isn't a valid linked list for the host and we have
> >          * no way to reliably intercept the thread-death event, we can't
> >          * support these. Silently return ENOSYS so that guest userspace
> >          * falls back to a non-robust futex implementation (which should
> >          * be OK except in the corner case of the guest crashing while
> >          * holding a mutex that is shared with another process via
> >          * shared memory).
> >          */
> >         return -TARGET_ENOSYS;
> 
> I don't think we can sanely fix that. Can't QEMU track the robust thing
> itself and use waitpid() to discover the thread is gone and fudge things
> from there?

Hmm, what about we mandate 'natural' alignement of the structure such
that it is always inside a single page, then QEMU can do the translation
here and hand the kernel the 'real' address.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] futex: Create set_robust_list2
  2024-11-04 11:32   ` Peter Zijlstra
  2024-11-04 11:56     ` Peter Zijlstra
@ 2024-11-04 12:36     ` Florian Weimer
  2024-11-05 12:18       ` Peter Zijlstra
  1 sibling, 1 reply; 18+ messages in thread
From: Florian Weimer @ 2024-11-04 12:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
	kernel-dev, linux-api, Nathan Chancellor

* Peter Zijlstra:

> On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
>
>> QEMU hints towards further problems (in linux-user/syscall.c):
>> 
>>     case TARGET_NR_set_robust_list:
>>     case TARGET_NR_get_robust_list:
>>         /* The ABI for supporting robust futexes has userspace pass
>>          * the kernel a pointer to a linked list which is updated by
>>          * userspace after the syscall; the list is walked by the kernel
>>          * when the thread exits. Since the linked list in QEMU guest
>>          * memory isn't a valid linked list for the host and we have
>>          * no way to reliably intercept the thread-death event, we can't
>>          * support these. Silently return ENOSYS so that guest userspace
>>          * falls back to a non-robust futex implementation (which should
>>          * be OK except in the corner case of the guest crashing while
>>          * holding a mutex that is shared with another process via
>>          * shared memory).
>>          */
>>         return -TARGET_ENOSYS;
>
> I don't think we can sanely fix that. Can't QEMU track the robust thing
> itself and use waitpid() to discover the thread is gone and fudge things
> from there?

There are race conditions with munmap, I think, and they probably get a
lot of worse if QEMU does that.

See Rich Felker's bug report:

| The corruption is performed by the kernel when it walks the robust
| list. The basic situation is the same as in PR #13690, except that
| here there's actually a potential write to the memory rather than just
| a read.
| 
| The sequence of events leading to corruption goes like this:
| 
| 1. Thread A unlocks the process-shared, robust mutex and is preempted
|    after the mutex is removed from the robust list and atomically
|    unlocked, but before it's removed from the list_op_pending field of
|    the robust list header.
| 
| 2. Thread B locks the mutex, and, knowing by program logic that it's
|    the last user of the mutex, unlocks and unmaps it, allocates/maps
|    something else that gets assigned the same address as the shared mutex
|    mapping, and then exits.
| 
| 3. The kernel destroys the process, which involves walking each
|   thread's robust list and processing each thread's list_op_pending
|   field of the robust list header. Since thread A has a list_op_pending
|   pointing at the address previously occupied by the mutex, the kernel
|   obliviously "unlocks the mutex" by writing a 0 to the address and
|   futex-waking it. However, the kernel has instead overwritten part of
|   whatever mapping thread A created. If this is private memory it
|   (probably) doesn't matter since the process is ending anyway (but are
|   there race conditions where this can be seen?). If this is shared
|   memory or a shared file mapping, however, the kernel corrupts it.
| 
| I suspect the race is difficult to hit since thread A has to get
| preempted at exactly the wrong time AND thread B has to do a fair
| amount of work without thread A getting scheduled again. So I'm not
| sure how much luck we'd have getting a test case.


<https://sourceware.org/bugzilla/show_bug.cgi?id=14485#c3>

We also have a silent unlocking failure because userspace does not know
about ROBUST_LIST_LIMIT:

  Bug 19089 - Robust mutexes do not take ROBUST_LIST_LIMIT into account
  <https://sourceware.org/bugzilla/show_bug.cgi?id=19089>

(I think we may have discussed this one before, and you may have
suggested to just hard-code 2048 in userspace because the constant is
not expected to change.)

So the in-mutex linked list has quite a few problems even outside of
emulation. 8-(

Thanks,
Florian


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] futex: Create set_robust_list2
  2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
  2024-11-04 11:32   ` Peter Zijlstra
@ 2024-11-04 21:49   ` André Almeida
  1 sibling, 0 replies; 18+ messages in thread
From: André Almeida @ 2024-11-04 21:49 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
	kernel-dev, linux-api, Nathan Chancellor

Hi Florian,

Em 02/11/2024 18:58, Florian Weimer escreveu:
> * André Almeida:
> 
>> 1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
>>     this is not a problem, because of the compat entry point. But there's
>>     no such compat entry point for AArch64, so the kernel would do the
>>     pointer arithmetic wrongly. Is also unviable to userspace to keep
>>     track every addition/removal to the robust list and keep a 64bit
>>     version of it somewhere else to feed the kernel. Thus, the new
>>     interface has an option of telling the kernel if the list is filled
>>     with 32bit or 64bit pointers.
> 
> The size is typically different for 32-bit and 64-bit mode (12 vs 24
> bytes).  Why isn't this enough to disambiguate?
> 

Right, so the idea would be to use `size_t len` from the syscall 
arguments for that?

>> 2) Apps can set just one robust list (in theory, x86-64 can set two if
>>     they also use the compat entry point). That means that when a x86 app
>>     asks FEX-Emu to call set_robust_list(), FEX have two options: to
>>     overwrite their own robust list pointer and make the app robust, or
>>     to ignore the app robust list and keep the emulator robust. The new
>>     interface allows for multiple robust lists per application, solving
>>     this.
> 
> Can't you avoid mixing emulated and general userspace code on the same
> thread?  On emulator threads, you have full control over the TCB.
> 

FEX can't avoid that because it doesn't do a full system emulation, it 
just does instructions translation. FEX doesn't have full control over 
the TCB, that's still all glibc, or whatever other dynamic linker is used.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/3] futex: Create set_robust_list2
  2024-11-04 11:22   ` Peter Zijlstra
@ 2024-11-04 21:55     ` André Almeida
  2024-11-05 12:10       ` Peter Zijlstra
  0 siblings, 1 reply; 18+ messages in thread
From: André Almeida @ 2024-11-04 21:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	Arnd Bergmann, sonicadvance1, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor

Hi Peter,

Em 04/11/2024 08:22, Peter Zijlstra escreveu:
> On Fri, Nov 01, 2024 at 01:21:46PM -0300, André Almeida wrote:
>> @@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
>>   
>>   static void futex_cleanup(struct task_struct *tsk)
>>   {
>> +	struct robust_list2_entry *curr, *n;
>> +	struct list_head *list2 = &tsk->robust_list2;
>> +
>>   #ifdef CONFIG_64BIT
>>   	if (unlikely(tsk->robust_list)) {
>> -		exit_robust_list64(tsk);
>> +		exit_robust_list64(tsk, tsk->robust_list);
>>   		tsk->robust_list = NULL;
>>   	}
>>   #else
>>   	if (unlikely(tsk->robust_list)) {
>> -		exit_robust_list32(tsk);
>> +		exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
>>   		tsk->robust_list = NULL;
>>   	}
>>   #endif
>>   
>>   #ifdef CONFIG_COMPAT
>>   	if (unlikely(tsk->compat_robust_list)) {
>> -		exit_robust_list32(tsk);
>> +		exit_robust_list32(tsk, tsk->compat_robust_list);
>>   		tsk->compat_robust_list = NULL;
>>   	}
>>   #endif
>> +	/*
>> +	 * Walk through the linked list, parsing robust lists and freeing the
>> +	 * allocated lists
>> +	 */
>> +	if (unlikely(!list_empty(list2))) {
>> +		list_for_each_entry_safe(curr, n, list2, list) {
>> +			if (curr->head != NULL) {
>> +				if (curr->list_type == ROBUST_LIST_64BIT)
>> +					exit_robust_list64(tsk, curr->head);
>> +				else if (curr->list_type == ROBUST_LIST_32BIT)
>> +					exit_robust_list32(tsk, curr->head);
>> +				curr->head = NULL;
>> +			}
>> +			list_del_init(&curr->list);
>> +			kfree(curr);
>> +		}
>> +	}
>>   
>>   	if (unlikely(!list_empty(&tsk->pi_state_list)))
>>   		exit_pi_state_list(tsk);
> 
> I'm still digesting this, but the above seems particularly silly.
> 
> Should not the legacy lists also be on the list of lists? I mean, it
> makes no sense to have two completely separate means of tracking lists.
> 

You are asking if, whenever someone calls set_robust_list() or 
compat_set_robust_list() to be inserted into &current->robust_list2 
instead of using tsk->robust_list and tsk->compat_robust_list?

I was thinking of doing that, but my current implementation has a 
kmalloc() call for every insertion, and I wasn't sure if I could add 
this new latency to the old set_robust_list() syscall. Assuming it is 
usually called just once during the thread initialization perhaps it 
shouldn't cause much harm I guess.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/3] futex: Create set_robust_list2
  2024-11-04 21:55     ` André Almeida
@ 2024-11-05 12:10       ` Peter Zijlstra
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-05 12:10 UTC (permalink / raw)
  To: André Almeida
  Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	Arnd Bergmann, sonicadvance1, linux-kernel, kernel-dev, linux-api,
	Nathan Chancellor

On Mon, Nov 04, 2024 at 06:55:45PM -0300, André Almeida wrote:
> Hi Peter,
> 
> Em 04/11/2024 08:22, Peter Zijlstra escreveu:
> > On Fri, Nov 01, 2024 at 01:21:46PM -0300, André Almeida wrote:
> > > @@ -1046,24 +1095,44 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
> > >   static void futex_cleanup(struct task_struct *tsk)
> > >   {
> > > +	struct robust_list2_entry *curr, *n;
> > > +	struct list_head *list2 = &tsk->robust_list2;
> > > +
> > >   #ifdef CONFIG_64BIT
> > >   	if (unlikely(tsk->robust_list)) {
> > > -		exit_robust_list64(tsk);
> > > +		exit_robust_list64(tsk, tsk->robust_list);
> > >   		tsk->robust_list = NULL;
> > >   	}
> > >   #else
> > >   	if (unlikely(tsk->robust_list)) {
> > > -		exit_robust_list32(tsk);
> > > +		exit_robust_list32(tsk, (struct robust_list_head32 *) tsk->robust_list);
> > >   		tsk->robust_list = NULL;
> > >   	}
> > >   #endif
> > >   #ifdef CONFIG_COMPAT
> > >   	if (unlikely(tsk->compat_robust_list)) {
> > > -		exit_robust_list32(tsk);
> > > +		exit_robust_list32(tsk, tsk->compat_robust_list);
> > >   		tsk->compat_robust_list = NULL;
> > >   	}
> > >   #endif
> > > +	/*
> > > +	 * Walk through the linked list, parsing robust lists and freeing the
> > > +	 * allocated lists
> > > +	 */
> > > +	if (unlikely(!list_empty(list2))) {
> > > +		list_for_each_entry_safe(curr, n, list2, list) {
> > > +			if (curr->head != NULL) {
> > > +				if (curr->list_type == ROBUST_LIST_64BIT)
> > > +					exit_robust_list64(tsk, curr->head);
> > > +				else if (curr->list_type == ROBUST_LIST_32BIT)
> > > +					exit_robust_list32(tsk, curr->head);
> > > +				curr->head = NULL;
> > > +			}
> > > +			list_del_init(&curr->list);
> > > +			kfree(curr);
> > > +		}
> > > +	}
> > >   	if (unlikely(!list_empty(&tsk->pi_state_list)))
> > >   		exit_pi_state_list(tsk);
> > 
> > I'm still digesting this, but the above seems particularly silly.
> > 
> > Should not the legacy lists also be on the list of lists? I mean, it
> > makes no sense to have two completely separate means of tracking lists.
> > 
> 
> You are asking if, whenever someone calls set_robust_list() or
> compat_set_robust_list() to be inserted into &current->robust_list2 instead
> of using tsk->robust_list and tsk->compat_robust_list?

Yes, that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] futex: Create set_robust_list2
  2024-11-04 12:36     ` Florian Weimer
@ 2024-11-05 12:18       ` Peter Zijlstra
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-11-05 12:18 UTC (permalink / raw)
  To: Florian Weimer
  Cc: André Almeida, Thomas Gleixner, Ingo Molnar, Darren Hart,
	Davidlohr Bueso, Arnd Bergmann, sonicadvance1, linux-kernel,
	kernel-dev, linux-api, Nathan Chancellor, Mathieu Desnoyers

On Mon, Nov 04, 2024 at 01:36:43PM +0100, Florian Weimer wrote:
> * Peter Zijlstra:
> 
> > On Sat, Nov 02, 2024 at 10:58:42PM +0100, Florian Weimer wrote:
> >
> >> QEMU hints towards further problems (in linux-user/syscall.c):
> >> 
> >>     case TARGET_NR_set_robust_list:
> >>     case TARGET_NR_get_robust_list:
> >>         /* The ABI for supporting robust futexes has userspace pass
> >>          * the kernel a pointer to a linked list which is updated by
> >>          * userspace after the syscall; the list is walked by the kernel
> >>          * when the thread exits. Since the linked list in QEMU guest
> >>          * memory isn't a valid linked list for the host and we have
> >>          * no way to reliably intercept the thread-death event, we can't
> >>          * support these. Silently return ENOSYS so that guest userspace
> >>          * falls back to a non-robust futex implementation (which should
> >>          * be OK except in the corner case of the guest crashing while
> >>          * holding a mutex that is shared with another process via
> >>          * shared memory).
> >>          */
> >>         return -TARGET_ENOSYS;
> >
> > I don't think we can sanely fix that. Can't QEMU track the robust thing
> > itself and use waitpid() to discover the thread is gone and fudge things
> > from there?
> 
> There are race conditions with munmap, I think, and they probably get a
> lot of worse if QEMU does that.
> 
> See Rich Felker's bug report:
> 
> | The corruption is performed by the kernel when it walks the robust
> | list. The basic situation is the same as in PR #13690, except that
> | here there's actually a potential write to the memory rather than just
> | a read.
> | 
> | The sequence of events leading to corruption goes like this:
> | 
> | 1. Thread A unlocks the process-shared, robust mutex and is preempted
> |    after the mutex is removed from the robust list and atomically
> |    unlocked, but before it's removed from the list_op_pending field of
> |    the robust list header.
> | 
> | 2. Thread B locks the mutex, and, knowing by program logic that it's
> |    the last user of the mutex, unlocks and unmaps it, allocates/maps
> |    something else that gets assigned the same address as the shared mutex
> |    mapping, and then exits.
> | 
> | 3. The kernel destroys the process, which involves walking each
> |   thread's robust list and processing each thread's list_op_pending
> |   field of the robust list header. Since thread A has a list_op_pending
> |   pointing at the address previously occupied by the mutex, the kernel
> |   obliviously "unlocks the mutex" by writing a 0 to the address and
> |   futex-waking it. However, the kernel has instead overwritten part of
> |   whatever mapping thread A created. If this is private memory it
> |   (probably) doesn't matter since the process is ending anyway (but are
> |   there race conditions where this can be seen?). If this is shared
> |   memory or a shared file mapping, however, the kernel corrupts it.
> | 
> | I suspect the race is difficult to hit since thread A has to get
> | preempted at exactly the wrong time AND thread B has to do a fair
> | amount of work without thread A getting scheduled again. So I'm not
> | sure how much luck we'd have getting a test case.
> 
> 
> <https://sourceware.org/bugzilla/show_bug.cgi?id=14485#c3>

So I've only managed to conjure up two horrible solutions for this:

 - put the robust futex operations under user-space RCU, and mandate a
   matching synchronize_rcu() before any munmap() calls.

 - add a robust-barrier syscall that waits until all list_op_pending are
   either NULL or changed since invocation. And mandate this call before
   munmap().

Neither are particularly pretty I admit, but at least they should work.

But doing this and mandating the alignment thing should at least make
this qemu thing workable, no?

> We also have a silent unlocking failure because userspace does not know
> about ROBUST_LIST_LIMIT:
> 
>   Bug 19089 - Robust mutexes do not take ROBUST_LIST_LIMIT into account
>   <https://sourceware.org/bugzilla/show_bug.cgi?id=19089>
> 
> (I think we may have discussed this one before, and you may have
> suggested to just hard-code 2048 in userspace because the constant is
> not expected to change.)
> 
> So the in-mutex linked list has quite a few problems even outside of
> emulation. 8-(

It's futex, ofcourse its a pain in the arse :-)

And yeah, no better ideas on that limit for now...

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-11-05 12:18 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-01 16:21 [PATCH v2 0/3] futex: Create set_robust_list2 André Almeida
2024-11-01 16:21 ` [PATCH v2 1/3] futex: Use explicit sizes for compat_exit_robust_list André Almeida
2024-11-02  5:44   ` kernel test robot
2024-11-02 14:57   ` kernel test robot
2024-11-01 16:21 ` [PATCH v2 2/3] futex: Create set_robust_list2 André Almeida
2024-11-02 16:40   ` kernel test robot
2024-11-04 11:22   ` Peter Zijlstra
2024-11-04 21:55     ` André Almeida
2024-11-05 12:10       ` Peter Zijlstra
2024-11-01 16:21 ` [PATCH v2 3/3] futex: Wire up set_robust_list2 syscall André Almeida
2024-11-02  5:13   ` kernel test robot
2024-11-02  6:05   ` kernel test robot
2024-11-02 21:58 ` [PATCH v2 0/3] futex: Create set_robust_list2 Florian Weimer
2024-11-04 11:32   ` Peter Zijlstra
2024-11-04 11:56     ` Peter Zijlstra
2024-11-04 12:36     ` Florian Weimer
2024-11-05 12:18       ` Peter Zijlstra
2024-11-04 21:49   ` André Almeida

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).