From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF0763B5821 for ; Mon, 16 Mar 2026 17:13:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773681181; cv=none; b=kSd0mvNYNOrY3sWpGiUdQG83qqiWzgdGRVRQsEQzo+LFOiZoADu05VoPoV2zjzoxORQd8jT8/7pPT/jLDO4g5SHB8ranCKX8a7lYDahhbNxsmnefd54PIKVVQ1mq9AguJDQl1f/xisLKQjDZgqtFxcq3nWkkSgNv65ZoZB1TZu4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773681181; c=relaxed/simple; bh=GocfcCIP8tAW+aOa+1c/Yigqk/8CTL08tGBugSEDikQ=; h=Date:Message-ID:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=lY+2nCM6eXCash44IIFvIIWRZmlHqKrKAP75iCoakfwurWqaDVxA/A7+t/URYIZQ1GsbZtlbdZr6dYR5jjoL8NL4ijhtOSAeXgjt25LH0O8RZ1jK41DIgkrZah/AO3I7ubLj92X4n4bkA/lMIvIcGu5LDEGV03Ysq24OcUlZCBA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=P+w7GabL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="P+w7GabL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7A214C19421; Mon, 16 Mar 2026 17:12:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773681180; bh=GocfcCIP8tAW+aOa+1c/Yigqk/8CTL08tGBugSEDikQ=; h=Date:From:To:Cc:Subject:References:From; b=P+w7GabLuwPqDuhJX2pJXYqMbKB593sKUMNdo6ZsvCoThgt4Y+GvyaNmrjLO2nn4Q X8+8DlqKwWe5uvUILO7S4ixuOlaGDNqIaGIH43Qd2AHR9Aqi/VUjNHZ/8KeAJZva6L 3uTVUV5uS5tolFrqozYW33YezyTTL9lV1B4qOgBqiB4K5QixHm1001/HfUtbhrGjjW QHHCvFzz1d0UW/LXHu6fhVgiuaZ743rJQ5MbX33ZzgXoWzyknc6WnGrm+uivPWu9NM FvA8/2bC9mohGH8za7jcQt/GcjMwkzA4ttGOKg59Hbkrx7QSv5IHqqt8xIcUsOquvo WftgKOP10EkVA== Date: Mon, 16 Mar 2026 18:12:56 +0100 Message-ID: <20260316164951.004423818@kernel.org> User-Agent: quilt/0.68 From: Thomas Gleixner To: LKML Cc: Mathieu Desnoyers , =?UTF-8?q?Andr=C3=A9=20Almeida?= , Sebastian Andrzej Siewior , Carlos O'Donell , Peter Zijlstra , Florian Weimer , Rich Felker , Torvald Riegel , Darren Hart , Ingo Molnar , Davidlohr Bueso , Arnd Bergmann , "Liam R . Howlett" Subject: [patch 1/8] futex: Move futex task related data into a struct References: <20260316162316.356674433@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Having all these members in task_struct along with the required #ifdeffery is annoying, does not allow efficient initializing of the data with memset() and makes extending it tedious. Move it into a data structure and fix up all usage sites. Signed-off-by: Thomas Gleixner --- Documentation/locking/robust-futexes.rst | 8 ++-- include/linux/futex.h | 12 ++---- include/linux/futex_types.h | 34 +++++++++++++++++++ include/linux/sched.h | 16 ++------- kernel/exit.c | 4 +- kernel/futex/core.c | 55 +++++++++++++++---------------- kernel/futex/pi.c | 26 +++++++------- kernel/futex/syscalls.c | 23 ++++-------- 8 files changed, 97 insertions(+), 81 deletions(-) --- a/Documentation/locking/robust-futexes.rst +++ b/Documentation/locking/robust-futexes.rst @@ -94,7 +94,7 @@ time, the kernel checks this user-space locks to be cleaned up? In the common case, at do_exit() time, there is no list registered, so -the cost of robust futexes is just a simple current->robust_list != NULL +the cost of robust futexes is just a current->futex.robust_list != NULL comparison. If the thread has registered a list, then normally the list is empty. If the thread/process crashed or terminated in some incorrect way then the list might be non-empty: in this case the kernel carefully @@ -178,9 +178,9 @@ The patch adds two new syscalls: one to size_t __user *len_ptr); List registration is very fast: the pointer is simply stored in -current->robust_list. [Note that in the future, if robust futexes become -widespread, we could extend sys_clone() to register a robust-list head -for new threads, without the need of another syscall.] +current->futex.robust_list. [Note that in the future, if robust futexes +become widespread, we could extend sys_clone() to register a robust-list +head for new threads, without the need of another syscall.] So there is virtually zero overhead for tasks not using robust futexes, and even for robust futex users, there is only one extra syscall per --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -64,14 +64,10 @@ enum { static inline void futex_init_task(struct task_struct *tsk) { - tsk->robust_list = NULL; -#ifdef CONFIG_COMPAT - tsk->compat_robust_list = NULL; -#endif - INIT_LIST_HEAD(&tsk->pi_state_list); - tsk->pi_state_cache = NULL; - tsk->futex_state = FUTEX_STATE_OK; - mutex_init(&tsk->futex_exit_mutex); + memset(&tsk->futex, 0, sizeof(tsk->futex)); + INIT_LIST_HEAD(&tsk->futex.pi_state_list); + tsk->futex.state = FUTEX_STATE_OK; + mutex_init(&tsk->futex.exit_mutex); } void futex_exit_recursive(struct task_struct *tsk); --- /dev/null +++ b/include/linux/futex_types.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_FUTEX_TYPES_H +#define _LINUX_FUTEX_TYPES_H + +#ifdef CONFIG_FUTEX +#include +#include + +struct compat_robust_list_head; +struct futex_pi_state; +struct robust_list_head; + +/** + * struct futex_ctrl - Futex related per task data + * @robust_list: User space registered robust list pointer + * @compat_robust_list: User space registered robust list pointer for compat tasks + * @exit_mutex: Mutex for serializing exit + * @state: Futex handling state to handle exit races correctly + */ +struct futex_ctrl { + struct robust_list_head __user *robust_list; +#ifdef CONFIG_COMPAT + struct compat_robust_list_head __user *compat_robust_list; +#endif + struct list_head pi_state_list; + struct futex_pi_state *pi_state_cache; + struct mutex exit_mutex; + unsigned int state; +}; +#else +struct futex_ctrl { }; +#endif /* !CONFIG_FUTEX */ + +#endif /* _LINUX_FUTEX_TYPES_H */ --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -16,6 +16,7 @@ #include #include +#include #include #include #include @@ -64,7 +65,6 @@ struct bpf_net_context; struct capture_control; struct cfs_rq; struct fs_struct; -struct futex_pi_state; struct io_context; struct io_uring_task; struct mempolicy; @@ -76,7 +76,6 @@ struct pid_namespace; struct pipe_inode_info; struct rcu_node; struct reclaim_state; -struct robust_list_head; struct root_domain; struct rq; struct sched_attr; @@ -1329,16 +1328,9 @@ struct task_struct { u32 closid; u32 rmid; #endif -#ifdef CONFIG_FUTEX - struct robust_list_head __user *robust_list; -#ifdef CONFIG_COMPAT - struct compat_robust_list_head __user *compat_robust_list; -#endif - struct list_head pi_state_list; - struct futex_pi_state *pi_state_cache; - struct mutex futex_exit_mutex; - unsigned int futex_state; -#endif + + struct futex_ctrl futex; + #ifdef CONFIG_PERF_EVENTS u8 perf_recursion[PERF_NR_CONTEXTS]; struct perf_event_context *perf_event_ctxp; --- a/kernel/exit.c +++ b/kernel/exit.c @@ -989,8 +989,8 @@ void __noreturn do_exit(long code) proc_exit_connector(tsk); mpol_put_task_policy(tsk); #ifdef CONFIG_FUTEX - if (unlikely(current->pi_state_cache)) - kfree(current->pi_state_cache); + if (unlikely(current->futex.pi_state_cache)) + kfree(current->futex.pi_state_cache); #endif /* * Make sure we are holding no locks: --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -32,18 +32,19 @@ * "But they come in a choice of three flavours!" */ #include -#include -#include #include -#include +#include #include -#include +#include #include -#include -#include -#include #include #include +#include +#include +#include +#include +#include +#include #include "futex.h" #include "../locking/rtmutex_common.h" @@ -829,7 +830,7 @@ void wait_for_owner_exiting(int ret, str if (WARN_ON_ONCE(ret == -EBUSY && !exiting)) return; - mutex_lock(&exiting->futex_exit_mutex); + mutex_lock(&exiting->futex.exit_mutex); /* * No point in doing state checking here. If the waiter got here * while the task was in exec()->exec_futex_release() then it can @@ -838,7 +839,7 @@ void wait_for_owner_exiting(int ret, str * already. Highly unlikely and not a problem. Just one more round * through the futex maze. */ - mutex_unlock(&exiting->futex_exit_mutex); + mutex_unlock(&exiting->futex.exit_mutex); put_task_struct(exiting); } @@ -1048,7 +1049,7 @@ static int handle_futex_death(u32 __user * * In both cases the following conditions are met: * - * 1) task->robust_list->list_op_pending != NULL + * 1) task->futex.robust_list->list_op_pending != NULL * @pending_op == true * 2) The owner part of user space futex value == 0 * 3) Regular futex: @pi == false @@ -1153,7 +1154,7 @@ static inline int fetch_robust_entry(str */ static void exit_robust_list(struct task_struct *curr) { - struct robust_list_head __user *head = curr->robust_list; + struct robust_list_head __user *head = curr->futex.robust_list; struct robust_list __user *entry, *next_entry, *pending; unsigned int limit = ROBUST_LIST_LIMIT, pi, pip; unsigned int next_pi; @@ -1247,7 +1248,7 @@ compat_fetch_robust_entry(compat_uptr_t */ static void compat_exit_robust_list(struct task_struct *curr) { - struct compat_robust_list_head __user *head = curr->compat_robust_list; + struct compat_robust_list_head __user *head = curr->futex.compat_robust_list; struct robust_list __user *entry, *next_entry, *pending; unsigned int limit = ROBUST_LIST_LIMIT, pi, pip; unsigned int next_pi; @@ -1323,7 +1324,7 @@ static void compat_exit_robust_list(stru */ static void exit_pi_state_list(struct task_struct *curr) { - struct list_head *next, *head = &curr->pi_state_list; + struct list_head *next, *head = &curr->futex.pi_state_list; struct futex_pi_state *pi_state; union futex_key key = FUTEX_KEY_INIT; @@ -1407,19 +1408,19 @@ static inline void exit_pi_state_list(st static void futex_cleanup(struct task_struct *tsk) { - if (unlikely(tsk->robust_list)) { + if (unlikely(tsk->futex.robust_list)) { exit_robust_list(tsk); - tsk->robust_list = NULL; + tsk->futex.robust_list = NULL; } #ifdef CONFIG_COMPAT - if (unlikely(tsk->compat_robust_list)) { + if (unlikely(tsk->futex.compat_robust_list)) { compat_exit_robust_list(tsk); - tsk->compat_robust_list = NULL; + tsk->futex.compat_robust_list = NULL; } #endif - if (unlikely(!list_empty(&tsk->pi_state_list))) + if (unlikely(!list_empty(&tsk->futex.pi_state_list))) exit_pi_state_list(tsk); } @@ -1442,10 +1443,10 @@ static void futex_cleanup(struct task_st */ void futex_exit_recursive(struct task_struct *tsk) { - /* If the state is FUTEX_STATE_EXITING then futex_exit_mutex is held */ - if (tsk->futex_state == FUTEX_STATE_EXITING) - mutex_unlock(&tsk->futex_exit_mutex); - tsk->futex_state = FUTEX_STATE_DEAD; + /* If the state is FUTEX_STATE_EXITING then futex.exit_mutex is held */ + if (tsk->futex.state == FUTEX_STATE_EXITING) + mutex_unlock(&tsk->futex.exit_mutex); + tsk->futex.state = FUTEX_STATE_DEAD; } static void futex_cleanup_begin(struct task_struct *tsk) @@ -1453,10 +1454,10 @@ static void futex_cleanup_begin(struct t /* * Prevent various race issues against a concurrent incoming waiter * including live locks by forcing the waiter to block on - * tsk->futex_exit_mutex when it observes FUTEX_STATE_EXITING in + * tsk->futex.exit_mutex when it observes FUTEX_STATE_EXITING in * attach_to_pi_owner(). */ - mutex_lock(&tsk->futex_exit_mutex); + mutex_lock(&tsk->futex.exit_mutex); /* * Switch the state to FUTEX_STATE_EXITING under tsk->pi_lock. @@ -1470,7 +1471,7 @@ static void futex_cleanup_begin(struct t * be observed in exit_pi_state_list(). */ raw_spin_lock_irq(&tsk->pi_lock); - tsk->futex_state = FUTEX_STATE_EXITING; + tsk->futex.state = FUTEX_STATE_EXITING; raw_spin_unlock_irq(&tsk->pi_lock); } @@ -1480,12 +1481,12 @@ static void futex_cleanup_end(struct tas * Lockless store. The only side effect is that an observer might * take another loop until it becomes visible. */ - tsk->futex_state = state; + tsk->futex.state = state; /* * Drop the exit protection. This unblocks waiters which observed * FUTEX_STATE_EXITING to reevaluate the state. */ - mutex_unlock(&tsk->futex_exit_mutex); + mutex_unlock(&tsk->futex.exit_mutex); } void futex_exec_release(struct task_struct *tsk) --- a/kernel/futex/pi.c +++ b/kernel/futex/pi.c @@ -14,7 +14,7 @@ int refill_pi_state_cache(void) { struct futex_pi_state *pi_state; - if (likely(current->pi_state_cache)) + if (likely(current->futex.pi_state_cache)) return 0; pi_state = kzalloc_obj(*pi_state); @@ -28,17 +28,17 @@ int refill_pi_state_cache(void) refcount_set(&pi_state->refcount, 1); pi_state->key = FUTEX_KEY_INIT; - current->pi_state_cache = pi_state; + current->futex.pi_state_cache = pi_state; return 0; } static struct futex_pi_state *alloc_pi_state(void) { - struct futex_pi_state *pi_state = current->pi_state_cache; + struct futex_pi_state *pi_state = current->futex.pi_state_cache; WARN_ON(!pi_state); - current->pi_state_cache = NULL; + current->futex.pi_state_cache = NULL; return pi_state; } @@ -60,7 +60,7 @@ static void pi_state_update_owner(struct if (new_owner) { raw_spin_lock(&new_owner->pi_lock); WARN_ON(!list_empty(&pi_state->list)); - list_add(&pi_state->list, &new_owner->pi_state_list); + list_add(&pi_state->list, &new_owner->futex.pi_state_list); pi_state->owner = new_owner; raw_spin_unlock(&new_owner->pi_lock); } @@ -96,7 +96,7 @@ void put_pi_state(struct futex_pi_state raw_spin_unlock_irqrestore(&pi_state->pi_mutex.wait_lock, flags); } - if (current->pi_state_cache) { + if (current->futex.pi_state_cache) { kfree(pi_state); } else { /* @@ -106,7 +106,7 @@ void put_pi_state(struct futex_pi_state */ pi_state->owner = NULL; refcount_set(&pi_state->refcount, 1); - current->pi_state_cache = pi_state; + current->futex.pi_state_cache = pi_state; } } @@ -179,7 +179,7 @@ void put_pi_state(struct futex_pi_state * * p->pi_lock: * - * p->pi_state_list -> pi_state->list, relation + * p->futex.pi_state_list -> pi_state->list, relation * pi_mutex->owner -> pi_state->owner, relation * * pi_state->refcount: @@ -327,7 +327,7 @@ static int handle_exit_race(u32 __user * * If the futex exit state is not yet FUTEX_STATE_DEAD, tell the * caller that the alleged owner is busy. */ - if (tsk && tsk->futex_state != FUTEX_STATE_DEAD) + if (tsk && tsk->futex.state != FUTEX_STATE_DEAD) return -EBUSY; /* @@ -346,8 +346,8 @@ static int handle_exit_race(u32 __user * * *uaddr = 0xC0000000; tsk = get_task(PID); * } if (!tsk->flags & PF_EXITING) { * ... attach(); - * tsk->futex_state = } else { - * FUTEX_STATE_DEAD; if (tsk->futex_state != + * tsk->futex.state = } else { + * FUTEX_STATE_DEAD; if (tsk->futex.state != * FUTEX_STATE_DEAD) * return -EAGAIN; * return -ESRCH; <--- FAIL @@ -395,7 +395,7 @@ static void __attach_to_pi_owner(struct pi_state->key = *key; WARN_ON(!list_empty(&pi_state->list)); - list_add(&pi_state->list, &p->pi_state_list); + list_add(&pi_state->list, &p->futex.pi_state_list); /* * Assignment without holding pi_state->pi_mutex.wait_lock is safe * because there is no concurrency as the object is not published yet. @@ -439,7 +439,7 @@ static int attach_to_pi_owner(u32 __user * in futex_exit_release(), we do this protected by p->pi_lock: */ raw_spin_lock_irq(&p->pi_lock); - if (unlikely(p->futex_state != FUTEX_STATE_OK)) { + if (unlikely(p->futex.state != FUTEX_STATE_OK)) { /* * The task is on the way out. When the futex state is * FUTEX_STATE_DEAD, we know that the task has finished --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -25,17 +25,13 @@ * @head: pointer to the list-head * @len: length of the list-head, as userspace expects */ -SYSCALL_DEFINE2(set_robust_list, struct robust_list_head __user *, head, - size_t, len) +SYSCALL_DEFINE2(set_robust_list, struct robust_list_head __user *, head, size_t, len) { - /* - * The kernel knows only one size for now: - */ + /* The kernel knows only one size for now. */ if (unlikely(len != sizeof(*head))) return -EINVAL; - current->robust_list = head; - + current->futex.robust_list = head; return 0; } @@ -43,9 +39,9 @@ static inline void __user *futex_task_ro { #ifdef CONFIG_COMPAT if (compat) - return p->compat_robust_list; + return p->futex.compat_robust_list; #endif - return p->robust_list; + return p->futex.robust_list; } static void __user *futex_get_robust_list_common(int pid, bool compat) @@ -467,15 +463,13 @@ SYSCALL_DEFINE4(futex_requeue, } #ifdef CONFIG_COMPAT -COMPAT_SYSCALL_DEFINE2(set_robust_list, - struct compat_robust_list_head __user *, head, - compat_size_t, len) +COMPAT_SYSCALL_DEFINE2(set_robust_list, struct compat_robust_list_head __user *, head, + compat_size_t, len) { if (unlikely(len != sizeof(*head))) return -EINVAL; - current->compat_robust_list = head; - + current->futex.compat_robust_list = head; return 0; } @@ -515,4 +509,3 @@ SYSCALL_DEFINE6(futex_time32, u32 __user return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); } #endif /* CONFIG_COMPAT_32BIT_TIME */ -