public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Alexey Gladkov <legion@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Howells <dhowells@redhat.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Cc: linux-kernel@vger.kernel.org
Subject: [RFC 2/1] kill task_ucounts()->rcu_read_lock(), add __task_ucounts()
Date: Sun, 26 Oct 2025 15:35:01 +0100	[thread overview]
Message-ID: <20251026143501.GA22472@redhat.com> (raw)
In-Reply-To: <20251026143140.GA22463@redhat.com>

On 10/26, Oleg Nesterov wrote:
>
> NOTE: task_ucounts() returns the pointer to another rcu-protected data,
> struct ucounts. So it should either be used when task->real_cred and thus
> task->real_cred->ucounts is stable (release_task, copy_process, copy_creds),
> or it should be called under rcu_read_lock(). In both cases it is pointless
> to take rcu_read_lock() to read the cred->ucounts pointer.

So I think task_ucounts() can just do

	/* The caller must ensure that ->real_cred is stable or take rcu_read_lock() */
	#define task_ucounts(task)	\
		rcu_dereference_check((task)->real_cred, 1)->ucounts

but this removes the lockdep checks altogether.

But, otoh, task_cred_xxx(t, ucounts) (or, say, task_cred_xxx(task, user_ns)) can
hide the problem. Lockdep won't complain if, for example, we remove rcu_read_lock()
in task_sig() around get_rlimit_value(task_ucounts(p)). So perhaps something like
below makes any sense?


diff --git a/include/linux/cred.h b/include/linux/cred.h
index 89ae50ad2ace..7078159486f0 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -347,7 +347,14 @@ DEFINE_FREE(put_cred, struct cred *, if (!IS_ERR_OR_NULL(_T)) put_cred(_T))
 
 #define task_uid(task)		(task_cred_xxx((task), uid))
 #define task_euid(task)		(task_cred_xxx((task), euid))
-#define task_ucounts(task)	(task_cred_xxx((task), ucounts))
+
+// ->real_cred must be stable
+#define __task_ucounts(task)	\
+	rcu_dereference_protected((task)->real_cred, 1)->ucounts
+
+// needs rcu_read_lock()
+#define task_ucounts(task)	\
+	rcu_dereference((task)->real_cred)->ucounts
 
 #define current_cred_xxx(xxx)			\
 ({						\
diff --git a/kernel/cred.c b/kernel/cred.c
index dbf6b687dc5c..edddecec82e5 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -305,7 +305,7 @@ int copy_creds(struct task_struct *p, u64 clone_flags)
 		p->real_cred = get_cred_many(p->cred, 2);
 		kdebug("share_creds(%p{%ld})",
 		       p->cred, atomic_long_read(&p->cred->usage));
-		inc_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
+		inc_rlimit_ucounts(__task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
 		return 0;
 	}
 
@@ -342,7 +342,7 @@ int copy_creds(struct task_struct *p, u64 clone_flags)
 #endif
 
 	p->cred = p->real_cred = get_cred(new);
-	inc_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
+	inc_rlimit_ucounts(__task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
 	return 0;
 
 error_put:
diff --git a/kernel/exit.c b/kernel/exit.c
index f041f0c05ebb..80b0f1114bd3 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -252,7 +252,7 @@ void release_task(struct task_struct *p)
 
 	/* don't need to get the RCU readlock here - the process is dead and
 	 * can't be modifying its own credentials. */
-	dec_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
+	dec_rlimit_ucounts(__task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
 
 	pidfs_exit(p);
 	cgroup_release(p);
diff --git a/kernel/fork.c b/kernel/fork.c
index 3da0f08615a9..f2a6a3cd14ef 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2048,7 +2048,7 @@ __latent_entropy struct task_struct *copy_process(
 		goto bad_fork_free;
 
 	retval = -EAGAIN;
-	if (is_rlimit_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) {
+	if (is_rlimit_overlimit(__task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) {
 		if (p->real_cred->user != INIT_USER &&
 		    !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
 			goto bad_fork_cleanup_count;
@@ -2486,7 +2486,7 @@ __latent_entropy struct task_struct *copy_process(
 bad_fork_cleanup_delayacct:
 	delayacct_tsk_free(p);
 bad_fork_cleanup_count:
-	dec_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
+	dec_rlimit_ucounts(__task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
 	exit_creds(p);
 bad_fork_free:
 	WRITE_ONCE(p->__state, TASK_DEAD);


  reply	other threads:[~2025-10-26 14:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-26 14:31 [PATCH 1/1] release_task: kill unnecessary rcu_read_lock() around dec_rlimit_ucounts() Oleg Nesterov
2025-10-26 14:35 ` Oleg Nesterov [this message]
2025-10-26 14:49   ` [RFC 2/1] kill task_ucounts()->rcu_read_lock(), add __task_ucounts() Mateusz Guzik
2025-10-27 14:55   ` David Howells
2025-10-27 15:18     ` Paul E. McKenney
2025-10-26 15:44 ` [PATCH 1/1] release_task: kill unnecessary rcu_read_lock() around dec_rlimit_ucounts() Alexey Gladkov
2025-10-27 14:51 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251026143501.GA22472@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=legion@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox