All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhen Ni <zhen.ni@easystack.cn>,
	viro@zeniv.linux.org.uk, catalin.marinas@arm.com,
	brauner@kernel.org, zev@bewilderbeest.net,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org
Subject: Re: [PATCH] kernel/sys: Optimize do_prlimit lock scope to reduce contention
Date: Thu, 28 Nov 2024 08:39:11 +0100	[thread overview]
Message-ID: <20241128073911.GB10998@redhat.com> (raw)
In-Reply-To: <20241128071351.GA10998@redhat.com>

On 11/28, Oleg Nesterov wrote:
>
> The problem is that task_lock(tsk->group_leader) doesn't look right with or
> without this patch. I'll try to make a fix on weekend.
>
> If the caller is sys_prlimit64() and tsk != current, then ->group_leader is
> not stable, do_prlimit() can race with mt exec and take the wrong lock.

... and task_unlock(tsk->group_leader) is simply unsafe.

perhaps something like below, but it doesn't look nice, I'll try to think
more. And grep, may be there are more lockless users of tsk->group_leader
when tsk != current.

Oleg.

--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1464,6 +1464,7 @@ SYSCALL_DEFINE2(setdomainname, char __user *, name, int, len)
 static int do_prlimit(struct task_struct *tsk, unsigned int resource,
 		      struct rlimit *new_rlim, struct rlimit *old_rlim)
 {
+	struct task_struct *leader;
 	struct rlimit *rlim;
 	int retval = 0;
 
@@ -1481,7 +1482,14 @@ static int do_prlimit(struct task_struct *tsk, unsigned int resource,
 
 	/* Holding a refcount on tsk protects tsk->signal from disappearing. */
 	rlim = tsk->signal->rlim + resource;
-	task_lock(tsk->group_leader);
+
+	if (tsk != current)
+		read_lock(&tasklist_lock);
+	leader = READ_ONCE(tsk->group_leader);
+	task_lock(leader);
+	if (tsk != current)
+		read_unlock(&tasklist_lock);
+
 	if (new_rlim) {
 		/*
 		 * Keep the capable check against init_user_ns until cgroups can
@@ -1499,7 +1507,7 @@ static int do_prlimit(struct task_struct *tsk, unsigned int resource,
 		if (new_rlim)
 			*rlim = *new_rlim;
 	}
-	task_unlock(tsk->group_leader);
+	task_unlock(leader);
 
 	/*
 	 * RLIMIT_CPU handling. Arm the posix CPU timer if the limit is not


  reply	other threads:[~2024-11-28  7:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-20 13:21 [PATCH] kernel/sys: Optimize do_prlimit lock scope to reduce contention Zhen Ni
2024-11-28  1:45 ` Andrew Morton
2024-11-28  7:13   ` Oleg Nesterov
2024-11-28  7:39     ` Oleg Nesterov [this message]
2024-11-28  8:08       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241128073911.GB10998@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zev@bewilderbeest.net \
    --cc=zhen.ni@easystack.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.