Re: [PATCH] kernel/sys: Optimize do_prlimit lock scope to reduce contention

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhen Ni <zhen.ni@easystack.cn>,
	viro@zeniv.linux.org.uk, catalin.marinas@arm.com,
	brauner@kernel.org, zev@bewilderbeest.net,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org
Subject: Re: [PATCH] kernel/sys: Optimize do_prlimit lock scope to reduce contention
Date: Thu, 28 Nov 2024 08:13:52 +0100	[thread overview]
Message-ID: <20241128071351.GA10998@redhat.com> (raw)
In-Reply-To: <20241127174536.752def18058e84487ab9ad65@linux-foundation.org>

On 11/27, Andrew Morton wrote:
>
> On Wed, 20 Nov 2024 21:21:56 +0800 Zhen Ni <zhen.ni@easystack.cn> wrote:
>
> > The security_task_setrlimit function is a Linux Security Module (LSM)
> > hook that evaluates resource limit changes based on security policies.
> > It does not alter the rlim data structure, as confirmed by existing
> > LSM implementations (e.g., SELinux and AppArmor). Thus, this function
> > does not require locking, ensuring correctness while improving
> > concurrency.
>
> Seems sane.
>
> Does any code call do_prlimit() frequently enough for this to matter?

I have the same question...

> > -	task_lock(tsk->group_leader);
> >  	if (new_rlim) {
> >  		/*
> >  		 * Keep the capable check against init_user_ns until cgroups can
> >  		 * contain all limits.
> >  		 */
> >  		if (new_rlim->rlim_max > rlim->rlim_max &&
> > -				!capable(CAP_SYS_RESOURCE))
> > -			retval = -EPERM;
> > -		if (!retval)
> > -			retval = security_task_setrlimit(tsk, resource, new_rlim);
> > +		    !capable(CAP_SYS_RESOURCE))
> > +			return -EPERM;
> > +		retval = security_task_setrlimit(tsk, resource, new_rlim);
> > +		if (retval)
> > +			return retval;
> >  	}
> > +
> > +	task_lock(tsk->group_leader);

The problem is that task_lock(tsk->group_leader) doesn't look right with or
without this patch. I'll try to make a fix on weekend.

If the caller is sys_prlimit64() and tsk != current, then ->group_leader is
not stable, do_prlimit() can race with mt exec and take the wrong lock.

Oleg.

next prev parent reply	other threads:[~2024-11-28  7:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-20 13:21 [PATCH] kernel/sys: Optimize do_prlimit lock scope to reduce contention Zhen Ni
2024-11-28  1:45 ` Andrew Morton
2024-11-28  7:13   ` Oleg Nesterov [this message]
2024-11-28  7:39     ` Oleg Nesterov
2024-11-28  8:08       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241128071351.GA10998@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zev@bewilderbeest.net \
    --cc=zhen.ni@easystack.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.