public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Dylan Hatch <dylanbhatch@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Ard Biesheuvel <ardb@kernel.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Vincent Whitchurch <vincent.whitchurch@axis.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Mike Christie <michael.christie@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Stefan Roesch <shr@devkernel.io>, Joey Gouly <joey.gouly@arm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Helge Deller <deller@gmx.de>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	Florent Revest <revest@chromium.org>,
	Miguel Ojeda <ojeda@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] getrusage: Use trylock when getting sighand lock.
Date: Wed, 17 Jan 2024 21:44:17 +0100	[thread overview]
Message-ID: <20240117204416.GB32526@redhat.com> (raw)
In-Reply-To: <20240117192534.1327608-1-dylanbhatch@google.com>

Heh ;)

getrusage() should not use ->siglock at all.

On my TODO list. I'll try to make a patch this week.

Oleg.

On 01/17, Dylan Hatch wrote:
>
> Processes with many threads run the risk of causing a hard lockup if
> too many threads are calling getrusage() at once. This is because a
> calling thread with RUSAGE_SELF spins on the sighand lock with irq
> disabled, and the critical section of getrusage scales linearly with the
> size of the process. All cpus may end up spinning on the sighand lock
> for a long time because another thread has the lock and is busy
> iterating over 250k+ threads.
>
> In order to mitigate this, periodically re-enable interrupts while
> waiting for the sighand lock. This approach lacks the FIFO fairness of a
> normal spinlock mechanism, but this effect is somewhat contained to
> different threads within the same process.
>
> -- Alternatives Considered --
>
> In an earlier version of the above approach, we added a cond_resched()
> call when disabling interrupts to prevent soft lockups. This solution
> turned out not to be workable on its own since getrusage() is called
> from a non-preemptible context in kernel/exit.c, but could possibly be
> adapted by having an alternate version of getrusage() that can be called
> from a preemptible context.
>
> Another alternative would be to have getruage() itself release the lock
> and enable interrupts periodically while iterating over large numbers of
> threads. However, this would be difficult to implement correctly, and
> the correctness/consistency of the data reported by getrusage() would be
> questionable.
>
> One final alternative might be to add a per-process mutex for callers of
> getrusage() to acquire before acquiring the sighand lock, or to be used
> as a total replacement of the sigahnd lock. We haven't fully explored
> what the implications of this might be.
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> ---
>  include/linux/sched/signal.h | 13 +++++++++++
>  kernel/signal.c              | 43 ++++++++++++++++++++++++++++++++++++
>  kernel/sys.c                 |  8 ++++++-
>  3 files changed, 63 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
> index 3499c1a8b9295..7852f16139965 100644
> --- a/include/linux/sched/signal.h
> +++ b/include/linux/sched/signal.h
> @@ -747,6 +747,19 @@ static inline struct sighand_struct *lock_task_sighand(struct task_struct *task,
>  	return ret;
>  }
>
> +extern struct sighand_struct *__lock_task_sighand_safe(struct task_struct *task,
> +							unsigned long *flags);
> +
> +static inline struct sighand_struct *lock_task_sighand_safe(struct task_struct *task,
> +						       unsigned long *flags)
> +{
> +	struct sighand_struct *ret;
> +
> +	ret = __lock_task_sighand_safe(task, flags);
> +	(void)__cond_lock(&task->sighand->siglock, ret);
> +	return ret;
> +}
> +
>  static inline void unlock_task_sighand(struct task_struct *task,
>  						unsigned long *flags)
>  {
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 47a7602dfe8df..6d60c73b7ab91 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1397,6 +1397,49 @@ int zap_other_threads(struct task_struct *p)
>  	return count;
>  }
>
> +struct sighand_struct *__lock_task_sighand_safe(struct task_struct *tsk,
> +						unsigned long *flags)
> +{
> +	struct sighand_struct *sighand;
> +	int n;
> +	bool lock = false;
> +
> +again:
> +	rcu_read_lock();
> +	local_irq_save(*flags);
> +	for (n = 0; n < 500; n++) {
> +		sighand = rcu_dereference(tsk->sighand);
> +		if (unlikely(sighand == NULL))
> +			break;
> +
> +		/*
> +		 * The downside of this approach is we loose the fairness of
> +		 * FIFO waiting because the acqusition order between multiple
> +		 * waiting tasks is effectively random.
> +		 */
> +		lock = spin_trylock(&sighand->siglock);
> +		if (!lock) {
> +			cpu_relax();
> +			continue;
> +		}
> +
> +		/* __lock_task_sighand has context explaining this check. */
> +		if (likely(sighand == rcu_access_pointer(tsk->sighand)))
> +			break;
> +		spin_unlock(&sighand->siglock);
> +		lock = false;
> +	}
> +	rcu_read_unlock();
> +
> +	/* Handle pending IRQ */
> +	if (!lock && sighand) {
> +		local_irq_restore(*flags);
> +		goto again;
> +	}
> +
> +	return sighand;
> +}
> +
>  struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
>  					   unsigned long *flags)
>  {
> diff --git a/kernel/sys.c b/kernel/sys.c
> index e219fcfa112d8..1b1254a3c506b 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1798,7 +1798,13 @@ void getrusage(struct task_struct *p, int who, struct rusage *r)
>  		goto out;
>  	}
>
> -	if (!lock_task_sighand(p, &flags))
> +	/*
> +	 * We use lock_task_sighand_safe here instead of lock_task_sighand
> +	 * because one process with many threads all calling getrusage may
> +	 * otherwise cause an NMI watchdog timeout by disabling IRQs for too
> +	 * long.
> +	 */
> +	if (!lock_task_sighand_safe(p, &flags))
>  		return;
>
>  	switch (who) {
> --
> 2.43.0.381.gb435a96ce8-goog
>


  reply	other threads:[~2024-01-17 20:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-17 19:25 [RFC PATCH] getrusage: Use trylock when getting sighand lock Dylan Hatch
2024-01-17 20:44 ` Oleg Nesterov [this message]
2024-01-18 15:56   ` Oleg Nesterov
2024-01-19 14:15 ` [PATCH 1/2] getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand() Oleg Nesterov
2024-01-19 14:15   ` [PATCH 2/2] getrusage: use sig->stats_lock Oleg Nesterov
2024-01-20  3:27     ` Dylan Hatch
2024-01-21  4:45       ` Andrew Morton
2024-01-21 12:07         ` Oleg Nesterov
2024-01-23  2:53           ` Dylan Hatch
2024-01-21 22:32       ` Andrew Morton
2024-01-20  3:29   ` [PATCH 1/2] getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand() Dylan Hatch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240117204416.GB32526@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=deller@gmx.de \
    --cc=dvyukov@google.com \
    --cc=dylanbhatch@google.com \
    --cc=ebiederm@xmission.com \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=joey.gouly@arm.com \
    --cc=josh@joshtriplett.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=michael.christie@oracle.com \
    --cc=ojeda@kernel.org \
    --cc=omosnace@redhat.com \
    --cc=revest@chromium.org \
    --cc=shr@devkernel.io \
    --cc=tglx@linutronix.de \
    --cc=vincent.whitchurch@axis.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox