From: David Vernet <void@manifault.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
sched-ext@meta.com, Andrea Righi <arighi@nvidia.com>,
Changwoo Min <multics69@gmail.com>
Subject: Re: [PATCH sched_ext/for-6.13 1/2] sched_ext: Avoid live-locking bypass mode switching
Date: Tue, 5 Nov 2024 16:03:46 -0600 [thread overview]
Message-ID: <20241105220346.GA64119@maniforge> (raw)
In-Reply-To: <ZyqSm4B4NuzuHEbp@slm.duckdns.org>
[-- Attachment #1: Type: text/plain, Size: 2444 bytes --]
On Tue, Nov 05, 2024 at 11:48:11AM -1000, Tejun Heo wrote:
[...]
> static bool consume_dispatch_q(struct rq *rq, struct scx_dispatch_q *dsq)
> {
> struct task_struct *p;
> retry:
> /*
> + * This retry loop can repeatedly race against scx_ops_bypass()
> + * dequeueing tasks from @dsq trying to put the system into the bypass
> + * mode. On some multi-socket machines (e.g. 2x Intel 8480c), this can
> + * live-lock the machine into soft lockups. Give a breather.
> + */
> + scx_ops_breather(rq);
Should we move this to after the list_empty() check? Or before the goto retry
below so we can avoid having to do the atomic read on the typical hotpath?
> +
> + /*
> * The caller can't expect to successfully consume a task if the task's
> * addition to @dsq isn't guaranteed to be visible somehow. Test
> * @dsq->list without locking and skip if it seems empty.
> @@ -4550,10 +4587,11 @@ bool task_should_scx(struct task_struct
> */
> static void scx_ops_bypass(bool bypass)
> {
> + static DEFINE_RAW_SPINLOCK(bypass_lock);
> int cpu;
> unsigned long flags;
>
> - raw_spin_lock_irqsave(&__scx_ops_bypass_lock, flags);
> + raw_spin_lock_irqsave(&bypass_lock, flags);
> if (bypass) {
> scx_ops_bypass_depth++;
> WARN_ON_ONCE(scx_ops_bypass_depth <= 0);
> @@ -4566,6 +4604,8 @@ static void scx_ops_bypass(bool bypass)
> goto unlock;
> }
>
> + atomic_inc(&scx_ops_breather_depth);
> +
> /*
> * No task property is changing. We just need to make sure all currently
> * queued tasks are re-queued according to the new scx_rq_bypassing()
> @@ -4621,8 +4661,10 @@ static void scx_ops_bypass(bool bypass)
> /* resched to restore ticks and idle state */
> resched_cpu(cpu);
> }
> +
> + atomic_dec(&scx_ops_breather_depth);
> unlock:
> - raw_spin_unlock_irqrestore(&__scx_ops_bypass_lock, flags);
> + raw_spin_unlock_irqrestore(&bypass_lock, flags);
> }
>
> static void free_exit_info(struct scx_exit_info *ei)
> @@ -6275,6 +6317,13 @@ static bool scx_dispatch_from_dsq(struct
> raw_spin_rq_lock(src_rq);
> }
>
> + /*
> + * If the BPF scheduler keeps calling this function repeatedly, it can
> + * cause similar live-lock conditions as consume_dispatch_q(). Insert a
> + * breather if necessary.
> + */
> + scx_ops_breather(src_rq);
> +
> locked_rq = src_rq;
> raw_spin_lock(&src_dsq->lock);
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2024-11-05 22:03 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-05 21:48 [PATCH sched_ext/for-6.13 1/2] sched_ext: Avoid live-locking bypass mode switching Tejun Heo
2024-11-05 21:49 ` [PATCH sched_ext/for-6.13 2/2] sched_ext: Enable the ops breather and eject BPF scheduler on softlockup Tejun Heo
2024-11-06 21:32 ` Doug Anderson
2024-11-06 22:08 ` Tejun Heo
2024-11-06 23:02 ` Doug Anderson
2024-11-06 23:07 ` Tejun Heo
2024-11-06 23:20 ` Doug Anderson
2024-11-07 19:31 ` Tejun Heo
2024-11-08 20:38 ` Tejun Heo
2024-11-05 22:03 ` David Vernet [this message]
2024-11-05 23:02 ` [PATCH sched_ext/for-6.13 1/2] sched_ext: Avoid live-locking bypass mode switching Tejun Heo
2024-11-05 23:57 ` Andrea Righi
2024-11-06 0:26 ` Tejun Heo
2024-11-06 0:33 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241105220346.GA64119@maniforge \
--to=void@manifault.com \
--cc=arighi@nvidia.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=multics69@gmail.com \
--cc=sched-ext@meta.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.