From: Tejun Heo <tj@kernel.org>
To: Doug Anderson <dianders@chromium.org>
Cc: David Vernet <void@manifault.com>,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
sched-ext@meta.com, Andrea Righi <arighi@nvidia.com>,
Changwoo Min <multics69@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH sched_ext/for-6.13 2/2] sched_ext: Enable the ops breather and eject BPF scheduler on softlockup
Date: Wed, 6 Nov 2024 12:08:46 -1000 [thread overview]
Message-ID: <Zyvo7lFcnAddB9RT@slm.duckdns.org> (raw)
In-Reply-To: <CAD=FV=U7z-Lf_1T2cYyae3b6W5Joyp+oiRSp-iXe_3jz9Aqoaw@mail.gmail.com>
Hello, Doug.
On Wed, Nov 06, 2024 at 01:32:40PM -0800, Doug Anderson wrote:
...
> 1. It doesn't feel right to add knowledge of "sched-ext" to the
> softlockup detector. You're calling from a generic part of the kernel
> to a specific part and it just feels unexpected, like there should be
> some better boundaries between the two.
I suppose we can create a notifier like infrastructure if directly calling
is what's bothersome but it's likely an overkill at this point. The second
point probably is more important to discuss.
> 2. You're relying on a debug feature to enforce correctness. The
> softlockup detector isn't designed to _fix_ softlockups. It's designed
> to detect and report softlockups and then possibly reboot the machine.
> Someone would not expect that turning on this debug feature would
> cause the system to take the action of kicking out a BPF scheduler.
Softlockups can trigger panic and thus system reset, which is arguably also
a remediative action.
> It feels like sched-ext should fix its own watchdog so it detects and
> fixes the problem before the softlockup detector does.
sched_ext has its own watchdog with configurable timeout and softlockups
would eventually trigger that too. However, that's looking at the time
between tasks waking up and running to detect stalls and the (configurable)
time duration is usually longer than softlockup detection threshold, which
makes sense given what the different failure modes they're looking at.
If sched_ext is to expand its watchdog to monitor softlockup like
conditions, it would essentially look just like softirq watchdog and we
would still have the same problem of coordinating detection thresholds.
Having a notification mechanism which triggers when watchdog is about to
trigger which can take a drastic action doesn't sound too odd to me. If I
make it use a notification chain so that the mechanism is more generic,
would that make it more acceptable to you?
Thanks.
--
tejun
next prev parent reply other threads:[~2024-11-06 22:08 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-05 21:48 [PATCH sched_ext/for-6.13 1/2] sched_ext: Avoid live-locking bypass mode switching Tejun Heo
2024-11-05 21:49 ` [PATCH sched_ext/for-6.13 2/2] sched_ext: Enable the ops breather and eject BPF scheduler on softlockup Tejun Heo
2024-11-06 21:32 ` Doug Anderson
2024-11-06 22:08 ` Tejun Heo [this message]
2024-11-06 23:02 ` Doug Anderson
2024-11-06 23:07 ` Tejun Heo
2024-11-06 23:20 ` Doug Anderson
2024-11-07 19:31 ` Tejun Heo
2024-11-08 20:38 ` Tejun Heo
2024-11-05 22:03 ` [PATCH sched_ext/for-6.13 1/2] sched_ext: Avoid live-locking bypass mode switching David Vernet
2024-11-05 23:02 ` Tejun Heo
2024-11-05 23:57 ` Andrea Righi
2024-11-06 0:26 ` Tejun Heo
2024-11-06 0:33 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zyvo7lFcnAddB9RT@slm.duckdns.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=arighi@nvidia.com \
--cc=dianders@chromium.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=multics69@gmail.com \
--cc=sched-ext@meta.com \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox