From: Benjamin Segall <bsegall@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mel Gorman <mgorman@suse.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Valentin Schneider <vschneid@redhat.com>,
linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v2] sched: async unthrottling for cfs bandwidth
Date: Mon, 31 Oct 2022 14:56:13 -0700 [thread overview]
Message-ID: <xm26fsf3wtc2.fsf@google.com> (raw)
In-Reply-To: <Y1/HzzA1FIawYM11@hirez.programming.kicks-ass.net> (Peter Zijlstra's message of "Mon, 31 Oct 2022 14:04:15 +0100")
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Oct 26, 2022 at 03:44:49PM -0700, Josh Don wrote:
>> CFS bandwidth currently distributes new runtime and unthrottles cfs_rq's
>> inline in an hrtimer callback. Runtime distribution is a per-cpu
>> operation, and unthrottling is a per-cgroup operation, since a tg walk
>> is required. On machines with a large number of cpus and large cgroup
>> hierarchies, this cpus*cgroups work can be too much to do in a single
>> hrtimer callback: since IRQ are disabled, hard lockups may easily occur.
>> Specifically, we've found this scalability issue on configurations with
>> 256 cpus, O(1000) cgroups in the hierarchy being throttled, and high
>> memory bandwidth usage.
>>
>> To fix this, we can instead unthrottle cfs_rq's asynchronously via a
>> CSD. Each cpu is responsible for unthrottling itself, thus sharding the
>> total work more fairly across the system, and avoiding hard lockups.
>
> So, TJ has been complaining about us throttling in kernel-space, causing
> grief when we also happen to hold a mutex or some other resource and has
> been prodding us to only throttle at the return-to-user boundary.
>
> Would this be an opportune moment to do this? That is, what if we
> replace this CSD with a task_work that's ran on the return-to-user path
> instead?
This is unthrottle, not throttle, but it would probably be
straightfoward enough to do what you said for throttle. I'd expect this
to not help all that much though, because throttle hits the entire
cfs_rq, not individual threads.
I'm currently trying something more invasive, which doesn't throttle a
cfs_rq while it has any kernel tasks, and prioritizes kernel tasks / ses
containing kernel tasks when a cfs_rq "should" be throttled. "Invasive"
is a key word though, as it needs to do the sort of h_nr_kernel_tasks
tracking on put_prev/set_next in ways we currently only need to do on
enqueue/dequeue.
next prev parent reply other threads:[~2022-10-31 21:56 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-26 22:44 [PATCH v2] sched: async unthrottling for cfs bandwidth Josh Don
2022-10-31 13:04 ` Peter Zijlstra
2022-10-31 21:22 ` Josh Don
2022-10-31 21:50 ` Tejun Heo
2022-10-31 23:15 ` Josh Don
2022-10-31 23:53 ` Tejun Heo
2022-11-01 1:01 ` Josh Don
2022-11-01 1:45 ` Tejun Heo
2022-11-01 19:11 ` Josh Don
2022-11-01 19:15 ` Tejun Heo
2022-11-01 20:56 ` Josh Don
2022-11-01 21:49 ` Tejun Heo
2022-11-01 21:59 ` Josh Don
2022-11-01 22:38 ` Tejun Heo
2022-11-02 17:10 ` Michal Koutný
2022-11-02 17:18 ` Tejun Heo
2022-10-31 21:56 ` Benjamin Segall [this message]
2022-11-02 8:40 ` Peter Zijlstra
2022-11-11 0:14 ` Josh Don
2022-11-02 16:59 ` Michal Koutný
2022-11-03 0:10 ` Josh Don
2022-11-03 10:11 ` Michal Koutný
2022-11-16 3:01 ` Josh Don
2022-11-16 9:57 ` Michal Koutný
2022-11-16 21:45 ` Josh Don
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xm26fsf3wtc2.fsf@google.com \
--to=bsegall@google.com \
--cc=bristot@redhat.com \
--cc=dietmar.eggemann@arm.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox