From: Tvrtko Ursulin <tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
"Johannes Weiner"
<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
"Zefan Li" <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
"Dave Airlie" <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"Daniel Vetter" <daniel.vetter-/w4YWyX8dFk@public.gmane.org>,
"Rob Clark" <robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
"Stéphane Marchesin"
<marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
"T . J . Mercier"
<tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Kenny.Ho-5C7GfCeVMHo@public.gmane.org,
"Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>,
"Brian Welty"
<brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
"Tvrtko Ursulin"
<tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control
Date: Thu, 2 Feb 2023 14:26:06 +0000 [thread overview]
Message-ID: <27b7882e-1201-b173-6f56-9ececb5780e8@linux.intel.com> (raw)
In-Reply-To: <Y9R2N8sl+7f8Zacv-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
On 28/01/2023 01:11, Tejun Heo wrote:
> On Thu, Jan 12, 2023 at 04:56:07PM +0000, Tvrtko Ursulin wrote:
> ...
>> + /*
>> + * 1st pass - reset working values and update hierarchical weights and
>> + * GPU utilisation.
>> + */
>> + if (!__start_scanning(root, period_us))
>> + goto out_retry; /*
>> + * Always come back later if scanner races with
>> + * core cgroup management. (Repeated pattern.)
>> + */
>> +
>> + css_for_each_descendant_pre(node, &root->css) {
>> + struct drm_cgroup_state *drmcs = css_to_drmcs(node);
>> + struct cgroup_subsys_state *css;
>> + unsigned int over_weights = 0;
>> + u64 unused_us = 0;
>> +
>> + if (!css_tryget_online(node))
>> + goto out_retry;
>> +
>> + /*
>> + * 2nd pass - calculate initial budgets, mark over budget
>> + * siblings and add up unused budget for the group.
>> + */
>> + css_for_each_child(css, &drmcs->css) {
>> + struct drm_cgroup_state *sibling = css_to_drmcs(css);
>> +
>> + if (!css_tryget_online(css)) {
>> + css_put(node);
>> + goto out_retry;
>> + }
>> +
>> + sibling->per_s_budget_us =
>> + DIV_ROUND_UP_ULL(drmcs->per_s_budget_us *
>> + sibling->weight,
>> + drmcs->sum_children_weights);
>> +
>> + sibling->over = sibling->active_us >
>> + sibling->per_s_budget_us;
>> + if (sibling->over)
>> + over_weights += sibling->weight;
>> + else
>> + unused_us += sibling->per_s_budget_us -
>> + sibling->active_us;
>> +
>> + css_put(css);
>> + }
>> +
>> + /*
>> + * 3rd pass - spread unused budget according to relative weights
>> + * of over budget siblings.
>> + */
>> + css_for_each_child(css, &drmcs->css) {
>> + struct drm_cgroup_state *sibling = css_to_drmcs(css);
>> +
>> + if (!css_tryget_online(css)) {
>> + css_put(node);
>> + goto out_retry;
>> + }
>> +
>> + if (sibling->over) {
>> + u64 budget_us =
>> + DIV_ROUND_UP_ULL(unused_us *
>> + sibling->weight,
>> + over_weights);
>> + sibling->per_s_budget_us += budget_us;
>> + sibling->over = sibling->active_us >
>> + sibling->per_s_budget_us;
>> + }
>> +
>> + css_put(css);
>> + }
>> +
>> + css_put(node);
>> + }
>> +
>> + /*
>> + * 4th pass - send out over/under budget notifications.
>> + */
>> + css_for_each_descendant_post(node, &root->css) {
>> + struct drm_cgroup_state *drmcs = css_to_drmcs(node);
>> +
>> + if (!css_tryget_online(node))
>> + goto out_retry;
>> +
>> + if (drmcs->over || drmcs->over_budget)
>> + signal_drm_budget(drmcs,
>> + drmcs->active_us,
>> + drmcs->per_s_budget_us);
>> + drmcs->over_budget = drmcs->over;
>> +
>> + css_put(node);
>> + }
>
> It keeps bothering me that the distribution logic has no memory. Maybe this
> is good enough for coarse control with long cycle durations but it likely
> will get in trouble if pushed to finer grained control. State keeping
> doesn't require a lot of complexity. The only state that needs tracking is
> each cgroup's vtime and then the core should be able to tell specific
> drivers how much each cgroup is over or under fairly accurately at any given
> time.
>
> That said, this isn't a blocker. What's implemented can work well enough
> with coarse enough time grain and that might be enough for the time being
> and we can get back to it later. I think Michal already mentioned it but it
> might be a good idea to track active and inactive cgroups and build the
> weight tree with only active ones. There are machines with a lot of mostly
> idle cgroups (> tens of thousands) and tree wide scanning even at low
> frequency can become a pretty bad bottleneck.
Right, that's the kind of experience (tens of thousands) I was missing,
thank you. Another one item on my TODO list then but I have a question
first.
When you say active/inactive - to what you are referring in the cgroup
world? Offline/online? For those my understanding was offline was a
temporary state while css is getting destroyed.
Also, I am really postponing implementing those changes until I hear at
least something from the DRM community.
Regards,
Tvrtko
next prev parent reply other threads:[~2023-02-02 14:26 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-12 16:55 [RFC v3 00/12] DRM scheduling cgroup controller Tvrtko Ursulin
2023-01-12 16:55 ` [RFC 01/12] drm: Track clients by tgid and not tid Tvrtko Ursulin
2023-01-12 16:55 ` [RFC 02/12] drm: Update file owner during use Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 03/12] cgroup: Add the DRM cgroup controller Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 04/12] drm/cgroup: Track clients per owning process Tvrtko Ursulin
2023-01-17 16:03 ` Stanislaw Gruszka
2023-01-17 16:25 ` Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 05/12] drm/cgroup: Allow safe external access to file_priv Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 06/12] drm/cgroup: Add ability to query drm cgroup GPU time Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 07/12] drm/cgroup: Add over budget signalling callback Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 08/12] drm/cgroup: Only track clients which are providing drm_cgroup_ops Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 09/12] cgroup/drm: Client exit hook Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control Tvrtko Ursulin
[not found] ` <20230112165609.1083270-11-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-27 13:01 ` Michal Koutný
[not found] ` <20230127130134.GA15846-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-27 13:31 ` Tvrtko Ursulin
2023-01-27 14:11 ` Michal Koutný
[not found] ` <20230127141136.GG3527-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-27 15:21 ` Tvrtko Ursulin
2023-01-28 1:11 ` Tejun Heo
[not found] ` <Y9R2N8sl+7f8Zacv-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-02-02 14:26 ` Tvrtko Ursulin [this message]
[not found] ` <27b7882e-1201-b173-6f56-9ececb5780e8-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-02-02 20:00 ` Tejun Heo
2023-01-12 16:56 ` [RFC 11/12] drm/i915: Wire up with drm controller GPU time query Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 12/12] drm/i915: Implement cgroup controller over budget throttling Tvrtko Ursulin
[not found] ` <20230112165609.1083270-1-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-23 15:42 ` [RFC v3 00/12] DRM scheduling cgroup controller Michal Koutný
[not found] ` <20230123154239.GA24348-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-25 18:11 ` Tvrtko Ursulin
[not found] ` <371f3ce5-3468-b91d-d688-7e89499ff347-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-26 13:00 ` Michal Koutný
2023-01-26 17:04 ` Tejun Heo
[not found] ` <Y9KyiCPYj2Mzym3Z-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-01-26 17:57 ` Tvrtko Ursulin
2023-01-26 18:14 ` Tvrtko Ursulin
[not found] ` <b8a0872c-fe86-b174-ca3b-0fc04a98e224-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-27 10:04 ` Michal Koutný
2023-01-27 11:40 ` Tvrtko Ursulin
2023-01-27 13:00 ` Michal Koutný
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27b7882e-1201-b173-6f56-9ececb5780e8@linux.intel.com \
--to=tvrtko.ursulin-vuqaysv1563yd54fqh9/ca@public.gmane.org \
--cc=Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=Kenny.Ho-5C7GfCeVMHo@public.gmane.org \
--cc=airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
--cc=daniel.vetter-/w4YWyX8dFk@public.gmane.org \
--cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
--cc=marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
--cc=robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox