public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Tvrtko Ursulin <tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Zefan Li" <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	"Dave Airlie" <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Daniel Vetter" <daniel.vetter-/w4YWyX8dFk@public.gmane.org>,
	"Rob Clark" <robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"Stéphane Marchesin"
	<marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"T . J . Mercier"
	<tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Kenny.Ho-5C7GfCeVMHo@public.gmane.org,
	"Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>,
	"Brian Welty"
	<brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Tvrtko Ursulin"
	<tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control
Date: Fri, 27 Jan 2023 15:11:19 -1000	[thread overview]
Message-ID: <Y9R2N8sl+7f8Zacv@slm.duckdns.org> (raw)
In-Reply-To: <20230112165609.1083270-11-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

On Thu, Jan 12, 2023 at 04:56:07PM +0000, Tvrtko Ursulin wrote:
...
> +	/*
> +	 * 1st pass - reset working values and update hierarchical weights and
> +	 * GPU utilisation.
> +	 */
> +	if (!__start_scanning(root, period_us))
> +		goto out_retry; /*
> +				 * Always come back later if scanner races with
> +				 * core cgroup management. (Repeated pattern.)
> +				 */
> +
> +	css_for_each_descendant_pre(node, &root->css) {
> +		struct drm_cgroup_state *drmcs = css_to_drmcs(node);
> +		struct cgroup_subsys_state *css;
> +		unsigned int over_weights = 0;
> +		u64 unused_us = 0;
> +
> +		if (!css_tryget_online(node))
> +			goto out_retry;
> +
> +		/*
> +		 * 2nd pass - calculate initial budgets, mark over budget
> +		 * siblings and add up unused budget for the group.
> +		 */
> +		css_for_each_child(css, &drmcs->css) {
> +			struct drm_cgroup_state *sibling = css_to_drmcs(css);
> +
> +			if (!css_tryget_online(css)) {
> +				css_put(node);
> +				goto out_retry;
> +			}
> +
> +			sibling->per_s_budget_us  =
> +				DIV_ROUND_UP_ULL(drmcs->per_s_budget_us *
> +						 sibling->weight,
> +						 drmcs->sum_children_weights);
> +
> +			sibling->over = sibling->active_us >
> +					sibling->per_s_budget_us;
> +			if (sibling->over)
> +				over_weights += sibling->weight;
> +			else
> +				unused_us += sibling->per_s_budget_us -
> +					     sibling->active_us;
> +
> +			css_put(css);
> +		}
> +
> +		/*
> +		 * 3rd pass - spread unused budget according to relative weights
> +		 * of over budget siblings.
> +		 */
> +		css_for_each_child(css, &drmcs->css) {
> +			struct drm_cgroup_state *sibling = css_to_drmcs(css);
> +
> +			if (!css_tryget_online(css)) {
> +				css_put(node);
> +				goto out_retry;
> +			}
> +
> +			if (sibling->over) {
> +				u64 budget_us =
> +					DIV_ROUND_UP_ULL(unused_us *
> +							 sibling->weight,
> +							 over_weights);
> +				sibling->per_s_budget_us += budget_us;
> +				sibling->over = sibling->active_us  >
> +						sibling->per_s_budget_us;
> +			}
> +
> +			css_put(css);
> +		}
> +
> +		css_put(node);
> +	}
> +
> +	/*
> +	 * 4th pass - send out over/under budget notifications.
> +	 */
> +	css_for_each_descendant_post(node, &root->css) {
> +		struct drm_cgroup_state *drmcs = css_to_drmcs(node);
> +
> +		if (!css_tryget_online(node))
> +			goto out_retry;
> +
> +		if (drmcs->over || drmcs->over_budget)
> +			signal_drm_budget(drmcs,
> +					  drmcs->active_us,
> +					  drmcs->per_s_budget_us);
> +		drmcs->over_budget = drmcs->over;
> +
> +		css_put(node);
> +	}

It keeps bothering me that the distribution logic has no memory. Maybe this
is good enough for coarse control with long cycle durations but it likely
will get in trouble if pushed to finer grained control. State keeping
doesn't require a lot of complexity. The only state that needs tracking is
each cgroup's vtime and then the core should be able to tell specific
drivers how much each cgroup is over or under fairly accurately at any given
time.

That said, this isn't a blocker. What's implemented can work well enough
with coarse enough time grain and that might be enough for the time being
and we can get back to it later. I think Michal already mentioned it but it
might be a good idea to track active and inactive cgroups and build the
weight tree with only active ones. There are machines with a lot of mostly
idle cgroups (> tens of thousands) and tree wide scanning even at low
frequency can become a pretty bad bottleneck.

Thanks.

-- 
tejun

  parent reply	other threads:[~2023-01-28  1:11 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 16:55 [RFC v3 00/12] DRM scheduling cgroup controller Tvrtko Ursulin
2023-01-12 16:55 ` [RFC 01/12] drm: Track clients by tgid and not tid Tvrtko Ursulin
2023-01-12 16:55 ` [RFC 02/12] drm: Update file owner during use Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 03/12] cgroup: Add the DRM cgroup controller Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 04/12] drm/cgroup: Track clients per owning process Tvrtko Ursulin
2023-01-17 16:03   ` Stanislaw Gruszka
2023-01-17 16:25     ` Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 05/12] drm/cgroup: Allow safe external access to file_priv Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 06/12] drm/cgroup: Add ability to query drm cgroup GPU time Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 07/12] drm/cgroup: Add over budget signalling callback Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 08/12] drm/cgroup: Only track clients which are providing drm_cgroup_ops Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 09/12] cgroup/drm: Client exit hook Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control Tvrtko Ursulin
     [not found]   ` <20230112165609.1083270-11-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-27 13:01     ` Michal Koutný
     [not found]       ` <20230127130134.GA15846-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-27 13:31         ` Tvrtko Ursulin
2023-01-27 14:11           ` Michal Koutný
     [not found]             ` <20230127141136.GG3527-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-27 15:21               ` Tvrtko Ursulin
2023-01-28  1:11     ` Tejun Heo [this message]
     [not found]       ` <Y9R2N8sl+7f8Zacv-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-02-02 14:26         ` Tvrtko Ursulin
     [not found]           ` <27b7882e-1201-b173-6f56-9ececb5780e8-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-02-02 20:00             ` Tejun Heo
2023-01-12 16:56 ` [RFC 11/12] drm/i915: Wire up with drm controller GPU time query Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 12/12] drm/i915: Implement cgroup controller over budget throttling Tvrtko Ursulin
     [not found] ` <20230112165609.1083270-1-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-23 15:42   ` [RFC v3 00/12] DRM scheduling cgroup controller Michal Koutný
     [not found]     ` <20230123154239.GA24348-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-25 18:11       ` Tvrtko Ursulin
     [not found]         ` <371f3ce5-3468-b91d-d688-7e89499ff347-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-26 13:00           ` Michal Koutný
2023-01-26 17:04             ` Tejun Heo
     [not found]               ` <Y9KyiCPYj2Mzym3Z-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-01-26 17:57                 ` Tvrtko Ursulin
2023-01-26 18:14                   ` Tvrtko Ursulin
     [not found]                   ` <b8a0872c-fe86-b174-ca3b-0fc04a98e224-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-27 10:04                     ` Michal Koutný
2023-01-27 11:40                       ` Tvrtko Ursulin
2023-01-27 13:00                         ` Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9R2N8sl+7f8Zacv@slm.duckdns.org \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=Kenny.Ho-5C7GfCeVMHo@public.gmane.org \
    --cc=airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
    --cc=daniel.vetter-/w4YWyX8dFk@public.gmane.org \
    --cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
    --cc=marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox