public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Tejun Heo <tj@kernel.org>
Cc: "Rob Clark" <robdclark@chromium.org>,
	Kenny.Ho@amd.com, "Dave Airlie" <airlied@redhat.com>,
	"Stéphane Marchesin" <marcheu@chromium.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	Intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org,
	"Christian König" <christian.koenig@amd.com>,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	cgroups@vger.kernel.org, "T . J . Mercier" <tjmercier@google.com>
Subject: Re: [Intel-gfx] [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control
Date: Thu, 2 Feb 2023 14:26:06 +0000	[thread overview]
Message-ID: <27b7882e-1201-b173-6f56-9ececb5780e8@linux.intel.com> (raw)
In-Reply-To: <Y9R2N8sl+7f8Zacv@slm.duckdns.org>


On 28/01/2023 01:11, Tejun Heo wrote:
> On Thu, Jan 12, 2023 at 04:56:07PM +0000, Tvrtko Ursulin wrote:
> ...
>> +	/*
>> +	 * 1st pass - reset working values and update hierarchical weights and
>> +	 * GPU utilisation.
>> +	 */
>> +	if (!__start_scanning(root, period_us))
>> +		goto out_retry; /*
>> +				 * Always come back later if scanner races with
>> +				 * core cgroup management. (Repeated pattern.)
>> +				 */
>> +
>> +	css_for_each_descendant_pre(node, &root->css) {
>> +		struct drm_cgroup_state *drmcs = css_to_drmcs(node);
>> +		struct cgroup_subsys_state *css;
>> +		unsigned int over_weights = 0;
>> +		u64 unused_us = 0;
>> +
>> +		if (!css_tryget_online(node))
>> +			goto out_retry;
>> +
>> +		/*
>> +		 * 2nd pass - calculate initial budgets, mark over budget
>> +		 * siblings and add up unused budget for the group.
>> +		 */
>> +		css_for_each_child(css, &drmcs->css) {
>> +			struct drm_cgroup_state *sibling = css_to_drmcs(css);
>> +
>> +			if (!css_tryget_online(css)) {
>> +				css_put(node);
>> +				goto out_retry;
>> +			}
>> +
>> +			sibling->per_s_budget_us  =
>> +				DIV_ROUND_UP_ULL(drmcs->per_s_budget_us *
>> +						 sibling->weight,
>> +						 drmcs->sum_children_weights);
>> +
>> +			sibling->over = sibling->active_us >
>> +					sibling->per_s_budget_us;
>> +			if (sibling->over)
>> +				over_weights += sibling->weight;
>> +			else
>> +				unused_us += sibling->per_s_budget_us -
>> +					     sibling->active_us;
>> +
>> +			css_put(css);
>> +		}
>> +
>> +		/*
>> +		 * 3rd pass - spread unused budget according to relative weights
>> +		 * of over budget siblings.
>> +		 */
>> +		css_for_each_child(css, &drmcs->css) {
>> +			struct drm_cgroup_state *sibling = css_to_drmcs(css);
>> +
>> +			if (!css_tryget_online(css)) {
>> +				css_put(node);
>> +				goto out_retry;
>> +			}
>> +
>> +			if (sibling->over) {
>> +				u64 budget_us =
>> +					DIV_ROUND_UP_ULL(unused_us *
>> +							 sibling->weight,
>> +							 over_weights);
>> +				sibling->per_s_budget_us += budget_us;
>> +				sibling->over = sibling->active_us  >
>> +						sibling->per_s_budget_us;
>> +			}
>> +
>> +			css_put(css);
>> +		}
>> +
>> +		css_put(node);
>> +	}
>> +
>> +	/*
>> +	 * 4th pass - send out over/under budget notifications.
>> +	 */
>> +	css_for_each_descendant_post(node, &root->css) {
>> +		struct drm_cgroup_state *drmcs = css_to_drmcs(node);
>> +
>> +		if (!css_tryget_online(node))
>> +			goto out_retry;
>> +
>> +		if (drmcs->over || drmcs->over_budget)
>> +			signal_drm_budget(drmcs,
>> +					  drmcs->active_us,
>> +					  drmcs->per_s_budget_us);
>> +		drmcs->over_budget = drmcs->over;
>> +
>> +		css_put(node);
>> +	}
> 
> It keeps bothering me that the distribution logic has no memory. Maybe this
> is good enough for coarse control with long cycle durations but it likely
> will get in trouble if pushed to finer grained control. State keeping
> doesn't require a lot of complexity. The only state that needs tracking is
> each cgroup's vtime and then the core should be able to tell specific
> drivers how much each cgroup is over or under fairly accurately at any given
> time.
> 
> That said, this isn't a blocker. What's implemented can work well enough
> with coarse enough time grain and that might be enough for the time being
> and we can get back to it later. I think Michal already mentioned it but it
> might be a good idea to track active and inactive cgroups and build the
> weight tree with only active ones. There are machines with a lot of mostly
> idle cgroups (> tens of thousands) and tree wide scanning even at low
> frequency can become a pretty bad bottleneck.

Right, that's the kind of experience (tens of thousands) I was missing, 
thank you. Another one item on my TODO list then but I have a question 
first.

When you say active/inactive - to what you are referring in the cgroup 
world? Offline/online? For those my understanding was offline was a 
temporary state while css is getting destroyed.

Also, I am really postponing implementing those changes until I hear at 
least something from the DRM community.

Regards,

Tvrtko

  reply	other threads:[~2023-02-02 14:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 16:55 [Intel-gfx] [RFC v3 00/12] DRM scheduling cgroup controller Tvrtko Ursulin
2023-01-12 16:55 ` [Intel-gfx] [RFC 01/12] drm: Track clients by tgid and not tid Tvrtko Ursulin
2023-01-12 16:55 ` [Intel-gfx] [RFC 02/12] drm: Update file owner during use Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 03/12] cgroup: Add the DRM cgroup controller Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 04/12] drm/cgroup: Track clients per owning process Tvrtko Ursulin
2023-01-17 16:03   ` Stanislaw Gruszka
2023-01-17 16:25     ` Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 05/12] drm/cgroup: Allow safe external access to file_priv Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 06/12] drm/cgroup: Add ability to query drm cgroup GPU time Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 07/12] drm/cgroup: Add over budget signalling callback Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 08/12] drm/cgroup: Only track clients which are providing drm_cgroup_ops Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 09/12] cgroup/drm: Client exit hook Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control Tvrtko Ursulin
2023-01-27 13:01   ` Michal Koutný
2023-01-27 13:31     ` Tvrtko Ursulin
2023-01-27 14:11       ` Michal Koutný
2023-01-27 15:21         ` Tvrtko Ursulin
2023-01-28  1:11   ` Tejun Heo
2023-02-02 14:26     ` Tvrtko Ursulin [this message]
2023-02-02 20:00       ` Tejun Heo
2023-01-12 16:56 ` [Intel-gfx] [RFC 11/12] drm/i915: Wire up with drm controller GPU time query Tvrtko Ursulin
2023-01-12 16:56 ` [Intel-gfx] [RFC 12/12] drm/i915: Implement cgroup controller over budget throttling Tvrtko Ursulin
2023-01-12 17:40 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for DRM scheduling cgroup controller (rev3) Patchwork
2023-01-12 18:09 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-01-13  6:52 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2023-01-23 15:42 ` [Intel-gfx] [RFC v3 00/12] DRM scheduling cgroup controller Michal Koutný
2023-01-25 18:11   ` Tvrtko Ursulin
2023-01-26 13:00     ` Michal Koutný
2023-01-26 17:04       ` Tejun Heo
2023-01-26 17:57         ` Tvrtko Ursulin
2023-01-26 18:14           ` Tvrtko Ursulin
2023-01-27 10:04           ` Michal Koutný
2023-01-27 11:40             ` Tvrtko Ursulin
2023-01-27 13:00               ` Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27b7882e-1201-b173-6f56-9ececb5780e8@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=Kenny.Ho@amd.com \
    --cc=airlied@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=marcheu@chromium.org \
    --cc=robdclark@chromium.org \
    --cc=tj@kernel.org \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox