public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: "Tejun Heo" <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>
Cc: Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Zefan Li" <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	"Dave Airlie" <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Daniel Vetter" <daniel.vetter-/w4YWyX8dFk@public.gmane.org>,
	"Rob Clark" <robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"Stéphane Marchesin"
	<marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"T . J . Mercier"
	<tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Kenny.Ho-5C7GfCeVMHo@public.gmane.org,
	"Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>,
	"Brian Welty"
	<brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Tvrtko Ursulin"
	<tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC v3 00/12] DRM scheduling cgroup controller
Date: Thu, 26 Jan 2023 17:57:24 +0000	[thread overview]
Message-ID: <b8a0872c-fe86-b174-ca3b-0fc04a98e224@linux.intel.com> (raw)
In-Reply-To: <Y9KyiCPYj2Mzym3Z-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>


Hi,

(Two replies in one, hope you will manage to navigate it.)

On 26/01/2023 17:04, Tejun Heo wrote:
> Hello,
> 
> On Thu, Jan 26, 2023 at 02:00:50PM +0100, Michal Koutný wrote:
>> On Wed, Jan 25, 2023 at 06:11:35PM +0000, Tvrtko Ursulin <tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>>> I don't immediately see how you envisage the half-userspace implementation
>>> would look like in terms of what functionality/new APIs would be provided by
>>> the kernel?
>>
>> Output:
>> 	drm.stat (with consumed time(s))
>>
>> Input:
>> 	drm.throttle (alternatives)
>> 	- a) writing 0,1 (in rough analogy to your proposed
>> 	     notifications)
>> 	- b) writing duration (in loose analogy to memory.reclaim)
>> 	     - for how long GPU work should be backed off
>>
>> An userspace agent sitting between these two and it'd do the measurement
>> and calculation depending on given policies (weighting, throttling) and
>> apply respective controls.

Right, I wouldn't recommend drm.throttle as ABI since my idea is to 
enable drivers to do as good job as they individually can. Eg. some may 
be able to be much smarter than simple throttling, or some may start of 
simpler and later gain a better implementation. Some may even have 
differing capability or granularity depending on the GPU model they are 
driving, like in the case of i915.

So even if the RFC shows just a simple i915 implementation, the 
controller itself shouldn't prevent a smarter approach (via exposed 
ABI). And neither this simple i915 implementation works equally well for 
all supported GPU generations! This will be a theme common for many DRM 
drivers.

Secondly, doing this in userspace would require the ability to get some 
sort of an atomic snapshot of the whole tree hierarchy to account for 
changes in layout of the tree and task migrations. Or some retry logic 
with some added ABI fields to enable it.

Even then I think the only thing we would be able to move to userspace 
is the tree-walking logic and that sounds like not that much kernel code 
saved to trade for increased inefficiency.

>> (In resemblance of e.g. https://denji.github.io/cpulimit/)
> 
> Yeah, things like this can be done from userspace but if we're gonna build
> the infrastructure to allow that in gpu drivers and so on, I don't see why
> we wouldn't add a generic in-kernel control layer if we can implement a
> proper weight based control. We can of course also expose .max style
> interface to allow userspace to do whatever they wanna do with it.

Yes agreed, and to re-stress out, the ABI as proposed does not preclude 
changing from scanning to charging or whatever. The idea was for it to 
be compatible in concept with the CPU controller and also avoid baking 
in the controlling method to individual drivers.

>>> Problem there is to find a suitable point to charge at. If for a moment we
>>> limit the discussion to i915, out of the box we could having charging
>>> happening at several thousand times per second to effectively never. This is
>>> to illustrate the GPU context execution dynamics which range from many small
>>> packets of work to multi-minute, or longer. For the latter to be accounted
>>> for we'd still need some periodic scanning, which would then perhaps go per
>>> driver. For the former we'd have thousands of needless updates per second.
>>>
>>> Hence my thinking was to pay both the cost of accounting and collecting the
>>> usage data once per actionable event, where the latter is controlled by some
>>> reasonable scanning period/frequency.
>>>
>>> In addition to that, a few DRM drivers already support GPU usage querying
>>> via fdinfo, so that being externally triggered, it is next to trivial to
>>> wire all those DRM drivers into such common DRM cgroup controller framework.
>>> All that every driver needs to implement on top is the "over budget"
>>> callback.
>>
>> I'd also like show comparison with CPU accounting and controller.
>> There is tick-based (~sampling) measurement of various components of CPU
>> time (task_group_account_field()). But the actual schedulling (weights)
>> or throttling is based on precise accounting (update_curr()).
>>
>> So, if the goal is to have precise and guaranteed limits, it shouldn't
>> (cannot) be based on sampling. OTOH, if it must be sampling based due to
>> variability of the device landscape, it could be advisory mechanism with
>> the userspace component.

I don't think precise and guaranteed limits are feasible given the 
heterogeneous nature of DRM driver capabilities, but I also don't think 
sticking an userspace component in the middle is the way to go.

> As for the specific control mechanism, yeah, charge based interface would be
> more conventional and my suspicion is that transposing the current
> implementation that way likely isn't too difficult. It just pushes "am I
> over the limit?" decisions to the specific drivers with the core layer
> telling them how much under/over budget they are. I'm curious what other 

As I have tried to explain in my previous reply, I don't think real time 
charging is feasible. Because frequency of charging events can both be 
too high and too low. Too high that it doesn't bring value apart from 
increased processing times, where it is not useful to send out 
notification at the same rate, and too low in the sense that some sort 
of periodic query would then be needed in every driver implementation to 
enable all classes of GPU clients to be properly handled.

I just don't see any positives to the charging approach in the DRM 
landscape, but for sure see some negatives. (If we ignore the positive 
of it being a more typical approach, but then I think that is not enough 
to outweigh the negatives.)

gpu
> driver folks think about the current RFC tho. Is at least AMD on board with
> the approach?

Yes I am keenly awaiting comments from the DRM colleagues as well.

Regards,

Tvrtko

P.S. Note that Michal's idea to simplify client tracking is on my TODO 
list. If that works out some patches, the series itself actually, would 
become even simpler.

  parent reply	other threads:[~2023-01-26 17:57 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 16:55 [RFC v3 00/12] DRM scheduling cgroup controller Tvrtko Ursulin
2023-01-12 16:55 ` [RFC 01/12] drm: Track clients by tgid and not tid Tvrtko Ursulin
2023-01-12 16:55 ` [RFC 02/12] drm: Update file owner during use Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 03/12] cgroup: Add the DRM cgroup controller Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 04/12] drm/cgroup: Track clients per owning process Tvrtko Ursulin
2023-01-17 16:03   ` Stanislaw Gruszka
2023-01-17 16:25     ` Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 05/12] drm/cgroup: Allow safe external access to file_priv Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 06/12] drm/cgroup: Add ability to query drm cgroup GPU time Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 07/12] drm/cgroup: Add over budget signalling callback Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 08/12] drm/cgroup: Only track clients which are providing drm_cgroup_ops Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 09/12] cgroup/drm: Client exit hook Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control Tvrtko Ursulin
     [not found]   ` <20230112165609.1083270-11-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-27 13:01     ` Michal Koutný
     [not found]       ` <20230127130134.GA15846-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-27 13:31         ` Tvrtko Ursulin
2023-01-27 14:11           ` Michal Koutný
     [not found]             ` <20230127141136.GG3527-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-27 15:21               ` Tvrtko Ursulin
2023-01-28  1:11     ` Tejun Heo
     [not found]       ` <Y9R2N8sl+7f8Zacv-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-02-02 14:26         ` Tvrtko Ursulin
     [not found]           ` <27b7882e-1201-b173-6f56-9ececb5780e8-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-02-02 20:00             ` Tejun Heo
2023-01-12 16:56 ` [RFC 11/12] drm/i915: Wire up with drm controller GPU time query Tvrtko Ursulin
2023-01-12 16:56 ` [RFC 12/12] drm/i915: Implement cgroup controller over budget throttling Tvrtko Ursulin
     [not found] ` <20230112165609.1083270-1-tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-23 15:42   ` [RFC v3 00/12] DRM scheduling cgroup controller Michal Koutný
     [not found]     ` <20230123154239.GA24348-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2023-01-25 18:11       ` Tvrtko Ursulin
     [not found]         ` <371f3ce5-3468-b91d-d688-7e89499ff347-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-26 13:00           ` Michal Koutný
2023-01-26 17:04             ` Tejun Heo
     [not found]               ` <Y9KyiCPYj2Mzym3Z-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-01-26 17:57                 ` Tvrtko Ursulin [this message]
2023-01-26 18:14                   ` Tvrtko Ursulin
     [not found]                   ` <b8a0872c-fe86-b174-ca3b-0fc04a98e224-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-01-27 10:04                     ` Michal Koutný
2023-01-27 11:40                       ` Tvrtko Ursulin
2023-01-27 13:00                         ` Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8a0872c-fe86-b174-ca3b-0fc04a98e224@linux.intel.com \
    --to=tvrtko.ursulin-vuqaysv1563yd54fqh9/ca@public.gmane.org \
    --cc=Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=Kenny.Ho-5C7GfCeVMHo@public.gmane.org \
    --cc=airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
    --cc=daniel.vetter-/w4YWyX8dFk@public.gmane.org \
    --cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
    --cc=marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=mkoutny-IBi9RG/b67k@public.gmane.org \
    --cc=robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox