From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tvrtko Ursulin Subject: Re: [PATCH 16/17] cgroup/drm: Expose memory stats Date: Fri, 28 Jul 2023 15:15:29 +0100 Message-ID: <0d51725f-d4d3-ae75-1d14-e9816cb8d33c@linux.intel.com> References: <20230712114605.519432-1-tvrtko.ursulin@linux.intel.com> <20230712114605.519432-17-tvrtko.ursulin@linux.intel.com> <89d7181c-6830-ca6e-0c39-caa49d14d474@linux.intel.com> <5d65d387-2718-06c3-ee5d-8a7da6e3ddfd@linux.intel.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690553739; x=1722089739; h=message-id:date:mime-version:subject:from:to:cc: references:in-reply-to:content-transfer-encoding; bh=AZ/ocz0hEyx4b7zjwB+nv50Nb+TCW7LuxkGS/3Orf7M=; b=MMohJhwRXLC5sVZtoRZsxTLJc5rdelLxmOOy6SOJyzDlIyD0hSdwxYB8 bvLziTmxNWkWra1cyo/3T1Zl2YQBEnS9GQVf3IRqfKf9UF+lCR9m7dIY0 qVAw2n2nQ+YgISJY7W9mY8g3ikXbrOEGUbxQUuQRHde4sMuInjKyFaUGF KteOn7qdUQO2+uOO3kV4Jo1Cn3g1JnBZoY8j0jGh/YUwjDG7K9J2h4SwT AfTCPtxmTbdjtI0WunbaDVbcxVXohz8qvGw9Oth4W/8zWBW2BbmxaRdyJ +OvRbXcNTcJC2t/hNE4e5tTDsGSSNp8ejkfwS2T9sMQbK/G5nDzN9vQFK A==; Content-Language: en-US In-Reply-To: <5d65d387-2718-06c3-ee5d-8a7da6e3ddfd-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> List-ID: Content-Type: text/plain; charset="iso-8859-1"; format="flowed" To: Maarten Lankhorst , Tejun Heo Cc: Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Johannes Weiner , Zefan Li , Dave Airlie , Daniel Vetter , Rob Clark , =?UTF-8?Q?St=c3=a9phane_Marchesin?= , "T . J . Mercier" , Kenny.Ho-5C7GfCeVMHo@public.gmane.org, =?UTF-8?Q?Christian_K=c3=b6nig?= , Brian Welty , Tvrtko Ursulin , Eero Tamminen One additional thought on one sub-topic: On 27/07/2023 18:08, Tvrtko Ursulin wrote: [snip] >>>> For something like this,=C2=A0 you would probably want it to work insi= de=20 >>>> the drm scheduler first. Presumably, this can be done by setting a=20 >>>> weight on each runqueue, and perhaps adding a callback to update one=20 >>>> for a running queue. Calculating the weights hierarchically might be=20 >>>> fun.. >>> >>> It is not needed to work in drm scheduler first. In fact drm=20 >>> scheduler based drivers can plug into what I have since it already=20 >>> has the notion of scheduling priorities. >>> >>> They would only need to implement a hook which allow the cgroup=20 >>> controller to query client GPU utilisation and another to received=20 >>> the over budget signal. >>> >>> Amdgpu and msm AFAIK could be easy candidates because they both=20 >>> support per client utilisation and priorities. >>> >>> Looks like I need to put all this info back into the cover letter. >>> >>> Also, hierarchic weights and time budgets are all already there. What=20 >>> could be done later is make this all smarter and respect the time=20 >>> budget with more precision. That would however, in many cases=20 >>> including Intel, require co-operation with the firmware. In any case=20 >>> it is only work in the implementation, while the cgroup control=20 >>> interface remains the same. >>> >>>> I have taken a look at how the rest of cgroup controllers change=20 >>>> ownership when moved to a different cgroup, and the answer was: not=20 >>>> at all. If we attempt to create the scheduler controls only on the=20 >>>> first time the fd is used, you could probably get rid of all the=20 >>>> tracking. >>> >>> Can you send a CPU file descriptor from process A to process B and=20 >>> have CPU usage belonging to process B show up in process' A cgroup,=20 >>> or vice-versa? Nope, I am not making any sense, am I? My point being=20 >>> it is not like-to-like, model is different. >>> >>> No ownership transfer would mean in wide deployments all GPU=20 >>> utilisation would be assigned to Xorg and so there is no point to any=20 >>> of this. No way to throttle a cgroup with un-important GPU clients=20 >>> for instance. >> If you just grab the current process' cgroup when a drm_sched_entity=20 >> is created, you don't have everything charged to X.org. No need for=20 >> complicated ownership tracking in drm_file. The same equivalent should=20 >> be done in i915 as well when a context is created as it's not using=20 >> the drm scheduler. >=20 > Okay so essentially nuking the concept of DRM clients belongs to one=20 > cgroup and instead tracking at the context level. That is an interesting = > idea. I suspect implementation could require somewhat generalizing the=20 > concept of an "execution context", or at least expressing it via the DRM = > cgroup controller. >=20 > I can give this a spin, or at least some more detailed thought, once we=20 > close on a few more details regarding charging in general. I didn't get much time to brainstorm this just yet, only one downside=20 randomly came to mind later - with this approach for i915 we wouldn't=20 correctly attribute any GPU activity done in the receiving process=20 against our default contexts. Those would still be accounted to the=20 sending process. How much problem in practice that would be remains to be investigated,=20 including if it applies to other drivers too. If there is a good amount=20 of deployed userspace which use the default context, then it would be a=20 bit messy. Regards, Tvrtko *) For non DRM and non i915 people, default context is a GPU submission=20 context implicitly created during the device node open. It always=20 remains valid, including in the receiving process if SCM_RIGHTS is used.