public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Maarten Lankhorst
	<maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	intel-xe-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	David Airlie <airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org>,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	Maxime Ripard <mripard-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Thomas Zimmermann <tzimmermann-l3A5Bk7waGM@public.gmane.org>,
	Tvrtko Ursulin
	<tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC PATCH 0/4]  Add support for DRM cgroup memory accounting.
Date: Fri, 5 May 2023 09:50:59 -1000	[thread overview]
Message-ID: <ZFVeI2DKQXddKDNl@slm.duckdns.org> (raw)
In-Reply-To: <20230503083500.645848-1-maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

Hello,

On Wed, May 03, 2023 at 10:34:56AM +0200, Maarten Lankhorst wrote:
> RFC as I'm looking for comments.
> 
> For long running compute, it can be beneficial to partition the GPU memory
> between cgroups, so each cgroup can use its maximum amount of memory without
> interfering with other scheduled jobs. Done properly, this can alleviate the
> need for eviction, which might result in a job being terminated if the GPU
> doesn't support mid-thread preemption or recoverable page faults.
> 
> This is done by adding a bunch of knobs to cgroup:
> drm.capacity: Shows maximum capacity of each resource region.
> drm.max: Display or limit max amount of memory.
> drm.current: Current amount of memory in use.
> 
> TTM has not been made cgroup aware yet, so instead of evicting from
> the current cgroup to stay within the cgroup limits, it simply returns
> the error -ENOSPC to userspace.
> 
> I've used Tvrtko's cgroup controller series as a base, but it implemented
> scheduling weight, not memory accounting, so I only ended up keeping the
> base patch.
> 
> Xe is not upstream yet, so the driver specific patch will only apply on
> https://gitlab.freedesktop.org/drm/xe/kernel

Some high-level feedbacks.

* There have been multiple attempts at this but the track record is kinda
  poor. People don't seem to agree what should constitute DRM memory and how
  they should be accounted / controlled.

* I like Tvrtko's scheduling patchset because it exposes a generic interface
  which makes sense regardless of hardware details and then each driver can
  implement the configured control in whatever way they can. However, even
  for that, there doesn't seem much buy-in from other drivers.

* This proposal seems narrowly scoped trying to solve a specific problem
  which may not translate to different hardware configurations. Please let
  me know if I got that wrong, but if that's the case, I think a better and
  easier approach might be just being a part of the misc controller. That
  doesn't require much extra code and should be able to provide everything
  necessary for statically limiting specific resources.

Thanks.

-- 
tejun

  parent reply	other threads:[~2023-05-05 19:50 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03  8:34 [RFC PATCH 0/4] Add support for DRM cgroup memory accounting Maarten Lankhorst
2023-05-03  8:34 ` [RFC PATCH 1/4] cgroup: Add the DRM cgroup controller Maarten Lankhorst
2023-05-03  8:34 ` [RFC PATCH 2/4] drm/cgroup: Add memory accounting to DRM cgroup Maarten Lankhorst
     [not found]   ` <20230503083500.645848-3-maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-05-03 15:31     ` [Intel-gfx] " Tvrtko Ursulin
2023-05-03 15:33       ` Maarten Lankhorst
     [not found]       ` <c9d1e666-50e9-d66a-d751-f4ec39fcb7bb-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-05-05 14:21         ` [Intel-gfx] " Maarten Lankhorst
     [not found] ` <20230503083500.645848-1-maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-05-03  8:34   ` [RFC PATCH 3/4] drm/ttm: Handle -EAGAIN in ttm_resource_alloc as -ENOSPC Maarten Lankhorst
2023-05-03  9:11     ` [Intel-xe] " Thomas Hellström
     [not found]       ` <888841c4-7bd4-8174-7786-033715c995c6-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-05-03  9:36         ` Maarten Lankhorst
2023-05-03  8:35   ` [RFC PATCH 4/4] drm/xe: Add support for the drm cgroup Maarten Lankhorst
2023-05-05 19:50   ` Tejun Heo [this message]
     [not found]     ` <ZFVeI2DKQXddKDNl-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-05-10 14:59       ` [RFC PATCH 0/4] Add support for DRM cgroup memory accounting Maarten Lankhorst
     [not found]         ` <4d6fbce3-a676-f648-7a09-6f6dcc4bdb46-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-05-10 18:46           ` Tejun Heo
     [not found]             ` <ZFvmaGNo0buQEUi1-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-05-11 10:03               ` Maarten Lankhorst
2023-05-11 10:14             ` [Intel-gfx] " Tvrtko Ursulin
     [not found]               ` <562bd20d-36b9-a617-92cc-460f2eece22e-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2023-05-11 19:58                 ` Maarten Lankhorst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFVeI2DKQXddKDNl@slm.duckdns.org \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=daniel-/w4YWyX8dFk@public.gmane.org \
    --cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=intel-xe-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
    --cc=maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=mripard-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=tzimmermann-l3A5Bk7waGM@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox