Linux cgroups development
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Natalie Vock" <natalie.vock@gmx.de>,
	"Maarten Lankhorst" <dev@lankhorst.se>,
	"Maxime Ripard" <mripard@kernel.org>, "Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Huang Rui" <ray.huang@amd.com>,
	"Matthew Auld" <matthew.auld@intel.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>
Cc: cgroups@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 0/4] cgroup/dmem,drm/ttm: Improve protection in contended cases
Date: Mon, 15 Sep 2025 14:48:39 +0200	[thread overview]
Message-ID: <01e2369e-1df2-446d-9f9d-59c86cc55a04@amd.com> (raw)
In-Reply-To: <20250915-dmemcg-aggressive-protect-v1-0-2f3353bfcdac@gmx.de>

On 15.09.25 14:36, Natalie Vock wrote:
> Hi all,
> 
> I've been looking into some cases where dmem protection fails to prevent
> allocations from ending up in GTT when VRAM gets scarce and apps start
> competing hard.
> 
> In short, this is because other (unprotected) applications end up
> filling VRAM before protected applications do. This causes TTM to back
> off and try allocating in GTT before anything else, and that is where
> the allocation is placed in the end. The existing eviction protection
> cannot prevent this, because no attempt at evicting is ever made
> (although you could consider the backing-off as an immediate eviction to
> GTT).

Well depending on what you gave as GEM flags from userspace that is expected behavior.

For applications using RADV we usually give GTT|VRAM as placement which basically tells the kernel that it shouldn't evict at all and immediately fallback to GTT.

Regards,
Christian.

> 
> This series tries to alleviate this by adding a special case when the
> allocation is protected by cgroups: Instead of backing off immediately,
> TTM will try evicting unprotected buffers from the domain to make space
> for the protected one. This ensures that applications can actually use
> all the memory protection awarded to them by the system, without being
> prone to ping-ponging (only protected allocations can evict unprotected
> ones, never the other way around).
> 
> The first two patches just add a few small utilities needed to implement
> this to the dmem controller. The second two patches are the TTM
> implementation:
> 
> "drm/ttm: Be more aggressive..." decouples cgroup charging from resource
> allocation to allow us to hold on to the charge even if allocation fails
> on first try, and adds a path to call ttm_bo_evict_alloc when the
> charged allocation falls within min/low protection limits.
> 
> "drm/ttm: Use common ancestor..." is a more general improvement in
> correctly implementing cgroup protection semantics. With recursive
> protection rules, unused memory protection afforded to a parent node is
> transferred to children recursively, which helps protect entire
> subtrees from stealing each others' memory without needing to protect
> each cgroup individually. This doesn't apply when considering direct
> siblings inside the same subtree, so in order to not break
> prioritization between these siblings, we need to consider the
> relationship of evictor and evictee when calculating protection.
> In practice, this fixes cases where a protected cgroup cannot steal
> memory from unprotected siblings (which, in turn, leads to eviction
> failures and new allocations being placed in GTT).
> 
> Thanks,
> Natalie
> 
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
> Natalie Vock (4):
>       cgroup/dmem: Add queries for protection values
>       cgroup/dmem: Add dmem_cgroup_common_ancestor helper
>       drm/ttm: Be more aggressive when allocating below protection limit
>       drm/ttm: Use common ancestor of evictor and evictee as limit pool
> 
>  drivers/gpu/drm/ttm/ttm_bo.c       | 79 ++++++++++++++++++++++++++++++++------
>  drivers/gpu/drm/ttm/ttm_resource.c | 48 ++++++++++++++++-------
>  include/drm/ttm/ttm_resource.h     |  6 ++-
>  include/linux/cgroup_dmem.h        | 25 ++++++++++++
>  kernel/cgroup/dmem.c               | 73 +++++++++++++++++++++++++++++++++++
>  5 files changed, 205 insertions(+), 26 deletions(-)
> ---
> base-commit: f3e82936857b3bd77b824ecd2fa7839dd99ec0c6
> change-id: 20250915-dmemcg-aggressive-protect-5cf37f717cdb
> 
> Best regards,


  parent reply	other threads:[~2025-09-15 12:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-15 12:36 [PATCH 0/4] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2025-09-15 12:36 ` [PATCH 1/4] cgroup/dmem: Add queries for protection values Natalie Vock
2025-09-15 12:36 ` [PATCH 2/4] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
2025-09-15 12:36 ` [PATCH 3/4] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2025-09-15 12:43   ` Christian König
2025-09-15 12:36 ` [PATCH 4/4] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
2025-09-15 12:48 ` Christian König [this message]
2025-09-15 13:17   ` [PATCH 0/4] cgroup/dmem, drm/ttm: Improve protection in contended cases Natalie Vock
2025-09-15 13:23     ` Christian König
2025-09-15 13:44       ` Natalie Vock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01e2369e-1df2-446d-9f9d-59c86cc55a04@amd.com \
    --to=christian.koenig@amd.com \
    --cc=airlied@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dev@lankhorst.se \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mkoutny@suse.com \
    --cc=mripard@kernel.org \
    --cc=natalie.vock@gmx.de \
    --cc=ray.huang@amd.com \
    --cc=simona@ffwll.ch \
    --cc=tj@kernel.org \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox