[PATCH v7 0/6] Add reclaim to the dmem cgroup controller

dri-devel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 0/6] Add reclaim to the dmem cgroup controller
@ 2026-07-03 13:05 Thomas Hellström
  2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König,
	Thadeu Lima de Souza Cascardo, Alex Deucher, Rodrigo Vivi,
	dri-devel, amd-gfx, linux-kernel

When writing a "max" limit lower than the current usage, the
existing code silently failed. This series aims to improve
on that by returning -EBUSY on failure and also attempt
to synchronously reclaim device memory to push the usage
under the new max limit to avoid the error.

Patch 1 fixes a pre-existing amdgpu_vram_mgr_init() error path
Patch 2 introduces struct dmem_cgroup_init for extensible region
      registration.
Patch 3 implements and documents a reclaim callback interface
      for the dmem controller.
Patch 4 implements a TTM reclaim callback.
Patches 5-6 hook up the reclaim callback to the dmem cgroup-aware
      drivers xe and amdgpu.

v2:
- Remove the error propagation that was in a previous series (Maarten)
- A number of updates in patch 1. See its commit message for
  details (Maarten)

v3:
- Add patch 1 fixing a pre-existing amdgpu_vram_mgr_init() error path
  bug where drmm_cgroup_register_region() was called before
  INIT_LIST_HEAD() and gpu_buddy_init(), causing a kernel panic on
  failure. (Sashiko-bot)
- Use an rwsem to protect reclaim callback registration and region
  unregister against concurrent reclaim invocations. (Sashiko-bot)
- Fix ttm_resource_manager_set_dmem_region() storing an error pointer
  in man->cg unconditionally. (Sashiko-bot)
- Fix kernel-doc function name format for ttm_bo_evict_cgroup() and
  ttm_resource_manager_set_dmem_region().

v4:
- Rebased on drm-tip; dropped the XE_PL_STOLEN guard in the xe patch
  as stolen memory uses a separate TTM manager.

v5:
- Add patch 2 introducing struct dmem_cgroup_init to make the
  dmem_cgroup_register_region() API extensible without adding positional
  arguments in the future.
- Use nonblock=true in reset_all_resource_limits() to avoid sleeping
  inside rcu_read_lock() in dmemcs_offline(). (Sashiko-bot)
- Compare usage against the truncated limit stored in cnt.max, not the
  original u64. (Sashiko-bot)
- Use DMEM_MAX_RECLAIM_RETRIES (16) retry budget instead of 5, matching
  the memcg controller; only -ENOSPC (no progress) counts against the
  budget, other errors abort immediately.
- Handle NULL region in ttm_resource_manager_set_dmem_region() to clear
  the reclaim callback, preventing use-after-free when the manager is
  torn down while the dmem region outlives it. (Sashiko-bot)
- Return 0 on any eviction progress; reserve -ENOSPC for zero progress.
- Clear the reclaim callback in xe and amdgpu fini paths to prevent
  use-after-free after driver unbind with open DRM file descriptors.
  (Sashiko-bot)
- Register xe fini devres action before drmm_cgroup_register_region()
  so LIFO teardown runs unregister first, draining callbacks before the
  manager is destroyed. (Sashiko-bot)
- Switch amdgpu to explicit dmem_cgroup_unregister_region() at the top
  of amdgpu_vram_mgr_fini() before any manager teardown, since amdgpu's
  fini is called explicitly during driver unbind before drmm cleanup.
  (Sashiko-bot)
- Wrap the xe reclaim callback with drm_dev_enter()/drm_dev_exit() to
  prevent TTM reclaim from running after driver unbind.

v6:
- Move the ops check inside down_read() in set_resource_max(), guarded
  by region->unregistered, to close a UAF race against
  dmem_cgroup_unregister_region(). (Sashiko-bot)
- Fix dmem_cgroup_ops->reclaim docstring: -ENOSPC is retried up to
  DMEM_MAX_RECLAIM_RETRIES times, not an immediate stop. (Sashiko-bot)
- Fix mgr->cg_region never being assigned in amdgpu_vram_mgr_init(),
  causing dmem_cgroup_unregister_region() in fini to silently no-op.
  (Sashiko-bot)
- Reorder amdgpu_vram_mgr_fini() to call set_used(false) and
  evict_all() before dmem_cgroup_unregister_region(), so
  ttm_resource_free() can uncharge via man->cg during eviction; clear
  man->cg after unregister. (Sashiko-bot)

v7:
- Replace the per-region rw_semaphore with a static SRCU domain
  (dmemcg_srcu). SRCU is a better fit: it avoids per-region lock
  overhead on every reclaim call, and synchronize_srcu() at unregister
  time is a rare, shutdown-time operation. (Maarten)
- Trim in-function comments to focus on what rather than how.
- Switch back to drmm_cgroup_register_region() with a drm_dev_enter/
  exit guard in the reclaim callback (matching xe), rather than manual
  register/unregister.  drm_dev_unplug() fires before vram_mgr_fini(),
  so drm_dev_enter() returning false prevents any reclaim from touching
  the manager during teardown.  This also fixes the "vram" name
  collision on multi-GPU systems, since drmm_cgroup_register_region()
  automatically prefixes with "drm/<pci-addr>/". (Sashiko-bot)

User-space tests are at
https://patchwork.freedesktop.org/series/163935/

Test-with: 20260428065411.4222-1-thomas.hellstrom@linux.intel.com

Thomas Hellström (6):
  drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  cgroup/dmem: Introduce struct dmem_cgroup_init for region
    initialization
  cgroup/dmem: Add reclaim callback for lowering max below current usage
  drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
    controller
  drm/xe: Wire up dmem cgroup reclaim for VRAM manager
  drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 38 +++++++-
 drivers/gpu/drm/drm_drv.c                    |  8 +-
 drivers/gpu/drm/ttm/ttm_bo.c                 | 95 +++++++++++++++++++-
 drivers/gpu/drm/ttm/ttm_bo_util.c            |  3 +-
 drivers/gpu/drm/ttm/ttm_resource.c           | 50 +++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c         | 53 +++++++++--
 include/drm/drm_drv.h                        |  4 +-
 include/drm/ttm/ttm_bo.h                     | 10 +++
 include/drm/ttm/ttm_resource.h               |  7 ++
 include/linux/cgroup_dmem.h                  | 38 +++++++-
 kernel/cgroup/dmem.c                         | 91 +++++++++++++++----
 12 files changed, 362 insertions(+), 37 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
@ 2026-07-03 13:05 ` Thomas Hellström
  2026-07-03 13:08   ` Christian König
  2026-07-03 13:26   ` sashiko-bot
  2026-07-03 13:05 ` [PATCH v7 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization Thomas Hellström
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Sashiko-bot, Friedrich Vock,
	Maarten Lankhorst, Tejun Heo, Maxime Ripard, Christian König,
	Alex Deucher, amd-gfx, dri-devel, stable, Natalie Vock,
	Johannes Weiner, Michal Koutný, cgroups, Huang Rui,
	Matthew Brost, Matthew Auld, Maarten Lankhorst, Thomas Zimmermann,
	Simona Vetter, David Airlie, Thadeu Lima de Souza Cascardo,
	Rodrigo Vivi, linux-kernel

drmm_cgroup_register_region() is called before INIT_LIST_HEAD() and
gpu_buddy_init() in amdgpu_vram_mgr_init(). If it fails, the function
returns early and bypasses those initializations.

Since adev->mman.initialized is set to true before amdgpu_vram_mgr_init()
is called, a failure triggers amdgpu_ttm_fini(), which calls
amdgpu_vram_mgr_fini(), which then:

 - Calls list_for_each_entry_safe() on reservations_pending and
   reserved_pages, whose list_head::next pointers are zero-initialized
   (NULL). The loop does not recognize them as empty and dereferences NULL.

 - Calls gpu_buddy_fini(), which iterates free_trees[] unconditionally
   via for_each_free_tree(). Since mm->free_trees is NULL
   (never allocated), this dereferences NULL.

Both result in a kernel panic on the module load error path.

Fix by moving drmm_cgroup_register_region() to after the list and buddy
allocator are fully initialized, so the teardown path is safe to run.

Reported-by: Sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260428073116.15687-1-thomas.hellstrom@linux.intel.com?part=4
Fixes: 2b624a2c1865 ("drm/ttm: Handle cgroup based eviction in TTM")
Cc: Friedrich Vock <friedrich.vock@gmx.de>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Cc: Tejun Heo <tj@kernel.org>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.14+
Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2a241a5b12c4..ac3f71d77140 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -918,9 +918,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	struct ttm_resource_manager *man = &mgr->manager;
 	int err;
 
-	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->gmc.real_vram_size);
-	if (IS_ERR(man->cg))
-		return PTR_ERR(man->cg);
 	ttm_resource_manager_init(man, &adev->mman.bdev,
 				  adev->gmc.real_vram_size);
 
@@ -935,6 +932,10 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	if (err)
 		return err;
 
+	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->gmc.real_vram_size);
+	if (IS_ERR(man->cg))
+		return PTR_ERR(man->cg);
+
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
  2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
@ 2026-07-03 13:05 ` Thomas Hellström
  2026-07-03 13:05 ` [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König,
	Thadeu Lima de Souza Cascardo, Alex Deucher, Rodrigo Vivi,
	dri-devel, amd-gfx, linux-kernel

Replace the bare u64 size argument to dmem_cgroup_register_region() and
drmm_cgroup_register_region() with a const struct dmem_cgroup_init *
pointer. The struct currently carries only the size field, but using a
struct makes the API extensible: future callers can supply additional
initialization parameters without adding more positional arguments.

Update all in-tree callers (amdgpu, xe) to use a compound-literal
initializer.

v5:
- Commit introduced.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c |  6 +++++-
 drivers/gpu/drm/drm_drv.c                    |  8 +++++---
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c         |  7 ++++++-
 include/drm/drm_drv.h                        |  4 +++-
 include/linux/cgroup_dmem.h                  | 16 +++++++++++++---
 kernel/cgroup/dmem.c                         | 10 ++++++----
 6 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index ac3f71d77140..08f05c3aed1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -23,6 +23,7 @@
  */
 
 #include <linux/dma-mapping.h>
+#include <linux/cgroup_dmem.h>
 #include <drm/ttm/ttm_range_manager.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_buddy.h>
@@ -932,7 +933,10 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	if (err)
 		return err;
 
-	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->gmc.real_vram_size);
+	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram",
+					      &(struct dmem_cgroup_init){
+						.size = adev->gmc.real_vram_size,
+					      });
 	if (IS_ERR(man->cg))
 		return PTR_ERR(man->cg);
 
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 1ff0bf7cba6a..3c570f9393b9 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -960,17 +960,19 @@ static void drmm_cg_unregister_region(struct drm_device *dev, void *arg)
  * drmm_cgroup_register_region - Register a region of a DRM device to cgroups
  * @dev: device for region
  * @region_name: Region name for registering
- * @size: Size of region in bytes
+ * @init: Initialization parameters for the region.
  *
  * This decreases the ref-count of @dev by one. The device is destroyed if the
  * ref-count drops to zero.
  */
-struct dmem_cgroup_region *drmm_cgroup_register_region(struct drm_device *dev, const char *region_name, u64 size)
+struct dmem_cgroup_region *
+drmm_cgroup_register_region(struct drm_device *dev, const char *region_name,
+			    const struct dmem_cgroup_init *init)
 {
 	struct dmem_cgroup_region *region;
 	int ret;
 
-	region = dmem_cgroup_register_region(size, "drm/%s/%s", dev->unique, region_name);
+	region = dmem_cgroup_register_region(init, "drm/%s/%s", dev->unique, region_name);
 	if (IS_ERR_OR_NULL(region))
 		return region;
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index b518f7dec680..308fda4248eb 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -4,6 +4,8 @@
  * Copyright (C) 2021-2022 Red Hat
  */
 
+#include <linux/cgroup_dmem.h>
+
 #include <drm/drm_managed.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_buddy.h>
@@ -303,7 +305,10 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	int err;
 
 	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
-	man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
+	man->cg = drmm_cgroup_register_region(&xe->drm, name,
+					      &(struct dmem_cgroup_init){
+						.size = size,
+					      });
 	if (IS_ERR(man->cg))
 		return PTR_ERR(man->cg);
 
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index e09559495c5b..b23830494ed4 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -34,6 +34,7 @@
 
 #include <drm/drm_device.h>
 
+struct dmem_cgroup_init;
 struct dmem_cgroup_region;
 struct drm_fb_helper;
 struct drm_fb_helper_surface_size;
@@ -433,7 +434,8 @@ void *__devm_drm_dev_alloc(struct device *parent,
 
 struct dmem_cgroup_region *
 drmm_cgroup_register_region(struct drm_device *dev,
-			    const char *region_name, u64 size);
+			    const char *region_name,
+			    const struct dmem_cgroup_init *init);
 
 /**
  * devm_drm_dev_alloc - Resource managed allocation of a &drm_device instance
diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
index dd4869f1d736..d9eab8a2c1ee 100644
--- a/include/linux/cgroup_dmem.h
+++ b/include/linux/cgroup_dmem.h
@@ -14,8 +14,18 @@ struct dmem_cgroup_pool_state;
 /* Opaque definition of a cgroup region, used internally */
 struct dmem_cgroup_region;
 
+/**
+ * struct dmem_cgroup_init - Initialization parameters for a dmem cgroup region.
+ * @size: Size of the region in bytes.
+ */
+struct dmem_cgroup_init {
+	u64 size;
+};
+
 #if IS_ENABLED(CONFIG_CGROUP_DMEM)
-struct dmem_cgroup_region *dmem_cgroup_register_region(u64 size, const char *name_fmt, ...) __printf(2,3);
+struct dmem_cgroup_region *
+dmem_cgroup_register_region(const struct dmem_cgroup_init *init,
+			    const char *name_fmt, ...) __printf(2, 3);
 void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region);
 int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
 			   struct dmem_cgroup_pool_state **ret_pool,
@@ -27,8 +37,8 @@ bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
 
 void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool);
 #else
-static inline __printf(2,3) struct dmem_cgroup_region *
-dmem_cgroup_register_region(u64 size, const char *name_fmt, ...)
+static inline __printf(2, 3) struct dmem_cgroup_region *
+dmem_cgroup_register_region(const struct dmem_cgroup_init *init, const char *name_fmt, ...)
 {
 	return NULL;
 }
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 6430c7ce1e03..d12c8543f3fe 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -502,7 +502,7 @@ EXPORT_SYMBOL_GPL(dmem_cgroup_unregister_region);
 
 /**
  * dmem_cgroup_register_region() - Register a regions for dev cgroup.
- * @size: Size of region to register, in bytes.
+ * @init: Initialization parameters for the region.
  * @fmt: Region parameters to register
  *
  * This function registers a node in the dmem cgroup with the
@@ -511,13 +511,15 @@ EXPORT_SYMBOL_GPL(dmem_cgroup_unregister_region);
  *
  * Return: NULL or a struct on success, PTR_ERR on failure.
  */
-struct dmem_cgroup_region *dmem_cgroup_register_region(u64 size, const char *fmt, ...)
+struct dmem_cgroup_region *
+dmem_cgroup_register_region(const struct dmem_cgroup_init *init,
+			    const char *fmt, ...)
 {
 	struct dmem_cgroup_region *ret;
 	char *region_name;
 	va_list ap;
 
-	if (!size)
+	if (!init || !init->size)
 		return NULL;
 
 	va_start(ap, fmt);
@@ -534,7 +536,7 @@ struct dmem_cgroup_region *dmem_cgroup_register_region(u64 size, const char *fmt
 
 	INIT_LIST_HEAD(&ret->pools);
 	ret->name = region_name;
-	ret->size = size;
+	ret->size = init->size;
 	kref_init(&ret->ref);
 
 	spin_lock(&dmemcg_lock);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
  2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
  2026-07-03 13:05 ` [PATCH v7 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization Thomas Hellström
@ 2026-07-03 13:05 ` Thomas Hellström
  2026-07-03 13:24   ` sashiko-bot
  2026-07-03 13:05 ` [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König,
	Thadeu Lima de Souza Cascardo, Alex Deucher, Rodrigo Vivi,
	dri-devel, amd-gfx, linux-kernel

Add an optional reclaim callback to struct dmem_cgroup_region. When
dmem.max is set below the current usage of a cgroup pool, the new limit
is applied immediately (so that concurrent allocations are throttled
while reclaim is in progress) and then the driver is asked to evict
memory to bring usage back below the limit.

Reclaim is attempted up to a bounded number of times. No error is
returned to userspace if usage remains above the limit after reclaim,
and a pending signal will abort the reclaim loop early. This matches
the behavior of memory.max in the memory cgroup controller.

Also honor O_NONBLOCK so that if that flag is set during the
max value write, no reclaim is initiated. The idea is to avoid
charging the reclaim cost to the writer of the max value.

v2:
- Write max before reclaim is attempted (Maarten)
- Let signals abort the reclaim without error (Maarten)
- If a new max value is written with the O_NONBLOCK flag,
  reclaim is not attempted (Maarten)
- Extract region from the pool parameter rather than
  passing it explicitly to set_resource_xxx().

v3:
- Use an rw_semaphore (unregister_sem) to protect reclaim callbacks
  against concurrent region unregistration: readers (reclaim) hold the
  read side; dmem_cgroup_unregister_region() takes the write side to
  drain in-flight callbacks before returning. (Sashiko-bot)

v5:
- Rebased on the introduction of struct dmem_cgroup_init.
- Use nonblock=true in reset_all_resource_limits() to avoid sleeping
  inside rcu_read_lock() in dmemcs_offline(). (Sashiko-bot)
- Compare usage against the truncated limit value stored in cnt.max,
  not the original u64. (Sashiko-bot)
- Use a DMEM_MAX_RECLAIM_RETRIES (16) retry budget instead of 5, matching
  the memcg controller's MAX_RECLAIM_RETRIES. Only -ENOSPC (no progress)
  counts against the retry budget; other errors terminate the loop
  immediately.

v6:
- Fix dmem_cgroup_ops->reclaim docstring: -ENOSPC does not stop reclaim
  immediately but is retried up to DMEM_MAX_RECLAIM_RETRIES times; only
  other negative errors terminate the loop. (Sashiko-bot)

v7:
- Replace the per-region rw_semaphore with a static SRCU domain
  (dmemcg_srcu). SRCU is a better fit than rwsem for this use: it
  avoids the per-region lock overhead on every reclaim call, and
  synchronize_srcu() at unregister time is a rare operation. (Maarten)
- Trim in-function comments to focus on what rather than how.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 include/linux/cgroup_dmem.h | 22 ++++++++++
 kernel/cgroup/dmem.c        | 81 +++++++++++++++++++++++++++++++------
 2 files changed, 91 insertions(+), 12 deletions(-)

diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
index d9eab8a2c1ee..8664321fa9f7 100644
--- a/include/linux/cgroup_dmem.h
+++ b/include/linux/cgroup_dmem.h
@@ -14,12 +14,34 @@ struct dmem_cgroup_pool_state;
 /* Opaque definition of a cgroup region, used internally */
 struct dmem_cgroup_region;
 
+/**
+ * struct dmem_cgroup_ops - Operations for a dmem cgroup region.
+ * @reclaim: Optional callback invoked when dmem.max is set below the current
+ *           usage of a pool. The driver should attempt to free at least
+ *           @target_bytes from @pool. May be called multiple times if usage
+ *           remains above the limit after returning.
+ *
+ *           Return: 0 if some progress was made (even if less than
+ *           @target_bytes was freed), -ENOSPC if no progress could be made
+ *           (the caller will retry up to a bounded number of times), or
+ *           another negative error code if a fatal error occurred (stops
+ *           further reclaim attempts immediately).
+ */
+struct dmem_cgroup_ops {
+	int (*reclaim)(struct dmem_cgroup_pool_state *pool,
+		       u64 target_bytes, void *priv);
+};
+
 /**
  * struct dmem_cgroup_init - Initialization parameters for a dmem cgroup region.
  * @size: Size of the region in bytes.
+ * @ops: Optional operations for this region. May be NULL.
+ * @reclaim_priv: Opaque pointer passed to @ops->reclaim. May be NULL.
  */
 struct dmem_cgroup_init {
 	u64 size;
+	const struct dmem_cgroup_ops *ops;
+	void *reclaim_priv;
 };
 
 #if IS_ENABLED(CONFIG_CGROUP_DMEM)
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index d12c8543f3fe..719d28dd1078 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -17,6 +17,13 @@
 #include <linux/refcount.h>
 #include <linux/rculist.h>
 #include <linux/slab.h>
+#include <linux/srcu.h>
+
+/* Maximum reclaim attempts before giving up when lowering dmem.max. */
+#define DMEM_MAX_RECLAIM_RETRIES 16
+
+/* SRCU domain serialising reclaim callbacks against region unregistration. */
+DEFINE_STATIC_SRCU(dmemcg_srcu);
 
 struct dmem_cgroup_region {
 	/**
@@ -48,9 +55,18 @@ struct dmem_cgroup_region {
 
 	/**
 	 * @unregistered: Whether the region is unregistered by its caller.
-	 * No new pools should be added to the region afterwards.
+	 * No new pools should be added to the region afterwards, and no new
+	 * reclaim callbacks should be invoked.
 	 */
 	bool unregistered;
+
+	/**
+	 * @ops: Optional driver operations for this region.
+	 */
+	const struct dmem_cgroup_ops *ops;
+
+	/** @reclaim_priv: Private data passed to @ops->reclaim. */
+	void *reclaim_priv;
 };
 
 struct dmemcg_state {
@@ -145,21 +161,52 @@ static void free_cg_pool(struct dmem_cgroup_pool_state *pool)
 }
 
 static void
-set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val)
+set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val, bool nonblock)
 {
 	page_counter_set_min(&pool->cnt, val);
 }
 
 static void
-set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val)
+set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val, bool nonblock)
 {
 	page_counter_set_low(&pool->cnt, val);
 }
 
 static void
-set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val)
+set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val, bool nonblock)
 {
-	page_counter_set_max(&pool->cnt, val);
+	struct dmem_cgroup_region *region = pool->region;
+	unsigned long limit = (unsigned long)val;
+
+	/* Apply the new limit immediately so concurrent allocations are throttled. */
+	xchg(&pool->cnt.max, limit);
+
+	if (nonblock)
+		return;
+
+	int srcu_idx = srcu_read_lock(&dmemcg_srcu);
+
+	if (!READ_ONCE(region->unregistered) && region->ops && region->ops->reclaim) {
+		for (int retries = DMEM_MAX_RECLAIM_RETRIES; ; ) {
+			u64 usage = page_counter_read(&pool->cnt);
+			int ret;
+
+			if (usage <= limit)
+				break;
+
+			if (signal_pending(current))
+				break;
+
+			ret = region->ops->reclaim(pool, usage - limit, region->reclaim_priv);
+
+			/* -ENOSPC means no progress; other errors are fatal. */
+			if (ret && (ret != -ENOSPC || !retries--))
+				break;
+
+			cond_resched();
+		}
+	}
+	srcu_read_unlock(&dmemcg_srcu, srcu_idx);
 }
 
 static u64 get_resource_low(struct dmem_cgroup_pool_state *pool)
@@ -189,9 +236,10 @@ static u64 get_resource_peak(struct dmem_cgroup_pool_state *pool)
 
 static void reset_all_resource_limits(struct dmem_cgroup_pool_state *rpool)
 {
-	set_resource_min(rpool, 0);
-	set_resource_low(rpool, 0);
-	set_resource_max(rpool, PAGE_COUNTER_MAX);
+	set_resource_min(rpool, 0, false);
+	set_resource_low(rpool, 0, false);
+	/* nonblock: raising to max makes reclaim a no-op; sleeping is forbidden here. */
+	set_resource_max(rpool, PAGE_COUNTER_MAX, true);
 }
 
 static void dmemcs_offline(struct cgroup_subsys_state *css)
@@ -468,7 +516,10 @@ static void dmemcg_free_region(struct kref *ref)
  * dmem_cgroup_unregister_region() - Unregister a previously registered region.
  * @region: The region to unregister.
  *
- * This function undoes dmem_cgroup_register_region.
+ * This function undoes dmem_cgroup_register_region.  It drains any
+ * in-flight reclaim callbacks before returning, so the caller may safely
+ * free the resources pointed to by the @reclaim_priv that was passed at
+ * registration time.
  */
 void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region)
 {
@@ -493,9 +544,11 @@ void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region)
 	 * no new pools should be added to the dead region
 	 * by get_cg_pool_unlocked.
 	 */
-	region->unregistered = true;
+	WRITE_ONCE(region->unregistered, true);
 	spin_unlock(&dmemcg_lock);
 
+	synchronize_srcu(&dmemcg_srcu);
+
 	kref_put(&region->ref, dmemcg_free_region);
 }
 EXPORT_SYMBOL_GPL(dmem_cgroup_unregister_region);
@@ -537,6 +590,8 @@ dmem_cgroup_register_region(const struct dmem_cgroup_init *init,
 	INIT_LIST_HEAD(&ret->pools);
 	ret->name = region_name;
 	ret->size = init->size;
+	ret->ops = init->ops;
+	ret->reclaim_priv = init->reclaim_priv;
 	kref_init(&ret->ref);
 
 	spin_lock(&dmemcg_lock);
@@ -733,9 +788,10 @@ static int dmemcg_parse_limit(char *options, u64 *new_limit)
 
 static ssize_t dmemcg_limit_write(struct kernfs_open_file *of,
 				 char *buf, size_t nbytes, loff_t off,
-				 void (*apply)(struct dmem_cgroup_pool_state *, u64))
+				 void (*apply)(struct dmem_cgroup_pool_state *, u64, bool))
 {
 	struct dmemcg_state *dmemcs = css_to_dmemcs(of_css(of));
+	bool nonblock = of->file->f_flags & O_NONBLOCK;
 	int err = 0;
 
 	while (buf && !err) {
@@ -780,7 +836,8 @@ static ssize_t dmemcg_limit_write(struct kernfs_open_file *of,
 		}
 
 		/* And commit */
-		apply(pool, new_limit);
+		apply(pool, new_limit, nonblock);
+
 		dmemcg_pool_put(pool);
 
 out_put:
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
                   ` (2 preceding siblings ...)
  2026-07-03 13:05 ` [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
@ 2026-07-03 13:05 ` Thomas Hellström
  2026-07-03 13:19   ` sashiko-bot
  2026-07-03 13:05 ` [PATCH v7 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König,
	Thadeu Lima de Souza Cascardo, Alex Deucher, Rodrigo Vivi,
	dri-devel, amd-gfx, linux-kernel

Add ttm_bo_evict_cgroup() to evict buffer objects charged to a specific
dmem cgroup pool from a resource manager's LRU until a byte target is
met.  Add ttm_resource_manager_set_dmem_region() to associate a dmem
cgroup region with a resource manager; drivers supply their own
dmem_cgroup_ops with ttm_resource_manager_dmem_reclaim as the reclaim
function and the manager pointer as reclaim_priv in the dmem_cgroup_init
to wire up TTM eviction as the reclaim callback.

The eviction context is interruptible; signals abort the operation and
propagate back through the write() syscall.

Introduce a new mode for the bo LRU walker so that sleeping locks
can be taken. This can be used when the caller doesn't hold any
previous dma_resv locks, and where it intends to hold at most
one lock at a time.

Like the rest of the TTM eviction this should sooner than later
be converted to full WW transactions.

v3:
- Fix ttm_resource_manager_set_dmem_region() storing an error pointer
  in man->cg unconditionally. (Sashiko-bot)
- Fix kernel-doc function name format for ttm_bo_evict_cgroup() and
  ttm_resource_manager_set_dmem_region().

v5:
- Rebased on the introduction of struct dmem_cgroup_init.
- Handle NULL region in ttm_resource_manager_set_dmem_region() to clear
  the reclaim callback, preventing use-after-free when the manager is
  torn down while the dmem region outlives it. (Sashiko-bot)
- Return 0 on any progress (even partial eviction), -ENOSPC only when
  nothing was freed; fixes callers that expected 0 on partial success.
- Document that the reclaim callback should return 0 if some progress
  was made, -ENOSPC if no progress at all, or another error for fatal
  failures.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c       | 95 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/ttm/ttm_bo_util.c  |  3 +-
 drivers/gpu/drm/ttm/ttm_resource.c | 50 ++++++++++++++++
 include/drm/ttm/ttm_bo.h           | 10 ++++
 include/drm/ttm/ttm_resource.h     |  7 +++
 5 files changed, 161 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 3980f376e3ba..b2bbbb69add3 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -515,12 +515,20 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
 {
 	struct ttm_bo_evict_walk *evict_walk =
 		container_of(walk, typeof(*evict_walk), walk);
+	/* Capture size before eviction in case res is cleared. */
+	s64 bo_size = bo->base.size;
 	s64 lret;
 
 	if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo->resource->css,
 					      evict_walk->try_low, &evict_walk->hit_low))
 		return 0;
 
+	/*
+	 * evict_walk->place is NULL in cgroup drain mode.  Drivers'
+	 * eviction_valuable() callbacks must handle a NULL place, treating it
+	 * as "any placement": the TTM base implementation already does so via
+	 * ttm_resource_intersects().
+	 */
 	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
 		return 0;
 
@@ -536,11 +544,15 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
 		goto out;
 
 	evict_walk->evicted++;
-	if (evict_walk->res)
+	if (evict_walk->res) {
 		lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
 					  evict_walk->res, NULL);
-	if (lret == 0)
-		return 1;
+		if (lret == 0)
+			return 1;
+	} else {
+		/* Cgroup drain: return bytes freed for byte-denominated progress. */
+		return bo_size;
+	}
 out:
 	/* Errors that should terminate the walk. */
 	if (lret == -ENOSPC)
@@ -614,6 +626,83 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
 	return 0;
 }
 
+/**
+ * ttm_bo_evict_cgroup() - Evict buffer objects charged to a specific cgroup.
+ * @bdev: The TTM device.
+ * @man: The resource manager whose LRU to walk.
+ * @limit_pool: The cgroup pool state whose members should be evicted.
+ * @target_bytes: Number of bytes to free.
+ * @ctx: The TTM operation context.
+ *
+ * Walk the LRU of @man and evict buffer objects that are charged to the
+ * cgroup identified by @limit_pool, until at least @target_bytes have been
+ * freed.  Mirrors the two-pass (trylock -> sleeping-lock, low-watermark)
+ * strategy used by ttm_bo_evict_alloc().
+ *
+ * Return: >= @target_bytes on full success, 0..target_bytes-1 if partial,
+ *         negative error code on fatal error.
+ */
+s64 ttm_bo_evict_cgroup(struct ttm_device *bdev,
+			struct ttm_resource_manager *man,
+			struct dmem_cgroup_pool_state *limit_pool,
+			s64 target_bytes,
+			struct ttm_operation_ctx *ctx)
+{
+	struct ttm_bo_evict_walk evict_walk = {
+		.walk = {
+			.ops = &ttm_evict_walk_ops,
+			.arg = { .ctx = ctx },
+		},
+		.limit_pool = limit_pool,
+		/* place, evictor, res left NULL: selects cgroup drain mode */
+	};
+	s64 lret, pass;
+
+	evict_walk.walk.arg.trylock_only = true;
+	lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, target_bytes);
+	if (lret < 0 || lret >= target_bytes)
+		return lret;
+
+	/* Second pass: also evict BOs at the low watermark. */
+	if (evict_walk.hit_low) {
+		evict_walk.try_low = true;
+		pass = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man,
+					      target_bytes - lret);
+		if (pass < 0)
+			return pass;
+		lret += pass;
+		if (lret >= target_bytes)
+			return lret;
+	}
+
+	/* Full sleeping-lock pass for remaining target. */
+	evict_walk.try_low = evict_walk.hit_low = false;
+	evict_walk.walk.arg.trylock_only = false;
+
+retry:
+	evict_walk.walk.arg.sleeping_lock = true;
+	do {
+		evict_walk.evicted = 0;
+		pass = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man,
+					      target_bytes - lret);
+		if (pass < 0) {
+			lret = pass;
+			goto out;
+		}
+		lret += pass;
+	} while (lret < target_bytes && evict_walk.evicted);
+
+	/* One more attempt if we hit the low limit during sleeping-lock pass. */
+	if (lret < target_bytes && evict_walk.hit_low && !evict_walk.try_low) {
+		evict_walk.try_low = true;
+		goto retry;
+	}
+
+out:
+	return lret;
+}
+EXPORT_SYMBOL(ttm_bo_evict_cgroup);
+
 /**
  * ttm_bo_pin - Pin the buffer object.
  * @bo: The buffer object to pin
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 3e3c201a0222..bd0b23ac2cc4 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -999,7 +999,8 @@ __ttm_bo_lru_cursor_next(struct ttm_bo_lru_cursor *curs)
 		bo = res->bo;
 		if (ttm_lru_walk_trylock(curs, bo))
 			bo_locked = true;
-		else if (!arg->ticket || arg->ctx->no_wait_gpu || arg->trylock_only)
+		else if ((!arg->ticket && !arg->sleeping_lock) || arg->ctx->no_wait_gpu ||
+			 arg->trylock_only)
 			continue;
 
 		if (!ttm_bo_get_unless_zero(bo)) {
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 154d6739256f..ad00723e99ef 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -953,3 +953,53 @@ void ttm_resource_manager_create_debugfs(struct ttm_resource_manager *man,
 #endif
 }
 EXPORT_SYMBOL(ttm_resource_manager_create_debugfs);
+
+/**
+ * ttm_resource_manager_dmem_reclaim() - dmem cgroup reclaim callback for TTM
+ *                                       resource managers.
+ * @pool: The dmem cgroup pool state for the cgroup being reclaimed.
+ * @target_bytes: Number of bytes to try to free.
+ * @priv: The &ttm_resource_manager pointer, passed as @init.reclaim_priv to
+ *        dmem_cgroup_register_region().
+ *
+ * Drivers should use this as the @reclaim member of their own
+ * &struct dmem_cgroup_ops, with the &ttm_resource_manager pointer as
+ * @init.reclaim_priv.
+ *
+ * Return: 0 if some memory was freed, -ENOSPC if nothing was freed, or
+ *         another negative error code on fatal failure.
+ */
+int ttm_resource_manager_dmem_reclaim(struct dmem_cgroup_pool_state *pool,
+				      u64 target_bytes, void *priv)
+{
+	struct ttm_resource_manager *man = priv;
+	struct ttm_operation_ctx ctx = { .interruptible = true };
+	s64 freed;
+
+	freed = ttm_bo_evict_cgroup(man->bdev, man, pool, target_bytes, &ctx);
+	if (freed < 0)
+		return freed;
+
+	return freed > 0 ? 0 : -ENOSPC;
+}
+EXPORT_SYMBOL(ttm_resource_manager_dmem_reclaim);
+
+/**
+ * ttm_resource_manager_set_dmem_region() - Associate a dmem cgroup region with a
+ *                                        resource manager.
+ * @man: The resource manager.
+ * @region: The dmem cgroup region to associate, may be NULL or IS_ERR().
+ *
+ * When @region is valid, stores it in @man->cg so that TTM can look up the
+ * associated pool during charging and eviction-target selection.
+ * The reclaim callback must be wired up using ttm_resource_manager_dmem_reclaim()
+ * in the driver's own &struct dmem_cgroup_ops, with the manager pointer as
+ * @init.reclaim_priv.
+ */
+void ttm_resource_manager_set_dmem_region(struct ttm_resource_manager *man,
+					  struct dmem_cgroup_region *region)
+{
+	if (!IS_ERR_OR_NULL(region))
+		man->cg = region;
+}
+EXPORT_SYMBOL(ttm_resource_manager_set_dmem_region);
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 8310bc3d55f9..32791c4db2a9 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -226,6 +226,11 @@ struct ttm_lru_walk_arg {
 	struct ww_acquire_ctx *ticket;
 	/** @trylock_only: Only use trylock for locking. */
 	bool trylock_only;
+	/**
+	 * @sleeping_lock: Use sleeping locks even with %NULL @ticket.
+	 * @trylock_only has precedence over this field.
+	 */
+	bool sleeping_lock;
 };
 
 /**
@@ -431,6 +436,11 @@ void ttm_bo_unpin(struct ttm_buffer_object *bo);
 int ttm_bo_evict_first(struct ttm_device *bdev,
 		       struct ttm_resource_manager *man,
 		       struct ttm_operation_ctx *ctx);
+s64 ttm_bo_evict_cgroup(struct ttm_device *bdev,
+			struct ttm_resource_manager *man,
+			struct dmem_cgroup_pool_state *limit_pool,
+			s64 target_bytes,
+			struct ttm_operation_ctx *ctx);
 int ttm_bo_access(struct ttm_buffer_object *bo, unsigned long offset,
 		  void *buf, int len, int write);
 vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index a5d386583fb6..32e485fdce9a 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -39,6 +39,7 @@
 
 struct dentry;
 struct dmem_cgroup_device;
+struct dmem_cgroup_region;
 struct drm_printer;
 struct ttm_device;
 struct ttm_resource_manager;
@@ -477,6 +478,12 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,
 			       struct ttm_device *bdev,
 			       uint64_t size);
 
+void ttm_resource_manager_set_dmem_region(struct ttm_resource_manager *man,
+					  struct dmem_cgroup_region *region);
+
+int ttm_resource_manager_dmem_reclaim(struct dmem_cgroup_pool_state *pool,
+				      u64 target_bytes, void *priv);
+
 int ttm_resource_manager_evict_all(struct ttm_device *bdev,
 				   struct ttm_resource_manager *man);
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
                   ` (3 preceding siblings ...)
  2026-07-03 13:05 ` [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
@ 2026-07-03 13:05 ` Thomas Hellström
  2026-07-03 13:05 ` [PATCH v7 6/6] drm/amdgpu: " Thomas Hellström
  2026-07-03 14:37 ` [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thadeu Lima de Souza Cascardo
  6 siblings, 0 replies; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König,
	Thadeu Lima de Souza Cascardo, Alex Deucher, Rodrigo Vivi,
	dri-devel, amd-gfx, linux-kernel

Register the VRAM manager with the dmem cgroup reclaim infrastructure
so that lowering dmem.max below current VRAM usage triggers TTM
eviction rather than failing with -EBUSY.

v4:
- Rebased on drm-tip; dropped the XE_PL_STOLEN guard as stolen memory
  uses a separate TTM manager and never calls __xe_ttm_vram_mgr_init().

v5:
- Rebased on the introduction of struct dmem_cgroup_init.
- Register the fini drmm action before drmm_cgroup_register_region() so
  that devres LIFO teardown runs unregister_region() first (draining any
  in-flight reclaim callbacks via the rwsem) and xe_ttm_vram_mgr_fini()
  second, ensuring the manager is never accessed by a reclaim callback
  after teardown. (Sashiko-bot)
- Wrap the reclaim callback in xe_ttm_vram_mgr_dmem_reclaim() using
  drm_dev_enter()/drm_dev_exit() to prevent TTM reclaim from running
  after driver unbind.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 54 +++++++++++++++++++++++-----
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 308fda4248eb..b2500344cd57 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -276,6 +276,28 @@ static const struct ttm_resource_manager_func xe_ttm_vram_mgr_func = {
 	.debug	= xe_ttm_vram_mgr_debug
 };
 
+static const struct dmem_cgroup_ops xe_ttm_vram_mgr_dmem_ops;
+
+static int xe_ttm_vram_mgr_dmem_reclaim(struct dmem_cgroup_pool_state *pool,
+					 u64 target_bytes, void *priv)
+{
+	struct ttm_resource_manager *man = priv;
+	struct xe_device *xe = ttm_to_xe_device(man->bdev);
+	int ret, idx;
+
+	if (!drm_dev_enter(&xe->drm, &idx))
+		return -ENODEV;
+
+	ret = ttm_resource_manager_dmem_reclaim(pool, target_bytes, priv);
+
+	drm_dev_exit(idx);
+	return ret;
+}
+
+static const struct dmem_cgroup_ops xe_ttm_vram_mgr_dmem_ops = {
+	.reclaim = xe_ttm_vram_mgr_dmem_reclaim,
+};
+
 static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 {
 	struct xe_device *xe = to_xe_device(dev);
@@ -301,17 +323,10 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 			   u64 default_page_size)
 {
 	struct ttm_resource_manager *man = &mgr->manager;
+	struct dmem_cgroup_region *cg;
 	const char *name;
 	int err;
 
-	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
-	man->cg = drmm_cgroup_register_region(&xe->drm, name,
-					      &(struct dmem_cgroup_init){
-						.size = size,
-					      });
-	if (IS_ERR(man->cg))
-		return PTR_ERR(man->cg);
-
 	man->func = &xe_ttm_vram_mgr_func;
 	mgr->mem_type = mem_type;
 	err = drmm_mutex_init(&xe->drm, &mgr->lock);
@@ -330,7 +345,28 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	ttm_set_driver_manager(&xe->ttm, mem_type, &mgr->manager);
 	ttm_resource_manager_set_used(&mgr->manager, true);
 
-	return drmm_add_action_or_reset(&xe->drm, xe_ttm_vram_mgr_fini, mgr);
+	/*
+	 * Register the fini action before the cgroup region so that devres
+	 * LIFO teardown runs unregister_region first (draining any in-flight
+	 * reclaim callbacks) and the manager fini second.
+	 */
+	err = drmm_add_action_or_reset(&xe->drm, xe_ttm_vram_mgr_fini, mgr);
+	if (err)
+		return err;
+
+	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
+	cg = drmm_cgroup_register_region(&xe->drm, name,
+					 &(struct dmem_cgroup_init){
+						.size = size,
+						.ops = &xe_ttm_vram_mgr_dmem_ops,
+						.reclaim_priv = man,
+					 });
+	if (IS_ERR(cg))
+		return PTR_ERR(cg);
+
+	ttm_resource_manager_set_dmem_region(man, cg);
+
+	return 0;
 }
 
 /**
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 6/6] drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
                   ` (4 preceding siblings ...)
  2026-07-03 13:05 ` [PATCH v7 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
@ 2026-07-03 13:05 ` Thomas Hellström
  2026-07-03 13:25   ` sashiko-bot
  2026-07-03 14:37 ` [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thadeu Lima de Souza Cascardo
  6 siblings, 1 reply; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:05 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König,
	Thadeu Lima de Souza Cascardo, Alex Deucher, Rodrigo Vivi,
	dri-devel, amd-gfx, linux-kernel

Register the VRAM manager with the dmem cgroup reclaim infrastructure
so that lowering dmem.max below current VRAM usage triggers TTM
eviction rather than failing with -EBUSY.

Guard place->flags in amdgpu_ttm_bo_eviction_valuable() against NULL,
as the TTM reclaim path passes a NULL place in cgroup drain mode.

Use drmm_cgroup_register_region() so that the region is automatically
unregistered at DRM device release, after drm_dev_unplug() has already
made drm_dev_enter() return false.  The drm_dev_enter/exit guard in the
reclaim callback ensures no reclaim work touches the TTM manager after
driver unbind, closing the window between vram_mgr_fini() (called from
drm_driver.release) and the drmm cleanup that unregisters the region.

v3:
- Rebased on fix for uninitialized list and buddy allocator on the
  drmm_cgroup_register_region() error path.

v5:
- Rebased on the introduction of struct dmem_cgroup_init.
- Clear the reclaim callback in amdgpu_vram_mgr_fini() to prevent
  use-after-free if cgroup reclaim is triggered after driver unbind
  while userspace holds an open DRM file descriptor. (Sashiko-bot)
- Switch from drmm_cgroup_register_region() to the raw
  dmem_cgroup_register_region() and store the region in
  amdgpu_vram_mgr.cg_region. Call dmem_cgroup_unregister_region()
  in amdgpu_vram_mgr_fini() after ttm_resource_manager_evict_all()
  to drain in-flight reclaim callbacks, and clear man->cg afterwards.
  This is required because amdgpu's vram manager fini is called
  explicitly during driver unbind, which may precede the DRM device
  release and thus precede any drmm-based cleanup. (Sashiko-bot)

v6:
- Fix mgr->cg_region never being assigned, so
  dmem_cgroup_unregister_region() in fini silently no-ops on NULL
  and leaks the region. (Sashiko-bot)
- Reorder fini to call set_used(false) and evict_all() before
  dmem_cgroup_unregister_region(), so ttm_resource_free() can
  uncharge via man->cg during eviction; clear man->cg after
  unregister. (Sashiko-bot)

v7:
- Move dmem_cgroup_unregister_region() before the early return on
  evict_all() failure; not doing so leaves a dangling reclaim callback
  pointing to the partially-torn-down VRAM manager, causing a
  use-after-free when the cgroup later triggers reclaim. (Sashiko-bot)
- Switch back to drmm_cgroup_register_region() with a drm_dev_enter/
  exit guard in the reclaim callback (matching xe), rather than manual
  register/unregister.  drm_dev_unplug() fires before vram_mgr_fini(),
  so drm_dev_enter() returning false prevents any reclaim from touching
  the manager during teardown.  This also fixes the "vram" name
  collision on multi-GPU systems, since drmm_cgroup_register_region()
  automatically prefixes with "drm/<pci-addr>/". (Sashiko-bot)

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 37 +++++++++++++++++---
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 025625e7e800..58bb21451826 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1507,7 +1507,7 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
 	dma_resv_for_each_fence(&resv_cursor, bo->base.resv,
 				DMA_RESV_USAGE_BOOKKEEP, f) {
 		if (amdkfd_fence_check_mm(f, current->mm) &&
-		    !(place->flags & TTM_PL_FLAG_CONTIGUOUS))
+		    !(place && (place->flags & TTM_PL_FLAG_CONTIGUOUS)))
 			return false;
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 08f05c3aed1d..9b9d738ba794 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -906,6 +906,28 @@ static const struct ttm_resource_manager_func amdgpu_vram_mgr_func = {
 	.debug	= amdgpu_vram_mgr_debug
 };
 
+static const struct dmem_cgroup_ops amdgpu_vram_mgr_dmem_ops;
+
+static int amdgpu_vram_mgr_dmem_reclaim(struct dmem_cgroup_pool_state *pool,
+					u64 target_bytes, void *priv)
+{
+	struct ttm_resource_manager *man = priv;
+	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
+	int ret, idx;
+
+	if (!drm_dev_enter(adev_to_drm(adev), &idx))
+		return -ENODEV;
+
+	ret = ttm_resource_manager_dmem_reclaim(pool, target_bytes, priv);
+
+	drm_dev_exit(idx);
+	return ret;
+}
+
+static const struct dmem_cgroup_ops amdgpu_vram_mgr_dmem_ops = {
+	.reclaim = amdgpu_vram_mgr_dmem_reclaim,
+};
+
 /**
  * amdgpu_vram_mgr_init - init VRAM manager and DRM MM
  *
@@ -917,6 +939,7 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
+	struct dmem_cgroup_region *cg;
 	int err;
 
 	ttm_resource_manager_init(man, &adev->mman.bdev,
@@ -933,12 +956,16 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	if (err)
 		return err;
 
-	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram",
-					      &(struct dmem_cgroup_init){
+	cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram",
+					 &(struct dmem_cgroup_init){
 						.size = adev->gmc.real_vram_size,
-					      });
-	if (IS_ERR(man->cg))
-		return PTR_ERR(man->cg);
+						.ops = &amdgpu_vram_mgr_dmem_ops,
+						.reclaim_priv = man,
+					 });
+	if (IS_ERR(cg))
+		return PTR_ERR(cg);
+
+	ttm_resource_manager_set_dmem_region(man, cg);
 
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
@ 2026-07-03 13:08   ` Christian König
  2026-07-03 13:11     ` Thomas Hellström
  2026-07-03 13:26   ` sashiko-bot
  1 sibling, 1 reply; 14+ messages in thread
From: Christian König @ 2026-07-03 13:08 UTC (permalink / raw)
  To: Thomas Hellström, intel-xe, Paneer Selvam, Arunpravin
  Cc: Sashiko-bot, Friedrich Vock, Maarten Lankhorst, Tejun Heo,
	Maxime Ripard, Alex Deucher, amd-gfx, dri-devel, stable,
	Natalie Vock, Johannes Weiner, Michal Koutný, cgroups,
	Huang Rui, Matthew Brost, Matthew Auld, Maarten Lankhorst,
	Thomas Zimmermann, Simona Vetter, David Airlie,
	Thadeu Lima de Souza Cascardo, Rodrigo Vivi, linux-kernel

Arun please take a look at this.

Thanks,
Christian.

On 7/3/26 15:05, Thomas Hellström wrote:
> drmm_cgroup_register_region() is called before INIT_LIST_HEAD() and
> gpu_buddy_init() in amdgpu_vram_mgr_init(). If it fails, the function
> returns early and bypasses those initializations.
> 
> Since adev->mman.initialized is set to true before amdgpu_vram_mgr_init()
> is called, a failure triggers amdgpu_ttm_fini(), which calls
> amdgpu_vram_mgr_fini(), which then:
> 
>  - Calls list_for_each_entry_safe() on reservations_pending and
>    reserved_pages, whose list_head::next pointers are zero-initialized
>    (NULL). The loop does not recognize them as empty and dereferences NULL.
> 
>  - Calls gpu_buddy_fini(), which iterates free_trees[] unconditionally
>    via for_each_free_tree(). Since mm->free_trees is NULL
>    (never allocated), this dereferences NULL.
> 
> Both result in a kernel panic on the module load error path.
> 
> Fix by moving drmm_cgroup_register_region() to after the list and buddy
> allocator are fully initialized, so the teardown path is safe to run.
> 
> Reported-by: Sashiko-bot <sashiko-bot@kernel.org>
> Closes: https://sashiko.dev/#/patchset/20260428073116.15687-1-thomas.hellstrom@linux.intel.com?part=4
> Fixes: 2b624a2c1865 ("drm/ttm: Handle cgroup based eviction in TTM")
> Cc: Friedrich Vock <friedrich.vock@gmx.de>
> Cc: Maarten Lankhorst <dev@lankhorst.se>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: <stable@vger.kernel.org> # v6.14+
> Assisted-by: GitHub_Copilot:claude-sonnet-4.6
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 2a241a5b12c4..ac3f71d77140 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -918,9 +918,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>  	struct ttm_resource_manager *man = &mgr->manager;
>  	int err;
>  
> -	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->gmc.real_vram_size);
> -	if (IS_ERR(man->cg))
> -		return PTR_ERR(man->cg);
>  	ttm_resource_manager_init(man, &adev->mman.bdev,
>  				  adev->gmc.real_vram_size);
>  
> @@ -935,6 +932,10 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>  	if (err)
>  		return err;
>  
> +	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->gmc.real_vram_size);
> +	if (IS_ERR(man->cg))
> +		return PTR_ERR(man->cg);
> +
>  	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>  	ttm_resource_manager_set_used(man, true);
>  	return 0;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  2026-07-03 13:08   ` Christian König
@ 2026-07-03 13:11     ` Thomas Hellström
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Hellström @ 2026-07-03 13:11 UTC (permalink / raw)
  To: Christian König, intel-xe, Paneer Selvam, Arunpravin
  Cc: Sashiko-bot, Friedrich Vock, Maarten Lankhorst, Tejun Heo,
	Maxime Ripard, Alex Deucher, amd-gfx, dri-devel, stable,
	Natalie Vock, Johannes Weiner, Michal Koutný, cgroups,
	Huang Rui, Matthew Brost, Matthew Auld, Maarten Lankhorst,
	Thomas Zimmermann, Simona Vetter, David Airlie,
	Thadeu Lima de Souza Cascardo, Rodrigo Vivi, linux-kernel

On Fri, 2026-07-03 at 15:08 +0200, Christian König wrote:
> Arun please take a look at this.
> 
> Thanks,
> Christian.

FWIW Sashiko claims there is yet another pre-existing bug WRT ordering
here, but since the fix wasn't needed for the rest of the series, I
focused on this one.

Thanks,
Thomas


> 
> On 7/3/26 15:05, Thomas Hellström wrote:
> > drmm_cgroup_register_region() is called before INIT_LIST_HEAD() and
> > gpu_buddy_init() in amdgpu_vram_mgr_init(). If it fails, the
> > function
> > returns early and bypasses those initializations.
> > 
> > Since adev->mman.initialized is set to true before
> > amdgpu_vram_mgr_init()
> > is called, a failure triggers amdgpu_ttm_fini(), which calls
> > amdgpu_vram_mgr_fini(), which then:
> > 
> >  - Calls list_for_each_entry_safe() on reservations_pending and
> >    reserved_pages, whose list_head::next pointers are zero-
> > initialized
> >    (NULL). The loop does not recognize them as empty and
> > dereferences NULL.
> > 
> >  - Calls gpu_buddy_fini(), which iterates free_trees[]
> > unconditionally
> >    via for_each_free_tree(). Since mm->free_trees is NULL
> >    (never allocated), this dereferences NULL.
> > 
> > Both result in a kernel panic on the module load error path.
> > 
> > Fix by moving drmm_cgroup_register_region() to after the list and
> > buddy
> > allocator are fully initialized, so the teardown path is safe to
> > run.
> > 
> > Reported-by: Sashiko-bot <sashiko-bot@kernel.org>
> > Closes:
> > https://sashiko.dev/#/patchset/20260428073116.15687-1-thomas.hellstrom@linux.intel.com?part=4
> > Fixes: 2b624a2c1865 ("drm/ttm: Handle cgroup based eviction in
> > TTM")
> > Cc: Friedrich Vock <friedrich.vock@gmx.de>
> > Cc: Maarten Lankhorst <dev@lankhorst.se>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: <stable@vger.kernel.org> # v6.14+
> > Assisted-by: GitHub_Copilot:claude-sonnet-4.6
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > index 2a241a5b12c4..ac3f71d77140 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > @@ -918,9 +918,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device
> > *adev)
> >  	struct ttm_resource_manager *man = &mgr->manager;
> >  	int err;
> >  
> > -	man->cg = drmm_cgroup_register_region(adev_to_drm(adev),
> > "vram", adev->gmc.real_vram_size);
> > -	if (IS_ERR(man->cg))
> > -		return PTR_ERR(man->cg);
> >  	ttm_resource_manager_init(man, &adev->mman.bdev,
> >  				  adev->gmc.real_vram_size);
> >  
> > @@ -935,6 +932,10 @@ int amdgpu_vram_mgr_init(struct amdgpu_device
> > *adev)
> >  	if (err)
> >  		return err;
> >  
> > +	man->cg = drmm_cgroup_register_region(adev_to_drm(adev),
> > "vram", adev->gmc.real_vram_size);
> > +	if (IS_ERR(man->cg))
> > +		return PTR_ERR(man->cg);
> > +
> >  	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM,
> > &mgr->manager);
> >  	ttm_resource_manager_set_used(man, true);
> >  	return 0;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller
  2026-07-03 13:05 ` [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
@ 2026-07-03 13:19   ` sashiko-bot
  0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-07-03 13:19 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: dri-devel, hannes, mkoutny, tj

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] `ttm_resource_manager_set_dmem_region()` fails to clear the cgroup region pointer when passed `NULL`, leaving a dangling pointer that can cause a use-after-free.
--

commit 02822c8c1770f6a971fca6d0ddb57472d4eea596
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller

This commit introduces functionality to evict buffer objects charged to a
specific dmem cgroup pool from a resource manager's LRU. It adds
ttm_resource_manager_set_dmem_region() to associate a dmem cgroup region
with a resource manager, and provides a reclaim callback that drivers can
wire up to TTM eviction.

> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 154d6739256f8..ad00723e99ef1 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c

[ ... ]

> +void ttm_resource_manager_set_dmem_region(struct ttm_resource_manager *man,
> +					  struct dmem_cgroup_region *region)
> +{
> +	if (!IS_ERR_OR_NULL(region))
> +		man->cg = region;
> +}
> +EXPORT_SYMBOL(ttm_resource_manager_set_dmem_region);

[Severity: High]
The v5 changelog mentions handling a NULL region here to clear the reclaim
callback and prevent use-after-free bugs. However, if region is NULL,
!IS_ERR_OR_NULL(region) evaluates to false. 

Does this skip the assignment and fail to clear man->cg, leaving a dangling
pointer if a driver tries to detach the region before teardown?

Should this code check !IS_ERR(region) instead, or unconditionally assign
man->cg = NULL when the passed region is NULL?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260703130541.2686-1-thomas.hellstrom@linux.intel.com?part=4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage
  2026-07-03 13:05 ` [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
@ 2026-07-03 13:24   ` sashiko-bot
  0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-07-03 13:24 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: tj, mkoutny, dri-devel, hannes

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Holding `srcu_read_lock` across a potentially unbounded reclaim loop causes `synchronize_srcu()` to hang indefinitely during region unregistration.
--

commit a6f9976e310be028f3b0c1a4f382c0d26ce5f305
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Subject: cgroup/dmem: Add reclaim callback for lowering max below current usage

This commit adds an optional reclaim callback to struct dmem_cgroup_region 
that is invoked when dmem.max is lowered. It introduces an SRCU domain to
protect reclaim callbacks against concurrent region unregistrations.

> diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> index 373ad04d289d5..93bc9813903c8 100644
> --- a/kernel/cgroup/dmem.c
> +++ b/kernel/cgroup/dmem.c

[ ... ]

> @@ -145,21 +161,52 @@ static void free_cg_pool(struct dmem_cgroup_pool_state *pool)
>  }
>  
>  static void
> -set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val)
> +set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val, bool nonblock)
>  {
> -	page_counter_set_max(&pool->cnt, val);
> +	struct dmem_cgroup_region *region = pool->region;
> +	unsigned long limit = (unsigned long)val;
> +
> +	/* Apply the new limit immediately so concurrent allocations are throttled. */
> +	xchg(&pool->cnt.max, limit);
> +
> +	if (nonblock)
> +		return;
> +
> +	int srcu_idx = srcu_read_lock(&dmemcg_srcu);
> +
> +	if (!READ_ONCE(region->unregistered) && region->ops && region->ops->reclaim) {
> +		for (int retries = DMEM_MAX_RECLAIM_RETRIES; ; ) {
> +			u64 usage = page_counter_read(&pool->cnt);
> +			int ret;
> +
> +			if (usage <= limit)
> +				break;
> +
> +			if (signal_pending(current))
> +				break;
> +
> +			ret = region->ops->reclaim(pool, usage - limit, region->reclaim_priv);
> +
> +			/* -ENOSPC means no progress; other errors are fatal. */
> +			if (ret && (ret != -ENOSPC || !retries--))
> +				break;
> +
> +			cond_resched();
> +		}

[Severity: High]
Can this loop cause synchronize_srcu() in dmem_cgroup_unregister_region() to
hang indefinitely?

If the driver's reclaim() callback repeatedly returns 0 (indicating some
progress) but concurrent allocations keep usage above the limit, this loop
could run for a very long time.

Because srcu_read_lock() is held outside the loop, a concurrent call to
dmem_cgroup_unregister_region() would block in synchronize_srcu() waiting
for this reader to finish.

Would it be safer to scope the SRCU read lock inside the loop iteration?
This would allow synchronize_srcu() to proceed and the loop could safely abort
when it sees region->unregistered become true.

> +	}
> +	srcu_read_unlock(&dmemcg_srcu, srcu_idx);
>  }

[ ... ]

> @@ -488,9 +539,11 @@ void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region)
>  	 * no new pools should be added to the dead region
>  	 * by get_cg_pool_unlocked.
>  	 */
> -	region->unregistered = true;
> +	WRITE_ONCE(region->unregistered, true);
>  	spin_unlock(&dmemcg_lock);
>  
> +	synchronize_srcu(&dmemcg_srcu);
> +
>  	kref_put(&region->ref, dmemcg_free_region);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260703130541.2686-1-thomas.hellstrom@linux.intel.com?part=3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 6/6] drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
  2026-07-03 13:05 ` [PATCH v7 6/6] drm/amdgpu: " Thomas Hellström
@ 2026-07-03 13:25   ` sashiko-bot
  0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-07-03 13:25 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: mkoutny, dri-devel, tj, hannes

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Use-After-Free due to concurrent cgroup reclaim during probe error teardown.
- [Medium] Memory leak of VRAM manager buddy allocator on cgroup region registration failure.
--

commit 7beb8b077a31ff08f80ed7810a3913fd3a463579
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager

Register the VRAM manager with the dmem cgroup reclaim infrastructure
so that lowering dmem.max below current VRAM usage triggers TTM
eviction rather than failing with -EBUSY.

[ ... ]

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 08f05c3aed1d5..9b9d738ba7941 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c

[ ... ]

> @@ -906,6 +906,28 @@ static const struct ttm_resource_manager_func amdgpu_vram_mgr_func = {
>  	.debug	= amdgpu_vram_mgr_debug
>  };
>  
> +static const struct dmem_cgroup_ops amdgpu_vram_mgr_dmem_ops;
> +
> +static int amdgpu_vram_mgr_dmem_reclaim(struct dmem_cgroup_pool_state *pool,
> +					u64 target_bytes, void *priv)
> +{
> +	struct ttm_resource_manager *man = priv;
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
> +	int ret, idx;
> +
> +	if (!drm_dev_enter(adev_to_drm(adev), &idx))
> +		return -ENODEV;

[Severity: High]
Does this drm_dev_enter() check adequately protect against teardown during a
probe error?

If a failure occurs during amdgpu_pci_probe() (for instance in
amdgpu_driver_load_kms()), drm_dev_unplug() is never called, so
dev->unplugged remains false. The teardown path then proceeds to
drm_dev_release() which does:

drm_dev_release(dev) {
    if (dev->driver->release)
        dev->driver->release(dev);

    drm_managed_release(dev);
    ...
}

The driver release callback will destroy the VRAM manager and TTM structures.
Only after this does drm_managed_release() unregister the cgroup region. If a
cgroup memory reclaim occurs during this window, drm_dev_enter() will allow
execution to proceed, and ttm_resource_manager_dmem_reclaim() will
dereference the destroyed VRAM manager. Is there a way to prevent reclaim from
accessing the manager during probe failure teardown?

> +
> +	ret = ttm_resource_manager_dmem_reclaim(pool, target_bytes, priv);
> +
> +	drm_dev_exit(idx);
> +	return ret;
> +}

[ ... ]

> @@ -933,12 +956,16 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>  	if (err)
>  		return err;
>  
> -	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram",
> -					      &(struct dmem_cgroup_init){
> +	cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram",
> +					 &(struct dmem_cgroup_init){
>  						.size = adev->gmc.real_vram_size,
> -					      });
> -	if (IS_ERR(man->cg))
> -		return PTR_ERR(man->cg);
> +						.ops = &amdgpu_vram_mgr_dmem_ops,
> +						.reclaim_priv = man,
> +					 });
> +	if (IS_ERR(cg))
> +		return PTR_ERR(cg);

[Severity: Medium]
Does returning here leak the buddy allocator initialized earlier in this
function via gpu_buddy_init()?

If drmm_cgroup_register_region() fails, the error propagates out of
amdgpu_ttm_init(), causing the GMC IP block's sw_init to fail. During driver
teardown, amdgpu_device_ip_fini() skips calling sw_fini for IP blocks where
status.sw is false. Since amdgpu_vram_mgr_fini() is skipped, it seems the
buddy allocator memory allocated by gpu_buddy_init() will never be freed.

Should this error path call gpu_buddy_fini() before returning?

> +
> +	ttm_resource_manager_set_dmem_region(man, cg);
>  
>  	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>  	ttm_resource_manager_set_used(man, true);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260703130541.2686-1-thomas.hellstrom@linux.intel.com?part=6

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
  2026-07-03 13:08   ` Christian König
@ 2026-07-03 13:26   ` sashiko-bot
  1 sibling, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-07-03 13:26 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: hannes, dri-devel, tj, mkoutny

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] The patch resolves a panic when `drmm_cgroup_register_region()` fails, but leaves an identical teardown vulnerability if `gpu_buddy_init()` fails.
--

commit 3650035d068659ae5cf730d3debae46e3ebe9589
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()

This commit fixes a kernel panic on the module load error path by moving
drmm_cgroup_register_region() after the list and buddy allocator are fully
initialized. This prevents the teardown path from dereferencing uninitialized
pointers if the cgroup registration fails.

Closes: https://sashiko.dev/#/patchset/20260428073116.15687-1-thomas.hellstrom@linux.intel.com?part=4

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c

[ ... ]

> @@ -935,6 +932,10 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>  	err = gpu_buddy_init(&mgr->mm, man->size, PAGE_SIZE);
>  	if (err)
>  		return err;

[Severity: High]
This isn't a bug introduced by this patch, but while fixing the teardown
vulnerability for the cgroup registration, does this leave an identical
teardown vulnerability if gpu_buddy_init() fails?

If gpu_buddy_init() fails, it frees the memory but leaves dangling pointers
and a non-zero mm->n_roots:

drivers/gpu/buddy.c:gpu_buddy_init() {
    ...
out_free_roots:
    while (root_count--)
        gpu_block_free(mm, mm->roots[root_count]);
    kfree(mm->roots);
out_free_tree:
    while (i--)
        kfree(mm->free_trees[i]);
    kfree(mm->free_trees);
    return -ENOMEM;
}

Since adev->mman.initialized is already set to true by the caller
(amdgpu_ttm_init()), returning this error triggers the driver's teardown path
which unconditionally calls gpu_buddy_fini():

drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c:amdgpu_vram_mgr_fini() {
    ...
    if (!adev->gmc.is_app_apu)
        gpu_buddy_fini(&mgr->mm);
    ...
}

This would lead to a use-after-free, NULL pointer dereference, and
double-free when gpu_buddy_fini() iterates up to the non-zero mm->n_roots:

drivers/gpu/buddy.c:gpu_buddy_fini() {
    ...
    for (i = 0; i < mm->n_roots; ++i) {
        order = ilog2(size) - ilog2(mm->chunk_size);
        start = gpu_buddy_block_offset(mm->roots[i]);
    ...
}

Should gpu_buddy_init() clear mm->n_roots and zero the pointers on error,
or should amdgpu_vram_mgr_init() clear its state upon this failure?

> +	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->gmc.real_vram_size);
> +	if (IS_ERR(man->cg))
> +		return PTR_ERR(man->cg);
> +
>  	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>  	ttm_resource_manager_set_used(man, true);
>  	return 0;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260703130541.2686-1-thomas.hellstrom@linux.intel.com?part=1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 0/6] Add reclaim to the dmem cgroup controller
  2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
                   ` (5 preceding siblings ...)
  2026-07-03 13:05 ` [PATCH v7 6/6] drm/amdgpu: " Thomas Hellström
@ 2026-07-03 14:37 ` Thadeu Lima de Souza Cascardo
  6 siblings, 0 replies; 14+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2026-07-03 14:37 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: intel-xe, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König, Alex Deucher,
	Rodrigo Vivi, dri-devel, amd-gfx, linux-kernel

On Fri, Jul 03, 2026 at 03:05:35PM +0200, Thomas Hellström wrote:
> When writing a "max" limit lower than the current usage, the
> existing code silently failed. This series aims to improve
> on that by returning -EBUSY on failure and also attempt
> to synchronously reclaim device memory to push the usage
> under the new max limit to avoid the error.
> 
> Patch 1 fixes a pre-existing amdgpu_vram_mgr_init() error path
> Patch 2 introduces struct dmem_cgroup_init for extensible region
>       registration.
> Patch 3 implements and documents a reclaim callback interface
>       for the dmem controller.
> Patch 4 implements a TTM reclaim callback.
> Patches 5-6 hook up the reclaim callback to the dmem cgroup-aware
>       drivers xe and amdgpu.
> 
> v2:
> - Remove the error propagation that was in a previous series (Maarten)
> - A number of updates in patch 1. See its commit message for
>   details (Maarten)
> 
> v3:
> - Add patch 1 fixing a pre-existing amdgpu_vram_mgr_init() error path
>   bug where drmm_cgroup_register_region() was called before
>   INIT_LIST_HEAD() and gpu_buddy_init(), causing a kernel panic on
>   failure. (Sashiko-bot)
> - Use an rwsem to protect reclaim callback registration and region
>   unregister against concurrent reclaim invocations. (Sashiko-bot)
> - Fix ttm_resource_manager_set_dmem_region() storing an error pointer
>   in man->cg unconditionally. (Sashiko-bot)
> - Fix kernel-doc function name format for ttm_bo_evict_cgroup() and
>   ttm_resource_manager_set_dmem_region().
> 
> v4:
> - Rebased on drm-tip; dropped the XE_PL_STOLEN guard in the xe patch
>   as stolen memory uses a separate TTM manager.
> 
> v5:
> - Add patch 2 introducing struct dmem_cgroup_init to make the
>   dmem_cgroup_register_region() API extensible without adding positional
>   arguments in the future.
> - Use nonblock=true in reset_all_resource_limits() to avoid sleeping
>   inside rcu_read_lock() in dmemcs_offline(). (Sashiko-bot)
> - Compare usage against the truncated limit stored in cnt.max, not the
>   original u64. (Sashiko-bot)
> - Use DMEM_MAX_RECLAIM_RETRIES (16) retry budget instead of 5, matching
>   the memcg controller; only -ENOSPC (no progress) counts against the
>   budget, other errors abort immediately.
> - Handle NULL region in ttm_resource_manager_set_dmem_region() to clear
>   the reclaim callback, preventing use-after-free when the manager is
>   torn down while the dmem region outlives it. (Sashiko-bot)
> - Return 0 on any eviction progress; reserve -ENOSPC for zero progress.
> - Clear the reclaim callback in xe and amdgpu fini paths to prevent
>   use-after-free after driver unbind with open DRM file descriptors.
>   (Sashiko-bot)
> - Register xe fini devres action before drmm_cgroup_register_region()
>   so LIFO teardown runs unregister first, draining callbacks before the
>   manager is destroyed. (Sashiko-bot)
> - Switch amdgpu to explicit dmem_cgroup_unregister_region() at the top
>   of amdgpu_vram_mgr_fini() before any manager teardown, since amdgpu's
>   fini is called explicitly during driver unbind before drmm cleanup.
>   (Sashiko-bot)
> - Wrap the xe reclaim callback with drm_dev_enter()/drm_dev_exit() to
>   prevent TTM reclaim from running after driver unbind.
> 
> v6:
> - Move the ops check inside down_read() in set_resource_max(), guarded
>   by region->unregistered, to close a UAF race against
>   dmem_cgroup_unregister_region(). (Sashiko-bot)
> - Fix dmem_cgroup_ops->reclaim docstring: -ENOSPC is retried up to
>   DMEM_MAX_RECLAIM_RETRIES times, not an immediate stop. (Sashiko-bot)
> - Fix mgr->cg_region never being assigned in amdgpu_vram_mgr_init(),
>   causing dmem_cgroup_unregister_region() in fini to silently no-op.
>   (Sashiko-bot)
> - Reorder amdgpu_vram_mgr_fini() to call set_used(false) and
>   evict_all() before dmem_cgroup_unregister_region(), so
>   ttm_resource_free() can uncharge via man->cg during eviction; clear
>   man->cg after unregister. (Sashiko-bot)
> 
> v7:
> - Replace the per-region rw_semaphore with a static SRCU domain
>   (dmemcg_srcu). SRCU is a better fit: it avoids per-region lock
>   overhead on every reclaim call, and synchronize_srcu() at unregister
>   time is a rare, shutdown-time operation. (Maarten)
> - Trim in-function comments to focus on what rather than how.
> - Switch back to drmm_cgroup_register_region() with a drm_dev_enter/
>   exit guard in the reclaim callback (matching xe), rather than manual
>   register/unregister.  drm_dev_unplug() fires before vram_mgr_fini(),
>   so drm_dev_enter() returning false prevents any reclaim from touching
>   the manager during teardown.  This also fixes the "vram" name
>   collision on multi-GPU systems, since drmm_cgroup_register_region()
>   automatically prefixes with "drm/<pci-addr>/". (Sashiko-bot)
> 
> User-space tests are at
> https://patchwork.freedesktop.org/series/163935/
> 
> Test-with: 20260428065411.4222-1-thomas.hellstrom@linux.intel.com
> 

I used the branch at [1] to run tests over amdgpu and they pass.

[1] https://gitlab.freedesktop.org/cascardo/igt-gpu-tools/-/commits/dmem_max?ref_type=heads

Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>

> Thomas Hellström (6):
>   drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
>   cgroup/dmem: Introduce struct dmem_cgroup_init for region
>     initialization
>   cgroup/dmem: Add reclaim callback for lowering max below current usage
>   drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
>     controller
>   drm/xe: Wire up dmem cgroup reclaim for VRAM manager
>   drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c      |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 38 +++++++-
>  drivers/gpu/drm/drm_drv.c                    |  8 +-
>  drivers/gpu/drm/ttm/ttm_bo.c                 | 95 +++++++++++++++++++-
>  drivers/gpu/drm/ttm/ttm_bo_util.c            |  3 +-
>  drivers/gpu/drm/ttm/ttm_resource.c           | 50 +++++++++++
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c         | 53 +++++++++--
>  include/drm/drm_drv.h                        |  4 +-
>  include/drm/ttm/ttm_bo.h                     | 10 +++
>  include/drm/ttm/ttm_resource.h               |  7 ++
>  include/linux/cgroup_dmem.h                  | 38 +++++++-
>  kernel/cgroup/dmem.c                         | 91 +++++++++++++++----
>  12 files changed, 362 insertions(+), 37 deletions(-)
> 
> -- 
> 2.54.0
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-07-03 14:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
2026-07-03 13:08   ` Christian König
2026-07-03 13:11     ` Thomas Hellström
2026-07-03 13:26   ` sashiko-bot
2026-07-03 13:05 ` [PATCH v7 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization Thomas Hellström
2026-07-03 13:05 ` [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
2026-07-03 13:24   ` sashiko-bot
2026-07-03 13:05 ` [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
2026-07-03 13:19   ` sashiko-bot
2026-07-03 13:05 ` [PATCH v7 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
2026-07-03 13:05 ` [PATCH v7 6/6] drm/amdgpu: " Thomas Hellström
2026-07-03 13:25   ` sashiko-bot
2026-07-03 14:37 ` [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thadeu Lima de Souza Cascardo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox