Linux cgroups development
 help / color / mirror / Atom feed
* [PATCH v6 0/6] [PATCH v6 0/6] Add reclaim to the dmem cgroup controller
@ 2026-06-11 17:32 Thomas Hellström
  2026-06-11 17:32 ` [PATCH v6 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Thomas Hellström @ 2026-06-11 17:32 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Natalie Vock, Johannes Weiner, Tejun Heo,
	Michal Koutný, cgroups, Huang Rui, Matthew Brost,
	Matthew Auld, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Simona Vetter, David Airlie, Christian König, Alex Deucher,
	Rodrigo Vivi, dri-devel, amd-gfx, linux-kernel

When writing a "max" limit lower than the current usage, the
existing code silently failed. This series aims to improve
on that by returning -EBUSY on failure and also attempt
to synchronously reclaim device memory to push the usage
under the new max limit to avoid the error.

Patch 1 fixes a pre-existing amdgpu_vram_mgr_init() error path
Patch 2 introduces struct dmem_cgroup_init for extensible region
      registration.
Patch 3 implements and documents a reclaim callback interface
      for the dmem controller.
Patch 4 implements a TTM reclaim callback.
Patches 5-6 hook up the reclaim callback to the dmem cgroup-aware
      drivers xe and amdgpu.

v2:
- Remove the error propagation that was in a previous series (Maarten)
- A number of updates in patch 1. See its commit message for
  details (Maarten)

v3:
- Add patch 1 fixing a pre-existing amdgpu_vram_mgr_init() error path
  bug where drmm_cgroup_register_region() was called before
  INIT_LIST_HEAD() and gpu_buddy_init(), causing a kernel panic on
  failure. (Sashiko-bot)
- Use an rwsem to protect reclaim callback registration and region
  unregister against concurrent reclaim invocations. (Sashiko-bot)
- Fix ttm_resource_manager_set_dmem_region() storing an error pointer
  in man->cg unconditionally. (Sashiko-bot)
- Fix kernel-doc function name format for ttm_bo_evict_cgroup() and
  ttm_resource_manager_set_dmem_region().

v4:
- Rebased on drm-tip; dropped the XE_PL_STOLEN guard in the xe patch
  as stolen memory uses a separate TTM manager.

v5:
- Add patch 2 introducing struct dmem_cgroup_init to make the
  dmem_cgroup_register_region() API extensible without adding positional
  arguments in the future.
- Use nonblock=true in reset_all_resource_limits() to avoid sleeping
  inside rcu_read_lock() in dmemcs_offline(). (Sashiko-bot)
- Compare usage against the truncated limit stored in cnt.max, not the
  original u64. (Sashiko-bot)
- Use DMEM_MAX_RECLAIM_RETRIES (16) retry budget instead of 5, matching
  the memcg controller; only -ENOSPC (no progress) counts against the
  budget, other errors abort immediately.
- Handle NULL region in ttm_resource_manager_set_dmem_region() to clear
  the reclaim callback, preventing use-after-free when the manager is
  torn down while the dmem region outlives it. (Sashiko-bot)
- Return 0 on any eviction progress; reserve -ENOSPC for zero progress.
- Clear the reclaim callback in xe and amdgpu fini paths to prevent
  use-after-free after driver unbind with open DRM file descriptors.
  (Sashiko-bot)
- Register xe fini devres action before drmm_cgroup_register_region()
  so LIFO teardown runs unregister first, draining callbacks before the
  manager is destroyed. (Sashiko-bot)
- Switch amdgpu to explicit dmem_cgroup_unregister_region() at the top
  of amdgpu_vram_mgr_fini() before any manager teardown, since amdgpu's
  fini is called explicitly during driver unbind before drmm cleanup.
  (Sashiko-bot)
- Wrap the xe reclaim callback with drm_dev_enter()/drm_dev_exit() to
  prevent TTM reclaim from running after driver unbind.

v6:
- Move the ops check inside down_read() in set_resource_max(), guarded
  by region->unregistered, to close a UAF race against
  dmem_cgroup_unregister_region(). (Sashiko-bot)
- Fix dmem_cgroup_ops->reclaim docstring: -ENOSPC is retried up to
  DMEM_MAX_RECLAIM_RETRIES times, not an immediate stop. (Sashiko-bot)
- Fix mgr->cg_region never being assigned in amdgpu_vram_mgr_init(),
  causing dmem_cgroup_unregister_region() in fini to silently no-op.
  (Sashiko-bot)
- Reorder amdgpu_vram_mgr_fini() to call set_used(false) and
  evict_all() before dmem_cgroup_unregister_region(), so
  ttm_resource_free() can uncharge via man->cg during eviction; clear
  man->cg after unregister. (Sashiko-bot)

User-space tests are at
https://patchwork.freedesktop.org/series/163935/

Test-with: 20260428065411.4222-1-thomas.hellstrom@linux.intel.com

Thomas Hellström (6):
  drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization
  cgroup/dmem: Add reclaim callback for lowering max below current usage
  drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
    controller
  drm/xe: Wire up dmem cgroup reclaim for VRAM manager
  drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c      |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c |  30 ++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h |   2 +
 drivers/gpu/drm/drm_drv.c                    |   8 +-
 drivers/gpu/drm/ttm/ttm_bo.c                 |  95 +++++++++++++++++++-
 drivers/gpu/drm/ttm/ttm_bo_util.c            |   3 +-
 drivers/gpu/drm/ttm/ttm_resource.c           |  50 +++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c         |  53 +++++++++--
 include/drm/drm_drv.h                        |   4 +-
 include/drm/ttm/ttm_bo.h                     |  10 +++
 include/drm/ttm/ttm_resource.h               |   7 ++
 include/linux/cgroup_dmem.h                  |  38 +++++++-
 kernel/cgroup/dmem.c                         | 129 ++++++++++++++++++++++++---
 13 files changed, 396 insertions(+), 35 deletions(-)

-- 
2.54.0

Thomas Hellström (6):
  drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
  cgroup/dmem: Introduce struct dmem_cgroup_init for region
    initialization
  cgroup/dmem: Add reclaim callback for lowering max below current usage
  drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
    controller
  drm/xe: Wire up dmem cgroup reclaim for VRAM manager
  drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c      |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c |  30 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h |   2 +
 drivers/gpu/drm/drm_drv.c                    |   8 +-
 drivers/gpu/drm/ttm/ttm_bo.c                 |  95 +++++++++++++-
 drivers/gpu/drm/ttm/ttm_bo_util.c            |   3 +-
 drivers/gpu/drm/ttm/ttm_resource.c           |  50 +++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c         |  53 +++++++-
 include/drm/drm_drv.h                        |   4 +-
 include/drm/ttm/ttm_bo.h                     |  10 ++
 include/drm/ttm/ttm_resource.h               |   7 +
 include/linux/cgroup_dmem.h                  |  38 +++++-
 kernel/cgroup/dmem.c                         | 129 +++++++++++++++++--
 13 files changed, 396 insertions(+), 35 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-11 19:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 17:32 [PATCH v6 0/6] [PATCH v6 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
2026-06-11 17:32 ` [PATCH v6 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
2026-06-11 17:32 ` [PATCH v6 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization Thomas Hellström
2026-06-11 17:32 ` [PATCH v6 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
2026-06-11 17:32 ` [PATCH v6 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
2026-06-11 17:33 ` [PATCH v6 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
2026-06-11 17:33 ` [PATCH v6 6/6] drm/amdgpu: " Thomas Hellström
2026-06-11 19:41   ` Thomas Hellström

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox