From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Natalie Vock" <natalie.vock@gmx.de>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Tejun Heo" <tj@kernel.org>, "Michal Koutný" <mkoutny@suse.com>,
cgroups@vger.kernel.org, "Huang Rui" <ray.huang@amd.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Matthew Auld" <matthew.auld@intel.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Simona Vetter" <simona@ffwll.ch>,
"David Airlie" <airlied@gmail.com>,
"Christian König" <christian.koenig@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v4 0/5] Add reclaim to the dmem cgroup controller
Date: Tue, 12 May 2026 10:24:01 +0200 [thread overview]
Message-ID: <20260512082406.44470-1-thomas.hellstrom@linux.intel.com> (raw)
When writing a "max" limit lower than the current usage, the
existing code silently failed. This series aims to improve
on that by returning -EBUSY on failure and also attempt
to synchronously reclaim device memory to push the usage
under the new max limit to avoid the error.
Patch 1 fixes a pre-existing amdgpu_vram_mgr_init() error path
Patch 2 implements and documents a reclaim callback interface
for the dmem controller.
Patch 3 implements a TTM reclaim callback.
Patch 4-5 hooks up the reclaim callback to the dmem cgroups-
aware drivers xe and amdgpu.
v2:
- Remove the error propagation that was in a previous series (Maarten)
- A number of updates in patch 1. See its commit message for
details (Maarten)
v3:
- Add patch 1 fixing a pre-existing amdgpu_vram_mgr_init() error path
bug where drmm_cgroup_register_region() was called before
INIT_LIST_HEAD() and gpu_buddy_init(), causing a kernel panic on
failure. (Sashiko-bot)
- Use an rwsem to protect reclaim callback registration and region
unregister against concurrent reclaim invocations. (Sashiko-bot)
- Fix ttm_resource_manager_set_dmem_region() storing an error pointer
in man->cg unconditionally. (Sashiko-bot)
- Fix kernel-doc function name format for ttm_bo_evict_cgroup() and
ttm_resource_manager_set_dmem_region().
v4:
- Rebased on drm-tip; dropped the XE_PL_STOLEN guard in the xe patch
as stolen memory uses a separate TTM manager.
User-space tests are at
https://patchwork.freedesktop.org/series/163935/
Test-with: 20260428065411.4222-1-thomas.hellstrom@linux.intel.com
Thomas Hellström (5):
drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
cgroup/dmem: Add reclaim callback for lowering max below current usage
drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
controller
drm/xe: Wire up dmem cgroup reclaim for VRAM manager
drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 10 +-
drivers/gpu/drm/ttm/ttm_bo.c | 95 ++++++++++++++++-
drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +-
drivers/gpu/drm/ttm/ttm_resource.c | 37 +++++++
drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 14 ++-
include/drm/ttm/ttm_bo.h | 10 ++
include/drm/ttm/ttm_resource.h | 4 +
include/linux/cgroup_dmem.h | 24 +++++
kernel/cgroup/dmem.c | 106 +++++++++++++++++--
10 files changed, 283 insertions(+), 22 deletions(-)
--
2.54.0
next reply other threads:[~2026-05-12 8:24 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 8:24 Thomas Hellström [this message]
2026-05-12 8:24 ` [PATCH v4 1/5] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
2026-05-12 8:24 ` [PATCH v4 2/5] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
2026-05-12 8:24 ` [PATCH v4 3/5] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
2026-05-12 8:24 ` [PATCH v4 4/5] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
2026-05-12 8:24 ` [PATCH v4 5/5] drm/amdgpu: " Thomas Hellström
2026-05-12 15:51 ` ✗ CI.checkpatch: warning for Add reclaim to the dmem cgroup controller (rev4) Patchwork
2026-05-12 15:53 ` ✓ CI.KUnit: success " Patchwork
2026-05-12 16:48 ` ✓ Xe.CI.BAT: " Patchwork
2026-05-13 4:27 ` ✗ Xe.CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512082406.44470-1-thomas.hellstrom@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=cgroups@vger.kernel.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hannes@cmpxchg.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=mkoutny@suse.com \
--cc=mripard@kernel.org \
--cc=natalie.vock@gmx.de \
--cc=ray.huang@amd.com \
--cc=rodrigo.vivi@intel.com \
--cc=simona@ffwll.ch \
--cc=tj@kernel.org \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.