* [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases
@ 2025-10-15 13:57 Natalie Vock
2025-10-15 13:57 ` [PATCH v2 1/5] cgroup/dmem: Add queries for protection values Natalie Vock
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Natalie Vock @ 2025-10-15 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
Hi all,
I've been looking into some cases where dmem protection fails to prevent
allocations from ending up in GTT when VRAM gets scarce and apps start
competing hard.
In short, this is because other (unprotected) applications end up
filling VRAM before protected applications do. This causes TTM to back
off and try allocating in GTT before anything else, and that is where
the allocation is placed in the end. The existing eviction protection
cannot prevent this, because no attempt at evicting is ever made
(although you could consider the backing-off as an immediate eviction to
GTT).
This series tries to alleviate this by adding a special case when the
allocation is protected by cgroups: Instead of backing off immediately,
TTM will try evicting unprotected buffers from the domain to make space
for the protected one. This ensures that applications can actually use
all the memory protection awarded to them by the system, without being
prone to ping-ponging (only protected allocations can evict unprotected
ones, never the other way around).
The first two patches just add a few small utilities needed to implement
this to the dmem controller. The second two patches are the TTM
implementation:
"drm/ttm: Be more aggressive..." decouples cgroup charging from resource
allocation to allow us to hold on to the charge even if allocation fails
on first try, and adds a path to call ttm_bo_evict_alloc when the
charged allocation falls within min/low protection limits.
"drm/ttm: Use common ancestor..." is a more general improvement in
correctly implementing cgroup protection semantics. With recursive
protection rules, unused memory protection afforded to a parent node is
transferred to children recursively, which helps protect entire
subtrees from stealing each others' memory without needing to protect
each cgroup individually. This doesn't apply when considering direct
siblings inside the same subtree, so in order to not break
prioritization between these siblings, we need to consider the
relationship of evictor and evictee when calculating protection.
In practice, this fixes cases where a protected cgroup cannot steal
memory from unprotected siblings (which, in turn, leads to eviction
failures and new allocations being placed in GTT).
Thanks,
Natalie
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
Changes in v2:
- Factored out the ttm logic for charging/allocating/evicting into a
separate helper to keep things simpler
- Link to v1: https://lore.kernel.org/r/20250915-dmemcg-aggressive-protect-v1-0-2f3353bfcdac@gmx.de
---
Natalie Vock (5):
cgroup/dmem: Add queries for protection values
cgroup/dmem: Add dmem_cgroup_common_ancestor helper
drm/ttm: Make a helper for attempting allocation in a place
drm/ttm: Be more aggressive when allocating below protection limit
drm/ttm: Use common ancestor of evictor and evictee as limit pool
drivers/gpu/drm/ttm/ttm_bo.c | 147 ++++++++++++++++++++++++++++++-------
drivers/gpu/drm/ttm/ttm_resource.c | 48 ++++++++----
include/drm/ttm/ttm_resource.h | 6 +-
include/linux/cgroup_dmem.h | 25 +++++++
kernel/cgroup/dmem.c | 73 ++++++++++++++++++
5 files changed, 258 insertions(+), 41 deletions(-)
---
base-commit: f3e82936857b3bd77b824ecd2fa7839dd99ec0c6
change-id: 20250915-dmemcg-aggressive-protect-5cf37f717cdb
Best regards,
--
Natalie Vock <natalie.vock@gmx.de>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/5] cgroup/dmem: Add queries for protection values
2025-10-15 13:57 [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
@ 2025-10-15 13:57 ` Natalie Vock
2025-10-24 11:44 ` Maarten Lankhorst
2025-10-15 13:57 ` [PATCH v2 2/5] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Natalie Vock @ 2025-10-15 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
Callers can use this feedback to be more aggressive in making space for
allocations of a cgroup if they know it is protected.
These are counterparts to memcg's mem_cgroup_below_{min,low}.
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
include/linux/cgroup_dmem.h | 16 +++++++++++++++
kernel/cgroup/dmem.c | 48 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
index dd4869f1d736e26847578e81377e40504bbba90f..1a88cd0c9eb00409ddd07d1f06eb63d2e55e8805 100644
--- a/include/linux/cgroup_dmem.h
+++ b/include/linux/cgroup_dmem.h
@@ -24,6 +24,10 @@ void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size);
bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
struct dmem_cgroup_pool_state *test_pool,
bool ignore_low, bool *ret_hit_low);
+bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
+ struct dmem_cgroup_pool_state *test);
+bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
+ struct dmem_cgroup_pool_state *test);
void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool);
#else
@@ -59,6 +63,18 @@ bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
return true;
}
+static inline bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
+ struct dmem_cgroup_pool_state *test)
+{
+ return false;
+}
+
+static inline bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
+ struct dmem_cgroup_pool_state *test)
+{
+ return false;
+}
+
static inline void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool)
{ }
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 10b63433f05737cc43a87029f2306147283a77ff..ece23f77f197f1b2da3ee322ff176460801907c6 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -641,6 +641,54 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
}
EXPORT_SYMBOL_GPL(dmem_cgroup_try_charge);
+/**
+ * dmem_cgroup_below_min() - Tests whether current usage is within min limit.
+ *
+ * @root: Root of the subtree to calculate protection for, or NULL to calculate global protection.
+ * @test: The pool to test the usage/min limit of.
+ *
+ * Return: true if usage is below min and the cgroup is protected, false otherwise.
+ */
+bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
+ struct dmem_cgroup_pool_state *test)
+{
+ if (root == test || !pool_parent(test))
+ return false;
+
+ if (!root) {
+ for (root = test; pool_parent(root); root = pool_parent(root))
+ {}
+ }
+
+ dmem_cgroup_calculate_protection(root, test);
+ return page_counter_read(&test->cnt) <= READ_ONCE(test->cnt.emin);
+}
+EXPORT_SYMBOL_GPL(dmem_cgroup_below_min);
+
+/**
+ * dmem_cgroup_below_low() - Tests whether current usage is within low limit.
+ *
+ * @root: Root of the subtree to calculate protection for, or NULL to calculate global protection.
+ * @test: The pool to test the usage/low limit of.
+ *
+ * Return: true if usage is below low and the cgroup is protected, false otherwise.
+ */
+bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
+ struct dmem_cgroup_pool_state *test)
+{
+ if (root == test || !pool_parent(test))
+ return false;
+
+ if (!root) {
+ for (root = test; pool_parent(root); root = pool_parent(root))
+ {}
+ }
+
+ dmem_cgroup_calculate_protection(root, test);
+ return page_counter_read(&test->cnt) <= READ_ONCE(test->cnt.elow);
+}
+EXPORT_SYMBOL_GPL(dmem_cgroup_below_low);
+
static int dmem_cgroup_region_capacity_show(struct seq_file *sf, void *v)
{
struct dmem_cgroup_region *region;
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/5] cgroup/dmem: Add dmem_cgroup_common_ancestor helper
2025-10-15 13:57 [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2025-10-15 13:57 ` [PATCH v2 1/5] cgroup/dmem: Add queries for protection values Natalie Vock
@ 2025-10-15 13:57 ` Natalie Vock
2025-10-15 13:57 ` [PATCH v2 3/5] drm/ttm: Make a helper for attempting allocation in a place Natalie Vock
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Natalie Vock @ 2025-10-15 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
This helps to find a common subtree of two resources, which is important
when determining whether it's helpful to evict one resource in favor of
another.
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
include/linux/cgroup_dmem.h | 9 +++++++++
kernel/cgroup/dmem.c | 25 +++++++++++++++++++++++++
2 files changed, 34 insertions(+)
diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
index 1a88cd0c9eb00409ddd07d1f06eb63d2e55e8805..444b84f4c253aded9e4e59d051e3ac34a851b9d7 100644
--- a/include/linux/cgroup_dmem.h
+++ b/include/linux/cgroup_dmem.h
@@ -28,6 +28,8 @@ bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
struct dmem_cgroup_pool_state *test);
bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
struct dmem_cgroup_pool_state *test);
+struct dmem_cgroup_pool_state *dmem_cgroup_common_ancestor(struct dmem_cgroup_pool_state *a,
+ struct dmem_cgroup_pool_state *b);
void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool);
#else
@@ -75,6 +77,13 @@ static inline bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
return false;
}
+static inline
+struct dmem_cgroup_pool_state *dmem_cgroup_common_ancestor(struct dmem_cgroup_pool_state *a,
+ struct dmem_cgroup_pool_state *b)
+{
+ return NULL;
+}
+
static inline void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool)
{ }
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index ece23f77f197f1b2da3ee322ff176460801907c6..0914fc8fd97f49d246da0344abedaf9244e8fbad 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -689,6 +689,31 @@ bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
}
EXPORT_SYMBOL_GPL(dmem_cgroup_below_low);
+/**
+ * dmem_cgroup_common_ancestor(): Find the first common ancestor of two pools.
+ * @a: First pool to find the common ancestor of.
+ * @b: First pool to find the common ancestor of.
+ *
+ * Return: The first pool that is a parent of both @a and @b, or NULL if either @a or @b are NULL.
+ */
+struct dmem_cgroup_pool_state *dmem_cgroup_common_ancestor(struct dmem_cgroup_pool_state *a,
+ struct dmem_cgroup_pool_state *b)
+{
+ struct dmem_cgroup_pool_state *parent;
+
+ while (a) {
+ parent = b;
+ while (parent) {
+ if (a == parent)
+ return a;
+ parent = pool_parent(parent);
+ }
+ a = pool_parent(a);
+ }
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(dmem_cgroup_common_ancestor);
+
static int dmem_cgroup_region_capacity_show(struct seq_file *sf, void *v)
{
struct dmem_cgroup_region *region;
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/5] drm/ttm: Make a helper for attempting allocation in a place
2025-10-15 13:57 [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2025-10-15 13:57 ` [PATCH v2 1/5] cgroup/dmem: Add queries for protection values Natalie Vock
2025-10-15 13:57 ` [PATCH v2 2/5] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
@ 2025-10-15 13:57 ` Natalie Vock
2025-10-15 13:57 ` [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2025-10-15 13:57 ` [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
4 siblings, 0 replies; 11+ messages in thread
From: Natalie Vock @ 2025-10-15 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
ttm_bo_alloc_place is a new helper function to make an attempt at
allocating a bo's resource in a specific place. It also makes decisions
on whether eviction should be attempted: If -ENOSPC is returned,
allocation should not be retried.
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
drivers/gpu/drm/ttm/ttm_bo.c | 98 +++++++++++++++++++++++++++++++++-----------
1 file changed, 73 insertions(+), 25 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index f4d9e68b21e70cb25d0db5e79391233e1dc72221..829d99479883594f8be5b9ceed4cc53c4864ace5 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -489,6 +489,11 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
return ret;
}
+struct ttm_bo_alloc_state {
+ /** @limit_pool: Which pool limit we should test against */
+ struct dmem_cgroup_pool_state *limit_pool;
+};
+
/**
* struct ttm_bo_evict_walk - Parameters for the evict walk.
*/
@@ -504,12 +509,13 @@ struct ttm_bo_evict_walk {
/** @evicted: Number of successful evictions. */
unsigned long evicted;
- /** @limit_pool: Which pool limit we should test against */
- struct dmem_cgroup_pool_state *limit_pool;
/** @try_low: Whether we should attempt to evict BO's with low watermark threshold */
bool try_low;
/** @hit_low: If we cannot evict a bo when @try_low is false (first pass) */
bool hit_low;
+
+ /** @alloc_state: */
+ struct ttm_bo_alloc_state *alloc_state;
};
static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
@@ -518,8 +524,9 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
container_of(walk, typeof(*evict_walk), walk);
s64 lret;
- if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo->resource->css,
- evict_walk->try_low, &evict_walk->hit_low))
+ if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
+ bo->resource->css, evict_walk->try_low,
+ &evict_walk->hit_low))
return 0;
if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
@@ -561,7 +568,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
struct ttm_operation_ctx *ctx,
struct ww_acquire_ctx *ticket,
struct ttm_resource **res,
- struct dmem_cgroup_pool_state *limit_pool)
+ struct ttm_bo_alloc_state *state)
{
struct ttm_bo_evict_walk evict_walk = {
.walk = {
@@ -574,7 +581,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
.place = place,
.evictor = evictor,
.res = res,
- .limit_pool = limit_pool,
+ .alloc_state = state,
};
s64 lret;
@@ -688,6 +695,47 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
return ret;
}
+
+/**
+ * ttm_bo_alloc_at_place - Attempt allocating a BO's backing store in a place
+ *
+ * @bo: The buffer to allocate the backing store of
+ * @place: The place to attempt allocation in
+ * @ctx: ttm_operation_ctx associated with this allocation
+ * @force_space: If we should evict buffers to force space
+ * @res: On allocation success, the resulting struct ttm_resource.
+ * @alloc_state: Object holding allocation state such as charged cgroups.
+ *
+ * Returns:
+ * -EBUSY: No space available, but allocation should be retried with eviction.
+ * -ENOSPC: No space available, allocation should not be retried.
+ * -ERESTARTSYS: An interruptible sleep was interrupted by a signal.
+ *
+ */
+static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
+ const struct ttm_place *place,
+ struct ttm_operation_ctx *ctx,
+ bool force_space,
+ struct ttm_resource **res,
+ struct ttm_bo_alloc_state *alloc_state)
+{
+ bool may_evict;
+ int ret;
+
+ may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
+
+ ret = ttm_resource_alloc(bo, place, res,
+ force_space ? &alloc_state->limit_pool : NULL);
+
+ if (ret) {
+ if ((ret == -ENOSPC || ret == -EAGAIN) && may_evict)
+ ret = -EBUSY;
+ return ret;
+ }
+
+ return 0;
+}
+
/**
* ttm_bo_alloc_resource - Allocate backing store for a BO
*
@@ -713,7 +761,9 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
bool force_space,
struct ttm_resource **res)
{
+ struct ttm_bo_alloc_state alloc_state = {0};
struct ttm_device *bdev = bo->bdev;
+ struct ttm_resource_manager *man;
struct ww_acquire_ctx *ticket;
int i, ret;
@@ -724,9 +774,6 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
for (i = 0; i < placement->num_placement; ++i) {
const struct ttm_place *place = &placement->placement[i];
- struct dmem_cgroup_pool_state *limit_pool = NULL;
- struct ttm_resource_manager *man;
- bool may_evict;
man = ttm_manager_type(bdev, place->mem_type);
if (!man || !ttm_resource_manager_used(man))
@@ -736,25 +783,26 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
TTM_PL_FLAG_FALLBACK))
continue;
- may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
- ret = ttm_resource_alloc(bo, place, res, force_space ? &limit_pool : NULL);
- if (ret) {
- if (ret != -ENOSPC && ret != -EAGAIN) {
- dmem_cgroup_pool_state_put(limit_pool);
- return ret;
- }
- if (!may_evict) {
- dmem_cgroup_pool_state_put(limit_pool);
- continue;
- }
+ ret = ttm_bo_alloc_at_place(bo, place, ctx, force_space,
+ res, &alloc_state);
+ if (ret == -ENOSPC) {
+ dmem_cgroup_pool_state_put(alloc_state.limit_pool);
+ continue;
+ } else if (ret == -EBUSY) {
ret = ttm_bo_evict_alloc(bdev, man, place, bo, ctx,
- ticket, res, limit_pool);
- dmem_cgroup_pool_state_put(limit_pool);
- if (ret == -EBUSY)
+ ticket, res, &alloc_state);
+
+ dmem_cgroup_pool_state_put(alloc_state.limit_pool);
+
+ if (ret) {
+ if (ret != -ENOSPC && ret != -EBUSY)
+ return ret;
continue;
- if (ret)
- return ret;
+ }
+ } else if (ret) {
+ dmem_cgroup_pool_state_put(alloc_state.limit_pool);
+ return ret;
}
ret = ttm_bo_add_move_fence(bo, man, ctx->no_wait_gpu);
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit
2025-10-15 13:57 [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
` (2 preceding siblings ...)
2025-10-15 13:57 ` [PATCH v2 3/5] drm/ttm: Make a helper for attempting allocation in a place Natalie Vock
@ 2025-10-15 13:57 ` Natalie Vock
2025-10-24 12:14 ` Maarten Lankhorst
2025-10-15 13:57 ` [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
4 siblings, 1 reply; 11+ messages in thread
From: Natalie Vock @ 2025-10-15 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
When the cgroup's memory usage is below the low/min limit and allocation
fails, try evicting some unprotected buffers to make space. Otherwise,
application buffers may be forced to go into GTT even though usage is
below the corresponding low/min limit, if other applications filled VRAM
with their allocations first.
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
drivers/gpu/drm/ttm/ttm_bo.c | 43 ++++++++++++++++++++++++++++------
drivers/gpu/drm/ttm/ttm_resource.c | 48 +++++++++++++++++++++++++++-----------
include/drm/ttm/ttm_resource.h | 6 ++++-
3 files changed, 76 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 829d99479883594f8be5b9ceed4cc53c4864ace5..7f7872ab2090cc8db188e08ddfdcd12fe924f743 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -490,8 +490,12 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
}
struct ttm_bo_alloc_state {
+ /** @charge_pool: The memory pool the resource is charged to */
+ struct dmem_cgroup_pool_state *charge_pool;
/** @limit_pool: Which pool limit we should test against */
struct dmem_cgroup_pool_state *limit_pool;
+ /** @only_evict_unprotected: If eviction should be restricted to unprotected BOs */
+ bool only_evict_unprotected;
};
/**
@@ -546,7 +550,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
evict_walk->evicted++;
if (evict_walk->res)
lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
- evict_walk->res, NULL);
+ evict_walk->res, evict_walk->alloc_state->charge_pool);
if (lret == 0)
return 1;
out:
@@ -589,7 +593,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
/* One more attempt if we hit low limit? */
- if (!lret && evict_walk.hit_low) {
+ if (!lret && evict_walk.hit_low && !state->only_evict_unprotected) {
evict_walk.try_low = true;
lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
}
@@ -610,7 +614,8 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
} while (!lret && evict_walk.evicted);
/* We hit the low limit? Try once more */
- if (!lret && evict_walk.hit_low && !evict_walk.try_low) {
+ if (!lret && evict_walk.hit_low && !evict_walk.try_low &&
+ !state->only_evict_unprotected) {
evict_walk.try_low = true;
goto retry;
}
@@ -719,20 +724,40 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
struct ttm_resource **res,
struct ttm_bo_alloc_state *alloc_state)
{
- bool may_evict;
+ bool may_evict, is_protected = false;
int ret;
may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
+ ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
+ force_space ? &alloc_state->limit_pool : NULL);
+ if (ret) {
+ /*
+ * -EAGAIN means the charge failed, which we treat like an
+ * allocation failure. Allocation failures are indicated
+ * by -ENOSPC, so return that instead.
+ */
+ if (ret == -EAGAIN && !may_evict)
+ ret = -ENOSPC;
+ return ret;
+ }
- ret = ttm_resource_alloc(bo, place, res,
- force_space ? &alloc_state->limit_pool : NULL);
+ is_protected = dmem_cgroup_below_min(NULL, alloc_state->charge_pool) ||
+ dmem_cgroup_below_low(NULL, alloc_state->charge_pool);
+ ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
+ alloc_state->only_evict_unprotected = !may_evict && is_protected;
if (ret) {
- if ((ret == -ENOSPC || ret == -EAGAIN) && may_evict)
+ if ((ret == -ENOSPC || ret == -EAGAIN) &&
+ (may_evict || is_protected))
ret = -EBUSY;
return ret;
}
+ /*
+ * Ownership of charge_pool has been transferred to the TTM resource,
+ * don't make the caller think we still hold a reference to it.
+ */
+ alloc_state->charge_pool = NULL;
return 0;
}
@@ -787,6 +812,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
res, &alloc_state);
if (ret == -ENOSPC) {
+ dmem_cgroup_pool_state_put(alloc_state.charge_pool);
dmem_cgroup_pool_state_put(alloc_state.limit_pool);
continue;
} else if (ret == -EBUSY) {
@@ -796,11 +822,14 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
dmem_cgroup_pool_state_put(alloc_state.limit_pool);
if (ret) {
+ dmem_cgroup_pool_state_put(
+ alloc_state.charge_pool);
if (ret != -ENOSPC && ret != -EBUSY)
return ret;
continue;
}
} else if (ret) {
+ dmem_cgroup_pool_state_put(alloc_state.charge_pool);
dmem_cgroup_pool_state_put(alloc_state.limit_pool);
return ret;
}
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index e2c82ad07eb44b5e88bf5b5db1ef54dd6d27823b..fcfa8b51b033745f46a01e40a9dc83e0c69165fc 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -372,30 +372,52 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
}
EXPORT_SYMBOL(ttm_resource_fini);
+/**
+ * ttm_resource_try_charge - charge a resource manager's cgroup pool
+ * @bo: buffer for which an allocation should be charged
+ * @place: where the allocation is attempted to be placed
+ * @ret_pool: on charge success, the pool that was charged
+ * @ret_limit_pool: on charge failure, the pool responsible for the failure
+ *
+ * Should be used to charge cgroups before attempting resource allocation.
+ * When charging succeeds, the value of ret_pool should be passed to
+ * ttm_resource_alloc.
+ *
+ * Returns: 0 on charge success, negative errno on failure.
+ */
+int ttm_resource_try_charge(struct ttm_buffer_object *bo,
+ const struct ttm_place *place,
+ struct dmem_cgroup_pool_state **ret_pool,
+ struct dmem_cgroup_pool_state **ret_limit_pool)
+{
+ struct ttm_resource_manager *man =
+ ttm_manager_type(bo->bdev, place->mem_type);
+
+ if (!man->cg) {
+ *ret_pool = NULL;
+ if (ret_limit_pool)
+ *ret_limit_pool = NULL;
+ return 0;
+ }
+
+ return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
+ ret_limit_pool);
+}
+
int ttm_resource_alloc(struct ttm_buffer_object *bo,
const struct ttm_place *place,
struct ttm_resource **res_ptr,
- struct dmem_cgroup_pool_state **ret_limit_pool)
+ struct dmem_cgroup_pool_state *charge_pool)
{
struct ttm_resource_manager *man =
ttm_manager_type(bo->bdev, place->mem_type);
- struct dmem_cgroup_pool_state *pool = NULL;
int ret;
- if (man->cg) {
- ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool);
- if (ret)
- return ret;
- }
-
ret = man->func->alloc(man, bo, place, res_ptr);
- if (ret) {
- if (pool)
- dmem_cgroup_uncharge(pool, bo->base.size);
+ if (ret)
return ret;
- }
- (*res_ptr)->css = pool;
+ (*res_ptr)->css = charge_pool;
spin_lock(&bo->bdev->lru_lock);
ttm_resource_add_bulk_move(*res_ptr, bo);
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index e52bba15012f78e352f392232ac2e89a83afd311..3aef7efdd7cfb8fd93071db85e632b975b53cf81 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -442,10 +442,14 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
void ttm_resource_fini(struct ttm_resource_manager *man,
struct ttm_resource *res);
+int ttm_resource_try_charge(struct ttm_buffer_object *bo,
+ const struct ttm_place *place,
+ struct dmem_cgroup_pool_state **ret_pool,
+ struct dmem_cgroup_pool_state **ret_limit_pool);
int ttm_resource_alloc(struct ttm_buffer_object *bo,
const struct ttm_place *place,
struct ttm_resource **res,
- struct dmem_cgroup_pool_state **ret_limit_pool);
+ struct dmem_cgroup_pool_state *charge_pool);
void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
bool ttm_resource_intersects(struct ttm_device *bdev,
struct ttm_resource *res,
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool
2025-10-15 13:57 [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
` (3 preceding siblings ...)
2025-10-15 13:57 ` [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
@ 2025-10-15 13:57 ` Natalie Vock
2025-10-24 12:21 ` Maarten Lankhorst
4 siblings, 1 reply; 11+ messages in thread
From: Natalie Vock @ 2025-10-15 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
When checking whether to skip certain buffers because they're protected
by dmem.low, we're checking the effective protection of the evictee's
cgroup, but depending on how the evictor's cgroup relates to the
evictee's, the semantics of effective protection values change.
When testing against cgroups from different subtrees, page_counter's
recursive protection propagates memory protection afforded to a parent
down to the child cgroups, even if the children were not explicitly
protected. This prevents cgroups whose parents were afforded no
protection from stealing memory from cgroups whose parents were afforded
more protection, without users having to explicitly propagate this
protection.
However, if we always calculate protection from the root cgroup, this
breaks prioritization of sibling cgroups: If one cgroup was explicitly
protected and its siblings were not, the protected cgroup should get
higher priority, i.e. the protected cgroup should be able to steal from
unprotected siblings. This only works if we restrict the protection
calculation to the subtree shared by evictor and evictee.
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 7f7872ab2090cc8db188e08ddfdcd12fe924f743..bc88941c0aadb9a1d6fbaa470ccdeae4f91c41fb 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -524,13 +524,29 @@ struct ttm_bo_evict_walk {
static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
{
+ struct dmem_cgroup_pool_state *limit_pool;
struct ttm_bo_evict_walk *evict_walk =
container_of(walk, typeof(*evict_walk), walk);
s64 lret;
- if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
- bo->resource->css, evict_walk->try_low,
- &evict_walk->hit_low))
+ /*
+ * If only_evict_unprotected is set, then we're trying to evict unprotected
+ * buffers in favor of a protected allocation for charge_pool. Explicitly skip
+ * buffers belonging to the same cgroup here - that cgroup is definitely protected,
+ * even though dmem_cgroup_state_evict_valuable would allow the eviction because a
+ * cgroup is always allowed to evict from itself even if it is protected.
+ */
+ if (evict_walk->alloc_state->only_evict_unprotected &&
+ bo->resource->css == evict_walk->alloc_state->charge_pool)
+ return 0;
+
+ limit_pool = evict_walk->alloc_state->limit_pool;
+ if (!limit_pool)
+ limit_pool = dmem_cgroup_common_ancestor(bo->resource->css,
+ evict_walk->alloc_state->charge_pool);
+
+ if (!dmem_cgroup_state_evict_valuable(limit_pool, bo->resource->css,
+ evict_walk->try_low, &evict_walk->hit_low))
return 0;
if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/5] cgroup/dmem: Add queries for protection values
2025-10-15 13:57 ` [PATCH v2 1/5] cgroup/dmem: Add queries for protection values Natalie Vock
@ 2025-10-24 11:44 ` Maarten Lankhorst
0 siblings, 0 replies; 11+ messages in thread
From: Maarten Lankhorst @ 2025-10-24 11:44 UTC (permalink / raw)
To: Natalie Vock, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
Hey,
Den 2025-10-15 kl. 15:57, skrev Natalie Vock:
> Callers can use this feedback to be more aggressive in making space for
> allocations of a cgroup if they know it is protected.
>
> These are counterparts to memcg's mem_cgroup_below_{min,low}.
>
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
It could be useful to document why we are calculating protection here,
since memcg doesn't do that in their functions.
> ---
> include/linux/cgroup_dmem.h | 16 +++++++++++++++
> kernel/cgroup/dmem.c | 48 +++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 64 insertions(+)
>
> diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
> index dd4869f1d736e26847578e81377e40504bbba90f..1a88cd0c9eb00409ddd07d1f06eb63d2e55e8805 100644
> --- a/include/linux/cgroup_dmem.h
> +++ b/include/linux/cgroup_dmem.h
> @@ -24,6 +24,10 @@ void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size);
> bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
> struct dmem_cgroup_pool_state *test_pool,
> bool ignore_low, bool *ret_hit_low);
> +bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
> + struct dmem_cgroup_pool_state *test);
> +bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
> + struct dmem_cgroup_pool_state *test);
>
> void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool);
> #else
> @@ -59,6 +63,18 @@ bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
> return true;
> }
>
> +static inline bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
> + struct dmem_cgroup_pool_state *test)
> +{
> + return false;
> +}
> +
> +static inline bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
> + struct dmem_cgroup_pool_state *test)
> +{
> + return false;
> +}
> +
> static inline void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool)
> { }
>
> diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> index 10b63433f05737cc43a87029f2306147283a77ff..ece23f77f197f1b2da3ee322ff176460801907c6 100644
> --- a/kernel/cgroup/dmem.c
> +++ b/kernel/cgroup/dmem.c
> @@ -641,6 +641,54 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
> }
> EXPORT_SYMBOL_GPL(dmem_cgroup_try_charge);
>
> +/**
> + * dmem_cgroup_below_min() - Tests whether current usage is within min limit.
> + *
> + * @root: Root of the subtree to calculate protection for, or NULL to calculate global protection.
> + * @test: The pool to test the usage/min limit of.
> + *
> + * Return: true if usage is below min and the cgroup is protected, false otherwise.
> + */
> +bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
> + struct dmem_cgroup_pool_state *test)
> +{
> + if (root == test || !pool_parent(test))
> + return false;
> +
> + if (!root) {
> + for (root = test; pool_parent(root); root = pool_parent(root))
> + {}
> + }
> +
> + dmem_cgroup_calculate_protection(root, test);
> + return page_counter_read(&test->cnt) <= READ_ONCE(test->cnt.emin);
> +}
> +EXPORT_SYMBOL_GPL(dmem_cgroup_below_min);
> +
> +/**
> + * dmem_cgroup_below_low() - Tests whether current usage is within low limit.
> + *
> + * @root: Root of the subtree to calculate protection for, or NULL to calculate global protection.
> + * @test: The pool to test the usage/low limit of.
> + *
> + * Return: true if usage is below low and the cgroup is protected, false otherwise.
> + */
> +bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
> + struct dmem_cgroup_pool_state *test)
> +{
> + if (root == test || !pool_parent(test))
> + return false;
> +
> + if (!root) {
> + for (root = test; pool_parent(root); root = pool_parent(root))
> + {}
> + }
> +
> + dmem_cgroup_calculate_protection(root, test);
> + return page_counter_read(&test->cnt) <= READ_ONCE(test->cnt.elow);
> +}
> +EXPORT_SYMBOL_GPL(dmem_cgroup_below_low);
> +
> static int dmem_cgroup_region_capacity_show(struct seq_file *sf, void *v)
> {
> struct dmem_cgroup_region *region;
>
Kind regards,
~Maarten lankhorst
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit
2025-10-15 13:57 ` [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
@ 2025-10-24 12:14 ` Maarten Lankhorst
2025-10-26 12:56 ` Natalie Vock
0 siblings, 1 reply; 11+ messages in thread
From: Maarten Lankhorst @ 2025-10-24 12:14 UTC (permalink / raw)
To: Natalie Vock, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: cgroups, dri-devel
Hey,
Den 2025-10-15 kl. 15:57, skrev Natalie Vock:
> When the cgroup's memory usage is below the low/min limit and allocation
> fails, try evicting some unprotected buffers to make space. Otherwise,
> application buffers may be forced to go into GTT even though usage is
> below the corresponding low/min limit, if other applications filled VRAM
> with their allocations first.
>
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 43 ++++++++++++++++++++++++++++------
> drivers/gpu/drm/ttm/ttm_resource.c | 48 +++++++++++++++++++++++++++-----------
> include/drm/ttm/ttm_resource.h | 6 ++++-
> 3 files changed, 76 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 829d99479883594f8be5b9ceed4cc53c4864ace5..7f7872ab2090cc8db188e08ddfdcd12fe924f743 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -490,8 +490,12 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
> }
>
> struct ttm_bo_alloc_state {
> + /** @charge_pool: The memory pool the resource is charged to */
> + struct dmem_cgroup_pool_state *charge_pool;
> /** @limit_pool: Which pool limit we should test against */
> struct dmem_cgroup_pool_state *limit_pool;
> + /** @only_evict_unprotected: If eviction should be restricted to unprotected BOs */
> + bool only_evict_unprotected;
I'm not entirely sure we should put 'low' and 'min' limits together here.
> };
>
> /**
> @@ -546,7 +550,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
> evict_walk->evicted++;
> if (evict_walk->res)
> lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
> - evict_walk->res, NULL);
> + evict_walk->res, evict_walk->alloc_state->charge_pool);
> if (lret == 0)
> return 1;
> out:
> @@ -589,7 +593,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
> lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
>
> /* One more attempt if we hit low limit? */
> - if (!lret && evict_walk.hit_low) {
> + if (!lret && evict_walk.hit_low && !state->only_evict_unprotected) {
> evict_walk.try_low = true;
> lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
> }
> @@ -610,7 +614,8 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
> } while (!lret && evict_walk.evicted);
>
> /* We hit the low limit? Try once more */
> - if (!lret && evict_walk.hit_low && !evict_walk.try_low) {
> + if (!lret && evict_walk.hit_low && !evict_walk.try_low &&
> + !state->only_evict_unprotected) {
> evict_walk.try_low = true;
> goto retry;
> }
> @@ -719,20 +724,40 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
> struct ttm_resource **res,
> struct ttm_bo_alloc_state *alloc_state)
> {
> - bool may_evict;
> + bool may_evict, is_protected = false;
> int ret;
>
> may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> + ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
> + force_space ? &alloc_state->limit_pool : NULL);
> + if (ret) {
> + /*
> + * -EAGAIN means the charge failed, which we treat like an
> + * allocation failure. Allocation failures are indicated
> + * by -ENOSPC, so return that instead.
> + */
> + if (ret == -EAGAIN && !may_evict)
> + ret = -ENOSPC;
> + return ret;
> + }
>
> - ret = ttm_resource_alloc(bo, place, res,
> - force_space ? &alloc_state->limit_pool : NULL);
> + is_protected = dmem_cgroup_below_min(NULL, alloc_state->charge_pool) ||
> + dmem_cgroup_below_low(NULL, alloc_state->charge_pool);
> + ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
> + alloc_state->only_evict_unprotected = !may_evict && is_protected;
This probably deserves a comment to explaing it's ok if we haven't hit low/min yet to evict from
those cgroups that did those limits already. It took me a bit of time to understand the idea.
>
> if (ret) {
> - if ((ret == -ENOSPC || ret == -EAGAIN) && may_evict)
> + if ((ret == -ENOSPC || ret == -EAGAIN) &&
> + (may_evict || is_protected))
> ret = -EBUSY;
> return ret;
> }
>
> + /*
> + * Ownership of charge_pool has been transferred to the TTM resource,
> + * don't make the caller think we still hold a reference to it.
> + */
> + alloc_state->charge_pool = NULL;
> return 0;
> }
>
> @@ -787,6 +812,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
> res, &alloc_state);
>
> if (ret == -ENOSPC) {
> + dmem_cgroup_pool_state_put(alloc_state.charge_pool);
> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> continue;
> } else if (ret == -EBUSY) {
> @@ -796,11 +822,14 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>
> if (ret) {
> + dmem_cgroup_pool_state_put(
> + alloc_state.charge_pool);
> if (ret != -ENOSPC && ret != -EBUSY)
> return ret;
> continue;
> }
> } else if (ret) {
> + dmem_cgroup_pool_state_put(alloc_state.charge_pool);
> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> return ret;
> }
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index e2c82ad07eb44b5e88bf5b5db1ef54dd6d27823b..fcfa8b51b033745f46a01e40a9dc83e0c69165fc 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -372,30 +372,52 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
> }
> EXPORT_SYMBOL(ttm_resource_fini);
>
> +/**
> + * ttm_resource_try_charge - charge a resource manager's cgroup pool
> + * @bo: buffer for which an allocation should be charged
> + * @place: where the allocation is attempted to be placed
> + * @ret_pool: on charge success, the pool that was charged
> + * @ret_limit_pool: on charge failure, the pool responsible for the failure
> + *
> + * Should be used to charge cgroups before attempting resource allocation.
> + * When charging succeeds, the value of ret_pool should be passed to
> + * ttm_resource_alloc.
> + *
> + * Returns: 0 on charge success, negative errno on failure.
> + */
> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
> + const struct ttm_place *place,
> + struct dmem_cgroup_pool_state **ret_pool,
> + struct dmem_cgroup_pool_state **ret_limit_pool)
> +{
> + struct ttm_resource_manager *man =
> + ttm_manager_type(bo->bdev, place->mem_type);
> +
> + if (!man->cg) {
> + *ret_pool = NULL;
> + if (ret_limit_pool)
> + *ret_limit_pool = NULL;
> + return 0;
> + }
> +
> + return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
> + ret_limit_pool);
> +}
> +
> int ttm_resource_alloc(struct ttm_buffer_object *bo,
> const struct ttm_place *place,
> struct ttm_resource **res_ptr,
> - struct dmem_cgroup_pool_state **ret_limit_pool)
> + struct dmem_cgroup_pool_state *charge_pool)
> {
> struct ttm_resource_manager *man =
> ttm_manager_type(bo->bdev, place->mem_type);
> - struct dmem_cgroup_pool_state *pool = NULL;
> int ret;
>
> - if (man->cg) {
> - ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool);
> - if (ret)
> - return ret;
> - }
> -
> ret = man->func->alloc(man, bo, place, res_ptr);
> - if (ret) {
> - if (pool)
> - dmem_cgroup_uncharge(pool, bo->base.size);
> + if (ret)
> return ret;
> - }
>
> - (*res_ptr)->css = pool;
> + (*res_ptr)->css = charge_pool;
>
> spin_lock(&bo->bdev->lru_lock);
> ttm_resource_add_bulk_move(*res_ptr, bo);
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index e52bba15012f78e352f392232ac2e89a83afd311..3aef7efdd7cfb8fd93071db85e632b975b53cf81 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -442,10 +442,14 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> void ttm_resource_fini(struct ttm_resource_manager *man,
> struct ttm_resource *res);
>
> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
> + const struct ttm_place *place,
> + struct dmem_cgroup_pool_state **ret_pool,
> + struct dmem_cgroup_pool_state **ret_limit_pool);
> int ttm_resource_alloc(struct ttm_buffer_object *bo,
> const struct ttm_place *place,
> struct ttm_resource **res,
> - struct dmem_cgroup_pool_state **ret_limit_pool);
> + struct dmem_cgroup_pool_state *charge_pool);
> void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
> bool ttm_resource_intersects(struct ttm_device *bdev,
> struct ttm_resource *res,
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool
2025-10-15 13:57 ` [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
@ 2025-10-24 12:21 ` Maarten Lankhorst
2025-11-10 12:43 ` Natalie Vock
0 siblings, 1 reply; 11+ messages in thread
From: Maarten Lankhorst @ 2025-10-24 12:21 UTC (permalink / raw)
To: Natalie Vock, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
Hey,
Den 2025-10-15 kl. 15:57, skrev Natalie Vock:
> When checking whether to skip certain buffers because they're protected
> by dmem.low, we're checking the effective protection of the evictee's
> cgroup, but depending on how the evictor's cgroup relates to the
> evictee's, the semantics of effective protection values change.
>
> When testing against cgroups from different subtrees, page_counter's
> recursive protection propagates memory protection afforded to a parent
> down to the child cgroups, even if the children were not explicitly
> protected. This prevents cgroups whose parents were afforded no
> protection from stealing memory from cgroups whose parents were afforded
> more protection, without users having to explicitly propagate this
> protection.
>
> However, if we always calculate protection from the root cgroup, this
> breaks prioritization of sibling cgroups: If one cgroup was explicitly
> protected and its siblings were not, the protected cgroup should get
> higher priority, i.e. the protected cgroup should be able to steal from
> unprotected siblings. This only works if we restrict the protection
> calculation to the subtree shared by evictor and evictee.
>
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 7f7872ab2090cc8db188e08ddfdcd12fe924f743..bc88941c0aadb9a1d6fbaa470ccdeae4f91c41fb 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -524,13 +524,29 @@ struct ttm_bo_evict_walk {
>
> static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
> {
> + struct dmem_cgroup_pool_state *limit_pool;
> struct ttm_bo_evict_walk *evict_walk =
> container_of(walk, typeof(*evict_walk), walk);
> s64 lret;
>
> - if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
> - bo->resource->css, evict_walk->try_low,
> - &evict_walk->hit_low))
> + /*
> + * If only_evict_unprotected is set, then we're trying to evict unprotected
> + * buffers in favor of a protected allocation for charge_pool. Explicitly skip
> + * buffers belonging to the same cgroup here - that cgroup is definitely protected,
> + * even though dmem_cgroup_state_evict_valuable would allow the eviction because a
> + * cgroup is always allowed to evict from itself even if it is protected.
> + */
> + if (evict_walk->alloc_state->only_evict_unprotected &&
> + bo->resource->css == evict_walk->alloc_state->charge_pool)
> + return 0;
> +
> + limit_pool = evict_walk->alloc_state->limit_pool;
> + if (!limit_pool)
> + limit_pool = dmem_cgroup_common_ancestor(bo->resource->css,
> + evict_walk->alloc_state->charge_pool);
> +
> + if (!dmem_cgroup_state_evict_valuable(limit_pool, bo->resource->css,
> + evict_walk->try_low, &evict_walk->hit_low))
> return 0;
>
> if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
>
Patches themselves look good, I think it would help to add a bit more documentation since
cgroup related dmem eviction is already complicated, and while I believe those changes are
correct, it will help others to understand the code in case bugs show up.
Perhaps even add a global overview of how dmem eviction interacts with TTM eviction.
This will need review from the TTM maintainers/reviewers too before being accepted.
With the extra documentation added:
Reviewed-by: Maarten Lankhorst <dev@lankhorst.se>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit
2025-10-24 12:14 ` Maarten Lankhorst
@ 2025-10-26 12:56 ` Natalie Vock
0 siblings, 0 replies; 11+ messages in thread
From: Natalie Vock @ 2025-10-26 12:56 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: cgroups, dri-devel
Hi,
On 10/24/25 14:14, Maarten Lankhorst wrote:
> Hey,
>
> Den 2025-10-15 kl. 15:57, skrev Natalie Vock:
>> When the cgroup's memory usage is below the low/min limit and allocation
>> fails, try evicting some unprotected buffers to make space. Otherwise,
>> application buffers may be forced to go into GTT even though usage is
>> below the corresponding low/min limit, if other applications filled VRAM
>> with their allocations first.
>>
>> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c | 43 ++++++++++++++++++++++++++++------
>> drivers/gpu/drm/ttm/ttm_resource.c | 48 +++++++++++++++++++++++++++-----------
>> include/drm/ttm/ttm_resource.h | 6 ++++-
>> 3 files changed, 76 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 829d99479883594f8be5b9ceed4cc53c4864ace5..7f7872ab2090cc8db188e08ddfdcd12fe924f743 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -490,8 +490,12 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
>> }
>>
>> struct ttm_bo_alloc_state {
>> + /** @charge_pool: The memory pool the resource is charged to */
>> + struct dmem_cgroup_pool_state *charge_pool;
>> /** @limit_pool: Which pool limit we should test against */
>> struct dmem_cgroup_pool_state *limit_pool;
>> + /** @only_evict_unprotected: If eviction should be restricted to unprotected BOs */
>> + bool only_evict_unprotected;
> I'm not entirely sure we should put 'low' and 'min' limits together here.
I think putting 'low' and 'min' together here is accurate. When the
allocation is covered by the 'low' limit, but not the 'min' limit, we
should evict only allocations that are covered by neither (which is what
this flag controls).
However maybe we should allow evicting allocations covered by 'low' when
the new allocation is covered by 'min' in ttm_resource_alloc_at_place
down below (because 'min' is a stronger guarantee). We could do this
simply by setting 'only_evict_unprotected' to false, since memory
covered by 'min' can never get evicted anyway.
>> };
>>
>> /**
>> @@ -546,7 +550,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
>> evict_walk->evicted++;
>> if (evict_walk->res)
>> lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
>> - evict_walk->res, NULL);
>> + evict_walk->res, evict_walk->alloc_state->charge_pool);
>> if (lret == 0)
>> return 1;
>> out:
>> @@ -589,7 +593,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>> lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
>>
>> /* One more attempt if we hit low limit? */
>> - if (!lret && evict_walk.hit_low) {
>> + if (!lret && evict_walk.hit_low && !state->only_evict_unprotected) {
>> evict_walk.try_low = true;
>> lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
>> }
>> @@ -610,7 +614,8 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>> } while (!lret && evict_walk.evicted);
>>
>> /* We hit the low limit? Try once more */
>> - if (!lret && evict_walk.hit_low && !evict_walk.try_low) {
>> + if (!lret && evict_walk.hit_low && !evict_walk.try_low &&
>> + !state->only_evict_unprotected) {
>> evict_walk.try_low = true;
>> goto retry;
>> }
>> @@ -719,20 +724,40 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
>> struct ttm_resource **res,
>> struct ttm_bo_alloc_state *alloc_state)
>> {
>> - bool may_evict;
>> + bool may_evict, is_protected = false;
>> int ret;
>>
>> may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
>> + ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
>> + force_space ? &alloc_state->limit_pool : NULL);
>> + if (ret) {
>> + /*
>> + * -EAGAIN means the charge failed, which we treat like an
>> + * allocation failure. Allocation failures are indicated
>> + * by -ENOSPC, so return that instead.
>> + */
>> + if (ret == -EAGAIN && !may_evict)
>> + ret = -ENOSPC;
>> + return ret;
>> + }
>>
>> - ret = ttm_resource_alloc(bo, place, res,
>> - force_space ? &alloc_state->limit_pool : NULL);
>> + is_protected = dmem_cgroup_below_min(NULL, alloc_state->charge_pool) ||
>> + dmem_cgroup_below_low(NULL, alloc_state->charge_pool);
>> + ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
>> + alloc_state->only_evict_unprotected = !may_evict && is_protected;
>
> This probably deserves a comment to explaing it's ok if we haven't hit low/min yet to evict from
> those cgroups that did those limits already. It took me a bit of time to understand the idea.
Yeah, that's a bit non-obvious. I'll add a comment.
Thanks,
Natalie
>
>>
>> if (ret) {
>> - if ((ret == -ENOSPC || ret == -EAGAIN) && may_evict)
>> + if ((ret == -ENOSPC || ret == -EAGAIN) &&
>> + (may_evict || is_protected))
>> ret = -EBUSY;
>> return ret;
>> }
>>
>> + /*
>> + * Ownership of charge_pool has been transferred to the TTM resource,
>> + * don't make the caller think we still hold a reference to it.
>> + */
>> + alloc_state->charge_pool = NULL;
>> return 0;
>> }
>>
>> @@ -787,6 +812,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>> res, &alloc_state);
>>
>> if (ret == -ENOSPC) {
>> + dmem_cgroup_pool_state_put(alloc_state.charge_pool);
>> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>> continue;
>> } else if (ret == -EBUSY) {
>> @@ -796,11 +822,14 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>>
>> if (ret) {
>> + dmem_cgroup_pool_state_put(
>> + alloc_state.charge_pool);
>> if (ret != -ENOSPC && ret != -EBUSY)
>> return ret;
>> continue;
>> }
>> } else if (ret) {
>> + dmem_cgroup_pool_state_put(alloc_state.charge_pool);
>> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>> return ret;
>> }
>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
>> index e2c82ad07eb44b5e88bf5b5db1ef54dd6d27823b..fcfa8b51b033745f46a01e40a9dc83e0c69165fc 100644
>> --- a/drivers/gpu/drm/ttm/ttm_resource.c
>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
>> @@ -372,30 +372,52 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
>> }
>> EXPORT_SYMBOL(ttm_resource_fini);
>>
>> +/**
>> + * ttm_resource_try_charge - charge a resource manager's cgroup pool
>> + * @bo: buffer for which an allocation should be charged
>> + * @place: where the allocation is attempted to be placed
>> + * @ret_pool: on charge success, the pool that was charged
>> + * @ret_limit_pool: on charge failure, the pool responsible for the failure
>> + *
>> + * Should be used to charge cgroups before attempting resource allocation.
>> + * When charging succeeds, the value of ret_pool should be passed to
>> + * ttm_resource_alloc.
>> + *
>> + * Returns: 0 on charge success, negative errno on failure.
>> + */
>> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
>> + const struct ttm_place *place,
>> + struct dmem_cgroup_pool_state **ret_pool,
>> + struct dmem_cgroup_pool_state **ret_limit_pool)
>> +{
>> + struct ttm_resource_manager *man =
>> + ttm_manager_type(bo->bdev, place->mem_type);
>> +
>> + if (!man->cg) {
>> + *ret_pool = NULL;
>> + if (ret_limit_pool)
>> + *ret_limit_pool = NULL;
>> + return 0;
>> + }
>> +
>> + return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
>> + ret_limit_pool);
>> +}
>> +
>> int ttm_resource_alloc(struct ttm_buffer_object *bo,
>> const struct ttm_place *place,
>> struct ttm_resource **res_ptr,
>> - struct dmem_cgroup_pool_state **ret_limit_pool)
>> + struct dmem_cgroup_pool_state *charge_pool)
>> {
>> struct ttm_resource_manager *man =
>> ttm_manager_type(bo->bdev, place->mem_type);
>> - struct dmem_cgroup_pool_state *pool = NULL;
>> int ret;
>>
>> - if (man->cg) {
>> - ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool);
>> - if (ret)
>> - return ret;
>> - }
>> -
>> ret = man->func->alloc(man, bo, place, res_ptr);
>> - if (ret) {
>> - if (pool)
>> - dmem_cgroup_uncharge(pool, bo->base.size);
>> + if (ret)
>> return ret;
>> - }
>>
>> - (*res_ptr)->css = pool;
>> + (*res_ptr)->css = charge_pool;
>>
>> spin_lock(&bo->bdev->lru_lock);
>> ttm_resource_add_bulk_move(*res_ptr, bo);
>> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
>> index e52bba15012f78e352f392232ac2e89a83afd311..3aef7efdd7cfb8fd93071db85e632b975b53cf81 100644
>> --- a/include/drm/ttm/ttm_resource.h
>> +++ b/include/drm/ttm/ttm_resource.h
>> @@ -442,10 +442,14 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>> void ttm_resource_fini(struct ttm_resource_manager *man,
>> struct ttm_resource *res);
>>
>> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
>> + const struct ttm_place *place,
>> + struct dmem_cgroup_pool_state **ret_pool,
>> + struct dmem_cgroup_pool_state **ret_limit_pool);
>> int ttm_resource_alloc(struct ttm_buffer_object *bo,
>> const struct ttm_place *place,
>> struct ttm_resource **res,
>> - struct dmem_cgroup_pool_state **ret_limit_pool);
>> + struct dmem_cgroup_pool_state *charge_pool);
>> void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
>> bool ttm_resource_intersects(struct ttm_device *bdev,
>> struct ttm_resource *res,
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool
2025-10-24 12:21 ` Maarten Lankhorst
@ 2025-11-10 12:43 ` Natalie Vock
0 siblings, 0 replies; 11+ messages in thread
From: Natalie Vock @ 2025-11-10 12:43 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: cgroups, dri-devel
Hi,
On 10/24/25 14:21, Maarten Lankhorst wrote:
> Hey,
>
> Den 2025-10-15 kl. 15:57, skrev Natalie Vock:
>> When checking whether to skip certain buffers because they're protected
>> by dmem.low, we're checking the effective protection of the evictee's
>> cgroup, but depending on how the evictor's cgroup relates to the
>> evictee's, the semantics of effective protection values change.
>>
>> When testing against cgroups from different subtrees, page_counter's
>> recursive protection propagates memory protection afforded to a parent
>> down to the child cgroups, even if the children were not explicitly
>> protected. This prevents cgroups whose parents were afforded no
>> protection from stealing memory from cgroups whose parents were afforded
>> more protection, without users having to explicitly propagate this
>> protection.
>>
>> However, if we always calculate protection from the root cgroup, this
>> breaks prioritization of sibling cgroups: If one cgroup was explicitly
>> protected and its siblings were not, the protected cgroup should get
>> higher priority, i.e. the protected cgroup should be able to steal from
>> unprotected siblings. This only works if we restrict the protection
>> calculation to the subtree shared by evictor and evictee.
>>
>> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++++++++++---
>> 1 file changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 7f7872ab2090cc8db188e08ddfdcd12fe924f743..bc88941c0aadb9a1d6fbaa470ccdeae4f91c41fb 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -524,13 +524,29 @@ struct ttm_bo_evict_walk {
>>
>> static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
>> {
>> + struct dmem_cgroup_pool_state *limit_pool;
>> struct ttm_bo_evict_walk *evict_walk =
>> container_of(walk, typeof(*evict_walk), walk);
>> s64 lret;
>>
>> - if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
>> - bo->resource->css, evict_walk->try_low,
>> - &evict_walk->hit_low))
>> + /*
>> + * If only_evict_unprotected is set, then we're trying to evict unprotected
>> + * buffers in favor of a protected allocation for charge_pool. Explicitly skip
>> + * buffers belonging to the same cgroup here - that cgroup is definitely protected,
>> + * even though dmem_cgroup_state_evict_valuable would allow the eviction because a
>> + * cgroup is always allowed to evict from itself even if it is protected.
>> + */
>> + if (evict_walk->alloc_state->only_evict_unprotected &&
>> + bo->resource->css == evict_walk->alloc_state->charge_pool)
>> + return 0;
>> +
>> + limit_pool = evict_walk->alloc_state->limit_pool;
>> + if (!limit_pool)
>> + limit_pool = dmem_cgroup_common_ancestor(bo->resource->css,
>> + evict_walk->alloc_state->charge_pool);
>> +
>> + if (!dmem_cgroup_state_evict_valuable(limit_pool, bo->resource->css,
>> + evict_walk->try_low, &evict_walk->hit_low))
>> return 0;
>>
>> if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
>>
> Patches themselves look good, I think it would help to add a bit more documentation since
> cgroup related dmem eviction is already complicated, and while I believe those changes are
> correct, it will help others to understand the code in case bugs show up.
>
> Perhaps even add a global overview of how dmem eviction interacts with TTM eviction.
>
> This will need review from the TTM maintainers/reviewers too before being accepted.
>
> With the extra documentation added:
> Reviewed-by: Maarten Lankhorst <dev@lankhorst.se>
Sent out a v3 with extra documentation added[1].
Is this roughly in line with the documentation on cgroup-related dmem
eviction you expected? I skipped adding your R-b below the patches for
now to give you a chance to look over it again :)
Thanks,
Natalie
[1]
https://lore.kernel.org/dri-devel/20251110-dmemcg-aggressive-protect-v3-0-219ffcfc54e9@gmx.de/
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-11-10 12:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-15 13:57 [PATCH v2 0/5] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2025-10-15 13:57 ` [PATCH v2 1/5] cgroup/dmem: Add queries for protection values Natalie Vock
2025-10-24 11:44 ` Maarten Lankhorst
2025-10-15 13:57 ` [PATCH v2 2/5] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
2025-10-15 13:57 ` [PATCH v2 3/5] drm/ttm: Make a helper for attempting allocation in a place Natalie Vock
2025-10-15 13:57 ` [PATCH v2 4/5] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2025-10-24 12:14 ` Maarten Lankhorst
2025-10-26 12:56 ` Natalie Vock
2025-10-15 13:57 ` [PATCH v2 5/5] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
2025-10-24 12:21 ` Maarten Lankhorst
2025-11-10 12:43 ` Natalie Vock
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).