[PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases

public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases
@ 2026-02-25 12:10 Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

Hi all,

I've been looking into some cases where dmem protection fails to prevent
allocations from ending up in GTT when VRAM gets scarce and apps start
competing hard.

In short, this is because other (unprotected) applications end up
filling VRAM before protected applications do. This causes TTM to back
off and try allocating in GTT before anything else, and that is where
the allocation is placed in the end. The existing eviction protection
cannot prevent this, because no attempt at evicting is ever made
(although you could consider the backing-off as an immediate eviction to
GTT).

This series tries to alleviate this by adding a special case when the
allocation is protected by cgroups: Instead of backing off immediately,
TTM will try evicting unprotected buffers from the domain to make space
for the protected one. This ensures that applications can actually use
all the memory protection awarded to them by the system, without being
prone to ping-ponging (only protected allocations can evict unprotected
ones, never the other way around).

The first two patches just add a few small utilities needed to implement
this to the dmem controller. The other patches are the TTM implementation:

"drm/ttm: Be more aggressive..." decouples cgroup charging from resource
allocation to allow us to hold on to the charge even if allocation fails
on first try, and adds a path to call ttm_bo_evict_alloc when the
charged allocation falls within min/low protection limits.

"drm/ttm: Use common ancestor..." is a more general improvement in
correctly implementing cgroup protection semantics. With recursive
protection rules, unused memory protection afforded to a parent node is
transferred to children recursively, which helps protect entire
subtrees from stealing each others' memory without needing to protect
each cgroup individually. This doesn't apply when considering direct
siblings inside the same subtree, so in order to not break
prioritization between these siblings, we need to consider the
relationship of evictor and evictee when calculating protection.
In practice, this fixes cases where a protected cgroup cannot steal
memory from unprotected siblings (which, in turn, leads to eviction
failures and new allocations being placed in GTT).

Thanks,
Natalie

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
Changes in v4:
- Split cgroup charge decoupling and eviction logic changes into
  separate commits (Tvrtko)
- Fix two cases of errno handling in ttm_bo_alloc_place and its caller
  (Tvrtko)
- Improve commit message/description of "drm/ttm: Make a helper..." (now
  "drm/ttm: Extract code...") (Tvrtko)
- Documentation improvements for new TTM eviction logic (Tvrtko)
- Formatting fixes (Tvrtko)
- Link to v3: https://lore.kernel.org/r/20251110-dmemcg-aggressive-protect-v3-0-219ffcfc54e9@gmx.de

Changes in v3:
- Improved documentation around cgroup queries and TTM eviction helpers
  (Maarten)
- Fixed up ttm_alloc_at_place charge failure logic to return either
  -EBUSY or -ENOSPC, not -EAGAIN (found this myself)
- Link to v2: https://lore.kernel.org/r/20251015-dmemcg-aggressive-protect-v2-0-36644fb4e37f@gmx.de

Changes in v2:
- Factored out the ttm logic for charging/allocating/evicting into a
  separate helper to keep things simpler
- Link to v1: https://lore.kernel.org/r/20250915-dmemcg-aggressive-protect-v1-0-2f3353bfcdac@gmx.de

---
Natalie Vock (6):
      cgroup/dmem: Add queries for protection values
      cgroup/dmem: Add dmem_cgroup_common_ancestor helper
      drm/ttm: Extract code for attempting allocation in a place
      drm/ttm: Split cgroup charge and resource allocation
      drm/ttm: Be more aggressive when allocating below protection limit
      drm/ttm: Use common ancestor of evictor and evictee as limit pool

 drivers/gpu/drm/ttm/ttm_bo.c       | 198 +++++++++++++++++++++++++++++++------
 drivers/gpu/drm/ttm/ttm_resource.c |  48 ++++++---
 include/drm/ttm/ttm_resource.h     |   6 +-
 include/linux/cgroup_dmem.h        |  25 +++++
 kernel/cgroup/dmem.c               |  87 ++++++++++++++++
 5 files changed, 322 insertions(+), 42 deletions(-)
---
base-commit: 61c0f69a2ff79c8f388a9e973abb4853be467127
change-id: 20250915-dmemcg-aggressive-protect-5cf37f717cdb

Best regards,
-- 
Natalie Vock <natalie.vock@gmx.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/6] cgroup/dmem: Add queries for protection values
  2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
@ 2026-02-25 12:10 ` Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 2/6] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

Callers can use this feedback to be more aggressive in making space for
allocations of a cgroup if they know it is protected.

These are counterparts to memcg's mem_cgroup_below_{min,low}.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
 include/linux/cgroup_dmem.h | 16 ++++++++++++
 kernel/cgroup/dmem.c        | 62 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
index dd4869f1d736e..1a88cd0c9eb00 100644
--- a/include/linux/cgroup_dmem.h
+++ b/include/linux/cgroup_dmem.h
@@ -24,6 +24,10 @@ void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size);
 bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
 				      struct dmem_cgroup_pool_state *test_pool,
 				      bool ignore_low, bool *ret_hit_low);
+bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
+			   struct dmem_cgroup_pool_state *test);
+bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
+			   struct dmem_cgroup_pool_state *test);
 
 void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool);
 #else
@@ -59,6 +63,18 @@ bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit_pool,
 	return true;
 }
 
+static inline bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
+					 struct dmem_cgroup_pool_state *test)
+{
+	return false;
+}
+
+static inline bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
+					 struct dmem_cgroup_pool_state *test)
+{
+	return false;
+}
+
 static inline void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool)
 { }
 
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 9d95824dc6fa0..28227405f7cfe 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -694,6 +694,68 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
 }
 EXPORT_SYMBOL_GPL(dmem_cgroup_try_charge);
 
+/**
+ * dmem_cgroup_below_min() - Tests whether current usage is within min limit.
+ *
+ * @root: Root of the subtree to calculate protection for, or NULL to calculate global protection.
+ * @test: The pool to test the usage/min limit of.
+ *
+ * Return: true if usage is below min and the cgroup is protected, false otherwise.
+ */
+bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
+			   struct dmem_cgroup_pool_state *test)
+{
+	if (root == test || !pool_parent(test))
+		return false;
+
+	if (!root) {
+		for (root = test; pool_parent(root); root = pool_parent(root))
+			{}
+	}
+
+	/*
+	 * In mem_cgroup_below_min(), the memcg pendant, this call is missing.
+	 * mem_cgroup_below_min() gets called during traversal of the cgroup tree, where
+	 * protection is already calculated as part of the traversal. dmem cgroup eviction
+	 * does not traverse the cgroup tree, so we need to recalculate effective protection
+	 * here.
+	 */
+	dmem_cgroup_calculate_protection(root, test);
+	return page_counter_read(&test->cnt) <= READ_ONCE(test->cnt.emin);
+}
+EXPORT_SYMBOL_GPL(dmem_cgroup_below_min);
+
+/**
+ * dmem_cgroup_below_low() - Tests whether current usage is within low limit.
+ *
+ * @root: Root of the subtree to calculate protection for, or NULL to calculate global protection.
+ * @test: The pool to test the usage/low limit of.
+ *
+ * Return: true if usage is below low and the cgroup is protected, false otherwise.
+ */
+bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
+			   struct dmem_cgroup_pool_state *test)
+{
+	if (root == test || !pool_parent(test))
+		return false;
+
+	if (!root) {
+		for (root = test; pool_parent(root); root = pool_parent(root))
+			{}
+	}
+
+	/*
+	 * In mem_cgroup_below_low(), the memcg pendant, this call is missing.
+	 * mem_cgroup_below_low() gets called during traversal of the cgroup tree, where
+	 * protection is already calculated as part of the traversal. dmem cgroup eviction
+	 * does not traverse the cgroup tree, so we need to recalculate effective protection
+	 * here.
+	 */
+	dmem_cgroup_calculate_protection(root, test);
+	return page_counter_read(&test->cnt) <= READ_ONCE(test->cnt.elow);
+}
+EXPORT_SYMBOL_GPL(dmem_cgroup_below_low);
+
 static int dmem_cgroup_region_capacity_show(struct seq_file *sf, void *v)
 {
 	struct dmem_cgroup_region *region;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/6] cgroup/dmem: Add dmem_cgroup_common_ancestor helper
  2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
@ 2026-02-25 12:10 ` Natalie Vock
  2026-02-25 17:16   ` Tejun Heo
  2026-02-25 12:10 ` [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

This helps to find a common subtree of two resources, which is important
when determining whether it's helpful to evict one resource in favor of
another.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
 include/linux/cgroup_dmem.h |  9 +++++++++
 kernel/cgroup/dmem.c        | 25 +++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h
index 1a88cd0c9eb00..444b84f4c253a 100644
--- a/include/linux/cgroup_dmem.h
+++ b/include/linux/cgroup_dmem.h
@@ -28,6 +28,8 @@ bool dmem_cgroup_below_min(struct dmem_cgroup_pool_state *root,
 			   struct dmem_cgroup_pool_state *test);
 bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
 			   struct dmem_cgroup_pool_state *test);
+struct dmem_cgroup_pool_state *dmem_cgroup_common_ancestor(struct dmem_cgroup_pool_state *a,
+							   struct dmem_cgroup_pool_state *b);
 
 void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool);
 #else
@@ -75,6 +77,13 @@ static inline bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
 	return false;
 }
 
+static inline
+struct dmem_cgroup_pool_state *dmem_cgroup_common_ancestor(struct dmem_cgroup_pool_state *a,
+							   struct dmem_cgroup_pool_state *b)
+{
+	return NULL;
+}
+
 static inline void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool)
 { }
 
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 28227405f7cfe..26e794400c5c7 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -756,6 +756,31 @@ bool dmem_cgroup_below_low(struct dmem_cgroup_pool_state *root,
 }
 EXPORT_SYMBOL_GPL(dmem_cgroup_below_low);
 
+/**
+ * dmem_cgroup_common_ancestor(): Find the first common ancestor of two pools.
+ * @a: First pool to find the common ancestor of.
+ * @b: First pool to find the common ancestor of.
+ *
+ * Return: The first pool that is a parent of both @a and @b, or NULL if either @a or @b are NULL.
+ */
+struct dmem_cgroup_pool_state *dmem_cgroup_common_ancestor(struct dmem_cgroup_pool_state *a,
+							   struct dmem_cgroup_pool_state *b)
+{
+	struct dmem_cgroup_pool_state *parent;
+
+	while (a) {
+		parent = b;
+		while (parent) {
+			if (a == parent)
+				return a;
+			parent = pool_parent(parent);
+		}
+		a = pool_parent(a);
+	}
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(dmem_cgroup_common_ancestor);
+
 static int dmem_cgroup_region_capacity_show(struct seq_file *sf, void *v)
 {
 	struct dmem_cgroup_region *region;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place
  2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 2/6] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
@ 2026-02-25 12:10 ` Natalie Vock
  2026-02-25 15:18   ` Tvrtko Ursulin
  2026-02-25 15:27   ` Tvrtko Ursulin
  2026-02-25 12:10 ` [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

Move all code for attempting allocation for a specific place to
ttm_bo_alloc_place. With subsequent patches, this logic is going to get
more complicated, so it helps readability to have this separate.

ttm_bo_alloc_at_place takes a pointer to a struct ttm_bo_alloc_state.
This struct holds various state produced by the allocation (e.g. cgroup
resource associated with the allocation) that the caller needs to keep
track of (and potentially dispose of). This is just the limiting cgroup
pool for now, but future patches will add more state needing to be tracked.

ttm_bo_alloc_at_place also communicates via return codes if eviction
using ttm_bo_evict_alloc should be attempted. This is preparation for
attempting eviction in more cases than just force_space being set.

No functional change intended.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 109 +++++++++++++++++++++++++++++++++----------
 1 file changed, 84 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index acb9197db8798..48dbaaa46824c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -489,6 +489,11 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
 	return ret;
 }
 
+struct ttm_bo_alloc_state {
+	/** @limit_pool: Which pool limit we should test against */
+	struct dmem_cgroup_pool_state *limit_pool;
+};
+
 /**
  * struct ttm_bo_evict_walk - Parameters for the evict walk.
  */
@@ -504,12 +509,13 @@ struct ttm_bo_evict_walk {
 	/** @evicted: Number of successful evictions. */
 	unsigned long evicted;
 
-	/** @limit_pool: Which pool limit we should test against */
-	struct dmem_cgroup_pool_state *limit_pool;
 	/** @try_low: Whether we should attempt to evict BO's with low watermark threshold */
 	bool try_low;
 	/** @hit_low: If we cannot evict a bo when @try_low is false (first pass) */
 	bool hit_low;
+
+	/** @alloc_state: State associated with the allocation attempt. */
+	struct ttm_bo_alloc_state *alloc_state;
 };
 
 static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
@@ -518,8 +524,9 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
 		container_of(walk, typeof(*evict_walk), walk);
 	s64 lret;
 
-	if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo->resource->css,
-					      evict_walk->try_low, &evict_walk->hit_low))
+	if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
+					      bo->resource->css, evict_walk->try_low,
+					      &evict_walk->hit_low))
 		return 0;
 
 	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
@@ -561,7 +568,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
 			      struct ttm_operation_ctx *ctx,
 			      struct ww_acquire_ctx *ticket,
 			      struct ttm_resource **res,
-			      struct dmem_cgroup_pool_state *limit_pool)
+			      struct ttm_bo_alloc_state *state)
 {
 	struct ttm_bo_evict_walk evict_walk = {
 		.walk = {
@@ -574,7 +581,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
 		.place = place,
 		.evictor = evictor,
 		.res = res,
-		.limit_pool = limit_pool,
+		.alloc_state = state,
 	};
 	s64 lret;
 
@@ -689,6 +696,58 @@ static int ttm_bo_add_pipelined_eviction_fences(struct ttm_buffer_object *bo,
 	return dma_resv_reserve_fences(bo->base.resv, 1);
 }
 
+
+/**
+ * ttm_bo_alloc_at_place - Attempt allocating a BO's backing store in a place
+ *
+ * @bo: The buffer to allocate the backing store of
+ * @place: The place to attempt allocation in
+ * @ctx: ttm_operation_ctx associated with this allocation
+ * @force_space: If we should evict buffers to force space
+ * @res: On allocation success, the resulting struct ttm_resource.
+ * @alloc_state: Object holding allocation state such as charged cgroups.
+ *
+ * Returns:
+ * -EBUSY: No space available, but allocation should be retried with ttm_bo_evict_alloc.
+ * -ENOSPC: No space available, allocation should not be retried.
+ * -ERESTARTSYS: An interruptible sleep was interrupted by a signal.
+ *
+ */
+static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
+				 const struct ttm_place *place,
+				 struct ttm_operation_ctx *ctx,
+				 bool force_space,
+				 struct ttm_resource **res,
+				 struct ttm_bo_alloc_state *alloc_state)
+{
+	bool may_evict;
+	int ret;
+
+	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
+
+	ret = ttm_resource_alloc(bo, place, res,
+				 force_space ? &alloc_state->limit_pool : NULL);
+
+	if (ret) {
+		/*
+		 * -EAGAIN means the charge failed, which we treat like an
+		 * allocation failure. Therefore, return an error code indicating
+		 * the allocation failed - either -EBUSY if the allocation should
+		 * be retried with eviction, or -ENOSPC if there should be no second
+		 * attempt.
+		 */
+		if (ret == -EAGAIN)
+			return may_evict ? -EBUSY : -ENOSPC;
+
+		if (ret == -ENOSPC && may_evict)
+			return -EBUSY;
+
+		return ret;
+	}
+
+	return 0;
+}
+
 /**
  * ttm_bo_alloc_resource - Allocate backing store for a BO
  *
@@ -714,7 +773,9 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
 				 bool force_space,
 				 struct ttm_resource **res)
 {
+	struct ttm_bo_alloc_state alloc_state = {0};
 	struct ttm_device *bdev = bo->bdev;
+	struct ttm_resource_manager *man;
 	struct ww_acquire_ctx *ticket;
 	int i, ret;
 
@@ -725,9 +786,6 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
 
 	for (i = 0; i < placement->num_placement; ++i) {
 		const struct ttm_place *place = &placement->placement[i];
-		struct dmem_cgroup_pool_state *limit_pool = NULL;
-		struct ttm_resource_manager *man;
-		bool may_evict;
 
 		man = ttm_manager_type(bdev, place->mem_type);
 		if (!man || !ttm_resource_manager_used(man))
@@ -737,25 +795,26 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
 				    TTM_PL_FLAG_FALLBACK))
 			continue;
 
-		may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
-		ret = ttm_resource_alloc(bo, place, res, force_space ? &limit_pool : NULL);
-		if (ret) {
-			if (ret != -ENOSPC && ret != -EAGAIN) {
-				dmem_cgroup_pool_state_put(limit_pool);
-				return ret;
-			}
-			if (!may_evict) {
-				dmem_cgroup_pool_state_put(limit_pool);
-				continue;
-			}
+		ret = ttm_bo_alloc_at_place(bo, place, ctx, force_space,
+				res, &alloc_state);
 
+		if (ret == -ENOSPC) {
+			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
+			continue;
+		} else if (ret == -EBUSY) {
 			ret = ttm_bo_evict_alloc(bdev, man, place, bo, ctx,
-						 ticket, res, limit_pool);
-			dmem_cgroup_pool_state_put(limit_pool);
-			if (ret == -EBUSY)
+						 ticket, res, &alloc_state);
+
+			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
+
+			if (ret) {
+				if (ret != -EBUSY)
+					return ret;
 				continue;
-			if (ret)
-				return ret;
+			}
+		} else if (ret) {
+			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
+			return ret;
 		}
 
 		ret = ttm_bo_add_pipelined_eviction_fences(bo, man, ctx->no_wait_gpu);

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation
  2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
                   ` (2 preceding siblings ...)
  2026-02-25 12:10 ` [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
@ 2026-02-25 12:10 ` Natalie Vock
  2026-02-25 15:33   ` Tvrtko Ursulin
  2026-02-25 12:10 ` [PATCH v4 5/6] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
  5 siblings, 1 reply; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

Coupling resource allocation and cgroup charging is racy when charging
succeeds, but subsequent resource allocation fails. Certain eviction
decisions are made on the basis of whether the allocating cgroup is
protected, i.e. within its min/low limits, but with the charge being
tied to resource allocation (and uncharged when the resource allocation
fails), this check is done at a poin where the allocation is not actually
charged to the cgroup.

This is subtly wrong if the allocation were to cause the cgroup to exceed
the min/low protection, but it's even more wrong if the same cgroup tries
allocating multiple buffers concurrently: In this case, the min/low
protection may pass for all allocation attempts when the real min/low
protection covers only some, or potentially none of the allocated
buffers.

Instead, charge the allocation to the cgroup once and keep the charge
for as long as we try to allocate a ttm_resource, and only undo the charge
if allocating the resource is ultimately unsuccessful and we move on to
a different ttm_place.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
 drivers/gpu/drm/ttm/ttm_bo.c       | 28 +++++++++++++++-------
 drivers/gpu/drm/ttm/ttm_resource.c | 48 +++++++++++++++++++++++++++-----------
 include/drm/ttm/ttm_resource.h     |  6 ++++-
 3 files changed, 60 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 48dbaaa46824c..a8914d20b0c32 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -490,6 +490,8 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
 }
 
 struct ttm_bo_alloc_state {
+	/** @charge_pool: The memory pool the resource is charged to */
+	struct dmem_cgroup_pool_state *charge_pool;
 	/** @limit_pool: Which pool limit we should test against */
 	struct dmem_cgroup_pool_state *limit_pool;
 };
@@ -546,7 +548,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
 	evict_walk->evicted++;
 	if (evict_walk->res)
 		lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
-					  evict_walk->res, NULL);
+					  evict_walk->res, evict_walk->alloc_state->charge_pool);
 	if (lret == 0)
 		return 1;
 out:
@@ -724,10 +726,8 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
 	int ret;
 
 	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
-
-	ret = ttm_resource_alloc(bo, place, res,
-				 force_space ? &alloc_state->limit_pool : NULL);
-
+	ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
+				      force_space ? &alloc_state->limit_pool : NULL);
 	if (ret) {
 		/*
 		 * -EAGAIN means the charge failed, which we treat like an
@@ -737,14 +737,23 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
 		 * attempt.
 		 */
 		if (ret == -EAGAIN)
-			return may_evict ? -EBUSY : -ENOSPC;
+			ret = may_evict ? -EBUSY : -ENOSPC;
+		return ret;
+	}
 
-		if (ret == -ENOSPC && may_evict)
-			return -EBUSY;
+	ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
 
+	if (ret) {
+		if (ret == -ENOSPC && may_evict)
+			ret = -EBUSY;
 		return ret;
 	}
 
+	/*
+	 * Ownership of charge_pool has been transferred to the TTM resource,
+	 * don't make the caller think we still hold a reference to it.
+	 */
+	alloc_state->charge_pool = NULL;
 	return 0;
 }
 
@@ -799,6 +808,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
 				res, &alloc_state);
 
 		if (ret == -ENOSPC) {
+			dmem_cgroup_pool_state_put(alloc_state.charge_pool);
 			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
 			continue;
 		} else if (ret == -EBUSY) {
@@ -808,11 +818,13 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
 			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
 
 			if (ret) {
+				dmem_cgroup_pool_state_put(alloc_state.charge_pool);
 				if (ret != -EBUSY)
 					return ret;
 				continue;
 			}
 		} else if (ret) {
+			dmem_cgroup_pool_state_put(alloc_state.charge_pool);
 			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
 			return ret;
 		}
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 192fca24f37e4..a8a836f6e376a 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -373,30 +373,52 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
 }
 EXPORT_SYMBOL(ttm_resource_fini);
 
+/**
+ * ttm_resource_try_charge - charge a resource manager's cgroup pool
+ * @bo: buffer for which an allocation should be charged
+ * @place: where the allocation is attempted to be placed
+ * @ret_pool: on charge success, the pool that was charged
+ * @ret_limit_pool: on charge failure, the pool responsible for the failure
+ *
+ * Should be used to charge cgroups before attempting resource allocation.
+ * When charging succeeds, the value of ret_pool should be passed to
+ * ttm_resource_alloc.
+ *
+ * Returns: 0 on charge success, negative errno on failure.
+ */
+int ttm_resource_try_charge(struct ttm_buffer_object *bo,
+			    const struct ttm_place *place,
+			    struct dmem_cgroup_pool_state **ret_pool,
+			    struct dmem_cgroup_pool_state **ret_limit_pool)
+{
+	struct ttm_resource_manager *man =
+		ttm_manager_type(bo->bdev, place->mem_type);
+
+	if (!man->cg) {
+		*ret_pool = NULL;
+		if (ret_limit_pool)
+			*ret_limit_pool = NULL;
+		return 0;
+	}
+
+	return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
+				      ret_limit_pool);
+}
+
 int ttm_resource_alloc(struct ttm_buffer_object *bo,
 		       const struct ttm_place *place,
 		       struct ttm_resource **res_ptr,
-		       struct dmem_cgroup_pool_state **ret_limit_pool)
+		       struct dmem_cgroup_pool_state *charge_pool)
 {
 	struct ttm_resource_manager *man =
 		ttm_manager_type(bo->bdev, place->mem_type);
-	struct dmem_cgroup_pool_state *pool = NULL;
 	int ret;
 
-	if (man->cg) {
-		ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool);
-		if (ret)
-			return ret;
-	}
-
 	ret = man->func->alloc(man, bo, place, res_ptr);
-	if (ret) {
-		if (pool)
-			dmem_cgroup_uncharge(pool, bo->base.size);
+	if (ret)
 		return ret;
-	}
 
-	(*res_ptr)->css = pool;
+	(*res_ptr)->css = charge_pool;
 
 	spin_lock(&bo->bdev->lru_lock);
 	ttm_resource_add_bulk_move(*res_ptr, bo);
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 33e80f30b8b82..549b5b796884d 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -456,10 +456,14 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 void ttm_resource_fini(struct ttm_resource_manager *man,
 		       struct ttm_resource *res);
 
+int ttm_resource_try_charge(struct ttm_buffer_object *bo,
+			    const struct ttm_place *place,
+			    struct dmem_cgroup_pool_state **ret_pool,
+			    struct dmem_cgroup_pool_state **ret_limit_pool);
 int ttm_resource_alloc(struct ttm_buffer_object *bo,
 		       const struct ttm_place *place,
 		       struct ttm_resource **res,
-		       struct dmem_cgroup_pool_state **ret_limit_pool);
+		       struct dmem_cgroup_pool_state *charge_pool);
 void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
 bool ttm_resource_intersects(struct ttm_device *bdev,
 			     struct ttm_resource *res,

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 5/6] drm/ttm: Be more aggressive when allocating below protection limit
  2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
                   ` (3 preceding siblings ...)
  2026-02-25 12:10 ` [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
@ 2026-02-25 12:10 ` Natalie Vock
  2026-02-25 12:10 ` [PATCH v4 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
  5 siblings, 0 replies; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

When the cgroup's memory usage is below the low/min limit and allocation
fails, try evicting some unprotected buffers to make space. Otherwise,
application buffers may be forced to go into GTT even though usage is
below the corresponding low/min limit, if other applications filled VRAM
with their allocations first.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 52 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 47 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index a8914d20b0c32..401a6846b470f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -494,6 +494,10 @@ struct ttm_bo_alloc_state {
 	struct dmem_cgroup_pool_state *charge_pool;
 	/** @limit_pool: Which pool limit we should test against */
 	struct dmem_cgroup_pool_state *limit_pool;
+	/** @only_evict_unprotected: If only unprotected BOs, i.e. BOs whose cgroup
+	 *  is exceeding its dmem low/min protection, should be considered for eviction
+	 */
+	bool only_evict_unprotected;
 };
 
 /**
@@ -590,8 +594,12 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
 	evict_walk.walk.arg.trylock_only = true;
 	lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
 
-	/* One more attempt if we hit low limit? */
-	if (!lret && evict_walk.hit_low) {
+	/* If we failed to find enough BOs to evict, but we skipped over
+	 * some BOs because they were covered by dmem low protection, retry
+	 * evicting these protected BOs too, except if we're told not to
+	 * consider protected BOs at all.
+	 */
+	if (!lret && evict_walk.hit_low && !state->only_evict_unprotected) {
 		evict_walk.try_low = true;
 		lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
 	}
@@ -612,7 +620,8 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
 	} while (!lret && evict_walk.evicted);
 
 	/* We hit the low limit? Try once more */
-	if (!lret && evict_walk.hit_low && !evict_walk.try_low) {
+	if (!lret && evict_walk.hit_low && !evict_walk.try_low &&
+			!state->only_evict_unprotected) {
 		evict_walk.try_low = true;
 		goto retry;
 	}
@@ -722,7 +731,7 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
 				 struct ttm_resource **res,
 				 struct ttm_bo_alloc_state *alloc_state)
 {
-	bool may_evict;
+	bool may_evict, below_low = false;
 	int ret;
 
 	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
@@ -741,10 +750,43 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
 		return ret;
 	}
 
+	/*
+	 * cgroup protection plays a special role in eviction.
+	 * Conceptually, protection of memory via the dmem cgroup controller
+	 * entitles the protected cgroup to use a certain amount of memory.
+	 * There are two types of protection - the 'low' limit is a
+	 * "best-effort" protection, whereas the 'min' limit provides a hard
+	 * guarantee that memory within the cgroup's allowance will not be
+	 * evicted under any circumstance.
+	 *
+	 * To faithfully model this concept in TTM, we also need to take cgroup
+	 * protection into account when allocating. When allocation in one
+	 * place fails, TTM will default to trying other places first before
+	 * evicting.
+	 * If the allocation is covered by dmem cgroup protection, however,
+	 * this prevents the allocation from using the memory it is "entitled"
+	 * to. To make sure unprotected allocations cannot push new protected
+	 * allocations out of places they are "entitled" to use, we should
+	 * evict buffers not covered by any cgroup protection, if this
+	 * allocation is covered by cgroup protection.
+	 *
+	 * Buffers covered by 'min' protection are a special case - the 'min'
+	 * limit is a stronger guarantee than 'low', and thus buffers protected
+	 * by 'low' but not 'min' should also be considered for eviction.
+	 * Buffers protected by 'min' will never be considered for eviction
+	 * anyway, so the regular eviction path should be triggered here.
+	 * Buffers protected by 'low' but not 'min' will take a special
+	 * eviction path that only evicts buffers covered by neither 'low' or
+	 * 'min' protections.
+	 */
+	may_evict |= dmem_cgroup_below_min(NULL, alloc_state->charge_pool);
+	below_low = dmem_cgroup_below_low(NULL, alloc_state->charge_pool);
+	alloc_state->only_evict_unprotected = !may_evict && below_low;
+
 	ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
 
 	if (ret) {
-		if (ret == -ENOSPC && may_evict)
+		if (ret == -ENOSPC && (may_evict || below_low))
 			ret = -EBUSY;
 		return ret;
 	}

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool
  2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
                   ` (4 preceding siblings ...)
  2026-02-25 12:10 ` [PATCH v4 5/6] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
@ 2026-02-25 12:10 ` Natalie Vock
  5 siblings, 0 replies; 13+ messages in thread
From: Natalie Vock @ 2026-02-25 12:10 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Tejun Heo, Johannes Weiner,
	Michal Koutný, Christian Koenig, Huang Rui, Matthew Auld,
	Matthew Brost, Maarten Lankhorst, Thomas Zimmermann, David Airlie,
	Simona Vetter, Tvrtko Ursulin
  Cc: cgroups, dri-devel, Natalie Vock

When checking whether to skip certain buffers because they're protected
by dmem.low, we're checking the effective protection of the evictee's
cgroup, but depending on how the evictor's cgroup relates to the
evictee's, the semantics of effective protection values change.

When testing against cgroups from different subtrees, page_counter's
recursive protection propagates memory protection afforded to a parent
down to the child cgroups, even if the children were not explicitly
protected. This prevents cgroups whose parents were afforded no
protection from stealing memory from cgroups whose parents were afforded
more protection, without users having to explicitly propagate this
protection.

However, if we always calculate protection from the root cgroup, this
breaks prioritization of sibling cgroups: If one cgroup was explicitly
protected and its siblings were not, the protected cgroup should get
higher priority, i.e. the protected cgroup should be able to steal from
unprotected siblings. This only works if we restrict the protection
calculation to the subtree shared by evictor and evictee.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 401a6846b470f..12c3241704895 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -526,13 +526,42 @@ struct ttm_bo_evict_walk {
 
 static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
 {
+	struct dmem_cgroup_pool_state *limit_pool;
 	struct ttm_bo_evict_walk *evict_walk =
 		container_of(walk, typeof(*evict_walk), walk);
 	s64 lret;
 
-	if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
-					      bo->resource->css, evict_walk->try_low,
-					      &evict_walk->hit_low))
+	/*
+	 * If only_evict_unprotected is set, then we're trying to evict unprotected
+	 * buffers in favor of a protected allocation for charge_pool. Explicitly skip
+	 * buffers belonging to the same cgroup here - that cgroup is definitely protected,
+	 * even though dmem_cgroup_state_evict_valuable would allow the eviction because a
+	 * cgroup is always allowed to evict from itself even if it is protected.
+	 */
+	if (evict_walk->alloc_state->only_evict_unprotected &&
+			bo->resource->css == evict_walk->alloc_state->charge_pool)
+		return 0;
+
+	limit_pool = evict_walk->alloc_state->limit_pool;
+	/*
+	 * If there is no explicit limit pool, find the root of the shared subtree between
+	 * evictor and evictee. This is important so that recursive protection rules can
+	 * apply properly: Recursive protection distributes cgroup protection afforded
+	 * to a parent cgroup but not used explicitly by a child cgroup between all child
+	 * cgroups (see docs of effective_protection in mm/page_counter.c). However, when
+	 * direct siblings compete for memory, siblings that were explicitly protected
+	 * should get prioritized over siblings that weren't. This only happens correctly
+	 * when the root of the shared subtree is passed to
+	 * dmem_cgroup_state_evict_valuable. Otherwise, the effective-protection
+	 * calculation cannot distinguish direct siblings from unrelated subtrees and the
+	 * calculated protection ends up wrong.
+	 */
+	if (!limit_pool)
+		limit_pool = dmem_cgroup_common_ancestor(bo->resource->css,
+							 evict_walk->alloc_state->charge_pool);
+
+	if (!dmem_cgroup_state_evict_valuable(limit_pool, bo->resource->css,
+					      evict_walk->try_low, &evict_walk->hit_low))
 		return 0;
 
 	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place
  2026-02-25 12:10 ` [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
@ 2026-02-25 15:18   ` Tvrtko Ursulin
  2026-02-25 15:27   ` Tvrtko Ursulin
  1 sibling, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2026-02-25 15:18 UTC (permalink / raw)
  To: Natalie Vock, Maarten Lankhorst, Maxime Ripard, Tejun Heo,
	Johannes Weiner, Michal Koutný, Christian Koenig, Huang Rui,
	Matthew Auld, Matthew Brost, Maarten Lankhorst, Thomas Zimmermann,
	David Airlie, Simona Vetter
  Cc: cgroups, dri-devel


On 25/02/2026 12:10, Natalie Vock wrote:
> Move all code for attempting allocation for a specific place to
> ttm_bo_alloc_place. With subsequent patches, this logic is going to get
> more complicated, so it helps readability to have this separate.
> 
> ttm_bo_alloc_at_place takes a pointer to a struct ttm_bo_alloc_state.
> This struct holds various state produced by the allocation (e.g. cgroup
> resource associated with the allocation) that the caller needs to keep
> track of (and potentially dispose of). This is just the limiting cgroup
> pool for now, but future patches will add more state needing to be tracked.
> 
> ttm_bo_alloc_at_place also communicates via return codes if eviction
> using ttm_bo_evict_alloc should be attempted. This is preparation for
> attempting eviction in more cases than just force_space being set.
> 
> No functional change intended.
> 
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c | 109 +++++++++++++++++++++++++++++++++----------
>   1 file changed, 84 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index acb9197db8798..48dbaaa46824c 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -489,6 +489,11 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
>   	return ret;
>   }
>   
> +struct ttm_bo_alloc_state {
> +	/** @limit_pool: Which pool limit we should test against */
> +	struct dmem_cgroup_pool_state *limit_pool;
> +};
> +
>   /**
>    * struct ttm_bo_evict_walk - Parameters for the evict walk.
>    */
> @@ -504,12 +509,13 @@ struct ttm_bo_evict_walk {
>   	/** @evicted: Number of successful evictions. */
>   	unsigned long evicted;
>   
> -	/** @limit_pool: Which pool limit we should test against */
> -	struct dmem_cgroup_pool_state *limit_pool;
>   	/** @try_low: Whether we should attempt to evict BO's with low watermark threshold */
>   	bool try_low;
>   	/** @hit_low: If we cannot evict a bo when @try_low is false (first pass) */
>   	bool hit_low;
> +
> +	/** @alloc_state: State associated with the allocation attempt. */
> +	struct ttm_bo_alloc_state *alloc_state;
>   };
>   
>   static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
> @@ -518,8 +524,9 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
>   		container_of(walk, typeof(*evict_walk), walk);
>   	s64 lret;
>   
> -	if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo->resource->css,
> -					      evict_walk->try_low, &evict_walk->hit_low))
> +	if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
> +					      bo->resource->css, evict_walk->try_low,
> +					      &evict_walk->hit_low))
>   		return 0;
>   
>   	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
> @@ -561,7 +568,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>   			      struct ttm_operation_ctx *ctx,
>   			      struct ww_acquire_ctx *ticket,
>   			      struct ttm_resource **res,
> -			      struct dmem_cgroup_pool_state *limit_pool)
> +			      struct ttm_bo_alloc_state *state)
>   {
>   	struct ttm_bo_evict_walk evict_walk = {
>   		.walk = {
> @@ -574,7 +581,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>   		.place = place,
>   		.evictor = evictor,
>   		.res = res,
> -		.limit_pool = limit_pool,
> +		.alloc_state = state,
>   	};
>   	s64 lret;
>   
> @@ -689,6 +696,58 @@ static int ttm_bo_add_pipelined_eviction_fences(struct ttm_buffer_object *bo,
>   	return dma_resv_reserve_fences(bo->base.resv, 1);
>   }
>   
> +
> +/**
> + * ttm_bo_alloc_at_place - Attempt allocating a BO's backing store in a place
> + *
> + * @bo: The buffer to allocate the backing store of
> + * @place: The place to attempt allocation in
> + * @ctx: ttm_operation_ctx associated with this allocation
> + * @force_space: If we should evict buffers to force space
> + * @res: On allocation success, the resulting struct ttm_resource.
> + * @alloc_state: Object holding allocation state such as charged cgroups.
> + *
> + * Returns:
> + * -EBUSY: No space available, but allocation should be retried with ttm_bo_evict_alloc.
> + * -ENOSPC: No space available, allocation should not be retried.
> + * -ERESTARTSYS: An interruptible sleep was interrupted by a signal.
> + *
> + */
> +static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
> +				 const struct ttm_place *place,
> +				 struct ttm_operation_ctx *ctx,
> +				 bool force_space,
> +				 struct ttm_resource **res,
> +				 struct ttm_bo_alloc_state *alloc_state)
> +{
> +	bool may_evict;
> +	int ret;
> +
> +	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> +
> +	ret = ttm_resource_alloc(bo, place, res,
> +				 force_space ? &alloc_state->limit_pool : NULL);
> +
> +	if (ret) {
> +		/*
> +		 * -EAGAIN means the charge failed, which we treat like an
> +		 * allocation failure. Therefore, return an error code indicating
> +		 * the allocation failed - either -EBUSY if the allocation should
> +		 * be retried with eviction, or -ENOSPC if there should be no second
> +		 * attempt.
> +		 */
> +		if (ret == -EAGAIN)
> +			return may_evict ? -EBUSY : -ENOSPC;
> +
> +		if (ret == -ENOSPC && may_evict)
> +			return -EBUSY;
> +
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
>   /**
>    * ttm_bo_alloc_resource - Allocate backing store for a BO
>    *
> @@ -714,7 +773,9 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   				 bool force_space,
>   				 struct ttm_resource **res)
>   {
> +	struct ttm_bo_alloc_state alloc_state = {0};

= {};

>   	struct ttm_device *bdev = bo->bdev;
> +	struct ttm_resource_manager *man;

I don't mind if you pull the above two out of the loop too much, but I 
have to re-point it out since I am sure you know the principle of not 
making changes which are not strictly needed, especially if they are not 
a clear win on readability or something.

>   	struct ww_acquire_ctx *ticket;
>   	int i, ret;
>   
> @@ -725,9 +786,6 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   
>   	for (i = 0; i < placement->num_placement; ++i) {
>   		const struct ttm_place *place = &placement->placement[i];
> -		struct dmem_cgroup_pool_state *limit_pool = NULL;
> -		struct ttm_resource_manager *man;
> -		bool may_evict;
>   
>   		man = ttm_manager_type(bdev, place->mem_type);
>   		if (!man || !ttm_resource_manager_used(man))
> @@ -737,25 +795,26 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   				    TTM_PL_FLAG_FALLBACK))
>   			continue;
>   
> -		may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> -		ret = ttm_resource_alloc(bo, place, res, force_space ? &limit_pool : NULL);
> -		if (ret) {
> -			if (ret != -ENOSPC && ret != -EAGAIN) {
> -				dmem_cgroup_pool_state_put(limit_pool);
> -				return ret;
> -			}
> -			if (!may_evict) {
> -				dmem_cgroup_pool_state_put(limit_pool);
> -				continue;
> -			}
> +		ret = ttm_bo_alloc_at_place(bo, place, ctx, force_space,
> +				res, &alloc_state);
>   
> +		if (ret == -ENOSPC) {
> +			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> +			continue;
> +		} else if (ret == -EBUSY) {
>   			ret = ttm_bo_evict_alloc(bdev, man, place, bo, ctx,
> -						 ticket, res, limit_pool);
> -			dmem_cgroup_pool_state_put(limit_pool);
> -			if (ret == -EBUSY)
> +						 ticket, res, &alloc_state);
> +
> +			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> +
> +			if (ret) {
> +				if (ret != -EBUSY)
> +					return ret;
>   				continue;
> -			if (ret)
> -				return ret;
> +			}

Would keeping the ret checks at one level of indentation look better? Eg 
like the current version:

if (ret == -EBUSY)
	continue;
else if (ret)
	return ret;

Up to you.

Btw, it is an interesting design that there are eviction errors which 
prevent trying the next placement. A bit surprising to me but it is out 
of scope here.

Anyway, I went back and forth a few times over the logic and it indeed 
looks to me that there are no functional changes. Thanks for improving 
the commit message as well, now it is completely clear what the patch is 
about. With or without the nitpicks:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Regards,

Tvrtko

> +		} else if (ret) {
> +			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> +			return ret;
>   		}
>   
>   		ret = ttm_bo_add_pipelined_eviction_fences(bo, man, ctx->no_wait_gpu);
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place
  2026-02-25 12:10 ` [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
  2026-02-25 15:18   ` Tvrtko Ursulin
@ 2026-02-25 15:27   ` Tvrtko Ursulin
  2026-02-26  8:56     ` Tvrtko Ursulin
  1 sibling, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2026-02-25 15:27 UTC (permalink / raw)
  To: Natalie Vock, Maarten Lankhorst, Maxime Ripard, Tejun Heo,
	Johannes Weiner, Michal Koutný, Christian Koenig, Huang Rui,
	Matthew Auld, Matthew Brost, Maarten Lankhorst, Thomas Zimmermann,
	David Airlie, Simona Vetter
  Cc: cgroups, dri-devel


On 25/02/2026 12:10, Natalie Vock wrote:
> Move all code for attempting allocation for a specific place to
> ttm_bo_alloc_place. With subsequent patches, this logic is going to get
> more complicated, so it helps readability to have this separate.
> 
> ttm_bo_alloc_at_place takes a pointer to a struct ttm_bo_alloc_state.
> This struct holds various state produced by the allocation (e.g. cgroup
> resource associated with the allocation) that the caller needs to keep
> track of (and potentially dispose of). This is just the limiting cgroup
> pool for now, but future patches will add more state needing to be tracked.
> 
> ttm_bo_alloc_at_place also communicates via return codes if eviction
> using ttm_bo_evict_alloc should be attempted. This is preparation for
> attempting eviction in more cases than just force_space being set.
> 
> No functional change intended.
> 
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c | 109 +++++++++++++++++++++++++++++++++----------
>   1 file changed, 84 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index acb9197db8798..48dbaaa46824c 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -489,6 +489,11 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
>   	return ret;
>   }
>   
> +struct ttm_bo_alloc_state {
> +	/** @limit_pool: Which pool limit we should test against */
> +	struct dmem_cgroup_pool_state *limit_pool;
> +};
> +
>   /**
>    * struct ttm_bo_evict_walk - Parameters for the evict walk.
>    */
> @@ -504,12 +509,13 @@ struct ttm_bo_evict_walk {
>   	/** @evicted: Number of successful evictions. */
>   	unsigned long evicted;
>   
> -	/** @limit_pool: Which pool limit we should test against */
> -	struct dmem_cgroup_pool_state *limit_pool;
>   	/** @try_low: Whether we should attempt to evict BO's with low watermark threshold */
>   	bool try_low;
>   	/** @hit_low: If we cannot evict a bo when @try_low is false (first pass) */
>   	bool hit_low;
> +
> +	/** @alloc_state: State associated with the allocation attempt. */
> +	struct ttm_bo_alloc_state *alloc_state;
>   };
>   
>   static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
> @@ -518,8 +524,9 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
>   		container_of(walk, typeof(*evict_walk), walk);
>   	s64 lret;
>   
> -	if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo->resource->css,
> -					      evict_walk->try_low, &evict_walk->hit_low))
> +	if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state->limit_pool,
> +					      bo->resource->css, evict_walk->try_low,
> +					      &evict_walk->hit_low))
>   		return 0;
>   
>   	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
> @@ -561,7 +568,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>   			      struct ttm_operation_ctx *ctx,
>   			      struct ww_acquire_ctx *ticket,
>   			      struct ttm_resource **res,
> -			      struct dmem_cgroup_pool_state *limit_pool)
> +			      struct ttm_bo_alloc_state *state)
>   {
>   	struct ttm_bo_evict_walk evict_walk = {
>   		.walk = {
> @@ -574,7 +581,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>   		.place = place,
>   		.evictor = evictor,
>   		.res = res,
> -		.limit_pool = limit_pool,
> +		.alloc_state = state,
>   	};
>   	s64 lret;
>   
> @@ -689,6 +696,58 @@ static int ttm_bo_add_pipelined_eviction_fences(struct ttm_buffer_object *bo,
>   	return dma_resv_reserve_fences(bo->base.resv, 1);
>   }
>   
> +
> +/**
> + * ttm_bo_alloc_at_place - Attempt allocating a BO's backing store in a place
> + *
> + * @bo: The buffer to allocate the backing store of
> + * @place: The place to attempt allocation in
> + * @ctx: ttm_operation_ctx associated with this allocation
> + * @force_space: If we should evict buffers to force space
> + * @res: On allocation success, the resulting struct ttm_resource.
> + * @alloc_state: Object holding allocation state such as charged cgroups.
> + *
> + * Returns:
> + * -EBUSY: No space available, but allocation should be retried with ttm_bo_evict_alloc.
> + * -ENOSPC: No space available, allocation should not be retried.
> + * -ERESTARTSYS: An interruptible sleep was interrupted by a signal.
> + *
> + */
> +static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
> +				 const struct ttm_place *place,
> +				 struct ttm_operation_ctx *ctx,
> +				 bool force_space,
> +				 struct ttm_resource **res,
> +				 struct ttm_bo_alloc_state *alloc_state)
> +{
> +	bool may_evict;
> +	int ret;
> +
> +	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> +
> +	ret = ttm_resource_alloc(bo, place, res,
> +				 force_space ? &alloc_state->limit_pool : NULL);
> +
> +	if (ret) {
> +		/*
> +		 * -EAGAIN means the charge failed, which we treat like an
> +		 * allocation failure. Therefore, return an error code indicating
> +		 * the allocation failed - either -EBUSY if the allocation should
> +		 * be retried with eviction, or -ENOSPC if there should be no second
> +		 * attempt.
> +		 */

Ah having started reading 4/6 I see this comment actually is one patch 
premature. So please fix that and keep my r-b.

Regards,

Tvrtko

> +		if (ret == -EAGAIN)
> +			return may_evict ? -EBUSY : -ENOSPC;
> +
> +		if (ret == -ENOSPC && may_evict)
> +			return -EBUSY;
> +
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
>   /**
>    * ttm_bo_alloc_resource - Allocate backing store for a BO
>    *
> @@ -714,7 +773,9 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   				 bool force_space,
>   				 struct ttm_resource **res)
>   {
> +	struct ttm_bo_alloc_state alloc_state = {0};
>   	struct ttm_device *bdev = bo->bdev;
> +	struct ttm_resource_manager *man;
>   	struct ww_acquire_ctx *ticket;
>   	int i, ret;
>   
> @@ -725,9 +786,6 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   
>   	for (i = 0; i < placement->num_placement; ++i) {
>   		const struct ttm_place *place = &placement->placement[i];
> -		struct dmem_cgroup_pool_state *limit_pool = NULL;
> -		struct ttm_resource_manager *man;
> -		bool may_evict;
>   
>   		man = ttm_manager_type(bdev, place->mem_type);
>   		if (!man || !ttm_resource_manager_used(man))
> @@ -737,25 +795,26 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   				    TTM_PL_FLAG_FALLBACK))
>   			continue;
>   
> -		may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> -		ret = ttm_resource_alloc(bo, place, res, force_space ? &limit_pool : NULL);
> -		if (ret) {
> -			if (ret != -ENOSPC && ret != -EAGAIN) {
> -				dmem_cgroup_pool_state_put(limit_pool);
> -				return ret;
> -			}
> -			if (!may_evict) {
> -				dmem_cgroup_pool_state_put(limit_pool);
> -				continue;
> -			}
> +		ret = ttm_bo_alloc_at_place(bo, place, ctx, force_space,
> +				res, &alloc_state);
>   
> +		if (ret == -ENOSPC) {
> +			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> +			continue;
> +		} else if (ret == -EBUSY) {
>   			ret = ttm_bo_evict_alloc(bdev, man, place, bo, ctx,
> -						 ticket, res, limit_pool);
> -			dmem_cgroup_pool_state_put(limit_pool);
> -			if (ret == -EBUSY)
> +						 ticket, res, &alloc_state);
> +
> +			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> +
> +			if (ret) {
> +				if (ret != -EBUSY)
> +					return ret;
>   				continue;
> -			if (ret)
> -				return ret;
> +			}
> +		} else if (ret) {
> +			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> +			return ret;
>   		}
>   
>   		ret = ttm_bo_add_pipelined_eviction_fences(bo, man, ctx->no_wait_gpu);
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation
  2026-02-25 12:10 ` [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
@ 2026-02-25 15:33   ` Tvrtko Ursulin
  2026-02-25 16:01     ` Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2026-02-25 15:33 UTC (permalink / raw)
  To: Natalie Vock, Maarten Lankhorst, Maxime Ripard, Tejun Heo,
	Johannes Weiner, Michal Koutný, Christian Koenig, Huang Rui,
	Matthew Auld, Matthew Brost, Maarten Lankhorst, Thomas Zimmermann,
	David Airlie, Simona Vetter
  Cc: cgroups, dri-devel


On 25/02/2026 12:10, Natalie Vock wrote:
> Coupling resource allocation and cgroup charging is racy when charging
> succeeds, but subsequent resource allocation fails. Certain eviction
> decisions are made on the basis of whether the allocating cgroup is
> protected, i.e. within its min/low limits, but with the charge being
> tied to resource allocation (and uncharged when the resource allocation
> fails), this check is done at a poin where the allocation is not actually

s/poin/point/

> charged to the cgroup.
> 
> This is subtly wrong if the allocation were to cause the cgroup to exceed
> the min/low protection, but it's even more wrong if the same cgroup tries
> allocating multiple buffers concurrently: In this case, the min/low
> protection may pass for all allocation attempts when the real min/low
> protection covers only some, or potentially none of the allocated
> buffers.

Interesting! Do I understand correctly this would be a scenario with 
multi-threaded buffer allocation or there is another path to it?

In any case moving the charge to before allocation makes sense to me. 
With a caveat that I wasn't involved in the dmem cgroup controller 
design so may be missing something.

> Instead, charge the allocation to the cgroup once and keep the charge
> for as long as we try to allocate a ttm_resource, and only undo the charge
> if allocating the resource is ultimately unsuccessful and we move on to
> a different ttm_place.
> 
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c       | 28 +++++++++++++++-------
>   drivers/gpu/drm/ttm/ttm_resource.c | 48 +++++++++++++++++++++++++++-----------
>   include/drm/ttm/ttm_resource.h     |  6 ++++-
>   3 files changed, 60 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 48dbaaa46824c..a8914d20b0c32 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -490,6 +490,8 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
>   }
>   
>   struct ttm_bo_alloc_state {
> +	/** @charge_pool: The memory pool the resource is charged to */
> +	struct dmem_cgroup_pool_state *charge_pool;
>   	/** @limit_pool: Which pool limit we should test against */
>   	struct dmem_cgroup_pool_state *limit_pool;
>   };
> @@ -546,7 +548,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
>   	evict_walk->evicted++;
>   	if (evict_walk->res)
>   		lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
> -					  evict_walk->res, NULL);
> +					  evict_walk->res, evict_walk->alloc_state->charge_pool);
>   	if (lret == 0)
>   		return 1;
>   out:
> @@ -724,10 +726,8 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
>   	int ret;
>   
>   	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> -
> -	ret = ttm_resource_alloc(bo, place, res,
> -				 force_space ? &alloc_state->limit_pool : NULL);
> -
> +	ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
> +				      force_space ? &alloc_state->limit_pool : NULL);
>   	if (ret) {
>   		/*
>   		 * -EAGAIN means the charge failed, which we treat like an
> @@ -737,14 +737,23 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
>   		 * attempt.
>   		 */
>   		if (ret == -EAGAIN)
> -			return may_evict ? -EBUSY : -ENOSPC;
> +			ret = may_evict ? -EBUSY : -ENOSPC;
> +		return ret;
> +	}
>   
> -		if (ret == -ENOSPC && may_evict)
> -			return -EBUSY;
> +	ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
>   

No need for a blank line here.

> +	if (ret) {
> +		if (ret == -ENOSPC && may_evict)
> +			ret = -EBUSY;

Why did you remove EAGAIN handling from after ttm_resource_alloc()?

>   		return ret;
>   	}
>   
> +	/*
> +	 * Ownership of charge_pool has been transferred to the TTM resource,
> +	 * don't make the caller think we still hold a reference to it.
> +	 */
> +	alloc_state->charge_pool = NULL;
>   	return 0;
>   }
>   
> @@ -799,6 +808,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   				res, &alloc_state);
>   
>   		if (ret == -ENOSPC) {
> +			dmem_cgroup_pool_state_put(alloc_state.charge_pool);
>   			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>   			continue;
>   		} else if (ret == -EBUSY) {
> @@ -808,11 +818,13 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>   			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>   
>   			if (ret) {
> +				dmem_cgroup_pool_state_put(alloc_state.charge_pool);
>   				if (ret != -EBUSY)
>   					return ret;
>   				continue;
>   			}
>   		} else if (ret) {
> +			dmem_cgroup_pool_state_put(alloc_state.charge_pool);

Is uncharge in the failure case hidden in dmem_cgroup_pool_state_put() 
somehow?

Regards,

Tvrtko

>   			dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>   			return ret;
>   		}
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 192fca24f37e4..a8a836f6e376a 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -373,30 +373,52 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
>   }
>   EXPORT_SYMBOL(ttm_resource_fini);
>   
> +/**
> + * ttm_resource_try_charge - charge a resource manager's cgroup pool
> + * @bo: buffer for which an allocation should be charged
> + * @place: where the allocation is attempted to be placed
> + * @ret_pool: on charge success, the pool that was charged
> + * @ret_limit_pool: on charge failure, the pool responsible for the failure
> + *
> + * Should be used to charge cgroups before attempting resource allocation.
> + * When charging succeeds, the value of ret_pool should be passed to
> + * ttm_resource_alloc.
> + *
> + * Returns: 0 on charge success, negative errno on failure.
> + */
> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
> +			    const struct ttm_place *place,
> +			    struct dmem_cgroup_pool_state **ret_pool,
> +			    struct dmem_cgroup_pool_state **ret_limit_pool)
> +{
> +	struct ttm_resource_manager *man =
> +		ttm_manager_type(bo->bdev, place->mem_type);
> +
> +	if (!man->cg) {
> +		*ret_pool = NULL;
> +		if (ret_limit_pool)
> +			*ret_limit_pool = NULL;
> +		return 0;
> +	}
> +
> +	return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
> +				      ret_limit_pool);
> +}
> +
>   int ttm_resource_alloc(struct ttm_buffer_object *bo,
>   		       const struct ttm_place *place,
>   		       struct ttm_resource **res_ptr,
> -		       struct dmem_cgroup_pool_state **ret_limit_pool)
> +		       struct dmem_cgroup_pool_state *charge_pool)
>   {
>   	struct ttm_resource_manager *man =
>   		ttm_manager_type(bo->bdev, place->mem_type);
> -	struct dmem_cgroup_pool_state *pool = NULL;
>   	int ret;
>   
> -	if (man->cg) {
> -		ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool);
> -		if (ret)
> -			return ret;
> -	}
> -
>   	ret = man->func->alloc(man, bo, place, res_ptr);
> -	if (ret) {
> -		if (pool)
> -			dmem_cgroup_uncharge(pool, bo->base.size);
> +	if (ret)
>   		return ret;
> -	}
>   
> -	(*res_ptr)->css = pool;
> +	(*res_ptr)->css = charge_pool;
>   
>   	spin_lock(&bo->bdev->lru_lock);
>   	ttm_resource_add_bulk_move(*res_ptr, bo);
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 33e80f30b8b82..549b5b796884d 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -456,10 +456,14 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>   void ttm_resource_fini(struct ttm_resource_manager *man,
>   		       struct ttm_resource *res);
>   
> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
> +			    const struct ttm_place *place,
> +			    struct dmem_cgroup_pool_state **ret_pool,
> +			    struct dmem_cgroup_pool_state **ret_limit_pool);
>   int ttm_resource_alloc(struct ttm_buffer_object *bo,
>   		       const struct ttm_place *place,
>   		       struct ttm_resource **res,
> -		       struct dmem_cgroup_pool_state **ret_limit_pool);
> +		       struct dmem_cgroup_pool_state *charge_pool);
>   void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
>   bool ttm_resource_intersects(struct ttm_device *bdev,
>   			     struct ttm_resource *res,
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation
  2026-02-25 15:33   ` Tvrtko Ursulin
@ 2026-02-25 16:01     ` Tvrtko Ursulin
  0 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2026-02-25 16:01 UTC (permalink / raw)
  To: Natalie Vock, Maarten Lankhorst, Maxime Ripard, Tejun Heo,
	Johannes Weiner, Michal Koutný, Christian Koenig, Huang Rui,
	Matthew Auld, Matthew Brost, Maarten Lankhorst, Thomas Zimmermann,
	David Airlie, Simona Vetter
  Cc: cgroups, dri-devel


On 25/02/2026 15:33, Tvrtko Ursulin wrote:
> 
> On 25/02/2026 12:10, Natalie Vock wrote:
>> Coupling resource allocation and cgroup charging is racy when charging
>> succeeds, but subsequent resource allocation fails. Certain eviction
>> decisions are made on the basis of whether the allocating cgroup is
>> protected, i.e. within its min/low limits, but with the charge being
>> tied to resource allocation (and uncharged when the resource allocation
>> fails), this check is done at a poin where the allocation is not actually
> 
> s/poin/point/
> 
>> charged to the cgroup.
>>
>> This is subtly wrong if the allocation were to cause the cgroup to exceed
>> the min/low protection, but it's even more wrong if the same cgroup tries
>> allocating multiple buffers concurrently: In this case, the min/low
>> protection may pass for all allocation attempts when the real min/low
>> protection covers only some, or potentially none of the allocated
>> buffers.
> 
> Interesting! Do I understand correctly this would be a scenario with 
> multi-threaded buffer allocation or there is another path to it?
> 
> In any case moving the charge to before allocation makes sense to me. 
> With a caveat that I wasn't involved in the dmem cgroup controller 
> design so may be missing something.
> 
>> Instead, charge the allocation to the cgroup once and keep the charge
>> for as long as we try to allocate a ttm_resource, and only undo the 
>> charge
>> if allocating the resource is ultimately unsuccessful and we move on to
>> a different ttm_place.
>>
>> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c       | 28 +++++++++++++++-------
>>   drivers/gpu/drm/ttm/ttm_resource.c | 48 ++++++++++++++++++++++++++ 
>> +-----------
>>   include/drm/ttm/ttm_resource.h     |  6 ++++-
>>   3 files changed, 60 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 48dbaaa46824c..a8914d20b0c32 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -490,6 +490,8 @@ int ttm_bo_evict_first(struct ttm_device *bdev, 
>> struct ttm_resource_manager *man
>>   }
>>   struct ttm_bo_alloc_state {
>> +    /** @charge_pool: The memory pool the resource is charged to */
>> +    struct dmem_cgroup_pool_state *charge_pool;
>>       /** @limit_pool: Which pool limit we should test against */
>>       struct dmem_cgroup_pool_state *limit_pool;
>>   };
>> @@ -546,7 +548,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk 
>> *walk, struct ttm_buffer_object *
>>       evict_walk->evicted++;
>>       if (evict_walk->res)
>>           lret = ttm_resource_alloc(evict_walk->evictor, evict_walk- 
>> >place,
>> -                      evict_walk->res, NULL);
>> +                      evict_walk->res, evict_walk->alloc_state- 
>> >charge_pool);
>>       if (lret == 0)
>>           return 1;
>>   out:
>> @@ -724,10 +726,8 @@ static int ttm_bo_alloc_at_place(struct 
>> ttm_buffer_object *bo,
>>       int ret;
>>       may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
>> -
>> -    ret = ttm_resource_alloc(bo, place, res,
>> -                 force_space ? &alloc_state->limit_pool : NULL);
>> -
>> +    ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
>> +                      force_space ? &alloc_state->limit_pool : NULL);
>>       if (ret) {
>>           /*
>>            * -EAGAIN means the charge failed, which we treat like an
>> @@ -737,14 +737,23 @@ static int ttm_bo_alloc_at_place(struct 
>> ttm_buffer_object *bo,
>>            * attempt.
>>            */
>>           if (ret == -EAGAIN)
>> -            return may_evict ? -EBUSY : -ENOSPC;
>> +            ret = may_evict ? -EBUSY : -ENOSPC;
>> +        return ret;
>> +    }
>> -        if (ret == -ENOSPC && may_evict)
>> -            return -EBUSY;
>> +    ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
> 
> No need for a blank line here.
> 
>> +    if (ret) {
>> +        if (ret == -ENOSPC && may_evict)
>> +            ret = -EBUSY;
> 
> Why did you remove EAGAIN handling from after ttm_resource_alloc()?

I figured this part out. I guess EAGAIN can only come out 
dmem_cgroup_try_charge() which is no longer here. Makes sense.

Return code handling changes look fine to me in this case. Just the 
question of uncharging remains.

Regards,

Tvrtko

> 
>>           return ret;
>>       }
>> +    /*
>> +     * Ownership of charge_pool has been transferred to the TTM 
>> resource,
>> +     * don't make the caller think we still hold a reference to it.
>> +     */
>> +    alloc_state->charge_pool = NULL;
>>       return 0;
>>   }
>> @@ -799,6 +808,7 @@ static int ttm_bo_alloc_resource(struct 
>> ttm_buffer_object *bo,
>>                   res, &alloc_state);
>>           if (ret == -ENOSPC) {
>> +            dmem_cgroup_pool_state_put(alloc_state.charge_pool);
>>               dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>>               continue;
>>           } else if (ret == -EBUSY) {
>> @@ -808,11 +818,13 @@ static int ttm_bo_alloc_resource(struct 
>> ttm_buffer_object *bo,
>>               dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>>               if (ret) {
>> +                dmem_cgroup_pool_state_put(alloc_state.charge_pool);
>>                   if (ret != -EBUSY)
>>                       return ret;
>>                   continue;
>>               }
>>           } else if (ret) {
>> +            dmem_cgroup_pool_state_put(alloc_state.charge_pool);
> 
> Is uncharge in the failure case hidden in dmem_cgroup_pool_state_put() 
> somehow?
> 
> Regards,
> 
> Tvrtko
> 
>>               dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>>               return ret;
>>           }
>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ 
>> ttm_resource.c
>> index 192fca24f37e4..a8a836f6e376a 100644
>> --- a/drivers/gpu/drm/ttm/ttm_resource.c
>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
>> @@ -373,30 +373,52 @@ void ttm_resource_fini(struct 
>> ttm_resource_manager *man,
>>   }
>>   EXPORT_SYMBOL(ttm_resource_fini);
>> +/**
>> + * ttm_resource_try_charge - charge a resource manager's cgroup pool
>> + * @bo: buffer for which an allocation should be charged
>> + * @place: where the allocation is attempted to be placed
>> + * @ret_pool: on charge success, the pool that was charged
>> + * @ret_limit_pool: on charge failure, the pool responsible for the 
>> failure
>> + *
>> + * Should be used to charge cgroups before attempting resource 
>> allocation.
>> + * When charging succeeds, the value of ret_pool should be passed to
>> + * ttm_resource_alloc.
>> + *
>> + * Returns: 0 on charge success, negative errno on failure.
>> + */
>> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
>> +                const struct ttm_place *place,
>> +                struct dmem_cgroup_pool_state **ret_pool,
>> +                struct dmem_cgroup_pool_state **ret_limit_pool)
>> +{
>> +    struct ttm_resource_manager *man =
>> +        ttm_manager_type(bo->bdev, place->mem_type);
>> +
>> +    if (!man->cg) {
>> +        *ret_pool = NULL;
>> +        if (ret_limit_pool)
>> +            *ret_limit_pool = NULL;
>> +        return 0;
>> +    }
>> +
>> +    return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
>> +                      ret_limit_pool);
>> +}
>> +
>>   int ttm_resource_alloc(struct ttm_buffer_object *bo,
>>                  const struct ttm_place *place,
>>                  struct ttm_resource **res_ptr,
>> -               struct dmem_cgroup_pool_state **ret_limit_pool)
>> +               struct dmem_cgroup_pool_state *charge_pool)
>>   {
>>       struct ttm_resource_manager *man =
>>           ttm_manager_type(bo->bdev, place->mem_type);
>> -    struct dmem_cgroup_pool_state *pool = NULL;
>>       int ret;
>> -    if (man->cg) {
>> -        ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, 
>> ret_limit_pool);
>> -        if (ret)
>> -            return ret;
>> -    }
>> -
>>       ret = man->func->alloc(man, bo, place, res_ptr);
>> -    if (ret) {
>> -        if (pool)
>> -            dmem_cgroup_uncharge(pool, bo->base.size);
>> +    if (ret)
>>           return ret;
>> -    }
>> -    (*res_ptr)->css = pool;
>> +    (*res_ptr)->css = charge_pool;
>>       spin_lock(&bo->bdev->lru_lock);
>>       ttm_resource_add_bulk_move(*res_ptr, bo);
>> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ 
>> ttm_resource.h
>> index 33e80f30b8b82..549b5b796884d 100644
>> --- a/include/drm/ttm/ttm_resource.h
>> +++ b/include/drm/ttm/ttm_resource.h
>> @@ -456,10 +456,14 @@ void ttm_resource_init(struct ttm_buffer_object 
>> *bo,
>>   void ttm_resource_fini(struct ttm_resource_manager *man,
>>                  struct ttm_resource *res);
>> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
>> +                const struct ttm_place *place,
>> +                struct dmem_cgroup_pool_state **ret_pool,
>> +                struct dmem_cgroup_pool_state **ret_limit_pool);
>>   int ttm_resource_alloc(struct ttm_buffer_object *bo,
>>                  const struct ttm_place *place,
>>                  struct ttm_resource **res,
>> -               struct dmem_cgroup_pool_state **ret_limit_pool);
>> +               struct dmem_cgroup_pool_state *charge_pool);
>>   void ttm_resource_free(struct ttm_buffer_object *bo, struct 
>> ttm_resource **res);
>>   bool ttm_resource_intersects(struct ttm_device *bdev,
>>                    struct ttm_resource *res,
>>
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/6] cgroup/dmem: Add dmem_cgroup_common_ancestor helper
  2026-02-25 12:10 ` [PATCH v4 2/6] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
@ 2026-02-25 17:16   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2026-02-25 17:16 UTC (permalink / raw)
  To: Natalie Vock
  Cc: Maarten Lankhorst, Maxime Ripard, Johannes Weiner, Michal Koutny,
	Christian Koenig, Huang Rui, Matthew Auld, Matthew Brost,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Tvrtko Ursulin, cgroups, dri-devel

Each cgroup already knows all its ancestors in cgrp->ancestors[] along with
its depth in cgrp->level (see cgroup_is_descendant() and cgroup_ancestor()).
This can be used to implement a generic cgroup_common_ancestor() a lot more
efficiently. Something like:

  static inline struct cgroup *cgroup_common_ancestor(struct cgroup *a,
                                                      struct cgroup *b)
  {
          int level;

          for (level = min(a->level, b->level); level >= 0; level--)
                  if (a->ancestors[level] == b->ancestors[level])
                          return a->ancestors[level];
          return NULL;
  }

This is O(depth) instead of O(n*m). Can you add a helper like the above in
include/linux/cgroup.h and use it here?

Thanks.
-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place
  2026-02-25 15:27   ` Tvrtko Ursulin
@ 2026-02-26  8:56     ` Tvrtko Ursulin
  0 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2026-02-26  8:56 UTC (permalink / raw)
  To: Natalie Vock, Maarten Lankhorst, Maxime Ripard, Tejun Heo,
	Johannes Weiner, Michal Koutný, Christian Koenig, Huang Rui,
	Matthew Auld, Matthew Brost, Maarten Lankhorst, Thomas Zimmermann,
	David Airlie, Simona Vetter
  Cc: cgroups, dri-devel


On 25/02/2026 15:27, Tvrtko Ursulin wrote:
> 
> On 25/02/2026 12:10, Natalie Vock wrote:
>> Move all code for attempting allocation for a specific place to
>> ttm_bo_alloc_place. With subsequent patches, this logic is going to get
>> more complicated, so it helps readability to have this separate.
>>
>> ttm_bo_alloc_at_place takes a pointer to a struct ttm_bo_alloc_state.
>> This struct holds various state produced by the allocation (e.g. cgroup
>> resource associated with the allocation) that the caller needs to keep
>> track of (and potentially dispose of). This is just the limiting cgroup
>> pool for now, but future patches will add more state needing to be 
>> tracked.
>>
>> ttm_bo_alloc_at_place also communicates via return codes if eviction
>> using ttm_bo_evict_alloc should be attempted. This is preparation for
>> attempting eviction in more cases than just force_space being set.
>>
>> No functional change intended.
>>
>> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 109 ++++++++++++++++++++++++++++++++ 
>> +----------
>>   1 file changed, 84 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index acb9197db8798..48dbaaa46824c 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -489,6 +489,11 @@ int ttm_bo_evict_first(struct ttm_device *bdev, 
>> struct ttm_resource_manager *man
>>       return ret;
>>   }
>> +struct ttm_bo_alloc_state {
>> +    /** @limit_pool: Which pool limit we should test against */
>> +    struct dmem_cgroup_pool_state *limit_pool;
>> +};
>> +
>>   /**
>>    * struct ttm_bo_evict_walk - Parameters for the evict walk.
>>    */
>> @@ -504,12 +509,13 @@ struct ttm_bo_evict_walk {
>>       /** @evicted: Number of successful evictions. */
>>       unsigned long evicted;
>> -    /** @limit_pool: Which pool limit we should test against */
>> -    struct dmem_cgroup_pool_state *limit_pool;
>>       /** @try_low: Whether we should attempt to evict BO's with low 
>> watermark threshold */
>>       bool try_low;
>>       /** @hit_low: If we cannot evict a bo when @try_low is false 
>> (first pass) */
>>       bool hit_low;
>> +
>> +    /** @alloc_state: State associated with the allocation attempt. */
>> +    struct ttm_bo_alloc_state *alloc_state;
>>   };
>>   static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct 
>> ttm_buffer_object *bo)
>> @@ -518,8 +524,9 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk 
>> *walk, struct ttm_buffer_object *
>>           container_of(walk, typeof(*evict_walk), walk);
>>       s64 lret;
>> -    if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo- 
>> >resource->css,
>> -                          evict_walk->try_low, &evict_walk->hit_low))
>> +    if (!dmem_cgroup_state_evict_valuable(evict_walk->alloc_state- 
>> >limit_pool,
>> +                          bo->resource->css, evict_walk->try_low,
>> +                          &evict_walk->hit_low))
>>           return 0;
>>       if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, 
>> evict_walk->place))
>> @@ -561,7 +568,7 @@ static int ttm_bo_evict_alloc(struct ttm_device 
>> *bdev,
>>                     struct ttm_operation_ctx *ctx,
>>                     struct ww_acquire_ctx *ticket,
>>                     struct ttm_resource **res,
>> -                  struct dmem_cgroup_pool_state *limit_pool)
>> +                  struct ttm_bo_alloc_state *state)
>>   {
>>       struct ttm_bo_evict_walk evict_walk = {
>>           .walk = {
>> @@ -574,7 +581,7 @@ static int ttm_bo_evict_alloc(struct ttm_device 
>> *bdev,
>>           .place = place,
>>           .evictor = evictor,
>>           .res = res,
>> -        .limit_pool = limit_pool,
>> +        .alloc_state = state,
>>       };
>>       s64 lret;
>> @@ -689,6 +696,58 @@ static int 
>> ttm_bo_add_pipelined_eviction_fences(struct ttm_buffer_object *bo,
>>       return dma_resv_reserve_fences(bo->base.resv, 1);
>>   }
>> +
>> +/**
>> + * ttm_bo_alloc_at_place - Attempt allocating a BO's backing store in 
>> a place
>> + *
>> + * @bo: The buffer to allocate the backing store of
>> + * @place: The place to attempt allocation in
>> + * @ctx: ttm_operation_ctx associated with this allocation
>> + * @force_space: If we should evict buffers to force space
>> + * @res: On allocation success, the resulting struct ttm_resource.
>> + * @alloc_state: Object holding allocation state such as charged 
>> cgroups.
>> + *
>> + * Returns:
>> + * -EBUSY: No space available, but allocation should be retried with 
>> ttm_bo_evict_alloc.
>> + * -ENOSPC: No space available, allocation should not be retried.
>> + * -ERESTARTSYS: An interruptible sleep was interrupted by a signal.
>> + *
>> + */
>> +static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
>> +                 const struct ttm_place *place,
>> +                 struct ttm_operation_ctx *ctx,
>> +                 bool force_space,
>> +                 struct ttm_resource **res,
>> +                 struct ttm_bo_alloc_state *alloc_state)
>> +{
>> +    bool may_evict;
>> +    int ret;
>> +
>> +    may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
>> +
>> +    ret = ttm_resource_alloc(bo, place, res,
>> +                 force_space ? &alloc_state->limit_pool : NULL);
>> +
>> +    if (ret) {
>> +        /*
>> +         * -EAGAIN means the charge failed, which we treat like an
>> +         * allocation failure. Therefore, return an error code 
>> indicating
>> +         * the allocation failed - either -EBUSY if the allocation 
>> should
>> +         * be retried with eviction, or -ENOSPC if there should be no 
>> second
>> +         * attempt.
>> +         */
> 
> Ah having started reading 4/6 I see this comment actually is one patch 
> premature. So please fix that and keep my r-b.

Or perhaps you are talking about charge here, not because in a later 
patch the call to try charge is put right here, but because even now it 
is happening inside the ttm_resource_alloc? I guess that's passable 
although not immediately obvious from just the context of this function. 
Okay, I think the comment can stay as is since in the next patch it 
becomes immediately obvious, sorry for the noise.

Regards,

Tvrtko

>> +        if (ret == -EAGAIN)
>> +            return may_evict ? -EBUSY : -ENOSPC;
>> +
>> +        if (ret == -ENOSPC && may_evict)
>> +            return -EBUSY;
>> +
>> +        return ret;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   /**
>>    * ttm_bo_alloc_resource - Allocate backing store for a BO
>>    *
>> @@ -714,7 +773,9 @@ static int ttm_bo_alloc_resource(struct 
>> ttm_buffer_object *bo,
>>                    bool force_space,
>>                    struct ttm_resource **res)
>>   {
>> +    struct ttm_bo_alloc_state alloc_state = {0};
>>       struct ttm_device *bdev = bo->bdev;
>> +    struct ttm_resource_manager *man;
>>       struct ww_acquire_ctx *ticket;
>>       int i, ret;
>> @@ -725,9 +786,6 @@ static int ttm_bo_alloc_resource(struct 
>> ttm_buffer_object *bo,
>>       for (i = 0; i < placement->num_placement; ++i) {
>>           const struct ttm_place *place = &placement->placement[i];
>> -        struct dmem_cgroup_pool_state *limit_pool = NULL;
>> -        struct ttm_resource_manager *man;
>> -        bool may_evict;
>>           man = ttm_manager_type(bdev, place->mem_type);
>>           if (!man || !ttm_resource_manager_used(man))
>> @@ -737,25 +795,26 @@ static int ttm_bo_alloc_resource(struct 
>> ttm_buffer_object *bo,
>>                       TTM_PL_FLAG_FALLBACK))
>>               continue;
>> -        may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
>> -        ret = ttm_resource_alloc(bo, place, res, force_space ? 
>> &limit_pool : NULL);
>> -        if (ret) {
>> -            if (ret != -ENOSPC && ret != -EAGAIN) {
>> -                dmem_cgroup_pool_state_put(limit_pool);
>> -                return ret;
>> -            }
>> -            if (!may_evict) {
>> -                dmem_cgroup_pool_state_put(limit_pool);
>> -                continue;
>> -            }
>> +        ret = ttm_bo_alloc_at_place(bo, place, ctx, force_space,
>> +                res, &alloc_state);
>> +        if (ret == -ENOSPC) {
>> +            dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>> +            continue;
>> +        } else if (ret == -EBUSY) {
>>               ret = ttm_bo_evict_alloc(bdev, man, place, bo, ctx,
>> -                         ticket, res, limit_pool);
>> -            dmem_cgroup_pool_state_put(limit_pool);
>> -            if (ret == -EBUSY)
>> +                         ticket, res, &alloc_state);
>> +
>> +            dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>> +
>> +            if (ret) {
>> +                if (ret != -EBUSY)
>> +                    return ret;
>>                   continue;
>> -            if (ret)
>> -                return ret;
>> +            }
>> +        } else if (ret) {
>> +            dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>> +            return ret;
>>           }
>>           ret = ttm_bo_add_pipelined_eviction_fences(bo, man, ctx- 
>> >no_wait_gpu);
>>
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-02-26  8:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 12:10 [PATCH v4 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2026-02-25 12:10 ` [PATCH v4 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
2026-02-25 12:10 ` [PATCH v4 2/6] cgroup/dmem: Add dmem_cgroup_common_ancestor helper Natalie Vock
2026-02-25 17:16   ` Tejun Heo
2026-02-25 12:10 ` [PATCH v4 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
2026-02-25 15:18   ` Tvrtko Ursulin
2026-02-25 15:27   ` Tvrtko Ursulin
2026-02-26  8:56     ` Tvrtko Ursulin
2026-02-25 12:10 ` [PATCH v4 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
2026-02-25 15:33   ` Tvrtko Ursulin
2026-02-25 16:01     ` Tvrtko Ursulin
2026-02-25 12:10 ` [PATCH v4 5/6] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2026-02-25 12:10 ` [PATCH v4 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox