* Switching over to GEM refcounts and a bunch of cleanups
@ 2025-07-16 16:04 Christian König
2025-07-16 16:04 ` [PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3 Christian König
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
Hi guys,
so I hope Thomas is back from vacation while I will be on vacation for
the next two weeks.
Here is the patch set which cleans up TTM and XE in preperation of
switching to drm_exec.
Please take a look and let me know what you think about it.
Regards,
Christian.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-16 16:04 ` [PATCH 2/7] drm/ttm: remove ttm_lru_walk_ops Christian König
` (6 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
Instead of keeping a separate reference count for the TTM object also use
the reference count for DRM GEM objects inside TTM.
Apart from avoiding two reference counts for one object this approach has
the clear advantage of being able to use drm_exec inside TTM.
v2: adjust XE assert as well and re-enable disabled test
v3: handle another case in i915
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 35 ++---
.../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 8 +-
drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 2 -
drivers/gpu/drm/ttm/ttm_bo.c | 146 +++++++++---------
drivers/gpu/drm/ttm/ttm_bo_internal.h | 9 +-
drivers/gpu/drm/ttm/ttm_bo_util.c | 2 +-
drivers/gpu/drm/xe/xe_bo.c | 2 +-
include/drm/ttm/ttm_bo.h | 9 --
8 files changed, 100 insertions(+), 113 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 57bb111d65da..37175020f123 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -932,7 +932,7 @@ void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
* Don't manipulate the TTM LRUs while in TTM bo destruction.
* We're called through i915_ttm_delete_mem_notify().
*/
- if (!kref_read(&bo->kref))
+ if (!kref_read(&bo->base.refcount))
return;
/*
@@ -950,30 +950,21 @@ void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
*
* TODO: consider maybe also bumping the shrinker list here when we have
* already unpinned it, which should give us something more like an LRU.
- *
- * TODO: There is a small window of opportunity for this function to
- * get called from eviction after we've dropped the last GEM refcount,
- * but before the TTM deleted flag is set on the object. Avoid
- * adjusting the shrinker list in such cases, since the object is
- * not available to the shrinker anyway due to its zero refcount.
- * To fix this properly we should move to a TTM shrinker LRU list for
- * these objects.
*/
- if (kref_get_unless_zero(&obj->base.refcount)) {
- if (shrinkable != obj->mm.ttm_shrinkable) {
- if (shrinkable) {
- if (obj->mm.madv == I915_MADV_WILLNEED)
- __i915_gem_object_make_shrinkable(obj);
- else
- __i915_gem_object_make_purgeable(obj);
- } else {
- i915_gem_object_make_unshrinkable(obj);
- }
-
- obj->mm.ttm_shrinkable = shrinkable;
+ i915_gem_object_get(obj);
+ if (shrinkable != obj->mm.ttm_shrinkable) {
+ if (shrinkable) {
+ if (obj->mm.madv == I915_MADV_WILLNEED)
+ __i915_gem_object_make_shrinkable(obj);
+ else
+ __i915_gem_object_make_purgeable(obj);
+ } else {
+ i915_gem_object_make_unshrinkable(obj);
}
- i915_gem_object_put(obj);
+
+ obj->mm.ttm_shrinkable = shrinkable;
}
+ i915_gem_object_put(obj);
/*
* Put on the correct LRU list depending on the MADV status
diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
index 3a1eef83190c..65e31d171634 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
@@ -127,7 +127,7 @@ static void ttm_bo_init_reserved_sys_man(struct kunit *test)
dma_resv_unlock(bo->base.resv);
KUNIT_EXPECT_EQ(test, err, 0);
- KUNIT_EXPECT_EQ(test, kref_read(&bo->kref), 1);
+ KUNIT_EXPECT_EQ(test, kref_read(&bo->base.refcount), 1);
KUNIT_EXPECT_PTR_EQ(test, bo->bdev, priv->ttm_dev);
KUNIT_EXPECT_EQ(test, bo->type, bo_type);
KUNIT_EXPECT_EQ(test, bo->page_alignment, PAGE_SIZE);
@@ -176,7 +176,7 @@ static void ttm_bo_init_reserved_mock_man(struct kunit *test)
dma_resv_unlock(bo->base.resv);
KUNIT_EXPECT_EQ(test, err, 0);
- KUNIT_EXPECT_EQ(test, kref_read(&bo->kref), 1);
+ KUNIT_EXPECT_EQ(test, kref_read(&bo->base.refcount), 1);
KUNIT_EXPECT_PTR_EQ(test, bo->bdev, priv->ttm_dev);
KUNIT_EXPECT_EQ(test, bo->type, bo_type);
KUNIT_EXPECT_EQ(test, ctx.bytes_moved, size);
@@ -928,6 +928,8 @@ static void ttm_bo_validate_allowed_only_evict(struct kunit *test)
ttm_mock_manager_fini(priv->ttm_dev, mem_multihop);
}
+extern const struct drm_gem_object_funcs ttm_deleted_object_funcs;
+
static void ttm_bo_validate_deleted_evict(struct kunit *test)
{
struct ttm_operation_ctx ctx_init = { }, ctx_val = { };
@@ -958,7 +960,7 @@ static void ttm_bo_validate_deleted_evict(struct kunit *test)
KUNIT_EXPECT_EQ(test, ttm_resource_manager_usage(man), big);
dma_resv_unlock(bo_big->base.resv);
- bo_big->deleted = true;
+ bo_big->base.funcs = &ttm_deleted_object_funcs;
bo_small = ttm_bo_kunit_init(test, test->priv, small, NULL);
bo_small->type = bo_type;
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
index 7aaf0d1395ff..49016fd43e50 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
@@ -193,8 +193,6 @@ struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
bo->bdev = devs->ttm_dev;
bo->destroy = dummy_ttm_bo_destroy;
- kref_init(&bo->kref);
-
return bo;
}
EXPORT_SYMBOL_GPL(ttm_bo_kunit_init);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 9c9e132558d4..d456be1d6e1e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -31,6 +31,7 @@
#define pr_fmt(fmt) "[TTM] " fmt
+#include <drm/drm_util.h>
#include <drm/ttm/ttm_bo.h>
#include <drm/ttm/ttm_placement.h>
#include <drm/ttm/ttm_tt.h>
@@ -245,88 +246,92 @@ static void ttm_bo_delayed_delete(struct work_struct *work)
ttm_bo_put(bo);
}
-static void ttm_bo_release(struct kref *kref)
+static void ttm_bo_free(struct drm_gem_object *gobj)
+{
+ struct ttm_buffer_object *bo = container_of(gobj, typeof(*bo), base);
+
+ atomic_dec(&ttm_glob.bo_count);
+ bo->destroy(bo);
+}
+
+/*
+ * All other callbacks should never ever be called on a deleted TTM object.
+ */
+const struct drm_gem_object_funcs ttm_deleted_object_funcs = {
+ .free = ttm_bo_free
+};
+EXPORT_SYMBOL_FOR_TESTS_ONLY(ttm_deleted_object_funcs);
+
+/* Returns true if the BO is about to get deleted */
+static bool ttm_bo_is_zombie(struct ttm_buffer_object *bo)
+{
+ return bo->base.funcs == &ttm_deleted_object_funcs;
+}
+
+void ttm_bo_fini(struct ttm_buffer_object *bo)
{
- struct ttm_buffer_object *bo =
- container_of(kref, struct ttm_buffer_object, kref);
struct ttm_device *bdev = bo->bdev;
int ret;
WARN_ON_ONCE(bo->pin_count);
WARN_ON_ONCE(bo->bulk_move);
- if (!bo->deleted) {
- ret = ttm_bo_individualize_resv(bo);
- if (ret) {
- /* Last resort, if we fail to allocate memory for the
- * fences block for the BO to become idle
- */
- dma_resv_wait_timeout(bo->base.resv,
- DMA_RESV_USAGE_BOOKKEEP, false,
- 30 * HZ);
- }
+ ret = ttm_bo_individualize_resv(bo);
+ if (ret) {
+ /* Last resort, if we fail to allocate memory for the
+ * fences block for the BO to become idle
+ */
+ dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP,
+ false, 30 * HZ);
+ }
- if (bo->bdev->funcs->release_notify)
- bo->bdev->funcs->release_notify(bo);
-
- drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
- ttm_mem_io_free(bdev, bo->resource);
-
- if (!dma_resv_test_signaled(&bo->base._resv,
- DMA_RESV_USAGE_BOOKKEEP) ||
- (want_init_on_free() && (bo->ttm != NULL)) ||
- bo->type == ttm_bo_type_sg ||
- !dma_resv_trylock(bo->base.resv)) {
- /* The BO is not idle, resurrect it for delayed destroy */
- ttm_bo_flush_all_fences(bo);
- bo->deleted = true;
-
- spin_lock(&bo->bdev->lru_lock);
-
- /*
- * Make pinned bos immediately available to
- * shrinkers, now that they are queued for
- * destruction.
- *
- * FIXME: QXL is triggering this. Can be removed when the
- * driver is fixed.
- */
- if (bo->pin_count) {
- bo->pin_count = 0;
- ttm_resource_move_to_lru_tail(bo->resource);
- }
+ if (bo->bdev->funcs->release_notify)
+ bo->bdev->funcs->release_notify(bo);
- kref_init(&bo->kref);
- spin_unlock(&bo->bdev->lru_lock);
+ drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
+ ttm_mem_io_free(bdev, bo->resource);
- INIT_WORK(&bo->delayed_delete, ttm_bo_delayed_delete);
+ if (!dma_resv_test_signaled(&bo->base._resv, DMA_RESV_USAGE_BOOKKEEP) ||
+ (want_init_on_free() && (bo->ttm != NULL)) ||
+ bo->type == ttm_bo_type_sg ||
+ !dma_resv_trylock(bo->base.resv)) {
+ /* The BO is not idle, resurrect it for delayed destroy */
+ ttm_bo_flush_all_fences(bo);
- /* Schedule the worker on the closest NUMA node. This
- * improves performance since system memory might be
- * cleared on free and that is best done on a CPU core
- * close to it.
- */
- queue_work_node(bdev->pool.nid, bdev->wq, &bo->delayed_delete);
- return;
+ spin_lock(&bo->bdev->lru_lock);
+
+ /*
+ * Make pinned bos immediately available to
+ * shrinkers, now that they are queued for
+ * destruction.
+ *
+ * FIXME: QXL is triggering this. Can be removed when the
+ * driver is fixed.
+ */
+ if (bo->pin_count) {
+ bo->pin_count = 0;
+ ttm_resource_move_to_lru_tail(bo->resource);
}
- ttm_bo_cleanup_memtype_use(bo);
- dma_resv_unlock(bo->base.resv);
- }
+ kref_init(&bo->base.refcount);
+ bo->base.funcs = &ttm_deleted_object_funcs;
+ spin_unlock(&bo->bdev->lru_lock);
- atomic_dec(&ttm_glob.bo_count);
- bo->destroy(bo);
-}
+ INIT_WORK(&bo->delayed_delete, ttm_bo_delayed_delete);
-/* TODO: remove! */
-void ttm_bo_put(struct ttm_buffer_object *bo)
-{
- kref_put(&bo->kref, ttm_bo_release);
-}
+ /* Schedule the worker on the closest NUMA node. This
+ * improves performance since system memory might be
+ * cleared on free and that is best done on a CPU core
+ * close to it.
+ */
+ queue_work_node(bdev->pool.nid, bdev->wq, &bo->delayed_delete);
+ } else {
+ ttm_bo_cleanup_memtype_use(bo);
+ dma_resv_unlock(bo->base.resv);
-void ttm_bo_fini(struct ttm_buffer_object *bo)
-{
- ttm_bo_put(bo);
+ atomic_dec(&ttm_glob.bo_count);
+ bo->destroy(bo);
+ }
}
EXPORT_SYMBOL(ttm_bo_fini);
@@ -470,7 +475,7 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
if (!bo->resource || bo->resource->mem_type != mem_type)
goto out_bo_moved;
- if (bo->deleted) {
+ if (ttm_bo_is_zombie(bo)) {
ret = ttm_bo_wait_ctx(bo, ctx);
if (!ret)
ttm_bo_cleanup_memtype_use(bo);
@@ -524,7 +529,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
return 0;
- if (bo->deleted) {
+ if (ttm_bo_is_zombie(bo)) {
lret = ttm_bo_wait_ctx(bo, walk->arg.ctx);
if (!lret)
ttm_bo_cleanup_memtype_use(bo);
@@ -624,7 +629,6 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
void ttm_bo_pin(struct ttm_buffer_object *bo)
{
dma_resv_assert_held(bo->base.resv);
- WARN_ON_ONCE(!kref_read(&bo->kref));
spin_lock(&bo->bdev->lru_lock);
if (bo->resource)
ttm_resource_del_bulk_move(bo->resource, bo);
@@ -643,7 +647,6 @@ EXPORT_SYMBOL(ttm_bo_pin);
void ttm_bo_unpin(struct ttm_buffer_object *bo)
{
dma_resv_assert_held(bo->base.resv);
- WARN_ON_ONCE(!kref_read(&bo->kref));
if (WARN_ON_ONCE(!bo->pin_count))
return;
@@ -932,7 +935,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, struct ttm_buffer_object *bo,
{
int ret;
- kref_init(&bo->kref);
bo->bdev = bdev;
bo->type = type;
bo->page_alignment = alignment;
@@ -1128,7 +1130,7 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
goto out;
}
- if (bo->deleted) {
+ if (ttm_bo_is_zombie(bo)) {
pgoff_t num_pages = bo->ttm->num_pages;
ret = ttm_bo_wait_ctx(bo, ctx);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_internal.h b/drivers/gpu/drm/ttm/ttm_bo_internal.h
index e0d48eac74b0..81327bc73834 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_internal.h
+++ b/drivers/gpu/drm/ttm/ttm_bo_internal.h
@@ -34,7 +34,7 @@
*/
static inline void ttm_bo_get(struct ttm_buffer_object *bo)
{
- kref_get(&bo->kref);
+ drm_gem_object_get(&bo->base);
}
/**
@@ -50,11 +50,14 @@ static inline void ttm_bo_get(struct ttm_buffer_object *bo)
static inline __must_check struct ttm_buffer_object *
ttm_bo_get_unless_zero(struct ttm_buffer_object *bo)
{
- if (!kref_get_unless_zero(&bo->kref))
+ if (!kref_get_unless_zero(&bo->base.refcount))
return NULL;
return bo;
}
-void ttm_bo_put(struct ttm_buffer_object *bo);
+static inline void ttm_bo_put(struct ttm_buffer_object *bo)
+{
+ drm_gem_object_put(&bo->base);
+}
#endif
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 6502ced6169d..ef1e42e005af 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -247,7 +247,7 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
atomic_inc(&ttm_glob.bo_count);
drm_vma_node_reset(&fbo->base.base.vma_node);
- kref_init(&fbo->base.kref);
+ kref_init(&fbo->base.base.refcount);
fbo->base.destroy = &ttm_transfered_destroy;
fbo->base.pin_count = 0;
if (bo->type != ttm_bo_type_sg)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index ea458b2f0bb0..9411114c6d5c 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1418,7 +1418,7 @@ static bool xe_ttm_bo_lock_in_destructor(struct ttm_buffer_object *ttm_bo)
struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
bool locked;
- xe_assert(xe, !kref_read(&ttm_bo->kref));
+ xe_assert(xe, !kref_read(&ttm_bo->base.refcount));
/*
* We can typically only race with TTM trylocking under the
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index da5c2e4971dc..5a16d304a849 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -78,11 +78,8 @@ enum ttm_bo_type {
* @type: The bo type.
* @page_alignment: Page alignment.
* @destroy: Destruction function. If NULL, kfree is used.
- * @kref: Reference count of this buffer object. When this refcount reaches
- * zero, the object is destroyed or put on the delayed delete list.
* @resource: structure describing current placement.
* @ttm: TTM structure holding system pages.
- * @deleted: True if the object is only a zombie and already deleted.
* @bulk_move: The bulk move object.
* @priority: Priority for LRU, BOs with lower priority are evicted first.
* @pin_count: Pin count.
@@ -109,17 +106,11 @@ struct ttm_buffer_object {
uint32_t page_alignment;
void (*destroy) (struct ttm_buffer_object *);
- /*
- * Members not needing protection.
- */
- struct kref kref;
-
/*
* Members protected by the bo::resv::reserved lock.
*/
struct ttm_resource *resource;
struct ttm_tt *ttm;
- bool deleted;
struct ttm_lru_bulk_move *bulk_move;
unsigned priority;
unsigned pin_count;
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/7] drm/ttm: remove ttm_lru_walk_ops
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
2025-07-16 16:04 ` [PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3 Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-16 16:04 ` [PATCH 3/7] drm/ttm: grab BO reference before locking it Christian König
` (5 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
It's just another layer of indirection.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 12 ++---------
drivers/gpu/drm/ttm/ttm_bo_util.c | 2 +-
include/drm/ttm/ttm_bo.h | 34 +++++++++++++------------------
3 files changed, 17 insertions(+), 31 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d456be1d6e1e..ea963dc43b62 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -554,10 +554,6 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
return lret;
}
-static const struct ttm_lru_walk_ops ttm_evict_walk_ops = {
- .process_bo = ttm_bo_evict_cb,
-};
-
static int ttm_bo_evict_alloc(struct ttm_device *bdev,
struct ttm_resource_manager *man,
const struct ttm_place *place,
@@ -569,7 +565,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
{
struct ttm_bo_evict_walk evict_walk = {
.walk = {
- .ops = &ttm_evict_walk_ops,
+ .process_bo = ttm_bo_evict_cb,
.arg = {
.ctx = ctx,
.ticket = ticket,
@@ -1197,10 +1193,6 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
return ret;
}
-const struct ttm_lru_walk_ops ttm_swap_ops = {
- .process_bo = ttm_bo_swapout_cb,
-};
-
/**
* ttm_bo_swapout() - Swap out buffer objects on the LRU list to shmem.
* @bdev: The ttm device.
@@ -1219,7 +1211,7 @@ s64 ttm_bo_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
{
struct ttm_bo_swapout_walk swapout_walk = {
.walk = {
- .ops = &ttm_swap_ops,
+ .process_bo = ttm_bo_swapout_cb,
.arg = {
.ctx = ctx,
.trylock_only = true,
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index ef1e42e005af..dec60a41185d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -888,7 +888,7 @@ s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct ttm_device *bdev,
s64 lret;
ttm_bo_lru_for_each_reserved_guarded(&cursor, man, &walk->arg, bo) {
- lret = walk->ops->process_bo(walk, bo);
+ lret = walk->process_bo(walk, bo);
if (lret == -EBUSY || lret == -EALREADY)
lret = 0;
progress = (lret < 0) ? lret : progress + lret;
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 5a16d304a849..63331a4b37f9 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -179,24 +179,6 @@ struct ttm_operation_ctx {
uint64_t bytes_moved;
};
-struct ttm_lru_walk;
-
-/** struct ttm_lru_walk_ops - Operations for a LRU walk. */
-struct ttm_lru_walk_ops {
- /**
- * process_bo - Process this bo.
- * @walk: struct ttm_lru_walk describing the walk.
- * @bo: A locked and referenced buffer object.
- *
- * Return: Negative error code on error, User-defined positive value
- * (typically, but not always, size of the processed bo) on success.
- * On success, the returned values are summed by the walk and the
- * walk exits when its target is met.
- * 0 also indicates success, -EBUSY means this bo was skipped.
- */
- s64 (*process_bo)(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo);
-};
-
/**
* struct ttm_lru_walk_arg - Common part for the variants of BO LRU walk.
*/
@@ -213,8 +195,20 @@ struct ttm_lru_walk_arg {
* struct ttm_lru_walk - Structure describing a LRU walk.
*/
struct ttm_lru_walk {
- /** @ops: Pointer to the ops structure. */
- const struct ttm_lru_walk_ops *ops;
+ /**
+ * process_bo - Process this bo.
+ * @walk: struct ttm_lru_walk describing the walk.
+ * @bo: A locked and referenced buffer object.
+ *
+ * Return: Negative error code on error, User-defined positive value
+ * (typically, but not always, size of the processed bo) on success.
+ * On success, the returned values are summed by the walk and the
+ * walk exits when its target is met.
+ * 0 also indicates success, -EBUSY means this bo was skipped.
+ */
+ s64 (*process_bo)(struct ttm_lru_walk *walk,
+ struct ttm_buffer_object *bo);
+
/** @arg: Common bo LRU walk arguments. */
struct ttm_lru_walk_arg arg;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/7] drm/ttm: grab BO reference before locking it
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
2025-07-16 16:04 ` [PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3 Christian König
2025-07-16 16:04 ` [PATCH 2/7] drm/ttm: remove ttm_lru_walk_ops Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-16 16:04 ` [PATCH 4/7] drm/ttm: switch to ttm_bo_lru_for_each_reserved_guarded for swapout Christian König
` (4 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
Previously we always grabbed the BO reference after taking the lock, but
that isn't necessary any more.
So avoid doing that and cleanup the handling here.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/ttm/ttm_bo_util.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index dec60a41185d..a903529c2b1f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -979,14 +979,17 @@ __ttm_bo_lru_cursor_next(struct ttm_bo_lru_cursor *curs)
break;
bo = res->bo;
- if (ttm_lru_walk_trylock(curs, bo))
- bo_locked = true;
- else if (!arg->ticket || arg->ctx->no_wait_gpu || arg->trylock_only)
+ if (!ttm_bo_get_unless_zero(bo))
continue;
- if (!ttm_bo_get_unless_zero(bo)) {
- if (curs->needs_unlock)
- dma_resv_unlock(bo->base.resv);
+ if (ttm_lru_walk_trylock(curs, bo)) {
+ bo_locked = true;
+
+ } else if (!arg->ticket || arg->ctx->no_wait_gpu ||
+ arg->trylock_only) {
+ spin_unlock(lru_lock);
+ ttm_bo_put(bo);
+ spin_lock(lru_lock);
continue;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/7] drm/ttm: switch to ttm_bo_lru_for_each_reserved_guarded for swapout
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
` (2 preceding siblings ...)
2025-07-16 16:04 ` [PATCH 3/7] drm/ttm: grab BO reference before locking it Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-16 16:04 ` [PATCH 5/7] drm/ttm: move zombie handling into ttm_bo_evict Christian König
` (3 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
Instead of the walker wrapper use the underlying foreach. Saves us quite
a bunch of complexity and loc.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 57 ++++++--------------------------
drivers/gpu/drm/ttm/ttm_device.c | 19 ++++++++---
include/drm/ttm/ttm_bo.h | 5 ++-
3 files changed, 27 insertions(+), 54 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index ea963dc43b62..2a4b98bfde57 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1087,25 +1087,18 @@ int ttm_bo_wait_ctx(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx)
EXPORT_SYMBOL(ttm_bo_wait_ctx);
/**
- * struct ttm_bo_swapout_walk - Parameters for the swapout walk
+ * ttm_bo_swapout() - Swap out buffer objects on the LRU list to shmem.
+ * @bo: The buffer to swap out.
+ * @ctx: The ttm_operation_ctx governing the swapout operation.
+ * @gfp_flags: The gfp flags used for shmem page allocations.
+ *
+ * Return: The number of bytes actually swapped out, or negative error code
+ * on error.
*/
-struct ttm_bo_swapout_walk {
- /** @walk: The walk base parameters. */
- struct ttm_lru_walk walk;
- /** @gfp_flags: The gfp flags to use for ttm_tt_swapout() */
- gfp_t gfp_flags;
- /** @hit_low: Whether we should attempt to swap BO's with low watermark threshold */
- /** @evict_low: If we cannot swap a bo when @try_low is false (first pass) */
- bool hit_low, evict_low;
-};
-
-static s64
-ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
+s64 ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
+ gfp_t gfp_flags)
{
struct ttm_place place = {.mem_type = bo->resource->mem_type};
- struct ttm_bo_swapout_walk *swapout_walk =
- container_of(walk, typeof(*swapout_walk), walk);
- struct ttm_operation_ctx *ctx = walk->arg.ctx;
s64 ret;
/*
@@ -1176,7 +1169,7 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
ttm_resource_del_bulk_move(bo->resource, bo);
spin_unlock(&bo->bdev->lru_lock);
- ret = ttm_tt_swapout(bo->bdev, bo->ttm, swapout_walk->gfp_flags);
+ ret = ttm_tt_swapout(bo->bdev, bo->ttm, gfp_flags);
spin_lock(&bo->bdev->lru_lock);
if (ret)
@@ -1193,36 +1186,6 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
return ret;
}
-/**
- * ttm_bo_swapout() - Swap out buffer objects on the LRU list to shmem.
- * @bdev: The ttm device.
- * @ctx: The ttm_operation_ctx governing the swapout operation.
- * @man: The resource manager whose resources / buffer objects are
- * goint to be swapped out.
- * @gfp_flags: The gfp flags used for shmem page allocations.
- * @target: The desired number of bytes to swap out.
- *
- * Return: The number of bytes actually swapped out, or negative error code
- * on error.
- */
-s64 ttm_bo_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
- struct ttm_resource_manager *man, gfp_t gfp_flags,
- s64 target)
-{
- struct ttm_bo_swapout_walk swapout_walk = {
- .walk = {
- .process_bo = ttm_bo_swapout_cb,
- .arg = {
- .ctx = ctx,
- .trylock_only = true,
- },
- },
- .gfp_flags = gfp_flags,
- };
-
- return ttm_lru_walk_for_evict(&swapout_walk.walk, bdev, man, target);
-}
-
void ttm_bo_tt_destroy(struct ttm_buffer_object *bo)
{
if (bo->ttm == NULL)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index c3e2fcbdd2cc..7ba18e782131 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -173,6 +173,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
gfp_t gfp_flags)
{
struct ttm_resource_manager *man;
+ struct ttm_bo_lru_cursor cursor;
+ struct ttm_buffer_object *bo;
+ struct ttm_lru_walk_arg arg = {
+ .ctx = ctx,
+ .trylock_only = true
+ };
unsigned i;
s64 lret;
@@ -181,10 +187,15 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
if (!man || !man->use_tt)
continue;
- lret = ttm_bo_swapout(bdev, ctx, man, gfp_flags, 1);
- /* Can be both positive (num_pages) and negative (error) */
- if (lret)
- return lret;
+ ttm_bo_lru_for_each_reserved_guarded(&cursor, man, &arg, bo) {
+ lret = ttm_bo_swapout(bo, ctx, gfp_flags);
+ continue;
+ /* Can be both positive (num_pages) and negative (error) */
+ if (lret && lret != -EBUSY && lret != -EALREADY)
+ return lret;
+ }
+ if (IS_ERR(bo))
+ return PTR_ERR(bo);
}
return 0;
}
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 63331a4b37f9..6e85f9e207ad 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -398,9 +398,8 @@ void *ttm_bo_kmap_try_from_panic(struct ttm_buffer_object *bo, unsigned long pag
int ttm_bo_vmap(struct ttm_buffer_object *bo, struct iosys_map *map);
void ttm_bo_vunmap(struct ttm_buffer_object *bo, struct iosys_map *map);
int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo);
-s64 ttm_bo_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
- struct ttm_resource_manager *man, gfp_t gfp_flags,
- s64 target);
+s64 ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
+ gfp_t gfp_flags);
void ttm_bo_pin(struct ttm_buffer_object *bo);
void ttm_bo_unpin(struct ttm_buffer_object *bo);
int ttm_bo_evict_first(struct ttm_device *bdev,
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/7] drm/ttm: move zombie handling into ttm_bo_evict
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
` (3 preceding siblings ...)
2025-07-16 16:04 ` [PATCH 4/7] drm/ttm: switch to ttm_bo_lru_for_each_reserved_guarded for swapout Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-16 16:04 ` [PATCH 6/7] drm/ttm: use ttm_bo_lru_for_each_reserved_guarded in evict_all Christian König
` (2 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
Both callers do the same thing, so we can trivially unify that.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 24 +++++++++---------------
1 file changed, 9 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 2a4b98bfde57..87e81e36bbd4 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -368,6 +368,13 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
struct ttm_place hop;
int ret = 0;
+ if (ttm_bo_is_zombie(bo)) {
+ ret = ttm_bo_wait_ctx(bo, ctx);
+ if (!ret)
+ ttm_bo_cleanup_memtype_use(bo);
+ return ret;
+ }
+
memset(&hop, 0, sizeof(hop));
dma_resv_assert_held(bo->base.resv);
@@ -475,13 +482,7 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
if (!bo->resource || bo->resource->mem_type != mem_type)
goto out_bo_moved;
- if (ttm_bo_is_zombie(bo)) {
- ret = ttm_bo_wait_ctx(bo, ctx);
- if (!ret)
- ttm_bo_cleanup_memtype_use(bo);
- } else {
- ret = ttm_bo_evict(bo, ctx);
- }
+ ret = ttm_bo_evict(bo, ctx);
out_bo_moved:
dma_resv_unlock(bo->base.resv);
out_no_lock:
@@ -529,14 +530,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
return 0;
- if (ttm_bo_is_zombie(bo)) {
- lret = ttm_bo_wait_ctx(bo, walk->arg.ctx);
- if (!lret)
- ttm_bo_cleanup_memtype_use(bo);
- } else {
- lret = ttm_bo_evict(bo, walk->arg.ctx);
- }
-
+ lret = ttm_bo_evict(bo, walk->arg.ctx);
if (lret)
goto out;
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/7] drm/ttm: use ttm_bo_lru_for_each_reserved_guarded in evict_all
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
` (4 preceding siblings ...)
2025-07-16 16:04 ` [PATCH 5/7] drm/ttm: move zombie handling into ttm_bo_evict Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-16 16:04 ` [PATCH 7/7] drm/xe: remove workaround for TTM internals Christian König
2025-07-22 12:36 ` Switching over to GEM refcounts and a bunch of cleanups Thomas Hellström
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
Use the for_each loop to evict all BOs of an resource manager as well.
Greately simplifying the handling and finally allows us to
remove ttm_bo_evict_first().
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 51 +-----------------------------
drivers/gpu/drm/ttm/ttm_resource.c | 17 ++++++----
include/drm/ttm/ttm_bo.h | 1 +
3 files changed, 13 insertions(+), 56 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 87e81e36bbd4..5d5fffcf16c0 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -359,8 +359,7 @@ static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo,
return 0;
}
-static int ttm_bo_evict(struct ttm_buffer_object *bo,
- struct ttm_operation_ctx *ctx)
+int ttm_bo_evict(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx)
{
struct ttm_device *bdev = bo->bdev;
struct ttm_resource *evict_mem;
@@ -446,54 +445,6 @@ bool ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
}
EXPORT_SYMBOL(ttm_bo_eviction_valuable);
-/**
- * ttm_bo_evict_first() - Evict the first bo on the manager's LRU list.
- * @bdev: The ttm device.
- * @man: The manager whose bo to evict.
- * @ctx: The TTM operation ctx governing the eviction.
- *
- * Return: 0 if successful or the resource disappeared. Negative error code on error.
- */
-int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man,
- struct ttm_operation_ctx *ctx)
-{
- struct ttm_resource_cursor cursor;
- struct ttm_buffer_object *bo;
- struct ttm_resource *res;
- unsigned int mem_type;
- int ret = 0;
-
- spin_lock(&bdev->lru_lock);
- ttm_resource_cursor_init(&cursor, man);
- res = ttm_resource_manager_first(&cursor);
- ttm_resource_cursor_fini(&cursor);
- if (!res) {
- ret = -ENOENT;
- goto out_no_ref;
- }
- bo = res->bo;
- if (!ttm_bo_get_unless_zero(bo))
- goto out_no_ref;
- mem_type = res->mem_type;
- spin_unlock(&bdev->lru_lock);
- ret = ttm_bo_reserve(bo, ctx->interruptible, ctx->no_wait_gpu, NULL);
- if (ret)
- goto out_no_lock;
- if (!bo->resource || bo->resource->mem_type != mem_type)
- goto out_bo_moved;
-
- ret = ttm_bo_evict(bo, ctx);
-out_bo_moved:
- dma_resv_unlock(bo->base.resv);
-out_no_lock:
- ttm_bo_put(bo);
- return ret;
-
-out_no_ref:
- spin_unlock(&bdev->lru_lock);
- return ret;
-}
-
/**
* struct ttm_bo_evict_walk - Parameters for the evict walk.
*/
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index e2c82ad07eb4..9ee5a9f444f0 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -546,20 +546,25 @@ EXPORT_SYMBOL(ttm_resource_manager_init);
int ttm_resource_manager_evict_all(struct ttm_device *bdev,
struct ttm_resource_manager *man)
{
+ struct ttm_bo_lru_cursor cursor;
+ struct ttm_buffer_object *bo;
struct ttm_operation_ctx ctx = {
.interruptible = false,
.no_wait_gpu = false,
};
+ struct ttm_lru_walk_arg arg = {
+ .ctx = &ctx,
+ .trylock_only = true
+ };
struct dma_fence *fence;
int ret;
- do {
- ret = ttm_bo_evict_first(bdev, man, &ctx);
+ ttm_bo_lru_for_each_reserved_guarded(&cursor, man, &arg, bo) {
+ ret = ttm_bo_evict(bo, &ctx);
+ if (ret)
+ return ret;
cond_resched();
- } while (!ret);
-
- if (ret && ret != -ENOENT)
- return ret;
+ }
spin_lock(&man->move_lock);
fence = dma_fence_get(man->move);
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 6e85f9e207ad..2eaed0780d21 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -379,6 +379,7 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
void ttm_bo_fini(struct ttm_buffer_object *bo);
void ttm_bo_set_bulk_move(struct ttm_buffer_object *bo,
struct ttm_lru_bulk_move *bulk);
+int ttm_bo_evict(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx);
bool ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
const struct ttm_place *place);
int ttm_bo_init_reserved(struct ttm_device *bdev, struct ttm_buffer_object *bo,
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/7] drm/xe: remove workaround for TTM internals
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
` (5 preceding siblings ...)
2025-07-16 16:04 ` [PATCH 6/7] drm/ttm: use ttm_bo_lru_for_each_reserved_guarded in evict_all Christian König
@ 2025-07-16 16:04 ` Christian König
2025-07-22 12:36 ` Switching over to GEM refcounts and a bunch of cleanups Thomas Hellström
7 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2025-07-16 16:04 UTC (permalink / raw)
To: matthew.brost, thomas.hellstrom, dri-devel, intel-xe
This should no longer be necessary, TTM doesn't lock the BO without a
reference any more.
Only compile tested!
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/xe/xe_bo.c | 32 +++++---------------------------
1 file changed, 5 insertions(+), 27 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 9411114c6d5c..250ebd9ff184 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1409,31 +1409,6 @@ static unsigned long xe_ttm_io_mem_pfn(struct ttm_buffer_object *ttm_bo,
static void __xe_bo_vunmap(struct xe_bo *bo);
-/*
- * TODO: Move this function to TTM so we don't rely on how TTM does its
- * locking, thereby abusing TTM internals.
- */
-static bool xe_ttm_bo_lock_in_destructor(struct ttm_buffer_object *ttm_bo)
-{
- struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
- bool locked;
-
- xe_assert(xe, !kref_read(&ttm_bo->base.refcount));
-
- /*
- * We can typically only race with TTM trylocking under the
- * lru_lock, which will immediately be unlocked again since
- * the ttm_bo refcount is zero at this point. So trylocking *should*
- * always succeed here, as long as we hold the lru lock.
- */
- spin_lock(&ttm_bo->bdev->lru_lock);
- locked = dma_resv_trylock(ttm_bo->base.resv);
- spin_unlock(&ttm_bo->bdev->lru_lock);
- xe_assert(xe, locked);
-
- return locked;
-}
-
static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo)
{
struct dma_resv_iter cursor;
@@ -1454,8 +1429,11 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo)
if (ttm_bo->base.resv != &ttm_bo->base._resv)
return;
- if (!xe_ttm_bo_lock_in_destructor(ttm_bo))
- return;
+ /*
+ * This should never fail since there are no other references to the BO
+ * any more.
+ */
+ WARN_ON(!dma_resv_trylock(ttm_bo->base.resv));
/*
* Scrub the preempt fences if any. The unbind fence is already
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Switching over to GEM refcounts and a bunch of cleanups
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
` (6 preceding siblings ...)
2025-07-16 16:04 ` [PATCH 7/7] drm/xe: remove workaround for TTM internals Christian König
@ 2025-07-22 12:36 ` Thomas Hellström
2025-08-04 12:24 ` Christian König
7 siblings, 1 reply; 14+ messages in thread
From: Thomas Hellström @ 2025-07-22 12:36 UTC (permalink / raw)
To: Christian König, matthew.brost, dri-devel, intel-xe
Hi, Christian,
On Wed, 2025-07-16 at 18:04 +0200, Christian König wrote:
> Hi guys,
>
> so I hope Thomas is back from vacation while I will be on vacation
> for
> the next two weeks.
>
> Here is the patch set which cleans up TTM and XE in preperation of
> switching to drm_exec.
>
> Please take a look and let me know what you think about it.
I'll take a look.
BTW Did you see my comments on patch 6/6 on v1, essentially I think
both i915 and xe rejecting some TTM callbacks if the object is a zombie
(the GEM refcount has reached 0).
Thanks,
Thomas
>
> Regards,
> Christian.
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Switching over to GEM refcounts and a bunch of cleanups
2025-07-22 12:36 ` Switching over to GEM refcounts and a bunch of cleanups Thomas Hellström
@ 2025-08-04 12:24 ` Christian König
2025-08-21 12:30 ` Christian König
0 siblings, 1 reply; 14+ messages in thread
From: Christian König @ 2025-08-04 12:24 UTC (permalink / raw)
To: Thomas Hellström, matthew.brost, dri-devel, intel-xe
On 22.07.25 14:36, Thomas Hellström wrote:
> Hi, Christian,
>
> On Wed, 2025-07-16 at 18:04 +0200, Christian König wrote:
>> Hi guys,
>>
>> so I hope Thomas is back from vacation while I will be on vacation
>> for
>> the next two weeks.
>>
>> Here is the patch set which cleans up TTM and XE in preperation of
>> switching to drm_exec.
>>
>> Please take a look and let me know what you think about it.
>
> I'll take a look.
> BTW Did you see my comments on patch 6/6 on v1, essentially I think
> both i915 and xe rejecting some TTM callbacks if the object is a zombie
> (the GEM refcount has reached 0).
Sorry for the delay I'm just back from vacation.
I have seen that XE and i915 is checking the GEM refcount, but I haven't seen your comment on that.
Going to take a look ASAP.
Regards,
Christian.
>
> Thanks,
> Thomas
>
>
>>
>> Regards,
>> Christian.
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Switching over to GEM refcounts and a bunch of cleanups
2025-08-04 12:24 ` Christian König
@ 2025-08-21 12:30 ` Christian König
2025-08-21 14:06 ` Thomas Hellström
0 siblings, 1 reply; 14+ messages in thread
From: Christian König @ 2025-08-21 12:30 UTC (permalink / raw)
To: Thomas Hellström, matthew.brost, dri-devel, intel-xe
On 04.08.25 14:24, Christian König wrote:
> On 22.07.25 14:36, Thomas Hellström wrote:
>> Hi, Christian,
>>
>> On Wed, 2025-07-16 at 18:04 +0200, Christian König wrote:
>>> Hi guys,
>>>
>>> so I hope Thomas is back from vacation while I will be on vacation
>>> for
>>> the next two weeks.
>>>
>>> Here is the patch set which cleans up TTM and XE in preperation of
>>> switching to drm_exec.
>>>
>>> Please take a look and let me know what you think about it.
>>
>> I'll take a look.
>> BTW Did you see my comments on patch 6/6 on v1, essentially I think
>> both i915 and xe rejecting some TTM callbacks if the object is a zombie
>> (the GEM refcount has reached 0).
>
> Sorry for the delay I'm just back from vacation.
>
> I have seen that XE and i915 is checking the GEM refcount, but I haven't seen your comment on that.
>
> Going to take a look ASAP.
I'm not able to find that mail.
All I've seen is your comment on the VMWGFX stuff and that was already resolved by Zack.
What are you referring to?
Thanks,
Christian.
>
> Regards,
> Christian.
>
>>
>> Thanks,
>> Thomas
>>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Switching over to GEM refcounts and a bunch of cleanups
2025-08-21 12:30 ` Christian König
@ 2025-08-21 14:06 ` Thomas Hellström
2025-08-21 14:59 ` Christian König
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Hellström @ 2025-08-21 14:06 UTC (permalink / raw)
To: Christian König, matthew.brost, dri-devel, intel-xe
Hi Christian,
On Thu, 2025-08-21 at 14:30 +0200, Christian König wrote:
> On 04.08.25 14:24, Christian König wrote:
> > On 22.07.25 14:36, Thomas Hellström wrote:
> > > Hi, Christian,
> > >
> > > On Wed, 2025-07-16 at 18:04 +0200, Christian König wrote:
> > > > Hi guys,
> > > >
> > > > so I hope Thomas is back from vacation while I will be on
> > > > vacation
> > > > for
> > > > the next two weeks.
> > > >
> > > > Here is the patch set which cleans up TTM and XE in preperation
> > > > of
> > > > switching to drm_exec.
> > > >
> > > > Please take a look and let me know what you think about it.
> > >
> > > I'll take a look.
> > > BTW Did you see my comments on patch 6/6 on v1, essentially I
> > > think
> > > both i915 and xe rejecting some TTM callbacks if the object is a
> > > zombie
> > > (the GEM refcount has reached 0).
> >
> > Sorry for the delay I'm just back from vacation.
> >
> > I have seen that XE and i915 is checking the GEM refcount, but I
> > haven't seen your comment on that.
> >
> > Going to take a look ASAP.
>
> I'm not able to find that mail.
>
> All I've seen is your comment on the VMWGFX stuff and that was
> already resolved by Zack.
>
> What are you referring to?
https://lore.kernel.org/intel-xe/a004736315d77837172418eb196d5b5f80b74e6c.camel@linux.intel.com/
Thanks,
Thomas
>
> Thanks,
> Christian.
>
> >
> > Regards,
> > Christian.
> >
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > > >
> > > > Regards,
> > > > Christian.
> > > >
> > >
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Switching over to GEM refcounts and a bunch of cleanups
2025-08-21 14:06 ` Thomas Hellström
@ 2025-08-21 14:59 ` Christian König
2025-08-22 8:51 ` Thomas Hellström
0 siblings, 1 reply; 14+ messages in thread
From: Christian König @ 2025-08-21 14:59 UTC (permalink / raw)
To: Thomas Hellström, matthew.brost, dri-devel, intel-xe
On 21.08.25 16:06, Thomas Hellström wrote:
>> What are you referring to?
>
> https://lore.kernel.org/intel-xe/a004736315d77837172418eb196d5b5f80b74e6c.camel@linux.intel.com/
Thanks, that one never made it into my inbox as far as I can see.
> A couple of questions on the design direction here:
>
> IIRC both xe and i915 has checks to consider objects with a 0 gem
> refcount as zombies requiring special treatment or skipping, when
> encountered in TTM callbacks. We need to double-check that.
I think I've found all of those. The one in i915 were actually not TTM specific but try to catch the same problem on the GEM refcount.
> But I wonder,
> first this practice of resurrecting refcounts seem a bit unusual, I
> wonder if we can get rid of that somehow?
I was also going back on forth if that is a good idea or not as well.
The usual solution to such kinds of issues is to use two reference counts, so that you got a multi stage cleanup approach. E.g. backing store and object, like what mm_struct is using as well.
The problem was simply that TTM/GEM ended up having *four* reference counts for the same object, each was doing something different and they didn't worked well together at all.
> Furthermore, it seems the problem with drm_exec is related only to the
> LRU walk. What about adding a struct completion to the object, that is
> signaled when the object has freed its final backing-store. The LRU
> walk would then check if the object is a zombie, and if so just wait on
> the struct completion. (Need of course to carefully set up locking
> orders). Then we wouldn't need to resurrect the gem refcount, nor use
> drm_exec locking for zombies.
I had a similar idea, waiting is already possible by waiting for the BOs work item.
But I abandoned that idea because I couldn't see how we could solve the locking.
> We would still need some form of refcounting while waiting on the
> struct completion, but if we restricted the TTM refcount to *only* be
> used internally for that sole purpose, and also replaced the final
> ttm_bo_put() with the ttm_bo_finalize() that you suggest we wouldn't
> need to resurrect that refcount since it wouldn't drop to zero until
> the object is ready for final free.
>
> Ideas, comments?
Ideally I think we would use the handle_count as backing store the drm_gem_object->refcount as structure reference.
But that means a massive rework of the GEM handling/drivers/TTM.
Alternative we could just grab a reference to a unsignaled fence when we encounter a dead BO on the LRU.
What do you think of that idea?
Regards,
Christian.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Switching over to GEM refcounts and a bunch of cleanups
2025-08-21 14:59 ` Christian König
@ 2025-08-22 8:51 ` Thomas Hellström
0 siblings, 0 replies; 14+ messages in thread
From: Thomas Hellström @ 2025-08-22 8:51 UTC (permalink / raw)
To: Christian König, matthew.brost, dri-devel, intel-xe
Hi
On Thu, 2025-08-21 at 16:59 +0200, Christian König wrote:
> On 21.08.25 16:06, Thomas Hellström wrote:
> > > What are you referring to?
> >
> > https://lore.kernel.org/intel-xe/a004736315d77837172418eb196d5b5f80b74e6c.camel@linux.intel.com/
>
> Thanks, that one never made it into my inbox as far as I can see.
>
> > A couple of questions on the design direction here:
> >
> > IIRC both xe and i915 has checks to consider objects with a 0 gem
> > refcount as zombies requiring special treatment or skipping, when
> > encountered in TTM callbacks. We need to double-check that.
>
> I think I've found all of those. The one in i915 were actually not
> TTM specific but try to catch the same problem on the GEM refcount.
>
> > But I wonder,
> > first this practice of resurrecting refcounts seem a bit unusual, I
> > wonder if we can get rid of that somehow?
>
> I was also going back on forth if that is a good idea or not as well.
>
> The usual solution to such kinds of issues is to use two reference
> counts, so that you got a multi stage cleanup approach. E.g. backing
> store and object, like what mm_struct is using as well.
>
> The problem was simply that TTM/GEM ended up having *four* reference
> counts for the same object, each was doing something different and
> they didn't worked well together at all.
>
> > Furthermore, it seems the problem with drm_exec is related only to
> > the
> > LRU walk. What about adding a struct completion to the object, that
> > is
> > signaled when the object has freed its final backing-store. The LRU
> > walk would then check if the object is a zombie, and if so just
> > wait on
> > the struct completion. (Need of course to carefully set up locking
> > orders). Then we wouldn't need to resurrect the gem refcount, nor
> > use
> > drm_exec locking for zombies.
>
> I had a similar idea, waiting is already possible by waiting for the
> BOs work item.
>
> But I abandoned that idea because I couldn't see how we could solve
> the locking.
>
> > We would still need some form of refcounting while waiting on the
> > struct completion, but if we restricted the TTM refcount to *only*
> > be
> > used internally for that sole purpose, and also replaced the final
> > ttm_bo_put() with the ttm_bo_finalize() that you suggest we
> > wouldn't
> > need to resurrect that refcount since it wouldn't drop to zero
> > until
> > the object is ready for final free.
> >
> > Ideas, comments?
>
> Ideally I think we would use the handle_count as backing store the
> drm_gem_object->refcount as structure reference.
>
> But that means a massive rework of the GEM handling/drivers/TTM.
>
> Alternative we could just grab a reference to a unsignaled fence when
> we encounter a dead BO on the LRU.
>
> What do you think of that idea?
I think to be able to *guarantee* exhaustive eviction, we need
1) all unfreed resources to sit on an LRU, and
2) everything on the LRU needs to be able to have something to wait
for.
A fence can't really guarantee 2), but it's close. There is a time-
interval in betwen where the last fence signals and we take the
resource from the LRU and free it.
A struct completion can be made to signal when the resource is freed.
I think the locking restriction in the struct completion case (the
struct completion is likely waited for under a dma-resv), is that
nothing except the object destructor may take an individualized resv of
a zombie gem object whose refcount has gone to zero. The destructor
should use an asserted trylock only to make lockdep happy. The struct
completion also needs a refcount to avoid destroying it while there are
waiters.
So what do you think about starting out with a fence, and if / when
that appears not to be sufficient, we have a backup plan to move to a
struct completion?
Thomas
>
> Regards,
> Christian.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-08-22 8:51 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 16:04 Switching over to GEM refcounts and a bunch of cleanups Christian König
2025-07-16 16:04 ` [PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3 Christian König
2025-07-16 16:04 ` [PATCH 2/7] drm/ttm: remove ttm_lru_walk_ops Christian König
2025-07-16 16:04 ` [PATCH 3/7] drm/ttm: grab BO reference before locking it Christian König
2025-07-16 16:04 ` [PATCH 4/7] drm/ttm: switch to ttm_bo_lru_for_each_reserved_guarded for swapout Christian König
2025-07-16 16:04 ` [PATCH 5/7] drm/ttm: move zombie handling into ttm_bo_evict Christian König
2025-07-16 16:04 ` [PATCH 6/7] drm/ttm: use ttm_bo_lru_for_each_reserved_guarded in evict_all Christian König
2025-07-16 16:04 ` [PATCH 7/7] drm/xe: remove workaround for TTM internals Christian König
2025-07-22 12:36 ` Switching over to GEM refcounts and a bunch of cleanups Thomas Hellström
2025-08-04 12:24 ` Christian König
2025-08-21 12:30 ` Christian König
2025-08-21 14:06 ` Thomas Hellström
2025-08-21 14:59 ` Christian König
2025-08-22 8:51 ` Thomas Hellström
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).