dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging
@ 2025-06-16 13:07 Christian König
  2025-06-16 13:07 ` [PATCH 2/6] drm/xe: stop asserting on the TTM refcount Christian König
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Christian König @ 2025-06-16 13:07 UTC (permalink / raw)
  To: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

That is something TTM internal which is about to get dropped.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
index c55382167c1b..7057d852951b 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
@@ -284,11 +284,10 @@ static void vmw_bo_print_info(int id, struct vmw_bo *bo, struct seq_file *m)
 
 	seq_printf(m, "\t\t0x%08x: %12zu bytes %s, type = %s",
 		   id, bo->tbo.base.size, placement, type);
-	seq_printf(m, ", priority = %u, pin_count = %u, GEM refs = %d, TTM refs = %d",
+	seq_printf(m, ", priority = %u, pin_count = %u, GEM refs = %d",
 		   bo->tbo.priority,
 		   bo->tbo.pin_count,
-		   kref_read(&bo->tbo.base.refcount),
-		   kref_read(&bo->tbo.kref));
+		   kref_read(&bo->tbo.base.refcount));
 	seq_puts(m, "\n");
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/6] drm/xe: stop asserting on the TTM refcount
  2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
@ 2025-06-16 13:07 ` Christian König
  2025-06-23 14:54   ` Matthew Brost
  2025-06-16 13:07 ` [PATCH 3/6] drm/ttm: fix error handling in ttm_buffer_object_transfer Christian König
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2025-06-16 13:07 UTC (permalink / raw)
  To: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

The TTM refcount is about to be removed.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/xe/xe_bo.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 7aa2c17825da..2a8e7cb8c982 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1417,8 +1417,6 @@ static bool xe_ttm_bo_lock_in_destructor(struct ttm_buffer_object *ttm_bo)
 	struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
 	bool locked;
 
-	xe_assert(xe, !kref_read(&ttm_bo->kref));
-
 	/*
 	 * We can typically only race with TTM trylocking under the
 	 * lru_lock, which will immediately be unlocked again since
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/6] drm/ttm: fix error handling in ttm_buffer_object_transfer
  2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
  2025-06-16 13:07 ` [PATCH 2/6] drm/xe: stop asserting on the TTM refcount Christian König
@ 2025-06-16 13:07 ` Christian König
  2025-06-24  2:24   ` Matthew Brost
  2025-06-16 13:07 ` [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini Christian König
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2025-06-16 13:07 UTC (permalink / raw)
  To: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

Unlocking the resv object was missing in the error path, additionally to
that we should move over the resource only after the fence slot was
reserved.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index b78365dc1fed..56f3152f34f5 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -256,6 +256,13 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	ret = dma_resv_trylock(&fbo->base.base._resv);
 	WARN_ON(!ret);
 
+	ret = dma_resv_reserve_fences(&fbo->base.base._resv, 1);
+	if (ret) {
+		dma_resv_unlock(&fbo->base.base._resv);
+		kfree(fbo);
+		return ret;
+	}
+
 	if (fbo->base.resource) {
 		ttm_resource_set_bo(fbo->base.resource, &fbo->base);
 		bo->resource = NULL;
@@ -264,12 +271,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 		fbo->base.bulk_move = NULL;
 	}
 
-	ret = dma_resv_reserve_fences(&fbo->base.base._resv, 1);
-	if (ret) {
-		kfree(fbo);
-		return ret;
-	}
-
 	ttm_bo_get(bo);
 	fbo->bo = bo;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini
  2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
  2025-06-16 13:07 ` [PATCH 2/6] drm/xe: stop asserting on the TTM refcount Christian König
  2025-06-16 13:07 ` [PATCH 3/6] drm/ttm: fix error handling in ttm_buffer_object_transfer Christian König
@ 2025-06-16 13:07 ` Christian König
  2025-06-17  2:08   ` kernel test robot
  2025-06-16 13:07 ` [PATCH 5/6] drm/ttm: disable ttm_bo_validate_deleted_evict for now Christian König
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2025-06-16 13:07 UTC (permalink / raw)
  To: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

Give TTM BOs a separate cleanup function.

The next step in removing the TTM BO reference counting and replacing it
with the GEM object reference counting.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
 drivers/gpu/drm/drm_gem_vram_helper.c         |  6 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |  4 +-
 drivers/gpu/drm/loongson/lsdc_gem.c           |  2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c         |  2 +-
 drivers/gpu/drm/qxl/qxl_gem.c                 |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c           |  2 +-
 drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 12 ++--
 .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  | 64 +++++++++----------
 drivers/gpu/drm/ttm/ttm_bo.c                  | 21 +++---
 drivers/gpu/drm/ttm/ttm_bo_internal.h         |  2 +
 drivers/gpu/drm/vmwgfx/vmwgfx_gem.c           |  2 +-
 drivers/gpu/drm/xe/xe_bo.c                    |  2 +-
 include/drm/ttm/ttm_bo.h                      |  2 +-
 14 files changed, 64 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 0ecc88df7208..1d5756be974a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -198,7 +198,7 @@ static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
 	struct amdgpu_bo *aobj = gem_to_amdgpu_bo(gobj);
 
 	amdgpu_hmm_unregister(aobj);
-	ttm_bo_put(&aobj->tbo);
+	ttm_bo_fini(&aobj->tbo);
 }
 
 int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
index ead50fef5e7d..3cb0acdd7bba 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -106,7 +106,7 @@ static const struct drm_gem_object_funcs drm_gem_vram_object_funcs;
 
 static void drm_gem_vram_cleanup(struct drm_gem_vram_object *gbo)
 {
-	/* We got here via ttm_bo_put(), which means that the
+	/* We got here via ttm_bo_fini(), which means that the
 	 * TTM buffer object in 'bo' has already been cleaned
 	 * up; only release the GEM object.
 	 */
@@ -233,11 +233,11 @@ EXPORT_SYMBOL(drm_gem_vram_create);
  * drm_gem_vram_put() - Releases a reference to a VRAM-backed GEM object
  * @gbo:	the GEM VRAM object
  *
- * See ttm_bo_put() for more information.
+ * See ttm_bo_fini() for more information.
  */
 void drm_gem_vram_put(struct drm_gem_vram_object *gbo)
 {
-	ttm_bo_put(&gbo->bo);
+	ttm_bo_fini(&gbo->bo);
 }
 EXPORT_SYMBOL(drm_gem_vram_put);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 1f4814968868..57bb111d65da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -1029,7 +1029,7 @@ static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj)
 {
 	GEM_BUG_ON(!obj->ttm.created);
 
-	ttm_bo_put(i915_gem_to_ttm(obj));
+	ttm_bo_fini(i915_gem_to_ttm(obj));
 }
 
 static vm_fault_t vm_fault_ttm(struct vm_fault *vmf)
@@ -1325,7 +1325,7 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
 	 * If this function fails, it will call the destructor, but
 	 * our caller still owns the object. So no freeing in the
 	 * destructor until obj->ttm.created is true.
-	 * Similarly, in delayed_destroy, we can't call ttm_bo_put()
+	 * Similarly, in delayed_destroy, we can't call ttm_bo_fini()
 	 * until successful initialization.
 	 */
 	ret = ttm_bo_init_reserved(&i915->bdev, i915_gem_to_ttm(obj), bo_type,
diff --git a/drivers/gpu/drm/loongson/lsdc_gem.c b/drivers/gpu/drm/loongson/lsdc_gem.c
index a720d8f53209..22d0eced95da 100644
--- a/drivers/gpu/drm/loongson/lsdc_gem.c
+++ b/drivers/gpu/drm/loongson/lsdc_gem.c
@@ -57,7 +57,7 @@ static void lsdc_gem_object_free(struct drm_gem_object *obj)
 	struct ttm_buffer_object *tbo = to_ttm_bo(obj);
 
 	if (tbo)
-		ttm_bo_put(tbo);
+		ttm_bo_fini(tbo);
 }
 
 static int lsdc_gem_object_vmap(struct drm_gem_object *obj, struct iosys_map *map)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 690e10fbf0bd..395d92ab6271 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -87,7 +87,7 @@ nouveau_gem_object_del(struct drm_gem_object *gem)
 		return;
 	}
 
-	ttm_bo_put(&nvbo->bo);
+	ttm_bo_fini(&nvbo->bo);
 
 	pm_runtime_mark_last_busy(dev);
 	pm_runtime_put_autosuspend(dev);
diff --git a/drivers/gpu/drm/qxl/qxl_gem.c b/drivers/gpu/drm/qxl/qxl_gem.c
index fc5e3763c359..d26043424e95 100644
--- a/drivers/gpu/drm/qxl/qxl_gem.c
+++ b/drivers/gpu/drm/qxl/qxl_gem.c
@@ -39,7 +39,7 @@ void qxl_gem_object_free(struct drm_gem_object *gobj)
 	qxl_surface_evict(qdev, qobj, false);
 
 	tbo = &qobj->tbo;
-	ttm_bo_put(tbo);
+	ttm_bo_fini(tbo);
 }
 
 int qxl_gem_object_create(struct qxl_device *qdev, int size,
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index f86773f3db20..18ca1bcfd2f9 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -86,7 +86,7 @@ static void radeon_gem_object_free(struct drm_gem_object *gobj)
 
 	if (robj) {
 		radeon_mn_unregister(robj);
-		ttm_bo_put(&robj->tbo);
+		ttm_bo_fini(&robj->tbo);
 	}
 }
 
diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
index 6c77550c51af..5426b435f702 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
@@ -379,7 +379,7 @@ static void ttm_bo_unreserve_bulk(struct kunit *test)
 	dma_resv_fini(resv);
 }
 
-static void ttm_bo_put_basic(struct kunit *test)
+static void ttm_bo_fini_basic(struct kunit *test)
 {
 	struct ttm_test_devices *priv = test->priv;
 	struct ttm_buffer_object *bo;
@@ -410,7 +410,7 @@ static void ttm_bo_put_basic(struct kunit *test)
 	dma_resv_unlock(bo->base.resv);
 	KUNIT_EXPECT_EQ(test, err, 0);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static const char *mock_name(struct dma_fence *f)
@@ -423,7 +423,7 @@ static const struct dma_fence_ops mock_fence_ops = {
 	.get_timeline_name = mock_name,
 };
 
-static void ttm_bo_put_shared_resv(struct kunit *test)
+static void ttm_bo_fini_shared_resv(struct kunit *test)
 {
 	struct ttm_test_devices *priv = test->priv;
 	struct ttm_buffer_object *bo;
@@ -463,7 +463,7 @@ static void ttm_bo_put_shared_resv(struct kunit *test)
 	bo->type = ttm_bo_type_device;
 	bo->base.resv = external_resv;
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static void ttm_bo_pin_basic(struct kunit *test)
@@ -616,8 +616,8 @@ static struct kunit_case ttm_bo_test_cases[] = {
 	KUNIT_CASE(ttm_bo_unreserve_basic),
 	KUNIT_CASE(ttm_bo_unreserve_pinned),
 	KUNIT_CASE(ttm_bo_unreserve_bulk),
-	KUNIT_CASE(ttm_bo_put_basic),
-	KUNIT_CASE(ttm_bo_put_shared_resv),
+	KUNIT_CASE(ttm_bo_fini_basic),
+	KUNIT_CASE(ttm_bo_fini_shared_resv),
 	KUNIT_CASE(ttm_bo_pin_basic),
 	KUNIT_CASE(ttm_bo_pin_unpin_resource),
 	KUNIT_CASE(ttm_bo_multiple_pin_one_unpin),
diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
index 3148f5d3dbd6..4553c4e0e0f1 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
@@ -144,7 +144,7 @@ static void ttm_bo_init_reserved_sys_man(struct kunit *test)
 				  drm_mm_node_allocated(&bo->base.vma_node.vm_node));
 
 	ttm_resource_free(bo, &bo->resource);
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static void ttm_bo_init_reserved_mock_man(struct kunit *test)
@@ -186,7 +186,7 @@ static void ttm_bo_init_reserved_mock_man(struct kunit *test)
 				  drm_mm_node_allocated(&bo->base.vma_node.vm_node));
 
 	ttm_resource_free(bo, &bo->resource);
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 }
 
@@ -221,7 +221,7 @@ static void ttm_bo_init_reserved_resv(struct kunit *test)
 	KUNIT_EXPECT_PTR_EQ(test, bo->base.resv, &resv);
 
 	ttm_resource_free(bo, &bo->resource);
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static void ttm_bo_validate_basic(struct kunit *test)
@@ -265,7 +265,7 @@ static void ttm_bo_validate_basic(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bo->resource->placement,
 			DRM_BUDDY_TOPDOWN_ALLOCATION);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 	ttm_mock_manager_fini(priv->ttm_dev, snd_mem);
 }
 
@@ -292,7 +292,7 @@ static void ttm_bo_validate_invalid_placement(struct kunit *test)
 
 	KUNIT_EXPECT_EQ(test, err, -ENOMEM);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static void ttm_bo_validate_failed_alloc(struct kunit *test)
@@ -321,7 +321,7 @@ static void ttm_bo_validate_failed_alloc(struct kunit *test)
 
 	KUNIT_EXPECT_EQ(test, err, -ENOMEM);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 	ttm_bad_manager_fini(priv->ttm_dev, mem_type);
 }
 
@@ -353,7 +353,7 @@ static void ttm_bo_validate_pinned(struct kunit *test)
 	ttm_bo_unpin(bo);
 	dma_resv_unlock(bo->base.resv);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static const struct ttm_bo_validate_test_case ttm_mem_type_cases[] = {
@@ -403,7 +403,7 @@ static void ttm_bo_validate_same_placement(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, err, 0);
 	KUNIT_EXPECT_EQ(test, ctx_val.bytes_moved, 0);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 
 	if (params->mem_type != TTM_PL_SYSTEM)
 		ttm_mock_manager_fini(priv->ttm_dev, params->mem_type);
@@ -452,7 +452,7 @@ static void ttm_bo_validate_busy_placement(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bo->resource->mem_type, snd_mem);
 	KUNIT_ASSERT_TRUE(test, list_is_singular(&man->lru[bo->priority]));
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 	ttm_bad_manager_fini(priv->ttm_dev, fst_mem);
 	ttm_mock_manager_fini(priv->ttm_dev, snd_mem);
 }
@@ -495,7 +495,7 @@ static void ttm_bo_validate_multihop(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, ctx_val.bytes_moved, size * 2);
 	KUNIT_EXPECT_EQ(test, bo->resource->mem_type, final_mem);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 
 	ttm_mock_manager_fini(priv->ttm_dev, fst_mem);
 	ttm_mock_manager_fini(priv->ttm_dev, tmp_mem);
@@ -566,7 +566,7 @@ static void ttm_bo_validate_no_placement_signaled(struct kunit *test)
 		KUNIT_ASSERT_TRUE(test, flags & TTM_TT_FLAG_ZERO_ALLOC);
 	}
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static int threaded_dma_resv_signal(void *arg)
@@ -634,7 +634,7 @@ static void ttm_bo_validate_no_placement_not_signaled(struct kunit *test)
 	/* Make sure we have an idle object at this point */
 	dma_resv_wait_timeout(bo->base.resv, usage, false, MAX_SCHEDULE_TIMEOUT);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 }
 
 static void ttm_bo_validate_move_fence_signaled(struct kunit *test)
@@ -667,7 +667,7 @@ static void ttm_bo_validate_move_fence_signaled(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bo->resource->mem_type, mem_type);
 	KUNIT_EXPECT_EQ(test, ctx.bytes_moved, size);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 	dma_fence_put(man->move);
 }
 
@@ -752,7 +752,7 @@ static void ttm_bo_validate_move_fence_not_signaled(struct kunit *test)
 	else
 		KUNIT_EXPECT_EQ(test, bo->resource->mem_type, fst_mem);
 
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo);
 	ttm_mock_manager_fini(priv->ttm_dev, fst_mem);
 	ttm_mock_manager_fini(priv->ttm_dev, snd_mem);
 }
@@ -801,8 +801,8 @@ static void ttm_bo_validate_swapout(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bo_small->resource->mem_type, TTM_PL_SYSTEM);
 	KUNIT_EXPECT_TRUE(test, bo_small->ttm->page_flags & TTM_TT_FLAG_SWAPPED);
 
-	ttm_bo_put(bo_big);
-	ttm_bo_put(bo_small);
+	ttm_bo_fini(bo_big);
+	ttm_bo_fini(bo_small);
 
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 }
@@ -856,8 +856,8 @@ static void ttm_bo_validate_happy_evict(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bos[1].resource->mem_type, mem_type);
 
 	for (i = 0; i < bo_no; i++)
-		ttm_bo_put(&bos[i]);
-	ttm_bo_put(bo_val);
+		ttm_bo_fini(&bos[i]);
+	ttm_bo_fini(bo_val);
 
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_multihop);
@@ -901,12 +901,12 @@ static void ttm_bo_validate_all_pinned_evict(struct kunit *test)
 
 	KUNIT_EXPECT_EQ(test, err, -ENOMEM);
 
-	ttm_bo_put(bo_small);
+	ttm_bo_fini(bo_small);
 
 	ttm_bo_reserve(bo_big, false, false, NULL);
 	ttm_bo_unpin(bo_big);
 	dma_resv_unlock(bo_big->base.resv);
-	ttm_bo_put(bo_big);
+	ttm_bo_fini(bo_big);
 
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_multihop);
@@ -965,13 +965,13 @@ static void ttm_bo_validate_allowed_only_evict(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bo_evictable->resource->mem_type, mem_type_evict);
 	KUNIT_EXPECT_EQ(test, ctx_val.bytes_moved, size * 2 + BO_SIZE);
 
-	ttm_bo_put(bo);
-	ttm_bo_put(bo_evictable);
+	ttm_bo_fini(bo);
+	ttm_bo_fini(bo_evictable);
 
 	ttm_bo_reserve(bo_pinned, false, false, NULL);
 	ttm_bo_unpin(bo_pinned);
 	dma_resv_unlock(bo_pinned->base.resv);
-	ttm_bo_put(bo_pinned);
+	ttm_bo_fini(bo_pinned);
 
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_multihop);
@@ -1022,8 +1022,8 @@ static void ttm_bo_validate_deleted_evict(struct kunit *test)
 	KUNIT_EXPECT_NULL(test, bo_big->ttm);
 	KUNIT_EXPECT_NULL(test, bo_big->resource);
 
-	ttm_bo_put(bo_small);
-	ttm_bo_put(bo_big);
+	ttm_bo_fini(bo_small);
+	ttm_bo_fini(bo_big);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 }
 
@@ -1074,8 +1074,8 @@ static void ttm_bo_validate_busy_domain_evict(struct kunit *test)
 	KUNIT_EXPECT_EQ(test, bo_init->resource->mem_type, mem_type);
 	KUNIT_EXPECT_NULL(test, bo_val->resource);
 
-	ttm_bo_put(bo_init);
-	ttm_bo_put(bo_val);
+	ttm_bo_fini(bo_init);
+	ttm_bo_fini(bo_val);
 
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 	ttm_bad_manager_fini(priv->ttm_dev, mem_type_evict);
@@ -1119,8 +1119,8 @@ static void ttm_bo_validate_evict_gutting(struct kunit *test)
 	KUNIT_ASSERT_NULL(test, bo_evict->resource);
 	KUNIT_ASSERT_TRUE(test, bo_evict->ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC);
 
-	ttm_bo_put(bo_evict);
-	ttm_bo_put(bo);
+	ttm_bo_fini(bo_evict);
+	ttm_bo_fini(bo);
 
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 }
@@ -1177,9 +1177,9 @@ static void ttm_bo_validate_recrusive_evict(struct kunit *test)
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type_evict);
 
-	ttm_bo_put(bo_val);
-	ttm_bo_put(bo_tt);
-	ttm_bo_put(bo_mock);
+	ttm_bo_fini(bo_val);
+	ttm_bo_fini(bo_tt);
+	ttm_bo_fini(bo_mock);
 }
 
 static struct kunit_case ttm_bo_validate_test_cases[] = {
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 0f874f1e2526..abcf45bc3930 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -48,6 +48,14 @@
 #include "ttm_module.h"
 #include "ttm_bo_internal.h"
 
+static void ttm_bo_release(struct kref *kref);
+
+/* TODO: remove! */
+void ttm_bo_put(struct ttm_buffer_object *bo)
+{
+	kref_put(&bo->kref, ttm_bo_release);
+}
+
 static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
 					struct ttm_placement *placement)
 {
@@ -317,18 +325,11 @@ static void ttm_bo_release(struct kref *kref)
 	bo->destroy(bo);
 }
 
-/**
- * ttm_bo_put
- *
- * @bo: The buffer object.
- *
- * Unreference a buffer object.
- */
-void ttm_bo_put(struct ttm_buffer_object *bo)
+void ttm_bo_fini(struct ttm_buffer_object *bo)
 {
-	kref_put(&bo->kref, ttm_bo_release);
+	ttm_bo_put(bo);
 }
-EXPORT_SYMBOL(ttm_bo_put);
+EXPORT_SYMBOL(ttm_bo_fini);
 
 static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo,
 				     struct ttm_operation_ctx *ctx,
diff --git a/drivers/gpu/drm/ttm/ttm_bo_internal.h b/drivers/gpu/drm/ttm/ttm_bo_internal.h
index 9d8b747a34db..e0d48eac74b0 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_internal.h
+++ b/drivers/gpu/drm/ttm/ttm_bo_internal.h
@@ -55,4 +55,6 @@ ttm_bo_get_unless_zero(struct ttm_buffer_object *bo)
 	return bo;
 }
 
+void ttm_bo_put(struct ttm_buffer_object *bo);
+
 #endif
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
index 7057d852951b..e564d071f40b 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
@@ -37,7 +37,7 @@ static void vmw_gem_object_free(struct drm_gem_object *gobj)
 {
 	struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gobj);
 	if (bo)
-		ttm_bo_put(bo);
+		ttm_bo_fini(bo);
 }
 
 static int vmw_gem_object_open(struct drm_gem_object *obj,
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 2a8e7cb8c982..a43b3f4f3da3 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1647,7 +1647,7 @@ static void xe_gem_object_free(struct drm_gem_object *obj)
 	 * refcount directly if needed.
 	 */
 	__xe_bo_vunmap(gem_to_xe_bo(obj));
-	ttm_bo_put(container_of(obj, struct ttm_buffer_object, base));
+	ttm_bo_fini(container_of(obj, struct ttm_buffer_object, base));
 }
 
 static void xe_gem_object_close(struct drm_gem_object *obj,
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 8ad6e2713625..4b0552d1bc55 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -383,7 +383,7 @@ int ttm_bo_wait_ctx(struct ttm_buffer_object *bo,
 int ttm_bo_validate(struct ttm_buffer_object *bo,
 		    struct ttm_placement *placement,
 		    struct ttm_operation_ctx *ctx);
-void ttm_bo_put(struct ttm_buffer_object *bo);
+void ttm_bo_fini(struct ttm_buffer_object *bo);
 void ttm_bo_set_bulk_move(struct ttm_buffer_object *bo,
 			  struct ttm_lru_bulk_move *bulk);
 bool ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/6] drm/ttm: disable ttm_bo_validate_deleted_evict for now
  2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
                   ` (2 preceding siblings ...)
  2025-06-16 13:07 ` [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini Christian König
@ 2025-06-16 13:07 ` Christian König
  2025-06-16 13:07 ` [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount Christian König
  2025-06-23 19:02 ` [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Ian Forbes
  5 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2025-06-16 13:07 UTC (permalink / raw)
  To: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

That test case uses internal TTM behavior which is about to change. We
need a mock fence or similar to get it working instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
index 4553c4e0e0f1..6766e1753343 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
@@ -977,6 +977,7 @@ static void ttm_bo_validate_allowed_only_evict(struct kunit *test)
 	ttm_mock_manager_fini(priv->ttm_dev, mem_multihop);
 }
 
+#if 0
 static void ttm_bo_validate_deleted_evict(struct kunit *test)
 {
 	struct ttm_operation_ctx ctx_init = { }, ctx_val  = { };
@@ -1026,6 +1027,7 @@ static void ttm_bo_validate_deleted_evict(struct kunit *test)
 	ttm_bo_fini(bo_big);
 	ttm_mock_manager_fini(priv->ttm_dev, mem_type);
 }
+#endif
 
 static void ttm_bo_validate_busy_domain_evict(struct kunit *test)
 {
@@ -1205,7 +1207,7 @@ static struct kunit_case ttm_bo_validate_test_cases[] = {
 	KUNIT_CASE(ttm_bo_validate_happy_evict),
 	KUNIT_CASE(ttm_bo_validate_all_pinned_evict),
 	KUNIT_CASE(ttm_bo_validate_allowed_only_evict),
-	KUNIT_CASE(ttm_bo_validate_deleted_evict),
+	/*KUNIT_CASE(ttm_bo_validate_deleted_evict),*/
 	KUNIT_CASE(ttm_bo_validate_busy_domain_evict),
 	KUNIT_CASE(ttm_bo_validate_evict_gutting),
 	KUNIT_CASE(ttm_bo_validate_recrusive_evict),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount
  2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
                   ` (3 preceding siblings ...)
  2025-06-16 13:07 ` [PATCH 5/6] drm/ttm: disable ttm_bo_validate_deleted_evict for now Christian König
@ 2025-06-16 13:07 ` Christian König
  2025-06-17  4:27   ` kernel test robot
  2025-06-17 10:18   ` Thomas Hellström
  2025-06-23 19:02 ` [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Ian Forbes
  5 siblings, 2 replies; 12+ messages in thread
From: Christian König @ 2025-06-16 13:07 UTC (permalink / raw)
  To: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

Instead of keeping a separate reference count for the TTM object also use
the reference count for DRM GEM objects inside TTM.

Apart from avoiding two reference counts for one object this approach has
the clear advantage of being able to use drm_exec inside TTM.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |   4 +-
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c |   2 -
 drivers/gpu/drm/ttm/ttm_bo.c                  | 146 +++++++++---------
 drivers/gpu/drm/ttm/ttm_bo_internal.h         |   9 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c             |   2 +-
 include/drm/ttm/ttm_bo.h                      |   9 --
 6 files changed, 81 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
index 6766e1753343..7b908920aae5 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
@@ -127,7 +127,7 @@ static void ttm_bo_init_reserved_sys_man(struct kunit *test)
 	dma_resv_unlock(bo->base.resv);
 
 	KUNIT_EXPECT_EQ(test, err, 0);
-	KUNIT_EXPECT_EQ(test, kref_read(&bo->kref), 1);
+	KUNIT_EXPECT_EQ(test, kref_read(&bo->base.refcount), 1);
 	KUNIT_EXPECT_PTR_EQ(test, bo->bdev, priv->ttm_dev);
 	KUNIT_EXPECT_EQ(test, bo->type, bo_type);
 	KUNIT_EXPECT_EQ(test, bo->page_alignment, PAGE_SIZE);
@@ -176,7 +176,7 @@ static void ttm_bo_init_reserved_mock_man(struct kunit *test)
 	dma_resv_unlock(bo->base.resv);
 
 	KUNIT_EXPECT_EQ(test, err, 0);
-	KUNIT_EXPECT_EQ(test, kref_read(&bo->kref), 1);
+	KUNIT_EXPECT_EQ(test, kref_read(&bo->base.refcount), 1);
 	KUNIT_EXPECT_PTR_EQ(test, bo->bdev, priv->ttm_dev);
 	KUNIT_EXPECT_EQ(test, bo->type, bo_type);
 	KUNIT_EXPECT_EQ(test, ctx.bytes_moved, size);
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
index b91c13f46225..7bc8eae77b8c 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
@@ -190,8 +190,6 @@ struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
 	bo->bdev = devs->ttm_dev;
 	bo->destroy = dummy_ttm_bo_destroy;
 
-	kref_init(&bo->kref);
-
 	return bo;
 }
 EXPORT_SYMBOL_GPL(ttm_bo_kunit_init);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index abcf45bc3930..071cbad2fe9e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -48,14 +48,6 @@
 #include "ttm_module.h"
 #include "ttm_bo_internal.h"
 
-static void ttm_bo_release(struct kref *kref);
-
-/* TODO: remove! */
-void ttm_bo_put(struct ttm_buffer_object *bo)
-{
-	kref_put(&bo->kref, ttm_bo_release);
-}
-
 static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
 					struct ttm_placement *placement)
 {
@@ -252,82 +244,91 @@ static void ttm_bo_delayed_delete(struct work_struct *work)
 	ttm_bo_put(bo);
 }
 
-static void ttm_bo_release(struct kref *kref)
+static void ttm_bo_free(struct drm_gem_object *gobj)
+{
+	struct ttm_buffer_object *bo = container_of(gobj, typeof(*bo), base);
+
+	atomic_dec(&ttm_glob.bo_count);
+	bo->destroy(bo);
+}
+
+/*
+ * All other callbacks should never ever be called on a deleted TTM object.
+ */
+static const struct drm_gem_object_funcs ttm_deleted_object_funcs = {
+	.free = ttm_bo_free
+};
+
+/* Returns true if the BO is about to get deleted */
+static bool ttm_bo_is_zombie(struct ttm_buffer_object *bo)
+{
+	return bo->base.funcs == &ttm_deleted_object_funcs;
+}
+
+void ttm_bo_fini(struct ttm_buffer_object *bo)
 {
-	struct ttm_buffer_object *bo =
-	    container_of(kref, struct ttm_buffer_object, kref);
 	struct ttm_device *bdev = bo->bdev;
 	int ret;
 
 	WARN_ON_ONCE(bo->pin_count);
 	WARN_ON_ONCE(bo->bulk_move);
 
-	if (!bo->deleted) {
-		ret = ttm_bo_individualize_resv(bo);
-		if (ret) {
-			/* Last resort, if we fail to allocate memory for the
-			 * fences block for the BO to become idle
-			 */
-			dma_resv_wait_timeout(bo->base.resv,
-					      DMA_RESV_USAGE_BOOKKEEP, false,
-					      30 * HZ);
-		}
+	ret = ttm_bo_individualize_resv(bo);
+	if (ret) {
+		/* Last resort, if we fail to allocate memory for the
+		 * fences block for the BO to become idle
+		 */
+		dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP,
+				      false, 30 * HZ);
+	}
 
-		if (bo->bdev->funcs->release_notify)
-			bo->bdev->funcs->release_notify(bo);
-
-		drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
-		ttm_mem_io_free(bdev, bo->resource);
-
-		if (!dma_resv_test_signaled(&bo->base._resv,
-					    DMA_RESV_USAGE_BOOKKEEP) ||
-		    (want_init_on_free() && (bo->ttm != NULL)) ||
-		    bo->type == ttm_bo_type_sg ||
-		    !dma_resv_trylock(bo->base.resv)) {
-			/* The BO is not idle, resurrect it for delayed destroy */
-			ttm_bo_flush_all_fences(bo);
-			bo->deleted = true;
-
-			spin_lock(&bo->bdev->lru_lock);
-
-			/*
-			 * Make pinned bos immediately available to
-			 * shrinkers, now that they are queued for
-			 * destruction.
-			 *
-			 * FIXME: QXL is triggering this. Can be removed when the
-			 * driver is fixed.
-			 */
-			if (bo->pin_count) {
-				bo->pin_count = 0;
-				ttm_resource_move_to_lru_tail(bo->resource);
-			}
+	if (bo->bdev->funcs->release_notify)
+		bo->bdev->funcs->release_notify(bo);
+
+	drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
+	ttm_mem_io_free(bdev, bo->resource);
 
-			kref_init(&bo->kref);
-			spin_unlock(&bo->bdev->lru_lock);
+	if (!dma_resv_test_signaled(&bo->base._resv, DMA_RESV_USAGE_BOOKKEEP) ||
+	    (want_init_on_free() && (bo->ttm != NULL)) ||
+	    bo->type == ttm_bo_type_sg ||
+	    !dma_resv_trylock(bo->base.resv)) {
+		/* The BO is not idle, resurrect it for delayed destroy */
+		ttm_bo_flush_all_fences(bo);
 
-			INIT_WORK(&bo->delayed_delete, ttm_bo_delayed_delete);
+		spin_lock(&bo->bdev->lru_lock);
 
-			/* Schedule the worker on the closest NUMA node. This
-			 * improves performance since system memory might be
-			 * cleared on free and that is best done on a CPU core
-			 * close to it.
-			 */
-			queue_work_node(bdev->pool.nid, bdev->wq, &bo->delayed_delete);
-			return;
+		/*
+		 * Make pinned bos immediately available to
+		 * shrinkers, now that they are queued for
+		 * destruction.
+		 *
+		 * FIXME: QXL is triggering this. Can be removed when the
+		 * driver is fixed.
+		 */
+		if (bo->pin_count) {
+			bo->pin_count = 0;
+			ttm_resource_move_to_lru_tail(bo->resource);
 		}
 
+		kref_init(&bo->base.refcount);
+		bo->base.funcs = &ttm_deleted_object_funcs;
+		spin_unlock(&bo->bdev->lru_lock);
+
+		INIT_WORK(&bo->delayed_delete, ttm_bo_delayed_delete);
+
+		/* Schedule the worker on the closest NUMA node. This
+		 * improves performance since system memory might be
+		 * cleared on free and that is best done on a CPU core
+		 * close to it.
+		 */
+		queue_work_node(bdev->pool.nid, bdev->wq, &bo->delayed_delete);
+	} else {
 		ttm_bo_cleanup_memtype_use(bo);
 		dma_resv_unlock(bo->base.resv);
-	}
 
-	atomic_dec(&ttm_glob.bo_count);
-	bo->destroy(bo);
-}
-
-void ttm_bo_fini(struct ttm_buffer_object *bo)
-{
-	ttm_bo_put(bo);
+		atomic_dec(&ttm_glob.bo_count);
+		bo->destroy(bo);
+	}
 }
 EXPORT_SYMBOL(ttm_bo_fini);
 
@@ -471,7 +472,7 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
 	if (!bo->resource || bo->resource->mem_type != mem_type)
 		goto out_bo_moved;
 
-	if (bo->deleted) {
+	if (ttm_bo_is_zombie(bo)) {
 		ret = ttm_bo_wait_ctx(bo, ctx);
 		if (!ret)
 			ttm_bo_cleanup_memtype_use(bo);
@@ -525,7 +526,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
 	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->place))
 		return 0;
 
-	if (bo->deleted) {
+	if (ttm_bo_is_zombie(bo)) {
 		lret = ttm_bo_wait_ctx(bo, walk->ctx);
 		if (!lret)
 			ttm_bo_cleanup_memtype_use(bo);
@@ -623,7 +624,6 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
 void ttm_bo_pin(struct ttm_buffer_object *bo)
 {
 	dma_resv_assert_held(bo->base.resv);
-	WARN_ON_ONCE(!kref_read(&bo->kref));
 	spin_lock(&bo->bdev->lru_lock);
 	if (bo->resource)
 		ttm_resource_del_bulk_move(bo->resource, bo);
@@ -642,7 +642,6 @@ EXPORT_SYMBOL(ttm_bo_pin);
 void ttm_bo_unpin(struct ttm_buffer_object *bo)
 {
 	dma_resv_assert_held(bo->base.resv);
-	WARN_ON_ONCE(!kref_read(&bo->kref));
 	if (WARN_ON_ONCE(!bo->pin_count))
 		return;
 
@@ -931,7 +930,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, struct ttm_buffer_object *bo,
 {
 	int ret;
 
-	kref_init(&bo->kref);
 	bo->bdev = bdev;
 	bo->type = type;
 	bo->page_alignment = alignment;
@@ -1127,7 +1125,7 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo)
 		goto out;
 	}
 
-	if (bo->deleted) {
+	if (ttm_bo_is_zombie(bo)) {
 		pgoff_t num_pages = bo->ttm->num_pages;
 
 		ret = ttm_bo_wait_ctx(bo, ctx);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_internal.h b/drivers/gpu/drm/ttm/ttm_bo_internal.h
index e0d48eac74b0..81327bc73834 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_internal.h
+++ b/drivers/gpu/drm/ttm/ttm_bo_internal.h
@@ -34,7 +34,7 @@
  */
 static inline void ttm_bo_get(struct ttm_buffer_object *bo)
 {
-	kref_get(&bo->kref);
+	drm_gem_object_get(&bo->base);
 }
 
 /**
@@ -50,11 +50,14 @@ static inline void ttm_bo_get(struct ttm_buffer_object *bo)
 static inline __must_check struct ttm_buffer_object *
 ttm_bo_get_unless_zero(struct ttm_buffer_object *bo)
 {
-	if (!kref_get_unless_zero(&bo->kref))
+	if (!kref_get_unless_zero(&bo->base.refcount))
 		return NULL;
 	return bo;
 }
 
-void ttm_bo_put(struct ttm_buffer_object *bo);
+static inline void ttm_bo_put(struct ttm_buffer_object *bo)
+{
+	drm_gem_object_put(&bo->base);
+}
 
 #endif
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 56f3152f34f5..56645039757e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -245,7 +245,7 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	atomic_inc(&ttm_glob.bo_count);
 	drm_vma_node_reset(&fbo->base.base.vma_node);
 
-	kref_init(&fbo->base.kref);
+	kref_init(&fbo->base.base.refcount);
 	fbo->base.destroy = &ttm_transfered_destroy;
 	fbo->base.pin_count = 0;
 	if (bo->type != ttm_bo_type_sg)
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 4b0552d1bc55..4fe4031f0165 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -78,11 +78,8 @@ enum ttm_bo_type {
  * @type: The bo type.
  * @page_alignment: Page alignment.
  * @destroy: Destruction function. If NULL, kfree is used.
- * @kref: Reference count of this buffer object. When this refcount reaches
- * zero, the object is destroyed or put on the delayed delete list.
  * @resource: structure describing current placement.
  * @ttm: TTM structure holding system pages.
- * @deleted: True if the object is only a zombie and already deleted.
  * @bulk_move: The bulk move object.
  * @priority: Priority for LRU, BOs with lower priority are evicted first.
  * @pin_count: Pin count.
@@ -109,17 +106,11 @@ struct ttm_buffer_object {
 	uint32_t page_alignment;
 	void (*destroy) (struct ttm_buffer_object *);
 
-	/*
-	* Members not needing protection.
-	*/
-	struct kref kref;
-
 	/*
 	 * Members protected by the bo::resv::reserved lock.
 	 */
 	struct ttm_resource *resource;
 	struct ttm_tt *ttm;
-	bool deleted;
 	struct ttm_lru_bulk_move *bulk_move;
 	unsigned priority;
 	unsigned pin_count;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini
  2025-06-16 13:07 ` [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini Christian König
@ 2025-06-17  2:08   ` kernel test robot
  0 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2025-06-17  2:08 UTC (permalink / raw)
  To: Christian König, thomas.hellstrom, intel-xe, dri-devel,
	matthew.brost, matthew.auld
  Cc: oe-kbuild-all

Hi Christian,

kernel test robot noticed the following build errors:

[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on next-20250616]
[cannot apply to drm-xe/drm-xe-next drm-exynos/exynos-drm-next linus/master drm/drm-next drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip v6.16-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Christian-K-nig/drm-xe-stop-asserting-on-the-TTM-refcount/20250616-210818
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:    https://lore.kernel.org/r/20250616130726.22863-4-christian.koenig%40amd.com
patch subject: [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini
config: riscv-randconfig-002-20250617 (https://download.01.org/0day-ci/archive/20250617/202506170945.w3e0EBJ6-lkp@intel.com/config)
compiler: riscv32-linux-gcc (GCC) 10.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250617/202506170945.w3e0EBJ6-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506170945.w3e0EBJ6-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/nouveau/nouveau_dma.h:30,
                    from drivers/gpu/drm/nouveau/nouveau_chan.c:31:
   drivers/gpu/drm/nouveau/nouveau_bo.h: In function 'nouveau_bo_fini':
>> drivers/gpu/drm/nouveau/nouveau_bo.h:60:2: error: implicit declaration of function 'ttm_bo_put'; did you mean 'ttm_bo_pin'? [-Werror=implicit-function-declaration]
      60 |  ttm_bo_put(&bo->bo);
         |  ^~~~~~~~~~
         |  ttm_bo_pin
   cc1: some warnings being treated as errors


vim +60 drivers/gpu/drm/nouveau/nouveau_bo.h

8be21a6402baa5 Ben Skeggs       2012-07-18  56  
bf32a3a1268638 Danilo Krummrich 2024-07-18  57  static inline void
bf32a3a1268638 Danilo Krummrich 2024-07-18  58  nouveau_bo_fini(struct nouveau_bo *bo)
8be21a6402baa5 Ben Skeggs       2012-07-18  59  {
bf32a3a1268638 Danilo Krummrich 2024-07-18 @60  	ttm_bo_put(&bo->bo);
8be21a6402baa5 Ben Skeggs       2012-07-18  61  }
8be21a6402baa5 Ben Skeggs       2012-07-18  62  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount
  2025-06-16 13:07 ` [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount Christian König
@ 2025-06-17  4:27   ` kernel test robot
  2025-06-17 10:18   ` Thomas Hellström
  1 sibling, 0 replies; 12+ messages in thread
From: kernel test robot @ 2025-06-17  4:27 UTC (permalink / raw)
  To: Christian König, thomas.hellstrom, intel-xe, dri-devel,
	matthew.brost, matthew.auld
  Cc: oe-kbuild-all

Hi Christian,

kernel test robot noticed the following build errors:

[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on next-20250616]
[cannot apply to drm-xe/drm-xe-next drm-exynos/exynos-drm-next linus/master drm/drm-next drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip v6.16-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Christian-K-nig/drm-xe-stop-asserting-on-the-TTM-refcount/20250616-210818
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:    https://lore.kernel.org/r/20250616130726.22863-6-christian.koenig%40amd.com
patch subject: [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount
config: x86_64-buildonly-randconfig-006-20250617 (https://download.01.org/0day-ci/archive/20250617/202506171254.EsodeP9U-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250617/202506171254.EsodeP9U-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506171254.EsodeP9U-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/gem/i915_gem_ttm.c: In function 'i915_ttm_adjust_lru':
>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c:935:27: error: 'struct ttm_buffer_object' has no member named 'kref'
     935 |         if (!kref_read(&bo->kref))
         |                           ^~


vim +935 drivers/gpu/drm/i915/gem/i915_gem_ttm.c

213d5092776345 Thomas Hellström 2021-06-10   918  
3589fdbd3b2085 Thomas Hellström 2021-11-04   919  /**
3589fdbd3b2085 Thomas Hellström 2021-11-04   920   * i915_ttm_adjust_lru - Adjust an object's position on relevant LRU lists.
3589fdbd3b2085 Thomas Hellström 2021-11-04   921   * @obj: The object
3589fdbd3b2085 Thomas Hellström 2021-11-04   922   */
3589fdbd3b2085 Thomas Hellström 2021-11-04   923  void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
213d5092776345 Thomas Hellström 2021-06-10   924  {
213d5092776345 Thomas Hellström 2021-06-10   925  	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
7ae034590ceaef Matthew Auld     2021-10-18   926  	struct i915_ttm_tt *i915_tt =
7ae034590ceaef Matthew Auld     2021-10-18   927  		container_of(bo->ttm, typeof(*i915_tt), ttm);
ebd4a8ec7799b1 Matthew Auld     2021-10-18   928  	bool shrinkable =
ebd4a8ec7799b1 Matthew Auld     2021-10-18   929  		bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm);
213d5092776345 Thomas Hellström 2021-06-10   930  
213d5092776345 Thomas Hellström 2021-06-10   931  	/*
213d5092776345 Thomas Hellström 2021-06-10   932  	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
213d5092776345 Thomas Hellström 2021-06-10   933  	 * We're called through i915_ttm_delete_mem_notify().
213d5092776345 Thomas Hellström 2021-06-10   934  	 */
213d5092776345 Thomas Hellström 2021-06-10  @935  	if (!kref_read(&bo->kref))
213d5092776345 Thomas Hellström 2021-06-10   936  		return;
213d5092776345 Thomas Hellström 2021-06-10   937  
ebd4a8ec7799b1 Matthew Auld     2021-10-18   938  	/*
ebd4a8ec7799b1 Matthew Auld     2021-10-18   939  	 * We skip managing the shrinker LRU in set_pages() and just manage
ebd4a8ec7799b1 Matthew Auld     2021-10-18   940  	 * everything here. This does at least solve the issue with having
ebd4a8ec7799b1 Matthew Auld     2021-10-18   941  	 * temporary shmem mappings(like with evicted lmem) not being visible to
ebd4a8ec7799b1 Matthew Auld     2021-10-18   942  	 * the shrinker. Only our shmem objects are shrinkable, everything else
ebd4a8ec7799b1 Matthew Auld     2021-10-18   943  	 * we keep as unshrinkable.
ebd4a8ec7799b1 Matthew Auld     2021-10-18   944  	 *
ebd4a8ec7799b1 Matthew Auld     2021-10-18   945  	 * To make sure everything plays nice we keep an extra shrink pin in TTM
ebd4a8ec7799b1 Matthew Auld     2021-10-18   946  	 * if the underlying pages are not currently shrinkable. Once we release
ebd4a8ec7799b1 Matthew Auld     2021-10-18   947  	 * our pin, like when the pages are moved to shmem, the pages will then
ebd4a8ec7799b1 Matthew Auld     2021-10-18   948  	 * be added to the shrinker LRU, assuming the caller isn't also holding
ebd4a8ec7799b1 Matthew Auld     2021-10-18   949  	 * a pin.
ebd4a8ec7799b1 Matthew Auld     2021-10-18   950  	 *
ebd4a8ec7799b1 Matthew Auld     2021-10-18   951  	 * TODO: consider maybe also bumping the shrinker list here when we have
ebd4a8ec7799b1 Matthew Auld     2021-10-18   952  	 * already unpinned it, which should give us something more like an LRU.
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   953  	 *
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   954  	 * TODO: There is a small window of opportunity for this function to
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   955  	 * get called from eviction after we've dropped the last GEM refcount,
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   956  	 * but before the TTM deleted flag is set on the object. Avoid
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   957  	 * adjusting the shrinker list in such cases, since the object is
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   958  	 * not available to the shrinker anyway due to its zero refcount.
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   959  	 * To fix this properly we should move to a TTM shrinker LRU list for
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   960  	 * these objects.
ebd4a8ec7799b1 Matthew Auld     2021-10-18   961  	 */
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   962  	if (kref_get_unless_zero(&obj->base.refcount)) {
ebd4a8ec7799b1 Matthew Auld     2021-10-18   963  		if (shrinkable != obj->mm.ttm_shrinkable) {
ebd4a8ec7799b1 Matthew Auld     2021-10-18   964  			if (shrinkable) {
ebd4a8ec7799b1 Matthew Auld     2021-10-18   965  				if (obj->mm.madv == I915_MADV_WILLNEED)
ebd4a8ec7799b1 Matthew Auld     2021-10-18   966  					__i915_gem_object_make_shrinkable(obj);
ebd4a8ec7799b1 Matthew Auld     2021-10-18   967  				else
ebd4a8ec7799b1 Matthew Auld     2021-10-18   968  					__i915_gem_object_make_purgeable(obj);
ebd4a8ec7799b1 Matthew Auld     2021-10-18   969  			} else {
ebd4a8ec7799b1 Matthew Auld     2021-10-18   970  				i915_gem_object_make_unshrinkable(obj);
ebd4a8ec7799b1 Matthew Auld     2021-10-18   971  			}
ebd4a8ec7799b1 Matthew Auld     2021-10-18   972  
ebd4a8ec7799b1 Matthew Auld     2021-10-18   973  			obj->mm.ttm_shrinkable = shrinkable;
ebd4a8ec7799b1 Matthew Auld     2021-10-18   974  		}
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   975  		i915_gem_object_put(obj);
d3cb30f8dcbcb7 Thomas Hellström 2021-11-10   976  	}
ebd4a8ec7799b1 Matthew Auld     2021-10-18   977  
213d5092776345 Thomas Hellström 2021-06-10   978  	/*
213d5092776345 Thomas Hellström 2021-06-10   979  	 * Put on the correct LRU list depending on the MADV status
213d5092776345 Thomas Hellström 2021-06-10   980  	 */
213d5092776345 Thomas Hellström 2021-06-10   981  	spin_lock(&bo->bdev->lru_lock);
ebd4a8ec7799b1 Matthew Auld     2021-10-18   982  	if (shrinkable) {
7ae034590ceaef Matthew Auld     2021-10-18   983  		/* Try to keep shmem_tt from being considered for shrinking. */
7ae034590ceaef Matthew Auld     2021-10-18   984  		bo->priority = TTM_MAX_BO_PRIORITY - 1;
7ae034590ceaef Matthew Auld     2021-10-18   985  	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
213d5092776345 Thomas Hellström 2021-06-10   986  		bo->priority = I915_TTM_PRIO_PURGE;
213d5092776345 Thomas Hellström 2021-06-10   987  	} else if (!i915_gem_object_has_pages(obj)) {
213d5092776345 Thomas Hellström 2021-06-10   988  		bo->priority = I915_TTM_PRIO_NO_PAGES;
ba2c5d15022a56 Matthew Auld     2022-02-09   989  	} else {
9373505967ffc1 Matthew Auld     2022-02-28   990  		struct ttm_resource_manager *man =
9373505967ffc1 Matthew Auld     2022-02-28   991  			ttm_manager_type(bo->bdev, bo->resource->mem_type);
9373505967ffc1 Matthew Auld     2022-02-28   992  
9373505967ffc1 Matthew Auld     2022-02-28   993  		/*
9373505967ffc1 Matthew Auld     2022-02-28   994  		 * If we need to place an LMEM resource which doesn't need CPU
9373505967ffc1 Matthew Auld     2022-02-28   995  		 * access then we should try not to victimize mappable objects
9373505967ffc1 Matthew Auld     2022-02-28   996  		 * first, since we likely end up stealing more of the mappable
54296aa4cfe718 Nitin Gote       2025-01-20   997  		 * portion. And likewise when we try to find space for a mappable
9373505967ffc1 Matthew Auld     2022-02-28   998  		 * object, we know not to ever victimize objects that don't
9373505967ffc1 Matthew Auld     2022-02-28   999  		 * occupy any mappable pages.
9373505967ffc1 Matthew Auld     2022-02-28  1000  		 */
9373505967ffc1 Matthew Auld     2022-02-28  1001  		if (i915_ttm_cpu_maps_iomem(bo->resource) &&
9373505967ffc1 Matthew Auld     2022-02-28  1002  		    i915_ttm_buddy_man_visible_size(man) < man->size &&
9373505967ffc1 Matthew Auld     2022-02-28  1003  		    !(obj->flags & I915_BO_ALLOC_GPU_ONLY))
9373505967ffc1 Matthew Auld     2022-02-28  1004  			bo->priority = I915_TTM_PRIO_NEEDS_CPU_ACCESS;
9373505967ffc1 Matthew Auld     2022-02-28  1005  		else
ba2c5d15022a56 Matthew Auld     2022-02-09  1006  			bo->priority = I915_TTM_PRIO_HAS_PAGES;
213d5092776345 Thomas Hellström 2021-06-10  1007  	}
213d5092776345 Thomas Hellström 2021-06-10  1008  
fee2ede155423b Christian König  2022-01-24  1009  	ttm_bo_move_to_lru_tail(bo);
213d5092776345 Thomas Hellström 2021-06-10  1010  	spin_unlock(&bo->bdev->lru_lock);
213d5092776345 Thomas Hellström 2021-06-10  1011  }
213d5092776345 Thomas Hellström 2021-06-10  1012  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount
  2025-06-16 13:07 ` [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount Christian König
  2025-06-17  4:27   ` kernel test robot
@ 2025-06-17 10:18   ` Thomas Hellström
  1 sibling, 0 replies; 12+ messages in thread
From: Thomas Hellström @ 2025-06-17 10:18 UTC (permalink / raw)
  To: Christian König, intel-xe, dri-devel, matthew.brost,
	matthew.auld

Hi, Christian,

On Mon, 2025-06-16 at 15:07 +0200, Christian König wrote:
> Instead of keeping a separate reference count for the TTM object also
> use
> the reference count for DRM GEM objects inside TTM.
> 
> Apart from avoiding two reference counts for one object this approach
> has
> the clear advantage of being able to use drm_exec inside TTM.

A couple of questions on the design direction here:

IIRC both xe and i915 has checks to consider objects with a 0 gem
refcount as zombies requiring special treatment or skipping, when
encountered in TTM callbacks. We need to double-check that.

But I wonder, 
first this practice of resurrecting refcounts seem a bit unusual, I
wonder if we can get rid of that somehow?

Furthermore, it seems the problem with drm_exec is related only to the
LRU walk. What about adding a struct completion to the object, that is
signaled when the object has freed its final backing-store. The LRU
walk would then check if the object is a zombie, and if so just wait on
the struct completion. (Need of course to carefully set up locking
orders). Then we wouldn't need to resurrect the gem refcount, nor use
drm_exec locking for zombies.

We would still need some form of refcounting while waiting on the
struct completion, but if we restricted the TTM refcount to *only* be
used internally for that sole purpose, and also replaced the final
ttm_bo_put() with the ttm_bo_finalize() that you suggest we wouldn't
need to resurrect that refcount since it wouldn't drop to zero until
the object is ready for final free.

Ideas, comments?

Thomas



> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |   4 +-
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c |   2 -
>  drivers/gpu/drm/ttm/ttm_bo.c                  | 146 +++++++++-------
> --
>  drivers/gpu/drm/ttm/ttm_bo_internal.h         |   9 +-
>  drivers/gpu/drm/ttm/ttm_bo_util.c             |   2 +-
>  include/drm/ttm/ttm_bo.h                      |   9 --
>  6 files changed, 81 insertions(+), 91 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> index 6766e1753343..7b908920aae5 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> @@ -127,7 +127,7 @@ static void ttm_bo_init_reserved_sys_man(struct
> kunit *test)
>  	dma_resv_unlock(bo->base.resv);
>  
>  	KUNIT_EXPECT_EQ(test, err, 0);
> -	KUNIT_EXPECT_EQ(test, kref_read(&bo->kref), 1);
> +	KUNIT_EXPECT_EQ(test, kref_read(&bo->base.refcount), 1);
>  	KUNIT_EXPECT_PTR_EQ(test, bo->bdev, priv->ttm_dev);
>  	KUNIT_EXPECT_EQ(test, bo->type, bo_type);
>  	KUNIT_EXPECT_EQ(test, bo->page_alignment, PAGE_SIZE);
> @@ -176,7 +176,7 @@ static void ttm_bo_init_reserved_mock_man(struct
> kunit *test)
>  	dma_resv_unlock(bo->base.resv);
>  
>  	KUNIT_EXPECT_EQ(test, err, 0);
> -	KUNIT_EXPECT_EQ(test, kref_read(&bo->kref), 1);
> +	KUNIT_EXPECT_EQ(test, kref_read(&bo->base.refcount), 1);
>  	KUNIT_EXPECT_PTR_EQ(test, bo->bdev, priv->ttm_dev);
>  	KUNIT_EXPECT_EQ(test, bo->type, bo_type);
>  	KUNIT_EXPECT_EQ(test, ctx.bytes_moved, size);
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> index b91c13f46225..7bc8eae77b8c 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> @@ -190,8 +190,6 @@ struct ttm_buffer_object
> *ttm_bo_kunit_init(struct kunit *test,
>  	bo->bdev = devs->ttm_dev;
>  	bo->destroy = dummy_ttm_bo_destroy;
>  
> -	kref_init(&bo->kref);
> -
>  	return bo;
>  }
>  EXPORT_SYMBOL_GPL(ttm_bo_kunit_init);
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> b/drivers/gpu/drm/ttm/ttm_bo.c
> index abcf45bc3930..071cbad2fe9e 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -48,14 +48,6 @@
>  #include "ttm_module.h"
>  #include "ttm_bo_internal.h"
>  
> -static void ttm_bo_release(struct kref *kref);
> -
> -/* TODO: remove! */
> -void ttm_bo_put(struct ttm_buffer_object *bo)
> -{
> -	kref_put(&bo->kref, ttm_bo_release);
> -}
> -
>  static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
>  					struct ttm_placement
> *placement)
>  {
> @@ -252,82 +244,91 @@ static void ttm_bo_delayed_delete(struct
> work_struct *work)
>  	ttm_bo_put(bo);
>  }
>  
> -static void ttm_bo_release(struct kref *kref)
> +static void ttm_bo_free(struct drm_gem_object *gobj)
> +{
> +	struct ttm_buffer_object *bo = container_of(gobj,
> typeof(*bo), base);
> +
> +	atomic_dec(&ttm_glob.bo_count);
> +	bo->destroy(bo);
> +}
> +
> +/*
> + * All other callbacks should never ever be called on a deleted TTM
> object.
> + */
> +static const struct drm_gem_object_funcs ttm_deleted_object_funcs =
> {
> +	.free = ttm_bo_free
> +};
> +
> +/* Returns true if the BO is about to get deleted */
> +static bool ttm_bo_is_zombie(struct ttm_buffer_object *bo)
> +{
> +	return bo->base.funcs == &ttm_deleted_object_funcs;
> +}
> +
> +void ttm_bo_fini(struct ttm_buffer_object *bo)
>  {
> -	struct ttm_buffer_object *bo =
> -	    container_of(kref, struct ttm_buffer_object, kref);
>  	struct ttm_device *bdev = bo->bdev;
>  	int ret;
>  
>  	WARN_ON_ONCE(bo->pin_count);
>  	WARN_ON_ONCE(bo->bulk_move);
>  
> -	if (!bo->deleted) {
> -		ret = ttm_bo_individualize_resv(bo);
> -		if (ret) {
> -			/* Last resort, if we fail to allocate
> memory for the
> -			 * fences block for the BO to become idle
> -			 */
> -			dma_resv_wait_timeout(bo->base.resv,
> -					     
> DMA_RESV_USAGE_BOOKKEEP, false,
> -					      30 * HZ);
> -		}
> +	ret = ttm_bo_individualize_resv(bo);
> +	if (ret) {
> +		/* Last resort, if we fail to allocate memory for
> the
> +		 * fences block for the BO to become idle
> +		 */
> +		dma_resv_wait_timeout(bo->base.resv,
> DMA_RESV_USAGE_BOOKKEEP,
> +				      false, 30 * HZ);
> +	}
>  
> -		if (bo->bdev->funcs->release_notify)
> -			bo->bdev->funcs->release_notify(bo);
> -
> -		drm_vma_offset_remove(bdev->vma_manager, &bo-
> >base.vma_node);
> -		ttm_mem_io_free(bdev, bo->resource);
> -
> -		if (!dma_resv_test_signaled(&bo->base._resv,
> -					    DMA_RESV_USAGE_BOOKKEEP)
> ||
> -		    (want_init_on_free() && (bo->ttm != NULL)) ||
> -		    bo->type == ttm_bo_type_sg ||
> -		    !dma_resv_trylock(bo->base.resv)) {
> -			/* The BO is not idle, resurrect it for
> delayed destroy */
> -			ttm_bo_flush_all_fences(bo);
> -			bo->deleted = true;
> -
> -			spin_lock(&bo->bdev->lru_lock);
> -
> -			/*
> -			 * Make pinned bos immediately available to
> -			 * shrinkers, now that they are queued for
> -			 * destruction.
> -			 *
> -			 * FIXME: QXL is triggering this. Can be
> removed when the
> -			 * driver is fixed.
> -			 */
> -			if (bo->pin_count) {
> -				bo->pin_count = 0;
> -				ttm_resource_move_to_lru_tail(bo-
> >resource);
> -			}
> +	if (bo->bdev->funcs->release_notify)
> +		bo->bdev->funcs->release_notify(bo);
> +
> +	drm_vma_offset_remove(bdev->vma_manager, &bo-
> >base.vma_node);
> +	ttm_mem_io_free(bdev, bo->resource);
>  
> -			kref_init(&bo->kref);
> -			spin_unlock(&bo->bdev->lru_lock);
> +	if (!dma_resv_test_signaled(&bo->base._resv,
> DMA_RESV_USAGE_BOOKKEEP) ||
> +	    (want_init_on_free() && (bo->ttm != NULL)) ||
> +	    bo->type == ttm_bo_type_sg ||
> +	    !dma_resv_trylock(bo->base.resv)) {
> +		/* The BO is not idle, resurrect it for delayed
> destroy */
> +		ttm_bo_flush_all_fences(bo);
>  
> -			INIT_WORK(&bo->delayed_delete,
> ttm_bo_delayed_delete);
> +		spin_lock(&bo->bdev->lru_lock);
>  
> -			/* Schedule the worker on the closest NUMA
> node. This
> -			 * improves performance since system memory
> might be
> -			 * cleared on free and that is best done on
> a CPU core
> -			 * close to it.
> -			 */
> -			queue_work_node(bdev->pool.nid, bdev->wq,
> &bo->delayed_delete);
> -			return;
> +		/*
> +		 * Make pinned bos immediately available to
> +		 * shrinkers, now that they are queued for
> +		 * destruction.
> +		 *
> +		 * FIXME: QXL is triggering this. Can be removed
> when the
> +		 * driver is fixed.
> +		 */
> +		if (bo->pin_count) {
> +			bo->pin_count = 0;
> +			ttm_resource_move_to_lru_tail(bo->resource);
>  		}
>  
> +		kref_init(&bo->base.refcount);
> +		bo->base.funcs = &ttm_deleted_object_funcs;
> +		spin_unlock(&bo->bdev->lru_lock);
> +
> +		INIT_WORK(&bo->delayed_delete,
> ttm_bo_delayed_delete);
> +
> +		/* Schedule the worker on the closest NUMA node.
> This
> +		 * improves performance since system memory might be
> +		 * cleared on free and that is best done on a CPU
> core
> +		 * close to it.
> +		 */
> +		queue_work_node(bdev->pool.nid, bdev->wq, &bo-
> >delayed_delete);
> +	} else {
>  		ttm_bo_cleanup_memtype_use(bo);
>  		dma_resv_unlock(bo->base.resv);
> -	}
>  
> -	atomic_dec(&ttm_glob.bo_count);
> -	bo->destroy(bo);
> -}
> -
> -void ttm_bo_fini(struct ttm_buffer_object *bo)
> -{
> -	ttm_bo_put(bo);
> +		atomic_dec(&ttm_glob.bo_count);
> +		bo->destroy(bo);
> +	}
>  }
>  EXPORT_SYMBOL(ttm_bo_fini);
>  
> @@ -471,7 +472,7 @@ int ttm_bo_evict_first(struct ttm_device *bdev,
> struct ttm_resource_manager *man
>  	if (!bo->resource || bo->resource->mem_type != mem_type)
>  		goto out_bo_moved;
>  
> -	if (bo->deleted) {
> +	if (ttm_bo_is_zombie(bo)) {
>  		ret = ttm_bo_wait_ctx(bo, ctx);
>  		if (!ret)
>  			ttm_bo_cleanup_memtype_use(bo);
> @@ -525,7 +526,7 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk
> *walk, struct ttm_buffer_object *
>  	if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo,
> evict_walk->place))
>  		return 0;
>  
> -	if (bo->deleted) {
> +	if (ttm_bo_is_zombie(bo)) {
>  		lret = ttm_bo_wait_ctx(bo, walk->ctx);
>  		if (!lret)
>  			ttm_bo_cleanup_memtype_use(bo);
> @@ -623,7 +624,6 @@ static int ttm_bo_evict_alloc(struct ttm_device
> *bdev,
>  void ttm_bo_pin(struct ttm_buffer_object *bo)
>  {
>  	dma_resv_assert_held(bo->base.resv);
> -	WARN_ON_ONCE(!kref_read(&bo->kref));
>  	spin_lock(&bo->bdev->lru_lock);
>  	if (bo->resource)
>  		ttm_resource_del_bulk_move(bo->resource, bo);
> @@ -642,7 +642,6 @@ EXPORT_SYMBOL(ttm_bo_pin);
>  void ttm_bo_unpin(struct ttm_buffer_object *bo)
>  {
>  	dma_resv_assert_held(bo->base.resv);
> -	WARN_ON_ONCE(!kref_read(&bo->kref));
>  	if (WARN_ON_ONCE(!bo->pin_count))
>  		return;
>  
> @@ -931,7 +930,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
> struct ttm_buffer_object *bo,
>  {
>  	int ret;
>  
> -	kref_init(&bo->kref);
>  	bo->bdev = bdev;
>  	bo->type = type;
>  	bo->page_alignment = alignment;
> @@ -1127,7 +1125,7 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk,
> struct ttm_buffer_object *bo)
>  		goto out;
>  	}
>  
> -	if (bo->deleted) {
> +	if (ttm_bo_is_zombie(bo)) {
>  		pgoff_t num_pages = bo->ttm->num_pages;
>  
>  		ret = ttm_bo_wait_ctx(bo, ctx);
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_internal.h
> b/drivers/gpu/drm/ttm/ttm_bo_internal.h
> index e0d48eac74b0..81327bc73834 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_internal.h
> +++ b/drivers/gpu/drm/ttm/ttm_bo_internal.h
> @@ -34,7 +34,7 @@
>   */
>  static inline void ttm_bo_get(struct ttm_buffer_object *bo)
>  {
> -	kref_get(&bo->kref);
> +	drm_gem_object_get(&bo->base);
>  }
>  
>  /**
> @@ -50,11 +50,14 @@ static inline void ttm_bo_get(struct
> ttm_buffer_object *bo)
>  static inline __must_check struct ttm_buffer_object *
>  ttm_bo_get_unless_zero(struct ttm_buffer_object *bo)
>  {
> -	if (!kref_get_unless_zero(&bo->kref))
> +	if (!kref_get_unless_zero(&bo->base.refcount))
>  		return NULL;
>  	return bo;
>  }
>  
> -void ttm_bo_put(struct ttm_buffer_object *bo);
> +static inline void ttm_bo_put(struct ttm_buffer_object *bo)
> +{
> +	drm_gem_object_put(&bo->base);
> +}
>  
>  #endif
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
> b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 56f3152f34f5..56645039757e 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -245,7 +245,7 @@ static int ttm_buffer_object_transfer(struct
> ttm_buffer_object *bo,
>  	atomic_inc(&ttm_glob.bo_count);
>  	drm_vma_node_reset(&fbo->base.base.vma_node);
>  
> -	kref_init(&fbo->base.kref);
> +	kref_init(&fbo->base.base.refcount);
>  	fbo->base.destroy = &ttm_transfered_destroy;
>  	fbo->base.pin_count = 0;
>  	if (bo->type != ttm_bo_type_sg)
> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
> index 4b0552d1bc55..4fe4031f0165 100644
> --- a/include/drm/ttm/ttm_bo.h
> +++ b/include/drm/ttm/ttm_bo.h
> @@ -78,11 +78,8 @@ enum ttm_bo_type {
>   * @type: The bo type.
>   * @page_alignment: Page alignment.
>   * @destroy: Destruction function. If NULL, kfree is used.
> - * @kref: Reference count of this buffer object. When this refcount
> reaches
> - * zero, the object is destroyed or put on the delayed delete list.
>   * @resource: structure describing current placement.
>   * @ttm: TTM structure holding system pages.
> - * @deleted: True if the object is only a zombie and already
> deleted.
>   * @bulk_move: The bulk move object.
>   * @priority: Priority for LRU, BOs with lower priority are evicted
> first.
>   * @pin_count: Pin count.
> @@ -109,17 +106,11 @@ struct ttm_buffer_object {
>  	uint32_t page_alignment;
>  	void (*destroy) (struct ttm_buffer_object *);
>  
> -	/*
> -	* Members not needing protection.
> -	*/
> -	struct kref kref;
> -
>  	/*
>  	 * Members protected by the bo::resv::reserved lock.
>  	 */
>  	struct ttm_resource *resource;
>  	struct ttm_tt *ttm;
> -	bool deleted;
>  	struct ttm_lru_bulk_move *bulk_move;
>  	unsigned priority;
>  	unsigned pin_count;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/6] drm/xe: stop asserting on the TTM refcount
  2025-06-16 13:07 ` [PATCH 2/6] drm/xe: stop asserting on the TTM refcount Christian König
@ 2025-06-23 14:54   ` Matthew Brost
  0 siblings, 0 replies; 12+ messages in thread
From: Matthew Brost @ 2025-06-23 14:54 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, intel-xe, dri-devel, matthew.auld

On Mon, Jun 16, 2025 at 03:07:22PM +0200, Christian König wrote:
> The TTM refcount is about to be removed.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/xe/xe_bo.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 7aa2c17825da..2a8e7cb8c982 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -1417,8 +1417,6 @@ static bool xe_ttm_bo_lock_in_destructor(struct ttm_buffer_object *ttm_bo)
>  	struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
>  	bool locked;
>  
> -	xe_assert(xe, !kref_read(&ttm_bo->kref));
> -

Maybe drop this patch and the last one in the series and replace them
with an assert(!gem.refcount). I think that would be okay and wouldn’t
break anything while retaining the spirit of this assert.

Matt

>  	/*
>  	 * We can typically only race with TTM trylocking under the
>  	 * lru_lock, which will immediately be unlocked again since
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging
  2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
                   ` (4 preceding siblings ...)
  2025-06-16 13:07 ` [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount Christian König
@ 2025-06-23 19:02 ` Ian Forbes
  5 siblings, 0 replies; 12+ messages in thread
From: Ian Forbes @ 2025-06-23 19:02 UTC (permalink / raw)
  To: Christian König
  Cc: thomas.hellstrom, intel-xe, dri-devel, matthew.brost,
	matthew.auld

[-- Attachment #1: Type: text/plain, Size: 281 bytes --]

On Mon, Jun 16, 2025 at 8:07 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> That is something TTM internal which is about to get dropped.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>

Reviewed-by: Ian Forbes <ian.forbes@broadcom.com>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/6] drm/ttm: fix error handling in ttm_buffer_object_transfer
  2025-06-16 13:07 ` [PATCH 3/6] drm/ttm: fix error handling in ttm_buffer_object_transfer Christian König
@ 2025-06-24  2:24   ` Matthew Brost
  0 siblings, 0 replies; 12+ messages in thread
From: Matthew Brost @ 2025-06-24  2:24 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, intel-xe, dri-devel, matthew.auld

On Mon, Jun 16, 2025 at 03:07:23PM +0200, Christian König wrote:
> Unlocking the resv object was missing in the error path, additionally to
> that we should move over the resource only after the fence slot was
> reserved.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

Fixes tag?

You probably can merge this one by itself ahead of the rest of the series.

With that:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/ttm/ttm_bo_util.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index b78365dc1fed..56f3152f34f5 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -256,6 +256,13 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  	ret = dma_resv_trylock(&fbo->base.base._resv);
>  	WARN_ON(!ret);
>  
> +	ret = dma_resv_reserve_fences(&fbo->base.base._resv, 1);
> +	if (ret) {
> +		dma_resv_unlock(&fbo->base.base._resv);
> +		kfree(fbo);
> +		return ret;
> +	}
> +
>  	if (fbo->base.resource) {
>  		ttm_resource_set_bo(fbo->base.resource, &fbo->base);
>  		bo->resource = NULL;
> @@ -264,12 +271,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  		fbo->base.bulk_move = NULL;
>  	}
>  
> -	ret = dma_resv_reserve_fences(&fbo->base.base._resv, 1);
> -	if (ret) {
> -		kfree(fbo);
> -		return ret;
> -	}
> -
>  	ttm_bo_get(bo);
>  	fbo->bo = bo;
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-06-24  2:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-16 13:07 [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Christian König
2025-06-16 13:07 ` [PATCH 2/6] drm/xe: stop asserting on the TTM refcount Christian König
2025-06-23 14:54   ` Matthew Brost
2025-06-16 13:07 ` [PATCH 3/6] drm/ttm: fix error handling in ttm_buffer_object_transfer Christian König
2025-06-24  2:24   ` Matthew Brost
2025-06-16 13:07 ` [PATCH 4/6] drm/ttm: rename ttm_bo_put to _fini Christian König
2025-06-17  2:08   ` kernel test robot
2025-06-16 13:07 ` [PATCH 5/6] drm/ttm: disable ttm_bo_validate_deleted_evict for now Christian König
2025-06-16 13:07 ` [PATCH 6/6] drm/ttm: replace TTMs refcount with the DRM refcount Christian König
2025-06-17  4:27   ` kernel test robot
2025-06-17 10:18   ` Thomas Hellström
2025-06-23 19:02 ` [PATCH 1/6] drm/vmwgfx: drop printing the TTM refcount for debugging Ian Forbes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).