[PATCH v5 0/6] Improving the worst case TTM large allocation latency

AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/6] Improving the worst case TTM large allocation latency
@ 2025-10-20 11:54 Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 1/6] drm/ttm: Add getter for some pool properties Tvrtko Ursulin
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
	Danilo Krummrich, Dave Airlie, Gerd Hoffmann, Joonas Lahtinen,
	Lucas De Marchi, Lyude Paul, Maarten Lankhorst, Maxime Ripard,
	Rodrigo Vivi, Sui Jingfeng, Thadeu Lima de Souza Cascardo,
	Thomas Hellström, Thomas Zimmermann, Zack Rusin

Disclaimer:
Please note that as this series includes a patch which touches a good number of
drivers I will only copy everyone in the cover letter and the respective patch.
Assumption is people are subscribed to dri-devel so can look at the whole series
there. I know someone is bound to complain for both the case when everyone is
copied on everything for getting too much email, and also for this other case.
So please be flexible.

Description:

All drivers which use the TTM pool allocator end up requesting large order
allocations when allocating large buffers. Those can be slow due memory pressure
and so add latency to buffer creation. But there is often also a size limit
above which contiguous blocks do not bring any performance benefits. This series
allows drivers to say when it is okay for the TTM to try a bit less hard.

We do this by allowing drivers to specify this cut off point when creating the
TTM device and pools. Allocations above this size will skip direct reclaim so
under memory pressure worst case latency will improve. Background reclaim is
still kicked off and both before and after the memory pressure all the TTM pool
buckets remain to be used as they are today.

This is especially interesting if someone has configured MAX_PAGE_ORDER to
higher than the default. And even with the default, with amdgpu for example,
the last patch in the series makes use of the new feature by telling TTM that
above 2MiB we do not expect performance benefits. Which makes TTM not try direct
reclaim for the top bucket (4MiB).

End result is TTM drivers become a tiny bit nicer mm citizens and users benefit
from better worst case buffer creation latencies. As a side benefit we get rid
of two instances of those often very unreadable mutliple nameless booleans
function signatures.

If this sounds interesting and gets merge the invidual drivers can follow up
with patches configuring their thresholds.

v2:
 * Christian suggested to pass in the new data by changing the function signatures.

v3:
 * Moved ttm pool helpers into new ttm_pool_internal.h. (Christian)

v4:
 * Fixed TTM unit test build.

v5:
 * Renamed pool_flags to alloc_flags and moved to TTM_ALLOCATION_ namespace.
 * Added last patch (propagate ENOSPC) from Thomas' related series for reference.

v1 thread:
https://lore.kernel.org/dri-devel/20250919131127.90932-1-tvrtko.ursulin@igalia.com/

v3 thread:
https://lore.kernel.org/dri-devel/20251008115314.55438-1-tvrtko.ursulin@igalia.com/

v4 thread:
https://lore.kernel.org/dri-devel/20251013082240.55263-1-tvrtko.ursulin@igalia.com/

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sui Jingfeng <suijingfeng@loongson.cn>
Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Zack Rusin <zack.rusin@broadcom.com>

Tvrtko Ursulin (6):
  drm/ttm: Add getter for some pool properties
  drm/ttm: Replace multiple booleans with flags in pool init
  drm/ttm: Replace multiple booleans with flags in device init
  drm/ttm: Allow drivers to specify maximum beneficial TTM pool size
  drm/amdgpu: Configure max beneficial TTM pool allocation order
  drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  9 ++--
 drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
 drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
 drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 ++-
 drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
 drivers/gpu/drm/radeon/radeon_ttm.c           |  6 ++-
 drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++----
 .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
 drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 33 ++++++--------
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++-----
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 +--
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c     | 24 +++++-----
 drivers/gpu/drm/ttm/ttm_bo.c                  |  4 +-
 drivers/gpu/drm/ttm/ttm_device.c              |  9 ++--
 drivers/gpu/drm/ttm/ttm_pool.c                | 45 +++++++++++--------
 drivers/gpu/drm/ttm/ttm_pool_internal.h       | 25 +++++++++++
 drivers/gpu/drm/ttm/ttm_tt.c                  | 10 +++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 +-
 drivers/gpu/drm/xe/xe_device.c                |  2 +-
 include/drm/ttm/ttm_allocation.h              | 12 +++++
 include/drm/ttm/ttm_device.h                  |  8 +++-
 include/drm/ttm/ttm_pool.h                    |  8 ++--
 23 files changed, 154 insertions(+), 107 deletions(-)
 create mode 100644 drivers/gpu/drm/ttm/ttm_pool_internal.h
 create mode 100644 include/drm/ttm/ttm_allocation.h

-- 
2.48.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/6] drm/ttm: Add getter for some pool properties
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
@ 2025-10-20 11:54 ` Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 2/6] drm/ttm: Replace multiple booleans with flags in pool init Tvrtko Ursulin
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Christian König,
	Thomas Hellström

No functional change but to allow easier refactoring in the future.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c |  4 +++-
 drivers/gpu/drm/ttm/ttm_pool.c            | 29 ++++++++++++-----------
 drivers/gpu/drm/ttm/ttm_pool_internal.h   | 19 +++++++++++++++
 drivers/gpu/drm/ttm/ttm_tt.c              | 10 ++++----
 4 files changed, 43 insertions(+), 19 deletions(-)
 create mode 100644 drivers/gpu/drm/ttm/ttm_pool_internal.h

diff --git a/drivers/gpu/drm/ttm/tests/ttm_pool_test.c b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
index 8ade53371f72..17ebb9fbd688 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
@@ -8,6 +8,7 @@
 #include <drm/ttm/ttm_pool.h>
 
 #include "ttm_kunit_helpers.h"
+#include "../ttm_pool_internal.h"
 
 struct ttm_pool_test_case {
 	const char *description;
@@ -155,7 +156,8 @@ static void ttm_pool_alloc_basic(struct kunit *test)
 
 	KUNIT_ASSERT_PTR_EQ(test, pool->dev, devs->dev);
 	KUNIT_ASSERT_EQ(test, pool->nid, NUMA_NO_NODE);
-	KUNIT_ASSERT_EQ(test, pool->use_dma_alloc, params->use_dma_alloc);
+	KUNIT_ASSERT_EQ(test, ttm_pool_uses_dma_alloc(pool),
+			params->use_dma_alloc);
 
 	err = ttm_pool_alloc(pool, tt, &simple_ctx);
 	KUNIT_ASSERT_EQ(test, err, 0);
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index baf27c70a419..ff6fab4122bb 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -48,6 +48,7 @@
 #include <drm/ttm/ttm_bo.h>
 
 #include "ttm_module.h"
+#include "ttm_pool_internal.h"
 
 #ifdef CONFIG_FAULT_INJECTION
 #include <linux/fault-inject.h>
@@ -148,7 +149,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 		gfp_flags |= __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN |
 			__GFP_THISNODE;
 
-	if (!pool->use_dma_alloc) {
+	if (!ttm_pool_uses_dma_alloc(pool)) {
 		p = alloc_pages_node(pool->nid, gfp_flags, order);
 		if (p)
 			p->private = order;
@@ -200,7 +201,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
 		set_pages_wb(p, 1 << order);
 #endif
 
-	if (!pool || !pool->use_dma_alloc) {
+	if (!pool || !ttm_pool_uses_dma_alloc(pool)) {
 		__free_pages(p, order);
 		return;
 	}
@@ -243,7 +244,7 @@ static int ttm_pool_map(struct ttm_pool *pool, unsigned int order,
 {
 	dma_addr_t addr;
 
-	if (pool->use_dma_alloc) {
+	if (ttm_pool_uses_dma_alloc(pool)) {
 		struct ttm_pool_dma *dma = (void *)p->private;
 
 		addr = dma->addr;
@@ -265,7 +266,7 @@ static void ttm_pool_unmap(struct ttm_pool *pool, dma_addr_t dma_addr,
 			   unsigned int num_pages)
 {
 	/* Unmapped while freeing the page */
-	if (pool->use_dma_alloc)
+	if (ttm_pool_uses_dma_alloc(pool))
 		return;
 
 	dma_unmap_page(pool->dev, dma_addr, (long)num_pages << PAGE_SHIFT,
@@ -339,7 +340,7 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
 						  enum ttm_caching caching,
 						  unsigned int order)
 {
-	if (pool->use_dma_alloc)
+	if (ttm_pool_uses_dma_alloc(pool))
 		return &pool->caching[caching].orders[order];
 
 #ifdef CONFIG_X86
@@ -348,7 +349,7 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
 		if (pool->nid != NUMA_NO_NODE)
 			return &pool->caching[caching].orders[order];
 
-		if (pool->use_dma32)
+		if (ttm_pool_uses_dma32(pool))
 			return &global_dma32_write_combined[order];
 
 		return &global_write_combined[order];
@@ -356,7 +357,7 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
 		if (pool->nid != NUMA_NO_NODE)
 			return &pool->caching[caching].orders[order];
 
-		if (pool->use_dma32)
+		if (ttm_pool_uses_dma32(pool))
 			return &global_dma32_uncached[order];
 
 		return &global_uncached[order];
@@ -396,7 +397,7 @@ static unsigned int ttm_pool_shrink(void)
 /* Return the allocation order based for a page */
 static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
 {
-	if (pool->use_dma_alloc) {
+	if (ttm_pool_uses_dma_alloc(pool)) {
 		struct ttm_pool_dma *dma = (void *)p->private;
 
 		return dma->vaddr & ~PAGE_MASK;
@@ -719,7 +720,7 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	if (ctx->gfp_retry_mayfail)
 		gfp_flags |= __GFP_RETRY_MAYFAIL;
 
-	if (pool->use_dma32)
+	if (ttm_pool_uses_dma32(pool))
 		gfp_flags |= GFP_DMA32;
 	else
 		gfp_flags |= GFP_HIGHUSER;
@@ -977,7 +978,7 @@ long ttm_pool_backup(struct ttm_pool *pool, struct ttm_tt *tt,
 		return -EINVAL;
 
 	if ((!ttm_backup_bytes_avail() && !flags->purge) ||
-	    pool->use_dma_alloc || ttm_tt_is_backed_up(tt))
+	    ttm_pool_uses_dma_alloc(pool) || ttm_tt_is_backed_up(tt))
 		return -EBUSY;
 
 #ifdef CONFIG_X86
@@ -1014,7 +1015,7 @@ long ttm_pool_backup(struct ttm_pool *pool, struct ttm_tt *tt,
 	if (flags->purge)
 		return shrunken;
 
-	if (pool->use_dma32)
+	if (ttm_pool_uses_dma32(pool))
 		gfp = GFP_DMA32;
 	else
 		gfp = GFP_HIGHUSER;
@@ -1068,7 +1069,7 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
 {
 	unsigned int i, j;
 
-	WARN_ON(!dev && use_dma_alloc);
+	WARN_ON(!dev && ttm_pool_uses_dma_alloc(pool));
 
 	pool->dev = dev;
 	pool->nid = nid;
@@ -1239,7 +1240,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)
 {
 	unsigned int i;
 
-	if (!pool->use_dma_alloc && pool->nid == NUMA_NO_NODE) {
+	if (!ttm_pool_uses_dma_alloc(pool) && pool->nid == NUMA_NO_NODE) {
 		seq_puts(m, "unused\n");
 		return 0;
 	}
@@ -1250,7 +1251,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)
 	for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) {
 		if (!ttm_pool_select_type(pool, i, 0))
 			continue;
-		if (pool->use_dma_alloc)
+		if (ttm_pool_uses_dma_alloc(pool))
 			seq_puts(m, "DMA ");
 		else
 			seq_printf(m, "N%d ", pool->nid);
diff --git a/drivers/gpu/drm/ttm/ttm_pool_internal.h b/drivers/gpu/drm/ttm/ttm_pool_internal.h
new file mode 100644
index 000000000000..3e50d30bd95a
--- /dev/null
+++ b/drivers/gpu/drm/ttm/ttm_pool_internal.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/* Copyright (c) 2025 Valve Corporation */
+
+#ifndef _TTM_POOL_INTERNAL_H_
+#define _TTM_POOL_INTERNAL_H_
+
+#include <drm/ttm/ttm_pool.h>
+
+static inline bool ttm_pool_uses_dma_alloc(struct ttm_pool *pool)
+{
+	return pool->use_dma_alloc;
+}
+
+static inline bool ttm_pool_uses_dma32(struct ttm_pool *pool)
+{
+	return pool->use_dma32;
+}
+
+#endif
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 506e257dfba8..ced0875d0722 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -46,6 +46,7 @@
 #include <drm/ttm/ttm_tt.h>
 
 #include "ttm_module.h"
+#include "ttm_pool_internal.h"
 
 static unsigned long ttm_pages_limit;
 
@@ -93,7 +94,8 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
 	 * mapped TT pages need to be decrypted or otherwise the drivers
 	 * will end up sending encrypted mem to the gpu.
 	 */
-	if (bdev->pool.use_dma_alloc && cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
+	if (ttm_pool_uses_dma_alloc(&bdev->pool) &&
+	    cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
 		page_flags |= TTM_TT_FLAG_DECRYPTED;
 		drm_info_once(ddev, "TT memory decryption enabled.");
 	}
@@ -378,7 +380,7 @@ int ttm_tt_populate(struct ttm_device *bdev,
 
 	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_add(ttm->num_pages, &ttm_pages_allocated);
-		if (bdev->pool.use_dma32)
+		if (ttm_pool_uses_dma32(&bdev->pool))
 			atomic_long_add(ttm->num_pages,
 					&ttm_dma32_pages_allocated);
 	}
@@ -416,7 +418,7 @@ int ttm_tt_populate(struct ttm_device *bdev,
 error:
 	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
-		if (bdev->pool.use_dma32)
+		if (ttm_pool_uses_dma32(&bdev->pool))
 			atomic_long_sub(ttm->num_pages,
 					&ttm_dma32_pages_allocated);
 	}
@@ -439,7 +441,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 
 	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
-		if (bdev->pool.use_dma32)
+		if (ttm_pool_uses_dma32(&bdev->pool))
 			atomic_long_sub(ttm->num_pages,
 					&ttm_dma32_pages_allocated);
 	}
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/6] drm/ttm: Replace multiple booleans with flags in pool init
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 1/6] drm/ttm: Add getter for some pool properties Tvrtko Ursulin
@ 2025-10-20 11:54 ` Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init Tvrtko Ursulin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
	Thomas Hellström

Multiple consecutive boolean function arguments are usually not very
readable.

Replace the ones in ttm_pool_init() with flags with the additional
benefit of soon being able to pass in more data with just this one
code base churning cost.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
v2:
 * Rebase for rename and move of flags to alloc_flags / TTM_ALLOCATION_.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c     |  2 +-
 drivers/gpu/drm/ttm/tests/ttm_device_test.c | 25 +++++++++------------
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c   | 24 +++++++++-----------
 drivers/gpu/drm/ttm/ttm_device.c            |  5 ++++-
 drivers/gpu/drm/ttm/ttm_pool.c              |  8 +++----
 drivers/gpu/drm/ttm/ttm_pool_internal.h     |  5 +++--
 include/drm/ttm/ttm_allocation.h            | 10 +++++++++
 include/drm/ttm/ttm_pool.h                  |  8 +++----
 8 files changed, 45 insertions(+), 42 deletions(-)
 create mode 100644 include/drm/ttm/ttm_allocation.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index aa9ee5dffa45..8f6d331e1ea2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1837,7 +1837,7 @@ static int amdgpu_ttm_pools_init(struct amdgpu_device *adev)
 	for (i = 0; i < adev->gmc.num_mem_partitions; i++) {
 		ttm_pool_init(&adev->mman.ttm_pools[i], adev->dev,
 			      adev->gmc.mem_partitions[i].numa.node,
-			      false, false);
+			      0);
 	}
 	return 0;
 }
diff --git a/drivers/gpu/drm/ttm/tests/ttm_device_test.c b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
index 1621903818e5..98648d5f20e7 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_device_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
@@ -7,11 +7,11 @@
 #include <drm/ttm/ttm_placement.h>
 
 #include "ttm_kunit_helpers.h"
+#include "../ttm_pool_internal.h"
 
 struct ttm_device_test_case {
 	const char *description;
-	bool use_dma_alloc;
-	bool use_dma32;
+	unsigned int alloc_flags;
 	bool pools_init_expected;
 };
 
@@ -119,26 +119,22 @@ static void ttm_device_init_no_vma_man(struct kunit *test)
 static const struct ttm_device_test_case ttm_device_cases[] = {
 	{
 		.description = "No DMA allocations, no DMA32 required",
-		.use_dma_alloc = false,
-		.use_dma32 = false,
 		.pools_init_expected = false,
 	},
 	{
 		.description = "DMA allocations, DMA32 required",
-		.use_dma_alloc = true,
-		.use_dma32 = true,
+		.alloc_flags = TTM_ALLOCATION_POOL_USE_DMA_ALLOC |
+			       TTM_ALLOCATION_POOL_USE_DMA32,
 		.pools_init_expected = true,
 	},
 	{
 		.description = "No DMA allocations, DMA32 required",
-		.use_dma_alloc = false,
-		.use_dma32 = true,
+		.alloc_flags = TTM_ALLOCATION_POOL_USE_DMA32,
 		.pools_init_expected = false,
 	},
 	{
 		.description = "DMA allocations, no DMA32 required",
-		.use_dma_alloc = true,
-		.use_dma32 = false,
+		.alloc_flags = TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
 		.pools_init_expected = true,
 	},
 };
@@ -163,15 +159,14 @@ static void ttm_device_init_pools(struct kunit *test)
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
 	err = ttm_device_kunit_init(priv, ttm_dev,
-				    params->use_dma_alloc,
-				    params->use_dma32);
+				    params->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
+				    params->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA32);
 	KUNIT_ASSERT_EQ(test, err, 0);
 
 	pool = &ttm_dev->pool;
 	KUNIT_ASSERT_NOT_NULL(test, pool);
 	KUNIT_EXPECT_PTR_EQ(test, pool->dev, priv->dev);
-	KUNIT_EXPECT_EQ(test, pool->use_dma_alloc, params->use_dma_alloc);
-	KUNIT_EXPECT_EQ(test, pool->use_dma32, params->use_dma32);
+	KUNIT_EXPECT_EQ(test, pool->alloc_flags, params->alloc_flags);
 
 	if (params->pools_init_expected) {
 		for (int i = 0; i < TTM_NUM_CACHING_TYPES; ++i) {
@@ -181,7 +176,7 @@ static void ttm_device_init_pools(struct kunit *test)
 				KUNIT_EXPECT_EQ(test, pt.caching, i);
 				KUNIT_EXPECT_EQ(test, pt.order, j);
 
-				if (params->use_dma_alloc)
+				if (ttm_pool_uses_dma_alloc(pool))
 					KUNIT_ASSERT_FALSE(test,
 							   list_empty(&pt.pages));
 			}
diff --git a/drivers/gpu/drm/ttm/tests/ttm_pool_test.c b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
index 17ebb9fbd688..11c92bd75779 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
@@ -13,7 +13,7 @@
 struct ttm_pool_test_case {
 	const char *description;
 	unsigned int order;
-	bool use_dma_alloc;
+	unsigned int alloc_flags;
 };
 
 struct ttm_pool_test_priv {
@@ -87,7 +87,7 @@ static struct ttm_pool *ttm_pool_pre_populated(struct kunit *test,
 	pool = kunit_kzalloc(test, sizeof(*pool), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, pool);
 
-	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, true, false);
+	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, TTM_ALLOCATION_POOL_USE_DMA_ALLOC);
 
 	err = ttm_pool_alloc(pool, tt, &simple_ctx);
 	KUNIT_ASSERT_EQ(test, err, 0);
@@ -114,12 +114,12 @@ static const struct ttm_pool_test_case ttm_pool_basic_cases[] = {
 	{
 		.description = "One page, with coherent DMA mappings enabled",
 		.order = 0,
-		.use_dma_alloc = true,
+		.alloc_flags = TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
 	},
 	{
 		.description = "Above the allocation limit, with coherent DMA mappings enabled",
 		.order = MAX_PAGE_ORDER + 1,
-		.use_dma_alloc = true,
+		.alloc_flags = TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
 	},
 };
 
@@ -151,13 +151,11 @@ static void ttm_pool_alloc_basic(struct kunit *test)
 	pool = kunit_kzalloc(test, sizeof(*pool), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, pool);
 
-	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, params->use_dma_alloc,
-		      false);
+	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, params->alloc_flags);
 
 	KUNIT_ASSERT_PTR_EQ(test, pool->dev, devs->dev);
 	KUNIT_ASSERT_EQ(test, pool->nid, NUMA_NO_NODE);
-	KUNIT_ASSERT_EQ(test, ttm_pool_uses_dma_alloc(pool),
-			params->use_dma_alloc);
+	KUNIT_ASSERT_EQ(test, pool->alloc_flags, params->alloc_flags);
 
 	err = ttm_pool_alloc(pool, tt, &simple_ctx);
 	KUNIT_ASSERT_EQ(test, err, 0);
@@ -167,14 +165,14 @@ static void ttm_pool_alloc_basic(struct kunit *test)
 	last_page = tt->pages[tt->num_pages - 1];
 
 	if (params->order <= MAX_PAGE_ORDER) {
-		if (params->use_dma_alloc) {
+		if (ttm_pool_uses_dma_alloc(pool)) {
 			KUNIT_ASSERT_NOT_NULL(test, (void *)fst_page->private);
 			KUNIT_ASSERT_NOT_NULL(test, (void *)last_page->private);
 		} else {
 			KUNIT_ASSERT_EQ(test, fst_page->private, params->order);
 		}
 	} else {
-		if (params->use_dma_alloc) {
+		if (ttm_pool_uses_dma_alloc(pool)) {
 			KUNIT_ASSERT_NOT_NULL(test, (void *)fst_page->private);
 			KUNIT_ASSERT_NULL(test, (void *)last_page->private);
 		} else {
@@ -220,7 +218,7 @@ static void ttm_pool_alloc_basic_dma_addr(struct kunit *test)
 	pool = kunit_kzalloc(test, sizeof(*pool), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, pool);
 
-	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, true, false);
+	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, TTM_ALLOCATION_POOL_USE_DMA_ALLOC);
 
 	err = ttm_pool_alloc(pool, tt, &simple_ctx);
 	KUNIT_ASSERT_EQ(test, err, 0);
@@ -350,7 +348,7 @@ static void ttm_pool_free_dma_alloc(struct kunit *test)
 	pool = kunit_kzalloc(test, sizeof(*pool), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, pool);
 
-	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, true, false);
+	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, TTM_ALLOCATION_POOL_USE_DMA_ALLOC);
 	ttm_pool_alloc(pool, tt, &simple_ctx);
 
 	pt = &pool->caching[caching].orders[order];
@@ -381,7 +379,7 @@ static void ttm_pool_free_no_dma_alloc(struct kunit *test)
 	pool = kunit_kzalloc(test, sizeof(*pool), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, pool);
 
-	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, false, false);
+	ttm_pool_init(pool, devs->dev, NUMA_NO_NODE, 0);
 	ttm_pool_alloc(pool, tt, &simple_ctx);
 
 	pt = &pool->caching[caching].orders[order];
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index c3e2fcbdd2cc..a97b1444536c 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -31,6 +31,7 @@
 #include <linux/export.h>
 #include <linux/mm.h>
 
+#include <drm/ttm/ttm_allocation.h>
 #include <drm/ttm/ttm_bo.h>
 #include <drm/ttm/ttm_device.h>
 #include <drm/ttm/ttm_tt.h>
@@ -236,7 +237,9 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
 	else
 		nid = NUMA_NO_NODE;
 
-	ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32);
+	ttm_pool_init(&bdev->pool, dev, nid,
+		      (use_dma_alloc ? TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
+		      (use_dma32 ? TTM_ALLOCATION_POOL_USE_DMA32 : 0));
 
 	bdev->vma_manager = vma_manager;
 	spin_lock_init(&bdev->lru_lock);
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index ff6fab4122bb..4fc69447060c 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -1059,13 +1059,12 @@ long ttm_pool_backup(struct ttm_pool *pool, struct ttm_tt *tt,
  * @pool: the pool to initialize
  * @dev: device for DMA allocations and mappings
  * @nid: NUMA node to use for allocations
- * @use_dma_alloc: true if coherent DMA alloc should be used
- * @use_dma32: true if GFP_DMA32 should be used
+ * @alloc_flags: TTM_ALLOCATION_POOL_ flags
  *
  * Initialize the pool and its pool types.
  */
 void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
-		   int nid, bool use_dma_alloc, bool use_dma32)
+		   int nid, unsigned int alloc_flags)
 {
 	unsigned int i, j;
 
@@ -1073,8 +1072,7 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
 
 	pool->dev = dev;
 	pool->nid = nid;
-	pool->use_dma_alloc = use_dma_alloc;
-	pool->use_dma32 = use_dma32;
+	pool->alloc_flags = alloc_flags;
 
 	for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) {
 		for (j = 0; j < NR_PAGE_ORDERS; ++j) {
diff --git a/drivers/gpu/drm/ttm/ttm_pool_internal.h b/drivers/gpu/drm/ttm/ttm_pool_internal.h
index 3e50d30bd95a..96b7f21514fb 100644
--- a/drivers/gpu/drm/ttm/ttm_pool_internal.h
+++ b/drivers/gpu/drm/ttm/ttm_pool_internal.h
@@ -4,16 +4,17 @@
 #ifndef _TTM_POOL_INTERNAL_H_
 #define _TTM_POOL_INTERNAL_H_
 
+#include <drm/ttm/ttm_allocation.h>
 #include <drm/ttm/ttm_pool.h>
 
 static inline bool ttm_pool_uses_dma_alloc(struct ttm_pool *pool)
 {
-	return pool->use_dma_alloc;
+	return pool->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA_ALLOC;
 }
 
 static inline bool ttm_pool_uses_dma32(struct ttm_pool *pool)
 {
-	return pool->use_dma32;
+	return pool->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA32;
 }
 
 #endif
diff --git a/include/drm/ttm/ttm_allocation.h b/include/drm/ttm/ttm_allocation.h
new file mode 100644
index 000000000000..7869dc32bd91
--- /dev/null
+++ b/include/drm/ttm/ttm_allocation.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/* Copyright (c) 2025 Valve Corporation */
+
+#ifndef _TTM_ALLOCATION_H_
+#define _TTM_ALLOCATION_H_
+
+#define TTM_ALLOCATION_POOL_USE_DMA_ALLOC	BIT(0) /* Use coherent DMA allocations. */
+#define TTM_ALLOCATION_POOL_USE_DMA32		BIT(1) /* Use GFP_DMA32 allocations. */
+
+#endif
diff --git a/include/drm/ttm/ttm_pool.h b/include/drm/ttm/ttm_pool.h
index 54cd34a6e4c0..67c72de913bb 100644
--- a/include/drm/ttm/ttm_pool.h
+++ b/include/drm/ttm/ttm_pool.h
@@ -64,16 +64,14 @@ struct ttm_pool_type {
  *
  * @dev: the device we allocate pages for
  * @nid: which numa node to use
- * @use_dma_alloc: if coherent DMA allocations should be used
- * @use_dma32: if GFP_DMA32 should be used
+ * @alloc_flags: TTM_ALLOCATION_POOL_ flags
  * @caching: pools for each caching/order
  */
 struct ttm_pool {
 	struct device *dev;
 	int nid;
 
-	bool use_dma_alloc;
-	bool use_dma32;
+	unsigned int alloc_flags;
 
 	struct {
 		struct ttm_pool_type orders[NR_PAGE_ORDERS];
@@ -85,7 +83,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt);
 
 void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
-		   int nid, bool use_dma_alloc, bool use_dma32);
+		   int nid, unsigned int alloc_flags);
 void ttm_pool_fini(struct ttm_pool *pool);
 
 int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m);
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 1/6] drm/ttm: Add getter for some pool properties Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 2/6] drm/ttm: Replace multiple booleans with flags in pool init Tvrtko Ursulin
@ 2025-10-20 11:54 ` Tvrtko Ursulin
  2025-10-21 14:16   ` Thomas Hellström
  2025-10-22  3:56   ` Zack Rusin
  2025-10-20 11:54 ` [PATCH v5 4/6] drm/ttm: Allow drivers to specify maximum beneficial TTM pool size Tvrtko Ursulin
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
	Danilo Krummrich, Dave Airlie, Gerd Hoffmann, Joonas Lahtinen,
	Lucas De Marchi, Lyude Paul, Maarten Lankhorst, Maxime Ripard,
	Rodrigo Vivi, Sui Jingfeng, Thomas Hellström,
	Thomas Zimmermann, Zack Rusin

Multiple consecutive boolean function arguments are usually not very
readable.

Replace the ones in ttm_device_init() with flags with the additional
benefit of soon being able to pass in more data with just a one off
code base churning cost.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sui Jingfeng <suijingfeng@loongson.cn>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
v2:
 * Rebase for rename and move of flags to alloc_flags / TTM_ALLOCATION_.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  6 +++--
 drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
 drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
 drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 ++-
 drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 +++--
 drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
 drivers/gpu/drm/radeon/radeon_ttm.c           |  6 +++--
 drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++++++-------
 .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
 drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 12 +++++-----
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++++++-----------
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 ++----
 drivers/gpu/drm/ttm/ttm_device.c              |  9 +++-----
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 ++--
 drivers/gpu/drm/xe/xe_device.c                |  2 +-
 include/drm/ttm/ttm_device.h                  |  3 ++-
 16 files changed, 50 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8f6d331e1ea2..7b144ddea268 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1930,8 +1930,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 	r = ttm_device_init(&adev->mman.bdev, &amdgpu_bo_driver, adev->dev,
 			       adev_to_drm(adev)->anon_inode->i_mapping,
 			       adev_to_drm(adev)->vma_offset_manager,
-			       adev->need_swiotlb,
-			       dma_addressing_limited(adev->dev));
+			       (adev->need_swiotlb ?
+				TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
+			       (dma_addressing_limited(adev->dev) ?
+				TTM_ALLOCATION_POOL_USE_DMA32 : 0));
 	if (r) {
 		dev_err(adev->dev,
 			"failed initializing buffer object driver(%d).\n", r);
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
index 0bec6f66682b..dd3292e57d64 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -859,7 +859,7 @@ static int drm_vram_mm_init(struct drm_vram_mm *vmm, struct drm_device *dev,
 	ret = ttm_device_init(&vmm->bdev, &bo_driver, dev->dev,
 				 dev->anon_inode->i_mapping,
 				 dev->vma_offset_manager,
-				 false, true);
+				 TTM_ALLOCATION_POOL_USE_DMA32);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c b/drivers/gpu/drm/i915/intel_region_ttm.c
index 04525d92bec5..47a69aad5c3f 100644
--- a/drivers/gpu/drm/i915/intel_region_ttm.c
+++ b/drivers/gpu/drm/i915/intel_region_ttm.c
@@ -34,7 +34,7 @@ int intel_region_ttm_device_init(struct drm_i915_private *dev_priv)
 
 	return ttm_device_init(&dev_priv->bdev, i915_ttm_driver(),
 			       drm->dev, drm->anon_inode->i_mapping,
-			       drm->vma_offset_manager, false, false);
+			       drm->vma_offset_manager, 0);
 }
 
 /**
diff --git a/drivers/gpu/drm/loongson/lsdc_ttm.c b/drivers/gpu/drm/loongson/lsdc_ttm.c
index 2e42c6970c9f..dca0d33e2cf2 100644
--- a/drivers/gpu/drm/loongson/lsdc_ttm.c
+++ b/drivers/gpu/drm/loongson/lsdc_ttm.c
@@ -544,7 +544,8 @@ int lsdc_ttm_init(struct lsdc_device *ldev)
 
 	ret = ttm_device_init(&ldev->bdev, &lsdc_bo_driver, ddev->dev,
 			      ddev->anon_inode->i_mapping,
-			      ddev->vma_offset_manager, false, true);
+			      ddev->vma_offset_manager,
+			      TTM_ALLOCATION_POOL_USE_DMA32);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 7d2436e5d50d..47b20cf80388 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -302,8 +302,10 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	ret = ttm_device_init(&drm->ttm.bdev, &nouveau_bo_driver, drm->dev->dev,
 				  dev->anon_inode->i_mapping,
 				  dev->vma_offset_manager,
-				  drm_need_swiotlb(drm->client.mmu.dmabits),
-				  drm->client.mmu.dmabits <= 32);
+				  (drm_need_swiotlb(drm->client.mmu.dmabits) ?
+				   TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0 ) |
+				  (drm->client.mmu.dmabits <= 32 ?
+				   TTM_ALLOCATION_POOL_USE_DMA32 : 0));
 	if (ret) {
 		NV_ERROR(drm, "error initialising bo driver, %d\n", ret);
 		return ret;
diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
index 765a144cea14..85d9df48affa 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -196,7 +196,7 @@ int qxl_ttm_init(struct qxl_device *qdev)
 	r = ttm_device_init(&qdev->mman.bdev, &qxl_bo_driver, NULL,
 			    qdev->ddev.anon_inode->i_mapping,
 			    qdev->ddev.vma_offset_manager,
-			    false, false);
+			    0);
 	if (r) {
 		DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 616d25c8c2de..51dffe23c0fc 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -683,8 +683,10 @@ int radeon_ttm_init(struct radeon_device *rdev)
 	r = ttm_device_init(&rdev->mman.bdev, &radeon_bo_driver, rdev->dev,
 			       rdev_to_drm(rdev)->anon_inode->i_mapping,
 			       rdev_to_drm(rdev)->vma_offset_manager,
-			       rdev->need_swiotlb,
-			       dma_addressing_limited(&rdev->pdev->dev));
+			       (rdev->need_swiotlb ?
+				TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0 ) |
+			       (dma_addressing_limited(&rdev->pdev->dev) ?
+				TTM_ALLOCATION_POOL_USE_DMA32 : 0));
 	if (r) {
 		DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
 		return r;
diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
index 5426b435f702..d468f8322072 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
@@ -251,7 +251,7 @@ static void ttm_bo_unreserve_basic(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -290,7 +290,7 @@ static void ttm_bo_unreserve_pinned(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -342,7 +342,7 @@ static void ttm_bo_unreserve_bulk(struct kunit *test)
 	resv = kunit_kzalloc(test, sizeof(*resv), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, resv);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -394,7 +394,7 @@ static void ttm_bo_fini_basic(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -437,7 +437,7 @@ static void ttm_bo_fini_shared_resv(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -477,7 +477,7 @@ static void ttm_bo_pin_basic(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -512,7 +512,7 @@ static void ttm_bo_pin_unpin_resource(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
@@ -563,7 +563,7 @@ static void ttm_bo_multiple_pin_one_unpin(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 	priv->ttm_dev = ttm_dev;
 
diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
index 3a1eef83190c..17a570af296c 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
@@ -995,7 +995,7 @@ static void ttm_bo_validate_busy_domain_evict(struct kunit *test)
 	 */
 	ttm_device_fini(priv->ttm_dev);
 
-	err = ttm_device_kunit_init_bad_evict(test->priv, priv->ttm_dev, false, false);
+	err = ttm_device_kunit_init_bad_evict(test->priv, priv->ttm_dev);
 	KUNIT_ASSERT_EQ(test, err, 0);
 
 	ttm_mock_manager_init(priv->ttm_dev, mem_type, MANAGER_SIZE);
diff --git a/drivers/gpu/drm/ttm/tests/ttm_device_test.c b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
index 98648d5f20e7..2d55ad34fe48 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_device_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
@@ -25,7 +25,7 @@ static void ttm_device_init_basic(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 
 	KUNIT_EXPECT_PTR_EQ(test, ttm_dev->funcs, &ttm_dev_funcs);
@@ -55,7 +55,7 @@ static void ttm_device_init_multiple(struct kunit *test)
 	KUNIT_ASSERT_NOT_NULL(test, ttm_devs);
 
 	for (i = 0; i < num_dev; i++) {
-		err = ttm_device_kunit_init(priv, &ttm_devs[i], false, false);
+		err = ttm_device_kunit_init(priv, &ttm_devs[i], 0);
 		KUNIT_ASSERT_EQ(test, err, 0);
 
 		KUNIT_EXPECT_PTR_EQ(test, ttm_devs[i].dev_mapping,
@@ -81,7 +81,7 @@ static void ttm_device_fini_basic(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 
 	man = ttm_manager_type(ttm_dev, TTM_PL_SYSTEM);
@@ -109,7 +109,7 @@ static void ttm_device_init_no_vma_man(struct kunit *test)
 	vma_man = drm->vma_offset_manager;
 	drm->vma_offset_manager = NULL;
 
-	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+	err = ttm_device_kunit_init(priv, ttm_dev, 0);
 	KUNIT_EXPECT_EQ(test, err, -EINVAL);
 
 	/* Bring the manager back for a graceful cleanup */
@@ -158,9 +158,7 @@ static void ttm_device_init_pools(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(priv, ttm_dev,
-				    params->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
-				    params->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA32);
+	err = ttm_device_kunit_init(priv, ttm_dev, params->alloc_flags);
 	KUNIT_ASSERT_EQ(test, err, 0);
 
 	pool = &ttm_dev->pool;
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
index 7aaf0d1395ff..7b533e4e1e04 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
@@ -117,8 +117,7 @@ static void bad_evict_flags(struct ttm_buffer_object *bo,
 
 static int ttm_device_kunit_init_with_funcs(struct ttm_test_devices *priv,
 					    struct ttm_device *ttm,
-					    bool use_dma_alloc,
-					    bool use_dma32,
+					    unsigned int alloc_flags,
 					    struct ttm_device_funcs *funcs)
 {
 	struct drm_device *drm = priv->drm;
@@ -127,7 +126,7 @@ static int ttm_device_kunit_init_with_funcs(struct ttm_test_devices *priv,
 	err = ttm_device_init(ttm, funcs, drm->dev,
 			      drm->anon_inode->i_mapping,
 			      drm->vma_offset_manager,
-			      use_dma_alloc, use_dma32);
+			      alloc_flags);
 
 	return err;
 }
@@ -143,11 +142,10 @@ EXPORT_SYMBOL_GPL(ttm_dev_funcs);
 
 int ttm_device_kunit_init(struct ttm_test_devices *priv,
 			  struct ttm_device *ttm,
-			  bool use_dma_alloc,
-			  bool use_dma32)
+			  unsigned int alloc_flags)
 {
-	return ttm_device_kunit_init_with_funcs(priv, ttm, use_dma_alloc,
-						use_dma32, &ttm_dev_funcs);
+	return ttm_device_kunit_init_with_funcs(priv, ttm, alloc_flags,
+						&ttm_dev_funcs);
 }
 EXPORT_SYMBOL_GPL(ttm_device_kunit_init);
 
@@ -161,12 +159,10 @@ struct ttm_device_funcs ttm_dev_funcs_bad_evict = {
 EXPORT_SYMBOL_GPL(ttm_dev_funcs_bad_evict);
 
 int ttm_device_kunit_init_bad_evict(struct ttm_test_devices *priv,
-				    struct ttm_device *ttm,
-				    bool use_dma_alloc,
-				    bool use_dma32)
+				    struct ttm_device *ttm)
 {
-	return ttm_device_kunit_init_with_funcs(priv, ttm, use_dma_alloc,
-						use_dma32, &ttm_dev_funcs_bad_evict);
+	return ttm_device_kunit_init_with_funcs(priv, ttm, 0,
+						&ttm_dev_funcs_bad_evict);
 }
 EXPORT_SYMBOL_GPL(ttm_device_kunit_init_bad_evict);
 
@@ -252,7 +248,7 @@ struct ttm_test_devices *ttm_test_devices_all(struct kunit *test)
 	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
 	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
 
-	err = ttm_device_kunit_init(devs, ttm_dev, false, false);
+	err = ttm_device_kunit_init(devs, ttm_dev, 0);
 	KUNIT_ASSERT_EQ(test, err, 0);
 
 	devs->ttm_dev = ttm_dev;
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
index c7da23232ffa..f8402b979d05 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
@@ -28,12 +28,9 @@ struct ttm_test_devices {
 /* Building blocks for test-specific init functions */
 int ttm_device_kunit_init(struct ttm_test_devices *priv,
 			  struct ttm_device *ttm,
-			  bool use_dma_alloc,
-			  bool use_dma32);
+			  unsigned int alloc_flags);
 int ttm_device_kunit_init_bad_evict(struct ttm_test_devices *priv,
-				    struct ttm_device *ttm,
-				    bool use_dma_alloc,
-				    bool use_dma32);
+				    struct ttm_device *ttm);
 struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
 					    struct ttm_test_devices *devs,
 					    size_t size,
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index a97b1444536c..87c85ccb21ac 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -199,8 +199,7 @@ EXPORT_SYMBOL(ttm_device_swapout);
  * @dev: The core kernel device pointer for DMA mappings and allocations.
  * @mapping: The address space to use for this bo.
  * @vma_manager: A pointer to a vma manager.
- * @use_dma_alloc: If coherent DMA allocation API should be used.
- * @use_dma32: If we should use GFP_DMA32 for device memory allocations.
+ * @alloc_flags: TTM_ALLOCATION_ flags.
  *
  * Initializes a struct ttm_device:
  * Returns:
@@ -209,7 +208,7 @@ EXPORT_SYMBOL(ttm_device_swapout);
 int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *funcs,
 		    struct device *dev, struct address_space *mapping,
 		    struct drm_vma_offset_manager *vma_manager,
-		    bool use_dma_alloc, bool use_dma32)
+		    unsigned int alloc_flags)
 {
 	struct ttm_global *glob = &ttm_glob;
 	int ret, nid;
@@ -237,9 +236,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
 	else
 		nid = NUMA_NO_NODE;
 
-	ttm_pool_init(&bdev->pool, dev, nid,
-		      (use_dma_alloc ? TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
-		      (use_dma32 ? TTM_ALLOCATION_POOL_USE_DMA32 : 0));
+	ttm_pool_init(&bdev->pool, dev, nid, alloc_flags);
 
 	bdev->vma_manager = vma_manager;
 	spin_lock_init(&bdev->lru_lock);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 8ff958d119be..599052d07ae8 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -1023,8 +1023,8 @@ static int vmw_driver_load(struct vmw_private *dev_priv, u32 pci_id)
 			      dev_priv->drm.dev,
 			      dev_priv->drm.anon_inode->i_mapping,
 			      dev_priv->drm.vma_offset_manager,
-			      dev_priv->map_mode == vmw_dma_alloc_coherent,
-			      false);
+			      (dev_priv->map_mode == vmw_dma_alloc_coherent) ?
+			      TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0);
 	if (unlikely(ret != 0)) {
 		drm_err(&dev_priv->drm,
 			"Failed initializing TTM buffer object driver.\n");
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 5f6a412b571c..58e7996160a0 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -437,7 +437,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
 
 	err = ttm_device_init(&xe->ttm, &xe_ttm_funcs, xe->drm.dev,
 			      xe->drm.anon_inode->i_mapping,
-			      xe->drm.vma_offset_manager, false, false);
+			      xe->drm.vma_offset_manager, 0);
 	if (WARN_ON(err))
 		goto err;
 
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 592b5f802859..074b98572275 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -27,6 +27,7 @@
 
 #include <linux/types.h>
 #include <linux/workqueue.h>
+#include <drm/ttm/ttm_allocation.h>
 #include <drm/ttm/ttm_resource.h>
 #include <drm/ttm/ttm_pool.h>
 
@@ -292,7 +293,7 @@ static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
 int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *funcs,
 		    struct device *dev, struct address_space *mapping,
 		    struct drm_vma_offset_manager *vma_manager,
-		    bool use_dma_alloc, bool use_dma32);
+		    unsigned int alloc_flags);
 void ttm_device_fini(struct ttm_device *bdev);
 void ttm_device_clear_dma_mappings(struct ttm_device *bdev);
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 4/6] drm/ttm: Allow drivers to specify maximum beneficial TTM pool size
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2025-10-20 11:54 ` [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init Tvrtko Ursulin
@ 2025-10-20 11:54 ` Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 5/6] drm/amdgpu: Configure max beneficial TTM pool allocation order Tvrtko Ursulin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Christian König,
	Thadeu Lima de Souza Cascardo

GPUs typically benefit from contiguous memory via reduced TLB pressure and
improved caching performance, where the maximum size of contiguous block
which adds a performance benefit is related to hardware design.

TTM pool allocator by default tries (hard) to allocate up to the system
MAX_PAGE_ORDER blocks. This varies by the CPU platform and can also be
configured via Kconfig.

If that limit was set to be higher than the GPU can make an extra use of,
lets allow the individual drivers to let TTM know over which allocation
order can the pool allocator afford to make a little bit less effort with.

We implement this by disabling direct reclaim for those allocations, which
reduces the allocation latency and lowers the demands on the page
allocator, in cases where expending this effort is not critical for the
GPU in question.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
v2:
 * Rebase for rename and move of flags to alloc_flags / TTM_ALLOCATION_.
---
 drivers/gpu/drm/ttm/ttm_pool.c          | 8 ++++++++
 drivers/gpu/drm/ttm/ttm_pool_internal.h | 5 +++++
 include/drm/ttm/ttm_allocation.h        | 5 +++--
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index 4fc69447060c..97e9ce505cf6 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -136,6 +136,7 @@ static DECLARE_RWSEM(pool_shrink_rwsem);
 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 					unsigned int order)
 {
+	const unsigned int beneficial_order = ttm_pool_beneficial_order(pool);
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
 	struct ttm_pool_dma *dma;
 	struct page *p;
@@ -149,6 +150,13 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 		gfp_flags |= __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN |
 			__GFP_THISNODE;
 
+	/*
+	 * Do not add latency to the allocation path for allocations orders
+	 * device tolds us do not bring them additional performance gains.
+	 */
+	if (beneficial_order && order > beneficial_order)
+		gfp_flags &= ~__GFP_DIRECT_RECLAIM;
+
 	if (!ttm_pool_uses_dma_alloc(pool)) {
 		p = alloc_pages_node(pool->nid, gfp_flags, order);
 		if (p)
diff --git a/drivers/gpu/drm/ttm/ttm_pool_internal.h b/drivers/gpu/drm/ttm/ttm_pool_internal.h
index 96b7f21514fb..82c4b7e56a99 100644
--- a/drivers/gpu/drm/ttm/ttm_pool_internal.h
+++ b/drivers/gpu/drm/ttm/ttm_pool_internal.h
@@ -17,4 +17,9 @@ static inline bool ttm_pool_uses_dma32(struct ttm_pool *pool)
 	return pool->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA32;
 }
 
+static inline bool ttm_pool_beneficial_order(struct ttm_pool *pool)
+{
+	return pool->alloc_flags & 0xff;
+}
+
 #endif
diff --git a/include/drm/ttm/ttm_allocation.h b/include/drm/ttm/ttm_allocation.h
index 7869dc32bd91..8f8544760306 100644
--- a/include/drm/ttm/ttm_allocation.h
+++ b/include/drm/ttm/ttm_allocation.h
@@ -4,7 +4,8 @@
 #ifndef _TTM_ALLOCATION_H_
 #define _TTM_ALLOCATION_H_
 
-#define TTM_ALLOCATION_POOL_USE_DMA_ALLOC	BIT(0) /* Use coherent DMA allocations. */
-#define TTM_ALLOCATION_POOL_USE_DMA32		BIT(1) /* Use GFP_DMA32 allocations. */
+#define TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(n)	((n) & 0xff) /* Max order which caller can benefit from */
+#define TTM_ALLOCATION_POOL_USE_DMA_ALLOC 	BIT(8) /* Use coherent DMA allocations. */
+#define TTM_ALLOCATION_POOL_USE_DMA32		BIT(9) /* Use GFP_DMA32 allocations. */
 
 #endif
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 5/6] drm/amdgpu: Configure max beneficial TTM pool allocation order
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2025-10-20 11:54 ` [PATCH v5 4/6] drm/ttm: Allow drivers to specify maximum beneficial TTM pool size Tvrtko Ursulin
@ 2025-10-20 11:54 ` Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM Tvrtko Ursulin
  2025-10-27 10:21 ` [PATCH v5 0/6] Improving the worst case TTM large allocation latency Christian König
  6 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
	Thadeu Lima de Souza Cascardo

Let the TTM pool allocator know that we can afford for it to expend less
effort for satisfying contiguous allocations larger than 2MiB. The latter
is the maximum relevant PTE entry size and the driver and hardware are
happy to get larger blocks only opportunistically.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
v2:
 * Rebase for rename and move of flags to alloc_flags / TTM_ALLOCATION_.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 7b144ddea268..8e82163981f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1837,7 +1837,7 @@ static int amdgpu_ttm_pools_init(struct amdgpu_device *adev)
 	for (i = 0; i < adev->gmc.num_mem_partitions; i++) {
 		ttm_pool_init(&adev->mman.ttm_pools[i], adev->dev,
 			      adev->gmc.mem_partitions[i].numa.node,
-			      0);
+			      TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(get_order(SZ_2M)));
 	}
 	return 0;
 }
@@ -1933,7 +1933,8 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 			       (adev->need_swiotlb ?
 				TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
 			       (dma_addressing_limited(adev->dev) ?
-				TTM_ALLOCATION_POOL_USE_DMA32 : 0));
+				TTM_ALLOCATION_POOL_USE_DMA32 : 0) |
+			       TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(get_order(SZ_2M)));
 	if (r) {
 		dev_err(adev->dev,
 			"failed initializing buffer object driver(%d).\n", r);
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2025-10-20 11:54 ` [PATCH v5 5/6] drm/amdgpu: Configure max beneficial TTM pool allocation order Tvrtko Ursulin
@ 2025-10-20 11:54 ` Tvrtko Ursulin
  2025-10-21 14:11   ` Thomas Hellström
  2025-10-27 10:21 ` [PATCH v5 0/6] Improving the worst case TTM large allocation latency Christian König
  6 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Thomas Hellström,
	Christian König, Matthew Brost

Some graphics APIs differentiate between out-of-graphics-memory and
out-of-host-memory (system memory). Add a device init flag to have -ENOSPC
propagated from the resource managers instead of being converted to
-ENOMEM, to aid driver stacks in determining what error code to return or
whether corrective action can be taken at the driver level.

Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
Thomas, feel free to take the ownership if you end up liking this version.
As you can see I lifted your commit text as is and the implementation is
the same on the high level.
---
 drivers/gpu/drm/ttm/ttm_bo.c     | 4 +++-
 drivers/gpu/drm/ttm/ttm_device.c | 1 +
 include/drm/ttm/ttm_allocation.h | 1 +
 include/drm/ttm/ttm_device.h     | 5 +++++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index fba2a68a556e..15b3cb199d45 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -31,6 +31,7 @@
 
 #define pr_fmt(fmt) "[TTM] " fmt
 
+#include <drm/ttm/ttm_allocation.h>
 #include <drm/ttm/ttm_bo.h>
 #include <drm/ttm/ttm_placement.h>
 #include <drm/ttm/ttm_tt.h>
@@ -877,7 +878,8 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
 
 	/* For backward compatibility with userspace */
 	if (ret == -ENOSPC)
-		return -ENOMEM;
+		return bo->bdev->alloc_flags & TTM_ALLOCATION_PROPAGATE_ENOSPC ?
+		       ret : -ENOMEM;
 
 	/*
 	 * We might need to add a TTM.
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 87c85ccb21ac..5c10e5fbf43b 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -227,6 +227,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
 		return -ENOMEM;
 	}
 
+	bdev->alloc_flags = alloc_flags;
 	bdev->funcs = funcs;
 
 	ttm_sys_man_init(bdev);
diff --git a/include/drm/ttm/ttm_allocation.h b/include/drm/ttm/ttm_allocation.h
index 8f8544760306..655d1e44aba7 100644
--- a/include/drm/ttm/ttm_allocation.h
+++ b/include/drm/ttm/ttm_allocation.h
@@ -7,5 +7,6 @@
 #define TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(n)	((n) & 0xff) /* Max order which caller can benefit from */
 #define TTM_ALLOCATION_POOL_USE_DMA_ALLOC 	BIT(8) /* Use coherent DMA allocations. */
 #define TTM_ALLOCATION_POOL_USE_DMA32		BIT(9) /* Use GFP_DMA32 allocations. */
+#define TTM_ALLOCATION_PROPAGATE_ENOSPC		BIT(10) /* Do not convert ENOSPC from resource managers to ENOMEM. */
 
 #endif
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 074b98572275..d016360e5ceb 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -220,6 +220,11 @@ struct ttm_device {
 	 */
 	struct list_head device_list;
 
+	/**
+	 * @alloc_flags: TTM_ALLOCATION_ flags.
+	 */
+	unsigned int alloc_flags;
+
 	/**
 	 * @funcs: Function table for the device.
 	 * Constant after bo device init
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM
  2025-10-20 11:54 ` [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM Tvrtko Ursulin
@ 2025-10-21 14:11   ` Thomas Hellström
  2025-10-23 13:37     ` Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Hellström @ 2025-10-21 14:11 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, Christian König, Matthew Brost

On Mon, 2025-10-20 at 12:54 +0100, Tvrtko Ursulin wrote:
> Some graphics APIs differentiate between out-of-graphics-memory and
> out-of-host-memory (system memory). Add a device init flag to have -
> ENOSPC
> propagated from the resource managers instead of being converted to
> -ENOMEM, to aid driver stacks in determining what error code to
> return or
> whether corrective action can be taken at the driver level.
> 
> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> ---
> Thomas, feel free to take the ownership if you end up liking this
> version.
> As you can see I lifted your commit text as is and the implementation
> is
> the same on the high level.

Let's keep it like this. Thanks for doing this. I'll follow up with
xeKMD change once this gets backmerged.

FWIW:
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>


> ---
>  drivers/gpu/drm/ttm/ttm_bo.c     | 4 +++-
>  drivers/gpu/drm/ttm/ttm_device.c | 1 +
>  include/drm/ttm/ttm_allocation.h | 1 +
>  include/drm/ttm/ttm_device.h     | 5 +++++
>  4 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> b/drivers/gpu/drm/ttm/ttm_bo.c
> index fba2a68a556e..15b3cb199d45 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -31,6 +31,7 @@
>  
>  #define pr_fmt(fmt) "[TTM] " fmt
>  
> +#include <drm/ttm/ttm_allocation.h>
>  #include <drm/ttm/ttm_bo.h>
>  #include <drm/ttm/ttm_placement.h>
>  #include <drm/ttm/ttm_tt.h>
> @@ -877,7 +878,8 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
>  
>  	/* For backward compatibility with userspace */
>  	if (ret == -ENOSPC)
> -		return -ENOMEM;
> +		return bo->bdev->alloc_flags &
> TTM_ALLOCATION_PROPAGATE_ENOSPC ?
> +		       ret : -ENOMEM;
>  
>  	/*
>  	 * We might need to add a TTM.
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c
> b/drivers/gpu/drm/ttm/ttm_device.c
> index 87c85ccb21ac..5c10e5fbf43b 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -227,6 +227,7 @@ int ttm_device_init(struct ttm_device *bdev,
> const struct ttm_device_funcs *func
>  		return -ENOMEM;
>  	}
>  
> +	bdev->alloc_flags = alloc_flags;
>  	bdev->funcs = funcs;
>  
>  	ttm_sys_man_init(bdev);
> diff --git a/include/drm/ttm/ttm_allocation.h
> b/include/drm/ttm/ttm_allocation.h
> index 8f8544760306..655d1e44aba7 100644
> --- a/include/drm/ttm/ttm_allocation.h
> +++ b/include/drm/ttm/ttm_allocation.h
> @@ -7,5 +7,6 @@
>  #define TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(n)	((n) & 0xff)
> /* Max order which caller can benefit from */
>  #define TTM_ALLOCATION_POOL_USE_DMA_ALLOC 	BIT(8) /* Use
> coherent DMA allocations. */
>  #define TTM_ALLOCATION_POOL_USE_DMA32		BIT(9) /* Use
> GFP_DMA32 allocations. */
> +#define TTM_ALLOCATION_PROPAGATE_ENOSPC		BIT(10) /*
> Do not convert ENOSPC from resource managers to ENOMEM. */
>  
>  #endif
> diff --git a/include/drm/ttm/ttm_device.h
> b/include/drm/ttm/ttm_device.h
> index 074b98572275..d016360e5ceb 100644
> --- a/include/drm/ttm/ttm_device.h
> +++ b/include/drm/ttm/ttm_device.h
> @@ -220,6 +220,11 @@ struct ttm_device {
>  	 */
>  	struct list_head device_list;
>  
> +	/**
> +	 * @alloc_flags: TTM_ALLOCATION_ flags.
> +	 */
> +	unsigned int alloc_flags;
> +
>  	/**
>  	 * @funcs: Function table for the device.
>  	 * Constant after bo device init


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init
  2025-10-20 11:54 ` [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init Tvrtko Ursulin
@ 2025-10-21 14:16   ` Thomas Hellström
  2025-10-22  3:56   ` Zack Rusin
  1 sibling, 0 replies; 13+ messages in thread
From: Thomas Hellström @ 2025-10-21 14:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, Alex Deucher, Christian König, Danilo Krummrich,
	Dave Airlie, Gerd Hoffmann, Joonas Lahtinen, Lucas De Marchi,
	Lyude Paul, Maarten Lankhorst, Maxime Ripard, Rodrigo Vivi,
	Sui Jingfeng, Thomas Zimmermann, Zack Rusin

On Mon, 2025-10-20 at 12:54 +0100, Tvrtko Ursulin wrote:
> Multiple consecutive boolean function arguments are usually not very
> readable.
> 
> Replace the ones in ttm_device_init() with flags with the additional
> benefit of soon being able to pass in more data with just a one off
> code base churning cost.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Sui Jingfeng <suijingfeng@loongson.cn>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Zack Rusin <zack.rusin@broadcom.com>
> Acked-by: Christian König <christian.koenig@amd.com>

Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> # For the
xe changes.

> ---
> v2:
>  * Rebase for rename and move of flags to alloc_flags /
> TTM_ALLOCATION_.
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  6 +++--
>  drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
>  drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
>  drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 ++-
>  drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 +++--
>  drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
>  drivers/gpu/drm/radeon/radeon_ttm.c           |  6 +++--
>  drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++++++-------
>  .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
>  drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 12 +++++-----
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++++++---------
> --
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 ++----
>  drivers/gpu/drm/ttm/ttm_device.c              |  9 +++-----
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 ++--
>  drivers/gpu/drm/xe/xe_device.c                |  2 +-
>  include/drm/ttm/ttm_device.h                  |  3 ++-
>  16 files changed, 50 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 8f6d331e1ea2..7b144ddea268 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1930,8 +1930,10 @@ int amdgpu_ttm_init(struct amdgpu_device
> *adev)
>  	r = ttm_device_init(&adev->mman.bdev, &amdgpu_bo_driver,
> adev->dev,
>  			       adev_to_drm(adev)->anon_inode-
> >i_mapping,
>  			       adev_to_drm(adev)-
> >vma_offset_manager,
> -			       adev->need_swiotlb,
> -			       dma_addressing_limited(adev->dev));
> +			       (adev->need_swiotlb ?
> +				TTM_ALLOCATION_POOL_USE_DMA_ALLOC :
> 0) |
> +			       (dma_addressing_limited(adev->dev) ?
> +				TTM_ALLOCATION_POOL_USE_DMA32 : 0));
>  	if (r) {
>  		dev_err(adev->dev,
>  			"failed initializing buffer object
> driver(%d).\n", r);
> diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c
> b/drivers/gpu/drm/drm_gem_vram_helper.c
> index 0bec6f66682b..dd3292e57d64 100644
> --- a/drivers/gpu/drm/drm_gem_vram_helper.c
> +++ b/drivers/gpu/drm/drm_gem_vram_helper.c
> @@ -859,7 +859,7 @@ static int drm_vram_mm_init(struct drm_vram_mm
> *vmm, struct drm_device *dev,
>  	ret = ttm_device_init(&vmm->bdev, &bo_driver, dev->dev,
>  				 dev->anon_inode->i_mapping,
>  				 dev->vma_offset_manager,
> -				 false, true);
> +				 TTM_ALLOCATION_POOL_USE_DMA32);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c
> b/drivers/gpu/drm/i915/intel_region_ttm.c
> index 04525d92bec5..47a69aad5c3f 100644
> --- a/drivers/gpu/drm/i915/intel_region_ttm.c
> +++ b/drivers/gpu/drm/i915/intel_region_ttm.c
> @@ -34,7 +34,7 @@ int intel_region_ttm_device_init(struct
> drm_i915_private *dev_priv)
>  
>  	return ttm_device_init(&dev_priv->bdev, i915_ttm_driver(),
>  			       drm->dev, drm->anon_inode->i_mapping,
> -			       drm->vma_offset_manager, false,
> false);
> +			       drm->vma_offset_manager, 0);
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/loongson/lsdc_ttm.c
> b/drivers/gpu/drm/loongson/lsdc_ttm.c
> index 2e42c6970c9f..dca0d33e2cf2 100644
> --- a/drivers/gpu/drm/loongson/lsdc_ttm.c
> +++ b/drivers/gpu/drm/loongson/lsdc_ttm.c
> @@ -544,7 +544,8 @@ int lsdc_ttm_init(struct lsdc_device *ldev)
>  
>  	ret = ttm_device_init(&ldev->bdev, &lsdc_bo_driver, ddev-
> >dev,
>  			      ddev->anon_inode->i_mapping,
> -			      ddev->vma_offset_manager, false,
> true);
> +			      ddev->vma_offset_manager,
> +			      TTM_ALLOCATION_POOL_USE_DMA32);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> index 7d2436e5d50d..47b20cf80388 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> @@ -302,8 +302,10 @@ nouveau_ttm_init(struct nouveau_drm *drm)
>  	ret = ttm_device_init(&drm->ttm.bdev, &nouveau_bo_driver,
> drm->dev->dev,
>  				  dev->anon_inode->i_mapping,
>  				  dev->vma_offset_manager,
> -				  drm_need_swiotlb(drm-
> >client.mmu.dmabits),
> -				  drm->client.mmu.dmabits <= 32);
> +				  (drm_need_swiotlb(drm-
> >client.mmu.dmabits) ?
> +				   TTM_ALLOCATION_POOL_USE_DMA_ALLOC
> : 0 ) |
> +				  (drm->client.mmu.dmabits <= 32 ?
> +				   TTM_ALLOCATION_POOL_USE_DMA32 :
> 0));
>  	if (ret) {
>  		NV_ERROR(drm, "error initialising bo driver, %d\n",
> ret);
>  		return ret;
> diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c
> b/drivers/gpu/drm/qxl/qxl_ttm.c
> index 765a144cea14..85d9df48affa 100644
> --- a/drivers/gpu/drm/qxl/qxl_ttm.c
> +++ b/drivers/gpu/drm/qxl/qxl_ttm.c
> @@ -196,7 +196,7 @@ int qxl_ttm_init(struct qxl_device *qdev)
>  	r = ttm_device_init(&qdev->mman.bdev, &qxl_bo_driver, NULL,
>  			    qdev->ddev.anon_inode->i_mapping,
>  			    qdev->ddev.vma_offset_manager,
> -			    false, false);
> +			    0);
>  	if (r) {
>  		DRM_ERROR("failed initializing buffer object
> driver(%d).\n", r);
>  		return r;
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
> b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 616d25c8c2de..51dffe23c0fc 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -683,8 +683,10 @@ int radeon_ttm_init(struct radeon_device *rdev)
>  	r = ttm_device_init(&rdev->mman.bdev, &radeon_bo_driver,
> rdev->dev,
>  			       rdev_to_drm(rdev)->anon_inode-
> >i_mapping,
>  			       rdev_to_drm(rdev)-
> >vma_offset_manager,
> -			       rdev->need_swiotlb,
> -			       dma_addressing_limited(&rdev->pdev-
> >dev));
> +			       (rdev->need_swiotlb ?
> +				TTM_ALLOCATION_POOL_USE_DMA_ALLOC :
> 0 ) |
> +			       (dma_addressing_limited(&rdev->pdev-
> >dev) ?
> +				TTM_ALLOCATION_POOL_USE_DMA32 : 0));
>  	if (r) {
>  		DRM_ERROR("failed initializing buffer object
> driver(%d).\n", r);
>  		return r;
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> index 5426b435f702..d468f8322072 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> @@ -251,7 +251,7 @@ static void ttm_bo_unreserve_basic(struct kunit
> *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -290,7 +290,7 @@ static void ttm_bo_unreserve_pinned(struct kunit
> *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -342,7 +342,7 @@ static void ttm_bo_unreserve_bulk(struct kunit
> *test)
>  	resv = kunit_kzalloc(test, sizeof(*resv), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, resv);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -394,7 +394,7 @@ static void ttm_bo_fini_basic(struct kunit *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -437,7 +437,7 @@ static void ttm_bo_fini_shared_resv(struct kunit
> *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -477,7 +477,7 @@ static void ttm_bo_pin_basic(struct kunit *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -512,7 +512,7 @@ static void ttm_bo_pin_unpin_resource(struct
> kunit *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> @@ -563,7 +563,7 @@ static void ttm_bo_multiple_pin_one_unpin(struct
> kunit *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  	priv->ttm_dev = ttm_dev;
>  
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> index 3a1eef83190c..17a570af296c 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> @@ -995,7 +995,7 @@ static void
> ttm_bo_validate_busy_domain_evict(struct kunit *test)
>  	 */
>  	ttm_device_fini(priv->ttm_dev);
>  
> -	err = ttm_device_kunit_init_bad_evict(test->priv, priv-
> >ttm_dev, false, false);
> +	err = ttm_device_kunit_init_bad_evict(test->priv, priv-
> >ttm_dev);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  
>  	ttm_mock_manager_init(priv->ttm_dev, mem_type,
> MANAGER_SIZE);
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> index 98648d5f20e7..2d55ad34fe48 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> @@ -25,7 +25,7 @@ static void ttm_device_init_basic(struct kunit
> *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  
>  	KUNIT_EXPECT_PTR_EQ(test, ttm_dev->funcs, &ttm_dev_funcs);
> @@ -55,7 +55,7 @@ static void ttm_device_init_multiple(struct kunit
> *test)
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_devs);
>  
>  	for (i = 0; i < num_dev; i++) {
> -		err = ttm_device_kunit_init(priv, &ttm_devs[i],
> false, false);
> +		err = ttm_device_kunit_init(priv, &ttm_devs[i], 0);
>  		KUNIT_ASSERT_EQ(test, err, 0);
>  
>  		KUNIT_EXPECT_PTR_EQ(test, ttm_devs[i].dev_mapping,
> @@ -81,7 +81,7 @@ static void ttm_device_fini_basic(struct kunit
> *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  
>  	man = ttm_manager_type(ttm_dev, TTM_PL_SYSTEM);
> @@ -109,7 +109,7 @@ static void ttm_device_init_no_vma_man(struct
> kunit *test)
>  	vma_man = drm->vma_offset_manager;
>  	drm->vma_offset_manager = NULL;
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(priv, ttm_dev, 0);
>  	KUNIT_EXPECT_EQ(test, err, -EINVAL);
>  
>  	/* Bring the manager back for a graceful cleanup */
> @@ -158,9 +158,7 @@ static void ttm_device_init_pools(struct kunit
> *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(priv, ttm_dev,
> -				    params->alloc_flags &
> TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
> -				    params->alloc_flags &
> TTM_ALLOCATION_POOL_USE_DMA32);
> +	err = ttm_device_kunit_init(priv, ttm_dev, params-
> >alloc_flags);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  
>  	pool = &ttm_dev->pool;
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> index 7aaf0d1395ff..7b533e4e1e04 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> @@ -117,8 +117,7 @@ static void bad_evict_flags(struct
> ttm_buffer_object *bo,
>  
>  static int ttm_device_kunit_init_with_funcs(struct ttm_test_devices
> *priv,
>  					    struct ttm_device *ttm,
> -					    bool use_dma_alloc,
> -					    bool use_dma32,
> +					    unsigned int
> alloc_flags,
>  					    struct ttm_device_funcs
> *funcs)
>  {
>  	struct drm_device *drm = priv->drm;
> @@ -127,7 +126,7 @@ static int
> ttm_device_kunit_init_with_funcs(struct ttm_test_devices *priv,
>  	err = ttm_device_init(ttm, funcs, drm->dev,
>  			      drm->anon_inode->i_mapping,
>  			      drm->vma_offset_manager,
> -			      use_dma_alloc, use_dma32);
> +			      alloc_flags);
>  
>  	return err;
>  }
> @@ -143,11 +142,10 @@ EXPORT_SYMBOL_GPL(ttm_dev_funcs);
>  
>  int ttm_device_kunit_init(struct ttm_test_devices *priv,
>  			  struct ttm_device *ttm,
> -			  bool use_dma_alloc,
> -			  bool use_dma32)
> +			  unsigned int alloc_flags)
>  {
> -	return ttm_device_kunit_init_with_funcs(priv, ttm,
> use_dma_alloc,
> -						use_dma32,
> &ttm_dev_funcs);
> +	return ttm_device_kunit_init_with_funcs(priv, ttm,
> alloc_flags,
> +						&ttm_dev_funcs);
>  }
>  EXPORT_SYMBOL_GPL(ttm_device_kunit_init);
>  
> @@ -161,12 +159,10 @@ struct ttm_device_funcs ttm_dev_funcs_bad_evict
> = {
>  EXPORT_SYMBOL_GPL(ttm_dev_funcs_bad_evict);
>  
>  int ttm_device_kunit_init_bad_evict(struct ttm_test_devices *priv,
> -				    struct ttm_device *ttm,
> -				    bool use_dma_alloc,
> -				    bool use_dma32)
> +				    struct ttm_device *ttm)
>  {
> -	return ttm_device_kunit_init_with_funcs(priv, ttm,
> use_dma_alloc,
> -						use_dma32,
> &ttm_dev_funcs_bad_evict);
> +	return ttm_device_kunit_init_with_funcs(priv, ttm, 0,
> +						&ttm_dev_funcs_bad_e
> vict);
>  }
>  EXPORT_SYMBOL_GPL(ttm_device_kunit_init_bad_evict);
>  
> @@ -252,7 +248,7 @@ struct ttm_test_devices
> *ttm_test_devices_all(struct kunit *test)
>  	ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>  	KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>  
> -	err = ttm_device_kunit_init(devs, ttm_dev, false, false);
> +	err = ttm_device_kunit_init(devs, ttm_dev, 0);
>  	KUNIT_ASSERT_EQ(test, err, 0);
>  
>  	devs->ttm_dev = ttm_dev;
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> index c7da23232ffa..f8402b979d05 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> +++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> @@ -28,12 +28,9 @@ struct ttm_test_devices {
>  /* Building blocks for test-specific init functions */
>  int ttm_device_kunit_init(struct ttm_test_devices *priv,
>  			  struct ttm_device *ttm,
> -			  bool use_dma_alloc,
> -			  bool use_dma32);
> +			  unsigned int alloc_flags);
>  int ttm_device_kunit_init_bad_evict(struct ttm_test_devices *priv,
> -				    struct ttm_device *ttm,
> -				    bool use_dma_alloc,
> -				    bool use_dma32);
> +				    struct ttm_device *ttm);
>  struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
>  					    struct ttm_test_devices
> *devs,
>  					    size_t size,
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c
> b/drivers/gpu/drm/ttm/ttm_device.c
> index a97b1444536c..87c85ccb21ac 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -199,8 +199,7 @@ EXPORT_SYMBOL(ttm_device_swapout);
>   * @dev: The core kernel device pointer for DMA mappings and
> allocations.
>   * @mapping: The address space to use for this bo.
>   * @vma_manager: A pointer to a vma manager.
> - * @use_dma_alloc: If coherent DMA allocation API should be used.
> - * @use_dma32: If we should use GFP_DMA32 for device memory
> allocations.
> + * @alloc_flags: TTM_ALLOCATION_ flags.
>   *
>   * Initializes a struct ttm_device:
>   * Returns:
> @@ -209,7 +208,7 @@ EXPORT_SYMBOL(ttm_device_swapout);
>  int ttm_device_init(struct ttm_device *bdev, const struct
> ttm_device_funcs *funcs,
>  		    struct device *dev, struct address_space
> *mapping,
>  		    struct drm_vma_offset_manager *vma_manager,
> -		    bool use_dma_alloc, bool use_dma32)
> +		    unsigned int alloc_flags)
>  {
>  	struct ttm_global *glob = &ttm_glob;
>  	int ret, nid;
> @@ -237,9 +236,7 @@ int ttm_device_init(struct ttm_device *bdev,
> const struct ttm_device_funcs *func
>  	else
>  		nid = NUMA_NO_NODE;
>  
> -	ttm_pool_init(&bdev->pool, dev, nid,
> -		      (use_dma_alloc ?
> TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
> -		      (use_dma32 ? TTM_ALLOCATION_POOL_USE_DMA32 :
> 0));
> +	ttm_pool_init(&bdev->pool, dev, nid, alloc_flags);
>  
>  	bdev->vma_manager = vma_manager;
>  	spin_lock_init(&bdev->lru_lock);
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index 8ff958d119be..599052d07ae8 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -1023,8 +1023,8 @@ static int vmw_driver_load(struct vmw_private
> *dev_priv, u32 pci_id)
>  			      dev_priv->drm.dev,
>  			      dev_priv->drm.anon_inode->i_mapping,
>  			      dev_priv->drm.vma_offset_manager,
> -			      dev_priv->map_mode ==
> vmw_dma_alloc_coherent,
> -			      false);
> +			      (dev_priv->map_mode ==
> vmw_dma_alloc_coherent) ?
> +			      TTM_ALLOCATION_POOL_USE_DMA_ALLOC :
> 0);
>  	if (unlikely(ret != 0)) {
>  		drm_err(&dev_priv->drm,
>  			"Failed initializing TTM buffer object
> driver.\n");
> diff --git a/drivers/gpu/drm/xe/xe_device.c
> b/drivers/gpu/drm/xe/xe_device.c
> index 5f6a412b571c..58e7996160a0 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -437,7 +437,7 @@ struct xe_device *xe_device_create(struct pci_dev
> *pdev,
>  
>  	err = ttm_device_init(&xe->ttm, &xe_ttm_funcs, xe->drm.dev,
>  			      xe->drm.anon_inode->i_mapping,
> -			      xe->drm.vma_offset_manager, false,
> false);
> +			      xe->drm.vma_offset_manager, 0);
>  	if (WARN_ON(err))
>  		goto err;
>  
> diff --git a/include/drm/ttm/ttm_device.h
> b/include/drm/ttm/ttm_device.h
> index 592b5f802859..074b98572275 100644
> --- a/include/drm/ttm/ttm_device.h
> +++ b/include/drm/ttm/ttm_device.h
> @@ -27,6 +27,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/workqueue.h>
> +#include <drm/ttm/ttm_allocation.h>
>  #include <drm/ttm/ttm_resource.h>
>  #include <drm/ttm/ttm_pool.h>
>  
> @@ -292,7 +293,7 @@ static inline void ttm_set_driver_manager(struct
> ttm_device *bdev, int type,
>  int ttm_device_init(struct ttm_device *bdev, const struct
> ttm_device_funcs *funcs,
>  		    struct device *dev, struct address_space
> *mapping,
>  		    struct drm_vma_offset_manager *vma_manager,
> -		    bool use_dma_alloc, bool use_dma32);
> +		    unsigned int alloc_flags);
>  void ttm_device_fini(struct ttm_device *bdev);
>  void ttm_device_clear_dma_mappings(struct ttm_device *bdev);
>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init
  2025-10-20 11:54 ` [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init Tvrtko Ursulin
  2025-10-21 14:16   ` Thomas Hellström
@ 2025-10-22  3:56   ` Zack Rusin
  1 sibling, 0 replies; 13+ messages in thread
From: Zack Rusin @ 2025-10-22  3:56 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: amd-gfx, dri-devel, kernel-dev, Alex Deucher,
	Christian König, Danilo Krummrich, Dave Airlie,
	Gerd Hoffmann, Joonas Lahtinen, Lucas De Marchi, Lyude Paul,
	Maarten Lankhorst, Maxime Ripard, Rodrigo Vivi, Sui Jingfeng,
	Thomas Hellström, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 23749 bytes --]

On Mon, Oct 20, 2025 at 7:54 AM Tvrtko Ursulin
<tvrtko.ursulin@igalia.com> wrote:
>
> Multiple consecutive boolean function arguments are usually not very
> readable.
>
> Replace the ones in ttm_device_init() with flags with the additional
> benefit of soon being able to pass in more data with just a one off
> code base churning cost.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Sui Jingfeng <suijingfeng@loongson.cn>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Zack Rusin <zack.rusin@broadcom.com>
> Acked-by: Christian König <christian.koenig@amd.com>
> ---
> v2:
>  * Rebase for rename and move of flags to alloc_flags / TTM_ALLOCATION_.
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  6 +++--
>  drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
>  drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
>  drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 ++-
>  drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 +++--
>  drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
>  drivers/gpu/drm/radeon/radeon_ttm.c           |  6 +++--
>  drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++++++-------
>  .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
>  drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 12 +++++-----
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++++++-----------
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 ++----
>  drivers/gpu/drm/ttm/ttm_device.c              |  9 +++-----
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 ++--
>  drivers/gpu/drm/xe/xe_device.c                |  2 +-
>  include/drm/ttm/ttm_device.h                  |  3 ++-
>  16 files changed, 50 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 8f6d331e1ea2..7b144ddea268 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1930,8 +1930,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>         r = ttm_device_init(&adev->mman.bdev, &amdgpu_bo_driver, adev->dev,
>                                adev_to_drm(adev)->anon_inode->i_mapping,
>                                adev_to_drm(adev)->vma_offset_manager,
> -                              adev->need_swiotlb,
> -                              dma_addressing_limited(adev->dev));
> +                              (adev->need_swiotlb ?
> +                               TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
> +                              (dma_addressing_limited(adev->dev) ?
> +                               TTM_ALLOCATION_POOL_USE_DMA32 : 0));
>         if (r) {
>                 dev_err(adev->dev,
>                         "failed initializing buffer object driver(%d).\n", r);
> diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
> index 0bec6f66682b..dd3292e57d64 100644
> --- a/drivers/gpu/drm/drm_gem_vram_helper.c
> +++ b/drivers/gpu/drm/drm_gem_vram_helper.c
> @@ -859,7 +859,7 @@ static int drm_vram_mm_init(struct drm_vram_mm *vmm, struct drm_device *dev,
>         ret = ttm_device_init(&vmm->bdev, &bo_driver, dev->dev,
>                                  dev->anon_inode->i_mapping,
>                                  dev->vma_offset_manager,
> -                                false, true);
> +                                TTM_ALLOCATION_POOL_USE_DMA32);
>         if (ret)
>                 return ret;
>
> diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c b/drivers/gpu/drm/i915/intel_region_ttm.c
> index 04525d92bec5..47a69aad5c3f 100644
> --- a/drivers/gpu/drm/i915/intel_region_ttm.c
> +++ b/drivers/gpu/drm/i915/intel_region_ttm.c
> @@ -34,7 +34,7 @@ int intel_region_ttm_device_init(struct drm_i915_private *dev_priv)
>
>         return ttm_device_init(&dev_priv->bdev, i915_ttm_driver(),
>                                drm->dev, drm->anon_inode->i_mapping,
> -                              drm->vma_offset_manager, false, false);
> +                              drm->vma_offset_manager, 0);
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/loongson/lsdc_ttm.c b/drivers/gpu/drm/loongson/lsdc_ttm.c
> index 2e42c6970c9f..dca0d33e2cf2 100644
> --- a/drivers/gpu/drm/loongson/lsdc_ttm.c
> +++ b/drivers/gpu/drm/loongson/lsdc_ttm.c
> @@ -544,7 +544,8 @@ int lsdc_ttm_init(struct lsdc_device *ldev)
>
>         ret = ttm_device_init(&ldev->bdev, &lsdc_bo_driver, ddev->dev,
>                               ddev->anon_inode->i_mapping,
> -                             ddev->vma_offset_manager, false, true);
> +                             ddev->vma_offset_manager,
> +                             TTM_ALLOCATION_POOL_USE_DMA32);
>         if (ret)
>                 return ret;
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> index 7d2436e5d50d..47b20cf80388 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> @@ -302,8 +302,10 @@ nouveau_ttm_init(struct nouveau_drm *drm)
>         ret = ttm_device_init(&drm->ttm.bdev, &nouveau_bo_driver, drm->dev->dev,
>                                   dev->anon_inode->i_mapping,
>                                   dev->vma_offset_manager,
> -                                 drm_need_swiotlb(drm->client.mmu.dmabits),
> -                                 drm->client.mmu.dmabits <= 32);
> +                                 (drm_need_swiotlb(drm->client.mmu.dmabits) ?
> +                                  TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0 ) |
> +                                 (drm->client.mmu.dmabits <= 32 ?
> +                                  TTM_ALLOCATION_POOL_USE_DMA32 : 0));
>         if (ret) {
>                 NV_ERROR(drm, "error initialising bo driver, %d\n", ret);
>                 return ret;
> diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
> index 765a144cea14..85d9df48affa 100644
> --- a/drivers/gpu/drm/qxl/qxl_ttm.c
> +++ b/drivers/gpu/drm/qxl/qxl_ttm.c
> @@ -196,7 +196,7 @@ int qxl_ttm_init(struct qxl_device *qdev)
>         r = ttm_device_init(&qdev->mman.bdev, &qxl_bo_driver, NULL,
>                             qdev->ddev.anon_inode->i_mapping,
>                             qdev->ddev.vma_offset_manager,
> -                           false, false);
> +                           0);
>         if (r) {
>                 DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
>                 return r;
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 616d25c8c2de..51dffe23c0fc 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -683,8 +683,10 @@ int radeon_ttm_init(struct radeon_device *rdev)
>         r = ttm_device_init(&rdev->mman.bdev, &radeon_bo_driver, rdev->dev,
>                                rdev_to_drm(rdev)->anon_inode->i_mapping,
>                                rdev_to_drm(rdev)->vma_offset_manager,
> -                              rdev->need_swiotlb,
> -                              dma_addressing_limited(&rdev->pdev->dev));
> +                              (rdev->need_swiotlb ?
> +                               TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0 ) |
> +                              (dma_addressing_limited(&rdev->pdev->dev) ?
> +                               TTM_ALLOCATION_POOL_USE_DMA32 : 0));
>         if (r) {
>                 DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
>                 return r;
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> index 5426b435f702..d468f8322072 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c
> @@ -251,7 +251,7 @@ static void ttm_bo_unreserve_basic(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -290,7 +290,7 @@ static void ttm_bo_unreserve_pinned(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -342,7 +342,7 @@ static void ttm_bo_unreserve_bulk(struct kunit *test)
>         resv = kunit_kzalloc(test, sizeof(*resv), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, resv);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -394,7 +394,7 @@ static void ttm_bo_fini_basic(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -437,7 +437,7 @@ static void ttm_bo_fini_shared_resv(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -477,7 +477,7 @@ static void ttm_bo_pin_basic(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -512,7 +512,7 @@ static void ttm_bo_pin_unpin_resource(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> @@ -563,7 +563,7 @@ static void ttm_bo_multiple_pin_one_unpin(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>         priv->ttm_dev = ttm_dev;
>
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> index 3a1eef83190c..17a570af296c 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c
> @@ -995,7 +995,7 @@ static void ttm_bo_validate_busy_domain_evict(struct kunit *test)
>          */
>         ttm_device_fini(priv->ttm_dev);
>
> -       err = ttm_device_kunit_init_bad_evict(test->priv, priv->ttm_dev, false, false);
> +       err = ttm_device_kunit_init_bad_evict(test->priv, priv->ttm_dev);
>         KUNIT_ASSERT_EQ(test, err, 0);
>
>         ttm_mock_manager_init(priv->ttm_dev, mem_type, MANAGER_SIZE);
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_device_test.c b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> index 98648d5f20e7..2d55ad34fe48 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
> @@ -25,7 +25,7 @@ static void ttm_device_init_basic(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>
>         KUNIT_EXPECT_PTR_EQ(test, ttm_dev->funcs, &ttm_dev_funcs);
> @@ -55,7 +55,7 @@ static void ttm_device_init_multiple(struct kunit *test)
>         KUNIT_ASSERT_NOT_NULL(test, ttm_devs);
>
>         for (i = 0; i < num_dev; i++) {
> -               err = ttm_device_kunit_init(priv, &ttm_devs[i], false, false);
> +               err = ttm_device_kunit_init(priv, &ttm_devs[i], 0);
>                 KUNIT_ASSERT_EQ(test, err, 0);
>
>                 KUNIT_EXPECT_PTR_EQ(test, ttm_devs[i].dev_mapping,
> @@ -81,7 +81,7 @@ static void ttm_device_fini_basic(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>
>         man = ttm_manager_type(ttm_dev, TTM_PL_SYSTEM);
> @@ -109,7 +109,7 @@ static void ttm_device_init_no_vma_man(struct kunit *test)
>         vma_man = drm->vma_offset_manager;
>         drm->vma_offset_manager = NULL;
>
> -       err = ttm_device_kunit_init(priv, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(priv, ttm_dev, 0);
>         KUNIT_EXPECT_EQ(test, err, -EINVAL);
>
>         /* Bring the manager back for a graceful cleanup */
> @@ -158,9 +158,7 @@ static void ttm_device_init_pools(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(priv, ttm_dev,
> -                                   params->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA_ALLOC,
> -                                   params->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA32);
> +       err = ttm_device_kunit_init(priv, ttm_dev, params->alloc_flags);
>         KUNIT_ASSERT_EQ(test, err, 0);
>
>         pool = &ttm_dev->pool;
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> index 7aaf0d1395ff..7b533e4e1e04 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> +++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
> @@ -117,8 +117,7 @@ static void bad_evict_flags(struct ttm_buffer_object *bo,
>
>  static int ttm_device_kunit_init_with_funcs(struct ttm_test_devices *priv,
>                                             struct ttm_device *ttm,
> -                                           bool use_dma_alloc,
> -                                           bool use_dma32,
> +                                           unsigned int alloc_flags,
>                                             struct ttm_device_funcs *funcs)
>  {
>         struct drm_device *drm = priv->drm;
> @@ -127,7 +126,7 @@ static int ttm_device_kunit_init_with_funcs(struct ttm_test_devices *priv,
>         err = ttm_device_init(ttm, funcs, drm->dev,
>                               drm->anon_inode->i_mapping,
>                               drm->vma_offset_manager,
> -                             use_dma_alloc, use_dma32);
> +                             alloc_flags);
>
>         return err;
>  }
> @@ -143,11 +142,10 @@ EXPORT_SYMBOL_GPL(ttm_dev_funcs);
>
>  int ttm_device_kunit_init(struct ttm_test_devices *priv,
>                           struct ttm_device *ttm,
> -                         bool use_dma_alloc,
> -                         bool use_dma32)
> +                         unsigned int alloc_flags)
>  {
> -       return ttm_device_kunit_init_with_funcs(priv, ttm, use_dma_alloc,
> -                                               use_dma32, &ttm_dev_funcs);
> +       return ttm_device_kunit_init_with_funcs(priv, ttm, alloc_flags,
> +                                               &ttm_dev_funcs);
>  }
>  EXPORT_SYMBOL_GPL(ttm_device_kunit_init);
>
> @@ -161,12 +159,10 @@ struct ttm_device_funcs ttm_dev_funcs_bad_evict = {
>  EXPORT_SYMBOL_GPL(ttm_dev_funcs_bad_evict);
>
>  int ttm_device_kunit_init_bad_evict(struct ttm_test_devices *priv,
> -                                   struct ttm_device *ttm,
> -                                   bool use_dma_alloc,
> -                                   bool use_dma32)
> +                                   struct ttm_device *ttm)
>  {
> -       return ttm_device_kunit_init_with_funcs(priv, ttm, use_dma_alloc,
> -                                               use_dma32, &ttm_dev_funcs_bad_evict);
> +       return ttm_device_kunit_init_with_funcs(priv, ttm, 0,
> +                                               &ttm_dev_funcs_bad_evict);
>  }
>  EXPORT_SYMBOL_GPL(ttm_device_kunit_init_bad_evict);
>
> @@ -252,7 +248,7 @@ struct ttm_test_devices *ttm_test_devices_all(struct kunit *test)
>         ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
>         KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
>
> -       err = ttm_device_kunit_init(devs, ttm_dev, false, false);
> +       err = ttm_device_kunit_init(devs, ttm_dev, 0);
>         KUNIT_ASSERT_EQ(test, err, 0);
>
>         devs->ttm_dev = ttm_dev;
> diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> index c7da23232ffa..f8402b979d05 100644
> --- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> +++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
> @@ -28,12 +28,9 @@ struct ttm_test_devices {
>  /* Building blocks for test-specific init functions */
>  int ttm_device_kunit_init(struct ttm_test_devices *priv,
>                           struct ttm_device *ttm,
> -                         bool use_dma_alloc,
> -                         bool use_dma32);
> +                         unsigned int alloc_flags);
>  int ttm_device_kunit_init_bad_evict(struct ttm_test_devices *priv,
> -                                   struct ttm_device *ttm,
> -                                   bool use_dma_alloc,
> -                                   bool use_dma32);
> +                                   struct ttm_device *ttm);
>  struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
>                                             struct ttm_test_devices *devs,
>                                             size_t size,
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index a97b1444536c..87c85ccb21ac 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -199,8 +199,7 @@ EXPORT_SYMBOL(ttm_device_swapout);
>   * @dev: The core kernel device pointer for DMA mappings and allocations.
>   * @mapping: The address space to use for this bo.
>   * @vma_manager: A pointer to a vma manager.
> - * @use_dma_alloc: If coherent DMA allocation API should be used.
> - * @use_dma32: If we should use GFP_DMA32 for device memory allocations.
> + * @alloc_flags: TTM_ALLOCATION_ flags.
>   *
>   * Initializes a struct ttm_device:
>   * Returns:
> @@ -209,7 +208,7 @@ EXPORT_SYMBOL(ttm_device_swapout);
>  int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *funcs,
>                     struct device *dev, struct address_space *mapping,
>                     struct drm_vma_offset_manager *vma_manager,
> -                   bool use_dma_alloc, bool use_dma32)
> +                   unsigned int alloc_flags)
>  {
>         struct ttm_global *glob = &ttm_glob;
>         int ret, nid;
> @@ -237,9 +236,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
>         else
>                 nid = NUMA_NO_NODE;
>
> -       ttm_pool_init(&bdev->pool, dev, nid,
> -                     (use_dma_alloc ? TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0) |
> -                     (use_dma32 ? TTM_ALLOCATION_POOL_USE_DMA32 : 0));
> +       ttm_pool_init(&bdev->pool, dev, nid, alloc_flags);
>
>         bdev->vma_manager = vma_manager;
>         spin_lock_init(&bdev->lru_lock);
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index 8ff958d119be..599052d07ae8 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -1023,8 +1023,8 @@ static int vmw_driver_load(struct vmw_private *dev_priv, u32 pci_id)
>                               dev_priv->drm.dev,
>                               dev_priv->drm.anon_inode->i_mapping,
>                               dev_priv->drm.vma_offset_manager,
> -                             dev_priv->map_mode == vmw_dma_alloc_coherent,
> -                             false);
> +                             (dev_priv->map_mode == vmw_dma_alloc_coherent) ?
> +                             TTM_ALLOCATION_POOL_USE_DMA_ALLOC : 0);
>         if (unlikely(ret != 0)) {
>                 drm_err(&dev_priv->drm,
>                         "Failed initializing TTM buffer object driver.\n");
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 5f6a412b571c..58e7996160a0 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -437,7 +437,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
>
>         err = ttm_device_init(&xe->ttm, &xe_ttm_funcs, xe->drm.dev,
>                               xe->drm.anon_inode->i_mapping,
> -                             xe->drm.vma_offset_manager, false, false);
> +                             xe->drm.vma_offset_manager, 0);
>         if (WARN_ON(err))
>                 goto err;
>
> diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
> index 592b5f802859..074b98572275 100644
> --- a/include/drm/ttm/ttm_device.h
> +++ b/include/drm/ttm/ttm_device.h
> @@ -27,6 +27,7 @@
>
>  #include <linux/types.h>
>  #include <linux/workqueue.h>
> +#include <drm/ttm/ttm_allocation.h>
>  #include <drm/ttm/ttm_resource.h>
>  #include <drm/ttm/ttm_pool.h>
>
> @@ -292,7 +293,7 @@ static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
>  int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *funcs,
>                     struct device *dev, struct address_space *mapping,
>                     struct drm_vma_offset_manager *vma_manager,
> -                   bool use_dma_alloc, bool use_dma32);
> +                   unsigned int alloc_flags);
>  void ttm_device_fini(struct ttm_device *bdev);
>  void ttm_device_clear_dma_mappings(struct ttm_device *bdev);
>

Looks good. I'm always happy to see boolean's in function parameters
replaced by enums/defines.

Acked-by: Zack Rusin <zack.rusin@broadcom.com>

z

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM
  2025-10-21 14:11   ` Thomas Hellström
@ 2025-10-23 13:37     ` Tvrtko Ursulin
  0 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-23 13:37 UTC (permalink / raw)
  To: Thomas Hellström, amd-gfx, dri-devel
  Cc: kernel-dev, Christian König, Matthew Brost


On 21/10/2025 15:11, Thomas Hellström wrote:
> On Mon, 2025-10-20 at 12:54 +0100, Tvrtko Ursulin wrote:
>> Some graphics APIs differentiate between out-of-graphics-memory and
>> out-of-host-memory (system memory). Add a device init flag to have -
>> ENOSPC
>> propagated from the resource managers instead of being converted to
>> -ENOMEM, to aid driver stacks in determining what error code to
>> return or
>> whether corrective action can be taken at the driver level.
>>
>> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> ---
>> Thomas, feel free to take the ownership if you end up liking this
>> version.
>> As you can see I lifted your commit text as is and the implementation
>> is
>> the same on the high level.
> 
> Let's keep it like this. Thanks for doing this. I'll follow up with
> xeKMD change once this gets backmerged.
> 
> FWIW:
> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Thanks!

Before being able to merge I will however need:

  * Someone to r-b patch 2/6.
  * Christian to check if I can upgrade his r-b to v2 on patches 2, 4 and 6.
  * Maybe not strictly required since all go via drm-misc, but 3/6 could 
use acks from more driver owners.

Regards,

Tvrtko

>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c     | 4 +++-
>>   drivers/gpu/drm/ttm/ttm_device.c | 1 +
>>   include/drm/ttm/ttm_allocation.h | 1 +
>>   include/drm/ttm/ttm_device.h     | 5 +++++
>>   4 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>> b/drivers/gpu/drm/ttm/ttm_bo.c
>> index fba2a68a556e..15b3cb199d45 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -31,6 +31,7 @@
>>   
>>   #define pr_fmt(fmt) "[TTM] " fmt
>>   
>> +#include <drm/ttm/ttm_allocation.h>
>>   #include <drm/ttm/ttm_bo.h>
>>   #include <drm/ttm/ttm_placement.h>
>>   #include <drm/ttm/ttm_tt.h>
>> @@ -877,7 +878,8 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
>>   
>>   	/* For backward compatibility with userspace */
>>   	if (ret == -ENOSPC)
>> -		return -ENOMEM;
>> +		return bo->bdev->alloc_flags &
>> TTM_ALLOCATION_PROPAGATE_ENOSPC ?
>> +		       ret : -ENOMEM;
>>   
>>   	/*
>>   	 * We might need to add a TTM.
>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c
>> b/drivers/gpu/drm/ttm/ttm_device.c
>> index 87c85ccb21ac..5c10e5fbf43b 100644
>> --- a/drivers/gpu/drm/ttm/ttm_device.c
>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
>> @@ -227,6 +227,7 @@ int ttm_device_init(struct ttm_device *bdev,
>> const struct ttm_device_funcs *func
>>   		return -ENOMEM;
>>   	}
>>   
>> +	bdev->alloc_flags = alloc_flags;
>>   	bdev->funcs = funcs;
>>   
>>   	ttm_sys_man_init(bdev);
>> diff --git a/include/drm/ttm/ttm_allocation.h
>> b/include/drm/ttm/ttm_allocation.h
>> index 8f8544760306..655d1e44aba7 100644
>> --- a/include/drm/ttm/ttm_allocation.h
>> +++ b/include/drm/ttm/ttm_allocation.h
>> @@ -7,5 +7,6 @@
>>   #define TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(n)	((n) & 0xff)
>> /* Max order which caller can benefit from */
>>   #define TTM_ALLOCATION_POOL_USE_DMA_ALLOC 	BIT(8) /* Use
>> coherent DMA allocations. */
>>   #define TTM_ALLOCATION_POOL_USE_DMA32		BIT(9) /* Use
>> GFP_DMA32 allocations. */
>> +#define TTM_ALLOCATION_PROPAGATE_ENOSPC		BIT(10) /*
>> Do not convert ENOSPC from resource managers to ENOMEM. */
>>   
>>   #endif
>> diff --git a/include/drm/ttm/ttm_device.h
>> b/include/drm/ttm/ttm_device.h
>> index 074b98572275..d016360e5ceb 100644
>> --- a/include/drm/ttm/ttm_device.h
>> +++ b/include/drm/ttm/ttm_device.h
>> @@ -220,6 +220,11 @@ struct ttm_device {
>>   	 */
>>   	struct list_head device_list;
>>   
>> +	/**
>> +	 * @alloc_flags: TTM_ALLOCATION_ flags.
>> +	 */
>> +	unsigned int alloc_flags;
>> +
>>   	/**
>>   	 * @funcs: Function table for the device.
>>   	 * Constant after bo device init
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 0/6] Improving the worst case TTM large allocation latency
  2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2025-10-20 11:54 ` [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM Tvrtko Ursulin
@ 2025-10-27 10:21 ` Christian König
  2025-10-31  9:32   ` Tvrtko Ursulin
  6 siblings, 1 reply; 13+ messages in thread
From: Christian König @ 2025-10-27 10:21 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx, dri-devel
  Cc: kernel-dev, Alex Deucher, Christian König, Danilo Krummrich,
	Dave Airlie, Gerd Hoffmann, Joonas Lahtinen, Lucas De Marchi,
	Lyude Paul, Maarten Lankhorst, Maxime Ripard, Rodrigo Vivi,
	Sui Jingfeng, Thadeu Lima de Souza Cascardo,
	Thomas Hellström, Thomas Zimmermann, Zack Rusin

Where not applied yet or superseeded by a newer version Reviewed-by: Christian König <christian.koenig@amd.com> for the entire series.

Regards,
Christian.

On 10/20/25 13:54, Tvrtko Ursulin wrote:
> Disclaimer:
> Please note that as this series includes a patch which touches a good number of
> drivers I will only copy everyone in the cover letter and the respective patch.
> Assumption is people are subscribed to dri-devel so can look at the whole series
> there. I know someone is bound to complain for both the case when everyone is
> copied on everything for getting too much email, and also for this other case.
> So please be flexible.
> 
> Description:
> 
> All drivers which use the TTM pool allocator end up requesting large order
> allocations when allocating large buffers. Those can be slow due memory pressure
> and so add latency to buffer creation. But there is often also a size limit
> above which contiguous blocks do not bring any performance benefits. This series
> allows drivers to say when it is okay for the TTM to try a bit less hard.
> 
> We do this by allowing drivers to specify this cut off point when creating the
> TTM device and pools. Allocations above this size will skip direct reclaim so
> under memory pressure worst case latency will improve. Background reclaim is
> still kicked off and both before and after the memory pressure all the TTM pool
> buckets remain to be used as they are today.
> 
> This is especially interesting if someone has configured MAX_PAGE_ORDER to
> higher than the default. And even with the default, with amdgpu for example,
> the last patch in the series makes use of the new feature by telling TTM that
> above 2MiB we do not expect performance benefits. Which makes TTM not try direct
> reclaim for the top bucket (4MiB).
> 
> End result is TTM drivers become a tiny bit nicer mm citizens and users benefit
> from better worst case buffer creation latencies. As a side benefit we get rid
> of two instances of those often very unreadable mutliple nameless booleans
> function signatures.
> 
> If this sounds interesting and gets merge the invidual drivers can follow up
> with patches configuring their thresholds.
> 
> v2:
>  * Christian suggested to pass in the new data by changing the function signatures.
> 
> v3:
>  * Moved ttm pool helpers into new ttm_pool_internal.h. (Christian)
> 
> v4:
>  * Fixed TTM unit test build.
> 
> v5:
>  * Renamed pool_flags to alloc_flags and moved to TTM_ALLOCATION_ namespace.
>  * Added last patch (propagate ENOSPC) from Thomas' related series for reference.
> 
> v1 thread:
> https://lore.kernel.org/dri-devel/20250919131127.90932-1-tvrtko.ursulin@igalia.com/
> 
> v3 thread:
> https://lore.kernel.org/dri-devel/20251008115314.55438-1-tvrtko.ursulin@igalia.com/
> 
> v4 thread:
> https://lore.kernel.org/dri-devel/20251013082240.55263-1-tvrtko.ursulin@igalia.com/
> 
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Sui Jingfeng <suijingfeng@loongson.cn>
> Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Zack Rusin <zack.rusin@broadcom.com>
> 
> Tvrtko Ursulin (6):
>   drm/ttm: Add getter for some pool properties
>   drm/ttm: Replace multiple booleans with flags in pool init
>   drm/ttm: Replace multiple booleans with flags in device init
>   drm/ttm: Allow drivers to specify maximum beneficial TTM pool size
>   drm/amdgpu: Configure max beneficial TTM pool allocation order
>   drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  9 ++--
>  drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
>  drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
>  drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 +-
>  drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 ++-
>  drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
>  drivers/gpu/drm/radeon/radeon_ttm.c           |  6 ++-
>  drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++----
>  .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
>  drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 33 ++++++--------
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++-----
>  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 +--
>  drivers/gpu/drm/ttm/tests/ttm_pool_test.c     | 24 +++++-----
>  drivers/gpu/drm/ttm/ttm_bo.c                  |  4 +-
>  drivers/gpu/drm/ttm/ttm_device.c              |  9 ++--
>  drivers/gpu/drm/ttm/ttm_pool.c                | 45 +++++++++++--------
>  drivers/gpu/drm/ttm/ttm_pool_internal.h       | 25 +++++++++++
>  drivers/gpu/drm/ttm/ttm_tt.c                  | 10 +++--
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 +-
>  drivers/gpu/drm/xe/xe_device.c                |  2 +-
>  include/drm/ttm/ttm_allocation.h              | 12 +++++
>  include/drm/ttm/ttm_device.h                  |  8 +++-
>  include/drm/ttm/ttm_pool.h                    |  8 ++--
>  23 files changed, 154 insertions(+), 107 deletions(-)
>  create mode 100644 drivers/gpu/drm/ttm/ttm_pool_internal.h
>  create mode 100644 include/drm/ttm/ttm_allocation.h
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 0/6] Improving the worst case TTM large allocation latency
  2025-10-27 10:21 ` [PATCH v5 0/6] Improving the worst case TTM large allocation latency Christian König
@ 2025-10-31  9:32   ` Tvrtko Ursulin
  0 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-31  9:32 UTC (permalink / raw)
  To: Christian König, amd-gfx, dri-devel
  Cc: kernel-dev, Alex Deucher, Danilo Krummrich, Dave Airlie,
	Gerd Hoffmann, Joonas Lahtinen, Lucas De Marchi, Lyude Paul,
	Maarten Lankhorst, Maxime Ripard, Rodrigo Vivi, Sui Jingfeng,
	Thadeu Lima de Souza Cascardo, Thomas Hellström,
	Thomas Zimmermann, Zack Rusin


On 27/10/2025 10:21, Christian König wrote:
> Where not applied yet or superseeded by a newer version Reviewed-by: Christian König <christian.koenig@amd.com> for the entire series.

Thank you all for reviews and acks! I have now pushed this to drm-misc-next.

Xe can follow up with the ENOSPC, and probably 2M beneficial order, at 
their leisure.

Regards,

Tvrtko

> On 10/20/25 13:54, Tvrtko Ursulin wrote:
>> Disclaimer:
>> Please note that as this series includes a patch which touches a good number of
>> drivers I will only copy everyone in the cover letter and the respective patch.
>> Assumption is people are subscribed to dri-devel so can look at the whole series
>> there. I know someone is bound to complain for both the case when everyone is
>> copied on everything for getting too much email, and also for this other case.
>> So please be flexible.
>>
>> Description:
>>
>> All drivers which use the TTM pool allocator end up requesting large order
>> allocations when allocating large buffers. Those can be slow due memory pressure
>> and so add latency to buffer creation. But there is often also a size limit
>> above which contiguous blocks do not bring any performance benefits. This series
>> allows drivers to say when it is okay for the TTM to try a bit less hard.
>>
>> We do this by allowing drivers to specify this cut off point when creating the
>> TTM device and pools. Allocations above this size will skip direct reclaim so
>> under memory pressure worst case latency will improve. Background reclaim is
>> still kicked off and both before and after the memory pressure all the TTM pool
>> buckets remain to be used as they are today.
>>
>> This is especially interesting if someone has configured MAX_PAGE_ORDER to
>> higher than the default. And even with the default, with amdgpu for example,
>> the last patch in the series makes use of the new feature by telling TTM that
>> above 2MiB we do not expect performance benefits. Which makes TTM not try direct
>> reclaim for the top bucket (4MiB).
>>
>> End result is TTM drivers become a tiny bit nicer mm citizens and users benefit
>> from better worst case buffer creation latencies. As a side benefit we get rid
>> of two instances of those often very unreadable mutliple nameless booleans
>> function signatures.
>>
>> If this sounds interesting and gets merge the invidual drivers can follow up
>> with patches configuring their thresholds.
>>
>> v2:
>>   * Christian suggested to pass in the new data by changing the function signatures.
>>
>> v3:
>>   * Moved ttm pool helpers into new ttm_pool_internal.h. (Christian)
>>
>> v4:
>>   * Fixed TTM unit test build.
>>
>> v5:
>>   * Renamed pool_flags to alloc_flags and moved to TTM_ALLOCATION_ namespace.
>>   * Added last patch (propagate ENOSPC) from Thomas' related series for reference.
>>
>> v1 thread:
>> https://lore.kernel.org/dri-devel/20250919131127.90932-1-tvrtko.ursulin@igalia.com/
>>
>> v3 thread:
>> https://lore.kernel.org/dri-devel/20251008115314.55438-1-tvrtko.ursulin@igalia.com/
>>
>> v4 thread:
>> https://lore.kernel.org/dri-devel/20251013082240.55263-1-tvrtko.ursulin@igalia.com/
>>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Dave Airlie <airlied@redhat.com>
>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>> Cc: Lyude Paul <lyude@redhat.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Maxime Ripard <mripard@kernel.org>
>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Cc: Sui Jingfeng <suijingfeng@loongson.cn>
>> Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>> Cc: Zack Rusin <zack.rusin@broadcom.com>
>>
>> Tvrtko Ursulin (6):
>>    drm/ttm: Add getter for some pool properties
>>    drm/ttm: Replace multiple booleans with flags in pool init
>>    drm/ttm: Replace multiple booleans with flags in device init
>>    drm/ttm: Allow drivers to specify maximum beneficial TTM pool size
>>    drm/amdgpu: Configure max beneficial TTM pool allocation order
>>    drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  9 ++--
>>   drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
>>   drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
>>   drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 +-
>>   drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 ++-
>>   drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
>>   drivers/gpu/drm/radeon/radeon_ttm.c           |  6 ++-
>>   drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++----
>>   .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
>>   drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 33 ++++++--------
>>   drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++-----
>>   drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 +--
>>   drivers/gpu/drm/ttm/tests/ttm_pool_test.c     | 24 +++++-----
>>   drivers/gpu/drm/ttm/ttm_bo.c                  |  4 +-
>>   drivers/gpu/drm/ttm/ttm_device.c              |  9 ++--
>>   drivers/gpu/drm/ttm/ttm_pool.c                | 45 +++++++++++--------
>>   drivers/gpu/drm/ttm/ttm_pool_internal.h       | 25 +++++++++++
>>   drivers/gpu/drm/ttm/ttm_tt.c                  | 10 +++--
>>   drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 +-
>>   drivers/gpu/drm/xe/xe_device.c                |  2 +-
>>   include/drm/ttm/ttm_allocation.h              | 12 +++++
>>   include/drm/ttm/ttm_device.h                  |  8 +++-
>>   include/drm/ttm/ttm_pool.h                    |  8 ++--
>>   23 files changed, 154 insertions(+), 107 deletions(-)
>>   create mode 100644 drivers/gpu/drm/ttm/ttm_pool_internal.h
>>   create mode 100644 include/drm/ttm/ttm_allocation.h
>>
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-10-31  9:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 1/6] drm/ttm: Add getter for some pool properties Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 2/6] drm/ttm: Replace multiple booleans with flags in pool init Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init Tvrtko Ursulin
2025-10-21 14:16   ` Thomas Hellström
2025-10-22  3:56   ` Zack Rusin
2025-10-20 11:54 ` [PATCH v5 4/6] drm/ttm: Allow drivers to specify maximum beneficial TTM pool size Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 5/6] drm/amdgpu: Configure max beneficial TTM pool allocation order Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM Tvrtko Ursulin
2025-10-21 14:11   ` Thomas Hellström
2025-10-23 13:37     ` Tvrtko Ursulin
2025-10-27 10:21 ` [PATCH v5 0/6] Improving the worst case TTM large allocation latency Christian König
2025-10-31  9:32   ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox