AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] Improving the worst case TTM large allocation latency
@ 2025-10-20 11:54 Tvrtko Ursulin
  2025-10-20 11:54 ` [PATCH v5 1/6] drm/ttm: Add getter for some pool properties Tvrtko Ursulin
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2025-10-20 11:54 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
	Danilo Krummrich, Dave Airlie, Gerd Hoffmann, Joonas Lahtinen,
	Lucas De Marchi, Lyude Paul, Maarten Lankhorst, Maxime Ripard,
	Rodrigo Vivi, Sui Jingfeng, Thadeu Lima de Souza Cascardo,
	Thomas Hellström, Thomas Zimmermann, Zack Rusin

Disclaimer:
Please note that as this series includes a patch which touches a good number of
drivers I will only copy everyone in the cover letter and the respective patch.
Assumption is people are subscribed to dri-devel so can look at the whole series
there. I know someone is bound to complain for both the case when everyone is
copied on everything for getting too much email, and also for this other case.
So please be flexible.

Description:

All drivers which use the TTM pool allocator end up requesting large order
allocations when allocating large buffers. Those can be slow due memory pressure
and so add latency to buffer creation. But there is often also a size limit
above which contiguous blocks do not bring any performance benefits. This series
allows drivers to say when it is okay for the TTM to try a bit less hard.

We do this by allowing drivers to specify this cut off point when creating the
TTM device and pools. Allocations above this size will skip direct reclaim so
under memory pressure worst case latency will improve. Background reclaim is
still kicked off and both before and after the memory pressure all the TTM pool
buckets remain to be used as they are today.

This is especially interesting if someone has configured MAX_PAGE_ORDER to
higher than the default. And even with the default, with amdgpu for example,
the last patch in the series makes use of the new feature by telling TTM that
above 2MiB we do not expect performance benefits. Which makes TTM not try direct
reclaim for the top bucket (4MiB).

End result is TTM drivers become a tiny bit nicer mm citizens and users benefit
from better worst case buffer creation latencies. As a side benefit we get rid
of two instances of those often very unreadable mutliple nameless booleans
function signatures.

If this sounds interesting and gets merge the invidual drivers can follow up
with patches configuring their thresholds.

v2:
 * Christian suggested to pass in the new data by changing the function signatures.

v3:
 * Moved ttm pool helpers into new ttm_pool_internal.h. (Christian)

v4:
 * Fixed TTM unit test build.

v5:
 * Renamed pool_flags to alloc_flags and moved to TTM_ALLOCATION_ namespace.
 * Added last patch (propagate ENOSPC) from Thomas' related series for reference.

v1 thread:
https://lore.kernel.org/dri-devel/20250919131127.90932-1-tvrtko.ursulin@igalia.com/

v3 thread:
https://lore.kernel.org/dri-devel/20251008115314.55438-1-tvrtko.ursulin@igalia.com/

v4 thread:
https://lore.kernel.org/dri-devel/20251013082240.55263-1-tvrtko.ursulin@igalia.com/

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sui Jingfeng <suijingfeng@loongson.cn>
Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Zack Rusin <zack.rusin@broadcom.com>

Tvrtko Ursulin (6):
  drm/ttm: Add getter for some pool properties
  drm/ttm: Replace multiple booleans with flags in pool init
  drm/ttm: Replace multiple booleans with flags in device init
  drm/ttm: Allow drivers to specify maximum beneficial TTM pool size
  drm/amdgpu: Configure max beneficial TTM pool allocation order
  drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  9 ++--
 drivers/gpu/drm/drm_gem_vram_helper.c         |  2 +-
 drivers/gpu/drm/i915/intel_region_ttm.c       |  2 +-
 drivers/gpu/drm/loongson/lsdc_ttm.c           |  3 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c         |  6 ++-
 drivers/gpu/drm/qxl/qxl_ttm.c                 |  2 +-
 drivers/gpu/drm/radeon/radeon_ttm.c           |  6 ++-
 drivers/gpu/drm/ttm/tests/ttm_bo_test.c       | 16 +++----
 .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |  2 +-
 drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 33 ++++++--------
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 22 ++++-----
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  7 +--
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c     | 24 +++++-----
 drivers/gpu/drm/ttm/ttm_bo.c                  |  4 +-
 drivers/gpu/drm/ttm/ttm_device.c              |  9 ++--
 drivers/gpu/drm/ttm/ttm_pool.c                | 45 +++++++++++--------
 drivers/gpu/drm/ttm/ttm_pool_internal.h       | 25 +++++++++++
 drivers/gpu/drm/ttm/ttm_tt.c                  | 10 +++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |  4 +-
 drivers/gpu/drm/xe/xe_device.c                |  2 +-
 include/drm/ttm/ttm_allocation.h              | 12 +++++
 include/drm/ttm/ttm_device.h                  |  8 +++-
 include/drm/ttm/ttm_pool.h                    |  8 ++--
 23 files changed, 154 insertions(+), 107 deletions(-)
 create mode 100644 drivers/gpu/drm/ttm/ttm_pool_internal.h
 create mode 100644 include/drm/ttm/ttm_allocation.h

-- 
2.48.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-10-31  9:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 11:54 [PATCH v5 0/6] Improving the worst case TTM large allocation latency Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 1/6] drm/ttm: Add getter for some pool properties Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 2/6] drm/ttm: Replace multiple booleans with flags in pool init Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 3/6] drm/ttm: Replace multiple booleans with flags in device init Tvrtko Ursulin
2025-10-21 14:16   ` Thomas Hellström
2025-10-22  3:56   ` Zack Rusin
2025-10-20 11:54 ` [PATCH v5 4/6] drm/ttm: Allow drivers to specify maximum beneficial TTM pool size Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 5/6] drm/amdgpu: Configure max beneficial TTM pool allocation order Tvrtko Ursulin
2025-10-20 11:54 ` [PATCH v5 6/6] drm/ttm: Add an allocation flag to propagate -ENOSPC on OOM Tvrtko Ursulin
2025-10-21 14:11   ` Thomas Hellström
2025-10-23 13:37     ` Tvrtko Ursulin
2025-10-27 10:21 ` [PATCH v5 0/6] Improving the worst case TTM large allocation latency Christian König
2025-10-31  9:32   ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox