intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] Driver-managed exhaustive eviction
@ 2025-08-13 10:51 Thomas Hellström
  2025-08-13 10:51 ` [PATCH 01/15] drm/xe/vm: Don't use a pin the vm_resv during validation Thomas Hellström
                   ` (18 more replies)
  0 siblings, 19 replies; 66+ messages in thread
From: Thomas Hellström @ 2025-08-13 10:51 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Joonas Lahtinen,
	Jani Nikula, Maarten Lankhorst, Matthew Auld

Exhaustive eviction means that every client should in theory be able to
allocate all graphics memory (minus pinned memory). This is done by
evicting other client's memory.

Currently when TTM wants to evict a buffer object it will typically
trylock that buffer object. It may also optionally try a sleeping lock,
but if deadlock resolution kicks in while doing so (the locking
returns -EDEADLK), that is converted to an -ENOMEM and returned to the
caller. If there are multiple clients simultaneously wanting to evict
eachother's buffer objects, there is a chance that clients end
up returning -ENOMEM.

The key to resolving this is that on memory contention, lower
priority clients back off, releasing their buffer object locks and
thereby allow their memory to be evicted. Eventually their priority
will elevate and they will succeed. TTM has long been intending to
implent this using full drm_exec locking during eviction. This means
that when that is implemented, clients wanting to validate memory must
pass the drm_exec context used to lock its buffer object to TTM
validation. Most of this series is making sure that is done, both
on exec-type validation and buffer object creation. The big benefit of
this approach is that it can distinguish between memory types and
avoid lock release rollbacks until it is really necessary. One
drawback is that it can't handle system memory contention resolved
by a shrinker.

However, since TTM has still to implement drm_exec validation, this
series, while preparing for the TTM implementation, takes a different
approach with an outer rw semaphore on top of the drm_exec retry loop.
When a client wants to allocate graphics memory, the lock is taken in
non-exclusive mode. If an OOM is hit, the locks are released and the
outer lock is retaken in exclusive mode. That ensures that on memory
contention, the client will, when the exclusive lock is held, be
the only client trying to allocate memory. It requires, however,
that all clients adhere to the same scheme.

The idea is that when TTM implements drm_exec eviction, the driver-
managed scheme could be retired.

Patch 1 to 3 fixes fixes problems hit while testing.
Patch 4 identifies the code-paths where we need a drm_exec transaction.
Patch 5 introduces the wrapper with the rw-semaphore

The rest of the patches ensure that we wrap graphics memory
allocation in the combined rw-semaphore / drm-exec loop.

As a follow up, additional patches around suspend / resume will
be posted.

Thomas Hellström (15):
  drm/xe/vm: Don't use a pin the vm_resv during validation
  drm/xe/tests/xe_dma_buf: Set the drm_object::dma_buf member
  drm/xe/vm: Clear the scratch_pt pointer on error
  drm/xe: Pass down drm_exec context to validation
  drm/xe: Introduce an xe_validation wrapper around drm_exec
  drm/xe: Convert xe_bo_create_user() for exhaustive eviction
  drm/xe: Convert SVM validation for exhaustive eviction
  drm/xe: Convert existing drm_exec transactions for exhaustive eviction
  drm/xe: Convert the CPU fault handler for exhaustive eviction
  drm/xe/display: Convert __xe_pin_fb_vma()
  drm/xe: Convert xe_dma_buf.c for exhaustive eviction
  drm/xe: Rename ___xe_bo_create_locked()
  drm/xe: Convert xe_bo_create_pin_map_at() for exhaustive eviction
  drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction
  drm/xe: Convert pinned suspend eviction for exhaustive eviction

 drivers/gpu/drm/xe/Makefile                   |   1 +
 .../compat-i915-headers/gem/i915_gem_stolen.h |  24 +-
 drivers/gpu/drm/xe/display/intel_fbdev_fb.c   |  18 +-
 drivers/gpu/drm/xe/display/xe_dsb_buffer.c    |  10 +-
 drivers/gpu/drm/xe/display/xe_fb_pin.c        |  62 +-
 drivers/gpu/drm/xe/display/xe_hdcp_gsc.c      |   8 +-
 drivers/gpu/drm/xe/display/xe_plane_initial.c |   4 +-
 drivers/gpu/drm/xe/tests/xe_bo.c              |  36 +-
 drivers/gpu/drm/xe/tests/xe_dma_buf.c         |  24 +-
 drivers/gpu/drm/xe/tests/xe_migrate.c         |  66 +-
 drivers/gpu/drm/xe/xe_bo.c                    | 589 ++++++++++++------
 drivers/gpu/drm/xe/xe_bo.h                    |  56 +-
 drivers/gpu/drm/xe/xe_device.c                |   2 +
 drivers/gpu/drm/xe/xe_device_types.h          |   3 +
 drivers/gpu/drm/xe/xe_dma_buf.c               |  70 ++-
 drivers/gpu/drm/xe/xe_eu_stall.c              |   6 +-
 drivers/gpu/drm/xe/xe_exec.c                  |  26 +-
 drivers/gpu/drm/xe/xe_ggtt.c                  |  15 +-
 drivers/gpu/drm/xe/xe_ggtt.h                  |   5 +-
 drivers/gpu/drm/xe/xe_gsc.c                   |   8 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c          |  24 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    |  22 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c |  24 +-
 drivers/gpu/drm/xe/xe_guc_engine_activity.c   |  13 +-
 drivers/gpu/drm/xe/xe_lmtt.c                  |  12 +-
 drivers/gpu/drm/xe/xe_lrc.c                   |   7 +-
 drivers/gpu/drm/xe/xe_migrate.c               |  20 +-
 drivers/gpu/drm/xe/xe_oa.c                    |   6 +-
 drivers/gpu/drm/xe/xe_pt.c                    |  10 +-
 drivers/gpu/drm/xe/xe_pt.h                    |   3 +-
 drivers/gpu/drm/xe/xe_pxp_submit.c            |  34 +-
 drivers/gpu/drm/xe/xe_svm.c                   |  65 +-
 drivers/gpu/drm/xe/xe_validation.c            | 248 ++++++++
 drivers/gpu/drm/xe/xe_validation.h            | 176 ++++++
 drivers/gpu/drm/xe/xe_vm.c                    | 287 +++++----
 drivers/gpu/drm/xe/xe_vm.h                    |  52 +-
 drivers/gpu/drm/xe/xe_vm_types.h              |  32 +-
 37 files changed, 1413 insertions(+), 655 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_validation.c
 create mode 100644 drivers/gpu/drm/xe/xe_validation.h

-- 
2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2025-08-22  7:40 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 10:51 [PATCH 00/15] Driver-managed exhaustive eviction Thomas Hellström
2025-08-13 10:51 ` [PATCH 01/15] drm/xe/vm: Don't use a pin the vm_resv during validation Thomas Hellström
2025-08-13 14:28   ` Matthew Brost
2025-08-13 14:33     ` Thomas Hellström
2025-08-13 15:17       ` Matthew Brost
2025-08-13 10:51 ` [PATCH 02/15] drm/xe/tests/xe_dma_buf: Set the drm_object::dma_buf member Thomas Hellström
2025-08-14  2:52   ` Matthew Brost
2025-08-13 10:51 ` [PATCH 03/15] drm/xe/vm: Clear the scratch_pt pointer on error Thomas Hellström
2025-08-13 14:45   ` Matthew Brost
2025-08-13 10:51 ` [PATCH 04/15] drm/xe: Pass down drm_exec context to validation Thomas Hellström
2025-08-13 16:42   ` Matthew Brost
2025-08-14  7:49     ` Thomas Hellström
2025-08-14 19:09       ` Matthew Brost
2025-08-22  7:40     ` Thomas Hellström
2025-08-13 10:51 ` [PATCH 05/15] drm/xe: Introduce an xe_validation wrapper around drm_exec Thomas Hellström
2025-08-13 17:25   ` Matthew Brost
2025-08-15 15:04     ` Thomas Hellström
2025-08-14  2:33   ` Matthew Brost
2025-08-14  4:23     ` Matthew Brost
2025-08-15 15:23     ` Thomas Hellström
2025-08-15 19:01       ` Matthew Brost
2025-08-17 14:05   ` [05/15] " Simon Richter
2025-08-18  2:19     ` Matthew Brost
2025-08-18  5:24       ` Simon Richter
2025-08-18  9:19     ` Thomas Hellström
2025-08-13 10:51 ` [PATCH 06/15] drm/xe: Convert xe_bo_create_user() for exhaustive eviction Thomas Hellström
2025-08-14  2:23   ` Matthew Brost
2025-08-13 10:51 ` [PATCH 07/15] drm/xe: Convert SVM validation " Thomas Hellström
2025-08-13 15:32   ` Matthew Brost
2025-08-14 12:24     ` Thomas Hellström
2025-08-13 10:51 ` [PATCH 08/15] drm/xe: Convert existing drm_exec transactions " Thomas Hellström
2025-08-14  2:48   ` Matthew Brost
2025-08-13 10:51 ` [PATCH 09/15] drm/xe: Convert the CPU fault handler " Thomas Hellström
2025-08-13 22:06   ` Matthew Brost
2025-08-15 15:16     ` Thomas Hellström
2025-08-15 19:04       ` Matthew Brost
2025-08-18  9:11         ` Thomas Hellström
2025-08-13 10:51 ` [PATCH 10/15] drm/xe/display: Convert __xe_pin_fb_vma() Thomas Hellström
2025-08-14  2:35   ` Matthew Brost
2025-08-13 10:51 ` [PATCH 11/15] drm/xe: Convert xe_dma_buf.c for exhaustive eviction Thomas Hellström
2025-08-13 21:37   ` Matthew Brost
2025-08-15 15:05     ` Thomas Hellström
2025-08-14 20:37   ` Matthew Brost
2025-08-15  6:57     ` Thomas Hellström
2025-08-13 10:51 ` [PATCH 12/15] drm/xe: Rename ___xe_bo_create_locked() Thomas Hellström
2025-08-13 21:33   ` Matthew Brost
2025-08-13 10:51 ` [PATCH 13/15] drm/xe: Convert xe_bo_create_pin_map_at() for exhaustive eviction Thomas Hellström
2025-08-14  3:58   ` Matthew Brost
2025-08-15 15:25     ` Thomas Hellström
2025-08-14  4:05   ` Matthew Brost
2025-08-15 15:27     ` Thomas Hellström
2025-08-14 18:48   ` Matthew Brost
2025-08-15  9:37     ` Thomas Hellström
2025-08-13 10:51 ` [PATCH 14/15] drm/xe: Convert xe_bo_create_pin_map() " Thomas Hellström
2025-08-14  4:18   ` Matthew Brost
2025-08-14 13:14     ` Thomas Hellström
2025-08-14 18:39       ` Matthew Brost
2025-08-13 10:51 ` [PATCH 15/15] drm/xe: Convert pinned suspend eviction " Thomas Hellström
2025-08-13 12:13   ` Matthew Auld
2025-08-13 12:30     ` Thomas Hellström
2025-08-14 20:30   ` Matthew Brost
2025-08-15 15:29     ` Thomas Hellström
2025-08-13 11:54 ` ✗ CI.checkpatch: warning for Driver-managed " Patchwork
2025-08-13 11:55 ` ✓ CI.KUnit: success " Patchwork
2025-08-13 13:20 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-08-13 14:25 ` ✗ Xe.CI.Full: " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).