From: Matthew Brost <matthew.brost@intel.com>
To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com,
jose.souza@intel.com, simona.vetter@ffwll.ch,
thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com,
airlied@gmail.com, christian.koenig@amd.com,
mihail.atanassov@arm.com, steven.price@arm.com,
shashank.sharma@amd.com
Subject: [RFC PATCH 00/29] UMD direct submission in Xe
Date: Mon, 18 Nov 2024 15:37:28 -0800 [thread overview]
Message-ID: <20241118233757.2374041-1-matthew.brost@intel.com> (raw)
This is an RFC, or possibly even a proof of concept, for UMD (User Mode
Driver) direct submission in Xe. It is similar to AMD's design [1] [2]
or ARM's design [3], utilizing a uAPI to convert user-space syncs
(memory writes) to kernel-space syncs (DMA fences). It is built around
the existing Xe preemption fences for dynamic memory management, such as
userptr invalidation and buffer object (BO) eviction.
The series also enables mapping a PPGTT-bound submission ring in
non-privileged mode, as well as exposing indirect ring state (such as
ring head, tail, etc.) and the doorbell to user space, enabling UMD
direct submission.
The target for this series is Mesa, with the goal of enabling UMD direct
submission and removing the submission thread that currently handles
future fences. I've discussed this with Sima and the Intel Mesa team,
and it seems like a reachable target. Most synchronization will be
handled in user space via memory writes and semaphore wait ring
instructions, with only legacy cross-process synchronization (e.g.,
compositors) requiring kernel synchronization (DMA fences).
The series includes some common patches at the beginning to implement
preemption fences and user fences. The idea of preemption
DMA-reservation slots [4] has been dropped in favor of attaching the
last exported DMA fence to the preemption fence as suggested by AMD.
This is a public checkpoint on the KMD (Kernel Mode Driver) work, which
will be tabled until Intel's Mesa team has the bandwidth to begin the
UMD work. That said, the uAPI is very preliminary and likely to change.
One idea that was discussed is a common user fence interface based
around DRM syncobjs, which will likely be explored further as UMD
engagement begins. Some work for syncing VM binds (kernel operation)
with UMD direct submission is also likely required.
Testing has been done with [5], and the main features—such as basic
submission, dynamic memory management, user-to-kernel sync conversion,
and protection against endless user fences—are working on BMG and LNL.
The GitLab branch [6] has also been pushed for reference.
Any early community feedback is always appreciated.
Matt
[1] https://patchwork.freedesktop.org/series/113675/
[2] https://patchwork.freedesktop.org/series/114385/
[3] https://patchwork.freedesktop.org/series/137924/
[4] https://patchwork.freedesktop.org/series/141129/
[5] https://patchwork.freedesktop.org/series/141518/
[6] https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-umd-submission-post/-/tree/post-11-18-24?ref_type=heads
Matthew Brost (28):
dma-fence: Add dma_fence_preempt base class
dma-fence: Add dma_fence_user_fence
drm/xe: Use dma_fence_preempt base class
drm/xe: Allocate doorbells for UMD exec queues
drm/xe: Add doorbell ID to snapshot capture
drm/xe: Break submission ring out into its own BO
drm/xe: Break indirect ring state out into its own BO
drm/xe: Clear GGTT in xe_bo_restore_kernel
FIXME: drm/xe: Add pad to ring and indirect state
drm/xe: Enable indirect ring on media GT
drm/xe: Don't add pinned mappings to VM bulk move
drm/xe: Add exec queue post init extension processing
drm/xe: Add support for mmapping doorbells to user space
drm/xe: Add support for mmapping submission ring and indirect ring
state to user space
drm/xe/uapi: Define UMD exec queue mapping uAPI
drm/xe: Add usermap exec queue extension
drm/xe: Drop EXEC_QUEUE_FLAG_UMD_SUBMISSION flag
drm/xe: Do not allow usermap exec queues in exec IOCTL
drm/xe: Teach GuC backend to kill usermap queues
drm/xe: Enable preempt fences on usermap queues
drm/xe/uapi: Add uAPI to convert user semaphore to / from drm syncobj
drm/xe: Add user fence IRQ handler
drm/xe: Add xe_hw_fence_user_init
drm/xe: Add a message lock to the Xe GPU scheduler
drm/xe: Always wait on preempt fences in vma_check_userptr
drm/xe: Teach xe_sync layer about drm_xe_semaphore
drm/xe: Add VM convert fence IOCTL
drm/xe: Add user fence TDR
Tejas Upadhyay (1):
drm/xe/mmap: Add mmap support for PCI memory barrier
drivers/dma-buf/Makefile | 2 +-
drivers/dma-buf/dma-fence-preempt.c | 134 ++++++
drivers/dma-buf/dma-fence-user-fence.c | 73 ++++
drivers/gpu/drm/xe/xe_bo.c | 29 +-
drivers/gpu/drm/xe/xe_bo.h | 5 +
drivers/gpu/drm/xe/xe_bo_evict.c | 8 +-
drivers/gpu/drm/xe/xe_device.c | 181 +++++++-
drivers/gpu/drm/xe/xe_device_types.h | 3 +
drivers/gpu/drm/xe/xe_exec.c | 3 +-
drivers/gpu/drm/xe/xe_exec_queue.c | 175 +++++++-
drivers/gpu/drm/xe/xe_exec_queue.h | 5 +
drivers/gpu/drm/xe/xe_exec_queue_types.h | 13 +
drivers/gpu/drm/xe/xe_execlist.c | 2 +-
drivers/gpu/drm/xe/xe_ggtt.c | 19 +-
drivers/gpu/drm/xe/xe_ggtt.h | 2 +
drivers/gpu/drm/xe/xe_gpu_scheduler.c | 19 +-
drivers/gpu/drm/xe/xe_gpu_scheduler.h | 12 +-
drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 +
drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 9 +-
drivers/gpu/drm/xe/xe_guc_submit.c | 177 +++++++-
drivers/gpu/drm/xe/xe_guc_submit_types.h | 2 +
drivers/gpu/drm/xe/xe_hw_engine.c | 4 +-
drivers/gpu/drm/xe/xe_hw_engine_group.c | 4 +-
drivers/gpu/drm/xe/xe_hw_fence.c | 17 +
drivers/gpu/drm/xe/xe_hw_fence.h | 3 +
drivers/gpu/drm/xe/xe_lrc.c | 176 ++++++--
drivers/gpu/drm/xe/xe_lrc.h | 4 +-
drivers/gpu/drm/xe/xe_lrc_types.h | 16 +-
drivers/gpu/drm/xe/xe_pci.c | 1 +
drivers/gpu/drm/xe/xe_preempt_fence.c | 89 ++--
drivers/gpu/drm/xe/xe_preempt_fence.h | 2 +-
drivers/gpu/drm/xe/xe_preempt_fence_types.h | 11 +-
drivers/gpu/drm/xe/xe_pt.c | 5 +-
drivers/gpu/drm/xe/xe_sync.c | 90 ++++
drivers/gpu/drm/xe/xe_sync.h | 8 +
drivers/gpu/drm/xe/xe_sync_types.h | 5 +-
drivers/gpu/drm/xe/xe_vm.c | 423 ++++++++++++++++++-
drivers/gpu/drm/xe/xe_vm.h | 4 +-
drivers/gpu/drm/xe/xe_vm_types.h | 26 ++
include/linux/dma-fence-preempt.h | 56 +++
include/linux/dma-fence-user-fence.h | 31 ++
include/uapi/drm/xe_drm.h | 147 ++++++-
42 files changed, 1798 insertions(+), 199 deletions(-)
create mode 100644 drivers/dma-buf/dma-fence-preempt.c
create mode 100644 drivers/dma-buf/dma-fence-user-fence.c
create mode 100644 include/linux/dma-fence-preempt.h
create mode 100644 include/linux/dma-fence-user-fence.h
--
2.34.1
next reply other threads:[~2024-11-18 23:37 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-18 23:37 Matthew Brost [this message]
2024-11-18 23:37 ` [RFC PATCH 01/29] dma-fence: Add dma_fence_preempt base class Matthew Brost
2024-11-20 13:31 ` Christian König
2024-11-20 17:36 ` Matthew Brost
2024-11-21 10:04 ` Christian König
2024-11-21 18:41 ` Matthew Brost
2024-11-22 10:56 ` Christian König
2024-11-18 23:37 ` [RFC PATCH 02/29] dma-fence: Add dma_fence_user_fence Matthew Brost
2024-11-20 13:38 ` Christian König
2024-11-20 22:50 ` Matthew Brost
2024-11-21 9:31 ` Christian König
2024-11-22 2:35 ` Matthew Brost
2024-11-22 10:28 ` Christian König
2024-11-18 23:37 ` [RFC PATCH 03/29] drm/xe: Use dma_fence_preempt base class Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 04/29] drm/xe: Allocate doorbells for UMD exec queues Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 05/29] drm/xe: Add doorbell ID to snapshot capture Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 06/29] drm/xe: Break submission ring out into its own BO Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 07/29] drm/xe: Break indirect ring state " Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 08/29] drm/xe: Clear GGTT in xe_bo_restore_kernel Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 09/29] FIXME: drm/xe: Add pad to ring and indirect state Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 10/29] drm/xe: Enable indirect ring on media GT Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 11/29] drm/xe: Don't add pinned mappings to VM bulk move Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 12/29] drm/xe: Add exec queue post init extension processing Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 13/29] drm/xe/mmap: Add mmap support for PCI memory barrier Matthew Brost
2024-11-19 10:00 ` Christian König
2024-11-19 11:57 ` Joonas Lahtinen
2024-11-19 12:42 ` Mrozek, Michal
2024-12-18 12:59 ` Upadhyay, Tejas
2024-11-18 23:37 ` [RFC PATCH 14/29] drm/xe: Add support for mmapping doorbells to user space Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 15/29] drm/xe: Add support for mmapping submission ring and indirect ring state " Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 16/29] drm/xe/uapi: Define UMD exec queue mapping uAPI Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 17/29] drm/xe: Add usermap exec queue extension Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 18/29] drm/xe: Drop EXEC_QUEUE_FLAG_UMD_SUBMISSION flag Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 19/29] drm/xe: Do not allow usermap exec queues in exec IOCTL Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 20/29] drm/xe: Teach GuC backend to kill usermap queues Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 21/29] drm/xe: Enable preempt fences on " Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 22/29] drm/xe/uapi: Add uAPI to convert user semaphore to / from drm syncobj Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 23/29] drm/xe: Add user fence IRQ handler Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 24/29] drm/xe: Add xe_hw_fence_user_init Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 25/29] drm/xe: Add a message lock to the Xe GPU scheduler Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 26/29] drm/xe: Always wait on preempt fences in vma_check_userptr Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 27/29] drm/xe: Teach xe_sync layer about drm_xe_semaphore Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 28/29] drm/xe: Add VM convert fence IOCTL Matthew Brost
2024-11-18 23:37 ` [RFC PATCH 29/29] drm/xe: Add user fence TDR Matthew Brost
2024-11-18 23:55 ` ✓ CI.Patch_applied: success for UMD direct submission in Xe Patchwork
2024-11-18 23:56 ` ✗ CI.checkpatch: warning " Patchwork
2024-11-18 23:57 ` ✓ CI.KUnit: success " Patchwork
2024-11-19 0:15 ` ✓ CI.Build: " Patchwork
2024-11-19 0:17 ` ✗ CI.Hooks: failure " Patchwork
2024-11-19 0:19 ` ✓ CI.checksparse: success " Patchwork
2024-11-19 0:39 ` ✗ CI.BAT: failure " Patchwork
2024-11-19 11:44 ` ✗ CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241118233757.2374041-1-matthew.brost@intel.com \
--to=matthew.brost@intel.com \
--cc=airlied@gmail.com \
--cc=boris.brezillon@collabora.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jose.souza@intel.com \
--cc=kenneth.w.graunke@intel.com \
--cc=lionel.g.landwerlin@intel.com \
--cc=mihail.atanassov@arm.com \
--cc=shashank.sharma@amd.com \
--cc=simona.vetter@ffwll.ch \
--cc=steven.price@arm.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox