Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5
@ 2025-10-06 11:16 Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface Mika Kuoppala
                   ` (23 more replies)
  0 siblings, 24 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

Hi,

This is the v5 patch series for Intel Xe GPU debug support (eudebug).

As the initial feedback on v4 was positive, we brought the
rest of the features from v3, namely the page fault support
into this series.

This series continues from the following previous submissions:
- v1: https://lists.freedesktop.org/archives/intel-xe/2024-July/043605.html
- v2: https://lists.freedesktop.org/archives/intel-xe/2024-October/052260.html
- v3: https://lists.freedesktop.org/archives/intel-xe/2024-December/061476.html
- v4: https://lists.freedesktop.org/archives/intel-xe/2025-August/091645.html

Known shortcomings: With multiple debug clients page faults can
race into how attention polling is stopped/started and lead to
missing attentions.

# Major Changes from v4

v4 omitted page fault support, it is reworked from v3 and included
in this series.

### Major Changes from v3

#### 1. Elimination of ptrace_may_access() and pid

In previous series, the connection attempt was made using the process ID
(PID) as the target. Access was checked using the `ptrace_may_access()`
helper to achieve security parity with CPU-side debugging.

In v4, this has been changed to connect to a DRM client, using a file
descriptor as the target. This approach eliminates the need for the
`ptrace_may_access()` symbol export, as access control is now managed
through the debugger process's access to the file descriptor. For example,
accessing a remote DRM client requires the debugger process to
successfully call `pidfd_getfd()` to obtain a duplicate of the target
file descriptor.The 1:1 mapping between DRM clients and their debuggers
eliminates the need for `EVENT_OPEN` and simplifies overall connection
tracking.

#### 2. ELF binaries not held in kernel memory

In v4, debug data is delivered as a VM bind 'OP_ADD_DEBUG_DATA' extension.
The ELF binaries are no longer stored within the Xe KMD but are instead
kept in a file. The file path is passed as part of an extension in
the newly introduced 'OP_ADD_DEBUG_DATA' VM bind operation. Alternatively
pseudo-paths can be used to annotate special address ranges similar to
/proc/<pid>/maps.

#### 3. Debug metadata not carried in VMA struct

Instead of attaching debug data to vma created by 'OP_MAP',
we introduce separate ops for managing the metadata.
Debug data is no longer held in the VMA struct. xe_vm contains a
list of all associated debug data.

#### 4. Reading debug data via debugfs

This revision introduces the possibility to access debug data using per
client debugfs entries. The intent was to achieve similar interface to
'/proc/<pid>/maps'

### Supported Hardware with v5
- Lunarlake (LNL)
- Battlemage (BMG)
- Pantherlake (PTL)

The code for this submission can be found at:
https://gitlab.freedesktop.org/miku/kernel/-/tree/eudebug-v5

Christoph Manszewski (5):
  drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops
  drm/xe/eudebug: Introduce vm bind and vm bind debug data events
  drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test
  drm/xe: Implement SR-IOV and eudebug exclusivity
  drm/xe: Add xe_client_debugfs and introduce debug_data file

Dominik Grzegorzek (5):
  drm/xe/eudebug: Introduce exec_queue events
  drm/xe: Add EUDEBUG_ENABLE exec queue property
  drm/xe/eudebug: hw enablement for eudebug
  drm/xe/eudebug: Introduce EU control interface
  drm/xe/eudebug: Introduce per device attention scan worker

Gwan-gyeong Mun (4):
  drm/xe/eudebug: Add read/count/compare helper for eu attention
  drm/xe/eudebug: Introduce EU pagefault handling interface
  drm/xe/vm: Support for adding null page VMA to VM on request
  drm/xe/eudebug: Enable EU pagefault handling

Mika Kuoppala (6):
  drm/xe/eudebug: Introduce eudebug interface
  drm/xe/eudebug: Introduce discovery for resources
  drm/xe/eudebug: Add UFENCE events with acks
  drm/xe/eudebug: vm open/pread/pwrite
  drm/xe/eudebug: userptr vm pread/pwrite
  drm/xe/eudebug: Mark guc contexts as debuggable

 drivers/gpu/drm/xe/Kconfig                  |   10 +
 drivers/gpu/drm/xe/Makefile                 |    7 +-
 drivers/gpu/drm/xe/abi/guc_actions_abi.h    |    5 +
 drivers/gpu/drm/xe/regs/xe_engine_regs.h    |    7 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h        |   43 +
 drivers/gpu/drm/xe/tests/xe_eudebug.c       |  189 ++
 drivers/gpu/drm/xe/tests/xe_live_test_mod.c |    5 +
 drivers/gpu/drm/xe/xe_client_debugfs.c      |  118 +
 drivers/gpu/drm/xe/xe_client_debugfs.h      |   19 +
 drivers/gpu/drm/xe/xe_debug_data.c          |  279 +++
 drivers/gpu/drm/xe/xe_debug_data.h          |   22 +
 drivers/gpu/drm/xe/xe_debug_data_types.h    |   25 +
 drivers/gpu/drm/xe/xe_device.c              |   30 +-
 drivers/gpu/drm/xe/xe_device.h              |   42 +
 drivers/gpu/drm/xe/xe_device_types.h        |   41 +-
 drivers/gpu/drm/xe/xe_eudebug.c             | 2360 +++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug.h             |  157 ++
 drivers/gpu/drm/xe/xe_eudebug_hw.c          |  798 +++++++
 drivers/gpu/drm/xe/xe_eudebug_hw.h          |   36 +
 drivers/gpu/drm/xe/xe_eudebug_pagefault.c   |  391 +++
 drivers/gpu/drm/xe/xe_eudebug_pagefault.h   |   15 +
 drivers/gpu/drm/xe/xe_eudebug_types.h       |  232 ++
 drivers/gpu/drm/xe/xe_eudebug_vm.c          |  434 ++++
 drivers/gpu/drm/xe/xe_eudebug_vm.h          |    8 +
 drivers/gpu/drm/xe/xe_exec.c                |    2 +-
 drivers/gpu/drm/xe/xe_exec_queue.c          |   51 +-
 drivers/gpu/drm/xe/xe_exec_queue.h          |    2 +
 drivers/gpu/drm/xe/xe_exec_queue_types.h    |    7 +
 drivers/gpu/drm/xe/xe_gt.c                  |    1 +
 drivers/gpu/drm/xe/xe_gt_debug.c            |  243 ++
 drivers/gpu/drm/xe/xe_gt_debug.h            |   47 +
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |   80 +-
 drivers/gpu/drm/xe/xe_guc_submit.c          |    4 +
 drivers/gpu/drm/xe/xe_hw_engine.h           |   14 +
 drivers/gpu/drm/xe/xe_lrc.c                 |   10 +
 drivers/gpu/drm/xe/xe_oa.c                  |    3 +-
 drivers/gpu/drm/xe/xe_pci_sriov.c           |   10 +
 drivers/gpu/drm/xe/xe_reg_sr.c              |   21 +-
 drivers/gpu/drm/xe/xe_reg_sr.h              |    4 +-
 drivers/gpu/drm/xe/xe_reg_whitelist.c       |    2 +-
 drivers/gpu/drm/xe/xe_rtp.c                 |    2 +-
 drivers/gpu/drm/xe/xe_sync.c                |   47 +-
 drivers/gpu/drm/xe/xe_sync.h                |    8 +-
 drivers/gpu/drm/xe/xe_sync_types.h          |   28 +-
 drivers/gpu/drm/xe/xe_userptr.c             |    4 +
 drivers/gpu/drm/xe/xe_userptr.h             |   32 +
 drivers/gpu/drm/xe/xe_vm.c                  |  215 +-
 drivers/gpu/drm/xe/xe_vm.h                  |    2 +
 drivers/gpu/drm/xe/xe_vm_types.h            |   32 +
 drivers/gpu/drm/xe/xe_wa_oob.rules          |    4 +
 include/uapi/drm/xe_drm.h                   |   59 +
 include/uapi/drm/xe_drm_eudebug.h           |  229 ++
 52 files changed, 6382 insertions(+), 54 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/tests/xe_eudebug.c
 create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.c
 create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.h
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data.c
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data.h
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data_types.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_pagefault.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_pagefault.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_types.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.h
 create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.c
 create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.h
 create mode 100644 include/uapi/drm/xe_drm_eudebug.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-11-19 21:33 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
2025-10-06 11:16 ` [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface Mika Kuoppala
2025-10-06 11:16 ` [PATCH 02/20] drm/xe/eudebug: Introduce discovery for resources Mika Kuoppala
2025-10-06 11:16 ` [PATCH 03/20] drm/xe/eudebug: Introduce exec_queue events Mika Kuoppala
2025-10-06 11:16 ` [PATCH 04/20] drm/xe: Add EUDEBUG_ENABLE exec queue property Mika Kuoppala
2025-10-06 11:16 ` [PATCH 05/20] drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops Mika Kuoppala
2025-10-06 11:16 ` [PATCH 06/20] drm/xe/eudebug: Introduce vm bind and vm bind debug data events Mika Kuoppala
2025-10-06 11:16 ` [PATCH 07/20] drm/xe/eudebug: Add UFENCE events with acks Mika Kuoppala
2025-10-06 11:16 ` [PATCH 08/20] drm/xe/eudebug: vm open/pread/pwrite Mika Kuoppala
2025-10-06 11:16 ` [PATCH 09/20] drm/xe/eudebug: userptr vm pread/pwrite Mika Kuoppala
2025-10-06 11:17 ` [PATCH 10/20] drm/xe/eudebug: hw enablement for eudebug Mika Kuoppala
2025-10-06 11:17 ` [PATCH 11/20] drm/xe/eudebug: Introduce EU control interface Mika Kuoppala
2025-10-06 11:17 ` [PATCH 12/20] drm/xe/eudebug: Introduce per device attention scan worker Mika Kuoppala
2025-10-06 11:17 ` [PATCH 13/20] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Mika Kuoppala
2025-10-06 11:17 ` [PATCH 14/20] drm/xe: Implement SR-IOV and eudebug exclusivity Mika Kuoppala
2025-10-06 11:17 ` [PATCH 15/20] drm/xe: Add xe_client_debugfs and introduce debug_data file Mika Kuoppala
2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
2025-10-06 18:35   ` Matthew Brost
2025-10-20 12:56     ` Mika Kuoppala
2025-10-20 12:53   ` Mika Kuoppala
2025-11-18 14:48   ` Mika Kuoppala
2025-11-19 21:33     ` Daniele Ceraolo Spurio
2025-10-06 11:17 ` [PATCH 17/20] drm/xe/eudebug: Add read/count/compare helper for eu attention Mika Kuoppala
2025-10-06 11:17 ` [PATCH 18/20] drm/xe/eudebug: Introduce EU pagefault handling interface Mika Kuoppala
2025-10-06 11:17 ` [PATCH 19/20] drm/xe/vm: Support for adding null page VMA to VM on request Mika Kuoppala
2025-10-06 11:17 ` [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling Mika Kuoppala
2025-10-06 18:43   ` Matthew Brost
2025-10-06 12:30 ` ✗ CI.checkpatch: warning for Intel Xe GPU Debug Support (eudebug) v5 Patchwork
2025-10-06 12:31 ` ✓ CI.KUnit: success " Patchwork
2025-10-06 13:14 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-06 15:53 ` ✗ Xe.CI.Full: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox