Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5
@ 2025-10-06 11:16 Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface Mika Kuoppala
                   ` (23 more replies)
  0 siblings, 24 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

Hi,

This is the v5 patch series for Intel Xe GPU debug support (eudebug).

As the initial feedback on v4 was positive, we brought the
rest of the features from v3, namely the page fault support
into this series.

This series continues from the following previous submissions:
- v1: https://lists.freedesktop.org/archives/intel-xe/2024-July/043605.html
- v2: https://lists.freedesktop.org/archives/intel-xe/2024-October/052260.html
- v3: https://lists.freedesktop.org/archives/intel-xe/2024-December/061476.html
- v4: https://lists.freedesktop.org/archives/intel-xe/2025-August/091645.html

Known shortcomings: With multiple debug clients page faults can
race into how attention polling is stopped/started and lead to
missing attentions.

# Major Changes from v4

v4 omitted page fault support, it is reworked from v3 and included
in this series.

### Major Changes from v3

#### 1. Elimination of ptrace_may_access() and pid

In previous series, the connection attempt was made using the process ID
(PID) as the target. Access was checked using the `ptrace_may_access()`
helper to achieve security parity with CPU-side debugging.

In v4, this has been changed to connect to a DRM client, using a file
descriptor as the target. This approach eliminates the need for the
`ptrace_may_access()` symbol export, as access control is now managed
through the debugger process's access to the file descriptor. For example,
accessing a remote DRM client requires the debugger process to
successfully call `pidfd_getfd()` to obtain a duplicate of the target
file descriptor.The 1:1 mapping between DRM clients and their debuggers
eliminates the need for `EVENT_OPEN` and simplifies overall connection
tracking.

#### 2. ELF binaries not held in kernel memory

In v4, debug data is delivered as a VM bind 'OP_ADD_DEBUG_DATA' extension.
The ELF binaries are no longer stored within the Xe KMD but are instead
kept in a file. The file path is passed as part of an extension in
the newly introduced 'OP_ADD_DEBUG_DATA' VM bind operation. Alternatively
pseudo-paths can be used to annotate special address ranges similar to
/proc/<pid>/maps.

#### 3. Debug metadata not carried in VMA struct

Instead of attaching debug data to vma created by 'OP_MAP',
we introduce separate ops for managing the metadata.
Debug data is no longer held in the VMA struct. xe_vm contains a
list of all associated debug data.

#### 4. Reading debug data via debugfs

This revision introduces the possibility to access debug data using per
client debugfs entries. The intent was to achieve similar interface to
'/proc/<pid>/maps'

### Supported Hardware with v5
- Lunarlake (LNL)
- Battlemage (BMG)
- Pantherlake (PTL)

The code for this submission can be found at:
https://gitlab.freedesktop.org/miku/kernel/-/tree/eudebug-v5

Christoph Manszewski (5):
  drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops
  drm/xe/eudebug: Introduce vm bind and vm bind debug data events
  drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test
  drm/xe: Implement SR-IOV and eudebug exclusivity
  drm/xe: Add xe_client_debugfs and introduce debug_data file

Dominik Grzegorzek (5):
  drm/xe/eudebug: Introduce exec_queue events
  drm/xe: Add EUDEBUG_ENABLE exec queue property
  drm/xe/eudebug: hw enablement for eudebug
  drm/xe/eudebug: Introduce EU control interface
  drm/xe/eudebug: Introduce per device attention scan worker

Gwan-gyeong Mun (4):
  drm/xe/eudebug: Add read/count/compare helper for eu attention
  drm/xe/eudebug: Introduce EU pagefault handling interface
  drm/xe/vm: Support for adding null page VMA to VM on request
  drm/xe/eudebug: Enable EU pagefault handling

Mika Kuoppala (6):
  drm/xe/eudebug: Introduce eudebug interface
  drm/xe/eudebug: Introduce discovery for resources
  drm/xe/eudebug: Add UFENCE events with acks
  drm/xe/eudebug: vm open/pread/pwrite
  drm/xe/eudebug: userptr vm pread/pwrite
  drm/xe/eudebug: Mark guc contexts as debuggable

 drivers/gpu/drm/xe/Kconfig                  |   10 +
 drivers/gpu/drm/xe/Makefile                 |    7 +-
 drivers/gpu/drm/xe/abi/guc_actions_abi.h    |    5 +
 drivers/gpu/drm/xe/regs/xe_engine_regs.h    |    7 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h        |   43 +
 drivers/gpu/drm/xe/tests/xe_eudebug.c       |  189 ++
 drivers/gpu/drm/xe/tests/xe_live_test_mod.c |    5 +
 drivers/gpu/drm/xe/xe_client_debugfs.c      |  118 +
 drivers/gpu/drm/xe/xe_client_debugfs.h      |   19 +
 drivers/gpu/drm/xe/xe_debug_data.c          |  279 +++
 drivers/gpu/drm/xe/xe_debug_data.h          |   22 +
 drivers/gpu/drm/xe/xe_debug_data_types.h    |   25 +
 drivers/gpu/drm/xe/xe_device.c              |   30 +-
 drivers/gpu/drm/xe/xe_device.h              |   42 +
 drivers/gpu/drm/xe/xe_device_types.h        |   41 +-
 drivers/gpu/drm/xe/xe_eudebug.c             | 2360 +++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug.h             |  157 ++
 drivers/gpu/drm/xe/xe_eudebug_hw.c          |  798 +++++++
 drivers/gpu/drm/xe/xe_eudebug_hw.h          |   36 +
 drivers/gpu/drm/xe/xe_eudebug_pagefault.c   |  391 +++
 drivers/gpu/drm/xe/xe_eudebug_pagefault.h   |   15 +
 drivers/gpu/drm/xe/xe_eudebug_types.h       |  232 ++
 drivers/gpu/drm/xe/xe_eudebug_vm.c          |  434 ++++
 drivers/gpu/drm/xe/xe_eudebug_vm.h          |    8 +
 drivers/gpu/drm/xe/xe_exec.c                |    2 +-
 drivers/gpu/drm/xe/xe_exec_queue.c          |   51 +-
 drivers/gpu/drm/xe/xe_exec_queue.h          |    2 +
 drivers/gpu/drm/xe/xe_exec_queue_types.h    |    7 +
 drivers/gpu/drm/xe/xe_gt.c                  |    1 +
 drivers/gpu/drm/xe/xe_gt_debug.c            |  243 ++
 drivers/gpu/drm/xe/xe_gt_debug.h            |   47 +
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |   80 +-
 drivers/gpu/drm/xe/xe_guc_submit.c          |    4 +
 drivers/gpu/drm/xe/xe_hw_engine.h           |   14 +
 drivers/gpu/drm/xe/xe_lrc.c                 |   10 +
 drivers/gpu/drm/xe/xe_oa.c                  |    3 +-
 drivers/gpu/drm/xe/xe_pci_sriov.c           |   10 +
 drivers/gpu/drm/xe/xe_reg_sr.c              |   21 +-
 drivers/gpu/drm/xe/xe_reg_sr.h              |    4 +-
 drivers/gpu/drm/xe/xe_reg_whitelist.c       |    2 +-
 drivers/gpu/drm/xe/xe_rtp.c                 |    2 +-
 drivers/gpu/drm/xe/xe_sync.c                |   47 +-
 drivers/gpu/drm/xe/xe_sync.h                |    8 +-
 drivers/gpu/drm/xe/xe_sync_types.h          |   28 +-
 drivers/gpu/drm/xe/xe_userptr.c             |    4 +
 drivers/gpu/drm/xe/xe_userptr.h             |   32 +
 drivers/gpu/drm/xe/xe_vm.c                  |  215 +-
 drivers/gpu/drm/xe/xe_vm.h                  |    2 +
 drivers/gpu/drm/xe/xe_vm_types.h            |   32 +
 drivers/gpu/drm/xe/xe_wa_oob.rules          |    4 +
 include/uapi/drm/xe_drm.h                   |   59 +
 include/uapi/drm/xe_drm_eudebug.h           |  229 ++
 52 files changed, 6382 insertions(+), 54 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/tests/xe_eudebug.c
 create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.c
 create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.h
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data.c
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data.h
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data_types.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_pagefault.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_pagefault.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_types.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.h
 create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.c
 create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.h
 create mode 100644 include/uapi/drm/xe_drm_eudebug.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 02/20] drm/xe/eudebug: Introduce discovery for resources Mika Kuoppala
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala, Maarten Lankhorst,
	Dominik Grzegorzek, Andi Shyti, Matt Roper,
	Zbigniew Kempczyński, Jonathan Cavitt

This patch adds the eudebug interface to the Xe driver, enabling
user-space debuggers (e.g., GDB) to track and interact with GPU resources
of a DRM client. Debuggers can inspect or modify these resources,
for example, to locate ISA/ELF sections and install breakpoints in a
shader's instruction stream.

A debugger opens a connection to the Xe driver via a DRM ioctl, specifying
the target DRM client's file descriptor. This returns an anonymous file
descriptor for the connection, which can be used to listen for resource
creation/destruction events. The same file descriptor can also be used to
receive hardware state change events and control execution flow by
interrupting EU threads on the GPU (in follow-up patches).

This patch introduces the eudebug connection and event queuing,
adding client create/destroy and VM create/destroy events as a baseline.
Additional events and hardware control for full debugger operation are
needed and will be introduced in follow-up patches.

The resource tracking components are inspired by Maciej Patelczyk's work on
resource handling for i915. Chris Wilson suggested a two-way mapping
approach, which simplifies using the resource map as definitive
bookkeeping forresources relayed to the debugger during the discovery
phase (in a follow-up patch).

v2: - Kconfig support (Matthew)
    - ptraced access control (Lucas)
    - pass expected event length to user (Zbigniew)
    - only track long running VMs
    - checkpatch (Tilak)
    - include order (Andrzej)
    - 32bit fixes (Andrzej)
    - cleaner get_task_struct
    - remove xa_array and use clients.list for tracking (Mika)

v3: - adapt to removal of clients.lock (Mika)
    - create_event cleanup (Christoph)

v4: - add proper header guards (Christoph)
    - better read_event fault handling (Christoph, Mika)
    - simplify attach (Mika)
    - connect using target file descriptors
    - avoid event->seqno after queue as it is can UAF (Mika)
    - use drmm for eudebug_fini (Maciej)
    - squash dynamic enable

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
---
 drivers/gpu/drm/xe/Kconfig            |   10 +
 drivers/gpu/drm/xe/Makefile           |    3 +
 drivers/gpu/drm/xe/xe_device.c        |   14 +
 drivers/gpu/drm/xe/xe_device_types.h  |   32 +-
 drivers/gpu/drm/xe/xe_eudebug.c       | 1038 +++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug.h       |   64 ++
 drivers/gpu/drm/xe/xe_eudebug_types.h |  128 +++
 drivers/gpu/drm/xe/xe_vm.c            |    7 +-
 include/uapi/drm/xe_drm.h             |   21 +
 include/uapi/drm/xe_drm_eudebug.h     |   77 ++
 10 files changed, 1392 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug.h
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_types.h
 create mode 100644 include/uapi/drm/xe_drm_eudebug.h

diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index 7219f6b884b6..a8bfce19aeae 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -129,6 +129,16 @@ config DRM_XE_FORCE_PROBE
 
 	  Use "!*" to block the probe of the driver for all known devices.
 
+config DRM_XE_EUDEBUG
+	bool "Enable gdb debugger support (eudebug)"
+	depends on DRM_XE
+	default y
+	help
+	  Choose this option if you want to add support for debugger (gdb) to
+	  attach into process using Xe and debug the gpu/gpgpu programs.
+	  With debugger support, Xe will provide interface for a debugger to
+	  process to track, inspect and modify resources.
+
 menu "drm/Xe Debugging"
 depends on DRM_XE
 depends on EXPERT
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 84321fad3265..bfc0f18ab50e 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -145,6 +145,9 @@ xe-$(CONFIG_I2C)	+= xe_i2c.o
 xe-$(CONFIG_DRM_XE_GPUSVM) += xe_svm.o
 xe-$(CONFIG_DRM_GPUSVM) += xe_userptr.o
 
+# debugging shaders with gdb (eudebug) support
+xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o
+
 # graphics hardware monitoring (HWMON) support
 xe-$(CONFIG_HWMON) += xe_hwmon.o
 
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 386940323630..55b7d5e064be 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -31,6 +31,7 @@
 #include "xe_dma_buf.h"
 #include "xe_drm_client.h"
 #include "xe_drv.h"
+#include "xe_eudebug.h"
 #include "xe_exec.h"
 #include "xe_exec_queue.h"
 #include "xe_force_wake.h"
@@ -104,6 +105,11 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 	mutex_init(&xef->exec_queue.lock);
 	xa_init_flags(&xef->exec_queue.xa, XA_FLAGS_ALLOC1);
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	mutex_init(&xef->eudebug.lock);
+	INIT_LIST_HEAD(&xef->eudebug.target_link);
+#endif
+
 	file->driver_priv = xef;
 	kref_init(&xef->refcount);
 
@@ -126,6 +132,9 @@ static void xe_file_destroy(struct kref *ref)
 	xa_destroy(&xef->vm.xa);
 	mutex_destroy(&xef->vm.lock);
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	mutex_destroy(&xef->eudebug.lock);
+#endif
 	xe_drm_client_put(xef->client);
 	kfree(xef->process_name);
 	kfree(xef);
@@ -167,6 +176,8 @@ static void xe_file_close(struct drm_device *dev, struct drm_file *file)
 
 	xe_pm_runtime_get(xe);
 
+	xe_eudebug_file_close(xef);
+
 	/*
 	 * No need for exec_queue.lock here as there is no contention for it
 	 * when FD is closing as IOCTLs presumably can't be modifying the
@@ -208,6 +219,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(XE_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_QUERY_MEM_RANGE_ATTRS, xe_vm_query_vmas_attrs_ioctl,
 			  DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_EUDEBUG_CONNECT, xe_eudebug_connect_ioctl, DRM_RENDER_ALLOW),
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
@@ -965,6 +977,8 @@ int xe_device_probe(struct xe_device *xe)
 	if (err)
 		goto err_unregister_display;
 
+	xe_eudebug_init(xe);
+
 	return devm_add_action_or_reset(xe->drm.dev, xe_device_sanitize, xe);
 
 err_unregister_display:
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 1d2718b70a5c..a369e949885d 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -13,6 +13,7 @@
 #include <drm/ttm/ttm_device.h>
 
 #include "xe_devcoredump_types.h"
+#include "xe_eudebug_types.h"
 #include "xe_heci_gsc.h"
 #include "xe_late_bind_fw_types.h"
 #include "xe_lmtt_types.h"
@@ -614,13 +615,29 @@ struct xe_device {
 		u8 region_mask;
 	} psmi;
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	/** @debugger connection list and globals for device */
+	struct {
+		/** @eudebug.session_count: session counter to track connections */
+		u64 session_count;
+
+		/** @eudebug.available: is the debugging functionality available */
+		enum xe_eudebug_state state;
+
+		/** @eudebug.targets: this is list for xe_files for each target */
+		struct list_head targets;
+
+		/** @eudebug.lock: protects state and targets */
+		struct mutex lock;
+	} eudebug;
+#endif
+
 #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
 	/** @g2g_test_array: for testing G2G communications */
 	u32 *g2g_test_array;
 	/** @g2g_test_count: for testing G2G communications */
 	atomic_t g2g_test_count;
 #endif
-
 	/* private: */
 
 #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
@@ -706,6 +723,19 @@ struct xe_file {
 
 	/** @refcount: ref count of this xe file */
 	struct kref refcount;
+
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	struct {
+		/** @eudebug.debugger: the debugger connection into this xe_file */
+		struct xe_eudebug *debugger;
+
+		/** @eudebug.lock: protecting debugger */
+		struct mutex lock;
+
+		/** @target_link: link into xe_device.eudebug.targets */
+		struct list_head target_link;
+	} eudebug;
+#endif
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
new file mode 100644
index 000000000000..4051c7548187
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -0,0 +1,1038 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/delay.h>
+#include <linux/poll.h>
+#include <linux/uaccess.h>
+
+#include <drm/drm_managed.h>
+#include <uapi/drm/xe_drm.h>
+
+#include "xe_assert.h"
+#include "xe_device.h"
+#include "xe_eudebug.h"
+#include "xe_eudebug_types.h"
+#include "xe_macros.h"
+#include "xe_vm.h"
+
+/*
+ * If there is no detected event read by userspace, during this period, assume
+ * userspace problem and disconnect debugger to allow forward progress.
+ */
+#define XE_EUDEBUG_NO_READ_DETECTED_TIMEOUT_MS (25 * 1000)
+
+#define cast_event(T, event) container_of((event), typeof(*(T)), base)
+
+static struct drm_xe_eudebug_event *
+event_fifo_pending(struct xe_eudebug *d)
+{
+	struct drm_xe_eudebug_event *event;
+
+	if (kfifo_peek(&d->events.fifo, &event))
+		return event;
+
+	return NULL;
+}
+
+/*
+ * This is racy as we dont take the lock for read but all the
+ * callsites can handle the race so we can live without lock.
+ */
+__no_kcsan
+static unsigned int
+event_fifo_num_events_peek(const struct xe_eudebug * const d)
+{
+	return kfifo_len(&d->events.fifo);
+}
+
+static bool
+xe_eudebug_detached(struct xe_eudebug *d)
+{
+	bool connected;
+
+	spin_lock(&d->target.lock);
+	connected = !!d->target.xef;
+	spin_unlock(&d->target.lock);
+
+	return !connected;
+}
+
+static unsigned int
+event_fifo_has_events(struct xe_eudebug *d)
+{
+	/* Allow all waiters to proceed to check their state */
+	if (xe_eudebug_detached(d))
+		return 1;
+
+	return event_fifo_num_events_peek(d);
+}
+
+static const struct rhashtable_params rhash_res = {
+	.head_offset = offsetof(struct xe_eudebug_handle, rh_head),
+	.key_len = sizeof_field(struct xe_eudebug_handle, key),
+	.key_offset = offsetof(struct xe_eudebug_handle, key),
+	.automatic_shrinking = true,
+};
+
+static struct xe_eudebug_resource *
+resource_from_type(struct xe_eudebug_resources * const res, const int t)
+{
+	return &res->rt[t];
+}
+
+static struct xe_eudebug_resources *
+xe_eudebug_resources_alloc(void)
+{
+	struct xe_eudebug_resources *res;
+	int err;
+	int i;
+
+	res = kzalloc(sizeof(*res), GFP_ATOMIC);
+	if (!res)
+		return ERR_PTR(-ENOMEM);
+
+	mutex_init(&res->lock);
+
+	for (i = 0; i < XE_EUDEBUG_RES_TYPE_COUNT; i++) {
+		xa_init_flags(&res->rt[i].xa, XA_FLAGS_ALLOC1);
+		err = rhashtable_init(&res->rt[i].rh, &rhash_res);
+
+		if (err)
+			break;
+	}
+
+	if (err) {
+		while (i--) {
+			xa_destroy(&res->rt[i].xa);
+			rhashtable_destroy(&res->rt[i].rh);
+		}
+
+		kfree(res);
+		return ERR_PTR(err);
+	}
+
+	return res;
+}
+
+static void res_free_fn(void *ptr, void *arg)
+{
+	XE_WARN_ON(ptr);
+	kfree(ptr);
+}
+
+static void
+xe_eudebug_destroy_resources(struct xe_eudebug *d)
+{
+	struct xe_eudebug_resources *res = d->res;
+	struct xe_eudebug_handle *h;
+	unsigned long j;
+	int i;
+	int err;
+
+	mutex_lock(&res->lock);
+	for (i = 0; i < XE_EUDEBUG_RES_TYPE_COUNT; i++) {
+		struct xe_eudebug_resource *r = &res->rt[i];
+
+		xa_for_each(&r->xa, j, h) {
+			struct xe_eudebug_handle *t;
+
+			err = rhashtable_remove_fast(&r->rh,
+						     &h->rh_head,
+						     rhash_res);
+			xe_eudebug_assert(d, !err);
+			t = xa_erase(&r->xa, h->id);
+			xe_eudebug_assert(d, t == h);
+			kfree(t);
+		}
+	}
+	mutex_unlock(&res->lock);
+
+	for (i = 0; i < XE_EUDEBUG_RES_TYPE_COUNT; i++) {
+		struct xe_eudebug_resource *r = &res->rt[i];
+
+		rhashtable_free_and_destroy(&r->rh, res_free_fn, NULL);
+		xe_eudebug_assert(d, xa_empty(&r->xa));
+		xa_destroy(&r->xa);
+	}
+
+	mutex_destroy(&res->lock);
+
+	kfree(res);
+}
+
+static void xe_eudebug_free(struct kref *ref)
+{
+	struct xe_eudebug *d = container_of(ref, typeof(*d), ref);
+	struct drm_xe_eudebug_event *event;
+
+	xe_assert(d->xe, xe_eudebug_detached(d));
+
+	while (kfifo_get(&d->events.fifo, &event))
+		kfree(event);
+
+	xe_eudebug_destroy_resources(d);
+	XE_WARN_ON(d->target.xef);
+
+	xe_eudebug_assert(d, !kfifo_len(&d->events.fifo));
+
+	kfree(d);
+}
+
+static void xe_eudebug_put(struct xe_eudebug *d)
+{
+	kref_put(&d->ref, xe_eudebug_free);
+}
+
+static void remove_debugger(struct xe_file *xef)
+{
+	struct xe_eudebug *d;
+
+	if (XE_WARN_ON(!xef))
+		return;
+
+	mutex_lock(&xef->eudebug.lock);
+	d = xef->eudebug.debugger;
+	if (d)
+		xef->eudebug.debugger = NULL;
+	mutex_unlock(&xef->eudebug.lock);
+
+	if (d) {
+		struct xe_device *xe = d->xe;
+
+		mutex_lock(&xe->eudebug.lock);
+		list_del_init(&xef->eudebug.target_link);
+		mutex_unlock(&xe->eudebug.lock);
+
+		eu_dbg(d, "debugger removed");
+
+		xe_eudebug_put(d);
+	}
+}
+
+static bool xe_eudebug_detach(struct xe_device *xe,
+			      struct xe_eudebug *d,
+			      const int err)
+{
+	struct xe_file *target = NULL;
+
+	XE_WARN_ON(err > 0);
+
+	spin_lock(&d->target.lock);
+	if (d->target.xef) {
+		target = d->target.xef;
+		d->target.xef = NULL;
+		d->target.err = err;
+	}
+	spin_unlock(&d->target.lock);
+
+	if (!target)
+		return false;
+
+	eu_dbg(d, "session %lld detached with %d", d->session, err);
+
+	remove_debugger(target);
+	xe_file_put(target);
+
+	return true;
+}
+
+static int _xe_eudebug_disconnect(struct xe_eudebug *d,
+				  const int err)
+{
+	wake_up_all(&d->events.write_done);
+	wake_up_all(&d->events.read_done);
+
+	return xe_eudebug_detach(d->xe, d, err);
+}
+
+#define xe_eudebug_disconnect(_d, _err) ({ \
+	if (_xe_eudebug_disconnect((_d), (_err))) { \
+		if ((_err) == 0 || (_err) == -ETIMEDOUT) \
+			eu_dbg(d, "Session closed (%d)", (_err)); \
+		else \
+			eu_err(d, "Session disconnected, err = %d (%s:%d)", \
+			       (_err), __func__, __LINE__); \
+	} \
+})
+
+static struct xe_eudebug *
+xe_eudebug_get(struct xe_file *xef)
+{
+	struct xe_eudebug *d;
+
+	mutex_lock(&xef->eudebug.lock);
+	d = xef->eudebug.debugger;
+	if (d && !kref_get_unless_zero(&d->ref))
+		d = NULL;
+	mutex_unlock(&xef->eudebug.lock);
+
+	if (d && xe_eudebug_detached(d)) {
+		xe_eudebug_put(d);
+		return NULL;
+	}
+
+	return d;
+}
+
+static int xe_eudebug_queue_event(struct xe_eudebug *d,
+				  struct drm_xe_eudebug_event *event)
+{
+	const u64 wait_jiffies = msecs_to_jiffies(1000);
+	u64 last_read_detected_ts, last_head_seqno, start_ts;
+	const u64 event_seqno = event->seqno;
+
+	xe_eudebug_assert(d, event->len > sizeof(struct drm_xe_eudebug_event));
+	xe_eudebug_assert(d, event->type);
+	xe_eudebug_assert(d, event->type != DRM_XE_EUDEBUG_EVENT_READ);
+
+	start_ts = ktime_get();
+	last_read_detected_ts = start_ts;
+	last_head_seqno = 0;
+
+	do  {
+		struct drm_xe_eudebug_event *head;
+		u64 head_seqno;
+		bool was_queued;
+
+		if (xe_eudebug_detached(d))
+			break;
+
+		spin_lock(&d->events.lock);
+		head = event_fifo_pending(d);
+		if (head)
+			head_seqno = event->seqno;
+		else
+			head_seqno = 0;
+
+		was_queued = kfifo_in(&d->events.fifo, &event, 1);
+		spin_unlock(&d->events.lock);
+
+		wake_up_all(&d->events.write_done);
+
+		if (was_queued) {
+			eu_dbg(d, "queued event with seqno %lld (head %lld)\n",
+			       event_seqno, head_seqno);
+			event = NULL;
+			break;
+		}
+
+		XE_WARN_ON(!head_seqno);
+
+		/* If we detect progress, restart timeout */
+		if (last_head_seqno != head_seqno)
+			last_read_detected_ts = ktime_get();
+
+		last_head_seqno = head_seqno;
+
+		wait_event_interruptible_timeout(d->events.read_done,
+						 !kfifo_is_full(&d->events.fifo),
+						 wait_jiffies);
+
+	} while (ktime_ms_delta(ktime_get(), last_read_detected_ts) <
+		 XE_EUDEBUG_NO_READ_DETECTED_TIMEOUT_MS);
+
+	if (event) {
+		eu_dbg(d,
+		       "event %llu queue failed (blocked %lld ms, avail %d)",
+		       event ? event->seqno : 0,
+		       ktime_ms_delta(ktime_get(), start_ts),
+		       kfifo_avail(&d->events.fifo));
+
+		kfree(event);
+
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static struct xe_eudebug_handle *
+alloc_handle(const int type, const u64 key)
+{
+	struct xe_eudebug_handle *h;
+
+	h = kzalloc(sizeof(*h), GFP_ATOMIC);
+	if (!h)
+		return NULL;
+
+	h->key = key;
+
+	return h;
+}
+
+static struct xe_eudebug_handle *
+__find_handle(struct xe_eudebug_resource *r,
+	      const u64 key)
+{
+	struct xe_eudebug_handle *h;
+
+	h = rhashtable_lookup_fast(&r->rh,
+				   &key,
+				   rhash_res);
+	return h;
+}
+
+static int _xe_eudebug_add_handle(struct xe_eudebug *d,
+				  int type,
+				  void *p,
+				  u64 *seqno,
+				  int *handle)
+{
+	const u64 key = (uintptr_t)p;
+	struct xe_eudebug_resource *r;
+	struct xe_eudebug_handle *h, *o;
+	int err;
+
+	if (XE_WARN_ON(!p))
+		return -EINVAL;
+
+	if (xe_eudebug_detached(d))
+		return -ENOTCONN;
+
+	h = alloc_handle(type, key);
+	if (!h)
+		return -ENOMEM;
+
+	r = resource_from_type(d->res, type);
+
+	mutex_lock(&d->res->lock);
+	o = __find_handle(r, key);
+	if (!o) {
+		err = xa_alloc(&r->xa, &h->id, h, xa_limit_31b, GFP_KERNEL);
+
+		if (h->id >= INT_MAX) {
+			xa_erase(&r->xa, h->id);
+			err = -ENOSPC;
+		}
+
+		if (!err)
+			err = rhashtable_insert_fast(&r->rh,
+						     &h->rh_head,
+						     rhash_res);
+
+		if (err) {
+			xa_erase(&r->xa, h->id);
+		} else {
+			if (seqno)
+				*seqno = atomic_long_inc_return(&d->events.seqno);
+		}
+	} else {
+		xe_eudebug_assert(d, o->id);
+		err = -EEXIST;
+	}
+	mutex_unlock(&d->res->lock);
+
+	if (handle)
+		*handle = o ? o->id : h->id;
+
+	if (err) {
+		kfree(h);
+		XE_WARN_ON(err > 0);
+		return err;
+	}
+
+	xe_eudebug_assert(d, h->id);
+
+	return h->id;
+}
+
+static int xe_eudebug_add_handle(struct xe_eudebug *d,
+				 int type,
+				 void *p,
+				 u64 *seqno)
+{
+	int ret;
+
+	ret = _xe_eudebug_add_handle(d, type, p, seqno, NULL);
+
+	eu_dbg(d, "handle type %d handle %p added: %d\n", type, p, ret);
+
+	return ret;
+}
+
+static int _xe_eudebug_remove_handle(struct xe_eudebug *d, int type, void *p,
+				     u64 *seqno)
+{
+	const u64 key = (uintptr_t)p;
+	struct xe_eudebug_resource *r;
+	struct xe_eudebug_handle *h, *xa_h;
+	int ret;
+
+	if (XE_WARN_ON(!key))
+		return -EINVAL;
+
+	if (xe_eudebug_detached(d))
+		return -ENOTCONN;
+
+	r = resource_from_type(d->res, type);
+
+	mutex_lock(&d->res->lock);
+	h = __find_handle(r, key);
+	if (h) {
+		ret = rhashtable_remove_fast(&r->rh,
+					     &h->rh_head,
+					     rhash_res);
+		xe_eudebug_assert(d, !ret);
+		xa_h = xa_erase(&r->xa, h->id);
+		xe_eudebug_assert(d, xa_h == h);
+		if (!ret) {
+			ret = h->id;
+			if (seqno)
+				*seqno = atomic_long_inc_return(&d->events.seqno);
+		}
+	} else {
+		ret = -ENOENT;
+	}
+	mutex_unlock(&d->res->lock);
+
+	kfree(h);
+
+	xe_eudebug_assert(d, ret);
+
+	return ret;
+}
+
+static int xe_eudebug_remove_handle(struct xe_eudebug *d, int type, void *p,
+				    u64 *seqno)
+{
+	int ret;
+
+	ret = _xe_eudebug_remove_handle(d, type, p, seqno);
+
+	eu_dbg(d, "handle type %d handle %p removed: %d\n", type, p, ret);
+
+	return ret;
+}
+
+static struct drm_xe_eudebug_event *
+xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags,
+			u32 len)
+{
+	const u16 known_flags =
+		DRM_XE_EUDEBUG_EVENT_CREATE |
+		DRM_XE_EUDEBUG_EVENT_DESTROY |
+		DRM_XE_EUDEBUG_EVENT_STATE_CHANGE |
+		DRM_XE_EUDEBUG_EVENT_NEED_ACK;
+	struct drm_xe_eudebug_event *event;
+
+	BUILD_BUG_ON(type > XE_EUDEBUG_MAX_EVENT_TYPE);
+
+	xe_eudebug_assert(d, type <= XE_EUDEBUG_MAX_EVENT_TYPE);
+	xe_eudebug_assert(d, !(~known_flags & flags));
+	xe_eudebug_assert(d, len > sizeof(*event));
+
+	event = kzalloc(len, GFP_KERNEL);
+	if (!event)
+		return NULL;
+
+	event->len = len;
+	event->type = type;
+	event->flags = flags;
+	event->seqno = seqno;
+
+	return event;
+}
+
+static int send_vm_event(struct xe_eudebug *d, u32 flags,
+			 const u64 vm_handle,
+			 const u64 seqno)
+{
+	struct drm_xe_eudebug_event *event;
+	struct drm_xe_eudebug_event_vm *e;
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM,
+					seqno, flags, sizeof(*e));
+	if (!event)
+		return -ENOMEM;
+
+	e = cast_event(e, event);
+
+	e->vm_handle = vm_handle;
+
+	return xe_eudebug_queue_event(d, event);
+}
+
+static int vm_create_event(struct xe_eudebug *d,
+			   struct xe_file *xef, struct xe_vm *vm)
+{
+	int vm_id;
+	u64 seqno;
+	int ret;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return 0;
+
+	vm_id = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_VM, vm, &seqno);
+	if (vm_id < 0)
+		return vm_id;
+
+	ret = send_vm_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, vm_id, seqno);
+	if (ret)
+		eu_dbg(d, "send_vm_event create error %d\n", ret);
+
+	return ret;
+}
+
+static int vm_destroy_event(struct xe_eudebug *d,
+			    struct xe_file *xef, struct xe_vm *vm)
+{
+	int vm_id;
+	u64 seqno;
+	int ret;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return 0;
+
+	vm_id = xe_eudebug_remove_handle(d, XE_EUDEBUG_RES_TYPE_VM, vm, &seqno);
+	if (vm_id < 0)
+		return vm_id;
+
+	ret = send_vm_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY, vm_id, seqno);
+	if (ret)
+		eu_dbg(d, "send_vm_event destroy error %d\n", ret);
+
+	return ret;
+}
+
+#define xe_eudebug_event_put(_d, _err) ({ \
+	if ((_err)) \
+		xe_eudebug_disconnect((_d), (_err)); \
+	xe_eudebug_put((_d)); \
+	})
+
+void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm)
+{
+	struct xe_eudebug *d;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return;
+
+	d = xe_eudebug_get(xef);
+	if (!d)
+		return;
+
+	xe_eudebug_event_put(d, vm_create_event(d, xef, vm));
+}
+
+void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm)
+{
+	struct xe_eudebug *d;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return;
+
+	d = xe_eudebug_get(xef);
+	if (!d)
+		return;
+
+	xe_eudebug_event_put(d, vm_destroy_event(d, xef, vm));
+}
+
+static int add_debugger(struct xe_device *xe, struct xe_eudebug *d,
+			struct drm_file *target)
+{
+	struct xe_file *xef = target->driver_priv;
+	int ret = -EBUSY;
+
+	mutex_lock(&xef->eudebug.lock);
+	if (!xef->eudebug.debugger) {
+		d->target.xef = xe_file_get(xef);
+		d->target.pid = xef->pid;
+		kref_get(&d->ref);
+		xef->eudebug.debugger = d;
+		ret = 0;
+	}
+	mutex_unlock(&xef->eudebug.lock);
+
+	if (ret)
+		return ret;
+
+	mutex_lock(&xe->eudebug.lock);
+	XE_WARN_ON(!list_empty(&xef->eudebug.target_link));
+	list_add_tail(&xef->eudebug.target_link, &xef->xe->eudebug.targets);
+	mutex_unlock(&xe->eudebug.lock);
+
+	return 0;
+}
+
+static int
+xe_eudebug_attach(struct xe_device *xe, struct drm_file *parent_file,
+		  struct xe_eudebug *d, u64 target_pidfd)
+{
+	struct file *file __free(fput) = NULL;
+	struct drm_file *drm_file;
+	int ret;
+
+	file = fget(target_pidfd);
+	if (XE_IOCTL_DBG(xe, !file))
+		return -EBADFD;
+
+	drm_file = file->private_data;
+	if (XE_IOCTL_DBG(xe, !drm_file))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, parent_file->filp->f_op != file->f_op))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, !drm_file->authenticated))
+		return -EACCES;
+
+	ret = add_debugger(xe, d, drm_file);
+	if (XE_IOCTL_DBG(xe, ret))
+		return ret;
+
+	d->xe = xe;
+	d->session = ++xe->eudebug.session_count ?: 1;
+
+	eu_dbg(d, "session %lld attached to %s", d->session,
+	       parent_file == drm_file ? "self" : "remote");
+
+	return 0;
+}
+
+static int xe_eudebug_release(struct inode *inode, struct file *file)
+{
+	struct xe_eudebug *d = file->private_data;
+
+	xe_eudebug_disconnect(d, 0);
+	xe_eudebug_put(d);
+
+	return 0;
+}
+
+static __poll_t xe_eudebug_poll(struct file *file, poll_table *wait)
+{
+	struct xe_eudebug * const d = file->private_data;
+	__poll_t ret = 0;
+
+	poll_wait(file, &d->events.write_done, wait);
+
+	if (xe_eudebug_detached(d)) {
+		ret |= EPOLLHUP;
+		if (d->target.err)
+			ret |= EPOLLERR;
+	}
+
+	if (event_fifo_num_events_peek(d))
+		ret |= EPOLLIN;
+
+	return ret;
+}
+
+static ssize_t xe_eudebug_read(struct file *file,
+			       char __user *buf,
+			       size_t count,
+			       loff_t *ppos)
+{
+	return -EINVAL;
+}
+
+static long xe_eudebug_read_event(struct xe_eudebug *d,
+				  const u64 arg,
+				  const bool wait)
+{
+	struct xe_device *xe = d->xe;
+	struct drm_xe_eudebug_event __user * const user_orig =
+		u64_to_user_ptr(arg);
+	struct drm_xe_eudebug_event user_event;
+	struct drm_xe_eudebug_event *pending, *event_out;
+	long ret = 0;
+
+	if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event))))
+		return -EFAULT;
+
+	if (XE_IOCTL_DBG(xe, !user_event.type))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, user_event.type > XE_EUDEBUG_MAX_EVENT_TYPE))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, user_event.type != DRM_XE_EUDEBUG_EVENT_READ))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, user_event.len < sizeof(*user_orig)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, user_event.flags))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, user_event.reserved))
+		return -EINVAL;
+
+	/* XXX: define wait time in connect arguments ? */
+	if (wait) {
+		ret = wait_event_interruptible_timeout(d->events.write_done,
+						       event_fifo_has_events(d),
+						       msecs_to_jiffies(5 * 1000));
+
+		if (XE_IOCTL_DBG(xe, ret < 0))
+			return ret;
+	}
+
+	if (XE_IOCTL_DBG(xe, xe_eudebug_detached(d)))
+		return -ENOTCONN;
+
+	event_out = NULL;
+	spin_lock(&d->events.lock);
+	pending = event_fifo_pending(d);
+	if (!pending)
+		ret = wait ? -ETIMEDOUT : -EAGAIN;
+	else if (user_event.len < pending->len)
+		ret = -EMSGSIZE;
+	else if (access_ok(user_orig, pending->len))
+		ret = kfifo_out(&d->events.fifo, &event_out, 1) == 1 ? 0 : -EIO;
+	else
+		ret = -EFAULT;
+
+	wake_up_all(&d->events.read_done);
+	spin_unlock(&d->events.lock);
+
+	if (!pending)
+		return ret;
+
+	if (ret == -EMSGSIZE) {
+		if (XE_IOCTL_DBG(xe, put_user(pending->len, &user_orig->len)))
+			return -EFAULT;
+
+		return -EMSGSIZE;
+	}
+
+	if (XE_IOCTL_DBG(xe, ret)) {
+		xe_eudebug_disconnect(d, (int)ret);
+		return ret;
+	}
+
+	XE_WARN_ON(pending != event_out);
+
+	if (__copy_to_user(user_orig, event_out, event_out->len)) {
+		ret = -EFAULT;
+		/* We can't rollback anymore, disconnect */
+		xe_eudebug_disconnect(d, -EFAULT);
+	}
+
+	eu_dbg(d, "event read=%ld: type=%u, flags=0x%x, seqno=%llu", ret,
+	       event_out->type, event_out->flags, event_out->seqno);
+
+	kfree(event_out);
+
+	return ret;
+}
+
+static long xe_eudebug_ioctl(struct file *file,
+			     unsigned int cmd,
+			     unsigned long arg)
+{
+	struct xe_eudebug * const d = file->private_data;
+	long ret;
+
+	switch (cmd) {
+	case DRM_XE_EUDEBUG_IOCTL_READ_EVENT:
+		ret = xe_eudebug_read_event(d, arg,
+					    !(file->f_flags & O_NONBLOCK));
+		break;
+
+	default:
+		ret = -EINVAL;
+	}
+
+	return ret;
+}
+
+static const struct file_operations fops = {
+	.owner		= THIS_MODULE,
+	.release	= xe_eudebug_release,
+	.poll		= xe_eudebug_poll,
+	.read		= xe_eudebug_read,
+	.unlocked_ioctl	= xe_eudebug_ioctl,
+};
+
+static int
+xe_eudebug_connect(struct xe_device *xe,
+		   struct drm_file *file,
+		   struct drm_xe_eudebug_connect *param)
+{
+	const u64 known_open_flags = 0;
+	unsigned long f_flags = 0;
+	struct xe_eudebug *d;
+	int fd, err;
+
+	if (XE_IOCTL_DBG(xe, param->extensions))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, !param->fd))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, param->flags & ~known_open_flags))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, param->version &&
+			 param->version != DRM_XE_EUDEBUG_VERSION))
+		return -EINVAL;
+
+	param->version = DRM_XE_EUDEBUG_VERSION;
+
+	mutex_lock(&xe->eudebug.lock);
+	err = xe_eudebug_is_enabled(xe) ? 0 : -EOPNOTSUPP;
+	mutex_unlock(&xe->eudebug.lock);
+
+	if (XE_IOCTL_DBG(xe, err))
+		return err;
+
+	d = kzalloc(sizeof(*d), GFP_KERNEL);
+	if (XE_IOCTL_DBG(xe, !d))
+		return -ENOMEM;
+
+	kref_init(&d->ref);
+	spin_lock_init(&d->target.lock);
+	init_waitqueue_head(&d->events.write_done);
+	init_waitqueue_head(&d->events.read_done);
+
+	spin_lock_init(&d->events.lock);
+	INIT_KFIFO(d->events.fifo);
+
+	d->res = xe_eudebug_resources_alloc();
+	if (XE_IOCTL_DBG(xe, IS_ERR(d->res))) {
+		err = PTR_ERR(d->res);
+		goto err_free;
+	}
+
+	err = xe_eudebug_attach(xe, file, d, param->fd);
+	if (XE_IOCTL_DBG(xe, err))
+		goto err_free_res;
+
+	fd = anon_inode_getfd("[xe_eudebug]", &fops, d, f_flags);
+	if (XE_IOCTL_DBG(xe, fd < 0)) {
+		err = fd;
+		goto err_detach;
+	}
+
+	eu_dbg(d, "connected session %lld", d->session);
+
+	return fd;
+
+err_detach:
+	xe_eudebug_detach(xe, d, err);
+err_free_res:
+	xe_eudebug_destroy_resources(d);
+err_free:
+	kfree(d);
+
+	return err;
+}
+
+void xe_eudebug_file_close(struct xe_file *xef)
+{
+	remove_debugger(xef);
+}
+
+bool xe_eudebug_is_enabled(struct xe_device *xe)
+{
+	return READ_ONCE(xe->eudebug.state) == XE_EUDEBUG_ENABLED;
+}
+
+static int xe_eudebug_enable(struct xe_device *xe, bool enable)
+{
+	mutex_lock(&xe->eudebug.lock);
+
+	if (xe->eudebug.state == XE_EUDEBUG_NOT_SUPPORTED) {
+		mutex_unlock(&xe->eudebug.lock);
+		return -EPERM;
+	}
+
+	if (!enable && !list_empty(&xe->eudebug.targets)) {
+		mutex_unlock(&xe->eudebug.lock);
+		return -EBUSY;
+	}
+
+	if (enable == xe_eudebug_is_enabled(xe)) {
+		mutex_unlock(&xe->eudebug.lock);
+		return 0;
+	}
+
+	xe->eudebug.state = enable ?
+		XE_EUDEBUG_ENABLED : XE_EUDEBUG_DISABLED;
+	mutex_unlock(&xe->eudebug.lock);
+
+	return 0;
+}
+
+static ssize_t enable_eudebug_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev));
+
+	return sysfs_emit(buf, "%u\n", xe_eudebug_is_enabled(xe));
+}
+
+static ssize_t enable_eudebug_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev));
+	bool enable;
+	int ret;
+
+	ret = kstrtobool(buf, &enable);
+	if (ret)
+		return ret;
+
+	ret = xe_eudebug_enable(xe, enable);
+	if (ret)
+		return ret;
+
+	return count;
+}
+
+static DEVICE_ATTR_RW(enable_eudebug);
+
+static void xe_eudebug_sysfs_fini(void *arg)
+{
+	struct xe_device *xe = arg;
+	struct drm_device *dev = &xe->drm;
+
+	sysfs_remove_file(&dev->dev->kobj,
+			  &dev_attr_enable_eudebug.attr);
+}
+
+void xe_eudebug_init(struct xe_device *xe)
+{
+	struct drm_device *dev = &xe->drm;
+	int err;
+
+	INIT_LIST_HEAD(&xe->eudebug.targets);
+
+	xe->eudebug.state = XE_EUDEBUG_NOT_SUPPORTED;
+
+	err = drmm_mutex_init(dev, &xe->eudebug.lock);
+	if (err)
+		goto out_err;
+
+	err = sysfs_create_file(&dev->dev->kobj,
+				&dev_attr_enable_eudebug.attr);
+	if (err)
+		goto out_err;
+
+	err = devm_add_action_or_reset(dev->dev, xe_eudebug_sysfs_fini, xe);
+	if (err)
+		goto out_err;
+
+	xe->eudebug.state = XE_EUDEBUG_DISABLED;
+
+	return;
+
+out_err:
+	drm_warn(&xe->drm, "eudebug disabled, init fail: %d\n", err);
+}
+
+int xe_eudebug_connect_ioctl(struct drm_device *dev,
+			     void *data,
+			     struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(dev);
+	struct drm_xe_eudebug_connect * const param = data;
+
+	return xe_eudebug_connect(xe, file, param);
+}
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
new file mode 100644
index 000000000000..ba13dc35f161
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#ifndef _XE_EUDEBUG_H_
+#define _XE_EUDEBUG_H_
+
+#include <linux/types.h>
+
+struct drm_device;
+struct drm_file;
+struct xe_device;
+struct xe_file;
+struct xe_vm;
+
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+
+#define XE_EUDEBUG_DBG_STR "eudbg: %lld:%lu:%s (%d/%d) -> (%d): "
+#define XE_EUDEBUG_DBG_ARGS(d) (d)->session, \
+		atomic_long_read(&(d)->events.seqno), \
+		!READ_ONCE(d->target.xef) ? "disconnected" : "", \
+		current->pid, \
+		task_tgid_nr(current), \
+		READ_ONCE(d->target.xef) ? d->target.xef->pid : -1
+
+#define eu_err(d, fmt, ...) drm_err(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \
+				    XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__)
+#define eu_warn(d, fmt, ...) drm_warn(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \
+				      XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__)
+#define eu_dbg(d, fmt, ...) drm_dbg(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \
+				    XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__)
+
+#define xe_eudebug_assert(d, ...) xe_assert((d)->xe, ##__VA_ARGS__)
+
+int xe_eudebug_connect_ioctl(struct drm_device *dev,
+			     void *data,
+			     struct drm_file *file);
+
+void xe_eudebug_init(struct xe_device *xe);
+bool xe_eudebug_is_enabled(struct xe_device *xe);
+
+void xe_eudebug_file_close(struct xe_file *xef);
+
+void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm);
+void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm);
+
+#else
+
+static inline int xe_eudebug_connect_ioctl(struct drm_device *dev,
+					   void *data,
+					   struct drm_file *file) { return 0; }
+
+static inline void xe_eudebug_init(struct xe_device *xe) { }
+static inline bool xe_eudebug_is_enabled(struct xe_device *xe) { return false; }
+
+static inline void xe_eudebug_file_close(struct xe_file *xef) { }
+
+static inline void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm) { }
+static inline void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) { }
+
+#endif /* CONFIG_DRM_XE_EUDEBUG */
+
+#endif /* _XE_EUDEBUG_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
new file mode 100644
index 000000000000..1e673c934169
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#ifndef _XE_EUDEBUG_TYPES_H_
+#define _XE_EUDEBUG_TYPES_H_
+
+#include <linux/completion.h>
+#include <linux/kfifo.h>
+#include <linux/kref.h>
+#include <linux/mutex.h>
+#include <linux/rbtree.h>
+#include <linux/rhashtable.h>
+#include <linux/wait.h>
+#include <linux/xarray.h>
+
+struct xe_device;
+struct task_struct;
+
+/**
+ * enum xe_eudebug_state - eudebug capability state
+ *
+ * @XE_EUDEBUG_NOT_SUPPORTED: eudebug feature support off
+ * @XE_EUDEBUG_DISABLED: eudebug feature supported but disabled
+ * @XE_EUDEBUG_ENABLED: eudebug enabled
+ */
+enum xe_eudebug_state {
+	XE_EUDEBUG_NOT_SUPPORTED = 0,
+	XE_EUDEBUG_DISABLED,
+	XE_EUDEBUG_ENABLED,
+};
+
+#define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64
+#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_VM
+
+/**
+ * struct xe_eudebug_handle - eudebug resource handle
+ */
+struct xe_eudebug_handle {
+	/** @key: key value in rhashtable <key:id> */
+	u64 key;
+
+	/** @id: opaque handle id for xarray <id:key> */
+	int id;
+
+	/** @rh_head: rhashtable head */
+	struct rhash_head rh_head;
+};
+
+/**
+ * struct xe_eudebug_resource - Resource map for one resource
+ */
+struct xe_eudebug_resource {
+	/** @xa: xarrays for <id->key> */
+	struct xarray xa;
+
+	/** @rh rhashtable for <key->id> */
+	struct rhashtable rh;
+};
+
+#define XE_EUDEBUG_RES_TYPE_VM		0
+#define XE_EUDEBUG_RES_TYPE_COUNT	(XE_EUDEBUG_RES_TYPE_VM + 1)
+
+/**
+ * struct xe_eudebug_resources - eudebug resources for all types
+ */
+struct xe_eudebug_resources {
+	/** @lock: guards access into rt */
+	struct mutex lock;
+
+	/** @rt: resource maps for all types */
+	struct xe_eudebug_resource rt[XE_EUDEBUG_RES_TYPE_COUNT];
+};
+
+/**
+ * struct xe_eudebug - Top level struct for eudebug: the connection
+ */
+struct xe_eudebug {
+	/** @ref: kref counter for this struct */
+	struct kref ref;
+
+	struct {
+		/** @xef: the target xe_file that we are debugging */
+		struct xe_file *xef;
+
+		/** @pid: pid of target */
+		pid_t pid;
+
+		/** @err: error code on disconnect */
+		int err;
+
+		/** @lock: guards access to xef and err */
+		spinlock_t lock;
+	} target;
+
+	/** @xe: the parent device we are serving */
+	struct xe_device *xe;
+
+	/** @res: the resource maps we track for target_task */
+	struct xe_eudebug_resources *res;
+
+	/** @session: session number for this connection (for logs) */
+	u64 session;
+
+	/** @events: kfifo queue of to-be-delivered events */
+	struct {
+		/** @lock: guards access to fifo */
+		spinlock_t lock;
+
+		/** @fifo: queue of events pending */
+		DECLARE_KFIFO(fifo,
+			      struct drm_xe_eudebug_event *,
+			      CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE);
+
+		/** @write_done: waitqueue for signalling write to fifo */
+		wait_queue_head_t write_done;
+
+		/** @read_done: waitqueue for signalling read from fifo */
+		wait_queue_head_t read_done;
+
+		/** @event_seqno: seqno counter to stamp events for fifo */
+		atomic_long_t seqno;
+	} events;
+
+};
+
+#endif /* _XE_EUDEBUG_TYPES_H_ */
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 4e914928e0a9..85c7e1b8e232 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -26,6 +26,7 @@
 #include "xe_bo.h"
 #include "xe_device.h"
 #include "xe_drm_client.h"
+#include "xe_eudebug.h"
 #include "xe_exec_queue.h"
 #include "xe_gt_pagefault.h"
 #include "xe_migrate.h"
@@ -1939,6 +1940,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 
 	args->vm_id = id;
 
+	xe_eudebug_vm_create(xef, vm);
+
 	return 0;
 
 err_close_and_put:
@@ -1970,8 +1973,10 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void *data,
 		xa_erase(&xef->vm.xa, args->vm_id);
 	mutex_unlock(&xef->vm.lock);
 
-	if (!err)
+	if (!err) {
+		xe_eudebug_vm_destroy(xef, vm);
 		xe_vm_close_and_put(vm);
+	}
 
 	return err;
 }
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 40ff19f52a8d..54868095952b 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -106,6 +106,7 @@ extern "C" {
 #define DRM_XE_OBSERVATION		0x0b
 #define DRM_XE_MADVISE			0x0c
 #define DRM_XE_VM_QUERY_MEM_RANGE_ATTRS	0x0d
+#define DRM_XE_EUDEBUG_CONNECT		0x0e
 
 /* Must be kept compact -- no holes */
 
@@ -123,6 +124,7 @@ extern "C" {
 #define DRM_IOCTL_XE_OBSERVATION		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param)
 #define DRM_IOCTL_XE_MADVISE			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_MADVISE, struct drm_xe_madvise)
 #define DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_QUERY_MEM_RANGE_ATTRS, struct drm_xe_vm_query_mem_range_attr)
+#define DRM_IOCTL_XE_EUDEBUG_CONNECT		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EUDEBUG_CONNECT, struct drm_xe_eudebug_connect)
 
 /**
  * DOC: Xe IOCTL Extensions
@@ -2254,6 +2256,25 @@ struct drm_xe_vm_query_mem_range_attr {
 
 };
 
+/*
+ * Debugger ABI (ioctl and events) Version History:
+ * 0 - No debugger available
+ * 1 - Initial version
+ */
+#define DRM_XE_EUDEBUG_VERSION 1
+
+struct drm_xe_eudebug_connect {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	__u64 fd; /* Target drm client fd */
+	__u32 flags; /* MBZ */
+
+	__u32 version; /* output: current ABI (ioctl / events) version */
+};
+
+#include "xe_drm_eudebug.h"
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
new file mode 100644
index 000000000000..fd2a0c911d02
--- /dev/null
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _UAPI_XE_DRM_EUDEBUG_H_
+#define _UAPI_XE_DRM_EUDEBUG_H_
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/**
+ * Do a eudebug event read for a debugger connection.
+ *
+ * This ioctl is available in debug version 1.
+ */
+#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0)
+
+/**
+ * struct drm_xe_eudebug_event - Base type of event delivered by xe_eudebug.
+ * @len: Length of event, including the base, of event.
+ * @type: Event type
+ * @flags: Flags for the event
+ * @seqno: Sequence number
+ * @reserved: MBZ
+ *
+ * Base event for xe_eudebug interface. To initiate a read, type
+ * needs to be set to DRM_XE_EUDEBUG_EVENT_READ and length
+ * need to be set by userspace to what has been allocated as max.
+ * On successful return the event len will be deliver or -EMSGSIZE
+ * if it does not fit. Seqno can be used to form a timeline
+ * as event delivery order does not guarantee event creation
+ * order.
+ *
+ * flags will indicate if resource was created, destroyed
+ * or its state changed.
+ *
+ * if DRM_XE_EUDEBUG_EVENT_NEED_ACK is set, the xe_eudebug
+ * will held the said resource until it is acked by userspace
+ * using another acking ioctl with the seqno of said event.
+ *
+ */
+struct drm_xe_eudebug_event {
+	__u32 len;
+
+	__u16 type;
+#define DRM_XE_EUDEBUG_EVENT_NONE		0
+#define DRM_XE_EUDEBUG_EVENT_READ		1
+#define DRM_XE_EUDEBUG_EVENT_VM			2
+
+	__u16 flags;
+#define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
+#define DRM_XE_EUDEBUG_EVENT_DESTROY		(1 << 1)
+#define DRM_XE_EUDEBUG_EVENT_STATE_CHANGE	(1 << 2)
+#define DRM_XE_EUDEBUG_EVENT_NEED_ACK		(1 << 3)
+	__u64 seqno;
+	__u64 reserved;
+};
+
+/**
+ * struct drm_xe_eudebug_event_vm - VM resource event
+ * @vm_handle: Handle of a vm that was created/destroyed
+ *
+ * Resource creation/destruction event for a VM.
+ */
+struct drm_xe_eudebug_event_vm {
+	struct drm_xe_eudebug_event base;
+
+	__u64 vm_handle;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _UAPI_XE_DRM_EUDEBUG_H_ */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/20] drm/xe/eudebug: Introduce discovery for resources
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 03/20] drm/xe/eudebug: Introduce exec_queue events Mika Kuoppala
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala, Dominik Grzegorzek

A debugger connection can occur after a client has created and destroyed an
arbitrary number of resources. To support this, we need to relay all
currently existing resources to the debugger. The client is held on selected
ioctls until this discovery process, executed by a workqueue, is complete.

This patch is based on discovery work by Maciej Patelczyk for the i915 driver.

v2: - use rw_semaphore to block DRM ioctls during discovery (Matthew)
    - only lock according to ioctl at play (Dominik)

v4: - s/discovery_lock/ioctl_lock
    - change lock to be per xe_file as is connections

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com> #locking
---
 drivers/gpu/drm/xe/xe_device.c        |  13 +++-
 drivers/gpu/drm/xe/xe_device.h        |  42 +++++++++++
 drivers/gpu/drm/xe/xe_device_types.h  |   6 ++
 drivers/gpu/drm/xe/xe_eudebug.c       | 104 +++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_eudebug_types.h |   7 ++
 5 files changed, 169 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 55b7d5e064be..ff9268ed6124 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -108,6 +108,7 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 	mutex_init(&xef->eudebug.lock);
 	INIT_LIST_HEAD(&xef->eudebug.target_link);
+	init_rwsem(&xef->eudebug.ioctl_lock);
 #endif
 
 	file->driver_priv = xef;
@@ -232,8 +233,12 @@ static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		return -ECANCELED;
 
 	ret = xe_pm_runtime_get_ioctl(xe);
-	if (ret >= 0)
+	if (ret >= 0) {
+		bool lock = xe_eudebug_discovery_lock(file, cmd);
 		ret = drm_ioctl(file, cmd, arg);
+		if (lock)
+			xe_eudebug_discovery_unlock(file, cmd);
+	}
 	xe_pm_runtime_put(xe);
 
 	return ret;
@@ -250,8 +255,12 @@ static long xe_drm_compat_ioctl(struct file *file, unsigned int cmd, unsigned lo
 		return -ECANCELED;
 
 	ret = xe_pm_runtime_get_ioctl(xe);
-	if (ret >= 0)
+	if (ret >= 0) {
+		bool lock = xe_eudebug_discovery_lock(file, cmd);
 		ret = drm_compat_ioctl(file, cmd, arg);
+		if (lock)
+			xe_eudebug_discovery_unlock(file, cmd);
+	}
 	xe_pm_runtime_put(xe);
 
 	return ret;
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 32cc6323b7f6..0cc454087d20 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -7,6 +7,7 @@
 #define _XE_DEVICE_H_
 
 #include <drm/drm_util.h>
+#include <drm/drm_ioctl.h>
 
 #include "xe_device_types.h"
 #include "xe_gt_types.h"
@@ -209,4 +210,45 @@ int xe_is_injection_active(void);
 #define LNL_FLUSH_WORK(wrk__) \
 	flush_work(wrk__)
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+static inline int xe_eudebug_needs_ioctl_lock(const unsigned int cmd)
+{
+	const unsigned int xe_cmd = DRM_IOCTL_NR(cmd) - DRM_COMMAND_BASE;
+
+	switch (xe_cmd) {
+	case DRM_XE_VM_CREATE:
+	case DRM_XE_VM_DESTROY:
+	case DRM_XE_VM_BIND:
+	case DRM_XE_EXEC_QUEUE_CREATE:
+	case DRM_XE_EXEC_QUEUE_DESTROY:
+		return 1;
+	}
+
+	return 0;
+}
+
+static inline bool xe_eudebug_discovery_lock(struct file *file, unsigned int cmd)
+{
+	struct drm_file *file_priv = file->private_data;
+	struct xe_file *xef = file_priv->driver_priv;
+
+	if (!xe_eudebug_needs_ioctl_lock(cmd))
+		return false;
+
+	down_read(&xef->eudebug.ioctl_lock);
+	return true;
+}
+
+static inline void xe_eudebug_discovery_unlock(struct file *file, unsigned int cmd)
+{
+	struct drm_file *file_priv = file->private_data;
+	struct xe_file *xef = file_priv->driver_priv;
+
+	up_read(&xef->eudebug.ioctl_lock);
+}
+#else
+static inline bool xe_eudebug_discovery_lock(struct file *file, unsigned int cmd) { return false; }
+static inline void xe_eudebug_discovery_unlock(struct file *file, unsigned int cmd) { }
+#endif /* CONFIG_DRM_XE_EUDEBUG */
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index a369e949885d..163305440fdf 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -629,6 +629,9 @@ struct xe_device {
 
 		/** @eudebug.lock: protects state and targets */
 		struct mutex lock;
+
+		/** @wq: used for client discovery */
+		struct workqueue_struct *wq;
 	} eudebug;
 #endif
 
@@ -734,6 +737,9 @@ struct xe_file {
 
 		/** @target_link: link into xe_device.eudebug.targets */
 		struct list_head target_link;
+
+		/** @eudebug.ioctl_lock syncing ioctl access */
+		struct rw_semaphore ioctl_lock;
 	} eudebug;
 #endif
 };
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 4051c7548187..8d172d001b1f 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -168,6 +168,8 @@ static void xe_eudebug_free(struct kref *ref)
 	struct xe_eudebug *d = container_of(ref, typeof(*d), ref);
 	struct drm_xe_eudebug_event *event;
 
+	WARN_ON(work_pending(&d->discovery_work));
+
 	xe_assert(d->xe, xe_eudebug_detached(d));
 
 	while (kfifo_get(&d->events.fifo, &event))
@@ -228,6 +230,8 @@ static bool xe_eudebug_detach(struct xe_device *xe,
 	}
 	spin_unlock(&d->target.lock);
 
+	flush_work(&d->discovery_work);
+
 	if (!target)
 		return false;
 
@@ -259,7 +263,7 @@ static int _xe_eudebug_disconnect(struct xe_eudebug *d,
 })
 
 static struct xe_eudebug *
-xe_eudebug_get(struct xe_file *xef)
+_xe_eudebug_get(struct xe_file *xef)
 {
 	struct xe_eudebug *d;
 
@@ -277,6 +281,25 @@ xe_eudebug_get(struct xe_file *xef)
 	return d;
 }
 
+static struct xe_eudebug *
+xe_eudebug_get(struct xe_file *xef)
+{
+	struct xe_eudebug *d;
+
+	lockdep_assert_held(&xef->eudebug.ioctl_lock);
+
+	d = _xe_eudebug_get(xef);
+	if (!d)
+		return NULL;
+
+	if (!completion_done(&d->discovery)) {
+		xe_eudebug_put(d);
+		return NULL;
+	}
+
+	return d;
+}
+
 static int xe_eudebug_queue_event(struct xe_eudebug *d,
 				  struct drm_xe_eudebug_event *event)
 {
@@ -500,6 +523,8 @@ static int xe_eudebug_remove_handle(struct xe_eudebug *d, int type, void *p,
 {
 	int ret;
 
+	XE_WARN_ON(!completion_done(&d->discovery));
+
 	ret = _xe_eudebug_remove_handle(d, type, p, seqno);
 
 	eu_dbg(d, "handle type %d handle %p removed: %d\n", type, p, ret);
@@ -631,6 +656,66 @@ void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm)
 	xe_eudebug_event_put(d, vm_destroy_event(d, xef, vm));
 }
 
+static struct xe_file *xe_eudebug_target_get(struct xe_eudebug *d)
+{
+	struct xe_file *xef = NULL;
+
+	spin_lock(&d->target.lock);
+	if (d->target.xef)
+		xef = xe_file_get(d->target.xef);
+	spin_unlock(&d->target.lock);
+
+	return xef;
+}
+
+static void discover_client(struct xe_eudebug *d)
+{
+	struct xe_file *xef;
+	struct xe_vm *vm;
+	unsigned long i;
+	unsigned int vm_count = 0;
+	int err = 0;
+
+	xef = xe_eudebug_target_get(d);
+	if (!xef)
+		return;
+
+	down_write(&xef->eudebug.ioctl_lock);
+
+	eu_dbg(d, "Discovery start for %lld", d->session);
+
+	xa_for_each(&xef->vm.xa, i, vm) {
+		err = vm_create_event(d, xef, vm);
+		if (err)
+			break;
+		vm_count++;
+	}
+
+	complete_all(&d->discovery);
+
+	eu_dbg(d, "Discovery end for %lld: %d", d->session, err);
+
+	up_write(&xef->eudebug.ioctl_lock);
+
+	if (vm_count)
+		eu_dbg(d, "Discovery found %u vms", vm_count);
+
+	xe_file_put(xef);
+}
+
+static void discovery_work_fn(struct work_struct *work)
+{
+	struct xe_eudebug *d = container_of(work, typeof(*d),
+					    discovery_work);
+
+	if (xe_eudebug_detached(d))
+		complete_all(&d->discovery);
+	else
+		discover_client(d);
+
+	xe_eudebug_put(d);
+}
+
 static int add_debugger(struct xe_device *xe, struct xe_eudebug *d,
 			struct drm_file *target)
 {
@@ -828,6 +913,10 @@ static long xe_eudebug_ioctl(struct file *file,
 	struct xe_eudebug * const d = file->private_data;
 	long ret;
 
+	if (cmd != DRM_XE_EUDEBUG_IOCTL_READ_EVENT &&
+	    !completion_done(&d->discovery))
+		return -EBUSY;
+
 	switch (cmd) {
 	case DRM_XE_EUDEBUG_IOCTL_READ_EVENT:
 		ret = xe_eudebug_read_event(d, arg,
@@ -889,9 +978,11 @@ xe_eudebug_connect(struct xe_device *xe,
 	spin_lock_init(&d->target.lock);
 	init_waitqueue_head(&d->events.write_done);
 	init_waitqueue_head(&d->events.read_done);
+	init_completion(&d->discovery);
 
 	spin_lock_init(&d->events.lock);
 	INIT_KFIFO(d->events.fifo);
+	INIT_WORK(&d->discovery_work, discovery_work_fn);
 
 	d->res = xe_eudebug_resources_alloc();
 	if (XE_IOCTL_DBG(xe, IS_ERR(d->res))) {
@@ -909,6 +1000,9 @@ xe_eudebug_connect(struct xe_device *xe,
 		goto err_detach;
 	}
 
+	kref_get(&d->ref);
+	queue_work(xe->eudebug.wq, &d->discovery_work);
+
 	eu_dbg(d, "connected session %lld", d->session);
 
 	return fd;
@@ -1000,6 +1094,7 @@ static void xe_eudebug_sysfs_fini(void *arg)
 void xe_eudebug_init(struct xe_device *xe)
 {
 	struct drm_device *dev = &xe->drm;
+	struct workqueue_struct *wq;
 	int err;
 
 	INIT_LIST_HEAD(&xe->eudebug.targets);
@@ -1010,6 +1105,13 @@ void xe_eudebug_init(struct xe_device *xe)
 	if (err)
 		goto out_err;
 
+	wq = drmm_alloc_ordered_workqueue(dev, "xe-eudebug", 0);
+	if (IS_ERR(wq)) {
+		err = PTR_ERR(wq);
+		goto out_err;
+	}
+	xe->eudebug.wq = wq;
+
 	err = sysfs_create_file(&dev->dev->kobj,
 				&dev_attr_enable_eudebug.attr);
 	if (err)
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 1e673c934169..55b71ddd92b6 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -17,6 +17,7 @@
 
 struct xe_device;
 struct task_struct;
+struct workqueue_struct;
 
 /**
  * enum xe_eudebug_state - eudebug capability state
@@ -103,6 +104,12 @@ struct xe_eudebug {
 	/** @session: session number for this connection (for logs) */
 	u64 session;
 
+	/** @discovery: completion to wait for discovery */
+	struct completion discovery;
+
+	/** @discovery_work: worker to discover resources for target_task */
+	struct work_struct discovery_work;
+
 	/** @events: kfifo queue of to-be-delivered events */
 	struct {
 		/** @lock: guards access to fifo */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/20] drm/xe/eudebug: Introduce exec_queue events
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 02/20] drm/xe/eudebug: Introduce discovery for resources Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 04/20] drm/xe: Add EUDEBUG_ENABLE exec queue property Mika Kuoppala
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek, Mika Kuoppala

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Add events to inform the debugger about the creation and destruction of
exec_queues. Use user engine class types instead of the internal
xe_engine_class enum in exec_queue events. During discovery, only advertise
exec_queues with render or compute class,excluding others.

v2: - Only track long running queues
    - Checkpatch (Tilak)
v3: __counted_by added
v4: - use helpers for filtering engines (Mika)

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_eudebug.c       | 209 +++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_eudebug.h       |   7 +
 drivers/gpu/drm/xe/xe_eudebug_types.h |   7 +-
 drivers/gpu/drm/xe/xe_exec_queue.c    |   5 +
 drivers/gpu/drm/xe/xe_hw_engine.h     |  14 ++
 include/uapi/drm/xe_drm_eudebug.h     |  11 ++
 6 files changed, 248 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 8d172d001b1f..2b8efa438716 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -15,6 +15,8 @@
 #include "xe_device.h"
 #include "xe_eudebug.h"
 #include "xe_eudebug_types.h"
+#include "xe_exec_queue.h"
+#include "xe_hw_engine.h"
 #include "xe_macros.h"
 #include "xe_vm.h"
 
@@ -398,6 +400,28 @@ __find_handle(struct xe_eudebug_resource *r,
 	return h;
 }
 
+static int find_handle(struct xe_eudebug_resources *res,
+		       const int type,
+		       const void *p)
+{
+	const u64 key = (uintptr_t)p;
+	struct xe_eudebug_resource *r;
+	struct xe_eudebug_handle *h;
+	int id;
+
+	if (XE_WARN_ON(!key))
+		return -EINVAL;
+
+	r = resource_from_type(res, type);
+
+	mutex_lock(&res->lock);
+	h = __find_handle(r, key);
+	id = h ? h->id : -ENOENT;
+	mutex_unlock(&res->lock);
+
+	return id;
+}
+
 static int _xe_eudebug_add_handle(struct xe_eudebug *d,
 				  int type,
 				  void *p,
@@ -656,6 +680,174 @@ void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm)
 	xe_eudebug_event_put(d, vm_destroy_event(d, xef, vm));
 }
 
+static const u16 xe_to_user_engine_class[] = {
+	[XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER,
+	[XE_ENGINE_CLASS_COPY] = DRM_XE_ENGINE_CLASS_COPY,
+	[XE_ENGINE_CLASS_VIDEO_DECODE] = DRM_XE_ENGINE_CLASS_VIDEO_DECODE,
+	[XE_ENGINE_CLASS_VIDEO_ENHANCE] = DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE,
+	[XE_ENGINE_CLASS_COMPUTE] = DRM_XE_ENGINE_CLASS_COMPUTE,
+};
+
+static int send_exec_queue_event(struct xe_eudebug *d, u32 flags,
+				 u64 vm_handle, u64 exec_queue_handle,
+				 enum xe_engine_class class,
+				 u32 width, u64 *lrc_handles, u64 seqno)
+{
+	struct drm_xe_eudebug_event *event;
+	struct drm_xe_eudebug_event_exec_queue *e;
+	const u32 sz = struct_size(e, lrc_handle, width);
+	const u32 xe_engine_class = xe_to_user_engine_class[class];
+
+	if (!xe_engine_supports_eudebug(class))
+		return -EINVAL;
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+					seqno, flags, sz);
+	if (!event)
+		return -ENOMEM;
+
+	e = cast_event(e, event);
+
+	e->vm_handle = vm_handle;
+	e->exec_queue_handle = exec_queue_handle;
+	e->engine_class = xe_engine_class;
+	e->width = width;
+
+	memcpy(e->lrc_handle, lrc_handles, width);
+
+	return xe_eudebug_queue_event(d, event);
+}
+
+static int exec_queue_create_event(struct xe_eudebug *d,
+				   struct xe_file *xef, struct xe_exec_queue *q)
+{
+	int h_vm, h_queue;
+	u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE], seqno;
+	int i;
+	int ret;
+
+	if (!xe_exec_queue_is_lr(q))
+		return 0;
+
+	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
+	if (h_vm < 0)
+		return h_vm;
+
+	if (XE_WARN_ON(q->width >= XE_HW_ENGINE_MAX_INSTANCE))
+		return -EINVAL;
+
+	for (i = 0; i < q->width; i++) {
+		int h, ret;
+
+		ret = _xe_eudebug_add_handle(d,
+					     XE_EUDEBUG_RES_TYPE_LRC,
+					     q->lrc[i],
+					     NULL,
+					     &h);
+
+		if (ret < 0 && ret != -EEXIST)
+			return ret;
+
+		XE_WARN_ON(!h);
+
+		h_lrc[i] = h;
+	}
+
+	h_queue = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, q, &seqno);
+	if (h_queue <= 0)
+		return h_queue;
+
+	/* No need to cleanup for added handles on error as if we fail
+	 * we disconnect
+	 */
+
+	ret = send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_CREATE,
+				    h_vm, h_queue, q->class,
+				    q->width, h_lrc, seqno);
+
+	if (ret)
+		eu_dbg(d, "send_exec_queue_event create error %d", ret);
+
+	return ret;
+}
+
+static int exec_queue_destroy_event(struct xe_eudebug *d,
+				    struct xe_file *xef,
+				    struct xe_exec_queue *q)
+{
+	int h_vm, h_queue;
+	u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE], seqno;
+	int i;
+	int ret;
+
+	if (!xe_exec_queue_is_lr(q))
+		return 0;
+
+	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
+	if (h_vm < 0)
+		return h_vm;
+
+	if (XE_WARN_ON(q->width >= XE_HW_ENGINE_MAX_INSTANCE))
+		return -EINVAL;
+
+	h_queue = xe_eudebug_remove_handle(d,
+					   XE_EUDEBUG_RES_TYPE_EXEC_QUEUE,
+					   q,
+					   &seqno);
+	if (h_queue <= 0)
+		return h_queue;
+
+	for (i = 0; i < q->width; i++) {
+		ret = _xe_eudebug_remove_handle(d,
+						XE_EUDEBUG_RES_TYPE_LRC,
+						q->lrc[i],
+						NULL);
+		if (ret < 0 && ret != -ENOENT)
+			return ret;
+
+		XE_WARN_ON(!ret);
+
+		h_lrc[i] = ret;
+	}
+
+	ret = send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY,
+				    h_vm, h_queue, q->class,
+				    q->width, h_lrc, seqno);
+
+	if (ret)
+		eu_dbg(d, "send_exec_queue_event destroy error %d\n", ret);
+
+	return ret;
+}
+
+void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q)
+{
+	struct xe_eudebug *d;
+
+	if (!xe_engine_supports_eudebug(q->class))
+		return;
+
+	d = xe_eudebug_get(xef);
+	if (!d)
+		return;
+
+	xe_eudebug_event_put(d, exec_queue_create_event(d, xef, q));
+}
+
+void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q)
+{
+	struct xe_eudebug *d;
+
+	if (!xe_engine_supports_eudebug(q->class))
+		return;
+
+	d = xe_eudebug_get(xef);
+	if (!d)
+		return;
+
+	xe_eudebug_event_put(d, exec_queue_destroy_event(d, xef, q));
+}
+
 static struct xe_file *xe_eudebug_target_get(struct xe_eudebug *d)
 {
 	struct xe_file *xef = NULL;
@@ -671,9 +863,10 @@ static struct xe_file *xe_eudebug_target_get(struct xe_eudebug *d)
 static void discover_client(struct xe_eudebug *d)
 {
 	struct xe_file *xef;
+	struct xe_exec_queue *q;
 	struct xe_vm *vm;
 	unsigned long i;
-	unsigned int vm_count = 0;
+	unsigned int vm_count = 0, eq_count = 0;
 	int err = 0;
 
 	xef = xe_eudebug_target_get(d);
@@ -691,14 +884,24 @@ static void discover_client(struct xe_eudebug *d)
 		vm_count++;
 	}
 
+	xa_for_each(&xef->exec_queue.xa, i, q) {
+		if (!xe_engine_supports_eudebug(q->class))
+			continue;
+
+		err = exec_queue_create_event(d, xef, q);
+		if (err)
+			break;
+	}
+
 	complete_all(&d->discovery);
 
 	eu_dbg(d, "Discovery end for %lld: %d", d->session, err);
 
 	up_write(&xef->eudebug.ioctl_lock);
 
-	if (vm_count)
-		eu_dbg(d, "Discovery found %u vms", vm_count);
+	if (vm_count || eq_count)
+		eu_dbg(d, "Discovery found %u vms, %u exec_queues",
+		       vm_count, eq_count);
 
 	xe_file_put(xef);
 }
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index ba13dc35f161..39c9aca373f2 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -13,6 +13,7 @@ struct drm_file;
 struct xe_device;
 struct xe_file;
 struct xe_vm;
+struct xe_exec_queue;
 
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 
@@ -45,6 +46,9 @@ void xe_eudebug_file_close(struct xe_file *xef);
 void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm);
 void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm);
 
+void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q);
+void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q);
+
 #else
 
 static inline int xe_eudebug_connect_ioctl(struct drm_device *dev,
@@ -59,6 +63,9 @@ static inline void xe_eudebug_file_close(struct xe_file *xef) { }
 static inline void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm) { }
 static inline void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) { }
 
+static inline void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q) { }
+static inline void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) { }
+
 #endif /* CONFIG_DRM_XE_EUDEBUG */
 
 #endif /* _XE_EUDEBUG_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 55b71ddd92b6..57bff7482163 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -33,7 +33,7 @@ enum xe_eudebug_state {
 };
 
 #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64
-#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_VM
+#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE
 
 /**
  * struct xe_eudebug_handle - eudebug resource handle
@@ -61,7 +61,9 @@ struct xe_eudebug_resource {
 };
 
 #define XE_EUDEBUG_RES_TYPE_VM		0
-#define XE_EUDEBUG_RES_TYPE_COUNT	(XE_EUDEBUG_RES_TYPE_VM + 1)
+#define XE_EUDEBUG_RES_TYPE_EXEC_QUEUE	1
+#define XE_EUDEBUG_RES_TYPE_LRC		2
+#define XE_EUDEBUG_RES_TYPE_COUNT	(XE_EUDEBUG_RES_TYPE_LRC + 1)
 
 /**
  * struct xe_eudebug_resources - eudebug resources for all types
@@ -133,3 +135,4 @@ struct xe_eudebug {
 };
 
 #endif /* _XE_EUDEBUG_TYPES_H_ */
+
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index df82463b19f6..dc049da6fac3 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -27,6 +27,7 @@
 #include "xe_trace.h"
 #include "xe_vm.h"
 #include "xe_pxp.h"
+#include "xe_eudebug.h"
 
 /**
  * DOC: Execution Queue
@@ -784,6 +785,8 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 
 	args->exec_queue_id = id;
 
+	xe_eudebug_exec_queue_create(xef, q);
+
 	return 0;
 
 kill_exec_queue:
@@ -988,6 +991,8 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
 	if (q->vm && q->hwe->hw_engine_group)
 		xe_hw_engine_group_del_exec_queue(q->hwe->hw_engine_group, q);
 
+	xe_eudebug_exec_queue_destroy(xef, q);
+
 	xe_exec_queue_kill(q);
 
 	trace_xe_exec_queue_close(q);
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h
index 6b5f9fa2a594..d8781bf79547 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -79,4 +79,18 @@ enum xe_force_wake_domains xe_hw_engine_to_fw_domain(struct xe_hw_engine *hwe);
 void xe_hw_engine_mmio_write32(struct xe_hw_engine *hwe, struct xe_reg reg, u32 val);
 u32 xe_hw_engine_mmio_read32(struct xe_hw_engine *hwe, struct xe_reg reg);
 
+static inline bool xe_engine_supports_eudebug(const enum xe_engine_class ec)
+{
+	if (ec == XE_ENGINE_CLASS_COMPUTE ||
+	    ec == XE_ENGINE_CLASS_RENDER)
+		return true;
+
+	return false;
+}
+
+static inline bool xe_hw_engine_has_eudebug(const struct xe_hw_engine *hwe)
+{
+	return xe_engine_supports_eudebug(hwe->class);
+}
+
 #endif
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index fd2a0c911d02..360d7a7ecb67 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -48,6 +48,7 @@ struct drm_xe_eudebug_event {
 #define DRM_XE_EUDEBUG_EVENT_NONE		0
 #define DRM_XE_EUDEBUG_EVENT_READ		1
 #define DRM_XE_EUDEBUG_EVENT_VM			2
+#define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE		3
 
 	__u16 flags;
 #define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
@@ -70,6 +71,16 @@ struct drm_xe_eudebug_event_vm {
 	__u64 vm_handle;
 };
 
+struct drm_xe_eudebug_event_exec_queue {
+	struct drm_xe_eudebug_event base;
+
+	__u64 vm_handle;
+	__u64 exec_queue_handle;
+	__u32 engine_class;
+	__u32 width;
+	__u64 lrc_handle[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/20] drm/xe: Add EUDEBUG_ENABLE exec queue property
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (2 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 03/20] drm/xe/eudebug: Introduce exec_queue events Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 05/20] drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops Mika Kuoppala
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek, Mika Kuoppala

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

This patch introduces an immutable eudebug property for exec_queues,
using a flags value to enable eudebug-specific features. For now, the
engine LRC uses this flag to enable the runalone hardware feature.
Runalone ensures that only one hardware engine in a group
[rcs0, ccs0-3] is active on a tile.

v2: - check CONFIG_DRM_XE_EUDEBUG and LR mode (Matthew)
    - disable preempt (Dominik)
    - lrc_create remove from engine init

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_eudebug.c          |  4 +--
 drivers/gpu/drm/xe/xe_exec_queue.c       | 43 +++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_exec_queue.h       |  2 ++
 drivers/gpu/drm/xe/xe_exec_queue_types.h |  7 ++++
 drivers/gpu/drm/xe/xe_lrc.c              | 10 ++++++
 include/uapi/drm/xe_drm.h                |  2 ++
 6 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 2b8efa438716..a6c0d2391e0e 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -726,7 +726,7 @@ static int exec_queue_create_event(struct xe_eudebug *d,
 	int i;
 	int ret;
 
-	if (!xe_exec_queue_is_lr(q))
+	if (!xe_exec_queue_is_debuggable(q))
 		return 0;
 
 	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
@@ -780,7 +780,7 @@ static int exec_queue_destroy_event(struct xe_eudebug *d,
 	int i;
 	int ret;
 
-	if (!xe_exec_queue_is_lr(q))
+	if (!xe_exec_queue_is_debuggable(q))
 		return 0;
 
 	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index dc049da6fac3..02f4e412fcdf 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -206,6 +206,9 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
 	if (!(exec_queue_flags & EXEC_QUEUE_FLAG_KERNEL))
 		flags |= XE_LRC_CREATE_USER_CTX;
 
+	if (q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)
+		flags |= XE_LRC_CREATE_RUNALONE;
+
 	for (i = 0; i < q->width; ++i) {
 		q->lrc[i] = xe_lrc_create(q->hwe, q->vm, SZ_16K, q->msix_vec, flags);
 		if (IS_ERR(q->lrc[i])) {
@@ -530,6 +533,42 @@ exec_queue_set_pxp_type(struct xe_device *xe, struct xe_exec_queue *q, u64 value
 	return xe_pxp_exec_queue_set_type(xe->pxp, q, DRM_XE_PXP_TYPE_HWDRM);
 }
 
+static int exec_queue_set_eudebug(struct xe_device *xe, struct xe_exec_queue *q,
+				  u64 value)
+{
+	const u64 known_flags = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE;
+
+	if (XE_IOCTL_DBG(xe, (q->class != XE_ENGINE_CLASS_RENDER &&
+			      q->class != XE_ENGINE_CLASS_COMPUTE)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, (value & ~known_flags)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, !IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)))
+		return -EOPNOTSUPP;
+
+	if (XE_IOCTL_DBG(xe, !xe_exec_queue_is_lr(q)))
+		return -EINVAL;
+	/*
+	 * We want to explicitly set the global feature if
+	 * property is set.
+	 */
+	if (XE_IOCTL_DBG(xe,
+			 !(value & DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)))
+		return -EINVAL;
+
+	q->eudebug_flags = EXEC_QUEUE_EUDEBUG_FLAG_ENABLE;
+	q->sched_props.preempt_timeout_us = 0;
+
+	return 0;
+}
+
+int xe_exec_queue_is_debuggable(struct xe_exec_queue *q)
+{
+	return q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE;
+}
+
 typedef int (*xe_exec_queue_set_property_fn)(struct xe_device *xe,
 					     struct xe_exec_queue *q,
 					     u64 value);
@@ -538,6 +577,7 @@ static const xe_exec_queue_set_property_fn exec_queue_set_property_funcs[] = {
 	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY] = exec_queue_set_priority,
 	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE] = exec_queue_set_timeslice,
 	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_PXP_TYPE] = exec_queue_set_pxp_type,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG] = exec_queue_set_eudebug,
 };
 
 static int exec_queue_user_ext_set_property(struct xe_device *xe,
@@ -558,7 +598,8 @@ static int exec_queue_user_ext_set_property(struct xe_device *xe,
 	    XE_IOCTL_DBG(xe, ext.pad) ||
 	    XE_IOCTL_DBG(xe, ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY &&
 			 ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE &&
-			 ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_PXP_TYPE))
+			 ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_PXP_TYPE &&
+			 ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG))
 		return -EINVAL;
 
 	idx = array_index_nospec(ext.property, ARRAY_SIZE(exec_queue_set_property_funcs));
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h
index 8821ceb838d0..cd4141f6ffbf 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -94,4 +94,6 @@ int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch);
 
 struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q);
 
+int xe_exec_queue_is_debuggable(struct xe_exec_queue *q);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
index 27b76cf9da89..52fd850c7ab0 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
@@ -96,6 +96,13 @@ struct xe_exec_queue {
 	 */
 	unsigned long flags;
 
+	/**
+	 * @eudebug_flags: immutable eudebug flags for this exec queue.
+	 * Set up with DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG.
+	 */
+#define EXEC_QUEUE_EUDEBUG_FLAG_ENABLE		BIT(0)
+	unsigned long eudebug_flags;
+
 	union {
 		/** @multi_gt_list: list head for VM bind engines if multi-GT */
 		struct list_head multi_gt_list;
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index af09f70f6e78..dc7405108261 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -1534,6 +1534,16 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 	if (err)
 		goto err_lrc_finish;
 
+	if (init_flags & XE_LRC_CREATE_RUNALONE) {
+		u32 ctx_control = xe_lrc_read_ctx_reg(lrc, CTX_CONTEXT_CONTROL);
+
+		drm_dbg(&xe->drm, "read CTX_CONTEXT_CONTROL: 0x%x\n", ctx_control);
+		ctx_control |= _MASKED_BIT_ENABLE(CTX_CTRL_RUN_ALONE);
+		drm_dbg(&xe->drm, "written CTX_CONTEXT_CONTROL: 0x%x\n", ctx_control);
+
+		xe_lrc_write_ctx_reg(lrc, CTX_CONTEXT_CONTROL, ctx_control);
+	}
+
 	return 0;
 
 err_lrc_finish:
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 54868095952b..ba98da4320da 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1275,6 +1275,8 @@ struct drm_xe_exec_queue_create {
 #define   DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY		0
 #define   DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE		1
 #define   DRM_XE_EXEC_QUEUE_SET_PROPERTY_PXP_TYPE		2
+#define   DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG		3
+#define     DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE		(1 << 0)
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/20] drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (3 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 04/20] drm/xe: Add EUDEBUG_ENABLE exec queue property Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 06/20] drm/xe/eudebug: Introduce vm bind and vm bind debug data events Mika Kuoppala
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun

From: Christoph Manszewski <christoph.manszewski@intel.com>

Make it possible to add and remove per vm debug data, which can be used
to annotate vm ranges (using pseudopaths) or to associate them with
a file which can carry arbitrary debug data (e.g. binary instruction to
code line mapping). The debug data is kept separe from the vmas. Each
address can be associated with only one debug data entry i.e. debug data
entries cannot overlap. Each entry is atomic so to remove it the
creation address and range has to be passed for removal.

For debug data manipulation only the 'op' and 'extensions' field from
'struct drm_xe_vm_bind_op' is used. All required parameters are passed
through 'struct drm_xe_vm_bind_op_ext_debug_data' and a valid instance
should be present in the extension chain pointed by the 'extensions'
field.

Debug data will be accessible through the eudebug event interface,
introduced in the following patch. An alternative way to access debug data
using debugfs, without relying on eudebug, will be proposed as a follow-up
to the eudebug series.

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
---
 drivers/gpu/drm/xe/Makefile              |   1 +
 drivers/gpu/drm/xe/xe_debug_data.c       | 275 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_debug_data.h       |  22 ++
 drivers/gpu/drm/xe/xe_debug_data_types.h |  25 +++
 drivers/gpu/drm/xe/xe_vm.c               | 157 ++++++++++++-
 drivers/gpu/drm/xe/xe_vm_types.h         |  19 ++
 include/uapi/drm/xe_drm.h                |  36 +++
 7 files changed, 529 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data.c
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data.h
 create mode 100644 drivers/gpu/drm/xe/xe_debug_data_types.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index bfc0f18ab50e..fd79a28814bc 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -86,6 +86,7 @@ xe-y += xe_bb.o \
 	xe_irq.o \
 	xe_late_bind_fw.o \
 	xe_lrc.o \
+	xe_debug_data.o \
 	xe_migrate.o \
 	xe_mmio.o \
 	xe_mmio_gem.o \
diff --git a/drivers/gpu/drm/xe/xe_debug_data.c b/drivers/gpu/drm/xe/xe_debug_data.c
new file mode 100644
index 000000000000..99044dc477d5
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_debug_data.c
@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_debug_data.h"
+#include "xe_debug_data_types.h"
+#include "xe_vm.h"
+
+const char *xe_debug_data_pseudo_path_to_string(u64 pseudopath)
+{
+	switch (pseudopath) {
+	case DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_MODULE_AREA:
+		return "[module_area]";
+	case DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_SBA_AREA:
+		return "[sba_area]";
+	case DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_SIP_AREA:
+		return "[sip_area]";
+	default:
+		return "[unknown]";
+	}
+}
+
+static int xe_debug_data_check_add(struct xe_vm *vm, struct drm_xe_vm_bind_op_ext_debug_data *ext)
+{
+	struct xe_debug_data *dd;
+	struct xe_device *xe = vm->xe;
+
+	mutex_lock(&vm->debug_data.lock);
+	list_for_each_entry(dd, &vm->debug_data.list, link) {
+		if (XE_IOCTL_DBG(xe, (dd->addr < ext->addr + ext->range) &&
+				     (ext->addr < dd->addr + dd->range))) {
+			mutex_unlock(&vm->debug_data.lock);
+			return -EINVAL;
+		}
+	}
+	mutex_unlock(&vm->debug_data.lock);
+
+	return 0;
+}
+
+static int xe_debug_data_check_remove(struct xe_vm *vm,
+				      struct drm_xe_vm_bind_op_ext_debug_data *ext)
+{
+	struct xe_debug_data *dd;
+	struct xe_device *xe = vm->xe;
+	bool found = false;
+
+	mutex_lock(&vm->debug_data.lock);
+	list_for_each_entry(dd, &vm->debug_data.list, link) {
+		if (dd->addr == ext->addr && dd->range == ext->range)
+			found = true;
+	}
+	mutex_unlock(&vm->debug_data.lock);
+
+	if (XE_IOCTL_DBG(xe, !found)) {
+		drm_dbg(&xe->drm, "Debug data to remove not found for addr 0x%llx, range 0x%llx\n",
+			ext->addr, ext->range);
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+int xe_debug_data_check_extension(struct xe_vm *vm, u32 operation, u64 extension)
+{
+	struct drm_xe_vm_bind_op_ext_debug_data *ext;
+	int ret = 0;
+
+	u64 __user *address = u64_to_user_ptr(extension);
+	struct xe_device *xe = vm->xe;
+
+	if (XE_IOCTL_DBG(xe, operation != DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA &&
+			     operation != DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA))
+		return -EINVAL;
+
+	ext = kzalloc(sizeof(*ext), GFP_KERNEL);
+	if (!ext)
+		return -ENOMEM;
+
+	if (copy_from_user(ext, address, sizeof(*ext))) {
+		kfree(ext);
+		return -EFAULT;
+	}
+
+	if (XE_IOCTL_DBG(xe, ext->flags & ~DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) ||
+	    XE_IOCTL_DBG(xe, ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO &&
+			     ext->offset != 0) ||
+	    XE_IOCTL_DBG(xe, ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO &&
+			     (ext->pseudopath < DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_MODULE_AREA ||
+			     ext->pseudopath > DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_SIP_AREA)) ||
+	    XE_IOCTL_DBG(xe, !(ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) &&
+			     strnlen(ext->pathname, PATH_MAX) >= PATH_MAX)) {
+		kfree(ext);
+		return -EINVAL;
+	}
+
+	ret = operation == DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA ?
+		xe_debug_data_check_add(vm, ext) :
+		xe_debug_data_check_remove(vm, ext);
+
+	kfree(ext);
+	return ret;
+}
+
+static int xe_debug_data_add(struct xe_vm *vm, struct xe_vma_op *vma_op,
+			     struct drm_xe_vm_bind_op_ext_debug_data *ext)
+{
+	struct xe_debug_data *dd;
+
+	vm_dbg(&vm->xe->drm,
+	       "ADD_DEBUG_DATA: addr=0x%016llx, range=0x%016llx, offset=0x%08x, flags=0x%016llx, path=%s\n",
+	       ext->addr, ext->range, ext->offset, ext->flags,
+	       (ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) ?
+	       xe_debug_data_pseudo_path_to_string(ext->pseudopath) : ext->pathname);
+
+	dd = kzalloc(sizeof(*dd), GFP_KERNEL);
+	if (!dd)
+		return -ENOMEM;
+
+	dd->addr = ext->addr;
+	dd->range = ext->range;
+	dd->flags = ext->flags;
+	dd->offset = ext->offset;
+
+	if (ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) {
+		dd->pseudopath = ext->pseudopath;
+	} else if (strscpy(dd->pathname, ext->pathname, PATH_MAX) < 0) {
+		kfree(dd);
+		return -EINVAL;
+	}
+
+	mutex_lock(&vm->debug_data.lock);
+	list_add_tail(&dd->link, &vm->debug_data.list);
+	mutex_unlock(&vm->debug_data.lock);
+
+	memcpy(&vma_op->modify_debug_data.debug_data, dd, sizeof(*dd));
+
+	return 0;
+}
+
+static int xe_debug_data_remove(struct xe_vm *vm, struct xe_vma_op *vma_op,
+				struct drm_xe_vm_bind_op_ext_debug_data *ext)
+{
+	struct xe_debug_data *dd;
+
+	vm_dbg(&vm->xe->drm,
+	       "REMOVE_DEBUG_DATA: addr=0x%016llx, range=0x%016llx, offset=0x%08x, flags=0x%016llx, path=%s\n",
+	       ext->addr, ext->range, ext->offset, ext->flags,
+	       (ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) ?
+	       xe_debug_data_pseudo_path_to_string(ext->pseudopath) : ext->pathname);
+
+	mutex_lock(&vm->debug_data.lock);
+	list_for_each_entry(dd, &vm->debug_data.list, link) {
+		if (dd->addr == ext->addr && dd->range == ext->range) {
+			list_del(&dd->link);
+			memcpy(&vma_op->modify_debug_data.debug_data, dd, sizeof(*dd));
+			kfree(dd);
+			break;
+		}
+	}
+	mutex_unlock(&vm->debug_data.lock);
+
+	return 0;
+}
+
+int xe_debug_data_process_extension(struct xe_vm *vm, struct drm_gpuva_ops *ops, u32 operation,
+				    u64 extension)
+{
+	struct drm_xe_vm_bind_op_ext_debug_data *ext;
+	struct xe_vma_op *vma_op;
+	struct drm_gpuva_op *op;
+
+	u64 __user *address = u64_to_user_ptr(extension);
+	int ret = 0;
+
+	ext = kzalloc(sizeof(*ext), GFP_KERNEL);
+	if (!ext)
+		return -ENOMEM;
+
+	if (copy_from_user(ext, address, sizeof(*ext))) {
+		kfree(ext);
+		return -EFAULT;
+	}
+
+	/* We expect only a single op for debug data */
+	op = drm_gpuva_first_op(ops);
+	if (op != drm_gpuva_last_op(ops))
+		drm_warn(&vm->xe->drm, "NOT POSSIBLE");
+
+	vma_op = gpuva_op_to_vma_op(op);
+
+	if (vma_op->subop == XE_VMA_SUBOP_ADD_DEBUG_DATA)
+		ret = xe_debug_data_add(vm, vma_op, ext);
+	else
+		ret = xe_debug_data_remove(vm, vma_op, ext);
+
+	kfree(ext);
+	return ret;
+}
+
+static int xe_debug_data_op_unwind_add(struct xe_vm *vm, struct xe_vma_op *vma_op)
+{
+	struct xe_debug_data *op_data = &vma_op->modify_debug_data.debug_data;
+	struct xe_debug_data *dd;
+
+	vm_dbg(&vm->xe->drm,
+	       "Reverting debug data add: addr=0x%016llx, range=0x%016llx, offset=0x%08x, flags=0x%016llx, path=%s\n",
+	       op_data->addr, op_data->range, op_data->offset, op_data->flags,
+	       (op_data->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) ?
+	       xe_debug_data_pseudo_path_to_string(op_data->pseudopath) : op_data->pathname);
+
+	mutex_lock(&vm->debug_data.lock);
+	list_for_each_entry(dd, &vm->debug_data.list, link) {
+		if (dd->addr == op_data->addr && dd->range == op_data->range) {
+			list_del(&dd->link);
+			kfree(dd);
+			break;
+		}
+	}
+	mutex_unlock(&vm->debug_data.lock);
+
+	return 0;
+}
+
+static int xe_debug_data_op_unwind_remove(struct xe_vm *vm, struct xe_vma_op *vma_op)
+{
+	struct xe_debug_data *op_data = &vma_op->modify_debug_data.debug_data;
+	struct xe_debug_data *dd;
+
+	vm_dbg(&vm->xe->drm,
+	       "Reverting debug data remove: addr=0x%016llx, range=0x%016llx, offset=0x%08x, flags=0x%016llx, path=%s\n",
+	       op_data->addr, op_data->range, op_data->offset, op_data->flags,
+	       (op_data->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) ?
+	       xe_debug_data_pseudo_path_to_string(op_data->pseudopath) : op_data->pathname);
+
+	dd = kzalloc(sizeof(*dd), GFP_KERNEL);
+	if (!dd)
+		return -ENOMEM;
+
+	memcpy(dd, op_data, sizeof(*dd));
+
+	mutex_lock(&vm->debug_data.lock);
+	list_add_tail(&dd->link, &vm->debug_data.list);
+	mutex_unlock(&vm->debug_data.lock);
+
+	return 0;
+}
+
+int xe_debug_data_op_unwind(struct xe_vm *vm, struct xe_vma_op *vma_op)
+{
+	switch (vma_op->subop) {
+	case XE_VMA_SUBOP_ADD_DEBUG_DATA:
+		return xe_debug_data_op_unwind_add(vm, vma_op);
+	case XE_VMA_SUBOP_REMOVE_DEBUG_DATA:
+		return xe_debug_data_op_unwind_remove(vm, vma_op);
+	default:
+		drm_err(&vm->xe->drm, "Invalid debug data subop %d\n", vma_op->subop);
+		return -EINVAL;
+	}
+}
+
+int xe_debug_data_destroy(struct xe_vm *vm)
+{
+	struct xe_debug_data *dd, *tmp;
+
+	mutex_lock(&vm->debug_data.lock);
+	list_for_each_entry_safe(dd, tmp, &vm->debug_data.list, link) {
+		list_del(&dd->link);
+		kfree(dd);
+	}
+	mutex_unlock(&vm->debug_data.lock);
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/xe/xe_debug_data.h b/drivers/gpu/drm/xe/xe_debug_data.h
new file mode 100644
index 000000000000..3436a7023920
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_debug_data.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_DEBUG_DATA_H_
+#define _XE_DEBUG_DATA_H_
+
+#include <linux/types.h>
+
+struct drm_gpuva_ops;
+struct xe_vm;
+struct xe_vma_op;
+
+const char *xe_debug_data_pseudo_path_to_string(u64 pseudopath);
+int xe_debug_data_check_extension(struct xe_vm *vm, u32 operation, u64 extension);
+int xe_debug_data_process_extension(struct xe_vm *vm, struct drm_gpuva_ops *ops, u32 operation,
+				    u64 extension);
+int xe_debug_data_op_unwind(struct xe_vm *vm, struct xe_vma_op *vma_op);
+int xe_debug_data_destroy(struct xe_vm *vm);
+
+#endif /* _XE_DEBUG_DATA_H_ */
diff --git a/drivers/gpu/drm/xe/xe_debug_data_types.h b/drivers/gpu/drm/xe/xe_debug_data_types.h
new file mode 100644
index 000000000000..a8b430af2275
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_debug_data_types.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_DEBUG_DATA_TYPES_H_
+#define _XE_DEBUG_DATA_TYPES_H_
+
+#include <linux/limits.h>
+#include <linux/list.h>
+#include <linux/types.h>
+
+struct xe_debug_data {
+	struct list_head link;
+	u64 addr;
+	u64 range;
+	u64 flags;
+	u32 offset;
+	union {
+		u64 pseudopath;
+		char pathname[PATH_MAX];
+	};
+};
+
+#endif /* _XE_DEBUG_DATA_TYPES_H_ */
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 85c7e1b8e232..79ad453c01f4 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -24,6 +24,7 @@
 #include "regs/xe_gtt_defs.h"
 #include "xe_assert.h"
 #include "xe_bo.h"
+#include "xe_debug_data.h"
 #include "xe_device.h"
 #include "xe_drm_client.h"
 #include "xe_eudebug.h"
@@ -1498,6 +1499,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags, struct xe_file *xef)
 	for_each_tile(tile, xe, id)
 		xe_range_fence_tree_init(&vm->rftree[id]);
 
+	INIT_LIST_HEAD(&vm->debug_data.list);
+	mutex_init(&vm->debug_data.lock);
+
 	vm->pt_ops = &xelp_pt_ops;
 
 	/*
@@ -1795,6 +1799,8 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	for_each_tile(tile, xe, id)
 		xe_range_fence_tree_fini(&vm->rftree[id]);
 
+	xe_debug_data_destroy(vm);
+
 	xe_vm_put(vm);
 }
 
@@ -2136,6 +2142,7 @@ static void prep_vma_destroy(struct xe_vm *vm, struct xe_vma *vma,
 #if IS_ENABLED(CONFIG_DRM_XE_DEBUG_VM)
 static void print_op(struct xe_device *xe, struct drm_gpuva_op *op)
 {
+	struct xe_vma_op *vma_op;
 	struct xe_vma *vma;
 
 	switch (op->op) {
@@ -2170,6 +2177,12 @@ static void print_op(struct xe_device *xe, struct drm_gpuva_op *op)
 		vm_dbg(&xe->drm, "PREFETCH: addr=0x%016llx, range=0x%016llx",
 		       (ULL)xe_vma_start(vma), (ULL)xe_vma_size(vma));
 		break;
+	case DRM_GPUVA_OP_DRIVER:
+		vma_op = gpuva_op_to_vma_op(op);
+		if (vma_op->subop != XE_VMA_SUBOP_ADD_DEBUG_DATA &&
+		    vma_op->subop != XE_VMA_SUBOP_REMOVE_DEBUG_DATA)
+			drm_warn(&xe->drm, "Unexpected vma sub op: %d", vma_op->subop);
+		break;
 	default:
 		drm_warn(&xe->drm, "NOT POSSIBLE");
 	}
@@ -2214,12 +2227,13 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_vma_ops *vops,
 			 struct xe_bo *bo, u64 bo_offset_or_userptr,
 			 u64 addr, u64 range,
 			 u32 operation, u32 flags,
-			 u32 prefetch_region, u16 pat_index)
+			 u32 prefetch_region, u16 pat_index, u64 extensions)
 {
 	struct drm_gem_object *obj = bo ? &bo->ttm.base : NULL;
 	struct drm_gpuva_ops *ops;
 	struct drm_gpuva_op *__op;
 	struct drm_gpuvm_bo *vm_bo;
+	struct xe_vma_op *vma_op;
 	u64 range_end = addr + range;
 	int err;
 
@@ -2266,6 +2280,24 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_vma_ops *vops,
 		drm_gpuvm_bo_put(vm_bo);
 		xe_bo_unlock(bo);
 		break;
+	case DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA:
+	case DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA:
+		ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+		if (!ops)
+			return ERR_PTR(-ENOMEM);
+
+		INIT_LIST_HEAD(&ops->list);
+		vma_op = kzalloc(sizeof(*vma_op), GFP_KERNEL);
+		if (!vma_op) {
+			kfree(ops);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		vma_op->base.op = DRM_GPUVA_OP_DRIVER;
+		vma_op->subop = operation == DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA ?
+			XE_VMA_SUBOP_ADD_DEBUG_DATA : XE_VMA_SUBOP_REMOVE_DEBUG_DATA;
+		list_add_tail(&vma_op->base.entry, &ops->list);
+		break;
 	default:
 		drm_warn(&vm->xe->drm, "NOT POSSIBLE");
 		ops = ERR_PTR(-EINVAL);
@@ -2531,6 +2563,11 @@ static int xe_vma_op_commit(struct xe_vm *vm, struct xe_vma_op *op)
 	case DRM_GPUVA_OP_PREFETCH:
 		op->flags |= XE_VMA_OP_COMMITTED;
 		break;
+	case DRM_GPUVA_OP_DRIVER:
+		if (op->subop != XE_VMA_SUBOP_ADD_DEBUG_DATA &&
+		    op->subop != XE_VMA_SUBOP_REMOVE_DEBUG_DATA)
+			drm_warn(&vm->xe->drm, "Unexpected vma sub op: %d", op->subop);
+		break;
 	default:
 		drm_warn(&vm->xe->drm, "NOT POSSIBLE");
 	}
@@ -2747,6 +2784,11 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops,
 				xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask, 1);
 
 			break;
+		case DRM_GPUVA_OP_DRIVER:
+			if (op->subop != XE_VMA_SUBOP_ADD_DEBUG_DATA &&
+			    op->subop != XE_VMA_SUBOP_REMOVE_DEBUG_DATA)
+				drm_warn(&vm->xe->drm, "Unexpected vma sub op: %d", op->subop);
+			break;
 		default:
 			drm_warn(&vm->xe->drm, "NOT POSSIBLE");
 		}
@@ -2809,6 +2851,13 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
 	case DRM_GPUVA_OP_PREFETCH:
 		/* Nothing to do */
 		break;
+	case DRM_GPUVA_OP_DRIVER:
+		if (op->subop == XE_VMA_SUBOP_ADD_DEBUG_DATA ||
+		    op->subop == XE_VMA_SUBOP_REMOVE_DEBUG_DATA)
+			xe_debug_data_op_unwind(vm, op);
+		else
+			drm_warn(&vm->xe->drm, "Unexpected vma sub op: %d", op->subop);
+		break;
 	default:
 		drm_warn(&vm->xe->drm, "NOT POSSIBLE");
 	}
@@ -2974,6 +3023,11 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
 					    exec);
 		break;
 	}
+	case DRM_GPUVA_OP_DRIVER:
+		if (op->subop != XE_VMA_SUBOP_ADD_DEBUG_DATA &&
+		    op->subop != XE_VMA_SUBOP_REMOVE_DEBUG_DATA)
+			drm_warn(&vm->xe->drm, "Unexpected vma sub op: %d", op->subop);
+		break;
 	default:
 		drm_warn(&vm->xe->drm, "NOT POSSIBLE");
 	}
@@ -3198,6 +3252,11 @@ static void op_add_ufence(struct xe_vm *vm, struct xe_vma_op *op,
 	case DRM_GPUVA_OP_PREFETCH:
 		vma_add_ufence(gpuva_to_vma(op->base.prefetch.va), ufence);
 		break;
+	case DRM_GPUVA_OP_DRIVER:
+		if (op->subop != XE_VMA_SUBOP_ADD_DEBUG_DATA &&
+		    op->subop != XE_VMA_SUBOP_REMOVE_DEBUG_DATA)
+			drm_warn(&vm->xe->drm, "Unexpected vma sub op: %d", op->subop);
+		break;
 	default:
 		drm_warn(&vm->xe->drm, "NOT POSSIBLE");
 	}
@@ -3285,6 +3344,79 @@ ALLOW_ERROR_INJECTION(vm_bind_ioctl_ops_execute, ERRNO);
 #define XE_64K_PAGE_MASK 0xffffull
 #define ALL_DRM_XE_SYNCS_FLAGS (DRM_XE_SYNCS_FLAG_WAIT_FOR_OP)
 
+#define MAX_USER_EXTENSIONS	16
+
+typedef int (*xe_vm_bind_user_extension_check_fn)(struct xe_vm *vm, u32 operation, u64 extension);
+
+typedef int (*xe_vm_bind_user_extension_process_fn)(struct xe_vm *vm, struct drm_gpuva_ops *ops,
+						    u32 operation, u64 extension);
+
+static const xe_vm_bind_user_extension_check_fn vm_bind_extension_check_funcs[] = {
+	[XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA] = xe_debug_data_check_extension,
+};
+
+static const xe_vm_bind_user_extension_process_fn vm_bind_extension_process_funcs[] = {
+	[XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA] = xe_debug_data_process_extension,
+};
+
+#define MAX_USER_EXTENSIONS 16
+static int __vm_bind_op_user_extensions(struct xe_vm *vm, struct drm_gpuva_ops *ops,
+					u32 operation, u64 extensions)
+{
+	struct xe_device *xe = vm->xe;
+	int debug_data_count = 0;
+	int ext_count = 0;
+	int err = -1;
+
+	struct drm_xe_user_extension ext;
+
+	while (extensions) {
+		u64 __user *address = u64_to_user_ptr(extensions);
+
+		if (XE_IOCTL_DBG(xe, ++ext_count >= MAX_USER_EXTENSIONS))
+			return -E2BIG;
+
+		err = copy_from_user(&ext, address, sizeof(ext));
+		if (XE_IOCTL_DBG(xe, err))
+			return -EFAULT;
+
+		if (XE_IOCTL_DBG(xe, operation != DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA &&
+				     operation != DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA &&
+				     ext.name == XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA) ||
+		    XE_IOCTL_DBG(xe, ext.name == XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA &&
+				     ++debug_data_count > 1))
+			return -EINVAL;
+
+		if (XE_IOCTL_DBG(xe, ext.pad) ||
+		    XE_IOCTL_DBG(xe, ext.name > XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA))
+			return -EINVAL;
+
+		if (!ops)
+			err = vm_bind_extension_check_funcs[ext.name](vm, operation, extensions);
+		else
+			err = vm_bind_extension_process_funcs[ext.name](vm, ops, operation,
+									extensions);
+
+		if (XE_IOCTL_DBG(xe, err))
+			return err;
+
+		extensions = ext.next_extension;
+	}
+
+	return 0;
+}
+
+static int vm_bind_ioctl_check_user_extensions(struct xe_vm *vm, u32 operation, u64 extensions)
+{
+	return __vm_bind_op_user_extensions(vm, NULL, operation, extensions);
+}
+
+static int vm_bind_ioctl_process_user_extensions(struct xe_vm *vm, struct drm_gpuva_ops *ops,
+						 u32 operation, u64 extensions)
+{
+	return __vm_bind_op_user_extensions(vm, ops, operation, extensions);
+}
+
 static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm,
 				    struct drm_xe_vm_bind *args,
 				    struct drm_xe_vm_bind_op **bind_ops)
@@ -3333,6 +3465,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm,
 		bool is_cpu_addr_mirror = flags &
 			DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR;
 		u16 pat_index = (*bind_ops)[i].pat_index;
+		u64 extensions = (*bind_ops)[i].extensions;
 		u16 coh_mode;
 
 		if (XE_IOCTL_DBG(xe, is_cpu_addr_mirror &&
@@ -3360,7 +3493,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm,
 			goto free_bind_ops;
 		}
 
-		if (XE_IOCTL_DBG(xe, op > DRM_XE_VM_BIND_OP_PREFETCH) ||
+		if (XE_IOCTL_DBG(xe, op > DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA) ||
 		    XE_IOCTL_DBG(xe, flags & ~SUPPORTED_FLAGS) ||
 		    XE_IOCTL_DBG(xe, obj && (is_null || is_cpu_addr_mirror)) ||
 		    XE_IOCTL_DBG(xe, obj_offset && (is_null ||
@@ -3398,10 +3531,16 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm,
 		    XE_IOCTL_DBG(xe, addr & ~PAGE_MASK) ||
 		    XE_IOCTL_DBG(xe, range & ~PAGE_MASK) ||
 		    XE_IOCTL_DBG(xe, !range &&
-				 op != DRM_XE_VM_BIND_OP_UNMAP_ALL)) {
+				 op != DRM_XE_VM_BIND_OP_UNMAP_ALL &&
+				 op != DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA &&
+				 op != DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA)) {
 			err = -EINVAL;
 			goto free_bind_ops;
 		}
+
+		err = vm_bind_ioctl_check_user_extensions(vm, op, extensions);
+		if (err)
+			goto free_bind_ops;
 	}
 
 	return 0;
@@ -3653,11 +3792,17 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		u64 obj_offset = bind_ops[i].obj_offset;
 		u32 prefetch_region = bind_ops[i].prefetch_mem_region_instance;
 		u16 pat_index = bind_ops[i].pat_index;
+		u64 extensions = bind_ops[i].extensions;
 
 		ops[i] = vm_bind_ioctl_ops_create(vm, &vops, bos[i], obj_offset,
 						  addr, range, op, flags,
-						  prefetch_region, pat_index);
-		if (IS_ERR(ops[i])) {
+						  prefetch_region, pat_index, extensions);
+
+		if (!IS_ERR(ops[i]) && extensions) {
+			err = vm_bind_ioctl_process_user_extensions(vm, ops[i], op, extensions);
+			if (err)
+				goto unwind_ops;
+		} else if (IS_ERR(ops[i])) {
 			err = PTR_ERR(ops[i]);
 			ops[i] = NULL;
 			goto unwind_ops;
@@ -3765,7 +3910,7 @@ struct dma_fence *xe_vm_bind_kernel_bo(struct xe_vm *vm, struct xe_bo *bo,
 
 	ops = vm_bind_ioctl_ops_create(vm, &vops, bo, 0, addr, xe_bo_size(bo),
 				       DRM_XE_VM_BIND_OP_MAP, 0, 0,
-				       vm->xe->pat.idx[cache_lvl]);
+				       vm->xe->pat.idx[cache_lvl], 0);
 	if (IS_ERR(ops)) {
 		err = PTR_ERR(ops);
 		goto release_vm_lock;
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index da39940501d8..92dcae6a5996 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -14,6 +14,7 @@
 #include <linux/mmu_notifier.h>
 #include <linux/scatterlist.h>
 
+#include "xe_debug_data_types.h"
 #include "xe_device_types.h"
 #include "xe_pt_types.h"
 #include "xe_range_fence.h"
@@ -339,6 +340,12 @@ struct xe_vm {
 	bool batch_invalidate_tlb;
 	/** @xef: XE file handle for tracking this VM's drm client */
 	struct xe_file *xef;
+
+	/** @debug_data: track debug_data mapped to vm */
+	struct {
+		struct list_head list;
+		struct mutex lock;
+	} debug_data;
 };
 
 /** struct xe_vma_op_map - VMA map operation */
@@ -412,6 +419,12 @@ struct xe_vma_op_prefetch_range {
 	struct xe_tile *tile;
 };
 
+/** struct xe_vma_op_debug_data - debug data altering operation */
+struct xe_vma_op_modify_debug_data {
+	/** @debug_data: debug data associated with that operation */
+	struct xe_debug_data debug_data;
+};
+
 /** enum xe_vma_op_flags - flags for VMA operation */
 enum xe_vma_op_flags {
 	/** @XE_VMA_OP_COMMITTED: VMA operation committed */
@@ -428,6 +441,10 @@ enum xe_vma_subop {
 	XE_VMA_SUBOP_MAP_RANGE,
 	/** @XE_VMA_SUBOP_UNMAP_RANGE: Unmap range */
 	XE_VMA_SUBOP_UNMAP_RANGE,
+	/** @XE_VMA_SUBOP_ADD_DEBUG_DATA: Add debug data to vm */
+	XE_VMA_SUBOP_ADD_DEBUG_DATA,
+	/** @XE_VMA_SUBOP_REMOVE_DEBUG_DATA: Remove debug data from vm */
+	XE_VMA_SUBOP_REMOVE_DEBUG_DATA,
 };
 
 /** struct xe_vma_op - VMA operation */
@@ -456,6 +473,8 @@ struct xe_vma_op {
 		struct xe_vma_op_unmap_range unmap_range;
 		/** @prefetch_range: VMA prefetch range operation specific data */
 		struct xe_vma_op_prefetch_range prefetch_range;
+		/** @debug_data: debug_data operation specific data */
+		struct xe_vma_op_modify_debug_data modify_debug_data;
 	};
 };
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index ba98da4320da..fba4b5d6f152 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -6,6 +6,8 @@
 #ifndef _UAPI_XE_DRM_H_
 #define _UAPI_XE_DRM_H_
 
+#include <linux/limits.h>
+
 #include "drm.h"
 
 #if defined(__cplusplus)
@@ -983,6 +985,35 @@ struct drm_xe_vm_destroy {
 	__u64 reserved[2];
 };
 
+struct drm_xe_vm_bind_op_ext_debug_data {
+	/** @base: base user extension */
+	struct drm_xe_user_extension base;
+
+	/** @addr: Address of the metadata mapping */
+	__u64 addr;
+
+	/** @range: Range of the metadata mapping */
+	__u64 range;
+
+#define DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO (1 << 0)
+	/** @flags: Debug metadata flags */
+	__u64 flags;
+
+	/** @offset: Offset into the debug data file, MBZ for DEBUG_PSEUDO */
+	__u32 offset;
+
+	/** @reserved: Reserved */
+	__u32 reserved;
+
+	union {
+#define DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_MODULE_AREA	0x1
+#define DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_SBA_AREA	0x2
+#define DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_SIP_AREA	0x3
+		__u64 pseudopath;
+		char pathname[PATH_MAX];
+	};
+};
+
 /**
  * struct drm_xe_vm_bind_op - run bind operations
  *
@@ -992,6 +1023,8 @@ struct drm_xe_vm_destroy {
  *  - %DRM_XE_VM_BIND_OP_MAP_USERPTR
  *  - %DRM_XE_VM_BIND_OP_UNMAP_ALL
  *  - %DRM_XE_VM_BIND_OP_PREFETCH
+ *  - %DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA
+ *  - %DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA
  *
  * and the @flags can be:
  *  - %DRM_XE_VM_BIND_FLAG_READONLY - Setup the page tables as read-only
@@ -1021,6 +1054,7 @@ struct drm_xe_vm_destroy {
  *    the memory region advised by madvise.
  */
 struct drm_xe_vm_bind_op {
+#define	XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA 0
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
@@ -1112,6 +1146,8 @@ struct drm_xe_vm_bind_op {
 #define DRM_XE_VM_BIND_OP_MAP_USERPTR	0x2
 #define DRM_XE_VM_BIND_OP_UNMAP_ALL	0x3
 #define DRM_XE_VM_BIND_OP_PREFETCH	0x4
+#define DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA	0x5
+#define DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA	0x6
 	/** @op: Bind operation to perform */
 	__u32 op;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/20] drm/xe/eudebug: Introduce vm bind and vm bind debug data events
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (4 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 05/20] drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 07/20] drm/xe/eudebug: Add UFENCE events with acks Mika Kuoppala
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

From: Christoph Manszewski <christoph.manszewski@intel.com>

This patch adds events to track the bind ioctl and associated debug data add
and remove operations. As a single bind can involve multiple operations and
may fail mid-process, events are stored until the full chain of operations
succeeds before relaying to the debugger. If no debug data operations occur,
no events are sent to avoid unnecessary debugger notifications.

v2: always end bind sequence on error (Mika, Maciej)

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Co-developed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_debug_data.c    |   4 +
 drivers/gpu/drm/xe/xe_eudebug.c       | 383 +++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_eudebug.h       |  19 ++
 drivers/gpu/drm/xe/xe_eudebug_types.h |   2 +-
 drivers/gpu/drm/xe/xe_vm.c            |  13 +-
 drivers/gpu/drm/xe/xe_vm_types.h      |  13 +
 include/uapi/drm/xe_drm_eudebug.h     |  71 +++++
 7 files changed, 499 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_debug_data.c b/drivers/gpu/drm/xe/xe_debug_data.c
index 99044dc477d5..7952fc764815 100644
--- a/drivers/gpu/drm/xe/xe_debug_data.c
+++ b/drivers/gpu/drm/xe/xe_debug_data.c
@@ -3,6 +3,7 @@
  * Copyright © 2025 Intel Corporation
  */
 
+#include "xe_eudebug.h"
 #include "xe_debug_data.h"
 #include "xe_debug_data_types.h"
 #include "xe_vm.h"
@@ -136,6 +137,8 @@ static int xe_debug_data_add(struct xe_vm *vm, struct xe_vma_op *vma_op,
 
 	memcpy(&vma_op->modify_debug_data.debug_data, dd, sizeof(*dd));
 
+	xe_eudebug_vm_bind_op_add(vm, DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA, dd);
+
 	return 0;
 }
 
@@ -153,6 +156,7 @@ static int xe_debug_data_remove(struct xe_vm *vm, struct xe_vma_op *vma_op,
 	mutex_lock(&vm->debug_data.lock);
 	list_for_each_entry(dd, &vm->debug_data.list, link) {
 		if (dd->addr == ext->addr && dd->range == ext->range) {
+			xe_eudebug_vm_bind_op_add(vm, DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA, dd);
 			list_del(&dd->link);
 			memcpy(&vma_op->modify_debug_data.debug_data, dd, sizeof(*dd));
 			kfree(dd);
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index a6c0d2391e0e..0255f35924d8 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -12,6 +12,7 @@
 #include <uapi/drm/xe_drm.h>
 
 #include "xe_assert.h"
+#include "xe_debug_data_types.h"
 #include "xe_device.h"
 #include "xe_eudebug.h"
 #include "xe_eudebug_types.h"
@@ -848,6 +849,324 @@ void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q)
 	xe_eudebug_event_put(d, exec_queue_destroy_event(d, xef, q));
 }
 
+struct xe_eudebug_event_envelope {
+	struct list_head link;
+	struct drm_xe_eudebug_event *event;
+};
+
+static int xe_eudebug_queue_bind_event(struct xe_eudebug *d,
+				       struct xe_vm *vm,
+				       struct drm_xe_eudebug_event *event)
+{
+	struct xe_eudebug_event_envelope *env;
+
+	lockdep_assert_held_write(&vm->lock);
+
+	env = kmalloc(sizeof(*env), GFP_KERNEL);
+	if (!env)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&env->link);
+	env->event = event;
+
+	spin_lock(&vm->eudebug.lock);
+	list_add_tail(&env->link, &vm->eudebug.events);
+
+	if (event->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA)
+		++vm->eudebug.ops;
+	spin_unlock(&vm->eudebug.lock);
+
+	return 0;
+}
+
+static int queue_vm_bind_event(struct xe_eudebug *d,
+			       struct xe_vm *vm,
+			       u64 vm_handle,
+			       u32 bind_flags,
+			       u32 num_ops, u64 *seqno)
+{
+	struct drm_xe_eudebug_event_vm_bind *e;
+	struct drm_xe_eudebug_event *event;
+	const u32 sz = sizeof(*e);
+	const u32 base_flags = DRM_XE_EUDEBUG_EVENT_STATE_CHANGE;
+
+	*seqno = atomic_long_inc_return(&d->events.seqno);
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND,
+					*seqno, base_flags, sz);
+	if (!event)
+		return -ENOMEM;
+
+	e = cast_event(e, event);
+
+	e->vm_handle = vm_handle;
+	e->flags = bind_flags;
+	e->num_binds = num_ops;
+
+	/* If in discovery, no need to collect ops */
+	if (!completion_done(&d->discovery)) {
+		XE_WARN_ON(!num_ops);
+		return xe_eudebug_queue_event(d, event);
+	}
+
+	return xe_eudebug_queue_bind_event(d, vm, event);
+}
+
+static int vm_bind_event(struct xe_eudebug *d,
+			 struct xe_vm *vm,
+			 u32 num_ops,
+			 u64 *seqno)
+{
+	int h_vm;
+
+	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, vm);
+	if (h_vm < 0)
+		return h_vm;
+
+	return queue_vm_bind_event(d, vm, h_vm, 0,
+				   num_ops, seqno);
+}
+
+static int vm_bind_op_event(struct xe_eudebug *d,
+			    struct xe_vm *vm,
+			    const u32 flags,
+			    const u64 bind_ref_seqno,
+			    const u64 num_extensions,
+			    struct xe_debug_data *debug_data,
+			    u64 *op_seqno)
+{
+	struct drm_xe_eudebug_event_vm_bind_op_debug_data *e;
+	struct drm_xe_eudebug_event *event;
+	const u32 sz = sizeof(*e);
+
+	*op_seqno = atomic_long_inc_return(&d->events.seqno);
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA,
+					*op_seqno, flags, sz);
+	if (!event)
+		return -ENOMEM;
+
+	e = cast_event(e, event);
+
+	e->vm_bind_ref_seqno = bind_ref_seqno;
+	e->num_extensions = num_extensions;
+	e->addr = debug_data->addr;
+	e->range = debug_data->range;
+	e->flags = debug_data->flags;
+	e->offset = debug_data->offset;
+
+	if (debug_data->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO)
+		e->pseudopath = debug_data->pseudopath;
+	else
+		strscpy(e->pathname, debug_data->pathname, PATH_MAX);
+
+	/* If in discovery, no need to collect ops */
+	if (!completion_done(&d->discovery))
+		return xe_eudebug_queue_event(d, event);
+
+	return xe_eudebug_queue_bind_event(d, vm, event);
+}
+
+static int vm_bind_op(struct xe_eudebug *d, struct xe_vm *vm,
+		      const u32 flags, const u64 bind_ref_seqno,
+		      struct xe_debug_data *debug_data)
+{
+	u64 op_seqno = 0;
+	u64 num_extensions = 0;
+	int ret;
+
+	ret = vm_bind_op_event(d, vm, flags, bind_ref_seqno, num_extensions,
+			       debug_data, &op_seqno);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+void xe_eudebug_vm_init(struct xe_vm *vm)
+{
+	INIT_LIST_HEAD(&vm->eudebug.events);
+	spin_lock_init(&vm->eudebug.lock);
+	vm->eudebug.ops = 0;
+	vm->eudebug.ref_seqno = 0;
+}
+
+void xe_eudebug_vm_bind_start(struct xe_vm *vm)
+{
+	struct xe_eudebug *d;
+	u64 seqno = 0;
+	int err;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return;
+
+	d = xe_eudebug_get(vm->xef);
+	if (!d)
+		return;
+
+	lockdep_assert_held_write(&vm->lock);
+
+	if (XE_WARN_ON(!list_empty(&vm->eudebug.events)) ||
+	    XE_WARN_ON(vm->eudebug.ops) ||
+	    XE_WARN_ON(vm->eudebug.ref_seqno)) {
+		eu_err(d, "bind busy on %s",  __func__);
+		xe_eudebug_disconnect(d, -EINVAL);
+	}
+
+	err = vm_bind_event(d, vm, 0, &seqno);
+	if (err) {
+		eu_err(d, "error %d on %s", err, __func__);
+		xe_eudebug_disconnect(d, err);
+	}
+
+	spin_lock(&vm->eudebug.lock);
+	XE_WARN_ON(vm->eudebug.ref_seqno);
+	vm->eudebug.ref_seqno = seqno;
+	vm->eudebug.ops = 0;
+	spin_unlock(&vm->eudebug.lock);
+
+	xe_eudebug_put(d);
+}
+
+void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, struct xe_debug_data *debug_data)
+{
+	struct xe_eudebug *d;
+	u32 flags;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return;
+
+	switch (op) {
+	case DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA:
+		flags = DRM_XE_EUDEBUG_EVENT_CREATE;
+		break;
+	case DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA:
+		flags = DRM_XE_EUDEBUG_EVENT_DESTROY;
+		break;
+	default:
+		flags = 0;
+		break;
+	}
+
+	if (!flags)
+		return;
+
+	d = xe_eudebug_get(vm->xef);
+	if (!d)
+		return;
+
+	xe_eudebug_event_put(d, vm_bind_op(d, vm, flags, 0, debug_data));
+}
+
+static struct drm_xe_eudebug_event *fetch_bind_event(struct xe_vm * const vm)
+{
+	struct xe_eudebug_event_envelope *env;
+	struct drm_xe_eudebug_event *e = NULL;
+
+	spin_lock(&vm->eudebug.lock);
+	env = list_first_entry_or_null(&vm->eudebug.events,
+				       struct xe_eudebug_event_envelope, link);
+	if (env) {
+		e = env->event;
+		list_del(&env->link);
+	}
+	spin_unlock(&vm->eudebug.lock);
+
+	kfree(env);
+
+	return e;
+}
+
+static void fill_vm_bind_fields(struct xe_vm *vm,
+				struct drm_xe_eudebug_event *e,
+				bool ufence,
+				u32 bind_ops)
+{
+	struct drm_xe_eudebug_event_vm_bind *eb = cast_event(eb, e);
+
+	eb->flags = ufence ?
+		DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE : 0;
+	eb->num_binds = bind_ops;
+}
+
+static void fill_vm_bind_op_fields(struct xe_vm *vm,
+				   struct drm_xe_eudebug_event *e,
+				   u64 ref_seqno)
+{
+	struct drm_xe_eudebug_event_vm_bind_op_debug_data *op;
+
+	if (e->type != DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA)
+		return;
+
+	op = cast_event(op, e);
+	op->vm_bind_ref_seqno = ref_seqno;
+}
+
+void xe_eudebug_vm_bind_end(struct xe_vm *vm,
+			    struct xe_user_fence *ufence,
+			    int bind_err)
+{
+	struct drm_xe_eudebug_event *e;
+	struct xe_eudebug *d;
+	u32 bind_ops;
+	u64 ref;
+
+	if (!xe_vm_in_lr_mode(vm))
+		return;
+
+	spin_lock(&vm->eudebug.lock);
+	ref = vm->eudebug.ref_seqno;
+	vm->eudebug.ref_seqno = 0;
+	bind_ops = vm->eudebug.ops;
+	vm->eudebug.ops = 0;
+	spin_unlock(&vm->eudebug.lock);
+
+	XE_WARN_ON(ufence && bind_err);
+
+	e = fetch_bind_event(vm);
+	if (!e)
+		return;
+
+	d = NULL;
+	if (!bind_err && ref) {
+		d = xe_eudebug_get(vm->xef);
+		if (d) {
+			if (bind_ops) {
+				fill_vm_bind_fields(vm, e, ufence != NULL, bind_ops);
+			} else {
+				/*
+				 * If there was no ops we are interested in,
+				 * we can omit the whole sequence
+				 */
+				xe_eudebug_put(d);
+				d = NULL;
+			}
+		}
+	}
+
+	while (e) {
+		if (d) {
+			int err;
+
+			err = xe_eudebug_queue_event(d, e);
+			if (err) {
+				xe_eudebug_disconnect(d, err);
+				xe_eudebug_put(d);
+				d = NULL;
+			}
+		} else {
+			kfree(e);
+		}
+
+		e = fetch_bind_event(vm);
+		if (e && ref && d)
+			fill_vm_bind_op_fields(vm, e, ref);
+	}
+
+	if (d)
+		xe_eudebug_put(d);
+}
+
 static struct xe_file *xe_eudebug_target_get(struct xe_eudebug *d)
 {
 	struct xe_file *xef = NULL;
@@ -860,19 +1179,67 @@ static struct xe_file *xe_eudebug_target_get(struct xe_eudebug *d)
 	return xef;
 }
 
+static int vm_discover_binds(struct xe_eudebug *d, struct xe_vm *vm)
+{
+	struct xe_debug_data *dd;
+	struct list_head *pos;
+	unsigned int ops, count;
+	u64 ref_seqno;
+	int err;
+
+	if (list_empty(&vm->debug_data.list))
+		return 0;
+
+	count = 0;
+	list_for_each(pos, &vm->debug_data.list)
+		count++;
+
+	ops = count;
+	ref_seqno = 0;
+	err = vm_bind_event(d, vm, ops, &ref_seqno);
+	if (err) {
+		eu_dbg(d, "vm_bind_event error %d\n", err);
+		return err;
+	}
+
+	list_for_each_entry(dd, &vm->debug_data.list, link) {
+		err = vm_bind_op(d, vm, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno, dd);
+		if (err) {
+			eu_dbg(d, "vm_bind_op error %d\n", err);
+			return err;
+		}
+
+		ops--;
+	}
+
+	XE_WARN_ON(ops);
+
+	return ops ? -EIO : count;
+}
+
 static void discover_client(struct xe_eudebug *d)
 {
 	struct xe_file *xef;
 	struct xe_exec_queue *q;
 	struct xe_vm *vm;
 	unsigned long i;
-	unsigned int vm_count = 0, eq_count = 0;
+	unsigned int vm_count = 0, eq_count = 0, ops_count = 0;
 	int err = 0;
 
 	xef = xe_eudebug_target_get(d);
 	if (!xef)
 		return;
 
+	/*
+	 * xe_eudebug ref is taken for discovery worker. It will
+	 * hold target xe_file ref and xe_file holds vm and exec_queue
+	 * refs.
+	 *
+	 * The relevant ioctls through xe_file are through
+	 * down_read(&xef->eudebug.lock). That means we can peek inside
+	 * the resources without taking their respective locks by
+	 * taking write lock.
+	 */
 	down_write(&xef->eudebug.ioctl_lock);
 
 	eu_dbg(d, "Discovery start for %lld", d->session);
@@ -882,6 +1249,12 @@ static void discover_client(struct xe_eudebug *d)
 		if (err)
 			break;
 		vm_count++;
+
+		err = vm_discover_binds(d, vm);
+		if (err < 0)
+			break;
+
+		ops_count += err;
 	}
 
 	xa_for_each(&xef->exec_queue.xa, i, q) {
@@ -891,6 +1264,8 @@ static void discover_client(struct xe_eudebug *d)
 		err = exec_queue_create_event(d, xef, q);
 		if (err)
 			break;
+
+		eq_count++;
 	}
 
 	complete_all(&d->discovery);
@@ -899,9 +1274,9 @@ static void discover_client(struct xe_eudebug *d)
 
 	up_write(&xef->eudebug.ioctl_lock);
 
-	if (vm_count || eq_count)
-		eu_dbg(d, "Discovery found %u vms, %u exec_queues",
-		       vm_count, eq_count);
+	if (vm_count || eq_count || ops_count)
+		eu_dbg(d, "Discovery found %u vms, %u exec_queues, %u bind_ops",
+		       vm_count, eq_count, ops_count);
 
 	xe_file_put(xef);
 }
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index 39c9aca373f2..6eb8a683a8b9 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -10,10 +10,13 @@
 
 struct drm_device;
 struct drm_file;
+struct xe_debug_data;
 struct xe_device;
 struct xe_file;
 struct xe_vm;
+struct xe_vma;
 struct xe_exec_queue;
+struct xe_user_fence;
 
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 
@@ -49,6 +52,13 @@ void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm);
 void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q);
 void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q);
 
+void xe_eudebug_vm_init(struct xe_vm *vm);
+void xe_eudebug_vm_bind_start(struct xe_vm *vm);
+void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, struct xe_debug_data *debug_data);
+void xe_eudebug_vm_bind_end(struct xe_vm *vm,
+			    struct xe_user_fence *ufence,
+			    int bind_err);
+
 #else
 
 static inline int xe_eudebug_connect_ioctl(struct drm_device *dev,
@@ -66,6 +76,15 @@ static inline void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm)
 static inline void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q) { }
 static inline void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) { }
 
+static inline void xe_eudebug_vm_init(struct xe_vm *vm) { }
+static inline void xe_eudebug_vm_bind_start(struct xe_vm *vm) { }
+static inline void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op,
+					     struct xe_debug_data *debug_data) { }
+
+static inline void xe_eudebug_vm_bind_end(struct xe_vm *vm,
+					  struct xe_user_fence *ufence,
+					  int bind_err) { }
+
 #endif /* CONFIG_DRM_XE_EUDEBUG */
 
 #endif /* _XE_EUDEBUG_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 57bff7482163..502b121114df 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -33,7 +33,7 @@ enum xe_eudebug_state {
 };
 
 #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64
-#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE
+#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA
 
 /**
  * struct xe_eudebug_handle - eudebug resource handle
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 79ad453c01f4..de2d1e0c8def 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1499,6 +1499,8 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags, struct xe_file *xef)
 	for_each_tile(tile, xe, id)
 		xe_range_fence_tree_init(&vm->rftree[id]);
 
+	xe_eudebug_vm_init(vm);
+
 	INIT_LIST_HEAD(&vm->debug_data.list);
 	mutex_init(&vm->debug_data.lock);
 
@@ -1812,6 +1814,8 @@ static void vm_destroy_work_func(struct work_struct *w)
 	struct xe_tile *tile;
 	u8 id;
 
+	xe_eudebug_vm_bind_end(vm, NULL, -ENOENT);
+
 	/* xe_vm_close_and_put was not called? */
 	xe_assert(xe, !vm->size);
 
@@ -3266,7 +3270,7 @@ static void vm_bind_ioctl_ops_fini(struct xe_vm *vm, struct xe_vma_ops *vops,
 				   struct dma_fence *fence)
 {
 	struct xe_exec_queue *wait_exec_queue = to_wait_exec_queue(vm, vops->q);
-	struct xe_user_fence *ufence;
+	struct xe_user_fence *ufence = NULL;
 	struct xe_vma_op *op;
 	int i;
 
@@ -3281,6 +3285,9 @@ static void vm_bind_ioctl_ops_fini(struct xe_vm *vm, struct xe_vma_ops *vops,
 			xe_vma_destroy(gpuva_to_vma(op->base.remap.unmap->va),
 				       fence);
 	}
+
+	xe_eudebug_vm_bind_end(vm, ufence, 0);
+
 	if (ufence)
 		xe_sync_ufence_put(ufence);
 	if (fence) {
@@ -3843,8 +3850,12 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		dma_fence_put(fence);
 
 unwind_ops:
+	if (err)
+		xe_eudebug_vm_bind_end(vm, NULL, err);
+
 	if (err && err != -ENODATA)
 		vm_bind_ioctl_ops_unwind(vm, ops, args->num_binds);
+
 	xe_vma_ops_fini(&vops);
 	for (i = args->num_binds - 1; i >= 0; --i)
 		if (ops[i])
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 92dcae6a5996..aa8d2d02d768 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -341,6 +341,19 @@ struct xe_vm {
 	/** @xef: XE file handle for tracking this VM's drm client */
 	struct xe_file *xef;
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	struct {
+		/** @lock: Lock for eudebug_bind members */
+		spinlock_t lock;
+		/** @events: List of vm bind ops gathered */
+		struct list_head events;
+		/** @ops: How many operations we have stored */
+		u32 ops;
+		/** @ref_seqno: Reference to the VM_BIND that the ops relate */
+		u64 ref_seqno;
+	} eudebug;
+#endif
+
 	/** @debug_data: track debug_data mapped to vm */
 	struct {
 		struct list_head list;
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index 360d7a7ecb67..b2b2b90bb3a7 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -49,6 +49,8 @@ struct drm_xe_eudebug_event {
 #define DRM_XE_EUDEBUG_EVENT_READ		1
 #define DRM_XE_EUDEBUG_EVENT_VM			2
 #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE		3
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND		4
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA	5
 
 	__u16 flags;
 #define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
@@ -81,6 +83,75 @@ struct drm_xe_eudebug_event_exec_queue {
 	__u64 lrc_handle[];
 };
 
+/*
+ * When the client (debuggee) calls the vm_bind_ioctl with the
+ * DRM_XE_VM_BIND_OP_[ADD|REMOVE]_DEBUG_DATA operation, the following event
+ * sequence will be created (for the debugger):
+ *
+ *  ┌───────────────────────┐
+ *  │  EVENT_VM_BIND        ├──────────────────┬─┬┄┐
+ *  └───────────────────────┘                  │ │ ┊
+ *      ┌──────────────────────────────────┐   │ │ ┊
+ *      │ EVENT_VM_BIND_OP_DEBUG_DATA #1   ├───┘ │ ┊
+ *      └──────────────────────────────────┘     │ ┊
+ *                      ...                      │ ┊
+ *      ┌──────────────────────────────────┐     │ ┊
+ *      │ EVENT_VM_BIND_OP_DEBUG_DATA #n   ├─────┘ ┊
+ *      └──────────────────────────────────┘       ┊
+ *                                                 ┊
+ *      ┌┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┐       ┊
+ *      ┊ EVENT_UFENCE                     ├┄┄┄┄┄┄┄┘
+ *      └┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┘
+ *
+ * All the events below VM_BIND will reference the VM_BIND
+ * they associate with, by field .vm_bind_ref_seqno.
+ * EVENT_UFENCE will only be included if the client did
+ * attach sync of type UFENCE into its vm_bind_ioctl().
+ *
+ * When EVENT_UFENCE is sent by the driver, all the OPs of
+ * the original VM_BIND are completed and the [addr,range]
+ * contained in them are present and modifiable through the
+ * vm accessors. Accessing [addr, range] before related ufence
+ * event will lead to undefined results as the actual bind
+ * operations are async and the backing storage might not
+ * be there on a moment of receiving the event.
+ *
+ * Client's UFENCE sync will be held by the driver: client's
+ * drm_xe_wait_ufence will not complete and the value of the ufence
+ * won't appear until ufence is acked by the debugger process calling
+ * DRM_XE_EUDEBUG_IOCTL_ACK_EVENT with the event_ufence.base.seqno.
+ * This will signal the fence, .value will update and the wait will
+ * complete allowing the client to continue.
+ *
+ */
+
+struct drm_xe_eudebug_event_vm_bind {
+	struct drm_xe_eudebug_event base;
+
+	__u64 vm_handle;
+
+	__u32 flags;
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE (1 << 0)
+
+	__u32 num_binds;
+};
+
+struct drm_xe_eudebug_event_vm_bind_op_debug_data {
+	struct drm_xe_eudebug_event base;
+	__u64 vm_bind_ref_seqno; /* *_event_vm_bind.base.seqno */
+	__u64 num_extensions;
+
+	__u64 addr;
+	__u64 range;
+	__u64 flags;
+	__u32 offset;
+	__u32 reserved;
+	union {
+		__u64 pseudopath;
+		char pathname[PATH_MAX];
+	};
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/20] drm/xe/eudebug: Add UFENCE events with acks
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (5 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 06/20] drm/xe/eudebug: Introduce vm bind and vm bind debug data events Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 08/20] drm/xe/eudebug: vm open/pread/pwrite Mika Kuoppala
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

When vma is in place, debugger needs to intercept before
userspace proceeds with the workload. For example to install
a breakpoint in a eu shader.

Attach debugger in xe_user_fence, send UFENCE event
and stall normal user fence signal path to yield if
there is debugger attached to ufence.

When ack (ioctl) is received for the corresponding seqno,
signal ufence.

v2: - return err instead of 0 to guarantee signalling (Dominik)
    - checkpatch (Tilak)
    - Kconfig (Mika, Andrzej)
    - use lock instead of cmpxchg (Mika)
v4: - improve ref handling and no ufences nodebug binds
v5: - remove overzealous warn_on on bind_ref_seqno (Christoph)
    - remove superfluous signalled (Mika)
    - fix double free on bind sequence (Mika)
    - Dont fill op fields if no debugger (Maciej)

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_eudebug.c       | 302 +++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_eudebug.h       |  15 ++
 drivers/gpu/drm/xe/xe_eudebug_types.h |   9 +-
 drivers/gpu/drm/xe/xe_exec.c          |   2 +-
 drivers/gpu/drm/xe/xe_oa.c            |   3 +-
 drivers/gpu/drm/xe/xe_sync.c          |  47 ++--
 drivers/gpu/drm/xe/xe_sync.h          |   8 +-
 drivers/gpu/drm/xe/xe_sync_types.h    |  28 ++-
 drivers/gpu/drm/xe/xe_vm.c            |   4 +-
 include/uapi/drm/xe_drm_eudebug.h     |  15 +-
 10 files changed, 402 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 0255f35924d8..85ffe417e492 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -19,6 +19,7 @@
 #include "xe_exec_queue.h"
 #include "xe_hw_engine.h"
 #include "xe_macros.h"
+#include "xe_sync.h"
 #include "xe_vm.h"
 
 /*
@@ -186,7 +187,7 @@ static void xe_eudebug_free(struct kref *ref)
 	kfree(d);
 }
 
-static void xe_eudebug_put(struct xe_eudebug *d)
+void xe_eudebug_put(struct xe_eudebug *d)
 {
 	kref_put(&d->ref, xe_eudebug_free);
 }
@@ -217,6 +218,114 @@ static void remove_debugger(struct xe_file *xef)
 	}
 }
 
+struct xe_eudebug_ack {
+	struct rb_node rb_node;
+	u64 seqno;
+	u64 ts_insert;
+	struct xe_user_fence *ufence;
+};
+
+#define fetch_ack(x) rb_entry(x, struct xe_eudebug_ack, rb_node)
+
+static int compare_ack(const u64 a, const u64 b)
+{
+	if (a < b)
+		return -1;
+	else if (a > b)
+		return 1;
+
+	return 0;
+}
+
+static int ack_insert_cmp(struct rb_node * const node,
+			  const struct rb_node * const p)
+{
+	return compare_ack(fetch_ack(node)->seqno,
+			   fetch_ack(p)->seqno);
+}
+
+static int ack_lookup_cmp(const void * const key,
+			  const struct rb_node * const node)
+{
+	return compare_ack(*(const u64 *)key,
+			   fetch_ack(node)->seqno);
+}
+
+static struct xe_eudebug_ack *remove_ack(struct xe_eudebug *d, u64 seqno)
+{
+	struct rb_root * const root = &d->acks.tree;
+	struct rb_node *node;
+
+	spin_lock(&d->acks.lock);
+	node = rb_find(&seqno, root, ack_lookup_cmp);
+	if (node)
+		rb_erase(node, root);
+	spin_unlock(&d->acks.lock);
+
+	if (!node)
+		return NULL;
+
+	return rb_entry_safe(node, struct xe_eudebug_ack, rb_node);
+}
+
+static void ufence_signal_worker(struct work_struct *w)
+{
+	struct xe_user_fence * const ufence =
+		container_of(w, struct xe_user_fence, eudebug.worker);
+
+	if (READ_ONCE(ufence->signalled))
+		xe_sync_ufence_signal(ufence);
+
+	xe_sync_ufence_put(ufence);
+}
+
+static void kick_ufence_worker(struct xe_user_fence *f)
+{
+	queue_work(f->xe->eudebug.wq, &f->eudebug.worker);
+}
+
+static void handle_ack(struct xe_eudebug *d, struct xe_eudebug_ack *ack,
+		       bool on_disconnect)
+{
+	struct xe_user_fence *f = ack->ufence;
+	u64 signalled_by;
+	bool signal = false;
+
+	spin_lock(&f->eudebug.lock);
+	if (!f->eudebug.signalled_seqno) {
+		f->eudebug.signalled_seqno = ack->seqno;
+		signal = true;
+	}
+	signalled_by = f->eudebug.signalled_seqno;
+	spin_unlock(&f->eudebug.lock);
+
+	if (signal)
+		kick_ufence_worker(f);
+	else
+		xe_sync_ufence_put(f);
+
+	eu_dbg(d, "ACK: seqno=%llu: signalled by %llu (%s) (held %lluus)",
+	       ack->seqno, signalled_by,
+	       on_disconnect ? "disconnect" : "debugger",
+	       ktime_us_delta(ktime_get(), ack->ts_insert));
+
+	kfree(ack);
+}
+
+static void release_acks(struct xe_eudebug *d)
+{
+	struct xe_eudebug_ack *ack, *n;
+	struct rb_root root;
+
+	spin_lock(&d->acks.lock);
+	root = d->acks.tree;
+	d->acks.tree = RB_ROOT;
+	spin_unlock(&d->acks.lock);
+
+	rbtree_postorder_for_each_entry_safe(ack, n, &root, rb_node)
+		handle_ack(d, ack, true);
+}
+
 static bool xe_eudebug_detach(struct xe_device *xe,
 			      struct xe_eudebug *d,
 			      const int err)
@@ -240,6 +349,8 @@ static bool xe_eudebug_detach(struct xe_device *xe,
 
 	eu_dbg(d, "session %lld detached with %d", d->session, err);
 
+	release_acks(d);
+
 	remove_debugger(target);
 	xe_file_put(target);
 
@@ -284,7 +395,7 @@ _xe_eudebug_get(struct xe_file *xef)
 	return d;
 }
 
-static struct xe_eudebug *
+struct xe_eudebug *
 xe_eudebug_get(struct xe_file *xef)
 {
 	struct xe_eudebug *d;
@@ -983,6 +1094,141 @@ static int vm_bind_op(struct xe_eudebug *d, struct xe_vm *vm,
 	return 0;
 }
 
+
+void xe_eudebug_ufence_init(struct xe_user_fence *ufence,
+			    struct xe_file *xef,
+			    struct xe_vm *vm)
+{
+	u64 bind_ref;
+
+	/* Drop if OA */
+	if (!vm)
+		return;
+
+	spin_lock(&vm->eudebug.lock);
+	bind_ref = vm->eudebug.ref_seqno;
+	spin_unlock(&vm->eudebug.lock);
+
+	spin_lock_init(&ufence->eudebug.lock);
+	INIT_WORK(&ufence->eudebug.worker, ufence_signal_worker);
+
+	ufence->eudebug.signalled_seqno = 0;
+
+	if (bind_ref) {
+		ufence->eudebug.debugger = xe_eudebug_get(xef);
+
+		if (ufence->eudebug.debugger)
+			ufence->eudebug.bind_ref_seqno = bind_ref;
+	}
+}
+
+void xe_eudebug_ufence_fini(struct xe_user_fence *ufence)
+{
+	XE_WARN_ON(ufence->eudebug.bind_ref_seqno);
+
+	if (!ufence->eudebug.debugger)
+		return;
+
+	xe_eudebug_put(ufence->eudebug.debugger);
+	ufence->eudebug.debugger = NULL;
+}
+
+static int xe_eudebug_track_ufence(struct xe_eudebug *d,
+				   struct xe_user_fence *f,
+				   u64 seqno)
+{
+	struct xe_eudebug_ack *ack;
+	struct rb_node *old;
+
+	ack = kzalloc(sizeof(*ack), GFP_KERNEL);
+	if (!ack)
+		return -ENOMEM;
+
+	ack->seqno = seqno;
+	ack->ts_insert = ktime_get();
+
+	__xe_sync_ufence_get(f);
+
+	spin_lock(&d->acks.lock);
+	old = rb_find_add(&ack->rb_node,
+			  &d->acks.tree, ack_insert_cmp);
+	if (!old)
+		ack->ufence = f;
+	spin_unlock(&d->acks.lock);
+
+	if (ack->ufence)
+		return 0;
+
+	xe_sync_ufence_put(f);
+	kfree(ack);
+
+	return -EEXIST;
+}
+
+static int vm_bind_ufence_event(struct xe_eudebug *d,
+				struct xe_user_fence *ufence,
+				u64 bind_ref_seqno)
+{
+	struct drm_xe_eudebug_event *event;
+	struct drm_xe_eudebug_event_vm_bind_ufence *e;
+	const u32 sz = sizeof(*e);
+	const u32 flags = DRM_XE_EUDEBUG_EVENT_CREATE |
+		DRM_XE_EUDEBUG_EVENT_NEED_ACK;
+	u64 seqno;
+	int ret;
+
+	seqno = atomic_long_inc_return(&d->events.seqno);
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					seqno, flags, sz);
+	if (!event)
+		return -ENOMEM;
+
+	e = cast_event(e, event);
+	e->vm_bind_ref_seqno = bind_ref_seqno;
+
+	ret = xe_eudebug_track_ufence(d, ufence, seqno);
+	if (ret) {
+		kfree(event);
+
+		eu_dbg(d, "tracking of ufence %llu failed with %d\n", seqno, ret);
+
+		return ret;
+	}
+
+	return xe_eudebug_queue_event(d, event);
+}
+
+int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence)
+{
+	struct xe_eudebug *d;
+	u64 bind_ref_seqno;
+	int err;
+
+	spin_lock(&ufence->eudebug.lock);
+	d = ufence->eudebug.debugger;
+	bind_ref_seqno = ufence->eudebug.bind_ref_seqno;
+	ufence->eudebug.bind_ref_seqno = 0;
+	spin_unlock(&ufence->eudebug.lock);
+
+	if (!d || xe_eudebug_detached(d))
+		return -ENOTCONN;
+
+	/* If there is no bind ref, no need to track */
+	if (!bind_ref_seqno) {
+		eu_dbg(d, "ufence without bind_ref_seqno, omitting send");
+		return -ENOENT;
+	}
+
+	err = vm_bind_ufence_event(d, ufence, bind_ref_seqno);
+	if (err) {
+		eu_err(d, "error %d on %s", err, __func__);
+		xe_eudebug_disconnect(d, err);
+	}
+
+	return err;
+}
+
 void xe_eudebug_vm_init(struct xe_vm *vm)
 {
 	INIT_LIST_HEAD(&vm->eudebug.events);
@@ -1123,6 +1369,12 @@ void xe_eudebug_vm_bind_end(struct xe_vm *vm,
 
 	XE_WARN_ON(ufence && bind_err);
 
+	if (ufence && !bind_ops) {
+		spin_lock(&ufence->eudebug.lock);
+		ufence->eudebug.bind_ref_seqno = 0;
+		spin_unlock(&ufence->eudebug.lock);
+	}
+
 	e = fetch_bind_event(vm);
 	if (!e)
 		return;
@@ -1484,6 +1736,44 @@ static long xe_eudebug_read_event(struct xe_eudebug *d,
 	return ret;
 }
 
+static long
+xe_eudebug_ack_event_ioctl(struct xe_eudebug *d,
+			   const unsigned int cmd,
+			   const u64 arg)
+{
+	struct drm_xe_eudebug_ack_event __user * const user_ptr =
+		u64_to_user_ptr(arg);
+	struct drm_xe_eudebug_ack_event user_arg;
+	struct xe_eudebug_ack *ack;
+	struct xe_device *xe = d->xe;
+
+	if (XE_IOCTL_DBG(xe, _IOC_SIZE(cmd) < sizeof(user_arg)))
+		return -EINVAL;
+
+	/* Userland write */
+	if (XE_IOCTL_DBG(xe, !(_IOC_DIR(cmd) & _IOC_WRITE)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, copy_from_user(&user_arg,
+					    user_ptr,
+					    sizeof(user_arg))))
+		return -EFAULT;
+
+	if (XE_IOCTL_DBG(xe, user_arg.flags))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, xe_eudebug_detached(d)))
+		return -ENOTCONN;
+
+	ack = remove_ack(d, user_arg.seqno);
+	if (XE_IOCTL_DBG(xe, !ack))
+		return -EINVAL;
+
+	handle_ack(d, ack, false);
+
+	return 0;
+}
+
 static long xe_eudebug_ioctl(struct file *file,
 			     unsigned int cmd,
 			     unsigned long arg)
@@ -1500,7 +1790,10 @@ static long xe_eudebug_ioctl(struct file *file,
 		ret = xe_eudebug_read_event(d, arg,
 					    !(file->f_flags & O_NONBLOCK));
 		break;
-
+	case DRM_XE_EUDEBUG_IOCTL_ACK_EVENT:
+		ret = xe_eudebug_ack_event_ioctl(d, cmd, arg);
+		eu_dbg(d, "ioctl cmd=EVENT_ACK ret=%ld\n", ret);
+		break;
 	default:
 		ret = -EINVAL;
 	}
@@ -1562,6 +1855,9 @@ xe_eudebug_connect(struct xe_device *xe,
 	INIT_KFIFO(d->events.fifo);
 	INIT_WORK(&d->discovery_work, discovery_work_fn);
 
+	spin_lock_init(&d->acks.lock);
+	d->acks.tree = RB_ROOT;
+
 	d->res = xe_eudebug_resources_alloc();
 	if (XE_IOCTL_DBG(xe, IS_ERR(d->res))) {
 		err = PTR_ERR(d->res);
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index 6eb8a683a8b9..6be20140d5d4 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -59,6 +59,13 @@ void xe_eudebug_vm_bind_end(struct xe_vm *vm,
 			    struct xe_user_fence *ufence,
 			    int bind_err);
 
+int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence);
+void xe_eudebug_ufence_init(struct xe_user_fence *ufence, struct xe_file *xef, struct xe_vm *vm);
+void xe_eudebug_ufence_fini(struct xe_user_fence *ufence);
+
+struct xe_eudebug *xe_eudebug_get(struct xe_file *xef);
+void xe_eudebug_put(struct xe_eudebug *d);
+
 #else
 
 static inline int xe_eudebug_connect_ioctl(struct drm_device *dev,
@@ -85,6 +92,14 @@ static inline void xe_eudebug_vm_bind_end(struct xe_vm *vm,
 					  struct xe_user_fence *ufence,
 					  int bind_err) { }
 
+static inline int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence) { return 0; }
+static inline void xe_eudebug_ufence_init(struct xe_user_fence *ufence,
+					  struct xe_file *xef, struct xe_vm *vm) { }
+static inline void xe_eudebug_ufence_fini(struct xe_user_fence *ufence) { }
+
+static inline struct xe_eudebug *xe_eudebug_get(struct xe_file *xef) { return NULL; }
+static inline void xe_eudebug_put(struct xe_eudebug *d) { }
+
 #endif /* CONFIG_DRM_XE_EUDEBUG */
 
 #endif /* _XE_EUDEBUG_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 502b121114df..a294e2f4e7df 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -33,7 +33,7 @@ enum xe_eudebug_state {
 };
 
 #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64
-#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA
+#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE
 
 /**
  * struct xe_eudebug_handle - eudebug resource handle
@@ -132,6 +132,13 @@ struct xe_eudebug {
 		atomic_long_t seqno;
 	} events;
 
+	/* user fences tracked by this debugger */
+	struct {
+		/** @lock: guards access to tree */
+		spinlock_t lock;
+
+		struct rb_root tree;
+	} acks;
 };
 
 #endif /* _XE_EUDEBUG_TYPES_H_ */
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 83897950f0da..b1e69daf4c3c 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -165,7 +165,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	vm = q->vm;
 
 	for (num_syncs = 0; num_syncs < args->num_syncs; num_syncs++) {
-		err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs],
+		err = xe_sync_entry_parse(xe, xef, vm, &syncs[num_syncs],
 					  &syncs_user[num_syncs], SYNC_PARSE_FLAG_EXEC |
 					  (xe_vm_in_lr_mode(vm) ?
 					   SYNC_PARSE_FLAG_LR_MODE : 0));
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
index a4894eb0d7f3..4ce0959c2858 100644
--- a/drivers/gpu/drm/xe/xe_oa.c
+++ b/drivers/gpu/drm/xe/xe_oa.c
@@ -1408,7 +1408,8 @@ static int xe_oa_parse_syncs(struct xe_oa *oa, struct xe_oa_open_param *param)
 	}
 
 	for (num_syncs = 0; num_syncs < param->num_syncs; num_syncs++) {
-		ret = xe_sync_entry_parse(oa->xe, param->xef, &param->syncs[num_syncs],
+		ret = xe_sync_entry_parse(oa->xe, param->xef, NULL,
+					  &param->syncs[num_syncs],
 					  &param->syncs_user[num_syncs], 0);
 		if (ret)
 			goto err_syncs;
diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c
index 82872a51f098..166c205352eb 100644
--- a/drivers/gpu/drm/xe/xe_sync.c
+++ b/drivers/gpu/drm/xe/xe_sync.c
@@ -15,27 +15,20 @@
 #include <uapi/drm/xe_drm.h>
 
 #include "xe_device_types.h"
+#include "xe_eudebug.h"
 #include "xe_exec_queue.h"
 #include "xe_macros.h"
 #include "xe_sched_job_types.h"
 
-struct xe_user_fence {
-	struct xe_device *xe;
-	struct kref refcount;
-	struct dma_fence_cb cb;
-	struct work_struct worker;
-	struct mm_struct *mm;
-	u64 __user *addr;
-	u64 value;
-	int signalled;
-};
-
 static void user_fence_destroy(struct kref *kref)
 {
 	struct xe_user_fence *ufence = container_of(kref, struct xe_user_fence,
 						 refcount);
 
 	mmdrop(ufence->mm);
+
+	xe_eudebug_ufence_fini(ufence);
+
 	kfree(ufence);
 }
 
@@ -49,7 +42,10 @@ static void user_fence_put(struct xe_user_fence *ufence)
 	kref_put(&ufence->refcount, user_fence_destroy);
 }
 
-static struct xe_user_fence *user_fence_create(struct xe_device *xe, u64 addr,
+static struct xe_user_fence *user_fence_create(struct xe_device *xe,
+					       struct xe_file *xef,
+					       struct xe_vm *vm,
+					       u64 addr,
 					       u64 value)
 {
 	struct xe_user_fence *ufence;
@@ -70,14 +66,15 @@ static struct xe_user_fence *user_fence_create(struct xe_device *xe, u64 addr,
 	ufence->mm = current->mm;
 	mmgrab(ufence->mm);
 
+	xe_eudebug_ufence_init(ufence, xef, vm);
+
 	return ufence;
 }
 
-static void user_fence_worker(struct work_struct *w)
+void xe_sync_ufence_signal(struct xe_user_fence *ufence)
 {
-	struct xe_user_fence *ufence = container_of(w, struct xe_user_fence, worker);
+	XE_WARN_ON(!ufence->signalled);
 
-	WRITE_ONCE(ufence->signalled, 1);
 	if (mmget_not_zero(ufence->mm)) {
 		kthread_use_mm(ufence->mm);
 		if (copy_to_user(ufence->addr, &ufence->value, sizeof(ufence->value)))
@@ -88,11 +85,25 @@ static void user_fence_worker(struct work_struct *w)
 		drm_dbg(&ufence->xe->drm, "mmget_not_zero() failed, ufence wasn't signaled\n");
 	}
 
+	wake_up_all(&ufence->xe->ufence_wq);
+}
+
+static void user_fence_worker(struct work_struct *w)
+{
+	struct xe_user_fence *ufence = container_of(w, struct xe_user_fence, worker);
+	int ret;
+
 	/*
 	 * Wake up waiters only after updating the ufence state, allowing the UMD
 	 * to safely reuse the same ufence without encountering -EBUSY errors.
 	 */
-	wake_up_all(&ufence->xe->ufence_wq);
+	WRITE_ONCE(ufence->signalled, 1);
+
+	/* Lets see if debugger wants to track this */
+	ret = xe_eudebug_vm_bind_ufence(ufence);
+	if (ret)
+		xe_sync_ufence_signal(ufence);
+
 	user_fence_put(ufence);
 }
 
@@ -111,6 +122,7 @@ static void user_fence_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
 }
 
 int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
+			struct xe_vm *vm,
 			struct xe_sync_entry *sync,
 			struct drm_xe_sync __user *sync_user,
 			unsigned int flags)
@@ -192,7 +204,8 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		if (exec) {
 			sync->addr = sync_in.addr;
 		} else {
-			sync->ufence = user_fence_create(xe, sync_in.addr,
+			sync->ufence = user_fence_create(xe, xef, vm,
+							 sync_in.addr,
 							 sync_in.timeline_value);
 			if (XE_IOCTL_DBG(xe, IS_ERR(sync->ufence)))
 				return PTR_ERR(sync->ufence);
diff --git a/drivers/gpu/drm/xe/xe_sync.h b/drivers/gpu/drm/xe/xe_sync.h
index 256ffc1e54dc..f5bec2b1b4f6 100644
--- a/drivers/gpu/drm/xe/xe_sync.h
+++ b/drivers/gpu/drm/xe/xe_sync.h
@@ -9,8 +9,12 @@
 #include "xe_sync_types.h"
 
 struct xe_device;
-struct xe_exec_queue;
 struct xe_file;
+struct xe_exec_queue;
+struct drm_syncobj;
+struct dma_fence;
+struct dma_fence_chain;
+struct drm_xe_sync;
 struct xe_sched_job;
 struct xe_vm;
 
@@ -19,6 +23,7 @@ struct xe_vm;
 #define SYNC_PARSE_FLAG_DISALLOW_USER_FENCE	BIT(2)
 
 int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
+			struct xe_vm *vm,
 			struct xe_sync_entry *sync,
 			struct drm_xe_sync __user *sync_user,
 			unsigned int flags);
@@ -40,5 +45,6 @@ struct xe_user_fence *__xe_sync_ufence_get(struct xe_user_fence *ufence);
 struct xe_user_fence *xe_sync_ufence_get(struct xe_sync_entry *sync);
 void xe_sync_ufence_put(struct xe_user_fence *ufence);
 int xe_sync_ufence_get_status(struct xe_user_fence *ufence);
+void xe_sync_ufence_signal(struct xe_user_fence *ufence);
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_sync_types.h b/drivers/gpu/drm/xe/xe_sync_types.h
index 30ac3f51993b..dcd3165e66a7 100644
--- a/drivers/gpu/drm/xe/xe_sync_types.h
+++ b/drivers/gpu/drm/xe/xe_sync_types.h
@@ -6,13 +6,31 @@
 #ifndef _XE_SYNC_TYPES_H_
 #define _XE_SYNC_TYPES_H_
 
+#include <linux/dma-fence-array.h>
+#include <linux/kref.h>
+#include <linux/spinlock.h>
 #include <linux/types.h>
 
-struct drm_syncobj;
-struct dma_fence;
-struct dma_fence_chain;
-struct drm_xe_sync;
-struct user_fence;
+struct xe_user_fence {
+	struct xe_device *xe;
+	struct kref refcount;
+	struct dma_fence_cb cb;
+	struct work_struct worker;
+	struct mm_struct *mm;
+	u64 __user *addr;
+	u64 value;
+	int signalled;
+
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	struct {
+		spinlock_t lock;
+		struct xe_eudebug *debugger;
+		u64 bind_ref_seqno;
+		u64 signalled_seqno;
+		struct work_struct worker;
+	} eudebug;
+#endif
+};
 
 struct xe_sync_entry {
 	struct drm_syncobj *syncobj;
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index de2d1e0c8def..5f86e87d8458 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3765,9 +3765,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		}
 	}
 
+	xe_eudebug_vm_bind_start(vm);
+
 	syncs_user = u64_to_user_ptr(args->syncs);
 	for (num_syncs = 0; num_syncs < args->num_syncs; num_syncs++) {
-		err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs],
+		err = xe_sync_entry_parse(xe, xef, vm, &syncs[num_syncs],
 					  &syncs_user[num_syncs],
 					  (xe_vm_in_lr_mode(vm) ?
 					   SYNC_PARSE_FLAG_LR_MODE : 0) |
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index b2b2b90bb3a7..e55fa52c2973 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -15,7 +15,8 @@ extern "C" {
  *
  * This ioctl is available in debug version 1.
  */
-#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0)
+#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT		_IO('j', 0x0)
+#define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT		_IOW('j', 0x1, struct drm_xe_eudebug_ack_event)
 
 /**
  * struct drm_xe_eudebug_event - Base type of event delivered by xe_eudebug.
@@ -51,6 +52,7 @@ struct drm_xe_eudebug_event {
 #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE		3
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND		4
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA	5
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE	6
 
 	__u16 flags;
 #define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
@@ -152,6 +154,17 @@ struct drm_xe_eudebug_event_vm_bind_op_debug_data {
 	};
 };
 
+struct drm_xe_eudebug_event_vm_bind_ufence {
+	struct drm_xe_eudebug_event base;
+	__u64 vm_bind_ref_seqno; /* *_event_vm_bind.base.seqno */
+};
+
+struct drm_xe_eudebug_ack_event {
+	__u32 type;
+	__u32 flags; /* MBZ */
+	__u64 seqno;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/20] drm/xe/eudebug: vm open/pread/pwrite
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (6 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 07/20] drm/xe/eudebug: Add UFENCE events with acks Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:16 ` [PATCH 09/20] drm/xe/eudebug: userptr vm pread/pwrite Mika Kuoppala
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

Debugger needs access to the client's vm to read and write. For
example inspecting ISA/ELF and setting up breakpoints.

Add ioctl to open target vm with debugger client and vm_handle
and hook up pread/pwrite possibility.

Open will take timeout argument so that standard fsync
can be used for explicit flushing between cpu/gpu for
the target vm.

Implement this for bo backed storage. userptr will
be done in following patch.

v2: - checkpatch (Maciej)
    - 32bit fixes (Andrzej)
    - bo_vmap (Mika)
    - fix vm leak if can't allocate k_buffer (Mika)
    - assert vm write held for vma (Matthew)

v3: - fw ref, ttm_bo_access
    - timeout boundary check (Dominik)
    - dont try to copy to user on zero bytes (Mika)

v4: - offset as unsigned long (Thomas)
    - check XE_VMA_DESTROYED

v5: drm_dev_put before releasing debugger (Mika)

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/Makefile           |   2 +-
 drivers/gpu/drm/xe/regs/xe_gt_regs.h  |  24 ++
 drivers/gpu/drm/xe/xe_eudebug.c       |  39 ++-
 drivers/gpu/drm/xe/xe_eudebug.h       |   8 +
 drivers/gpu/drm/xe/xe_eudebug_types.h |   5 +
 drivers/gpu/drm/xe/xe_eudebug_vm.c    | 418 ++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug_vm.h    |   8 +
 include/uapi/drm/xe_drm_eudebug.h     |  15 +
 8 files changed, 516 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index fd79a28814bc..c0b860511181 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -147,7 +147,7 @@ xe-$(CONFIG_DRM_XE_GPUSVM) += xe_svm.o
 xe-$(CONFIG_DRM_GPUSVM) += xe_userptr.o
 
 # debugging shaders with gdb (eudebug) support
-xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o
+xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o xe_eudebug_vm.o
 
 # graphics hardware monitoring (HWMON) support
 xe-$(CONFIG_HWMON) += xe_hwmon.o
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index 06cb6b02ec64..ee64f69e784e 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -561,6 +561,30 @@
 #define   CCS_MODE_CSLICE(cslice, ccs) \
 	((ccs) << ((cslice) * CCS_MODE_CSLICE_WIDTH))
 
+#define RCU_ASYNC_FLUSH				XE_REG(0x149fc)
+#define   RCU_ASYNC_FLUSH_IN_PROGRESS	REG_BIT(31)
+#define   RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT	28
+#define   RCU_ASYNC_FLUSH_ENGINE_ID_DECODE1 REG_BIT(26)
+#define   RCU_ASYNC_FLUSH_AMFS		REG_BIT(8)
+#define   RCU_ASYNC_FLUSH_PREFETCH	REG_BIT(7)
+#define   RCU_ASYNC_FLUSH_DATA_PORT	REG_BIT(6)
+#define   RCU_ASYNC_FLUSH_DATA_CACHE	REG_BIT(5)
+#define   RCU_ASYNC_FLUSH_HDC_PIPELINE	REG_BIT(4)
+#define   RCU_ASYNC_INVALIDATE_HDC_PIPELINE REG_BIT(3)
+#define   RCU_ASYNC_INVALIDATE_CONSTANT_CACHE REG_BIT(2)
+#define   RCU_ASYNC_INVALIDATE_TEXTURE_CACHE REG_BIT(1)
+#define   RCU_ASYNC_INVALIDATE_INSTRUCTION_CACHE REG_BIT(0)
+#define   RCU_ASYNC_FLUSH_AND_INVALIDATE_ALL ( \
+	RCU_ASYNC_FLUSH_AMFS | \
+	RCU_ASYNC_FLUSH_PREFETCH | \
+	RCU_ASYNC_FLUSH_DATA_PORT | \
+	RCU_ASYNC_FLUSH_DATA_CACHE | \
+	RCU_ASYNC_FLUSH_HDC_PIPELINE | \
+	RCU_ASYNC_INVALIDATE_HDC_PIPELINE | \
+	RCU_ASYNC_INVALIDATE_CONSTANT_CACHE | \
+	RCU_ASYNC_INVALIDATE_TEXTURE_CACHE | \
+	RCU_ASYNC_INVALIDATE_INSTRUCTION_CACHE)
+
 #define FORCEWAKE_ACK_GT			XE_REG(0x130044)
 
 /* Applicable for all FORCEWAKE_DOMAIN and FORCEWAKE_ACK_DOMAIN regs */
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 85ffe417e492..3c4d1050cb82 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -16,6 +16,7 @@
 #include "xe_device.h"
 #include "xe_eudebug.h"
 #include "xe_eudebug_types.h"
+#include "xe_eudebug_vm.h"
 #include "xe_exec_queue.h"
 #include "xe_hw_engine.h"
 #include "xe_macros.h"
@@ -52,8 +53,7 @@ event_fifo_num_events_peek(const struct xe_eudebug * const d)
 	return kfifo_len(&d->events.fifo);
 }
 
-static bool
-xe_eudebug_detached(struct xe_eudebug *d)
+bool xe_eudebug_detached(struct xe_eudebug *d)
 {
 	bool connected;
 
@@ -668,6 +668,35 @@ static int xe_eudebug_remove_handle(struct xe_eudebug *d, int type, void *p,
 	return ret;
 }
 
+static void *find_resource__unlocked(struct xe_eudebug_resources *res,
+				     int type,
+				     u32 id)
+{
+
+	struct xe_eudebug_resource *r;
+	struct xe_eudebug_handle *h;
+
+	r = resource_from_type(res, type);
+	h = xa_load(&r->xa, id);
+
+	return h ? (void *)(uintptr_t)h->key : NULL;
+}
+
+struct xe_vm *xe_eudebug_vm_get(struct xe_eudebug *d, u32 id)
+{
+	struct xe_vm *vm;
+
+	mutex_lock(&d->res->lock);
+	vm = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_VM, id);
+	if (vm)
+		xe_vm_get(vm);
+
+	mutex_unlock(&d->res->lock);
+
+	return vm;
+}
+
+
 static struct drm_xe_eudebug_event *
 xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags,
 			u32 len)
@@ -1794,6 +1823,10 @@ static long xe_eudebug_ioctl(struct file *file,
 		ret = xe_eudebug_ack_event_ioctl(d, cmd, arg);
 		eu_dbg(d, "ioctl cmd=EVENT_ACK ret=%ld\n", ret);
 		break;
+	case DRM_XE_EUDEBUG_IOCTL_VM_OPEN:
+		ret = xe_eudebug_vm_open_ioctl(d, arg);
+		eu_dbg(d, "ioctl cmd=VM_OPEN ret=%ld\n", ret);
+		break;
 	default:
 		ret = -EINVAL;
 	}
@@ -1858,6 +1891,8 @@ xe_eudebug_connect(struct xe_device *xe,
 	spin_lock_init(&d->acks.lock);
 	d->acks.tree = RB_ROOT;
 
+	mutex_init(&d->hw.lock);
+
 	d->res = xe_eudebug_resources_alloc();
 	if (XE_IOCTL_DBG(xe, IS_ERR(d->res))) {
 		err = PTR_ERR(d->res);
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index 6be20140d5d4..212482d2de73 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -17,6 +17,7 @@ struct xe_vm;
 struct xe_vma;
 struct xe_exec_queue;
 struct xe_user_fence;
+struct xe_eudebug;
 
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 
@@ -37,6 +38,10 @@ struct xe_user_fence;
 
 #define xe_eudebug_assert(d, ...) xe_assert((d)->xe, ##__VA_ARGS__)
 
+#define xe_eudebug_for_each_hw_engine(__hwe, __gt, __id) \
+	for_each_hw_engine(__hwe, __gt, __id)	       \
+		if (xe_hw_engine_has_eudebug(__hwe))
+
 int xe_eudebug_connect_ioctl(struct drm_device *dev,
 			     void *data,
 			     struct drm_file *file);
@@ -46,8 +51,11 @@ bool xe_eudebug_is_enabled(struct xe_device *xe);
 
 void xe_eudebug_file_close(struct xe_file *xef);
 
+bool xe_eudebug_detached(struct xe_eudebug *d);
+
 void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm);
 void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm);
+struct xe_vm *xe_eudebug_vm_get(struct xe_eudebug *d, u32 vm_id);
 
 void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q);
 void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q);
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index a294e2f4e7df..292e93c72a64 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -139,6 +139,11 @@ struct xe_eudebug {
 
 		struct rb_root tree;
 	} acks;
+
+	struct {
+		/** @lock: guards access to hw state */
+		struct mutex lock;
+	} hw;
 };
 
 #endif /* _XE_EUDEBUG_TYPES_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_vm.c b/drivers/gpu/drm/xe/xe_eudebug_vm.c
new file mode 100644
index 000000000000..4dd747680a9c
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_vm.c
@@ -0,0 +1,418 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#include "xe_eudebug_vm.h"
+
+#include <linux/anon_inodes.h>
+#include <linux/fs.h>
+#include <linux/vmalloc.h>
+
+#include <drm/drm_drv.h>
+
+#include "xe_bo.h"
+#include "xe_device.h"
+#include "xe_eudebug.h"
+#include "xe_eudebug_types.h"
+#include "xe_force_wake.h"
+#include "xe_gt.h"
+#include "xe_mmio.h"
+#include "xe_vm.h"
+
+#include "regs/xe_gt_regs.h"
+#include "regs/xe_engine_regs.h"
+
+static int xe_eudebug_vma_access(struct xe_vma *vma,
+				 unsigned long offset_in_vma,
+				 void *buf, unsigned long len, bool write)
+{
+	struct xe_bo *bo;
+	u64 bytes;
+
+	lockdep_assert_held_write(&xe_vma_vm(vma)->lock);
+
+	if (XE_WARN_ON(offset_in_vma >= xe_vma_size(vma)))
+		return -EINVAL;
+
+	if (vma->gpuva.flags & XE_VMA_DESTROYED)
+		return -EINVAL;
+
+	bytes = min_t(u64, len, xe_vma_size(vma) - offset_in_vma);
+	if (!bytes)
+		return 0;
+
+	bo = xe_bo_get(xe_vma_bo(vma));
+	if (bo) {
+		int ret;
+
+		ret = ttm_bo_access(&bo->ttm, offset_in_vma, buf, bytes, write);
+
+		xe_bo_put(bo);
+
+		return ret;
+	}
+
+	return -EINVAL;
+}
+
+static int xe_eudebug_vm_access(struct xe_vm *vm, unsigned long offset,
+				void *buf, unsigned long len, bool write)
+{
+	struct xe_vma *vma;
+	int ret;
+
+	down_write(&vm->lock);
+
+	vma = xe_vm_find_overlapping_vma(vm, offset, len);
+	if (vma) {
+		/* XXX: why find overlapping returns below start? */
+		if (offset < xe_vma_start(vma) ||
+		    offset >= (xe_vma_start(vma) + xe_vma_size(vma))) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		/* Offset into vma */
+		offset -= xe_vma_start(vma);
+		ret = xe_eudebug_vma_access(vma, offset, buf, len, write);
+	} else {
+		ret = -EINVAL;
+	}
+
+out:
+	up_write(&vm->lock);
+
+	return ret;
+}
+
+struct vm_file {
+	struct xe_eudebug *debugger;
+	struct xe_vm *vm;
+	u64 flags;
+	u64 vm_handle;
+	unsigned int timeout_us;
+};
+
+static ssize_t __vm_read_write(struct xe_vm *vm,
+			       void *bb,
+			       char __user *r_buffer,
+			       const char __user *w_buffer,
+			       unsigned long offset,
+			       unsigned long len,
+			       const bool write)
+{
+	ssize_t ret;
+
+	if (!len)
+		return 0;
+
+	if (write) {
+		ret = copy_from_user(bb, w_buffer, len);
+		if (ret)
+			return -EFAULT;
+
+		ret = xe_eudebug_vm_access(vm, offset, bb, len, true);
+		if (ret <= 0)
+			return ret;
+
+		len = ret;
+	} else {
+		ret = xe_eudebug_vm_access(vm, offset, bb, len, false);
+		if (ret <= 0)
+			return ret;
+
+		len = ret;
+
+		ret = copy_to_user(r_buffer, bb, len);
+		if (ret)
+			return -EFAULT;
+	}
+
+	return len;
+}
+
+static ssize_t __xe_eudebug_vm_access(struct file *file,
+				      char __user *r_buffer,
+				      const char __user *w_buffer,
+				      size_t count, loff_t *__pos)
+{
+	struct vm_file *vmf = file->private_data;
+	struct xe_eudebug * const d = vmf->debugger;
+	struct xe_device * const xe = d->xe;
+	const bool write = !!w_buffer;
+	struct xe_vm *vm;
+	ssize_t copied = 0;
+	ssize_t bytes_left = count;
+	ssize_t ret;
+	unsigned long alloc_len;
+	loff_t pos = *__pos;
+	void *k_buffer;
+
+	if (XE_IOCTL_DBG(xe, write && r_buffer))
+		return -EINVAL;
+
+	vm = xe_eudebug_vm_get(d, vmf->vm_handle);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, vm != vmf->vm)) {
+		eu_warn(d, "vm_access(%s): vm handle mismatch vm_handle=%llu, flags=0x%llx, pos=%llu, count=%zu\n",
+			write ? "write" : "read",
+			vmf->vm_handle, vmf->flags, pos, count);
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	if (!count) {
+		xe_vm_put(vm);
+		return 0;
+	}
+
+	alloc_len = min_t(unsigned long, ALIGN(count, PAGE_SIZE), 64 * SZ_1M);
+	do  {
+		k_buffer = vmalloc(alloc_len);
+		if (k_buffer)
+			break;
+
+		alloc_len >>= 1;
+	} while (alloc_len > PAGE_SIZE);
+
+	if (XE_IOCTL_DBG(xe, !k_buffer)) {
+		xe_vm_put(vm);
+		return -ENOMEM;
+	}
+
+	do {
+		const ssize_t len = min_t(ssize_t, bytes_left, alloc_len);
+
+		ret = __vm_read_write(vm, k_buffer,
+				      write ? NULL : r_buffer + copied,
+				      write ? w_buffer + copied : NULL,
+				      pos + copied,
+				      len,
+				      write);
+		if (ret <= 0)
+			break;
+
+		bytes_left -= ret;
+		copied += ret;
+	} while (bytes_left > 0);
+
+	vfree(k_buffer);
+	xe_vm_put(vm);
+
+	if (XE_WARN_ON(copied < 0))
+		copied = 0;
+
+	*__pos += copied;
+
+	return copied ?: ret;
+}
+
+static ssize_t xe_eudebug_vm_read(struct file *file,
+				  char __user *buffer,
+				  size_t count, loff_t *pos)
+{
+	return __xe_eudebug_vm_access(file, buffer, NULL, count, pos);
+}
+
+static ssize_t xe_eudebug_vm_write(struct file *file,
+				   const char __user *buffer,
+				   size_t count, loff_t *pos)
+{
+	return __xe_eudebug_vm_access(file, NULL, buffer, count, pos);
+}
+
+static int engine_rcu_flush(struct xe_eudebug *d,
+			    struct xe_hw_engine *hwe,
+			    unsigned int timeout_us)
+{
+	const struct xe_reg psmi_addr = RING_PSMI_CTL(hwe->mmio_base);
+	struct xe_gt *gt = hwe->gt;
+	unsigned int fw_ref;
+	u32 mask = RCU_ASYNC_FLUSH_AND_INVALIDATE_ALL;
+	u32 psmi_ctrl;
+	u32 id;
+	int ret;
+
+	if (hwe->class == XE_ENGINE_CLASS_RENDER)
+		id = 0;
+	else if (hwe->class == XE_ENGINE_CLASS_COMPUTE)
+		id = hwe->instance + 1;
+	else
+		return -EINVAL;
+
+	if (id < 8)
+		mask |= id << RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT;
+	else
+		mask |= (id - 8) << RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT |
+			RCU_ASYNC_FLUSH_ENGINE_ID_DECODE1;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(gt), hwe->domain);
+	if (!fw_ref)
+		return -ETIMEDOUT;
+
+	/* Prevent concurrent flushes */
+	mutex_lock(&d->hw.lock);
+	psmi_ctrl = xe_mmio_read32(&gt->mmio, psmi_addr);
+	if (!(psmi_ctrl & IDLE_MSG_DISABLE))
+		xe_mmio_write32(&gt->mmio, psmi_addr, _MASKED_BIT_ENABLE(IDLE_MSG_DISABLE));
+
+	/* XXX: Timeout is per operation but in here we flush previous */
+	ret = xe_mmio_wait32(&gt->mmio, RCU_ASYNC_FLUSH,
+			     RCU_ASYNC_FLUSH_IN_PROGRESS, 0,
+			     timeout_us, NULL, false);
+	if (ret)
+		goto out;
+
+	xe_mmio_write32(&gt->mmio, RCU_ASYNC_FLUSH, mask);
+
+	ret = xe_mmio_wait32(&gt->mmio, RCU_ASYNC_FLUSH,
+			     RCU_ASYNC_FLUSH_IN_PROGRESS, 0,
+			     timeout_us, NULL, false);
+out:
+	if (!(psmi_ctrl & IDLE_MSG_DISABLE))
+		xe_mmio_write32(&gt->mmio, psmi_addr, _MASKED_BIT_DISABLE(IDLE_MSG_DISABLE));
+
+	mutex_unlock(&d->hw.lock);
+	xe_force_wake_put(gt_to_fw(gt), fw_ref);
+
+	return ret;
+}
+
+static int xe_eudebug_vm_fsync(struct file *file, loff_t start, loff_t end, int datasync)
+{
+	struct vm_file *vmf = file->private_data;
+	struct xe_eudebug *d = vmf->debugger;
+	struct xe_gt *gt;
+	int gt_id;
+	int ret = -EINVAL;
+
+	eu_dbg(d, "vm_fsync: vm_handle=%llu, flags=0x%llx, start=%llu, end=%llu datasync=%d\n",
+	       vmf->vm_handle, vmf->flags, start, end, datasync);
+
+	for_each_gt(gt, d->xe, gt_id) {
+		struct xe_hw_engine *hwe;
+		enum xe_hw_engine_id id;
+
+		/* XXX: vm open per engine? */
+		xe_eudebug_for_each_hw_engine(hwe, gt, id) {
+			ret = engine_rcu_flush(d, hwe, vmf->timeout_us);
+			if (ret)
+				break;
+		}
+	}
+
+	return ret;
+}
+
+static int xe_eudebug_vm_release(struct inode *inode, struct file *file)
+{
+	struct vm_file *vmf = file->private_data;
+	struct xe_eudebug *d = vmf->debugger;
+
+	eu_dbg(d, "vm_release: vm_handle=%llu, flags=0x%llx",
+	       vmf->vm_handle, vmf->flags);
+
+	xe_vm_put(vmf->vm);
+	drm_dev_put(&d->xe->drm);
+	xe_eudebug_put(d);
+
+	kfree(vmf);
+
+	return 0;
+}
+
+static const struct file_operations vm_fops = {
+	.owner   = THIS_MODULE,
+	.llseek  = generic_file_llseek,
+	.read    = xe_eudebug_vm_read,
+	.write   = xe_eudebug_vm_write,
+	.fsync   = xe_eudebug_vm_fsync,
+	.mmap    = NULL,
+	.release = xe_eudebug_vm_release,
+};
+
+long xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg)
+{
+	struct drm_xe_eudebug_vm_open param;
+	struct xe_device * const xe = d->xe;
+	struct vm_file *vmf = NULL;
+	struct xe_vm *vm;
+	struct file *file;
+	long ret = 0;
+	int fd;
+
+	if (XE_IOCTL_DBG(xe, _IOC_SIZE(DRM_XE_EUDEBUG_IOCTL_VM_OPEN) != sizeof(param)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, !(_IOC_DIR(DRM_XE_EUDEBUG_IOCTL_VM_OPEN) & _IOC_WRITE)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, copy_from_user(&param, (void __user *)arg, sizeof(param))))
+		return -EFAULT;
+
+	if (XE_IOCTL_DBG(xe, param.flags))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, xe_eudebug_detached(d)))
+		return -ENOTCONN;
+
+	vm = xe_eudebug_vm_get(d, param.vm_handle);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -EINVAL;
+
+	vmf = kzalloc(sizeof(*vmf), GFP_KERNEL);
+	if (XE_IOCTL_DBG(xe, !vmf)) {
+		ret = -ENOMEM;
+		goto out_vm_put;
+	}
+
+	fd = get_unused_fd_flags(O_CLOEXEC);
+	if (XE_IOCTL_DBG(xe, fd < 0)) {
+		ret = fd;
+		goto out_free;
+	}
+
+	kref_get(&d->ref);
+	vmf->debugger = d;
+	vmf->vm = vm;
+	vmf->flags = param.flags;
+	vmf->vm_handle = param.vm_handle;
+	vmf->timeout_us = div64_u64(param.timeout_ns, 1000ull);
+
+	file = anon_inode_getfile("[xe_eudebug.vm]", &vm_fops, vmf, O_RDWR);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		XE_IOCTL_DBG(xe, ret);
+		file = NULL;
+		goto out_fd_put;
+	}
+
+	drm_dev_get(&xe->drm);
+
+	file->f_mode |= FMODE_PREAD | FMODE_PWRITE |
+		FMODE_READ | FMODE_WRITE | FMODE_LSEEK;
+
+	fd_install(fd, file);
+
+	eu_dbg(d, "vm_open: handle=%llu, flags=0x%llx, fd=%d",
+	       vmf->vm_handle, vmf->flags, fd);
+
+	XE_WARN_ON(ret);
+
+	return fd;
+
+out_fd_put:
+	put_unused_fd(fd);
+	xe_eudebug_put(d);
+out_free:
+	kfree(vmf);
+out_vm_put:
+	xe_vm_put(vm);
+
+	XE_WARN_ON(ret >= 0);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/xe/xe_eudebug_vm.h b/drivers/gpu/drm/xe/xe_eudebug_vm.h
new file mode 100644
index 000000000000..b3dc5618a5e6
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_vm.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+struct xe_eudebug;
+
+long xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg);
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index e55fa52c2973..cc45ebd47143 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -17,6 +17,7 @@ extern "C" {
  */
 #define DRM_XE_EUDEBUG_IOCTL_READ_EVENT		_IO('j', 0x0)
 #define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT		_IOW('j', 0x1, struct drm_xe_eudebug_ack_event)
+#define DRM_XE_EUDEBUG_IOCTL_VM_OPEN		_IOW('j', 0x2, struct drm_xe_eudebug_vm_open)
 
 /**
  * struct drm_xe_eudebug_event - Base type of event delivered by xe_eudebug.
@@ -165,6 +166,20 @@ struct drm_xe_eudebug_ack_event {
 	__u64 seqno;
 };
 
+struct drm_xe_eudebug_vm_open {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	/** @vm_handle: id of vm */
+	__u64 vm_handle;
+
+	/** @flags: flags */
+	__u64 flags;
+
+	/** @timeout_ns: Timeout value in nanoseconds operations (fsync) */
+	__u64 timeout_ns;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/20] drm/xe/eudebug: userptr vm pread/pwrite
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (7 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 08/20] drm/xe/eudebug: vm open/pread/pwrite Mika Kuoppala
@ 2025-10-06 11:16 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 10/20] drm/xe/eudebug: hw enablement for eudebug Mika Kuoppala
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:16 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala, Simona Vetter, Dominik Grzegorzek

Implement debugger vm access for userptrs.

When bind is done, take ref to current task so that
we know from which vm the address was bound. Then during
debugger pread/pwrite we use this target task as
parameter to access the debuggee vm with access_process_vm().

This is based on suggestions from Simona, Thomas and Joonas.

v2: need to add offset into vma (Dominik)
v3: move code into xe_userptr.c (Mika)

Cc: Simona Vetter <simona@ffwll.ch>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_eudebug_vm.c | 16 +++++++++++++++
 drivers/gpu/drm/xe/xe_userptr.c    |  4 ++++
 drivers/gpu/drm/xe/xe_userptr.h    | 32 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.c         |  1 +
 4 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_eudebug_vm.c b/drivers/gpu/drm/xe/xe_eudebug_vm.c
index 4dd747680a9c..6d341bae4ffc 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_vm.c
+++ b/drivers/gpu/drm/xe/xe_eudebug_vm.c
@@ -51,6 +51,22 @@ static int xe_eudebug_vma_access(struct xe_vma *vma,
 		xe_bo_put(bo);
 
 		return ret;
+	} else if (xe_vma_is_userptr(vma)) {
+		struct xe_userptr *userptr = &to_userptr_vma(vma)->userptr;
+
+		if (XE_WARN_ON(!userptr->eudebug.task))
+			return -EINVAL;
+
+		/*
+		 * access_remote_vm() would fit as userptr notifier has
+		 * mm ref so we would not need to carry task ref at all.
+		 * But access_remote_vm is not exported. access_process_vm()
+		 * is exported so use it instead.
+		 */
+		return access_process_vm(userptr->eudebug.task,
+					 xe_vma_userptr(vma) + offset_in_vma,
+					 buf, bytes,
+					 write ? FOLL_WRITE : 0);
 	}
 
 	return -EINVAL;
diff --git a/drivers/gpu/drm/xe/xe_userptr.c b/drivers/gpu/drm/xe/xe_userptr.c
index f16e92cd8090..6cf9158f9e79 100644
--- a/drivers/gpu/drm/xe/xe_userptr.c
+++ b/drivers/gpu/drm/xe/xe_userptr.c
@@ -290,6 +290,8 @@ int xe_userptr_setup(struct xe_userptr_vma *uvma, unsigned long start,
 
 	userptr->pages.notifier_seq = LONG_MAX;
 
+	xe_eudebug_track_userptr_task(userptr);
+
 	return 0;
 }
 
@@ -298,6 +300,8 @@ void xe_userptr_remove(struct xe_userptr_vma *uvma)
 	struct xe_vm *vm = xe_vma_vm(&uvma->vma);
 	struct xe_userptr *userptr = &uvma->userptr;
 
+	xe_eudebug_untrack_userptr_task(userptr);
+
 	drm_gpusvm_free_pages(&vm->svm.gpusvm, &uvma->userptr.pages,
 			      xe_vma_size(&uvma->vma) >> PAGE_SHIFT);
 
diff --git a/drivers/gpu/drm/xe/xe_userptr.h b/drivers/gpu/drm/xe/xe_userptr.h
index ef801234991e..4af2569f1fa1 100644
--- a/drivers/gpu/drm/xe/xe_userptr.h
+++ b/drivers/gpu/drm/xe/xe_userptr.h
@@ -66,6 +66,12 @@ struct xe_userptr {
 #if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
 	u32 divisor;
 #endif
+
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	struct {
+		struct task_struct *task;
+	} eudebug;
+#endif
 };
 
 #if IS_ENABLED(CONFIG_DRM_GPUSVM)
@@ -104,4 +110,30 @@ static inline void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
 {
 }
 #endif
+
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+static inline void xe_eudebug_track_userptr_task(struct xe_userptr *userptr)
+{
+	/*
+	 * We could use the mm which is on notifier. But
+	 * the access_remote_vm() is not exported. Thus
+	 * we get reference to task for access_process_vm()
+	 */
+	userptr->eudebug.task = get_task_struct(current);
+}
+
+static inline void xe_eudebug_untrack_userptr_task(struct xe_userptr *userptr)
+{
+	put_task_struct(userptr->eudebug.task);
+}
+#else
+static inline void xe_eudebug_track_userptr_task(struct xe_userptr *userptr)
+{
+}
+
+static inline void xe_eudebug_untrack_userptr_task(struct xe_userptr *userptr)
+{
+}
+#endif /* CONFIG_DRM_XE_EUDEBUG */
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 5f86e87d8458..5a05563009b2 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1053,6 +1053,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
 		struct xe_userptr_vma *uvma = to_userptr_vma(vma);
 
 		xe_userptr_remove(uvma);
+
 		xe_vm_put(vm);
 	} else if (xe_vma_is_null(vma) || xe_vma_is_cpu_addr_mirror(vma)) {
 		xe_vm_put(vm);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/20] drm/xe/eudebug: hw enablement for eudebug
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (8 preceding siblings ...)
  2025-10-06 11:16 ` [PATCH 09/20] drm/xe/eudebug: userptr vm pread/pwrite Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 11/20] drm/xe/eudebug: Introduce EU control interface Mika Kuoppala
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek, Mika Kuoppala

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

In order to turn on debug capabilities, (i.e. breakpoints), TD_CTL
and some other registers needs to be programmed. Implement eudebug
mode enabling including eudebug related workarounds.

v2: Move workarounds to xe_wa_oob. Use reg_sr directly instead of
xe_rtp as it suits better for dynamic manipulation of those register we
do later in the series.
v3: get rid of undefining XE_MCR_REG (Mika)

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/Makefile              |   2 +-
 drivers/gpu/drm/xe/regs/xe_engine_regs.h |   4 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h     |  19 +++
 drivers/gpu/drm/xe/xe_eudebug_hw.c       |  72 +++++++++
 drivers/gpu/drm/xe/xe_eudebug_hw.h       |  25 ++++
 drivers/gpu/drm/xe/xe_gt_debug.c         | 179 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_debug.h         |  41 ++++++
 drivers/gpu/drm/xe/xe_reg_sr.c           |  21 ++-
 drivers/gpu/drm/xe/xe_reg_sr.h           |   4 +-
 drivers/gpu/drm/xe/xe_reg_whitelist.c    |   2 +-
 drivers/gpu/drm/xe/xe_rtp.c              |   2 +-
 drivers/gpu/drm/xe/xe_wa_oob.rules       |   4 +
 12 files changed, 365 insertions(+), 10 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.h
 create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.c
 create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index c0b860511181..ecbca68d3c2a 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -147,7 +147,7 @@ xe-$(CONFIG_DRM_XE_GPUSVM) += xe_svm.o
 xe-$(CONFIG_DRM_GPUSVM) += xe_userptr.o
 
 # debugging shaders with gdb (eudebug) support
-xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o xe_eudebug_vm.o
+xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o xe_eudebug_vm.o xe_eudebug_hw.o xe_gt_debug.o
 
 # graphics hardware monitoring (HWMON) support
 xe-$(CONFIG_HWMON) += xe_hwmon.o
diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
index f4c3e1187a00..07b0bc45d228 100644
--- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
@@ -123,6 +123,10 @@
 
 #define INDIRECT_RING_STATE(base)		XE_REG((base) + 0x108)
 
+#define CS_DEBUG_MODE2(base)			XE_REG((base) + 0xd8, XE_REG_OPTION_MASKED)
+#define   INST_STATE_CACHE_INVALIDATE		REG_BIT(6)
+#define   GLOBAL_DEBUG_ENABLE			REG_BIT(5)
+
 #define RING_BBADDR(base)			XE_REG((base) + 0x140)
 #define RING_BBADDR_UDW(base)			XE_REG((base) + 0x168)
 
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index ee64f69e784e..afa2924c9e0b 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -474,10 +474,20 @@
 #define   DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA	REG_BIT(15)
 #define   CLEAR_OPTIMIZATION_DISABLE			REG_BIT(6)
 
+#define TD_CTL					XE_REG_MCR(0xe400)
+#define   TD_CTL_FEH_AND_FEE_ENABLE		REG_BIT(7) /* forced halt and exception */
+#define   TD_CTL_FORCE_EXTERNAL_HALT		REG_BIT(6)
+#define   TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE	REG_BIT(4)
+#define   TD_CTL_FORCE_EXCEPTION		REG_BIT(3)
+#define   TD_CTL_BREAKPOINT_ENABLE		REG_BIT(2)
+#define   TD_CTL_GLOBAL_DEBUG_ENABLE		REG_BIT(0) /* XeHP */
+
 #define CACHE_MODE_SS				XE_REG_MCR(0xe420, XE_REG_OPTION_MASKED)
 #define   DISABLE_ECC				REG_BIT(5)
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 
+#define EU_ATT(reg, row)			XE_REG_MCR((reg ? 0xe478 : 0xe470) + (row) * 4)
+
 #define ROW_CHICKEN4				XE_REG_MCR(0xe48c, XE_REG_OPTION_MASKED)
 #define   DISABLE_GRF_CLEAR			REG_BIT(13)
 #define   XEHP_DIS_BBL_SYSPIPE			REG_BIT(11)
@@ -487,6 +497,8 @@
 #define   THREAD_EX_ARB_MODE			REG_GENMASK(3, 2)
 #define   THREAD_EX_ARB_MODE_RR_AFTER_DEP	REG_FIELD_PREP(THREAD_EX_ARB_MODE, 0x2)
 
+#define EU_ATT_CLR(reg, row)			XE_REG_MCR((reg ? 0xe698 : 0xe490) + (row) * 4)
+
 #define ROW_CHICKEN3				XE_REG_MCR(0xe49c, XE_REG_OPTION_MASKED)
 #define   XE2_EUPEND_CHK_FLUSH_DIS		REG_BIT(14)
 #define   DIS_FIX_EOT1_FLUSH			REG_BIT(9)
@@ -501,11 +513,13 @@
 #define   MDQ_ARBITRATION_MODE			REG_BIT(12)
 #define   STALL_DOP_GATING_DISABLE		REG_BIT(5)
 #define   EARLY_EOT_DIS				REG_BIT(1)
+#define   STALL_DOP_GATING_DISABLE		REG_BIT(5)
 
 #define ROW_CHICKEN2				XE_REG_MCR(0xe4f4, XE_REG_OPTION_MASKED)
 #define   DISABLE_READ_SUPPRESSION		REG_BIT(15)
 #define   DISABLE_EARLY_READ			REG_BIT(14)
 #define   ENABLE_LARGE_GRF_MODE			REG_BIT(12)
+#define   XEHPC_DISABLE_BTB			REG_BIT(11)
 #define   PUSH_CONST_DEREF_HOLD_DIS		REG_BIT(8)
 #define   DISABLE_TDL_SVHS_GATING		REG_BIT(1)
 #define   DISABLE_DOP_GATING			REG_BIT(0)
@@ -561,6 +575,11 @@
 #define   CCS_MODE_CSLICE(cslice, ccs) \
 	((ccs) << ((cslice) * CCS_MODE_CSLICE_WIDTH))
 
+#define RCU_DEBUG_1				XE_REG(0x14a00)
+#define   RCU_DEBUG_1_ENGINE_STATUS		REG_GENMASK(2, 0)
+#define   RCU_DEBUG_1_RUNALONE_ACTIVE		REG_BIT(2)
+#define   RCU_DEBUG_1_CONTEXT_ACTIVE		REG_BIT(0)
+
 #define RCU_ASYNC_FLUSH				XE_REG(0x149fc)
 #define   RCU_ASYNC_FLUSH_IN_PROGRESS	REG_BIT(31)
 #define   RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT	28
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
new file mode 100644
index 000000000000..5a0fa9f9bd86
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#include "xe_eudebug_hw.h"
+
+#include <linux/delay.h>
+#include <linux/pm_runtime.h>
+#include <generated/xe_wa_oob.h>
+
+#include "regs/xe_gt_regs.h"
+#include "regs/xe_engine_regs.h"
+
+#include "xe_eudebug.h"
+#include "xe_eudebug_types.h"
+#include "xe_exec_queue.h"
+#include "xe_exec_queue_types.h"
+#include "xe_force_wake.h"
+#include "xe_gt.h"
+#include "xe_gt_debug.h"
+#include "xe_gt_mcr.h"
+#include "xe_hw_engine.h"
+#include "xe_lrc.h"
+#include "xe_macros.h"
+#include "xe_mmio.h"
+#include "xe_reg_sr.h"
+#include "xe_rtp.h"
+#include "xe_wa.h"
+
+static void add_sr_entry(struct xe_hw_engine *hwe,
+			 struct xe_reg_mcr mcr_reg,
+			 u32 mask, bool enable)
+{
+	const struct xe_reg_sr_entry sr_entry = {
+		.reg = mcr_reg.__reg,
+		.clr_bits = mask,
+		.set_bits = enable ? mask : 0,
+		.read_mask = mask,
+	};
+
+	xe_reg_sr_add(&hwe->reg_sr, &sr_entry, hwe->gt, true);
+}
+
+void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable)
+{
+	struct xe_gt *gt = hwe->gt;
+	struct xe_device *xe = gt_to_xe(gt);
+
+	if (!xe_rtp_match_first_render_or_compute(gt, hwe))
+		return;
+
+	if (XE_GT_WA(gt, 18022722726))
+		add_sr_entry(hwe, ROW_CHICKEN,
+			     STALL_DOP_GATING_DISABLE, enable);
+
+	if (XE_GT_WA(gt, 14015474168))
+		add_sr_entry(hwe, ROW_CHICKEN2,
+			     XEHPC_DISABLE_BTB,
+			     enable);
+
+	if (xe->info.graphics_verx100 >= 1200)
+		add_sr_entry(hwe, TD_CTL,
+			     TD_CTL_BREAKPOINT_ENABLE |
+			     TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE |
+			     TD_CTL_FEH_AND_FEE_ENABLE,
+			     enable);
+
+	if (xe->info.graphics_verx100 >= 1250)
+		add_sr_entry(hwe, TD_CTL,
+			     TD_CTL_GLOBAL_DEBUG_ENABLE, enable);
+}
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
new file mode 100644
index 000000000000..7362ed9bde68
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#include <linux/types.h>
+
+#ifndef _XE_EUDEBUG_HW_H_
+#define _XE_EUDEBUG_HW_H_
+
+#include <linux/types.h>
+
+struct xe_eudebug;
+struct xe_hw_engine;
+struct xe_gt;
+
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+
+void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable);
+
+#else /* CONFIG_DRM_XE_EUDEBUG */
+
+#endif /* CONFIG_DRM_XE_EUDEBUG */
+
+#endif /* _XE_EUDEBUG_HW_H_ */
diff --git a/drivers/gpu/drm/xe/xe_gt_debug.c b/drivers/gpu/drm/xe/xe_gt_debug.c
new file mode 100644
index 000000000000..314eef6734c3
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_gt_debug.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include "regs/xe_gt_regs.h"
+#include "xe_device.h"
+#include "xe_force_wake.h"
+#include "xe_gt.h"
+#include "xe_gt_topology.h"
+#include "xe_gt_debug.h"
+#include "xe_gt_mcr.h"
+#include "xe_pm.h"
+#include "xe_macros.h"
+
+unsigned int xe_gt_eu_att_regs(struct xe_gt *gt)
+{
+	return (GRAPHICS_VERx100(gt_to_xe(gt)) >= 3000) ? 2u : 1u;
+}
+
+int xe_gt_foreach_dss_group_instance(struct xe_gt *gt,
+				     int (*fn)(struct xe_gt *gt,
+					       void *data,
+					       u16 group,
+					       u16 instance,
+					       bool present),
+				     void *data)
+{
+	const enum xe_force_wake_domains fw_domains = XE_FW_GT;
+	xe_dss_mask_t dss_mask;
+	unsigned int dss, fw_ref;
+	u16 group, instance;
+	int ret = 0;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(gt), fw_domains);
+	if (!fw_ref)
+		return -ETIMEDOUT;
+
+	bitmap_or(dss_mask, gt->fuse_topo.g_dss_mask, gt->fuse_topo.c_dss_mask,
+		  XE_MAX_DSS_FUSE_BITS);
+
+	/*
+	 * Note: This removes terminating zeros when last dss is fused out!
+	 * In order bitmask to be exactly the same as on with i915 we would
+	 * need to figure out max dss for given platform, most probably by
+	 * querying hwconfig
+	 */
+
+	for (dss = 0;
+	     dss <= find_last_bit(dss_mask, XE_MAX_DSS_FUSE_BITS);
+	     dss++) {
+		xe_gt_mcr_get_dss_steering(gt, dss, &group, &instance);
+
+		ret = fn(gt, data, group, instance, test_bit(dss, dss_mask));
+		if (ret)
+			break;
+	}
+
+	xe_force_wake_put(gt_to_fw(gt), fw_ref);
+
+	return ret;
+}
+
+static int read_first_attention_mcr(struct xe_gt *gt, void *data,
+				    u16 group, u16 instance, bool present)
+{
+	unsigned int reg, row;
+
+	if (!present)
+		return 0;
+
+	for (reg = 0; reg < xe_gt_eu_att_regs(gt); reg++) {
+		for (row = 0; row < XE_GT_EU_ATT_ROWS; row++) {
+			u32 val;
+
+			val = xe_gt_mcr_unicast_read(gt, EU_ATT(reg, row), group, instance);
+
+			if (val)
+				return 1;
+		}
+	}
+
+	return 0;
+}
+
+#define MAX_EUS_PER_ROW 4u
+#define MAX_THREADS 8u
+
+/**
+ * xe_gt_eu_attention_bitmap_size - query size of the attention bitmask
+ *
+ * @gt: pointer to struct xe_gt
+ *
+ * Return: size in bytes.
+ */
+int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt)
+{
+	xe_dss_mask_t dss_mask;
+
+	bitmap_or(dss_mask, gt->fuse_topo.c_dss_mask,
+		  gt->fuse_topo.g_dss_mask, XE_MAX_DSS_FUSE_BITS);
+
+	return (find_last_bit(dss_mask, XE_MAX_DSS_FUSE_BITS) + 1) *
+		XE_GT_EU_ATT_ROWS * xe_gt_eu_att_regs(gt) * MAX_THREADS *
+		MAX_EUS_PER_ROW / 8;
+}
+
+struct attn_read_iter {
+	struct xe_gt *gt;
+	unsigned int i;
+	unsigned int size;
+	u8 *bits;
+};
+
+static int read_eu_attentions_mcr(struct xe_gt *gt, void *data,
+				  u16 group, u16 instance, bool present)
+{
+	struct attn_read_iter * const iter = data;
+	unsigned int reg, row;
+
+	for (reg = 0; reg < xe_gt_eu_att_regs(gt); reg++) {
+		for (row = 0; row < XE_GT_EU_ATT_ROWS; row++) {
+			u32 val;
+
+			if (iter->i >= iter->size)
+				return 0;
+
+			XE_WARN_ON(iter->i + sizeof(val) > xe_gt_eu_attention_bitmap_size(gt));
+
+			if (present)
+				val = xe_gt_mcr_unicast_read(gt, EU_ATT(reg, row), group, instance);
+			else
+				val = 0;
+
+			memcpy(&iter->bits[iter->i], &val, sizeof(val));
+			iter->i += sizeof(val);
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * xe_gt_eu_attention_bitmap - query host attention
+ *
+ * @gt: pointer to struct xe_gt
+ *
+ * Return: 0 on success, negative otherwise.
+ */
+int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits,
+			      unsigned int bitmap_size)
+{
+	struct attn_read_iter iter = {
+		.gt = gt,
+		.i = 0,
+		.size = bitmap_size,
+		.bits = bits
+	};
+
+	return xe_gt_foreach_dss_group_instance(gt, read_eu_attentions_mcr, &iter);
+}
+
+/**
+ * xe_gt_eu_threads_needing_attention - Query host attention
+ *
+ * @gt: pointer to struct xe_gt
+ *
+ * Return: 1 if threads waiting host attention, 0 otherwise.
+ */
+int xe_gt_eu_threads_needing_attention(struct xe_gt *gt)
+{
+	int err;
+
+	err = xe_gt_foreach_dss_group_instance(gt, read_first_attention_mcr, NULL);
+
+	XE_WARN_ON(err < 0);
+
+	return err < 0 ? 0 : err;
+}
diff --git a/drivers/gpu/drm/xe/xe_gt_debug.h b/drivers/gpu/drm/xe/xe_gt_debug.h
new file mode 100644
index 000000000000..f882770e18d3
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_gt_debug.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef __XE_GT_DEBUG_
+#define __XE_GT_DEBUG_
+
+#include <linux/bits.h>
+#include <linux/math.h>
+
+struct xe_gt;
+
+#define XE_GT_ATTENTION_TIMEOUT_MS 100
+#define XE_GT_EU_ATT_ROWS 2u
+
+struct xe_eu_attentions {
+#define XE_MAX_EUS 1024
+#define XE_MAX_THREADS 10
+
+	u8 att[DIV_ROUND_UP(XE_MAX_EUS * XE_MAX_THREADS, BITS_PER_BYTE)];
+	unsigned int size;
+	ktime_t ts;
+};
+
+unsigned int xe_gt_eu_att_regs(struct xe_gt *gt);
+
+int xe_gt_eu_threads_needing_attention(struct xe_gt *gt);
+int xe_gt_foreach_dss_group_instance(struct xe_gt *gt,
+				     int (*fn)(struct xe_gt *gt,
+					       void *data,
+					       u16 group,
+					       u16 instance,
+					       bool present),
+				     void *data);
+
+int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt);
+int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits,
+			      unsigned int bitmap_size);
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_reg_sr.c b/drivers/gpu/drm/xe/xe_reg_sr.c
index fc8447a838c4..8f525d6c6765 100644
--- a/drivers/gpu/drm/xe/xe_reg_sr.c
+++ b/drivers/gpu/drm/xe/xe_reg_sr.c
@@ -73,22 +73,31 @@ static void reg_sr_inc_error(struct xe_reg_sr *sr)
 
 int xe_reg_sr_add(struct xe_reg_sr *sr,
 		  const struct xe_reg_sr_entry *e,
-		  struct xe_gt *gt)
+		  struct xe_gt *gt,
+		  bool overwrite)
 {
 	unsigned long idx = e->reg.addr;
 	struct xe_reg_sr_entry *pentry = xa_load(&sr->xa, idx);
 	int ret;
 
 	if (pentry) {
-		if (!compatible_entries(pentry, e)) {
+		if (overwrite && e->set_bits) {
+			pentry->clr_bits |= e->clr_bits;
+			pentry->set_bits |= e->set_bits;
+			pentry->read_mask |= e->read_mask;
+		} else if (overwrite && !e->set_bits) {
+			pentry->clr_bits |= e->clr_bits;
+			pentry->set_bits &= ~e->clr_bits;
+			pentry->read_mask |= e->read_mask;
+		} else if (!compatible_entries(pentry, e)) {
 			ret = -EINVAL;
 			goto fail;
+		} else {
+			pentry->clr_bits |= e->clr_bits;
+			pentry->set_bits |= e->set_bits;
+			pentry->read_mask |= e->read_mask;
 		}
 
-		pentry->clr_bits |= e->clr_bits;
-		pentry->set_bits |= e->set_bits;
-		pentry->read_mask |= e->read_mask;
-
 		return 0;
 	}
 
diff --git a/drivers/gpu/drm/xe/xe_reg_sr.h b/drivers/gpu/drm/xe/xe_reg_sr.h
index 51fbba423e27..d67fafdcd847 100644
--- a/drivers/gpu/drm/xe/xe_reg_sr.h
+++ b/drivers/gpu/drm/xe/xe_reg_sr.h
@@ -6,6 +6,8 @@
 #ifndef _XE_REG_SR_
 #define _XE_REG_SR_
 
+#include <linux/types.h>
+
 /*
  * Reg save/restore bookkeeping
  */
@@ -21,7 +23,7 @@ int xe_reg_sr_init(struct xe_reg_sr *sr, const char *name, struct xe_device *xe)
 void xe_reg_sr_dump(struct xe_reg_sr *sr, struct drm_printer *p);
 
 int xe_reg_sr_add(struct xe_reg_sr *sr, const struct xe_reg_sr_entry *e,
-		  struct xe_gt *gt);
+		  struct xe_gt *gt, bool overwrite);
 void xe_reg_sr_apply_mmio(struct xe_reg_sr *sr, struct xe_gt *gt);
 void xe_reg_sr_apply_whitelist(struct xe_hw_engine *hwe);
 
diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c
index 23f6c81d9994..166ca3d183db 100644
--- a/drivers/gpu/drm/xe/xe_reg_whitelist.c
+++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c
@@ -118,7 +118,7 @@ static void whitelist_apply_to_hwe(struct xe_hw_engine *hwe)
 		}
 
 		xe_reg_whitelist_print_entry(&p, 0, reg, entry);
-		xe_reg_sr_add(&hwe->reg_sr, &hwe_entry, hwe->gt);
+		xe_reg_sr_add(&hwe->reg_sr, &hwe_entry, hwe->gt, false);
 
 		slot++;
 	}
diff --git a/drivers/gpu/drm/xe/xe_rtp.c b/drivers/gpu/drm/xe/xe_rtp.c
index b5f430d59f80..0fbae14e7151 100644
--- a/drivers/gpu/drm/xe/xe_rtp.c
+++ b/drivers/gpu/drm/xe/xe_rtp.c
@@ -181,7 +181,7 @@ static void rtp_add_sr_entry(const struct xe_rtp_action *action,
 	};
 
 	sr_entry.reg.addr += mmio_base;
-	xe_reg_sr_add(sr, &sr_entry, gt);
+	xe_reg_sr_add(sr, &sr_entry, gt, false);
 }
 
 static bool rtp_process_one_sr(const struct xe_rtp_entry_sr *entry,
diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
index f3a6d5d239ce..e8f09ae7a67b 100644
--- a/drivers/gpu/drm/xe/xe_wa_oob.rules
+++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
@@ -81,3 +81,7 @@
 
 15015404425_disable	PLATFORM(PANTHERLAKE), MEDIA_STEP(B0, FOREVER)
 16026007364    MEDIA_VERSION(3000)
+
+#eudebug
+18022722726	GRAPHICS_VERSION_RANGE(1250, 1274)
+14015474168	PLATFORM(PVC)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 11/20] drm/xe/eudebug: Introduce EU control interface
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (9 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 10/20] drm/xe/eudebug: hw enablement for eudebug Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 12/20] drm/xe/eudebug: Introduce per device attention scan worker Mika Kuoppala
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek, Mika Kuoppala

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Introduce EU control functionality, which allows EU debugger
to interrupt, resume, and inform about the current state of
EU threads during execution. Provide an abstraction layer,
so in the future guc will only need to provide appropriate callbacks.

Based on implementation created by authors and other folks within
i915 driver.

v2: - checkpatch (Maciej)
    - lrc index off by one fix (Mika)
    - checkpatch (Tilak)
    - 32bit fixes (Andrzej, Mika)
    - find_resource_get for client (Mika)

v3: - fw ref (Mika)
    - attention register naming

v4: - fused off handling (Dominik)
    - squash xe3 parts and ptl attentions (Mika)

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
---
 drivers/gpu/drm/xe/regs/xe_engine_regs.h |   3 +
 drivers/gpu/drm/xe/xe_eudebug.c          |  47 ++
 drivers/gpu/drm/xe/xe_eudebug.h          |   2 +
 drivers/gpu/drm/xe/xe_eudebug_hw.c       | 658 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug_hw.h       |   7 +
 drivers/gpu/drm/xe/xe_eudebug_types.h    |  25 +
 include/uapi/drm/xe_drm_eudebug.h        |  18 +
 7 files changed, 760 insertions(+)

diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
index 07b0bc45d228..35d40cee4dd4 100644
--- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
@@ -145,6 +145,9 @@
 #define   INHIBIT_SWITCH_UNTIL_PREEMPTED	REG_BIT(31)
 #define   IDLE_DELAY				REG_GENMASK(20, 0)
 
+#define RING_CURRENT_LRCA(base)			XE_REG((base) + 0x240)
+#define   CURRENT_LRCA_VALID			REG_BIT(0)
+
 #define RING_CONTEXT_CONTROL(base)		XE_REG((base) + 0x244, XE_REG_OPTION_MASKED)
 #define	  CTX_CTRL_PXP_ENABLE			REG_BIT(10)
 #define	  CTX_CTRL_OAC_CONTEXT_ENABLE		REG_BIT(8)
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 3c4d1050cb82..a71797b4e9dd 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -15,9 +15,11 @@
 #include "xe_debug_data_types.h"
 #include "xe_device.h"
 #include "xe_eudebug.h"
+#include "xe_eudebug_hw.h"
 #include "xe_eudebug_types.h"
 #include "xe_eudebug_vm.h"
 #include "xe_exec_queue.h"
+#include "xe_gt.h"
 #include "xe_hw_engine.h"
 #include "xe_macros.h"
 #include "xe_sync.h"
@@ -696,6 +698,29 @@ struct xe_vm *xe_eudebug_vm_get(struct xe_eudebug *d, u32 id)
 	return vm;
 }
 
+struct xe_exec_queue *xe_eudebug_exec_queue_get(struct xe_eudebug *d, u32 id)
+{
+	struct xe_exec_queue *q;
+
+	mutex_lock(&d->res->lock);
+	q = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, id);
+	if (q)
+		xe_exec_queue_get(q);
+	mutex_unlock(&d->res->lock);
+
+	return q;
+}
+
+struct xe_lrc *xe_eudebug_find_lrc(struct xe_eudebug *d, u32 id)
+{
+	struct xe_lrc *lrc;
+
+	mutex_lock(&d->res->lock);
+	lrc = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_LRC, id);
+	mutex_unlock(&d->res->lock);
+
+	return lrc;
+}
 
 static struct drm_xe_eudebug_event *
 xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags,
@@ -1827,6 +1852,10 @@ static long xe_eudebug_ioctl(struct file *file,
 		ret = xe_eudebug_vm_open_ioctl(d, arg);
 		eu_dbg(d, "ioctl cmd=VM_OPEN ret=%ld\n", ret);
 		break;
+	case DRM_XE_EUDEBUG_IOCTL_EU_CONTROL:
+		ret = xe_eudebug_eu_control(d, arg);
+		eu_dbg(d, "ioctl cmd=EU_CONTROL ret=%ld\n", ret);
+		break;
 	default:
 		ret = -EINVAL;
 	}
@@ -1909,6 +1938,8 @@ xe_eudebug_connect(struct xe_device *xe,
 		goto err_detach;
 	}
 
+	xe_eudebug_hw_init(d);
+
 	kref_get(&d->ref);
 	queue_work(xe->eudebug.wq, &d->discovery_work);
 
@@ -1938,6 +1969,10 @@ bool xe_eudebug_is_enabled(struct xe_device *xe)
 
 static int xe_eudebug_enable(struct xe_device *xe, bool enable)
 {
+	struct xe_gt *gt;
+	int i;
+	u8 id;
+
 	mutex_lock(&xe->eudebug.lock);
 
 	if (xe->eudebug.state == XE_EUDEBUG_NOT_SUPPORTED) {
@@ -1955,6 +1990,18 @@ static int xe_eudebug_enable(struct xe_device *xe, bool enable)
 		return 0;
 	}
 
+	for_each_gt(gt, xe, id) {
+		for (i = 0; i < ARRAY_SIZE(gt->hw_engines); i++) {
+			if (!(gt->info.engine_mask & BIT(i)))
+				continue;
+
+			xe_eudebug_init_hw_engine(&gt->hw_engines[i], enable);
+		}
+
+		xe_gt_reset_async(gt);
+		flush_work(&gt->reset.worker);
+	}
+
 	xe->eudebug.state = enable ?
 		XE_EUDEBUG_ENABLED : XE_EUDEBUG_DISABLED;
 	mutex_unlock(&xe->eudebug.lock);
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index 212482d2de73..208b18127603 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -59,6 +59,8 @@ struct xe_vm *xe_eudebug_vm_get(struct xe_eudebug *d, u32 vm_id);
 
 void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q);
 void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q);
+struct xe_exec_queue *xe_eudebug_exec_queue_get(struct xe_eudebug *d, u32 id);
+struct xe_lrc *xe_eudebug_find_lrc(struct xe_eudebug *d, u32 id);
 
 void xe_eudebug_vm_init(struct xe_vm *vm);
 void xe_eudebug_vm_bind_start(struct xe_vm *vm);
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
index 5a0fa9f9bd86..11e3fe0c05e0 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.c
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
@@ -70,3 +70,661 @@ void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable)
 		add_sr_entry(hwe, TD_CTL,
 			     TD_CTL_GLOBAL_DEBUG_ENABLE, enable);
 }
+
+static int __current_lrca(struct xe_hw_engine *hwe, u32 *lrc_hw)
+{
+	u32 lrc_reg;
+
+	lrc_reg = xe_hw_engine_mmio_read32(hwe, RING_CURRENT_LRCA(0));
+
+	if (!(lrc_reg & CURRENT_LRCA_VALID))
+		return -ENOENT;
+
+	*lrc_hw = lrc_reg & GENMASK(31, 12);
+
+	return 0;
+}
+
+static int current_lrca(struct xe_hw_engine *hwe, u32 *lrc_hw)
+{
+	unsigned int fw_ref;
+	int ret;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(hwe->gt), hwe->domain);
+	if (!fw_ref)
+		return -ETIMEDOUT;
+
+	ret = __current_lrca(hwe, lrc_hw);
+
+	xe_force_wake_put(gt_to_fw(hwe->gt), fw_ref);
+
+	return ret;
+}
+
+static bool lrca_equals(u32 a, u32 b)
+{
+	return (a & GENMASK(31, 12)) == (b & GENMASK(31, 12));
+}
+
+static int match_exec_queue_lrca(struct xe_exec_queue *q, u32 lrc_hw)
+{
+	int i;
+
+	for (i = 0; i < q->width; i++)
+		if (lrca_equals(lower_32_bits(xe_lrc_descriptor(q->lrc[i])), lrc_hw))
+			return i;
+
+	return -1;
+}
+
+static int rcu_debug1_engine_index(const struct xe_hw_engine * const hwe)
+{
+	if (hwe->class == XE_ENGINE_CLASS_RENDER) {
+		XE_WARN_ON(hwe->instance);
+		return 0;
+	}
+
+	XE_WARN_ON(hwe->instance > 3);
+
+	return hwe->instance + 1;
+}
+
+static u32 engine_status_xe1(const struct xe_hw_engine * const hwe,
+			     u32 rcu_debug1)
+{
+	const unsigned int first = 7;
+	const unsigned int incr = 3;
+	const unsigned int i = rcu_debug1_engine_index(hwe);
+	const unsigned int shift = first + (i * incr);
+
+	return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS;
+}
+
+static u32 engine_status_xe2(const struct xe_hw_engine * const hwe,
+			     u32 rcu_debug1)
+{
+	const unsigned int first = 7;
+	const unsigned int incr = 4;
+	const unsigned int i = rcu_debug1_engine_index(hwe);
+	const unsigned int shift = first + (i * incr);
+
+	return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS;
+}
+
+static u32 engine_status_xe3(const struct xe_hw_engine * const hwe,
+			     u32 rcu_debug1)
+{
+	const unsigned int first = 6;
+	const unsigned int incr = 4;
+	const unsigned int i = rcu_debug1_engine_index(hwe);
+	const unsigned int shift = first + (i * incr);
+
+	return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS;
+}
+
+static u32 engine_status(const struct xe_hw_engine * const hwe,
+			 u32 rcu_debug1)
+{
+	u32 status = 0;
+
+	if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 20)
+		status = engine_status_xe1(hwe, rcu_debug1);
+	else if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 30)
+		status = engine_status_xe2(hwe, rcu_debug1);
+	else if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 35)
+		status = engine_status_xe3(hwe, rcu_debug1);
+	else
+		XE_WARN_ON(GRAPHICS_VER(gt_to_xe(hwe->gt)));
+
+	return status;
+}
+
+static bool engine_has_runalone_set(const struct xe_hw_engine * const hwe,
+				   u32 rcu_debug1)
+{
+	return engine_status(hwe, rcu_debug1) & RCU_DEBUG_1_RUNALONE_ACTIVE;
+}
+
+static bool engine_has_context_set(const struct xe_hw_engine * const hwe,
+				  u32 rcu_debug1)
+{
+	return engine_status(hwe, rcu_debug1) & RCU_DEBUG_1_CONTEXT_ACTIVE;
+}
+
+static struct xe_hw_engine *get_runalone_active_hw_engine(struct xe_gt *gt)
+{
+	struct xe_hw_engine *hwe, *first = NULL;
+	unsigned int num_active, id, fw_ref;
+	u32 val;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
+	if (!fw_ref) {
+		drm_dbg(&gt_to_xe(gt)->drm, "eudbg: runalone failed to get force wake\n");
+		return NULL;
+	}
+
+	val = xe_mmio_read32(&gt->mmio, RCU_DEBUG_1);
+	xe_force_wake_put(gt_to_fw(gt), fw_ref);
+
+	drm_dbg(&gt_to_xe(gt)->drm, "eudbg: runalone RCU_DEBUG_1 = 0x%08x\n", val);
+
+	num_active = 0;
+	xe_eudebug_for_each_hw_engine(hwe, gt, id) {
+		bool runalone, ctx;
+
+		runalone = engine_has_runalone_set(hwe, val);
+		ctx = engine_has_context_set(hwe, val);
+
+		drm_dbg(&gt_to_xe(gt)->drm, "eudbg: engine %s: runalone=%s, context=%s",
+			hwe->name, runalone ? "active" : "inactive",
+			ctx ? "active" : "inactive");
+
+		/*
+		 * On earlier gen12 the context status seems to be idle when
+		 * it has raised attention. We have to omit the active bit.
+		 */
+		if (IS_DGFX(gt_to_xe(gt)))
+			ctx = true;
+
+		if (runalone && ctx) {
+			num_active++;
+
+			drm_dbg(&gt_to_xe(gt)->drm, "eudbg: runalone engine %s %s",
+				hwe->name, first ? "selected" : "found");
+			if (!first)
+				first = hwe;
+		}
+	}
+
+	if (num_active > 1)
+		drm_err(&gt_to_xe(gt)->drm, "eudbg: %d runalone engines active!",
+			num_active);
+
+	return first;
+}
+
+static struct xe_exec_queue *active_hwe_to_exec_queue(struct xe_hw_engine *hwe,
+						      int *lrc_idx)
+{
+	struct xe_device *xe = gt_to_xe(hwe->gt);
+	struct xe_gt *gt = hwe->gt;
+	struct xe_exec_queue *q, *found = NULL;
+	struct xe_file *xef;
+	unsigned long i;
+	int idx, err;
+	u32 lrc_hw;
+
+	err = current_lrca(hwe, &lrc_hw);
+	if (err)
+		return ERR_PTR(err);
+
+	mutex_lock(&xe->eudebug.lock);
+	list_for_each_entry(xef, &xe->eudebug.targets, eudebug.target_link) {
+		down_write(&xef->eudebug.ioctl_lock);
+		xa_for_each(&xef->exec_queue.xa, i, q) {
+			if (q->gt != gt)
+				continue;
+
+			if (q->class != hwe->class)
+				continue;
+
+			if (xe_exec_queue_is_idle(q))
+				continue;
+
+			idx = match_exec_queue_lrca(q, lrc_hw);
+			if (idx < 0)
+				continue;
+
+			found = xe_exec_queue_get(q);
+
+			if (lrc_idx)
+				*lrc_idx = idx;
+
+			break;
+		}
+		up_write(&xef->eudebug.ioctl_lock);
+
+		if (found)
+			break;
+	}
+	mutex_unlock(&xe->eudebug.lock);
+
+	if (!found)
+		return ERR_PTR(-ENOENT);
+
+	if (XE_WARN_ON(current_lrca(hwe, &lrc_hw)) &&
+	    XE_WARN_ON(match_exec_queue_lrca(found, lrc_hw) < 0)) {
+		xe_exec_queue_put(found);
+		return ERR_PTR(-ENOENT);
+	}
+
+	return found;
+}
+
+static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx)
+{
+	struct xe_hw_engine *active;
+
+	active = get_runalone_active_hw_engine(gt);
+	if (!active) {
+		drm_dbg(&gt_to_xe(gt)->drm, "Runalone engine not found!");
+		return ERR_PTR(-ENOENT);
+	}
+
+	return active_hwe_to_exec_queue(active, lrc_idx);
+}
+
+static int do_eu_control(struct xe_eudebug *d,
+			 const struct drm_xe_eudebug_eu_control * const arg,
+			 struct drm_xe_eudebug_eu_control __user * const user_ptr)
+{
+	void __user * const bitmask_ptr = u64_to_user_ptr(arg->bitmask_ptr);
+	struct xe_device *xe = d->xe;
+	u8 *bits = NULL;
+	unsigned int hw_attn_size, attn_size;
+	struct xe_exec_queue *q;
+	struct xe_lrc *lrc;
+	u64 seqno;
+	int ret;
+
+	if (xe_eudebug_detached(d))
+		return -ENOTCONN;
+
+	/* Accept only hardware reg granularity mask */
+	if (XE_IOCTL_DBG(xe, !IS_ALIGNED(arg->bitmask_size, sizeof(u32))))
+		return -EINVAL;
+
+	q = xe_eudebug_exec_queue_get(d, arg->exec_queue_handle);
+	if (XE_IOCTL_DBG(xe, !q))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, !xe_exec_queue_is_debuggable(q))) {
+		ret = -EINVAL;
+		goto queue_put;
+	}
+
+	lrc = xe_eudebug_find_lrc(d, arg->lrc_handle);
+	if (XE_IOCTL_DBG(xe, !lrc)) {
+		ret = -EINVAL;
+		goto queue_put;
+	}
+
+	hw_attn_size = xe_gt_eu_attention_bitmap_size(q->gt);
+	attn_size = arg->bitmask_size;
+
+	if (attn_size > hw_attn_size)
+		attn_size = hw_attn_size;
+
+	if (attn_size > 0) {
+		bits = kmalloc(attn_size, GFP_KERNEL);
+		if (!bits) {
+			ret =  -ENOMEM;
+			goto queue_put;
+		}
+
+		if (copy_from_user(bits, bitmask_ptr, attn_size)) {
+			ret = -EFAULT;
+			goto out_free;
+		}
+	}
+
+	if (!pm_runtime_active(xe->drm.dev)) {
+		ret = -EIO;
+		goto out_free;
+	}
+
+	ret = -EINVAL;
+	mutex_lock(&d->hw.lock);
+
+	switch (arg->cmd) {
+	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL:
+		/* Make sure we dont promise anything but interrupting all */
+		if (!attn_size)
+			ret = d->ops->interrupt_all(d, q, lrc);
+		break;
+	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED:
+		ret = d->ops->stopped(d, q, lrc, bits, attn_size);
+		break;
+	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME:
+		ret = d->ops->resume(d, q, lrc, bits, attn_size);
+		break;
+	default:
+		break;
+	}
+
+	if (ret == 0)
+		seqno = atomic_long_inc_return(&d->events.seqno);
+
+	mutex_unlock(&d->hw.lock);
+
+	if (ret)
+		goto out_free;
+
+	if (put_user(seqno, &user_ptr->seqno)) {
+		ret = -EFAULT;
+		goto out_free;
+	}
+
+	if (copy_to_user(bitmask_ptr, bits, attn_size)) {
+		ret = -EFAULT;
+		goto out_free;
+	}
+
+	if (hw_attn_size != arg->bitmask_size)
+		if (put_user(hw_attn_size, &user_ptr->bitmask_size))
+			ret = -EFAULT;
+
+out_free:
+	kfree(bits);
+queue_put:
+	xe_exec_queue_put(q);
+
+	return ret;
+}
+
+static int xe_eu_control_interrupt_all(struct xe_eudebug *d,
+				       struct xe_exec_queue *q,
+				       struct xe_lrc *lrc)
+{
+	struct xe_gt *gt = q->hwe->gt;
+	struct xe_device *xe = d->xe;
+	struct xe_exec_queue *active;
+	struct xe_hw_engine *hwe;
+	unsigned int fw_ref;
+	int lrc_idx, ret;
+	u32 lrc_hw;
+	u32 td_ctl;
+
+	hwe = get_runalone_active_hw_engine(gt);
+	if (XE_IOCTL_DBG(xe, !hwe)) {
+		drm_dbg(&gt_to_xe(gt)->drm, "Runalone engine not found!");
+		return -EINVAL;
+	}
+
+	active = active_hwe_to_exec_queue(hwe, &lrc_idx);
+	if (XE_IOCTL_DBG(xe, IS_ERR(active)))
+		return PTR_ERR(active);
+
+	if (XE_IOCTL_DBG(xe, q != active)) {
+		xe_exec_queue_put(active);
+		return -EINVAL;
+	}
+	xe_exec_queue_put(active);
+
+	if (XE_IOCTL_DBG(xe, lrc_idx >= q->width || q->lrc[lrc_idx] != lrc))
+		return -EINVAL;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(gt), hwe->domain);
+	if (!fw_ref)
+		return -ETIMEDOUT;
+
+	/* Additional check just before issuing MMIO writes */
+	ret = __current_lrca(hwe, &lrc_hw);
+	if (ret)
+		goto put_fw;
+
+	if (!lrca_equals(lower_32_bits(xe_lrc_descriptor(lrc)), lrc_hw)) {
+		ret = -EBUSY;
+		goto put_fw;
+	}
+
+	td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL);
+
+	/* Halt on next thread dispatch */
+	if (!(td_ctl & TD_CTL_FORCE_EXTERNAL_HALT))
+		xe_gt_mcr_multicast_write(gt, TD_CTL,
+					  td_ctl | TD_CTL_FORCE_EXTERNAL_HALT);
+	else
+		eu_warn(d, "TD_CTL force external halt bit already set!\n");
+
+	/*
+	 * The sleep is needed because some interrupts are ignored
+	 * by the HW, hence we allow the HW some time to acknowledge
+	 * that.
+	 */
+	usleep_range(100, 110);
+
+	/* Halt regardless of thread dependencies */
+	if (!(td_ctl & TD_CTL_FORCE_EXCEPTION))
+		xe_gt_mcr_multicast_write(gt, TD_CTL,
+					  td_ctl | TD_CTL_FORCE_EXCEPTION);
+	else
+		eu_warn(d, "TD_CTL force exception bit already set!\n");
+
+	usleep_range(100, 110);
+
+	xe_gt_mcr_multicast_write(gt, TD_CTL, td_ctl &
+				  ~(TD_CTL_FORCE_EXTERNAL_HALT | TD_CTL_FORCE_EXCEPTION));
+
+	/*
+	 * In case of stopping wrong ctx emit warning.
+	 * Nothing else we can do for now.
+	 */
+	ret = __current_lrca(hwe, &lrc_hw);
+	if (ret || !lrca_equals(lower_32_bits(xe_lrc_descriptor(lrc)), lrc_hw))
+		eu_warn(d, "xe_eudebug: interrupted wrong context.");
+
+put_fw:
+	xe_force_wake_put(gt_to_fw(gt), fw_ref);
+
+	return ret;
+}
+
+struct ss_iter {
+	struct xe_eudebug *debugger;
+	unsigned int i;
+
+	unsigned int size;
+	u8 *bits;
+};
+
+static int check_attn_mcr(struct xe_gt *gt, void *data,
+			  u16 group, u16 instance, bool present)
+{
+	struct ss_iter *iter = data;
+	struct xe_eudebug *d = iter->debugger;
+	unsigned int reg, row;
+
+	for (reg = 0; reg < xe_gt_eu_att_regs(gt); reg++) {
+		for (row = 0; row < XE_GT_EU_ATT_ROWS; row++) {
+			u32 val, cur = 0;
+
+			if (iter->i >= iter->size)
+				return 0;
+
+			if (XE_WARN_ON((iter->i + sizeof(val)) >
+					(xe_gt_eu_attention_bitmap_size(gt))))
+				return -EIO;
+
+			memcpy(&val, &iter->bits[iter->i], sizeof(val));
+			iter->i += sizeof(val);
+
+			if (present)
+				cur = xe_gt_mcr_unicast_read(gt, EU_ATT(reg, row), group, instance);
+
+			if ((val | cur) != cur) {
+				eu_dbg(d,
+				       "WRONG CLEAR (%u:%u:%u:%u) EU_ATT_CLR: 0x%08x; EU_ATT: 0x%08x\n",
+				       group, instance, reg, row, val, cur);
+				return -EINVAL;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static int clear_attn_mcr(struct xe_gt *gt, void *data,
+			  u16 group, u16 instance, bool present)
+{
+	struct ss_iter *iter = data;
+	struct xe_eudebug *d = iter->debugger;
+	unsigned int reg, row;
+
+	for (reg = 0; reg < xe_gt_eu_att_regs(gt); reg++) {
+		for (row = 0; row < XE_GT_EU_ATT_ROWS; row++) {
+			u32 val;
+
+			if (iter->i >= iter->size)
+				return 0;
+
+			if (XE_WARN_ON((iter->i + sizeof(val)) >
+					(xe_gt_eu_attention_bitmap_size(gt))))
+				return -EIO;
+
+			memcpy(&val, &iter->bits[iter->i], sizeof(val));
+			iter->i += sizeof(val);
+
+			if (!val)
+				continue;
+
+			if (present) {
+				xe_gt_mcr_unicast_write(gt, EU_ATT_CLR(reg, row), val,
+							group, instance);
+
+				eu_dbg(d,
+				       "EU_ATT_CLR: (%u:%u:%u:%u): 0x%08x\n",
+				       group, instance, reg, row, val);
+			} else {
+				eu_warn(d,
+					"EU_ATT_CLR: (%u:%u:%u:%u): 0x%08x to fused off dss\n",
+					group, instance, reg, row, val);
+			}
+		}
+	}
+
+	return 0;
+}
+
+static int xe_eu_control_resume(struct xe_eudebug *d,
+				struct xe_exec_queue *q,
+				struct xe_lrc *lrc,
+				u8 *bits, unsigned int bitmask_size)
+{
+	struct xe_device *xe = d->xe;
+	struct ss_iter iter = {
+		.debugger = d,
+		.i = 0,
+		.size = bitmask_size,
+		.bits = bits
+	};
+	int ret = 0;
+	struct xe_exec_queue *active;
+	int lrc_idx;
+
+	active = runalone_active_queue_get(q->gt, &lrc_idx);
+	if (IS_ERR(active))
+		return PTR_ERR(active);
+
+	if (XE_IOCTL_DBG(xe, q != active)) {
+		xe_exec_queue_put(active);
+		return -EBUSY;
+	}
+	xe_exec_queue_put(active);
+
+	if (XE_IOCTL_DBG(xe, lrc_idx >= q->width || q->lrc[lrc_idx] != lrc))
+		return -EBUSY;
+
+	/*
+	 * hsdes: 18021122357
+	 * We need to avoid clearing attention bits that are not set
+	 * in order to avoid the EOT hang on PVC.
+	 */
+	if (GRAPHICS_VERx100(d->xe) == 1260) {
+		ret = xe_gt_foreach_dss_group_instance(q->gt, check_attn_mcr, &iter);
+		if (ret)
+			return ret;
+
+		iter.i = 0;
+	}
+
+	xe_gt_foreach_dss_group_instance(q->gt, clear_attn_mcr, &iter);
+	return 0;
+}
+
+static int xe_eu_control_stopped(struct xe_eudebug *d,
+				 struct xe_exec_queue *q,
+				 struct xe_lrc *lrc,
+				 u8 *bits, unsigned int bitmask_size)
+{
+	struct xe_device *xe = d->xe;
+	struct xe_exec_queue *active;
+	int lrc_idx;
+
+	if (XE_WARN_ON(!q) || XE_WARN_ON(!q->gt))
+		return -EINVAL;
+
+	active = runalone_active_queue_get(q->gt, &lrc_idx);
+	if (IS_ERR(active))
+		return PTR_ERR(active);
+
+	if (active) {
+		if (XE_IOCTL_DBG(xe, q != active)) {
+			xe_exec_queue_put(active);
+			return -EBUSY;
+		}
+
+		if (XE_IOCTL_DBG(xe, lrc_idx >= q->width || q->lrc[lrc_idx] != lrc)) {
+			xe_exec_queue_put(active);
+			return -EBUSY;
+		}
+	}
+
+	xe_exec_queue_put(active);
+
+	return xe_gt_eu_attention_bitmap(q->gt, bits, bitmask_size);
+}
+
+static struct xe_eudebug_eu_control_ops eu_control = {
+	.interrupt_all = xe_eu_control_interrupt_all,
+	.stopped = xe_eu_control_stopped,
+	.resume = xe_eu_control_resume,
+};
+
+void xe_eudebug_hw_init(struct xe_eudebug *d)
+{
+	d->ops = &eu_control;
+}
+
+long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg)
+{
+	struct drm_xe_eudebug_eu_control __user * const user_ptr =
+		u64_to_user_ptr(arg);
+	struct drm_xe_eudebug_eu_control user_arg;
+	struct xe_device *xe = d->xe;
+	int ret;
+
+	if (XE_IOCTL_DBG(xe, !(_IOC_DIR(DRM_XE_EUDEBUG_IOCTL_EU_CONTROL) & _IOC_WRITE)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, !(_IOC_DIR(DRM_XE_EUDEBUG_IOCTL_EU_CONTROL) & _IOC_READ)))
+		return -EINVAL;
+
+	if (XE_IOCTL_DBG(xe, _IOC_SIZE(DRM_XE_EUDEBUG_IOCTL_EU_CONTROL) != sizeof(user_arg)))
+		return -EINVAL;
+
+	if (copy_from_user(&user_arg,
+			   user_ptr,
+			   sizeof(user_arg)))
+		return -EFAULT;
+
+	if (XE_IOCTL_DBG(xe, user_arg.flags))
+		return -EINVAL;
+
+	if (!access_ok(u64_to_user_ptr(user_arg.bitmask_ptr), user_arg.bitmask_size))
+		return -EFAULT;
+
+	eu_dbg(d,
+	       "eu_control: cmd=%u, flags=0x%x, exec_queue_handle=%llu, bitmask_size=%u\n",
+	       user_arg.cmd, user_arg.flags, user_arg.exec_queue_handle,
+	       user_arg.bitmask_size);
+
+	ret = do_eu_control(d, &user_arg, user_ptr);
+
+	eu_dbg(d,
+	       "eu_control: cmd=%u, flags=0x%x, exec_queue_handle=%llu, bitmask_size=%u ret=%d\n",
+	       user_arg.cmd, user_arg.flags, user_arg.exec_queue_handle,
+	       user_arg.bitmask_size, ret);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
index 7362ed9bde68..8f59ec574e4e 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
@@ -16,10 +16,17 @@ struct xe_gt;
 
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 
+void xe_eudebug_hw_init(struct xe_eudebug *d);
 void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable);
 
+long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg);
+
+struct xe_exec_queue *xe_gt_runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx);
+
 #else /* CONFIG_DRM_XE_EUDEBUG */
 
+static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable) { }
+
 #endif /* CONFIG_DRM_XE_EUDEBUG */
 
 #endif /* _XE_EUDEBUG_HW_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 292e93c72a64..205777a851a3 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -17,7 +17,11 @@
 
 struct xe_device;
 struct task_struct;
+struct xe_eudebug;
+struct xe_hw_engine;
 struct workqueue_struct;
+struct xe_exec_queue;
+struct xe_lrc;
 
 /**
  * enum xe_eudebug_state - eudebug capability state
@@ -76,6 +80,24 @@ struct xe_eudebug_resources {
 	struct xe_eudebug_resource rt[XE_EUDEBUG_RES_TYPE_COUNT];
 };
 
+/**
+ * struct xe_eudebug_eu_control_ops - interface for eu thread
+ * state control backend
+ */
+struct xe_eudebug_eu_control_ops {
+	/** @interrupt_all: interrupts workload active on given hwe */
+	int (*interrupt_all)(struct xe_eudebug *e, struct xe_exec_queue *q,
+			     struct xe_lrc *lrc);
+
+	/** @resume: resumes threads reflected by bitmask active on given hwe */
+	int (*resume)(struct xe_eudebug *e, struct xe_exec_queue *q,
+		      struct xe_lrc *lrc, u8 *bitmap, unsigned int bitmap_size);
+
+	/** @stopped: returns bitmap reflecting threads which signal attention */
+	int (*stopped)(struct xe_eudebug *e, struct xe_exec_queue *q,
+		       struct xe_lrc *lrc, u8 *bitmap, unsigned int bitmap_size);
+};
+
 /**
  * struct xe_eudebug - Top level struct for eudebug: the connection
  */
@@ -144,6 +166,9 @@ struct xe_eudebug {
 		/** @lock: guards access to hw state */
 		struct mutex lock;
 	} hw;
+
+	/** @ops operations for eu_control */
+	struct xe_eudebug_eu_control_ops *ops;
 };
 
 #endif /* _XE_EUDEBUG_TYPES_H_ */
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index cc45ebd47143..24bf3887d556 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -18,6 +18,7 @@ extern "C" {
 #define DRM_XE_EUDEBUG_IOCTL_READ_EVENT		_IO('j', 0x0)
 #define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT		_IOW('j', 0x1, struct drm_xe_eudebug_ack_event)
 #define DRM_XE_EUDEBUG_IOCTL_VM_OPEN		_IOW('j', 0x2, struct drm_xe_eudebug_vm_open)
+#define DRM_XE_EUDEBUG_IOCTL_EU_CONTROL		_IOWR('j', 0x3, struct drm_xe_eudebug_eu_control)
 
 /**
  * struct drm_xe_eudebug_event - Base type of event delivered by xe_eudebug.
@@ -180,6 +181,23 @@ struct drm_xe_eudebug_vm_open {
 	__u64 timeout_ns;
 };
 
+struct drm_xe_eudebug_eu_control {
+
+#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL	0
+#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED		1
+#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME		2
+	__u32 cmd;
+	__u32 flags;
+
+	__u64 seqno;
+
+	__u64 exec_queue_handle;
+	__u64 lrc_handle;
+	__u32 reserved;
+	__u32 bitmask_size;
+	__u64 bitmask_ptr;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 12/20] drm/xe/eudebug: Introduce per device attention scan worker
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (10 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 11/20] drm/xe/eudebug: Introduce EU control interface Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 13/20] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Mika Kuoppala
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek, Mika Kuoppala

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Scan for EU debugging attention bits periodically to detect if some EU
thread has entered the system routine (SIP) due to EU thread exception.

Make the scanning interval 10 times slower when there is no debugger
connection open. Send attention event whenever we see attention with
debugger presence. If there is no debugger connection active - reset.

Based on work by authors and other folks who were part of attentions in
i915.

v2: - use xa_array for files
    - null ptr deref fix for non-debugged context (Dominik)
    - checkpatch (Tilak)
    - use discovery_lock during list traversal

v3: - engine status per gen improvements, force_wake ref
    - __counted_by (Mika)

v4: - attention register naming (Dominik)

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h  |   3 +
 drivers/gpu/drm/xe/xe_eudebug.c       | 171 ++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug_hw.c    |   6 +-
 drivers/gpu/drm/xe/xe_eudebug_types.h |   3 +-
 include/uapi/drm/xe_drm_eudebug.h     |  12 ++
 5 files changed, 190 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 163305440fdf..9ea962dd4749 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -632,6 +632,9 @@ struct xe_device {
 
 		/** @wq: used for client discovery */
 		struct workqueue_struct *wq;
+
+		/** @attention_poll: attention poll work */
+		struct delayed_work attention_dwork;
 	} eudebug;
 #endif
 
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index a71797b4e9dd..0dae4694b8a0 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -21,7 +21,10 @@
 #include "xe_exec_queue.h"
 #include "xe_gt.h"
 #include "xe_hw_engine.h"
+#include "xe_gt.h"
+#include "xe_gt_debug.h"
 #include "xe_macros.h"
+#include "xe_pm.h"
 #include "xe_sync.h"
 #include "xe_vm.h"
 
@@ -1871,6 +1874,154 @@ static const struct file_operations fops = {
 	.unlocked_ioctl	= xe_eudebug_ioctl,
 };
 
+static int send_attention_event(struct xe_eudebug *d, struct xe_exec_queue *q, int lrc_idx)
+{
+	struct drm_xe_eudebug_event_eu_attention *e;
+	struct drm_xe_eudebug_event *event;
+	const u32 size = xe_gt_eu_attention_bitmap_size(q->gt);
+	const u32 sz = struct_size(e, bitmask, size);
+	int h_queue, h_lrc;
+	int ret;
+
+	XE_WARN_ON(lrc_idx < 0 || lrc_idx >= q->width);
+
+	XE_WARN_ON(!xe_exec_queue_is_debuggable(q));
+
+	h_queue = find_handle(d->res, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, q);
+	if (h_queue < 0)
+		return h_queue;
+
+	h_lrc = find_handle(d->res, XE_EUDEBUG_RES_TYPE_LRC, q->lrc[lrc_idx]);
+	if (h_lrc < 0)
+		return h_lrc;
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION, 0,
+					DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, sz);
+
+	if (!event)
+		return -ENOSPC;
+
+	e = cast_event(e, event);
+	e->exec_queue_handle = h_queue;
+	e->lrc_handle = h_lrc;
+	e->bitmask_size = size;
+
+	mutex_lock(&d->hw.lock);
+	event->seqno = atomic_long_inc_return(&d->events.seqno);
+	ret = xe_gt_eu_attention_bitmap(q->gt, &e->bitmask[0], e->bitmask_size);
+	mutex_unlock(&d->hw.lock);
+
+	if (ret)
+		return ret;
+
+	return xe_eudebug_queue_event(d, event);
+}
+
+static int xe_send_gt_attention(struct xe_gt *gt)
+{
+	struct xe_eudebug *d;
+	struct xe_exec_queue *q;
+	int ret, lrc_idx;
+
+	q = xe_gt_runalone_active_queue_get(gt, &lrc_idx);
+	if (IS_ERR(q))
+		return PTR_ERR(q);
+
+	if (!xe_exec_queue_is_debuggable(q)) {
+		ret = -EPERM;
+		goto err_exec_queue_put;
+	}
+
+	d = _xe_eudebug_get(q->vm->xef);
+	if (!d) {
+		ret = -ENOTCONN;
+		goto err_exec_queue_put;
+	}
+
+	if (!completion_done(&d->discovery)) {
+		eu_dbg(d, "discovery not yet done\n");
+		ret = -EBUSY;
+		goto err_eudebug_put;
+	}
+
+	ret = send_attention_event(d, q, lrc_idx);
+	if (ret)
+		xe_eudebug_disconnect(d, ret);
+
+err_eudebug_put:
+	xe_eudebug_put(d);
+err_exec_queue_put:
+	xe_exec_queue_put(q);
+
+	return ret;
+}
+
+static int xe_eudebug_handle_gt_attention(struct xe_gt *gt)
+{
+	int ret;
+
+	ret = xe_gt_eu_threads_needing_attention(gt);
+	if (ret <= 0)
+		return ret;
+
+	ret = xe_send_gt_attention(gt);
+
+	/* Discovery in progress, fake it */
+	if (ret == -EBUSY)
+		return 0;
+
+	return ret;
+}
+
+static void attention_poll_work(struct work_struct *work)
+{
+	struct xe_device *xe = container_of(work, typeof(*xe),
+					    eudebug.attention_dwork.work);
+	const unsigned int poll_interval_ms = 100;
+	long delay = msecs_to_jiffies(poll_interval_ms);
+	struct xe_gt *gt;
+	u8 gt_id;
+
+	if (list_empty(&xe->eudebug.targets))
+		delay *= 11;
+
+	if (delay >= HZ)
+		delay = round_jiffies_up_relative(delay);
+
+	if (xe_pm_runtime_get_if_active(xe)) {
+		for_each_gt(gt, xe, gt_id) {
+			int ret;
+
+			if (gt->info.type != XE_GT_TYPE_MAIN)
+				continue;
+
+			ret = xe_eudebug_handle_gt_attention(gt);
+			if (ret) {
+				/* TODO: error capture */
+				drm_info(&gt_to_xe(gt)->drm,
+					 "gt:%d unable to handle eu attention ret=%d\n",
+					 gt_id, ret);
+
+				xe_gt_reset_async(gt);
+			}
+		}
+
+		xe_pm_runtime_put(xe);
+	}
+
+	schedule_delayed_work(&xe->eudebug.attention_dwork, delay);
+}
+
+static void attention_poll_stop(struct xe_device *xe)
+{
+	cancel_delayed_work_sync(&xe->eudebug.attention_dwork);
+}
+
+static void attention_poll_start(struct xe_device *xe)
+{
+	mod_delayed_work(system_wq, &xe->eudebug.attention_dwork, 0);
+}
+
 static int
 xe_eudebug_connect(struct xe_device *xe,
 		   struct drm_file *file,
@@ -1942,6 +2093,7 @@ xe_eudebug_connect(struct xe_device *xe,
 
 	kref_get(&d->ref);
 	queue_work(xe->eudebug.wq, &d->discovery_work);
+	attention_poll_start(xe);
 
 	eu_dbg(d, "connected session %lld", d->session);
 
@@ -2006,6 +2158,11 @@ static int xe_eudebug_enable(struct xe_device *xe, bool enable)
 		XE_EUDEBUG_ENABLED : XE_EUDEBUG_DISABLED;
 	mutex_unlock(&xe->eudebug.lock);
 
+	if (enable)
+		attention_poll_start(xe);
+	else
+		attention_poll_stop(xe);
+
 	return 0;
 }
 
@@ -2047,6 +2204,15 @@ static void xe_eudebug_sysfs_fini(void *arg)
 			  &dev_attr_enable_eudebug.attr);
 }
 
+static void xe_eudebug_fini(struct drm_device *dev, void *__unused)
+{
+	struct xe_device *xe = to_xe_device(dev);
+
+	xe_assert(xe, list_empty(&xe->eudebug.targets));
+
+	attention_poll_stop(xe);
+}
+
 void xe_eudebug_init(struct xe_device *xe)
 {
 	struct drm_device *dev = &xe->drm;
@@ -2054,6 +2220,7 @@ void xe_eudebug_init(struct xe_device *xe)
 	int err;
 
 	INIT_LIST_HEAD(&xe->eudebug.targets);
+	INIT_DELAYED_WORK(&xe->eudebug.attention_dwork, attention_poll_work);
 
 	xe->eudebug.state = XE_EUDEBUG_NOT_SUPPORTED;
 
@@ -2068,6 +2235,10 @@ void xe_eudebug_init(struct xe_device *xe)
 	}
 	xe->eudebug.wq = wq;
 
+	err = drmm_add_action_or_reset(&xe->drm, xe_eudebug_fini, NULL);
+	if (err)
+		goto out_err;
+
 	err = sysfs_create_file(&dev->dev->kobj,
 				&dev_attr_enable_eudebug.attr);
 	if (err)
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
index 11e3fe0c05e0..a62c4b439888 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.c
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
@@ -301,7 +301,7 @@ static struct xe_exec_queue *active_hwe_to_exec_queue(struct xe_hw_engine *hwe,
 	return found;
 }
 
-static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx)
+struct xe_exec_queue *xe_gt_runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx)
 {
 	struct xe_hw_engine *active;
 
@@ -612,7 +612,7 @@ static int xe_eu_control_resume(struct xe_eudebug *d,
 	struct xe_exec_queue *active;
 	int lrc_idx;
 
-	active = runalone_active_queue_get(q->gt, &lrc_idx);
+	active = xe_gt_runalone_active_queue_get(q->gt, &lrc_idx);
 	if (IS_ERR(active))
 		return PTR_ERR(active);
 
@@ -654,7 +654,7 @@ static int xe_eu_control_stopped(struct xe_eudebug *d,
 	if (XE_WARN_ON(!q) || XE_WARN_ON(!q->gt))
 		return -EINVAL;
 
-	active = runalone_active_queue_get(q->gt, &lrc_idx);
+	active = xe_gt_runalone_active_queue_get(q->gt, &lrc_idx);
 	if (IS_ERR(active))
 		return PTR_ERR(active);
 
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 205777a851a3..85fc321f8b0e 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -37,7 +37,7 @@ enum xe_eudebug_state {
 };
 
 #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64
-#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE
+#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_EU_ATTENTION
 
 /**
  * struct xe_eudebug_handle - eudebug resource handle
@@ -172,4 +172,3 @@ struct xe_eudebug {
 };
 
 #endif /* _XE_EUDEBUG_TYPES_H_ */
-
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index 24bf3887d556..1c797a8b4d32 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -55,12 +55,14 @@ struct drm_xe_eudebug_event {
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND		4
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA	5
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE	6
+#define DRM_XE_EUDEBUG_EVENT_EU_ATTENTION	7
 
 	__u16 flags;
 #define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
 #define DRM_XE_EUDEBUG_EVENT_DESTROY		(1 << 1)
 #define DRM_XE_EUDEBUG_EVENT_STATE_CHANGE	(1 << 2)
 #define DRM_XE_EUDEBUG_EVENT_NEED_ACK		(1 << 3)
+
 	__u64 seqno;
 	__u64 reserved;
 };
@@ -198,6 +200,16 @@ struct drm_xe_eudebug_eu_control {
 	__u64 bitmask_ptr;
 };
 
+struct drm_xe_eudebug_event_eu_attention {
+	struct drm_xe_eudebug_event base;
+
+	__u64 exec_queue_handle;
+	__u64 lrc_handle;
+	__u32 flags;
+	__u32 bitmask_size;
+	__u8 bitmask[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 13/20] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (11 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 12/20] drm/xe/eudebug: Introduce per device attention scan worker Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 14/20] drm/xe: Implement SR-IOV and eudebug exclusivity Mika Kuoppala
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

From: Christoph Manszewski <christoph.manszewski@intel.com>

Introduce kunit test for eudebug. For now it checks the dynamic
application of WAs.

v2: adapt to removal of call_for_each_device (Mika)
v3: s/FW_RENDER/FORCEWAKE_ALL (Mika)

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/tests/xe_eudebug.c       | 183 ++++++++++++++++++++
 drivers/gpu/drm/xe/tests/xe_live_test_mod.c |   5 +
 drivers/gpu/drm/xe/xe_eudebug.c             |   4 +
 3 files changed, 192 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/tests/xe_eudebug.c

diff --git a/drivers/gpu/drm/xe/tests/xe_eudebug.c b/drivers/gpu/drm/xe/tests/xe_eudebug.c
new file mode 100644
index 000000000000..f839fb292b9b
--- /dev/null
+++ b/drivers/gpu/drm/xe/tests/xe_eudebug.c
@@ -0,0 +1,183 @@
+// SPDX-License-Identifier: GPL-2.0 AND MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include <kunit/visibility.h>
+
+#include "regs/xe_gt_regs.h"
+#include "regs/xe_engine_regs.h"
+
+#include "xe_force_wake.h"
+#include "xe_gt_mcr.h"
+#include "xe_mmio.h"
+
+#include "tests/xe_kunit_helpers.h"
+#include "tests/xe_pci_test.h"
+#include "tests/xe_test.h"
+
+#undef XE_REG_MCR
+#define XE_REG_MCR(r_, ...)	((const struct xe_reg_mcr){					\
+				 .__reg = XE_REG_INITIALIZER(r_,  ##__VA_ARGS__, .mcr = 1)	\
+				 })
+
+static const char *reg_to_str(struct xe_reg reg)
+{
+	if (reg.raw == TD_CTL.__reg.raw)
+		return "TD_CTL";
+	else if (reg.raw == CS_DEBUG_MODE2(RENDER_RING_BASE).raw)
+		return "CS_DEBUG_MODE2";
+	else if (reg.raw == ROW_CHICKEN.__reg.raw)
+		return "ROW_CHICKEN";
+	else if (reg.raw == ROW_CHICKEN2.__reg.raw)
+		return "ROW_CHICKEN2";
+	else if (reg.raw == ROW_CHICKEN3.__reg.raw)
+		return "ROW_CHICKEN3";
+	else
+		return "UNKNOWN REG";
+}
+
+static u32 get_reg_mask(struct xe_device *xe, struct xe_reg reg)
+{
+	struct kunit *test = kunit_get_current_test();
+	u32 val = 0;
+
+	if (reg.raw == TD_CTL.__reg.raw) {
+		val = TD_CTL_BREAKPOINT_ENABLE |
+		      TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE |
+		      TD_CTL_FEH_AND_FEE_ENABLE;
+
+		if (GRAPHICS_VERx100(xe) >= 1250)
+			val |= TD_CTL_GLOBAL_DEBUG_ENABLE;
+
+	} else if (reg.raw == CS_DEBUG_MODE2(RENDER_RING_BASE).raw) {
+		val = GLOBAL_DEBUG_ENABLE;
+	} else if (reg.raw == ROW_CHICKEN.__reg.raw) {
+		val = STALL_DOP_GATING_DISABLE;
+	} else if (reg.raw == ROW_CHICKEN2.__reg.raw) {
+		val = XEHPC_DISABLE_BTB;
+	} else if (reg.raw == ROW_CHICKEN3.__reg.raw) {
+		val = XE2_EUPEND_CHK_FLUSH_DIS;
+	} else {
+		kunit_warn(test, "Invalid register selection: %u\n", reg.raw);
+	}
+
+	return val;
+}
+
+static u32 get_reg_expected(struct xe_device *xe, struct xe_reg reg, bool enable_eudebug)
+{
+	u32 reg_mask = get_reg_mask(xe, reg);
+	u32 reg_bits = 0;
+
+	if (enable_eudebug || reg.raw == ROW_CHICKEN3.__reg.raw)
+		reg_bits = reg_mask;
+	else
+		reg_bits = 0;
+
+	return reg_bits;
+}
+
+static void check_reg(struct xe_gt *gt, bool enable_eudebug, struct xe_reg reg)
+{
+	struct kunit *test = kunit_get_current_test();
+	struct xe_device *xe = gt_to_xe(gt);
+	u32 reg_bits_expected = get_reg_expected(xe, reg, enable_eudebug);
+	u32 reg_mask = get_reg_mask(xe, reg);
+	u32 reg_bits = 0;
+
+	if (reg.mcr)
+		reg_bits = xe_gt_mcr_unicast_read_any(gt, (struct xe_reg_mcr){.__reg = reg});
+	else
+		reg_bits = xe_mmio_read32(&gt->mmio, reg);
+
+	reg_bits &= reg_mask;
+
+	kunit_printk(KERN_DEBUG, test, "%s bits: expected == 0x%x; actual == 0x%x\n",
+		     reg_to_str(reg), reg_bits_expected, reg_bits);
+	KUNIT_EXPECT_EQ_MSG(test, reg_bits_expected, reg_bits,
+			    "Invalid bits set for %s\n", reg_to_str(reg));
+}
+
+static void __check_regs(struct xe_gt *gt, bool enable_eudebug)
+{
+	struct xe_device *xe = gt_to_xe(gt);
+
+	if (GRAPHICS_VERx100(xe) >= 1200)
+		check_reg(gt, enable_eudebug, TD_CTL.__reg);
+
+	if (GRAPHICS_VERx100(xe) >= 1250 && GRAPHICS_VERx100(xe) <= 1274)
+		check_reg(gt, enable_eudebug, ROW_CHICKEN.__reg);
+
+	if (xe->info.platform == XE_PVC)
+		check_reg(gt, enable_eudebug, ROW_CHICKEN2.__reg);
+
+	if (GRAPHICS_VERx100(xe) >= 2000 && GRAPHICS_VERx100(xe) <= 2004)
+		check_reg(gt, enable_eudebug, ROW_CHICKEN3.__reg);
+}
+
+static void check_regs(struct xe_device *xe, bool enable_eudebug)
+{
+	struct kunit *test = kunit_get_current_test();
+	struct xe_gt *gt;
+	unsigned int fw_ref;
+	u8 id;
+
+	kunit_printk(KERN_DEBUG, test, "Check regs for eudebug %s\n",
+		     enable_eudebug ? "enabled" : "disabled");
+
+	xe_pm_runtime_get(xe);
+	for_each_gt(gt, xe, id) {
+		if (xe_gt_is_media_type(gt))
+			continue;
+
+		/* XXX: Figure out per platform proper domain */
+		fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
+		KUNIT_ASSERT_TRUE_MSG(test, fw_ref, "Forcewake failed.\n");
+
+		__check_regs(gt, enable_eudebug);
+
+		xe_force_wake_put(gt_to_fw(gt), fw_ref);
+	}
+	xe_pm_runtime_put(xe);
+}
+
+static int toggle_reg_value(struct xe_device *xe)
+{
+	struct kunit *test = kunit_get_current_test();
+	bool enable_eudebug = xe_eudebug_is_enabled(xe);
+
+	kunit_printk(KERN_DEBUG, test, "Test eudebug WAs for graphics version: %u\n",
+		     GRAPHICS_VERx100(xe));
+
+	check_regs(xe, enable_eudebug);
+
+	xe_eudebug_enable(xe, !enable_eudebug);
+	check_regs(xe, !enable_eudebug);
+
+	xe_eudebug_enable(xe, enable_eudebug);
+	check_regs(xe, enable_eudebug);
+
+	return 0;
+}
+
+static void xe_eudebug_toggle_reg_kunit(struct kunit *test)
+{
+	struct xe_device *xe = test->priv;
+
+	toggle_reg_value(xe);
+}
+
+static struct kunit_case xe_eudebug_tests[] = {
+	KUNIT_CASE_PARAM(xe_eudebug_toggle_reg_kunit,
+			 xe_pci_live_device_gen_param),
+	{}
+};
+
+VISIBLE_IF_KUNIT
+struct kunit_suite xe_eudebug_test_suite = {
+	.name = "xe_eudebug",
+	.test_cases = xe_eudebug_tests,
+	.init = xe_kunit_helper_xe_device_live_test_init,
+};
+EXPORT_SYMBOL_IF_KUNIT(xe_eudebug_test_suite);
diff --git a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c
index c55e46f1ae92..dc83bb6a892d 100644
--- a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c
+++ b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c
@@ -19,6 +19,11 @@ kunit_test_suite(xe_migrate_test_suite);
 kunit_test_suite(xe_mocs_test_suite);
 kunit_test_suite(xe_guc_g2g_test_suite);
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+extern struct kunit_suite xe_eudebug_test_suite;
+kunit_test_suite(xe_eudebug_test_suite);
+#endif
+
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("xe live kunit tests");
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 0dae4694b8a0..a20397229e65 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -2265,3 +2265,7 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev,
 
 	return xe_eudebug_connect(xe, file, param);
 }
+
+#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
+#include "tests/xe_eudebug.c"
+#endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 14/20] drm/xe: Implement SR-IOV and eudebug exclusivity
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (12 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 13/20] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 15/20] drm/xe: Add xe_client_debugfs and introduce debug_data file Mika Kuoppala
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun

From: Christoph Manszewski <christoph.manszewski@intel.com>

EU debug functionality relies on access to specific mmio registers.
Since VFs don't have access to those registers and in order to avoid
interference with VFs, make SR-IOV and eudebug functionality exclusive.
I.e. don't allow to enable eudebug in VF mode and don't allow to enable
eudebug when any VFs are provisioned. Likewise, don't allow to provision
VFs when eudebug is enabled.

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
---
 drivers/gpu/drm/xe/tests/xe_eudebug.c |  6 +++++
 drivers/gpu/drm/xe/xe_eudebug.c       | 33 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug.h       |  6 +++++
 drivers/gpu/drm/xe/xe_exec_queue.c    |  3 +++
 drivers/gpu/drm/xe/xe_gt.c            |  1 +
 drivers/gpu/drm/xe/xe_pci_sriov.c     | 10 ++++++++
 6 files changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/xe/tests/xe_eudebug.c b/drivers/gpu/drm/xe/tests/xe_eudebug.c
index f839fb292b9b..c1e5eb091fc4 100644
--- a/drivers/gpu/drm/xe/tests/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/tests/xe_eudebug.c
@@ -147,6 +147,12 @@ static int toggle_reg_value(struct xe_device *xe)
 	struct kunit *test = kunit_get_current_test();
 	bool enable_eudebug = xe_eudebug_is_enabled(xe);
 
+	if (IS_SRIOV_VF(xe))
+		kunit_skip(test, "eudebug not available in SR-IOV VF mode\n");
+
+	if (xe->eudebug.state == XE_EUDEBUG_NOT_SUPPORTED)
+		kunit_skip(test, "eudebug not supported\n");
+
 	kunit_printk(KERN_DEBUG, test, "Test eudebug WAs for graphics version: %u\n",
 		     GRAPHICS_VERx100(xe));
 
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index a20397229e65..5dc8e4cd7f6b 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -2119,6 +2119,34 @@ bool xe_eudebug_is_enabled(struct xe_device *xe)
 	return READ_ONCE(xe->eudebug.state) == XE_EUDEBUG_ENABLED;
 }
 
+static int __xe_eudebug_toggle_support(struct xe_device *xe,
+				       bool support_enable)
+{
+	mutex_lock(&xe->eudebug.lock);
+
+	if (xe_eudebug_is_enabled(xe)) {
+		mutex_unlock(&xe->eudebug.lock);
+		return -EPERM;
+	}
+
+	xe->eudebug.state = support_enable ?
+		XE_EUDEBUG_DISABLED : XE_EUDEBUG_NOT_SUPPORTED;
+
+	mutex_unlock(&xe->eudebug.lock);
+
+	return 0;
+}
+
+void xe_eudebug_support_enable(struct xe_device *xe)
+{
+	__xe_eudebug_toggle_support(xe, true);
+}
+
+int xe_eudebug_support_disable(struct xe_device *xe)
+{
+	return __xe_eudebug_toggle_support(xe, false);
+}
+
 static int xe_eudebug_enable(struct xe_device *xe, bool enable)
 {
 	struct xe_gt *gt;
@@ -2224,6 +2252,11 @@ void xe_eudebug_init(struct xe_device *xe)
 
 	xe->eudebug.state = XE_EUDEBUG_NOT_SUPPORTED;
 
+	if (IS_SRIOV_VF(xe)) {
+		drm_info(&xe->drm, "eudebug not available in SR-IOV VF mode\n");
+		return;
+	}
+
 	err = drmm_mutex_init(dev, &xe->eudebug.lock);
 	if (err)
 		goto out_err;
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index 208b18127603..ed3c2078e960 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -46,6 +46,9 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev,
 			     void *data,
 			     struct drm_file *file);
 
+void xe_eudebug_support_enable(struct xe_device *xe);
+int xe_eudebug_support_disable(struct xe_device *xe);
+
 void xe_eudebug_init(struct xe_device *xe);
 bool xe_eudebug_is_enabled(struct xe_device *xe);
 
@@ -82,6 +85,9 @@ static inline int xe_eudebug_connect_ioctl(struct drm_device *dev,
 					   void *data,
 					   struct drm_file *file) { return 0; }
 
+static inline void xe_eudebug_support_enable(struct xe_device *xe) { }
+static inline int xe_eudebug_support_disable(struct xe_device *xe) { return 0; }
+
 static inline void xe_eudebug_init(struct xe_device *xe) { }
 static inline bool xe_eudebug_is_enabled(struct xe_device *xe) { return false; }
 
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 02f4e412fcdf..c44a73038fd0 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -558,6 +558,9 @@ static int exec_queue_set_eudebug(struct xe_device *xe, struct xe_exec_queue *q,
 			 !(value & DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)))
 		return -EINVAL;
 
+	if (XE_IOCTL_DBG(xe, !xe_eudebug_is_enabled(xe)))
+		return -EPERM;
+
 	q->eudebug_flags = EXEC_QUEUE_EUDEBUG_FLAG_ENABLE;
 	q->sched_props.preempt_timeout_us = 0;
 
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index b77572a19548..4c89f9bb4767 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -21,6 +21,7 @@
 #include "xe_bb.h"
 #include "xe_bo.h"
 #include "xe_device.h"
+#include "xe_eudebug.h"
 #include "xe_eu_stall.h"
 #include "xe_exec_queue.h"
 #include "xe_execlist.h"
diff --git a/drivers/gpu/drm/xe/xe_pci_sriov.c b/drivers/gpu/drm/xe/xe_pci_sriov.c
index 9c1c9e669b04..49661c36ccf4 100644
--- a/drivers/gpu/drm/xe/xe_pci_sriov.c
+++ b/drivers/gpu/drm/xe/xe_pci_sriov.c
@@ -9,6 +9,7 @@
 #include "regs/xe_bars.h"
 #include "xe_assert.h"
 #include "xe_device.h"
+#include "xe_eudebug.h"
 #include "xe_gt_sriov_pf_config.h"
 #include "xe_gt_sriov_pf_control.h"
 #include "xe_gt_sriov_printk.h"
@@ -153,6 +154,10 @@ static int pf_enable_vfs(struct xe_device *xe, int num_vfs)
 	xe_assert(xe, num_vfs <= total_vfs);
 	xe_sriov_dbg(xe, "enabling %u VF%s\n", num_vfs, str_plural(num_vfs));
 
+	err = xe_eudebug_support_disable(xe);
+	if (err < 0)
+		goto failed_eudebug;
+
 	err = xe_sriov_pf_wait_ready(xe);
 	if (err)
 		goto out;
@@ -195,6 +200,9 @@ static int pf_enable_vfs(struct xe_device *xe, int num_vfs)
 	pf_unprovision_vfs(xe, num_vfs);
 	xe_pm_runtime_put(xe);
 out:
+	xe_eudebug_support_enable(xe);
+failed_eudebug:
+
 	xe_sriov_notice(xe, "Failed to enable %u VF%s (%pe)\n",
 			num_vfs, str_plural(num_vfs), ERR_PTR(err));
 	return err;
@@ -223,6 +231,8 @@ static int pf_disable_vfs(struct xe_device *xe)
 	/* not needed anymore - see pf_enable_vfs() */
 	xe_pm_runtime_put(xe);
 
+	xe_eudebug_support_enable(xe);
+
 	xe_sriov_info(xe, "Disabled %u VF%s\n", num_vfs, str_plural(num_vfs));
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 15/20] drm/xe: Add xe_client_debugfs and introduce debug_data file
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (13 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 14/20] drm/xe: Implement SR-IOV and eudebug exclusivity Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun

From: Christoph Manszewski <christoph.manszewski@intel.com>

Create a debug_data file for each xe file/client which is supposed to
list all mapped debug data and attempts to mimic '/proc/pid/maps'.
Each line represents a single mapping and has the following format:

  <vm id> <begin>-<end> <flags> <offset> <pathname>

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
---
 drivers/gpu/drm/xe/Makefile            |   3 +-
 drivers/gpu/drm/xe/xe_client_debugfs.c | 118 +++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_client_debugfs.h |  19 ++++
 drivers/gpu/drm/xe/xe_device.c         |   3 +
 4 files changed, 142 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.c
 create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index ecbca68d3c2a..16666f0a4c01 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -330,7 +330,8 @@ ifeq ($(CONFIG_DRM_FBDEV_EMULATION),y)
 endif
 
 ifeq ($(CONFIG_DEBUG_FS),y)
-	xe-y += xe_debugfs.o \
+	xe-y += xe_client_debugfs.o \
+		xe_debugfs.o \
 		xe_gt_debugfs.o \
 		xe_gt_sriov_vf_debugfs.o \
 		xe_gt_stats.o \
diff --git a/drivers/gpu/drm/xe/xe_client_debugfs.c b/drivers/gpu/drm/xe/xe_client_debugfs.c
new file mode 100644
index 000000000000..0b952038e698
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_client_debugfs.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_client_debugfs.h"
+
+#include <linux/debugfs.h>
+
+#include "xe_debug_data.h"
+#include "xe_debug_data_types.h"
+#include "xe_device_types.h"
+#include "xe_vm_types.h"
+
+#define MAX_LINE_LEN (64 + PATH_MAX)
+
+static ssize_t debug_data_read(struct file *file, char __user *buf, size_t count,
+			       loff_t *ppos)
+{
+	struct xe_debug_data *dd;
+	unsigned long vm_index;
+	const char *path;
+	char *kbuf;
+	struct xe_vm *vm;
+
+	struct xe_file *xef = file->private_data;
+	ssize_t total = 0;
+	loff_t pos = 0;
+
+	if (!xef || !buf)
+		return -EINVAL;
+
+	kbuf = kmalloc(MAX_LINE_LEN, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	mutex_lock(&xef->vm.lock);
+
+	xa_for_each(&xef->vm.xa, vm_index, vm) {
+		mutex_lock(&vm->debug_data.lock);
+		list_for_each_entry(dd, &vm->debug_data.list, link) {
+			int len;
+
+			path = dd->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO ?
+				xe_debug_data_pseudo_path_to_string(dd->pseudopath) :
+				dd->pathname;
+
+			/* Format: <vm id> <begin>-<end> <flags> <offset> <pathname> */
+			len = snprintf(kbuf, MAX_LINE_LEN, "%lu 0x%llx-0x%llx 0x%llx 0x%x\t%s\n",
+				vm_index,
+				dd->addr,
+				dd->addr + dd->range,
+				dd->flags,
+				dd->offset,
+				path);
+
+			if (pos + len <= *ppos) {
+				pos += len;
+				continue;
+			}
+
+			if (pos < *ppos) {
+				const int skip = *ppos - pos;
+
+				len -= skip;
+				memmove(kbuf, kbuf + skip, len);
+				pos = *ppos;
+			}
+
+			if (total + len > count)
+				len = count - total;
+
+			if (copy_to_user(buf + total, kbuf, len)) {
+				mutex_unlock(&vm->debug_data.lock);
+				mutex_unlock(&xef->vm.lock);
+				kfree(kbuf);
+				return -EFAULT;
+			}
+
+			total += len;
+			pos += len;
+
+			if (total >= count) {
+				mutex_unlock(&vm->debug_data.lock);
+				mutex_unlock(&xef->vm.lock);
+				kfree(kbuf);
+				*ppos = pos;
+				return total;
+			}
+		}
+		mutex_unlock(&vm->debug_data.lock);
+	}
+
+	mutex_unlock(&xef->vm.lock);
+	kfree(kbuf);
+	*ppos = pos;
+	return total;
+}
+
+static int debug_data_open(struct inode *inode, struct file *file)
+{
+	struct xe_file *xef = inode->i_private;
+
+	file->private_data = xef;
+	return 0;
+}
+
+static const struct file_operations maps_fops = {
+	.owner = THIS_MODULE,
+	.open = debug_data_open,
+	.read = debug_data_read,
+	.llseek = default_llseek,
+};
+
+void xe_client_debugfs_register(struct xe_file *xef)
+{
+	debugfs_create_file("debug_data", 0444, xef->drm->debugfs_client, xef, &maps_fops);
+}
diff --git a/drivers/gpu/drm/xe/xe_client_debugfs.h b/drivers/gpu/drm/xe/xe_client_debugfs.h
new file mode 100644
index 000000000000..9eace15c0a49
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_client_debugfs.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_CLIENT_DEBUGFS_H_
+#define _XE_CLIENT_DEBUGFS_H_
+
+#include <linux/debugfs.h>
+
+struct xe_file;
+
+#ifdef CONFIG_DEBUG_FS
+void xe_client_debugfs_register(struct xe_file *xef);
+#else
+static inline void xe_client_debugfs_register(struct xe_file *xef) { }
+#endif
+
+#endif // _XE_CLIENT_DEBUGFS_H_
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index ff9268ed6124..660fe5cdd3fb 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -25,6 +25,7 @@
 #include "regs/xe_regs.h"
 #include "xe_bo.h"
 #include "xe_bo_evict.h"
+#include "xe_client_debugfs.h"
 #include "xe_debugfs.h"
 #include "xe_devcoredump.h"
 #include "xe_device_sysfs.h"
@@ -121,6 +122,8 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 		put_task_struct(task);
 	}
 
+	xe_client_debugfs_register(xef);
+
 	return 0;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (14 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 15/20] drm/xe: Add xe_client_debugfs and introduce debug_data file Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 18:35   ` Matthew Brost
                     ` (2 more replies)
  2025-10-06 11:17 ` [PATCH 17/20] drm/xe/eudebug: Add read/count/compare helper for eu attention Mika Kuoppala
                   ` (7 subsequent siblings)
  23 siblings, 3 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala, Dominik Grzegorzek

We need to inform to guc which contexts are debuggable
as their handling is different from ordinary contexts.

Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/abi/guc_actions_abi.h |  5 +++
 drivers/gpu/drm/xe/xe_eudebug_hw.c       | 55 ++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug_hw.h       |  4 ++
 drivers/gpu/drm/xe/xe_guc_submit.c       |  4 ++
 4 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
index 47756e4674a1..32a5f680a6d2 100644
--- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
@@ -155,6 +155,7 @@ enum xe_guc_action {
 	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
 	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
 	XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
+	XE_GUC_ACTION_EU_KERNEL_DEBUG = 0x8006,
 	XE_GUC_ACTION_TEST_G2G_SEND = 0xF001,
 	XE_GUC_ACTION_TEST_G2G_RECV = 0xF002,
 	XE_GUC_ACTION_LIMIT
@@ -278,4 +279,8 @@ enum xe_guc_g2g_type {
 /* invalid type for XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR */
 #define XE_GUC_CAT_ERR_TYPE_INVALID 0xdeadbeef
 
+enum  xe_guc_eu_kernel_debug_request_type {
+	XE_GUC_EU_KERNEL_DEBUG_ENABLE = 0x3,
+};
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
index a62c4b439888..cd4627705b56 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.c
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
@@ -12,6 +12,7 @@
 #include "regs/xe_gt_regs.h"
 #include "regs/xe_engine_regs.h"
 
+#include "abi/guc_actions_abi.h"
 #include "xe_eudebug.h"
 #include "xe_eudebug_types.h"
 #include "xe_exec_queue.h"
@@ -20,6 +21,9 @@
 #include "xe_gt.h"
 #include "xe_gt_debug.h"
 #include "xe_gt_mcr.h"
+#include "xe_guc.h"
+#include "xe_guc_ct.h"
+#include "xe_guc_exec_queue_types.h"
 #include "xe_hw_engine.h"
 #include "xe_lrc.h"
 #include "xe_macros.h"
@@ -675,6 +679,57 @@ static int xe_eu_control_stopped(struct xe_eudebug *d,
 	return xe_gt_eu_attention_bitmap(q->gt, bits, bitmask_size);
 }
 
+static int xe_guc_action_eu_kernel_debug(struct xe_device *xe,
+					 struct xe_exec_queue *q,
+					 struct xe_lrc *lrc, u32 cmd)
+{
+	u32 action[] = {
+		XE_GUC_ACTION_EU_KERNEL_DEBUG,
+		q->guc->id,
+		cmd,
+		0, /* reserved */
+	};
+	int ret, i;
+
+	if (cmd != XE_GUC_EU_KERNEL_DEBUG_ENABLE)
+		return -EINVAL;
+
+	ret = -EINVAL;
+	for (i = 0; i < q->width; i++) {
+		if (lrc && q->lrc[i] != lrc)
+			continue;
+
+		action[1] = q->guc->id + i;
+		drm_dbg(&xe->drm, "Guc action[%u] for ctx=%d",
+			cmd, action[1]);
+
+		ret = xe_guc_ct_send(&q->gt->uc.guc.ct,
+				     action, ARRAY_SIZE(action), 0, 0);
+
+		if (ret)
+			drm_dbg(&xe->drm, "eudebug guc cmd %u failed with %d\n",
+				cmd, ret);
+	}
+
+	return ret;
+}
+
+static bool xe_guc_has_debug_contexts(struct xe_gt *gt)
+{
+	return GUC_FIRMWARE_VER(&gt->uc.guc) >=	MAKE_GUC_VER(70, 49, 0);
+}
+
+int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q)
+{
+	struct xe_device *xe = gt_to_xe(q->gt);
+
+	if (!xe_guc_has_debug_contexts(q->gt))
+		return 0;
+
+	return xe_guc_action_eu_kernel_debug(xe, q, NULL,
+					     XE_GUC_EU_KERNEL_DEBUG_ENABLE);
+}
+
 static struct xe_eudebug_eu_control_ops eu_control = {
 	.interrupt_all = xe_eu_control_interrupt_all,
 	.stopped = xe_eu_control_stopped,
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
index 8f59ec574e4e..5d1df5d7dc46 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
@@ -23,10 +23,14 @@ long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg);
 
 struct xe_exec_queue *xe_gt_runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx);
 
+int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q);
+
 #else /* CONFIG_DRM_XE_EUDEBUG */
 
 static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable) { }
 
+static inline int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q) { return 0; }
+
 #endif /* CONFIG_DRM_XE_EUDEBUG */
 
 #endif /* _XE_EUDEBUG_HW_H_ */
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 16f78376f196..da264c1cfe76 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -21,6 +21,7 @@
 #include "xe_assert.h"
 #include "xe_devcoredump.h"
 #include "xe_device.h"
+#include "xe_eudebug_hw.h"
 #include "xe_exec_queue.h"
 #include "xe_force_wake.h"
 #include "xe_gpu_scheduler.h"
@@ -655,6 +656,9 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
 	if (xe_exec_queue_is_lr(q))
 		xe_exec_queue_get(q);
 
+	if (q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)
+		xe_eudebug_exec_queue_enable(q);
+
 	set_exec_queue_registered(q);
 	trace_xe_exec_queue_register(q);
 	if (xe_exec_queue_is_parallel(q))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 17/20] drm/xe/eudebug: Add read/count/compare helper for eu attention
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (15 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 18/20] drm/xe/eudebug: Introduce EU pagefault handling interface Mika Kuoppala
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

From: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>

Add xe_eu_attentions structure to capture and store eu attention bits.
Add a function to count the number of eu threads that have turned on from
eu attentions, and add a function to count the number of eu threads that
have changed on a state between eu attentions.

Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_debug.c | 64 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_debug.h |  6 +++
 2 files changed, 70 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_debug.c b/drivers/gpu/drm/xe/xe_gt_debug.c
index 314eef6734c3..8386a527adf0 100644
--- a/drivers/gpu/drm/xe/xe_gt_debug.c
+++ b/drivers/gpu/drm/xe/xe_gt_debug.c
@@ -3,6 +3,7 @@
  * Copyright © 2023 Intel Corporation
  */
 
+#include <linux/delay.h>
 #include "regs/xe_gt_regs.h"
 #include "xe_device.h"
 #include "xe_force_wake.h"
@@ -177,3 +178,66 @@ int xe_gt_eu_threads_needing_attention(struct xe_gt *gt)
 
 	return err < 0 ? 0 : err;
 }
+
+static inline unsigned int
+xe_eu_attentions_count(const struct xe_eu_attentions *a)
+{
+	return bitmap_weight((void *)a->att, a->size * BITS_PER_BYTE);
+}
+
+void xe_gt_eu_attentions_read(struct xe_gt *gt,
+			      struct xe_eu_attentions *a,
+			      const unsigned int settle_time_ms)
+{
+	unsigned int prev = 0;
+	ktime_t end, now;
+
+	now = ktime_get_raw();
+	end = ktime_add_ms(now, settle_time_ms);
+
+	a->ts = 0;
+	a->size = min_t(int,
+			xe_gt_eu_attention_bitmap_size(gt),
+			sizeof(a->att));
+
+	do {
+		unsigned int attn;
+
+		xe_gt_eu_attention_bitmap(gt, a->att, a->size);
+		attn = xe_eu_attentions_count(a);
+
+		now = ktime_get_raw();
+
+		if (a->ts == 0)
+			a->ts = now;
+		else if (attn && attn != prev)
+			a->ts = now;
+
+		prev = attn;
+
+		if (settle_time_ms)
+			udelay(5);
+
+		/*
+		 * XXX We are gathering data for production SIP to find
+		 * the upper limit of settle time. For now, we wait full
+		 * timeout value regardless.
+		 */
+	} while (ktime_before(now, end));
+}
+
+unsigned int xe_eu_attentions_xor_count(const struct xe_eu_attentions *a,
+					const struct xe_eu_attentions *b)
+{
+	unsigned int count = 0;
+	unsigned int i;
+
+	if (XE_WARN_ON(a->size != b->size))
+		return -EINVAL;
+
+	for (i = 0; i < a->size; i++)
+		if (a->att[i] ^ b->att[i])
+			count++;
+
+	return count;
+}
diff --git a/drivers/gpu/drm/xe/xe_gt_debug.h b/drivers/gpu/drm/xe/xe_gt_debug.h
index f882770e18d3..aba36bc5f85d 100644
--- a/drivers/gpu/drm/xe/xe_gt_debug.h
+++ b/drivers/gpu/drm/xe/xe_gt_debug.h
@@ -38,4 +38,10 @@ int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt);
 int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits,
 			      unsigned int bitmap_size);
 
+void xe_gt_eu_attentions_read(struct xe_gt *gt,
+			      struct xe_eu_attentions *a,
+			      const unsigned int settle_time_ms);
+
+unsigned int xe_eu_attentions_xor_count(const struct xe_eu_attentions *a,
+					const struct xe_eu_attentions *b);
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 18/20] drm/xe/eudebug: Introduce EU pagefault handling interface
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (16 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 17/20] drm/xe/eudebug: Add read/count/compare helper for eu attention Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 19/20] drm/xe/vm: Support for adding null page VMA to VM on request Mika Kuoppala
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Jan Maślak, Mika Kuoppala

From: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>

The XE2 (and PVC) HW has a limitation that the pagefault due to invalid
access will halt the corresponding EUs. To solve this problem, introduce
EU pagefault handling functionality, which allows to unhalt pagefaulted
eu threads and to EU debugger to get inform about the eu attentions state
of EU threads during execution.

If a pagefault occurs, send the DRM_XE_EUDEBUG_EVENT_PAGEFAULT event
after handling the pagefault. The pagefault eudebug event follows
the newly added drm_xe_eudebug_event_pagefault type.
When a pagefault occurs, it prevents to send the
DRM_XE_EUDEBUG_EVENT_EU_ATTENTION event to the client during pagefault
handling.

The page fault event delivery follows the below policy.
(1) If EU Debugger discovery has completed and pagefaulted eu threads turn
    on attention bit then pagefault handler delivers pagefault event
    directly.
(2) If a pagefault occurs during eu debugger discovery process, pagefault
    handler queues a pagefault event and sends the queued event when
    discovery has completed and pagefaulted eu threads turn on attention
    bit.
(3) If the pagefaulted eu thread struggles to turn on the attention bit
    within the specified time, the attention scan worker sends a pagefault
    event when it detects that the attention bit is turned on.

If multiple eu threads are running and a pagefault occurs due to accessing
the same invalid address, send a single pagefault event
(DRM_XE_EUDEBUG_EVENT_PAGEFAULT type) to the user debugger instead of a
pagefault event for each of the multiple eu threads.
If eu threads (other than the one that caused the page fault before) access
the new invalid addresses, send a new pagefault event.

As the attention scan worker send the eu attention event whenever the
attention bit is turned on, user debugger receives attenion event
immediately after pagefault event.
In this case, the page-fault event always precedes the attention event.

When the user debugger receives an attention event after a pagefault event,
it can detect whether additional breakpoints or interrupts occur in
addition to the existing pagefault by comparing the eu threads where the
pagefault occurred with the eu threads where the attention bit is newly
enabled.

v2: use only force exception (Joonas, Mika)
v3: rebased on v4 (Mika)

Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Jan Maślak <jan.maslak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/Makefile               |   2 +-
 drivers/gpu/drm/xe/xe_eudebug.c           | 124 +++++--
 drivers/gpu/drm/xe/xe_eudebug.h           |  36 ++
 drivers/gpu/drm/xe/xe_eudebug_hw.c        |  15 +-
 drivers/gpu/drm/xe/xe_eudebug_pagefault.c | 391 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_eudebug_pagefault.h |  15 +
 drivers/gpu/drm/xe/xe_eudebug_types.h     |  60 +++-
 include/uapi/drm/xe_drm_eudebug.h         |  12 +
 8 files changed, 618 insertions(+), 37 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_pagefault.c
 create mode 100644 drivers/gpu/drm/xe/xe_eudebug_pagefault.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 16666f0a4c01..97827ec36e59 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -147,7 +147,7 @@ xe-$(CONFIG_DRM_XE_GPUSVM) += xe_svm.o
 xe-$(CONFIG_DRM_GPUSVM) += xe_userptr.o
 
 # debugging shaders with gdb (eudebug) support
-xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o xe_eudebug_vm.o xe_eudebug_hw.o xe_gt_debug.o
+xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o xe_eudebug_vm.o xe_eudebug_hw.o xe_eudebug_pagefault.o xe_gt_debug.o
 
 # graphics hardware monitoring (HWMON) support
 xe-$(CONFIG_HWMON) += xe_hwmon.o
diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
index 5dc8e4cd7f6b..c64898de85d8 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.c
+++ b/drivers/gpu/drm/xe/xe_eudebug.c
@@ -17,12 +17,16 @@
 #include "xe_eudebug.h"
 #include "xe_eudebug_hw.h"
 #include "xe_eudebug_types.h"
+#include "xe_eudebug_pagefault.h"
 #include "xe_eudebug_vm.h"
 #include "xe_exec_queue.h"
+#include "xe_force_wake.h"
 #include "xe_gt.h"
 #include "xe_hw_engine.h"
 #include "xe_gt.h"
 #include "xe_gt_debug.h"
+#include "xe_gt_mcr.h"
+#include "regs/xe_gt_regs.h"
 #include "xe_macros.h"
 #include "xe_pm.h"
 #include "xe_sync.h"
@@ -184,6 +188,8 @@ static void xe_eudebug_free(struct kref *ref)
 	while (kfifo_get(&d->events.fifo, &event))
 		kfree(event);
 
+	xe_eudebug_pagefault_fini(d);
+
 	xe_eudebug_destroy_resources(d);
 	XE_WARN_ON(d->target.xef);
 
@@ -381,8 +387,8 @@ static int _xe_eudebug_disconnect(struct xe_eudebug *d,
 	} \
 })
 
-static struct xe_eudebug *
-_xe_eudebug_get(struct xe_file *xef)
+struct xe_eudebug *
+xe_eudebug_get_nolock(struct xe_file *xef)
 {
 	struct xe_eudebug *d;
 
@@ -392,7 +398,11 @@ _xe_eudebug_get(struct xe_file *xef)
 		d = NULL;
 	mutex_unlock(&xef->eudebug.lock);
 
-	if (d && xe_eudebug_detached(d)) {
+	if (!d)
+		return NULL;
+
+	if (xe_eudebug_detached(d) ||
+	    !completion_done(&d->discovery)) {
 		xe_eudebug_put(d);
 		return NULL;
 	}
@@ -403,20 +413,9 @@ _xe_eudebug_get(struct xe_file *xef)
 struct xe_eudebug *
 xe_eudebug_get(struct xe_file *xef)
 {
-	struct xe_eudebug *d;
-
 	lockdep_assert_held(&xef->eudebug.ioctl_lock);
 
-	d = _xe_eudebug_get(xef);
-	if (!d)
-		return NULL;
-
-	if (!completion_done(&d->discovery)) {
-		xe_eudebug_put(d);
-		return NULL;
-	}
-
-	return d;
+	return xe_eudebug_get_nolock(xef);
 }
 
 static int xe_eudebug_queue_event(struct xe_eudebug *d,
@@ -1932,7 +1931,7 @@ static int xe_send_gt_attention(struct xe_gt *gt)
 		goto err_exec_queue_put;
 	}
 
-	d = _xe_eudebug_get(q->vm->xef);
+	d = xe_eudebug_get_nolock(q->vm->xef);
 	if (!d) {
 		ret = -ENOTCONN;
 		goto err_exec_queue_put;
@@ -1960,10 +1959,6 @@ static int xe_eudebug_handle_gt_attention(struct xe_gt *gt)
 {
 	int ret;
 
-	ret = xe_gt_eu_threads_needing_attention(gt);
-	if (ret <= 0)
-		return ret;
-
 	ret = xe_send_gt_attention(gt);
 
 	/* Discovery in progress, fake it */
@@ -1973,6 +1968,65 @@ static int xe_eudebug_handle_gt_attention(struct xe_gt *gt)
 	return ret;
 }
 
+int xe_eudebug_send_pagefault_event(struct xe_eudebug *d,
+				    struct xe_eudebug_pagefault *pf)
+{
+	struct drm_xe_eudebug_event_pagefault *ep;
+	struct drm_xe_eudebug_event *event;
+	int h_queue, h_lrc;
+	u32 size = xe_gt_eu_attention_bitmap_size(pf->q->gt) * 3;
+	u32 sz = struct_size(ep, bitmask, size);
+	int ret;
+
+	XE_WARN_ON(pf->lrc_idx < 0 || pf->lrc_idx >= pf->q->width);
+
+	XE_WARN_ON(!xe_exec_queue_is_debuggable(pf->q));
+
+	h_queue = find_handle(d->res, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, pf->q);
+	if (h_queue < 0)
+		return h_queue;
+
+	h_lrc = find_handle(d->res, XE_EUDEBUG_RES_TYPE_LRC, pf->q->lrc[pf->lrc_idx]);
+	if (h_lrc < 0)
+		return h_lrc;
+
+	event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_PAGEFAULT, 0,
+					DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, sz);
+
+	if (!event)
+		return -ENOSPC;
+
+	ep = cast_event(ep, event);
+	ep->exec_queue_handle = h_queue;
+	ep->lrc_handle = h_lrc;
+	ep->bitmask_size = size;
+	ep->pagefault_address = pf->fault.addr;
+
+	memcpy(ep->bitmask, pf->attentions.before.att, pf->attentions.before.size);
+	memcpy(ep->bitmask + pf->attentions.before.size,
+	       pf->attentions.after.att, pf->attentions.after.size);
+	memcpy(ep->bitmask + pf->attentions.before.size + pf->attentions.after.size,
+	       pf->attentions.resolved.att, pf->attentions.resolved.size);
+
+	event->seqno = atomic_long_inc_return(&d->events.seqno);
+
+	ret = xe_eudebug_queue_event(d, event);
+	if (ret)
+		xe_eudebug_disconnect(d, ret);
+
+	return ret;
+}
+
+static void handle_attention_fail(struct xe_gt *gt, int gt_id, int ret)
+{
+	/* TODO: error capture */
+	drm_info(&gt_to_xe(gt)->drm,
+		 "gt:%d unable to handle eu attention ret = %d\n",
+		 gt_id, ret);
+
+	xe_gt_reset_async(gt);
+}
+
 static void attention_poll_work(struct work_struct *work)
 {
 	struct xe_device *xe = container_of(work, typeof(*xe),
@@ -1995,15 +2049,15 @@ static void attention_poll_work(struct work_struct *work)
 			if (gt->info.type != XE_GT_TYPE_MAIN)
 				continue;
 
-			ret = xe_eudebug_handle_gt_attention(gt);
-			if (ret) {
-				/* TODO: error capture */
-				drm_info(&gt_to_xe(gt)->drm,
-					 "gt:%d unable to handle eu attention ret=%d\n",
-					 gt_id, ret);
+			if (!xe_gt_eu_threads_needing_attention(gt))
+				continue;
 
-				xe_gt_reset_async(gt);
-			}
+			ret = xe_eudebug_handle_pagefaults(gt);
+			if (!ret)
+				ret = xe_eudebug_handle_gt_attention(gt);
+
+			if (ret)
+				handle_attention_fail(gt, gt_id, ret);
 		}
 
 		xe_pm_runtime_put(xe);
@@ -2012,12 +2066,12 @@ static void attention_poll_work(struct work_struct *work)
 	schedule_delayed_work(&xe->eudebug.attention_dwork, delay);
 }
 
-static void attention_poll_stop(struct xe_device *xe)
+void xe_eudebug_attention_poll_stop(struct xe_device *xe)
 {
 	cancel_delayed_work_sync(&xe->eudebug.attention_dwork);
 }
 
-static void attention_poll_start(struct xe_device *xe)
+void xe_eudebug_attention_poll_start(struct xe_device *xe)
 {
 	mod_delayed_work(system_wq, &xe->eudebug.attention_dwork, 0);
 }
@@ -2060,6 +2114,8 @@ xe_eudebug_connect(struct xe_device *xe,
 
 	kref_init(&d->ref);
 	spin_lock_init(&d->target.lock);
+	mutex_init(&d->pf_lock);
+	INIT_LIST_HEAD(&d->pagefaults);
 	init_waitqueue_head(&d->events.write_done);
 	init_waitqueue_head(&d->events.read_done);
 	init_completion(&d->discovery);
@@ -2093,7 +2149,7 @@ xe_eudebug_connect(struct xe_device *xe,
 
 	kref_get(&d->ref);
 	queue_work(xe->eudebug.wq, &d->discovery_work);
-	attention_poll_start(xe);
+	xe_eudebug_attention_poll_start(xe);
 
 	eu_dbg(d, "connected session %lld", d->session);
 
@@ -2187,9 +2243,9 @@ static int xe_eudebug_enable(struct xe_device *xe, bool enable)
 	mutex_unlock(&xe->eudebug.lock);
 
 	if (enable)
-		attention_poll_start(xe);
+		xe_eudebug_attention_poll_start(xe);
 	else
-		attention_poll_stop(xe);
+		xe_eudebug_attention_poll_stop(xe);
 
 	return 0;
 }
@@ -2238,7 +2294,7 @@ static void xe_eudebug_fini(struct drm_device *dev, void *__unused)
 
 	xe_assert(xe, list_empty(&xe->eudebug.targets));
 
-	attention_poll_stop(xe);
+	xe_eudebug_attention_poll_stop(xe);
 }
 
 void xe_eudebug_init(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h
index ed3c2078e960..f5b02ee010c2 100644
--- a/drivers/gpu/drm/xe/xe_eudebug.h
+++ b/drivers/gpu/drm/xe/xe_eudebug.h
@@ -13,11 +13,13 @@ struct drm_file;
 struct xe_debug_data;
 struct xe_device;
 struct xe_file;
+struct xe_gt;
 struct xe_vm;
 struct xe_vma;
 struct xe_exec_queue;
 struct xe_user_fence;
 struct xe_eudebug;
+struct xe_eudebug_pagefault;
 
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 
@@ -77,8 +79,23 @@ void xe_eudebug_ufence_init(struct xe_user_fence *ufence, struct xe_file *xef, s
 void xe_eudebug_ufence_fini(struct xe_user_fence *ufence);
 
 struct xe_eudebug *xe_eudebug_get(struct xe_file *xef);
+struct xe_eudebug *xe_eudebug_get_nolock(struct xe_file *xef);
 void xe_eudebug_put(struct xe_eudebug *d);
 
+int xe_eudebug_send_pagefault_event(struct xe_eudebug *d,
+				    struct xe_eudebug_pagefault *pf);
+
+struct xe_eudebug_pagefault *xe_eudebug_pagefault_create(struct xe_gt *gt, struct xe_vm *vm,
+							 u64 page_addr, u8 fault_type,
+							 u8 fault_level, u8 access_type);
+void xe_eudebug_pagefault_process(struct xe_gt *gt, struct xe_eudebug_pagefault *pf);
+void xe_eudebug_pagefault_destroy(struct xe_gt *gt, struct xe_vm *vm,
+				  struct xe_eudebug_pagefault *pf, bool send_event);
+
+
+void xe_eudebug_attention_poll_stop(struct xe_device *xe);
+void xe_eudebug_attention_poll_start(struct xe_device *xe);
+
 #else
 
 static inline int xe_eudebug_connect_ioctl(struct drm_device *dev,
@@ -116,6 +133,25 @@ static inline void xe_eudebug_ufence_fini(struct xe_user_fence *ufence) { }
 static inline struct xe_eudebug *xe_eudebug_get(struct xe_file *xef) { return NULL; }
 static inline void xe_eudebug_put(struct xe_eudebug *d) { }
 
+static inline struct xe_eudebug_pagefault *
+xe_eudebug_pagefault_create(struct xe_gt *gt, struct xe_vm *vm, u64 page_addr,
+			    u8 fault_type, u8 fault_level, u8 access_type)
+{
+	return NULL;
+}
+
+static inline void
+xe_eudebug_pagefault_process(struct xe_gt *gt, struct xe_eudebug_pagefault *pf)
+{
+}
+
+static inline void xe_eudebug_pagefault_destroy(struct xe_gt *gt,
+						struct xe_vm *vm,
+						struct xe_eudebug_pagefault *pf,
+						bool send_event)
+{
+}
+
 #endif /* CONFIG_DRM_XE_EUDEBUG */
 
 #endif /* _XE_EUDEBUG_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
index cd4627705b56..0d82542e03ce 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.c
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
@@ -326,6 +326,7 @@ static int do_eu_control(struct xe_eudebug *d,
 	struct xe_device *xe = d->xe;
 	u8 *bits = NULL;
 	unsigned int hw_attn_size, attn_size;
+	struct dma_fence *pf_fence;
 	struct xe_exec_queue *q;
 	struct xe_lrc *lrc;
 	u64 seqno;
@@ -377,8 +378,20 @@ static int do_eu_control(struct xe_eudebug *d,
 		goto out_free;
 	}
 
-	ret = -EINVAL;
 	mutex_lock(&d->hw.lock);
+	do {
+		pf_fence = dma_fence_get(d->pf_fence);
+		if (pf_fence) {
+			mutex_unlock(&d->hw.lock);
+			ret = dma_fence_wait(pf_fence, true);
+			dma_fence_put(pf_fence);
+			if (ret)
+				goto out_free;
+			mutex_lock(&d->hw.lock);
+		}
+	} while (pf_fence);
+
+	ret = -EINVAL;
 
 	switch (arg->cmd) {
 	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL:
diff --git a/drivers/gpu/drm/xe/xe_eudebug_pagefault.c b/drivers/gpu/drm/xe/xe_eudebug_pagefault.c
new file mode 100644
index 000000000000..8d705d41a2aa
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_pagefault.c
@@ -0,0 +1,391 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#include "xe_eudebug_pagefault.h"
+
+#include <linux/delay.h>
+
+#include "xe_exec_queue.h"
+#include "xe_eudebug.h"
+#include "xe_eudebug_hw.h"
+#include "xe_force_wake.h"
+#include "xe_gt_mcr.h"
+#include "regs/xe_gt_regs.h"
+#include "xe_vm.h"
+
+static int queue_pagefault(struct xe_gt *gt, struct xe_eudebug_pagefault *pf)
+{
+	struct xe_eudebug *d;
+
+	d = xe_eudebug_get_nolock(pf->q->vm->xef);
+	if (!d)
+		return -EINVAL;
+
+	mutex_lock(&d->pf_lock);
+	list_add_tail(&pf->list, &d->pagefaults);
+	mutex_unlock(&d->pf_lock);
+
+	xe_eudebug_put(d);
+
+	return 0;
+}
+
+static int send_pagefault(struct xe_gt *gt, struct xe_eudebug_pagefault *pf,
+			  bool from_attention_scan)
+{
+	struct xe_eudebug *d;
+	struct xe_exec_queue *q;
+	int ret, lrc_idx;
+
+	q = xe_gt_runalone_active_queue_get(gt, &lrc_idx);
+	if (IS_ERR(q))
+		return PTR_ERR(q);
+
+	if (!xe_exec_queue_is_debuggable(q)) {
+		ret = -EPERM;
+		goto out_exec_queue_put;
+	}
+
+	d = xe_eudebug_get_nolock(q->vm->xef);
+	if (!d) {
+		ret = -ENOTCONN;
+		goto out_exec_queue_put;
+	}
+
+	if (pf->deferred_resolved) {
+		xe_gt_eu_attentions_read(gt, &pf->attentions.resolved,
+					 XE_GT_ATTENTION_TIMEOUT_MS);
+
+		if (!xe_eu_attentions_xor_count(&pf->attentions.after,
+						&pf->attentions.resolved) &&
+		    !from_attention_scan) {
+			eu_dbg(d, "xe attentions not yet updated\n");
+			ret = -EBUSY;
+			goto out_eudebug_put;
+		}
+	}
+
+	ret = xe_eudebug_send_pagefault_event(d, pf);
+
+out_eudebug_put:
+	xe_eudebug_put(d);
+out_exec_queue_put:
+	xe_exec_queue_put(q);
+
+	return ret;
+}
+
+static int handle_pagefault(struct xe_gt *gt, struct xe_eudebug_pagefault *pf)
+{
+	int ret;
+
+	ret = send_pagefault(gt, pf, false);
+
+	/*
+	 * if debugger discovery is not completed or resolved attentions are not
+	 * updated, then queue pagefault
+	 */
+	if (ret == -EBUSY) {
+		ret = queue_pagefault(gt, pf);
+		if (!ret)
+			goto out;
+	}
+
+	xe_exec_queue_put(pf->q);
+	kfree(pf);
+
+out:
+	return ret;
+}
+
+static const char *
+pagefault_get_driver_name(struct dma_fence *dma_fence)
+{
+	return "xe";
+}
+
+static const char *
+pagefault_fence_get_timeline_name(struct dma_fence *dma_fence)
+{
+	return "eudebug_pagefault_fence";
+}
+
+static const struct dma_fence_ops pagefault_fence_ops = {
+	.get_driver_name = pagefault_get_driver_name,
+	.get_timeline_name = pagefault_fence_get_timeline_name,
+};
+
+struct pagefault_fence {
+	struct dma_fence base;
+	spinlock_t lock;
+};
+
+static struct pagefault_fence *pagefault_fence_create(void)
+{
+	struct pagefault_fence *fence;
+
+	fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+	if (fence == NULL)
+		return NULL;
+
+	spin_lock_init(&fence->lock);
+	dma_fence_init(&fence->base, &pagefault_fence_ops, &fence->lock,
+		       dma_fence_context_alloc(1), 1);
+
+	return fence;
+}
+
+struct xe_eudebug_pagefault *
+xe_eudebug_pagefault_create(struct xe_gt *gt, struct xe_vm *vm, u64 page_addr,
+			    u8 fault_type, u8 fault_level, u8 access_type)
+{
+	struct pagefault_fence *pf_fence;
+	struct xe_eudebug_pagefault *pf;
+	struct xe_vma *vma = NULL;
+	struct xe_exec_queue *q;
+	struct dma_fence *fence;
+	struct xe_eudebug *d;
+	unsigned int fw_ref;
+	int lrc_idx;
+	u32 td_ctl;
+
+	down_read(&vm->lock);
+	vma = xe_vm_find_vma_by_addr(vm, page_addr);
+	up_read(&vm->lock);
+
+	if (vma)
+		return NULL;
+
+	d = xe_eudebug_get_nolock(vm->xef);
+	if (!d)
+		return NULL;
+
+	q = xe_gt_runalone_active_queue_get(gt, &lrc_idx);
+	if (IS_ERR(q))
+		goto err_put_eudebug;
+
+	if (!xe_exec_queue_is_debuggable(q))
+		goto err_put_exec_queue;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(gt), q->hwe->domain);
+	if (!fw_ref)
+		goto err_put_exec_queue;
+
+	/*
+	 * If there is no debug functionality (TD_CTL_GLOBAL_DEBUG_ENABLE, etc.),
+	 * don't proceed pagefault routine for eu debugger.
+	 */
+	td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL);
+	if (!td_ctl)
+		goto err_put_fw;
+
+	pf = kzalloc(sizeof(*pf), GFP_KERNEL);
+	if (!pf)
+		goto err_put_fw;
+
+	xe_eudebug_attention_poll_stop(gt_to_xe(gt));
+
+	mutex_lock(&d->hw.lock);
+	fence = dma_fence_get(d->pf_fence);
+
+	if (fence) {
+		/*
+		 * TODO: If the new incoming pagefaulted address is different
+		 * from the pagefaulted address it is currently handling on the
+		 * same ASID, it needs a routine to wait here and then do the
+		 * following pagefault.
+		 */
+		dma_fence_put(fence);
+		goto err_unlock_hw_lock;
+	}
+
+	pf_fence = pagefault_fence_create();
+	if (!pf_fence)
+		goto err_unlock_hw_lock;
+
+	d->pf_fence = &pf_fence->base;
+
+	INIT_LIST_HEAD(&pf->list);
+
+	xe_gt_eu_attentions_read(gt, &pf->attentions.before, 0);
+
+	if (td_ctl & TD_CTL_FORCE_EXCEPTION)
+		eu_warn(d, "force exception already set!");
+
+	/* Halt regardless of thread dependencies */
+	while (!(td_ctl & TD_CTL_FORCE_EXCEPTION)) {
+		xe_gt_mcr_multicast_write(gt, TD_CTL,
+					  td_ctl | TD_CTL_FORCE_EXCEPTION);
+		udelay(200);
+		td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL);
+	}
+
+	xe_gt_eu_attentions_read(gt, &pf->attentions.after,
+				 XE_GT_ATTENTION_TIMEOUT_MS);
+
+	mutex_unlock(&d->hw.lock);
+
+	/*
+	 * xe_exec_queue_put() will be called from xe_eudebug_pagefault_destroy()
+	 * or handle_pagefault()
+	 */
+	pf->q = q;
+	pf->lrc_idx = lrc_idx;
+	pf->fault.addr = page_addr;
+	pf->fault.type = fault_type;
+	pf->fault.level = fault_level;
+	pf->fault.access = access_type;
+
+	xe_force_wake_put(gt_to_fw(gt), fw_ref);
+	xe_eudebug_put(d);
+
+	return pf;
+
+err_unlock_hw_lock:
+	mutex_unlock(&d->hw.lock);
+	xe_eudebug_attention_poll_start(gt_to_xe(gt));
+	kfree(pf);
+err_put_fw:
+	xe_force_wake_put(gt_to_fw(gt), fw_ref);
+err_put_exec_queue:
+	xe_exec_queue_put(q);
+err_put_eudebug:
+	xe_eudebug_put(d);
+
+	return NULL;
+}
+
+void
+xe_eudebug_pagefault_process(struct xe_gt *gt, struct xe_eudebug_pagefault *pf)
+{
+	xe_gt_eu_attentions_read(gt, &pf->attentions.resolved,
+				 XE_GT_ATTENTION_TIMEOUT_MS);
+
+	if (!xe_eu_attentions_xor_count(&pf->attentions.after,
+					&pf->attentions.resolved))
+		pf->deferred_resolved = true;
+}
+
+void
+xe_eudebug_pagefault_destroy(struct xe_gt *gt, struct xe_vm *vm,
+			     struct xe_eudebug_pagefault *pf, bool send_event)
+{
+	struct xe_eudebug *d;
+	unsigned int fw_ref;
+	u32 td_ctl;
+
+	fw_ref = xe_force_wake_get(gt_to_fw(gt), pf->q->hwe->domain);
+	if (!fw_ref) {
+		struct xe_device *xe = gt_to_xe(gt);
+
+		drm_warn(&xe->drm, "Forcewake fail: Can not recover TD_CTL");
+	} else {
+		td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL);
+		xe_gt_mcr_multicast_write(gt, TD_CTL, td_ctl &
+					  ~(TD_CTL_FORCE_EXCEPTION));
+		xe_force_wake_put(gt_to_fw(gt), fw_ref);
+	}
+
+	if (send_event)
+		handle_pagefault(gt, pf);
+
+	d = xe_eudebug_get_nolock(vm->xef);
+	if (d) {
+		struct dma_fence *fence;
+
+		mutex_lock(&d->hw.lock);
+		fence = dma_fence_get(d->pf_fence);
+
+		if (fence) {
+			if (send_event)
+				dma_fence_signal(fence);
+
+			dma_fence_put(fence); /* deref for dma_fence_get() */
+			dma_fence_put(fence); /* defef for dma_fence_init() */
+		}
+
+		d->pf_fence = NULL;
+		mutex_unlock(&d->hw.lock);
+
+		xe_eudebug_put(d);
+	}
+
+	if (!send_event) {
+		xe_exec_queue_put(pf->q);
+		kfree(pf);
+	}
+
+	xe_eudebug_attention_poll_start(gt_to_xe(gt));
+}
+
+static int send_queued_pagefault(struct xe_eudebug *d, bool from_attention_scan)
+{
+	struct xe_eudebug_pagefault *pf, *pf_temp;
+	int ret = 0;
+
+	mutex_lock(&d->pf_lock);
+	list_for_each_entry_safe(pf, pf_temp, &d->pagefaults, list) {
+		struct xe_gt *gt = pf->q->gt;
+
+		ret = send_pagefault(gt, pf, from_attention_scan);
+
+		/* if resolved attentions are not updated */
+		if (ret == -EBUSY)
+			break;
+
+		/* decrease the reference count of xe_exec_queue obtained from pagefault handler */
+		xe_exec_queue_put(pf->q);
+		list_del(&pf->list);
+		kfree(pf);
+
+		if (ret)
+			break;
+	}
+	mutex_unlock(&d->pf_lock);
+
+	return ret;
+}
+
+int xe_eudebug_handle_pagefaults(struct xe_gt *gt)
+{
+	struct xe_exec_queue *q;
+	struct xe_eudebug *d;
+	int ret, lrc_idx;
+
+	q = xe_gt_runalone_active_queue_get(gt, &lrc_idx);
+	if (IS_ERR(q))
+		return PTR_ERR(q);
+
+	if (!xe_exec_queue_is_debuggable(q)) {
+		ret = -EPERM;
+		goto out_exec_queue_put;
+	}
+
+	d = xe_eudebug_get_nolock(q->vm->xef);
+	if (!d) {
+		ret = -ENOTCONN;
+		goto out_exec_queue_put;
+	}
+
+	ret = send_queued_pagefault(d, true);
+
+	xe_eudebug_put(d);
+
+out_exec_queue_put:
+	xe_exec_queue_put(q);
+
+	return ret;
+}
+
+void xe_eudebug_pagefault_fini(struct xe_eudebug *d)
+{
+	struct xe_eudebug_pagefault *pf, *pf_temp;
+
+	/* Since it's the last reference no race here */
+	list_for_each_entry_safe(pf, pf_temp, &d->pagefaults, list) {
+		xe_exec_queue_put(pf->q);
+		kfree(pf);
+	}
+}
diff --git a/drivers/gpu/drm/xe/xe_eudebug_pagefault.h b/drivers/gpu/drm/xe/xe_eudebug_pagefault.h
new file mode 100644
index 000000000000..0b22e91f4f85
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_eudebug_pagefault.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023-2025 Intel Corporation
+ */
+
+#ifndef _XE_EUDEBUG_PAGEFAULT_H_
+#define _XE_EUDEBUG_PAGEFAULT_H_
+
+struct xe_eudebug;
+struct xe_gt;
+
+void xe_eudebug_pagefault_fini(struct xe_eudebug *d);
+int xe_eudebug_handle_pagefaults(struct xe_gt *gt);
+
+#endif /* _XE_EUDEBUG_PAGEFAULT_H_ */
diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h
index 85fc321f8b0e..c4debbb92838 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_types.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_types.h
@@ -15,6 +15,8 @@
 #include <linux/wait.h>
 #include <linux/xarray.h>
 
+#include "xe_gt_debug.h"
+
 struct xe_device;
 struct task_struct;
 struct xe_eudebug;
@@ -37,7 +39,7 @@ enum xe_eudebug_state {
 };
 
 #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64
-#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_EU_ATTENTION
+#define XE_EUDEBUG_MAX_EVENT_TYPE DRM_XE_EUDEBUG_EVENT_PAGEFAULT
 
 /**
  * struct xe_eudebug_handle - eudebug resource handle
@@ -169,6 +171,62 @@ struct xe_eudebug {
 
 	/** @ops operations for eu_control */
 	struct xe_eudebug_eu_control_ops *ops;
+
+	/** @pf_lock: guards access to pagefaults list*/
+	struct mutex pf_lock;
+	/** @pagefaults: xe_eudebug_pagefault list for pagefault event queuing */
+	struct list_head pagefaults;
+	/**
+	 * @pf_fence: fence on operations of eus (eu thread control and attention)
+	 * when page faults are being handled, protected by @eu_lock.
+	 */
+	struct dma_fence *pf_fence;
+};
+
+/**
+ * struct xe_eudebug_pagefault - eudebug structure for queuing pagefault
+ */
+struct xe_eudebug_pagefault {
+	/** @list: link into the xe_eudebug.pagefaults */
+	struct list_head list;
+	/** @q: exec_queue which raised pagefault */
+	struct xe_exec_queue *q;
+	/** @lrc_idx: lrc index of the workload which raised pagefault */
+	int lrc_idx;
+
+	/* pagefault raw partial data passed from guc*/
+	struct {
+		/** @addr: ppgtt address where the pagefault occurred */
+		u64 addr;
+		int type;
+		int level;
+		int access;
+	} fault;
+
+	struct {
+		/** @before: state of attention bits before page fault WA processing*/
+		struct xe_eu_attentions before;
+		/**
+		 * @after: status of attention bits during page fault WA processing.
+		 * It includes eu threads where attention bits are turned on for
+		 * reasons other than page fault WA (breakpoint, interrupt, etc.).
+		 */
+		struct xe_eu_attentions after;
+		/**
+		 * @resolved: state of the attention bits after page fault WA.
+		 * It includes the eu thread that caused the page fault.
+		 * To determine the eu thread that caused the page fault,
+		 * do XOR attentions.after and attentions.resolved.
+		 */
+		struct xe_eu_attentions resolved;
+	} attentions;
+
+	/**
+	 * @deferred_resolved: to update attentions.resolved again when attention
+	 * bits are ready if the eu thread fails to turn on attention bits within
+	 * a certain time after page fault WA processing.
+	 */
+	bool deferred_resolved;
 };
 
 #endif /* _XE_EUDEBUG_TYPES_H_ */
diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h
index 1c797a8b4d32..a6ee51aa0ede 100644
--- a/include/uapi/drm/xe_drm_eudebug.h
+++ b/include/uapi/drm/xe_drm_eudebug.h
@@ -56,6 +56,7 @@ struct drm_xe_eudebug_event {
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_DEBUG_DATA	5
 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE	6
 #define DRM_XE_EUDEBUG_EVENT_EU_ATTENTION	7
+#define DRM_XE_EUDEBUG_EVENT_PAGEFAULT		8
 
 	__u16 flags;
 #define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
@@ -210,6 +211,17 @@ struct drm_xe_eudebug_event_eu_attention {
 	__u8 bitmask[];
 };
 
+struct drm_xe_eudebug_event_pagefault {
+	struct drm_xe_eudebug_event base;
+
+	__u64 exec_queue_handle;
+	__u64 lrc_handle;
+	__u32 flags;
+	__u32 bitmask_size;
+	__u64 pagefault_address;
+	__u8 bitmask[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 19/20] drm/xe/vm: Support for adding null page VMA to VM on request
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (17 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 18/20] drm/xe/eudebug: Introduce EU pagefault handling interface Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 11:17 ` [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling Mika Kuoppala
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Oak Zeng, Niranjana Vishwanathapura,
	Stuart Summers, Bruce Chang, Mika Kuoppala

From: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>

The XE2 (and PVC) HW has a limitation that the pagefault due to invalid
access will halt the corresponding EUs. So, in order to activate the
debugger, kmd needs to install the temporal page to unhalt the EUs.
Plan to be used for pagefault handling when the EU debugger is running.
The idea is to install a null page vma if the pagefault is from an invalid
access. After installing null page pte, the user debugger can continue to
run/inspect without causing a fatal failure or reset and stop.
Based on Bruce's implementation [1].

[1] https://lore.kernel.org/intel-xe/20230829231648.4438-1-yu.bruce.chang@intel.com/

Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 33 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h |  2 ++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 5a05563009b2..444ef151431a 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -4534,3 +4534,36 @@ int xe_vm_alloc_cpu_addr_mirror_vma(struct xe_vm *vm, uint64_t start, uint64_t r
 
 	return xe_vm_alloc_vma(vm, &map_req, false);
 }
+
+struct xe_vma *xe_vm_create_null_vma(struct xe_vm *vm, u64 addr)
+{
+	struct xe_vma_mem_attr default_attr = {
+		.preferred_loc = {
+			.devmem_fd = DRM_XE_PREFERRED_LOC_DEFAULT_DEVICE,
+			.migration_policy = DRM_XE_MIGRATE_ALL_PAGES,
+		},
+		.atomic_access = DRM_XE_ATOMIC_UNDEFINED,
+		.default_pat_index = vm->xe->pat.idx[XE_CACHE_NONE],
+		.pat_index = vm->xe->pat.idx[XE_CACHE_NONE],
+	};
+	struct xe_vma *vma;
+	u32 page_size;
+	int err;
+
+	if (xe_vm_is_closed_or_banned(vm))
+		return ERR_PTR(-ENOENT);
+
+	page_size = vm->flags & XE_VM_FLAG_64K ? SZ_64K : SZ_4K;
+	vma = xe_vma_create(vm, NULL, 0, addr, addr + page_size - 1,
+			    &default_attr, VMA_CREATE_FLAG_IS_NULL);
+	if (IS_ERR_OR_NULL(vma))
+		return vma;
+
+	err = xe_vm_insert_vma(vm, vma);
+	if (err) {
+		xe_vma_destroy_late(vma);
+		return ERR_PTR(err);
+	}
+
+	return vma;
+}
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index ef8a5019574e..ebaaa855e231 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -411,4 +411,6 @@ static inline struct drm_exec *xe_vm_validation_exec(struct xe_vm *vm)
 #define xe_vm_has_valid_gpu_mapping(tile, tile_present, tile_invalidated)	\
 	((READ_ONCE(tile_present) & ~READ_ONCE(tile_invalidated)) & BIT((tile)->id))
 
+struct xe_vma *xe_vm_create_null_vma(struct xe_vm *vm, u64 addr);
+
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (18 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 19/20] drm/xe/vm: Support for adding null page VMA to VM on request Mika Kuoppala
@ 2025-10-06 11:17 ` Mika Kuoppala
  2025-10-06 18:43   ` Matthew Brost
  2025-10-06 12:30 ` ✗ CI.checkpatch: warning for Intel Xe GPU Debug Support (eudebug) v5 Patchwork
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-06 11:17 UTC (permalink / raw)
  To: intel-xe
  Cc: simona.vetter, matthew.brost, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Mika Kuoppala

From: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>

The XE2 (and PVC) HW has a limitation that the pagefault due to invalid
access will halt the corresponding EUs. To solve this problem, enable
EU pagefault handling functionality, which allows to unhalt pagefaulted
eu threads and to EU debugger to get inform about the eu attentions state
of EU threads during execution.

If a pagefault occurs, send the DRM_XE_EUDEBUG_EVENT_PAGEFAULT event
after handling the pagefault.

The pagefault handling is a mechanism that allows a stalled EU thread to
enter SIP mode by installing a temporal null page to the page table entry
where the pagefault happened.

A brief description of the page fault handling mechanism flow between KMD
and the eu thread is as follows

(1) eu thread accesses unallocated address
(2) pagefault happens and eu thread stalls
(3) XE kmd set an force eu thread exception to allow the running eu thread
    to enter SIP mode (kmd set ForceException / ForceExternalHalt bit of
    TD_CTL register)
    Not stalled (none-pagefaulted) eu threads enter SIP mode
(4) XE kmd installs temporal null page to the pagetable entry of the
    address where pagefault happened.
(5) XE kmd replies pagefault successful message to GUC
(6) stalled eu thread resumes as per pagefault condition has resolved
(7) resumed eu thread enters SIP mode due to force exception set by (3)

As designed this feature to only work when eudbug is enabled, it should
have no impact to regular recoverable pagefault code path.

Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_pagefault.c | 80 +++++++++++++++++++++++++---
 1 file changed, 74 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index a054d6010ae0..873ffd982030 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -13,6 +13,7 @@
 
 #include "abi/guc_actions_abi.h"
 #include "xe_bo.h"
+#include "xe_eudebug.h"
 #include "xe_gt.h"
 #include "xe_gt_printk.h"
 #include "xe_gt_stats.h"
@@ -173,10 +174,14 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
 	return vm;
 }
 
-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
+static int handle_pagefault_start(struct xe_gt *gt, struct pagefault *pf,
+				  struct xe_vm **pf_vm,
+				  struct xe_eudebug_pagefault **eudebug_pf_out)
 {
 	struct xe_device *xe = gt_to_xe(gt);
 	struct xe_vm *vm;
+	struct xe_eudebug_pagefault *eudebug_pf;
+	bool  destroy_eudebug_pf = false;
 	struct xe_vma *vma = NULL;
 	int err;
 	bool atomic;
@@ -189,6 +194,10 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 	if (IS_ERR(vm))
 		return PTR_ERR(vm);
 
+	eudebug_pf = xe_eudebug_pagefault_create(gt, vm, pf->page_addr,
+						 pf->fault_type, pf->fault_level,
+						 pf->access_type);
+
 	/*
 	 * TODO: Change to read lock? Using write lock for simplicity.
 	 */
@@ -201,8 +210,27 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 
 	vma = xe_vm_find_vma_by_addr(vm, pf->page_addr);
 	if (!vma) {
-		err = -EINVAL;
-		goto unlock_vm;
+		if (eudebug_pf)
+			vma = xe_vm_create_null_vma(vm, pf->page_addr);
+
+		if (IS_ERR_OR_NULL(vma)) {
+			err = -EINVAL;
+			if (eudebug_pf)
+				destroy_eudebug_pf = true;
+
+			goto unlock_vm;
+		}
+	} else {
+		/*
+		 * When creating an instance of eudebug_pagefault, there was
+		 * no vma containing the ppgtt address where the pagefault occurred,
+		 * but when reacquiring vm->lock, there is.
+		 * During not aquiring the vm->lock from this context,
+		 * but vma corresponding to the address where the pagefault occurred
+		 * in another context has allocated.
+		 */
+		if (eudebug_pf)
+			destroy_eudebug_pf = true;
 	}
 
 	atomic = access_is_atomic(pf->access_type);
@@ -217,11 +245,43 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 	if (!err)
 		vm->usm.last_fault_vma = vma;
 	up_write(&vm->lock);
-	xe_vm_put(vm);
+
+	if (destroy_eudebug_pf) {
+		xe_eudebug_pagefault_destroy(gt, vm, eudebug_pf, false);
+		*eudebug_pf_out = NULL;
+	} else {
+		*eudebug_pf_out = eudebug_pf;
+	}
+
+	/* while the lifetime of the eudebug pagefault instance, keep the VM instance.*/
+	if (!*eudebug_pf_out) {
+		xe_vm_put(vm);
+		*pf_vm = NULL;
+	} else {
+		*pf_vm = vm;
+	}
 
 	return err;
 }
 
+static void handle_pagefault_end(struct xe_gt *gt, struct xe_vm *vm,
+				 struct xe_eudebug_pagefault *eudebug_pf)
+{
+	/* if there no eudebug_pagefault then return */
+	if (!eudebug_pf)
+		return;
+
+	xe_eudebug_pagefault_process(gt, eudebug_pf);
+
+	/*
+	 * TODO: Remove VMA added to handle eudebug pagefault
+	 */
+
+	xe_eudebug_pagefault_destroy(gt, vm, eudebug_pf, true);
+
+	xe_vm_put(vm);
+}
+
 static int send_pagefault_reply(struct xe_guc *guc,
 				struct xe_guc_pagefault_reply *reply)
 {
@@ -346,7 +406,10 @@ static void pf_queue_work_func(struct work_struct *w)
 	threshold = jiffies + msecs_to_jiffies(USM_QUEUE_MAX_RUNTIME_MS);
 
 	while (get_pagefault(pf_queue, &pf)) {
-		ret = handle_pagefault(gt, &pf);
+		struct xe_eudebug_pagefault *eudebug_pf = NULL;
+		struct xe_vm *vm = NULL;
+
+		ret = handle_pagefault_start(gt, &pf, &vm, &eudebug_pf);
 		if (unlikely(ret)) {
 			print_pagefault(gt, &pf);
 			pf.fault_unsuccessful = 1;
@@ -364,7 +427,12 @@ static void pf_queue_work_func(struct work_struct *w)
 			FIELD_PREP(PFR_ENG_CLASS, pf.engine_class) |
 			FIELD_PREP(PFR_PDATA, pf.pdata);
 
-		send_pagefault_reply(&gt->uc.guc, &reply);
+		ret = send_pagefault_reply(&gt->uc.guc, &reply);
+
+		if (unlikely(ret))
+			xe_gt_dbg(gt, "GuC Pagefault reply failed: %d\n", ret);
+
+		handle_pagefault_end(gt, vm, eudebug_pf);
 
 		if (time_after(jiffies, threshold) &&
 		    pf_queue->tail != pf_queue->head) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* ✗ CI.checkpatch: warning for Intel Xe GPU Debug Support (eudebug) v5
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (19 preceding siblings ...)
  2025-10-06 11:17 ` [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling Mika Kuoppala
@ 2025-10-06 12:30 ` Patchwork
  2025-10-06 12:31 ` ✓ CI.KUnit: success " Patchwork
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-06 12:30 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-xe

== Series Details ==

Series: Intel Xe GPU Debug Support (eudebug) v5
URL   : https://patchwork.freedesktop.org/series/155452/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
fbd08a78c3a3bb17964db2a326514c69c1dca660
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit e6a3eb9952045883e24bb470bcecc6f5715301d5
Author: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Date:   Mon Oct 6 14:17:10 2025 +0300

    drm/xe/eudebug: Enable EU pagefault handling
    
    The XE2 (and PVC) HW has a limitation that the pagefault due to invalid
    access will halt the corresponding EUs. To solve this problem, enable
    EU pagefault handling functionality, which allows to unhalt pagefaulted
    eu threads and to EU debugger to get inform about the eu attentions state
    of EU threads during execution.
    
    If a pagefault occurs, send the DRM_XE_EUDEBUG_EVENT_PAGEFAULT event
    after handling the pagefault.
    
    The pagefault handling is a mechanism that allows a stalled EU thread to
    enter SIP mode by installing a temporal null page to the page table entry
    where the pagefault happened.
    
    A brief description of the page fault handling mechanism flow between KMD
    and the eu thread is as follows
    
    (1) eu thread accesses unallocated address
    (2) pagefault happens and eu thread stalls
    (3) XE kmd set an force eu thread exception to allow the running eu thread
        to enter SIP mode (kmd set ForceException / ForceExternalHalt bit of
        TD_CTL register)
        Not stalled (none-pagefaulted) eu threads enter SIP mode
    (4) XE kmd installs temporal null page to the pagetable entry of the
        address where pagefault happened.
    (5) XE kmd replies pagefault successful message to GUC
    (6) stalled eu thread resumes as per pagefault condition has resolved
    (7) resumed eu thread enters SIP mode due to force exception set by (3)
    
    As designed this feature to only work when eudbug is enabled, it should
    have no impact to regular recoverable pagefault code path.
    
    Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
    Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
+ /mt/dim checkpatch 29dc3d947463e9e9756a253801e5cc4466536ecc drm-intel
5fb3fc4a2601 drm/xe/eudebug: Introduce eudebug interface
-:227: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#227: 
new file mode 100644

-:482: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_err' - possible side-effects?
#482: FILE: drivers/gpu/drm/xe/xe_eudebug.c:251:
+#define xe_eudebug_disconnect(_d, _err) ({ \
+	if (_xe_eudebug_disconnect((_d), (_err))) { \
+		if ((_err) == 0 || (_err) == -ETIMEDOUT) \
+			eu_dbg(d, "Session closed (%d)", (_err)); \
+		else \
+			eu_err(d, "Session disconnected, err = %d (%s:%d)", \
+			       (_err), __func__, __LINE__); \
+	} \
+})

-:831: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_d' - possible side-effects?
#831: FILE: drivers/gpu/drm/xe/xe_eudebug.c:600:
+#define xe_eudebug_event_put(_d, _err) ({ \
+	if ((_err)) \
+		xe_eudebug_disconnect((_d), (_err)); \
+	xe_eudebug_put((_d)); \
+	})

-:831: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_err' - possible side-effects?
#831: FILE: drivers/gpu/drm/xe/xe_eudebug.c:600:
+#define xe_eudebug_event_put(_d, _err) ({ \
+	if ((_err)) \
+		xe_eudebug_disconnect((_d), (_err)); \
+	xe_eudebug_put((_d)); \
+	})

-:1295: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#1295: FILE: drivers/gpu/drm/xe/xe_eudebug.h:20:
+#define XE_EUDEBUG_DBG_ARGS(d) (d)->session, \
+		atomic_long_read(&(d)->events.seqno), \
+		!READ_ONCE(d->target.xef) ? "disconnected" : "", \
+		current->pid, \
+		task_tgid_nr(current), \
+		READ_ONCE(d->target.xef) ? d->target.xef->pid : -1

BUT SEE:

   do {} while (0) advice is over-stated in a few situations:

   The more obvious case is macros, like MODULE_PARM_DESC, invoked at
   file-scope, where C disallows code (it must be in functions).  See
   $exceptions if you have one to add by name.

   More troublesome is declarative macros used at top of new scope,
   like DECLARE_PER_CPU.  These might just compile with a do-while-0
   wrapper, but would be incorrect.  Most of these are handled by
   detecting struct,union,etc declaration primitives in $exceptions.

   Theres also macros called inside an if (block), which "return" an
   expression.  These cannot do-while, and need a ({}) wrapper.

   Enjoy this qualification while we work to improve our heuristics.

-:1295: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'd' - possible side-effects?
#1295: FILE: drivers/gpu/drm/xe/xe_eudebug.h:20:
+#define XE_EUDEBUG_DBG_ARGS(d) (d)->session, \
+		atomic_long_read(&(d)->events.seqno), \
+		!READ_ONCE(d->target.xef) ? "disconnected" : "", \
+		current->pid, \
+		task_tgid_nr(current), \
+		READ_ONCE(d->target.xef) ? d->target.xef->pid : -1

-:1302: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'd' - possible side-effects?
#1302: FILE: drivers/gpu/drm/xe/xe_eudebug.h:27:
+#define eu_err(d, fmt, ...) drm_err(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \
+				    XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__)

-:1304: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'd' - possible side-effects?
#1304: FILE: drivers/gpu/drm/xe/xe_eudebug.h:29:
+#define eu_warn(d, fmt, ...) drm_warn(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \
+				      XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__)

-:1306: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'd' - possible side-effects?
#1306: FILE: drivers/gpu/drm/xe/xe_eudebug.h:31:
+#define eu_dbg(d, fmt, ...) drm_dbg(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \
+				    XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__)

-:1523: WARNING:LONG_LINE: line length of 130 exceeds 100 columns
#1523: FILE: include/uapi/drm/xe_drm.h:127:
+#define DRM_IOCTL_XE_EUDEBUG_CONNECT		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EUDEBUG_CONNECT, struct drm_xe_eudebug_connect)

total: 1 errors, 2 warnings, 7 checks, 1503 lines checked
15b82102f262 drm/xe/eudebug: Introduce discovery for resources
-:8: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#8: 
currently existing resources to the debugger. The client is held on selected

total: 0 errors, 1 warnings, 0 checks, 295 lines checked
68f923c76314 drm/xe/eudebug: Introduce exec_queue events
828f0be35045 drm/xe: Add EUDEBUG_ENABLE exec queue property
97d5924dfa95 drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops
-:41: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#41: 
new file mode 100644

-:77: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#77: FILE: drivers/gpu/drm/xe/xe_debug_data.c:32:
+		if (XE_IOCTL_DBG(xe, (dd->addr < ext->addr + ext->range) &&
+				     (ext->addr < dd->addr + dd->range))) {

-:119: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#119: FILE: drivers/gpu/drm/xe/xe_debug_data.c:74:
+	if (XE_IOCTL_DBG(xe, operation != DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA &&
+			     operation != DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA))

-:133: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#133: FILE: drivers/gpu/drm/xe/xe_debug_data.c:88:
+	    XE_IOCTL_DBG(xe, ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO &&
+			     ext->offset != 0) ||

-:135: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#135: FILE: drivers/gpu/drm/xe/xe_debug_data.c:90:
+	    XE_IOCTL_DBG(xe, ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO &&
+			     (ext->pseudopath < DRM_XE_VM_BIND_DEBUG_DATA_PSEUDO_MODULE_AREA ||

-:138: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#138: FILE: drivers/gpu/drm/xe/xe_debug_data.c:93:
+	    XE_IOCTL_DBG(xe, !(ext->flags & DRM_XE_VM_BIND_DEBUG_DATA_FLAG_PSEUDO) &&
+			     strnlen(ext->pathname, PATH_MAX) >= PATH_MAX)) {

-:575: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#575: FILE: drivers/gpu/drm/xe/xe_vm.c:3384:
+		if (XE_IOCTL_DBG(xe, operation != DRM_XE_VM_BIND_OP_ADD_DEBUG_DATA &&
+				     operation != DRM_XE_VM_BIND_OP_REMOVE_DEBUG_DATA &&

-:578: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#578: FILE: drivers/gpu/drm/xe/xe_vm.c:3387:
+		    XE_IOCTL_DBG(xe, ext.name == XE_VM_BIND_OP_EXTENSIONS_DEBUG_DATA &&
+				     ++debug_data_count > 1))

-:698: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#698: FILE: drivers/gpu/drm/xe/xe_vm_types.h:347:
+		struct mutex lock;

total: 0 errors, 1 warnings, 8 checks, 720 lines checked
dc1e51594e41 drm/xe/eudebug: Introduce vm bind and vm bind debug data events
-:7: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#7: 
This patch adds events to track the bind ioctl and associated debug data add

-:347: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "ufence"
#347: FILE: drivers/gpu/drm/xe/xe_eudebug.c:1135:
+				fill_vm_bind_fields(vm, e, ufence != NULL, bind_ops);

-:651: WARNING:LONG_LINE_COMMENT: line length of 102 exceeds 100 columns
#651: FILE: include/uapi/drm/xe_drm_eudebug.h:92:
+ *  │  EVENT_VM_BIND        ├──────────────────┬─┬┄┐

-:652: WARNING:LONG_LINE_COMMENT: line length of 108 exceeds 100 columns
#652: FILE: include/uapi/drm/xe_drm_eudebug.h:93:
+ *  └───────────────────────┘                  │ │ ┊

-:653: WARNING:LONG_LINE_COMMENT: line length of 130 exceeds 100 columns
#653: FILE: include/uapi/drm/xe_drm_eudebug.h:94:
+ *      ┌──────────────────────────────────┐   │ │ ┊

-:655: WARNING:LONG_LINE_COMMENT: line length of 128 exceeds 100 columns
#655: FILE: include/uapi/drm/xe_drm_eudebug.h:96:
+ *      └──────────────────────────────────┘     │ ┊

-:657: WARNING:LONG_LINE_COMMENT: line length of 128 exceeds 100 columns
#657: FILE: include/uapi/drm/xe_drm_eudebug.h:98:
+ *      ┌──────────────────────────────────┐     │ ┊

-:659: WARNING:LONG_LINE_COMMENT: line length of 126 exceeds 100 columns
#659: FILE: include/uapi/drm/xe_drm_eudebug.h:100:
+ *      └──────────────────────────────────┘       ┊

-:661: WARNING:LONG_LINE_COMMENT: line length of 126 exceeds 100 columns
#661: FILE: include/uapi/drm/xe_drm_eudebug.h:102:
+ *      ┌┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┐       ┊

-:663: WARNING:LONG_LINE_COMMENT: line length of 116 exceeds 100 columns
#663: FILE: include/uapi/drm/xe_drm_eudebug.h:104:
+ *      └┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┘

total: 0 errors, 9 warnings, 1 checks, 649 lines checked
8d3d0548a042 drm/xe/eudebug: Add UFENCE events with acks
-:188: CHECK:LINE_SPACING: Please don't use multiple blank lines
#188: FILE: drivers/gpu/drm/xe/xe_eudebug.c:1097:
 
+

-:668: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#668: FILE: drivers/gpu/drm/xe/xe_sync_types.h:26:
+		spinlock_t lock;

total: 0 errors, 0 warnings, 2 checks, 636 lines checked
b5039b9f92c6 drm/xe/eudebug: vm open/pread/pwrite
-:115: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#115: FILE: drivers/gpu/drm/xe/xe_eudebug.c:675:
+{
+

-:139: CHECK:LINE_SPACING: Please don't use multiple blank lines
#139: FILE: drivers/gpu/drm/xe/xe_eudebug.c:699:
+
+

-:179: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#179: FILE: drivers/gpu/drm/xe/xe_eudebug.h:41:
+#define xe_eudebug_for_each_hw_engine(__hwe, __gt, __id) \
+	for_each_hw_engine(__hwe, __gt, __id)	       \
+		if (xe_hw_engine_has_eudebug(__hwe))

BUT SEE:

   do {} while (0) advice is over-stated in a few situations:

   The more obvious case is macros, like MODULE_PARM_DESC, invoked at
   file-scope, where C disallows code (it must be in functions).  See
   $exceptions if you have one to add by name.

   More troublesome is declarative macros used at top of new scope,
   like DECLARE_PER_CPU.  These might just compile with a do-while-0
   wrapper, but would be incorrect.  Most of these are handled by
   detecting struct,union,etc declaration primitives in $exceptions.

   Theres also macros called inside an if (block), which "return" an
   expression.  These cannot do-while, and need a ({}) wrapper.

   Enjoy this qualification while we work to improve our heuristics.

-:179: CHECK:MACRO_ARG_REUSE: Macro argument reuse '__hwe' - possible side-effects?
#179: FILE: drivers/gpu/drm/xe/xe_eudebug.h:41:
+#define xe_eudebug_for_each_hw_engine(__hwe, __gt, __id) \
+	for_each_hw_engine(__hwe, __gt, __id)	       \
+		if (xe_hw_engine_has_eudebug(__hwe))

-:215: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#215: 
new file mode 100644

total: 1 errors, 1 warnings, 3 checks, 599 lines checked
b754e5aa56a6 drm/xe/eudebug: userptr vm pread/pwrite
d2771e0bc227 drm/xe/eudebug: hw enablement for eudebug
-:107: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#107: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 463 lines checked
d8994ebf26dd drm/xe/eudebug: Introduce EU control interface
-:272: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#272: FILE: drivers/gpu/drm/xe/xe_eudebug_hw.c:183:
+static bool engine_has_runalone_set(const struct xe_hw_engine * const hwe,
+				   u32 rcu_debug1)

-:278: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#278: FILE: drivers/gpu/drm/xe/xe_eudebug_hw.c:189:
+static bool engine_has_context_set(const struct xe_hw_engine * const hwe,
+				  u32 rcu_debug1)

-:910: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#910: FILE: include/uapi/drm/xe_drm_eudebug.h:185:
+struct drm_xe_eudebug_eu_control {
+

total: 0 errors, 0 warnings, 3 checks, 855 lines checked
8fc1933b11ce drm/xe/eudebug: Introduce per device attention scan worker
67ee594be7a7 drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test
-:16: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 201 lines checked
2ed66d3a05fb drm/xe: Implement SR-IOV and eudebug exclusivity
022dc3059031 drm/xe: Add xe_client_debugfs and introduce debug_data file
-:29: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#29: 
new file mode 100644

-:83: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#83: FILE: drivers/gpu/drm/xe/xe_client_debugfs.c:50:
+			len = snprintf(kbuf, MAX_LINE_LEN, "%lu 0x%llx-0x%llx 0x%llx 0x%x\t%s\n",
+				vm_index,

total: 0 errors, 1 warnings, 1 checks, 161 lines checked
576256a71788 drm/xe/eudebug: Mark guc contexts as debuggable
-:9: WARNING:BAD_SIGN_OFF: Co-developed-by: must be immediately followed by Signed-off-by:
#9: 
Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>

-:10: WARNING:BAD_SIGN_OFF: Co-developed-by and Signed-off-by: name/email do not match
#10: 
Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

total: 0 errors, 2 warnings, 0 checks, 118 lines checked
9cf1196625c6 drm/xe/eudebug: Add read/count/compare helper for eu attention
89d6e551ce50 drm/xe/eudebug: Introduce EU pagefault handling interface
-:347: CHECK:LINE_SPACING: Please don't use multiple blank lines
#347: FILE: drivers/gpu/drm/xe/xe_eudebug.h:95:
+
+

-:415: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#415: 
new file mode 100644

-:541: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#541: FILE: drivers/gpu/drm/xe/xe_eudebug_pagefault.c:122:
+	spinlock_t lock;

-:549: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "!fence"
#549: FILE: drivers/gpu/drm/xe/xe_eudebug_pagefault.c:130:
+	if (fence == NULL)

-:640: CHECK:USLEEP_RANGE: usleep_range is preferred over udelay; see function description of usleep_range() and udelay().
#640: FILE: drivers/gpu/drm/xe/xe_eudebug_pagefault.c:221:
+		udelay(200);

total: 0 errors, 1 warnings, 4 checks, 827 lines checked
24fbc6d651c2 drm/xe/vm: Support for adding null page VMA to VM on request
-:15: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#15: 
[1] https://lore.kernel.org/intel-xe/20230829231648.4438-1-yu.bruce.chang@intel.com/

total: 0 errors, 1 warnings, 0 checks, 42 lines checked
e6a3eb995204 drm/xe/eudebug: Enable EU pagefault handling



^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ CI.KUnit: success for Intel Xe GPU Debug Support (eudebug) v5
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (20 preceding siblings ...)
  2025-10-06 12:30 ` ✗ CI.checkpatch: warning for Intel Xe GPU Debug Support (eudebug) v5 Patchwork
@ 2025-10-06 12:31 ` Patchwork
  2025-10-06 13:14 ` ✓ Xe.CI.BAT: " Patchwork
  2025-10-06 15:53 ` ✗ Xe.CI.Full: failure " Patchwork
  23 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-06 12:31 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-xe

== Series Details ==

Series: Intel Xe GPU Debug Support (eudebug) v5
URL   : https://patchwork.freedesktop.org/series/155452/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[12:30:36] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[12:30:40] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[12:31:09] Starting KUnit Kernel (1/1)...
[12:31:09] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[12:31:09] ================== guc_buf (11 subtests) ===================
[12:31:09] [PASSED] test_smallest
[12:31:09] [PASSED] test_largest
[12:31:09] [PASSED] test_granular
[12:31:09] [PASSED] test_unique
[12:31:09] [PASSED] test_overlap
[12:31:09] [PASSED] test_reusable
[12:31:09] [PASSED] test_too_big
[12:31:09] [PASSED] test_flush
[12:31:09] [PASSED] test_lookup
[12:31:09] [PASSED] test_data
[12:31:09] [PASSED] test_class
[12:31:09] ===================== [PASSED] guc_buf =====================
[12:31:09] =================== guc_dbm (7 subtests) ===================
[12:31:09] [PASSED] test_empty
[12:31:09] [PASSED] test_default
[12:31:09] ======================== test_size  ========================
[12:31:09] [PASSED] 4
[12:31:09] [PASSED] 8
[12:31:09] [PASSED] 32
[12:31:09] [PASSED] 256
[12:31:09] ==================== [PASSED] test_size ====================
[12:31:09] ======================= test_reuse  ========================
[12:31:09] [PASSED] 4
[12:31:09] [PASSED] 8
[12:31:09] [PASSED] 32
[12:31:09] [PASSED] 256
[12:31:09] =================== [PASSED] test_reuse ====================
[12:31:09] =================== test_range_overlap  ====================
[12:31:09] [PASSED] 4
[12:31:09] [PASSED] 8
[12:31:09] [PASSED] 32
[12:31:09] [PASSED] 256
[12:31:09] =============== [PASSED] test_range_overlap ================
[12:31:09] =================== test_range_compact  ====================
[12:31:09] [PASSED] 4
[12:31:09] [PASSED] 8
[12:31:09] [PASSED] 32
[12:31:09] [PASSED] 256
[12:31:09] =============== [PASSED] test_range_compact ================
[12:31:09] ==================== test_range_spare  =====================
[12:31:09] [PASSED] 4
[12:31:09] [PASSED] 8
[12:31:09] [PASSED] 32
[12:31:09] [PASSED] 256
[12:31:09] ================ [PASSED] test_range_spare =================
[12:31:09] ===================== [PASSED] guc_dbm =====================
[12:31:09] =================== guc_idm (6 subtests) ===================
[12:31:09] [PASSED] bad_init
[12:31:09] [PASSED] no_init
[12:31:09] [PASSED] init_fini
[12:31:09] [PASSED] check_used
[12:31:09] [PASSED] check_quota
[12:31:09] [PASSED] check_all
[12:31:09] ===================== [PASSED] guc_idm =====================
[12:31:09] ================== no_relay (3 subtests) ===================
[12:31:09] [PASSED] xe_drops_guc2pf_if_not_ready
[12:31:09] [PASSED] xe_drops_guc2vf_if_not_ready
[12:31:09] [PASSED] xe_rejects_send_if_not_ready
[12:31:09] ==================== [PASSED] no_relay =====================
[12:31:09] ================== pf_relay (14 subtests) ==================
[12:31:09] [PASSED] pf_rejects_guc2pf_too_short
[12:31:09] [PASSED] pf_rejects_guc2pf_too_long
[12:31:09] [PASSED] pf_rejects_guc2pf_no_payload
[12:31:09] [PASSED] pf_fails_no_payload
[12:31:09] [PASSED] pf_fails_bad_origin
[12:31:09] [PASSED] pf_fails_bad_type
[12:31:09] [PASSED] pf_txn_reports_error
[12:31:09] [PASSED] pf_txn_sends_pf2guc
[12:31:09] [PASSED] pf_sends_pf2guc
[12:31:09] [SKIPPED] pf_loopback_nop
[12:31:09] [SKIPPED] pf_loopback_echo
[12:31:09] [SKIPPED] pf_loopback_fail
[12:31:09] [SKIPPED] pf_loopback_busy
[12:31:09] [SKIPPED] pf_loopback_retry
[12:31:09] ==================== [PASSED] pf_relay =====================
[12:31:09] ================== vf_relay (3 subtests) ===================
[12:31:09] [PASSED] vf_rejects_guc2vf_too_short
[12:31:09] [PASSED] vf_rejects_guc2vf_too_long
[12:31:09] [PASSED] vf_rejects_guc2vf_no_payload
[12:31:09] ==================== [PASSED] vf_relay =====================
[12:31:09] ===================== lmtt (1 subtest) =====================
[12:31:09] ======================== test_ops  =========================
[12:31:09] [PASSED] 2-level
[12:31:09] [PASSED] multi-level
[12:31:09] ==================== [PASSED] test_ops =====================
[12:31:09] ====================== [PASSED] lmtt =======================
[12:31:09] ================= pf_service (11 subtests) =================
[12:31:09] [PASSED] pf_negotiate_any
[12:31:09] [PASSED] pf_negotiate_base_match
[12:31:09] [PASSED] pf_negotiate_base_newer
[12:31:09] [PASSED] pf_negotiate_base_next
[12:31:09] [SKIPPED] pf_negotiate_base_older
[12:31:09] [PASSED] pf_negotiate_base_prev
[12:31:09] [PASSED] pf_negotiate_latest_match
[12:31:09] [PASSED] pf_negotiate_latest_newer
[12:31:09] [PASSED] pf_negotiate_latest_next
[12:31:09] [SKIPPED] pf_negotiate_latest_older
[12:31:09] [SKIPPED] pf_negotiate_latest_prev
[12:31:09] =================== [PASSED] pf_service ====================
[12:31:09] ================== xe_eudebug (1 subtest) ==================
[12:31:09] =============== xe_eudebug_toggle_reg_kunit  ===============
[12:31:09] ========== [SKIPPED] xe_eudebug_toggle_reg_kunit ===========
[12:31:09] =================== [SKIPPED] xe_eudebug ===================
[12:31:09] ================= xe_guc_g2g (2 subtests) ==================
[12:31:09] ============== xe_live_guc_g2g_kunit_default  ==============
[12:31:09] ========= [SKIPPED] xe_live_guc_g2g_kunit_default ==========
[12:31:09] ============== xe_live_guc_g2g_kunit_allmem  ===============
[12:31:09] ========== [SKIPPED] xe_live_guc_g2g_kunit_allmem ==========
[12:31:09] =================== [SKIPPED] xe_guc_g2g ===================
[12:31:09] =================== xe_mocs (2 subtests) ===================
[12:31:09] ================ xe_live_mocs_kernel_kunit  ================
[12:31:09] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[12:31:09] ================ xe_live_mocs_reset_kunit  =================
[12:31:09] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[12:31:09] ==================== [SKIPPED] xe_mocs =====================
[12:31:09] ================= xe_migrate (2 subtests) ==================
[12:31:09] ================= xe_migrate_sanity_kunit  =================
[12:31:09] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[12:31:09] ================== xe_validate_ccs_kunit  ==================
[12:31:09] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[12:31:09] =================== [SKIPPED] xe_migrate ===================
[12:31:09] ================== xe_dma_buf (1 subtest) ==================
[12:31:09] ==================== xe_dma_buf_kunit  =====================
[12:31:09] ================ [SKIPPED] xe_dma_buf_kunit ================
[12:31:09] =================== [SKIPPED] xe_dma_buf ===================
[12:31:09] ================= xe_bo_shrink (1 subtest) =================
[12:31:09] =================== xe_bo_shrink_kunit  ====================
[12:31:09] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[12:31:09] ================== [SKIPPED] xe_bo_shrink ==================
[12:31:09] ==================== xe_bo (2 subtests) ====================
[12:31:09] ================== xe_ccs_migrate_kunit  ===================
[12:31:09] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[12:31:09] ==================== xe_bo_evict_kunit  ====================
[12:31:09] =============== [SKIPPED] xe_bo_evict_kunit ================
[12:31:09] ===================== [SKIPPED] xe_bo ======================
[12:31:09] ==================== args (11 subtests) ====================
[12:31:09] [PASSED] count_args_test
[12:31:09] [PASSED] call_args_example
[12:31:09] [PASSED] call_args_test
[12:31:09] [PASSED] drop_first_arg_example
[12:31:09] [PASSED] drop_first_arg_test
[12:31:09] [PASSED] first_arg_example
[12:31:09] [PASSED] first_arg_test
[12:31:09] [PASSED] last_arg_example
[12:31:09] [PASSED] last_arg_test
[12:31:09] [PASSED] pick_arg_example
[12:31:09] [PASSED] sep_comma_example
[12:31:09] ====================== [PASSED] args =======================
[12:31:09] =================== xe_pci (3 subtests) ====================
[12:31:09] ==================== check_graphics_ip  ====================
[12:31:09] [PASSED] 12.00 Xe_LP
[12:31:09] [PASSED] 12.10 Xe_LP+
[12:31:09] [PASSED] 12.55 Xe_HPG
[12:31:09] [PASSED] 12.60 Xe_HPC
[12:31:09] [PASSED] 12.70 Xe_LPG
[12:31:09] [PASSED] 12.71 Xe_LPG
[12:31:09] [PASSED] 12.74 Xe_LPG+
[12:31:09] [PASSED] 20.01 Xe2_HPG
[12:31:09] [PASSED] 20.02 Xe2_HPG
[12:31:09] [PASSED] 20.04 Xe2_LPG
[12:31:09] [PASSED] 30.00 Xe3_LPG
[12:31:09] [PASSED] 30.01 Xe3_LPG
[12:31:09] [PASSED] 30.03 Xe3_LPG
[12:31:09] ================ [PASSED] check_graphics_ip ================
[12:31:09] ===================== check_media_ip  ======================
[12:31:09] [PASSED] 12.00 Xe_M
[12:31:09] [PASSED] 12.55 Xe_HPM
[12:31:09] [PASSED] 13.00 Xe_LPM+
[12:31:09] [PASSED] 13.01 Xe2_HPM
[12:31:09] [PASSED] 20.00 Xe2_LPM
[12:31:09] [PASSED] 30.00 Xe3_LPM
[12:31:09] [PASSED] 30.02 Xe3_LPM
[12:31:09] ================= [PASSED] check_media_ip ==================
[12:31:09] ================= check_platform_gt_count  =================
[12:31:09] [PASSED] 0x9A60 (TIGERLAKE)
[12:31:09] [PASSED] 0x9A68 (TIGERLAKE)
[12:31:09] [PASSED] 0x9A70 (TIGERLAKE)
[12:31:09] [PASSED] 0x9A40 (TIGERLAKE)
[12:31:09] [PASSED] 0x9A49 (TIGERLAKE)
[12:31:09] [PASSED] 0x9A59 (TIGERLAKE)
[12:31:09] [PASSED] 0x9A78 (TIGERLAKE)
[12:31:09] [PASSED] 0x9AC0 (TIGERLAKE)
[12:31:09] [PASSED] 0x9AC9 (TIGERLAKE)
[12:31:09] [PASSED] 0x9AD9 (TIGERLAKE)
[12:31:09] [PASSED] 0x9AF8 (TIGERLAKE)
[12:31:09] [PASSED] 0x4C80 (ROCKETLAKE)
[12:31:09] [PASSED] 0x4C8A (ROCKETLAKE)
[12:31:09] [PASSED] 0x4C8B (ROCKETLAKE)
[12:31:09] [PASSED] 0x4C8C (ROCKETLAKE)
[12:31:09] [PASSED] 0x4C90 (ROCKETLAKE)
[12:31:09] [PASSED] 0x4C9A (ROCKETLAKE)
[12:31:09] [PASSED] 0x4680 (ALDERLAKE_S)
[12:31:09] [PASSED] 0x4682 (ALDERLAKE_S)
[12:31:09] [PASSED] 0x4688 (ALDERLAKE_S)
[12:31:09] [PASSED] 0x468A (ALDERLAKE_S)
[12:31:09] [PASSED] 0x468B (ALDERLAKE_S)
[12:31:09] [PASSED] 0x4690 (ALDERLAKE_S)
[12:31:09] [PASSED] 0x4692 (ALDERLAKE_S)
[12:31:09] [PASSED] 0x4693 (ALDERLAKE_S)
[12:31:09] [PASSED] 0x46A0 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46A1 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46A2 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46A3 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46A6 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46A8 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46AA (ALDERLAKE_P)
[12:31:09] [PASSED] 0x462A (ALDERLAKE_P)
[12:31:09] [PASSED] 0x4626 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x4628 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46B0 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46B1 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46B2 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46B3 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46C0 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46C1 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46C2 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46C3 (ALDERLAKE_P)
[12:31:09] [PASSED] 0x46D0 (ALDERLAKE_N)
[12:31:09] [PASSED] 0x46D1 (ALDERLAKE_N)
[12:31:09] [PASSED] 0x46D2 (ALDERLAKE_N)
[12:31:09] [PASSED] 0x46D3 (ALDERLAKE_N)
[12:31:09] [PASSED] 0x46D4 (ALDERLAKE_N)
[12:31:09] [PASSED] 0xA721 (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7A1 (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7A9 (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7AC (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7AD (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA720 (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7A0 (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7A8 (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7AA (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA7AB (ALDERLAKE_P)
[12:31:09] [PASSED] 0xA780 (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA781 (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA782 (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA783 (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA788 (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA789 (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA78A (ALDERLAKE_S)
[12:31:09] [PASSED] 0xA78B (ALDERLAKE_S)
[12:31:09] [PASSED] 0x4905 (DG1)
[12:31:09] [PASSED] 0x4906 (DG1)
[12:31:09] [PASSED] 0x4907 (DG1)
[12:31:09] [PASSED] 0x4908 (DG1)
[12:31:09] [PASSED] 0x4909 (DG1)
[12:31:09] [PASSED] 0x56C0 (DG2)
[12:31:09] [PASSED] 0x56C2 (DG2)
[12:31:09] [PASSED] 0x56C1 (DG2)
[12:31:09] [PASSED] 0x7D51 (METEORLAKE)
[12:31:09] [PASSED] 0x7DD1 (METEORLAKE)
[12:31:09] [PASSED] 0x7D41 (METEORLAKE)
[12:31:09] [PASSED] 0x7D67 (METEORLAKE)
[12:31:09] [PASSED] 0xB640 (METEORLAKE)
[12:31:09] [PASSED] 0x56A0 (DG2)
[12:31:09] [PASSED] 0x56A1 (DG2)
[12:31:09] [PASSED] 0x56A2 (DG2)
[12:31:09] [PASSED] 0x56BE (DG2)
[12:31:09] [PASSED] 0x56BF (DG2)
[12:31:09] [PASSED] 0x5690 (DG2)
[12:31:09] [PASSED] 0x5691 (DG2)
[12:31:09] [PASSED] 0x5692 (DG2)
[12:31:09] [PASSED] 0x56A5 (DG2)
[12:31:09] [PASSED] 0x56A6 (DG2)
[12:31:09] [PASSED] 0x56B0 (DG2)
[12:31:09] [PASSED] 0x56B1 (DG2)
[12:31:09] [PASSED] 0x56BA (DG2)
[12:31:09] [PASSED] 0x56BB (DG2)
[12:31:09] [PASSED] 0x56BC (DG2)
[12:31:09] [PASSED] 0x56BD (DG2)
[12:31:09] [PASSED] 0x5693 (DG2)
[12:31:09] [PASSED] 0x5694 (DG2)
[12:31:09] [PASSED] 0x5695 (DG2)
[12:31:09] [PASSED] 0x56A3 (DG2)
[12:31:09] [PASSED] 0x56A4 (DG2)
[12:31:09] [PASSED] 0x56B2 (DG2)
[12:31:09] [PASSED] 0x56B3 (DG2)
[12:31:09] [PASSED] 0x5696 (DG2)
[12:31:09] [PASSED] 0x5697 (DG2)
[12:31:09] [PASSED] 0xB69 (PVC)
[12:31:09] [PASSED] 0xB6E (PVC)
[12:31:09] [PASSED] 0xBD4 (PVC)
[12:31:09] [PASSED] 0xBD5 (PVC)
[12:31:09] [PASSED] 0xBD6 (PVC)
[12:31:09] [PASSED] 0xBD7 (PVC)
[12:31:09] [PASSED] 0xBD8 (PVC)
[12:31:09] [PASSED] 0xBD9 (PVC)
[12:31:09] [PASSED] 0xBDA (PVC)
[12:31:09] [PASSED] 0xBDB (PVC)
[12:31:09] [PASSED] 0xBE0 (PVC)
[12:31:09] [PASSED] 0xBE1 (PVC)
[12:31:09] [PASSED] 0xBE5 (PVC)
[12:31:09] [PASSED] 0x7D40 (METEORLAKE)
[12:31:09] [PASSED] 0x7D45 (METEORLAKE)
[12:31:09] [PASSED] 0x7D55 (METEORLAKE)
[12:31:09] [PASSED] 0x7D60 (METEORLAKE)
[12:31:09] [PASSED] 0x7DD5 (METEORLAKE)
[12:31:09] [PASSED] 0x6420 (LUNARLAKE)
[12:31:09] [PASSED] 0x64A0 (LUNARLAKE)
[12:31:09] [PASSED] 0x64B0 (LUNARLAKE)
[12:31:09] [PASSED] 0xE202 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE209 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE20B (BATTLEMAGE)
[12:31:09] [PASSED] 0xE20C (BATTLEMAGE)
[12:31:09] [PASSED] 0xE20D (BATTLEMAGE)
[12:31:09] [PASSED] 0xE210 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE211 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE212 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE216 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE220 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE221 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE222 (BATTLEMAGE)
[12:31:09] [PASSED] 0xE223 (BATTLEMAGE)
[12:31:09] [PASSED] 0xB080 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB081 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB082 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB083 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB084 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB085 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB086 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB087 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB08F (PANTHERLAKE)
[12:31:09] [PASSED] 0xB090 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB0A0 (PANTHERLAKE)
[12:31:09] [PASSED] 0xB0B0 (PANTHERLAKE)
[12:31:09] [PASSED] 0xFD80 (PANTHERLAKE)
[12:31:09] [PASSED] 0xFD81 (PANTHERLAKE)
[12:31:09] ============= [PASSED] check_platform_gt_count =============
[12:31:09] ===================== [PASSED] xe_pci ======================
[12:31:09] =================== xe_rtp (2 subtests) ====================
[12:31:09] =============== xe_rtp_process_to_sr_tests  ================
[12:31:09] [PASSED] coalesce-same-reg
[12:31:09] [PASSED] no-match-no-add
[12:31:09] [PASSED] match-or
[12:31:09] [PASSED] match-or-xfail
[12:31:09] [PASSED] no-match-no-add-multiple-rules
[12:31:09] [PASSED] two-regs-two-entries
[12:31:09] [PASSED] clr-one-set-other
[12:31:09] [PASSED] set-field
[12:31:09] [PASSED] conflict-duplicate
[12:31:09] [PASSED] conflict-not-disjoint
[12:31:09] [PASSED] conflict-reg-type
[12:31:09] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[12:31:09] ================== xe_rtp_process_tests  ===================
[12:31:09] [PASSED] active1
[12:31:09] [PASSED] active2
[12:31:09] [PASSED] active-inactive
[12:31:09] [PASSED] inactive-active
[12:31:09] [PASSED] inactive-1st_or_active-inactive
[12:31:09] [PASSED] inactive-2nd_or_active-inactive
[12:31:09] [PASSED] inactive-last_or_active-inactive
[12:31:09] [PASSED] inactive-no_or_active-inactive
[12:31:09] ============== [PASSED] xe_rtp_process_tests ===============
stty: 'standard input': Inappropriate ioctl for device
[12:31:09] ===================== [PASSED] xe_rtp ======================
[12:31:09] ==================== xe_wa (1 subtest) =====================
[12:31:09] ======================== xe_wa_gt  =========================
[12:31:09] [PASSED] TIGERLAKE B0
[12:31:09] [PASSED] DG1 A0
[12:31:09] [PASSED] DG1 B0
[12:31:09] [PASSED] ALDERLAKE_S A0
[12:31:09] [PASSED] ALDERLAKE_S B0
[12:31:09] [PASSED] ALDERLAKE_S C0
[12:31:09] [PASSED] ALDERLAKE_S D0
[12:31:09] [PASSED] ALDERLAKE_P A0
[12:31:09] [PASSED] ALDERLAKE_P B0
[12:31:09] [PASSED] ALDERLAKE_P C0
[12:31:09] [PASSED] ALDERLAKE_S RPLS D0
[12:31:09] [PASSED] ALDERLAKE_P RPLU E0
[12:31:09] [PASSED] DG2 G10 C0
[12:31:09] [PASSED] DG2 G11 B1
[12:31:09] [PASSED] DG2 G12 A1
[12:31:09] [PASSED] METEORLAKE 12.70(Xe_LPG) A0 13.00(Xe_LPM+) A0
[12:31:09] [PASSED] METEORLAKE 12.71(Xe_LPG) A0 13.00(Xe_LPM+) A0
[12:31:09] [PASSED] METEORLAKE 12.74(Xe_LPG+) A0 13.00(Xe_LPM+) A0
[12:31:09] [PASSED] LUNARLAKE 20.04(Xe2_LPG) A0 20.00(Xe2_LPM) A0
[12:31:09] [PASSED] LUNARLAKE 20.04(Xe2_LPG) B0 20.00(Xe2_LPM) A0
[12:31:09] [PASSED] BATTLEMAGE 20.01(Xe2_HPG) A0 13.01(Xe2_HPM) A1
[12:31:09] [PASSED] PANTHERLAKE 30.00(Xe3_LPG) A0 30.00(Xe3_LPM) A0
[12:31:09] ==================== [PASSED] xe_wa_gt =====================
[12:31:09] ====================== [PASSED] xe_wa ======================
[12:31:09] ============================================================
[12:31:09] Testing complete. Ran 307 tests: passed: 288, skipped: 19
[12:31:09] Elapsed time: 33.723s total, 4.142s configuring, 29.215s building, 0.326s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[12:31:09] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[12:31:11] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[12:31:34] Starting KUnit Kernel (1/1)...
[12:31:34] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[12:31:35] ============ drm_test_pick_cmdline (2 subtests) ============
[12:31:35] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[12:31:35] =============== drm_test_pick_cmdline_named  ===============
[12:31:35] [PASSED] NTSC
[12:31:35] [PASSED] NTSC-J
[12:31:35] [PASSED] PAL
[12:31:35] [PASSED] PAL-M
[12:31:35] =========== [PASSED] drm_test_pick_cmdline_named ===========
[12:31:35] ============== [PASSED] drm_test_pick_cmdline ==============
[12:31:35] == drm_test_atomic_get_connector_for_encoder (1 subtest) ===
[12:31:35] [PASSED] drm_test_drm_atomic_get_connector_for_encoder
[12:31:35] ==== [PASSED] drm_test_atomic_get_connector_for_encoder ====
[12:31:35] =========== drm_validate_clone_mode (2 subtests) ===========
[12:31:35] ============== drm_test_check_in_clone_mode  ===============
[12:31:35] [PASSED] in_clone_mode
[12:31:35] [PASSED] not_in_clone_mode
[12:31:35] ========== [PASSED] drm_test_check_in_clone_mode ===========
[12:31:35] =============== drm_test_check_valid_clones  ===============
[12:31:35] [PASSED] not_in_clone_mode
[12:31:35] [PASSED] valid_clone
[12:31:35] [PASSED] invalid_clone
[12:31:35] =========== [PASSED] drm_test_check_valid_clones ===========
[12:31:35] ============= [PASSED] drm_validate_clone_mode =============
[12:31:35] ============= drm_validate_modeset (1 subtest) =============
[12:31:35] [PASSED] drm_test_check_connector_changed_modeset
[12:31:35] ============== [PASSED] drm_validate_modeset ===============
[12:31:35] ====== drm_test_bridge_get_current_state (2 subtests) ======
[12:31:35] [PASSED] drm_test_drm_bridge_get_current_state_atomic
[12:31:35] [PASSED] drm_test_drm_bridge_get_current_state_legacy
[12:31:35] ======== [PASSED] drm_test_bridge_get_current_state ========
[12:31:35] ====== drm_test_bridge_helper_reset_crtc (3 subtests) ======
[12:31:35] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic
[12:31:35] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic_disabled
[12:31:35] [PASSED] drm_test_drm_bridge_helper_reset_crtc_legacy
[12:31:35] ======== [PASSED] drm_test_bridge_helper_reset_crtc ========
[12:31:35] ============== drm_bridge_alloc (2 subtests) ===============
[12:31:35] [PASSED] drm_test_drm_bridge_alloc_basic
[12:31:35] [PASSED] drm_test_drm_bridge_alloc_get_put
[12:31:35] ================ [PASSED] drm_bridge_alloc =================
[12:31:35] ================== drm_buddy (7 subtests) ==================
[12:31:35] [PASSED] drm_test_buddy_alloc_limit
[12:31:35] [PASSED] drm_test_buddy_alloc_optimistic
[12:31:35] [PASSED] drm_test_buddy_alloc_pessimistic
[12:31:35] [PASSED] drm_test_buddy_alloc_pathological
[12:31:35] [PASSED] drm_test_buddy_alloc_contiguous
[12:31:35] [PASSED] drm_test_buddy_alloc_clear
[12:31:35] [PASSED] drm_test_buddy_alloc_range_bias
[12:31:35] ==================== [PASSED] drm_buddy ====================
[12:31:35] ============= drm_cmdline_parser (40 subtests) =============
[12:31:35] [PASSED] drm_test_cmdline_force_d_only
[12:31:35] [PASSED] drm_test_cmdline_force_D_only_dvi
[12:31:35] [PASSED] drm_test_cmdline_force_D_only_hdmi
[12:31:35] [PASSED] drm_test_cmdline_force_D_only_not_digital
[12:31:35] [PASSED] drm_test_cmdline_force_e_only
[12:31:35] [PASSED] drm_test_cmdline_res
[12:31:35] [PASSED] drm_test_cmdline_res_vesa
[12:31:35] [PASSED] drm_test_cmdline_res_vesa_rblank
[12:31:35] [PASSED] drm_test_cmdline_res_rblank
[12:31:35] [PASSED] drm_test_cmdline_res_bpp
[12:31:35] [PASSED] drm_test_cmdline_res_refresh
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[12:31:35] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[12:31:35] [PASSED] drm_test_cmdline_res_margins_force_on
[12:31:35] [PASSED] drm_test_cmdline_res_vesa_margins
[12:31:35] [PASSED] drm_test_cmdline_name
[12:31:35] [PASSED] drm_test_cmdline_name_bpp
[12:31:35] [PASSED] drm_test_cmdline_name_option
[12:31:35] [PASSED] drm_test_cmdline_name_bpp_option
[12:31:35] [PASSED] drm_test_cmdline_rotate_0
[12:31:35] [PASSED] drm_test_cmdline_rotate_90
[12:31:35] [PASSED] drm_test_cmdline_rotate_180
[12:31:35] [PASSED] drm_test_cmdline_rotate_270
[12:31:35] [PASSED] drm_test_cmdline_hmirror
[12:31:35] [PASSED] drm_test_cmdline_vmirror
[12:31:35] [PASSED] drm_test_cmdline_margin_options
[12:31:35] [PASSED] drm_test_cmdline_multiple_options
[12:31:35] [PASSED] drm_test_cmdline_bpp_extra_and_option
[12:31:35] [PASSED] drm_test_cmdline_extra_and_option
[12:31:35] [PASSED] drm_test_cmdline_freestanding_options
[12:31:35] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[12:31:35] [PASSED] drm_test_cmdline_panel_orientation
[12:31:35] ================ drm_test_cmdline_invalid  =================
[12:31:35] [PASSED] margin_only
[12:31:35] [PASSED] interlace_only
[12:31:35] [PASSED] res_missing_x
[12:31:35] [PASSED] res_missing_y
[12:31:35] [PASSED] res_bad_y
[12:31:35] [PASSED] res_missing_y_bpp
[12:31:35] [PASSED] res_bad_bpp
[12:31:35] [PASSED] res_bad_refresh
[12:31:35] [PASSED] res_bpp_refresh_force_on_off
[12:31:35] [PASSED] res_invalid_mode
[12:31:35] [PASSED] res_bpp_wrong_place_mode
[12:31:35] [PASSED] name_bpp_refresh
[12:31:35] [PASSED] name_refresh
[12:31:35] [PASSED] name_refresh_wrong_mode
[12:31:35] [PASSED] name_refresh_invalid_mode
[12:31:35] [PASSED] rotate_multiple
[12:31:35] [PASSED] rotate_invalid_val
[12:31:35] [PASSED] rotate_truncated
[12:31:35] [PASSED] invalid_option
[12:31:35] [PASSED] invalid_tv_option
[12:31:35] [PASSED] truncated_tv_option
[12:31:35] ============ [PASSED] drm_test_cmdline_invalid =============
[12:31:35] =============== drm_test_cmdline_tv_options  ===============
[12:31:35] [PASSED] NTSC
[12:31:35] [PASSED] NTSC_443
[12:31:35] [PASSED] NTSC_J
[12:31:35] [PASSED] PAL
[12:31:35] [PASSED] PAL_M
[12:31:35] [PASSED] PAL_N
[12:31:35] [PASSED] SECAM
[12:31:35] [PASSED] MONO_525
[12:31:35] [PASSED] MONO_625
[12:31:35] =========== [PASSED] drm_test_cmdline_tv_options ===========
[12:31:35] =============== [PASSED] drm_cmdline_parser ================
[12:31:35] ========== drmm_connector_hdmi_init (20 subtests) ==========
[12:31:35] [PASSED] drm_test_connector_hdmi_init_valid
[12:31:35] [PASSED] drm_test_connector_hdmi_init_bpc_8
[12:31:35] [PASSED] drm_test_connector_hdmi_init_bpc_10
[12:31:35] [PASSED] drm_test_connector_hdmi_init_bpc_12
[12:31:35] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[12:31:35] [PASSED] drm_test_connector_hdmi_init_bpc_null
[12:31:35] [PASSED] drm_test_connector_hdmi_init_formats_empty
[12:31:35] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[12:31:35] === drm_test_connector_hdmi_init_formats_yuv420_allowed  ===
[12:31:35] [PASSED] supported_formats=0x9 yuv420_allowed=1
[12:31:35] [PASSED] supported_formats=0x9 yuv420_allowed=0
[12:31:35] [PASSED] supported_formats=0x3 yuv420_allowed=1
[12:31:35] [PASSED] supported_formats=0x3 yuv420_allowed=0
[12:31:35] === [PASSED] drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[12:31:35] [PASSED] drm_test_connector_hdmi_init_null_ddc
[12:31:35] [PASSED] drm_test_connector_hdmi_init_null_product
[12:31:35] [PASSED] drm_test_connector_hdmi_init_null_vendor
[12:31:35] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[12:31:35] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[12:31:35] [PASSED] drm_test_connector_hdmi_init_product_valid
[12:31:35] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[12:31:35] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[12:31:35] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[12:31:35] ========= drm_test_connector_hdmi_init_type_valid  =========
[12:31:35] [PASSED] HDMI-A
[12:31:35] [PASSED] HDMI-B
[12:31:35] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[12:31:35] ======== drm_test_connector_hdmi_init_type_invalid  ========
[12:31:35] [PASSED] Unknown
[12:31:35] [PASSED] VGA
[12:31:35] [PASSED] DVI-I
[12:31:35] [PASSED] DVI-D
[12:31:35] [PASSED] DVI-A
[12:31:35] [PASSED] Composite
[12:31:35] [PASSED] SVIDEO
[12:31:35] [PASSED] LVDS
[12:31:35] [PASSED] Component
[12:31:35] [PASSED] DIN
[12:31:35] [PASSED] DP
[12:31:35] [PASSED] TV
[12:31:35] [PASSED] eDP
[12:31:35] [PASSED] Virtual
[12:31:35] [PASSED] DSI
[12:31:35] [PASSED] DPI
[12:31:35] [PASSED] Writeback
[12:31:35] [PASSED] SPI
[12:31:35] [PASSED] USB
[12:31:35] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[12:31:35] ============ [PASSED] drmm_connector_hdmi_init =============
[12:31:35] ============= drmm_connector_init (3 subtests) =============
[12:31:35] [PASSED] drm_test_drmm_connector_init
[12:31:35] [PASSED] drm_test_drmm_connector_init_null_ddc
[12:31:35] ========= drm_test_drmm_connector_init_type_valid  =========
[12:31:35] [PASSED] Unknown
[12:31:35] [PASSED] VGA
[12:31:35] [PASSED] DVI-I
[12:31:35] [PASSED] DVI-D
[12:31:35] [PASSED] DVI-A
[12:31:35] [PASSED] Composite
[12:31:35] [PASSED] SVIDEO
[12:31:35] [PASSED] LVDS
[12:31:35] [PASSED] Component
[12:31:35] [PASSED] DIN
[12:31:35] [PASSED] DP
[12:31:35] [PASSED] HDMI-A
[12:31:35] [PASSED] HDMI-B
[12:31:35] [PASSED] TV
[12:31:35] [PASSED] eDP
[12:31:35] [PASSED] Virtual
[12:31:35] [PASSED] DSI
[12:31:35] [PASSED] DPI
[12:31:35] [PASSED] Writeback
[12:31:35] [PASSED] SPI
[12:31:35] [PASSED] USB
[12:31:35] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[12:31:35] =============== [PASSED] drmm_connector_init ===============
[12:31:35] ========= drm_connector_dynamic_init (6 subtests) ==========
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_init
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_init_null_ddc
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_init_not_added
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_init_properties
[12:31:35] ===== drm_test_drm_connector_dynamic_init_type_valid  ======
[12:31:35] [PASSED] Unknown
[12:31:35] [PASSED] VGA
[12:31:35] [PASSED] DVI-I
[12:31:35] [PASSED] DVI-D
[12:31:35] [PASSED] DVI-A
[12:31:35] [PASSED] Composite
[12:31:35] [PASSED] SVIDEO
[12:31:35] [PASSED] LVDS
[12:31:35] [PASSED] Component
[12:31:35] [PASSED] DIN
[12:31:35] [PASSED] DP
[12:31:35] [PASSED] HDMI-A
[12:31:35] [PASSED] HDMI-B
[12:31:35] [PASSED] TV
[12:31:35] [PASSED] eDP
[12:31:35] [PASSED] Virtual
[12:31:35] [PASSED] DSI
[12:31:35] [PASSED] DPI
[12:31:35] [PASSED] Writeback
[12:31:35] [PASSED] SPI
[12:31:35] [PASSED] USB
[12:31:35] = [PASSED] drm_test_drm_connector_dynamic_init_type_valid ==
[12:31:35] ======== drm_test_drm_connector_dynamic_init_name  =========
[12:31:35] [PASSED] Unknown
[12:31:35] [PASSED] VGA
[12:31:35] [PASSED] DVI-I
[12:31:35] [PASSED] DVI-D
[12:31:35] [PASSED] DVI-A
[12:31:35] [PASSED] Composite
[12:31:35] [PASSED] SVIDEO
[12:31:35] [PASSED] LVDS
[12:31:35] [PASSED] Component
[12:31:35] [PASSED] DIN
[12:31:35] [PASSED] DP
[12:31:35] [PASSED] HDMI-A
[12:31:35] [PASSED] HDMI-B
[12:31:35] [PASSED] TV
[12:31:35] [PASSED] eDP
[12:31:35] [PASSED] Virtual
[12:31:35] [PASSED] DSI
[12:31:35] [PASSED] DPI
[12:31:35] [PASSED] Writeback
[12:31:35] [PASSED] SPI
[12:31:35] [PASSED] USB
[12:31:35] ==== [PASSED] drm_test_drm_connector_dynamic_init_name =====
[12:31:35] =========== [PASSED] drm_connector_dynamic_init ============
[12:31:35] ==== drm_connector_dynamic_register_early (4 subtests) =====
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_early_on_list
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_early_defer
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_early_no_init
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_early_no_mode_object
[12:31:35] ====== [PASSED] drm_connector_dynamic_register_early =======
[12:31:35] ======= drm_connector_dynamic_register (7 subtests) ========
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_on_list
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_no_defer
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_no_init
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_mode_object
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_sysfs
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_sysfs_name
[12:31:35] [PASSED] drm_test_drm_connector_dynamic_register_debugfs
[12:31:35] ========= [PASSED] drm_connector_dynamic_register ==========
[12:31:35] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[12:31:35] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[12:31:35] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[12:31:35] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[12:31:35] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[12:31:35] ========== drm_test_get_tv_mode_from_name_valid  ===========
[12:31:35] [PASSED] NTSC
[12:31:35] [PASSED] NTSC-443
[12:31:35] [PASSED] NTSC-J
[12:31:35] [PASSED] PAL
[12:31:35] [PASSED] PAL-M
[12:31:35] [PASSED] PAL-N
[12:31:35] [PASSED] SECAM
[12:31:35] [PASSED] Mono
[12:31:35] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[12:31:35] [PASSED] drm_test_get_tv_mode_from_name_truncated
[12:31:35] ============ [PASSED] drm_get_tv_mode_from_name ============
[12:31:35] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[12:31:35] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[12:31:35] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[12:31:35] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[12:31:35] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[12:31:35] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[12:31:35] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[12:31:35] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid  =
[12:31:35] [PASSED] VIC 96
[12:31:35] [PASSED] VIC 97
[12:31:35] [PASSED] VIC 101
[12:31:35] [PASSED] VIC 102
[12:31:35] [PASSED] VIC 106
[12:31:35] [PASSED] VIC 107
[12:31:35] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[12:31:35] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[12:31:35] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[12:31:35] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[12:31:35] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[12:31:35] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[12:31:35] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[12:31:35] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[12:31:35] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name  ====
[12:31:35] [PASSED] Automatic
[12:31:35] [PASSED] Full
[12:31:35] [PASSED] Limited 16:235
[12:31:35] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[12:31:35] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[12:31:35] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[12:31:35] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[12:31:35] === drm_test_drm_hdmi_connector_get_output_format_name  ====
[12:31:35] [PASSED] RGB
[12:31:35] [PASSED] YUV 4:2:0
[12:31:35] [PASSED] YUV 4:2:2
[12:31:35] [PASSED] YUV 4:4:4
[12:31:35] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[12:31:35] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[12:31:35] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[12:31:35] ============= drm_damage_helper (21 subtests) ==============
[12:31:35] [PASSED] drm_test_damage_iter_no_damage
[12:31:35] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[12:31:35] [PASSED] drm_test_damage_iter_no_damage_src_moved
[12:31:35] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[12:31:35] [PASSED] drm_test_damage_iter_no_damage_not_visible
[12:31:35] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[12:31:35] [PASSED] drm_test_damage_iter_no_damage_no_fb
[12:31:35] [PASSED] drm_test_damage_iter_simple_damage
[12:31:35] [PASSED] drm_test_damage_iter_single_damage
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_outside_src
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_src_moved
[12:31:35] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[12:31:35] [PASSED] drm_test_damage_iter_damage
[12:31:35] [PASSED] drm_test_damage_iter_damage_one_intersect
[12:31:35] [PASSED] drm_test_damage_iter_damage_one_outside
[12:31:35] [PASSED] drm_test_damage_iter_damage_src_moved
[12:31:35] [PASSED] drm_test_damage_iter_damage_not_visible
[12:31:35] ================ [PASSED] drm_damage_helper ================
[12:31:35] ============== drm_dp_mst_helper (3 subtests) ==============
[12:31:35] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[12:31:35] [PASSED] Clock 154000 BPP 30 DSC disabled
[12:31:35] [PASSED] Clock 234000 BPP 30 DSC disabled
[12:31:35] [PASSED] Clock 297000 BPP 24 DSC disabled
[12:31:35] [PASSED] Clock 332880 BPP 24 DSC enabled
[12:31:35] [PASSED] Clock 324540 BPP 24 DSC enabled
[12:31:35] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[12:31:35] ============== drm_test_dp_mst_calc_pbn_div  ===============
[12:31:35] [PASSED] Link rate 2000000 lane count 4
[12:31:35] [PASSED] Link rate 2000000 lane count 2
[12:31:35] [PASSED] Link rate 2000000 lane count 1
[12:31:35] [PASSED] Link rate 1350000 lane count 4
[12:31:35] [PASSED] Link rate 1350000 lane count 2
[12:31:35] [PASSED] Link rate 1350000 lane count 1
[12:31:35] [PASSED] Link rate 1000000 lane count 4
[12:31:35] [PASSED] Link rate 1000000 lane count 2
[12:31:35] [PASSED] Link rate 1000000 lane count 1
[12:31:35] [PASSED] Link rate 810000 lane count 4
[12:31:35] [PASSED] Link rate 810000 lane count 2
[12:31:35] [PASSED] Link rate 810000 lane count 1
[12:31:35] [PASSED] Link rate 540000 lane count 4
[12:31:35] [PASSED] Link rate 540000 lane count 2
[12:31:35] [PASSED] Link rate 540000 lane count 1
[12:31:35] [PASSED] Link rate 270000 lane count 4
[12:31:35] [PASSED] Link rate 270000 lane count 2
[12:31:35] [PASSED] Link rate 270000 lane count 1
[12:31:35] [PASSED] Link rate 162000 lane count 4
[12:31:35] [PASSED] Link rate 162000 lane count 2
[12:31:35] [PASSED] Link rate 162000 lane count 1
[12:31:35] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[12:31:35] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[12:31:35] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[12:31:35] [PASSED] DP_POWER_UP_PHY with port number
[12:31:35] [PASSED] DP_POWER_DOWN_PHY with port number
[12:31:35] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[12:31:35] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[12:31:35] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[12:31:35] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[12:31:35] [PASSED] DP_QUERY_PAYLOAD with port number
[12:31:35] [PASSED] DP_QUERY_PAYLOAD with VCPI
[12:31:35] [PASSED] DP_REMOTE_DPCD_READ with port number
[12:31:35] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[12:31:35] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[12:31:35] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[12:31:35] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[12:31:35] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[12:31:35] [PASSED] DP_REMOTE_I2C_READ with port number
[12:31:35] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[12:31:35] [PASSED] DP_REMOTE_I2C_READ with transactions array
[12:31:35] [PASSED] DP_REMOTE_I2C_WRITE with port number
[12:31:35] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[12:31:35] [PASSED] DP_REMOTE_I2C_WRITE with data array
[12:31:35] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[12:31:35] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[12:31:35] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[12:31:35] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[12:31:35] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[12:31:35] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[12:31:35] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[12:31:35] ================ [PASSED] drm_dp_mst_helper ================
[12:31:35] ================== drm_exec (7 subtests) ===================
[12:31:35] [PASSED] sanitycheck
[12:31:35] [PASSED] test_lock
[12:31:35] [PASSED] test_lock_unlock
[12:31:35] [PASSED] test_duplicates
[12:31:35] [PASSED] test_prepare
[12:31:35] [PASSED] test_prepare_array
[12:31:35] [PASSED] test_multiple_loops
[12:31:35] ==================== [PASSED] drm_exec =====================
[12:31:35] =========== drm_format_helper_test (17 subtests) ===========
[12:31:35] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[12:31:35] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[12:31:35] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[12:31:35] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[12:31:35] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[12:31:35] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[12:31:35] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[12:31:35] ============= drm_test_fb_xrgb8888_to_bgr888  ==============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ========= [PASSED] drm_test_fb_xrgb8888_to_bgr888 ==========
[12:31:35] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[12:31:35] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[12:31:35] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[12:31:35] ============== drm_test_fb_xrgb8888_to_mono  ===============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[12:31:35] ==================== drm_test_fb_swab  =====================
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ================ [PASSED] drm_test_fb_swab =================
[12:31:35] ============ drm_test_fb_xrgb8888_to_xbgr8888  =============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[12:31:35] ============ drm_test_fb_xrgb8888_to_abgr8888  =============
[12:31:35] [PASSED] single_pixel_source_buffer
[12:31:35] [PASSED] single_pixel_clip_rectangle
[12:31:35] [PASSED] well_known_colors
[12:31:35] [PASSED] destination_pitch
[12:31:35] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[12:31:35] ================= drm_test_fb_clip_offset  =================
[12:31:35] [PASSED] pass through
[12:31:35] [PASSED] horizontal offset
[12:31:35] [PASSED] vertical offset
[12:31:35] [PASSED] horizontal and vertical offset
[12:31:35] [PASSED] horizontal offset (custom pitch)
[12:31:35] [PASSED] vertical offset (custom pitch)
[12:31:35] [PASSED] horizontal and vertical offset (custom pitch)
[12:31:35] ============= [PASSED] drm_test_fb_clip_offset =============
[12:31:35] =================== drm_test_fb_memcpy  ====================
[12:31:35] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[12:31:35] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[12:31:35] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[12:31:35] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[12:31:35] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[12:31:35] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[12:31:35] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[12:31:35] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[12:31:35] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[12:31:35] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[12:31:35] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[12:31:35] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[12:31:35] =============== [PASSED] drm_test_fb_memcpy ================
[12:31:35] ============= [PASSED] drm_format_helper_test ==============
[12:31:35] ================= drm_format (18 subtests) =================
[12:31:35] [PASSED] drm_test_format_block_width_invalid
[12:31:35] [PASSED] drm_test_format_block_width_one_plane
[12:31:35] [PASSED] drm_test_format_block_width_two_plane
[12:31:35] [PASSED] drm_test_format_block_width_three_plane
[12:31:35] [PASSED] drm_test_format_block_width_tiled
[12:31:35] [PASSED] drm_test_format_block_height_invalid
[12:31:35] [PASSED] drm_test_format_block_height_one_plane
[12:31:35] [PASSED] drm_test_format_block_height_two_plane
[12:31:35] [PASSED] drm_test_format_block_height_three_plane
[12:31:35] [PASSED] drm_test_format_block_height_tiled
[12:31:35] [PASSED] drm_test_format_min_pitch_invalid
[12:31:35] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[12:31:35] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[12:31:35] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[12:31:35] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[12:31:35] [PASSED] drm_test_format_min_pitch_two_plane
[12:31:35] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[12:31:35] [PASSED] drm_test_format_min_pitch_tiled
[12:31:35] =================== [PASSED] drm_format ====================
[12:31:35] ============== drm_framebuffer (10 subtests) ===============
[12:31:35] ========== drm_test_framebuffer_check_src_coords  ==========
[12:31:35] [PASSED] Success: source fits into fb
[12:31:35] [PASSED] Fail: overflowing fb with x-axis coordinate
[12:31:35] [PASSED] Fail: overflowing fb with y-axis coordinate
[12:31:35] [PASSED] Fail: overflowing fb with source width
[12:31:35] [PASSED] Fail: overflowing fb with source height
[12:31:35] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[12:31:35] [PASSED] drm_test_framebuffer_cleanup
[12:31:35] =============== drm_test_framebuffer_create  ===============
[12:31:35] [PASSED] ABGR8888 normal sizes
[12:31:35] [PASSED] ABGR8888 max sizes
[12:31:35] [PASSED] ABGR8888 pitch greater than min required
[12:31:35] [PASSED] ABGR8888 pitch less than min required
[12:31:35] [PASSED] ABGR8888 Invalid width
[12:31:35] [PASSED] ABGR8888 Invalid buffer handle
[12:31:35] [PASSED] No pixel format
[12:31:35] [PASSED] ABGR8888 Width 0
[12:31:35] [PASSED] ABGR8888 Height 0
[12:31:35] [PASSED] ABGR8888 Out of bound height * pitch combination
[12:31:35] [PASSED] ABGR8888 Large buffer offset
[12:31:35] [PASSED] ABGR8888 Buffer offset for inexistent plane
[12:31:35] [PASSED] ABGR8888 Invalid flag
[12:31:35] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[12:31:35] [PASSED] ABGR8888 Valid buffer modifier
[12:31:35] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[12:31:35] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] NV12 Normal sizes
[12:31:35] [PASSED] NV12 Max sizes
[12:31:35] [PASSED] NV12 Invalid pitch
[12:31:35] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[12:31:35] [PASSED] NV12 different  modifier per-plane
[12:31:35] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[12:31:35] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] NV12 Modifier for inexistent plane
[12:31:35] [PASSED] NV12 Handle for inexistent plane
[12:31:35] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[12:31:35] [PASSED] YVU420 Normal sizes
[12:31:35] [PASSED] YVU420 Max sizes
[12:31:35] [PASSED] YVU420 Invalid pitch
[12:31:35] [PASSED] YVU420 Different pitches
[12:31:35] [PASSED] YVU420 Different buffer offsets/pitches
[12:31:35] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[12:31:35] [PASSED] YVU420 Valid modifier
[12:31:35] [PASSED] YVU420 Different modifiers per plane
[12:31:35] [PASSED] YVU420 Modifier for inexistent plane
[12:31:35] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[12:31:35] [PASSED] X0L2 Normal sizes
[12:31:35] [PASSED] X0L2 Max sizes
[12:31:35] [PASSED] X0L2 Invalid pitch
[12:31:35] [PASSED] X0L2 Pitch greater than minimum required
[12:31:35] [PASSED] X0L2 Handle for inexistent plane
[12:31:35] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[12:31:35] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[12:31:35] [PASSED] X0L2 Valid modifier
[12:31:35] [PASSED] X0L2 Modifier for inexistent plane
[12:31:35] =========== [PASSED] drm_test_framebuffer_create ===========
[12:31:35] [PASSED] drm_test_framebuffer_free
[12:31:35] [PASSED] drm_test_framebuffer_init
[12:31:35] [PASSED] drm_test_framebuffer_init_bad_format
[12:31:35] [PASSED] drm_test_framebuffer_init_dev_mismatch
[12:31:35] [PASSED] drm_test_framebuffer_lookup
[12:31:35] [PASSED] drm_test_framebuffer_lookup_inexistent
[12:31:35] [PASSED] drm_test_framebuffer_modifiers_not_supported
[12:31:35] ================= [PASSED] drm_framebuffer =================
[12:31:35] ================ drm_gem_shmem (8 subtests) ================
[12:31:35] [PASSED] drm_gem_shmem_test_obj_create
[12:31:35] [PASSED] drm_gem_shmem_test_obj_create_private
[12:31:35] [PASSED] drm_gem_shmem_test_pin_pages
[12:31:35] [PASSED] drm_gem_shmem_test_vmap
[12:31:35] [PASSED] drm_gem_shmem_test_get_pages_sgt
[12:31:35] [PASSED] drm_gem_shmem_test_get_sg_table
[12:31:35] [PASSED] drm_gem_shmem_test_madvise
[12:31:35] [PASSED] drm_gem_shmem_test_purge
[12:31:35] ================== [PASSED] drm_gem_shmem ==================
[12:31:35] === drm_atomic_helper_connector_hdmi_check (27 subtests) ===
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[12:31:35] ====== drm_test_check_broadcast_rgb_cea_mode_yuv420  =======
[12:31:35] [PASSED] Automatic
[12:31:35] [PASSED] Full
[12:31:35] [PASSED] Limited 16:235
[12:31:35] == [PASSED] drm_test_check_broadcast_rgb_cea_mode_yuv420 ===
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[12:31:35] [PASSED] drm_test_check_disable_connector
[12:31:35] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[12:31:35] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_rgb
[12:31:35] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_yuv420
[12:31:35] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422
[12:31:35] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420
[12:31:35] [PASSED] drm_test_check_driver_unsupported_fallback_yuv420
[12:31:35] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[12:31:35] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[12:31:35] [PASSED] drm_test_check_output_bpc_dvi
[12:31:35] [PASSED] drm_test_check_output_bpc_format_vic_1
[12:31:35] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[12:31:35] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[12:31:35] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[12:31:35] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[12:31:35] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[12:31:35] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[12:31:35] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[12:31:35] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[12:31:35] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[12:31:35] [PASSED] drm_test_check_broadcast_rgb_value
[12:31:35] [PASSED] drm_test_check_bpc_8_value
[12:31:35] [PASSED] drm_test_check_bpc_10_value
[12:31:35] [PASSED] drm_test_check_bpc_12_value
[12:31:35] [PASSED] drm_test_check_format_value
[12:31:35] [PASSED] drm_test_check_tmds_char_value
[12:31:35] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[12:31:35] = drm_atomic_helper_connector_hdmi_mode_valid (4 subtests) =
[12:31:35] [PASSED] drm_test_check_mode_valid
[12:31:35] [PASSED] drm_test_check_mode_valid_reject
[12:31:35] [PASSED] drm_test_check_mode_valid_reject_rate
[12:31:35] [PASSED] drm_test_check_mode_valid_reject_max_clock
[12:31:35] === [PASSED] drm_atomic_helper_connector_hdmi_mode_valid ===
[12:31:35] ================= drm_managed (2 subtests) =================
[12:31:35] [PASSED] drm_test_managed_release_action
[12:31:35] [PASSED] drm_test_managed_run_action
[12:31:35] =================== [PASSED] drm_managed ===================
[12:31:35] =================== drm_mm (6 subtests) ====================
[12:31:35] [PASSED] drm_test_mm_init
[12:31:35] [PASSED] drm_test_mm_debug
[12:31:35] [PASSED] drm_test_mm_align32
[12:31:35] [PASSED] drm_test_mm_align64
[12:31:35] [PASSED] drm_test_mm_lowest
[12:31:35] [PASSED] drm_test_mm_highest
[12:31:35] ===================== [PASSED] drm_mm ======================
[12:31:35] ============= drm_modes_analog_tv (5 subtests) =============
[12:31:35] [PASSED] drm_test_modes_analog_tv_mono_576i
[12:31:35] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[12:31:35] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[12:31:35] [PASSED] drm_test_modes_analog_tv_pal_576i
[12:31:35] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[12:31:35] =============== [PASSED] drm_modes_analog_tv ===============
[12:31:35] ============== drm_plane_helper (2 subtests) ===============
[12:31:35] =============== drm_test_check_plane_state  ================
[12:31:35] [PASSED] clipping_simple
[12:31:35] [PASSED] clipping_rotate_reflect
[12:31:35] [PASSED] positioning_simple
[12:31:35] [PASSED] upscaling
[12:31:35] [PASSED] downscaling
[12:31:35] [PASSED] rounding1
[12:31:35] [PASSED] rounding2
[12:31:35] [PASSED] rounding3
[12:31:35] [PASSED] rounding4
[12:31:35] =========== [PASSED] drm_test_check_plane_state ============
[12:31:35] =========== drm_test_check_invalid_plane_state  ============
[12:31:35] [PASSED] positioning_invalid
[12:31:35] [PASSED] upscaling_invalid
[12:31:35] [PASSED] downscaling_invalid
[12:31:35] ======= [PASSED] drm_test_check_invalid_plane_state ========
[12:31:35] ================ [PASSED] drm_plane_helper =================
[12:31:35] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[12:31:35] ====== drm_test_connector_helper_tv_get_modes_check  =======
[12:31:35] [PASSED] None
[12:31:35] [PASSED] PAL
[12:31:35] [PASSED] NTSC
[12:31:35] [PASSED] Both, NTSC Default
[12:31:35] [PASSED] Both, PAL Default
[12:31:35] [PASSED] Both, NTSC Default, with PAL on command-line
[12:31:35] [PASSED] Both, PAL Default, with NTSC on command-line
[12:31:35] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[12:31:35] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[12:31:35] ================== drm_rect (9 subtests) ===================
[12:31:35] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[12:31:35] [PASSED] drm_test_rect_clip_scaled_not_clipped
[12:31:35] [PASSED] drm_test_rect_clip_scaled_clipped
[12:31:35] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[12:31:35] ================= drm_test_rect_intersect  =================
[12:31:35] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[12:31:35] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[12:31:35] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[12:31:35] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[12:31:35] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[12:31:35] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[12:31:35] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[12:31:35] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[12:31:35] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[12:31:35] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[12:31:35] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[12:31:35] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[12:31:35] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[12:31:35] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[12:31:35] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[12:31:35] ============= [PASSED] drm_test_rect_intersect =============
[12:31:35] ================ drm_test_rect_calc_hscale  ================
[12:31:35] [PASSED] normal use
[12:31:35] [PASSED] out of max range
[12:31:35] [PASSED] out of min range
[12:31:35] [PASSED] zero dst
[12:31:35] [PASSED] negative src
[12:31:35] [PASSED] negative dst
[12:31:35] ============ [PASSED] drm_test_rect_calc_hscale ============
[12:31:35] ================ drm_test_rect_calc_vscale  ================
[12:31:35] [PASSED] normal use
[12:31:35] [PASSED] out of max range
[12:31:35] [PASSED] out of min range
[12:31:35] [PASSED] zero dst
[12:31:35] [PASSED] negative src
stty: 'standard input': Inappropriate ioctl for device
[12:31:35] [PASSED] negative dst
[12:31:35] ============ [PASSED] drm_test_rect_calc_vscale ============
[12:31:35] ================== drm_test_rect_rotate  ===================
[12:31:35] [PASSED] reflect-x
[12:31:35] [PASSED] reflect-y
[12:31:35] [PASSED] rotate-0
[12:31:35] [PASSED] rotate-90
[12:31:35] [PASSED] rotate-180
[12:31:35] [PASSED] rotate-270
[12:31:35] ============== [PASSED] drm_test_rect_rotate ===============
[12:31:35] ================ drm_test_rect_rotate_inv  =================
[12:31:35] [PASSED] reflect-x
[12:31:35] [PASSED] reflect-y
[12:31:35] [PASSED] rotate-0
[12:31:35] [PASSED] rotate-90
[12:31:35] [PASSED] rotate-180
[12:31:35] [PASSED] rotate-270
[12:31:35] ============ [PASSED] drm_test_rect_rotate_inv =============
[12:31:35] ==================== [PASSED] drm_rect =====================
[12:31:35] ============ drm_sysfb_modeset_test (1 subtest) ============
[12:31:35] ============ drm_test_sysfb_build_fourcc_list  =============
[12:31:35] [PASSED] no native formats
[12:31:35] [PASSED] XRGB8888 as native format
[12:31:35] [PASSED] remove duplicates
[12:31:35] [PASSED] convert alpha formats
[12:31:35] [PASSED] random formats
[12:31:35] ======== [PASSED] drm_test_sysfb_build_fourcc_list =========
[12:31:35] ============= [PASSED] drm_sysfb_modeset_test ==============
[12:31:35] ============================================================
[12:31:35] Testing complete. Ran 621 tests: passed: 621
[12:31:35] Elapsed time: 25.346s total, 1.703s configuring, 23.425s building, 0.194s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[12:31:35] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[12:31:37] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[12:31:45] Starting KUnit Kernel (1/1)...
[12:31:45] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[12:31:46] ================= ttm_device (5 subtests) ==================
[12:31:46] [PASSED] ttm_device_init_basic
[12:31:46] [PASSED] ttm_device_init_multiple
[12:31:46] [PASSED] ttm_device_fini_basic
[12:31:46] [PASSED] ttm_device_init_no_vma_man
[12:31:46] ================== ttm_device_init_pools  ==================
[12:31:46] [PASSED] No DMA allocations, no DMA32 required
[12:31:46] [PASSED] DMA allocations, DMA32 required
[12:31:46] [PASSED] No DMA allocations, DMA32 required
[12:31:46] [PASSED] DMA allocations, no DMA32 required
[12:31:46] ============== [PASSED] ttm_device_init_pools ==============
[12:31:46] =================== [PASSED] ttm_device ====================
[12:31:46] ================== ttm_pool (8 subtests) ===================
[12:31:46] ================== ttm_pool_alloc_basic  ===================
[12:31:46] [PASSED] One page
[12:31:46] [PASSED] More than one page
[12:31:46] [PASSED] Above the allocation limit
[12:31:46] [PASSED] One page, with coherent DMA mappings enabled
[12:31:46] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[12:31:46] ============== [PASSED] ttm_pool_alloc_basic ===============
[12:31:46] ============== ttm_pool_alloc_basic_dma_addr  ==============
[12:31:46] [PASSED] One page
[12:31:46] [PASSED] More than one page
[12:31:46] [PASSED] Above the allocation limit
[12:31:46] [PASSED] One page, with coherent DMA mappings enabled
[12:31:46] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[12:31:46] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[12:31:46] [PASSED] ttm_pool_alloc_order_caching_match
[12:31:46] [PASSED] ttm_pool_alloc_caching_mismatch
[12:31:46] [PASSED] ttm_pool_alloc_order_mismatch
[12:31:46] [PASSED] ttm_pool_free_dma_alloc
[12:31:46] [PASSED] ttm_pool_free_no_dma_alloc
[12:31:46] [PASSED] ttm_pool_fini_basic
[12:31:46] ==================== [PASSED] ttm_pool =====================
[12:31:46] ================ ttm_resource (8 subtests) =================
[12:31:46] ================= ttm_resource_init_basic  =================
[12:31:46] [PASSED] Init resource in TTM_PL_SYSTEM
[12:31:46] [PASSED] Init resource in TTM_PL_VRAM
[12:31:46] [PASSED] Init resource in a private placement
[12:31:46] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[12:31:46] ============= [PASSED] ttm_resource_init_basic =============
[12:31:46] [PASSED] ttm_resource_init_pinned
[12:31:46] [PASSED] ttm_resource_fini_basic
[12:31:46] [PASSED] ttm_resource_manager_init_basic
[12:31:46] [PASSED] ttm_resource_manager_usage_basic
[12:31:46] [PASSED] ttm_resource_manager_set_used_basic
[12:31:46] [PASSED] ttm_sys_man_alloc_basic
[12:31:46] [PASSED] ttm_sys_man_free_basic
[12:31:46] ================== [PASSED] ttm_resource ===================
[12:31:46] =================== ttm_tt (15 subtests) ===================
[12:31:46] ==================== ttm_tt_init_basic  ====================
[12:31:46] [PASSED] Page-aligned size
[12:31:46] [PASSED] Extra pages requested
[12:31:46] ================ [PASSED] ttm_tt_init_basic ================
[12:31:46] [PASSED] ttm_tt_init_misaligned
[12:31:46] [PASSED] ttm_tt_fini_basic
[12:31:46] [PASSED] ttm_tt_fini_sg
[12:31:46] [PASSED] ttm_tt_fini_shmem
[12:31:46] [PASSED] ttm_tt_create_basic
[12:31:46] [PASSED] ttm_tt_create_invalid_bo_type
[12:31:46] [PASSED] ttm_tt_create_ttm_exists
[12:31:46] [PASSED] ttm_tt_create_failed
[12:31:46] [PASSED] ttm_tt_destroy_basic
[12:31:46] [PASSED] ttm_tt_populate_null_ttm
[12:31:46] [PASSED] ttm_tt_populate_populated_ttm
[12:31:46] [PASSED] ttm_tt_unpopulate_basic
[12:31:46] [PASSED] ttm_tt_unpopulate_empty_ttm
[12:31:46] [PASSED] ttm_tt_swapin_basic
[12:31:46] ===================== [PASSED] ttm_tt ======================
[12:31:46] =================== ttm_bo (14 subtests) ===================
[12:31:46] =========== ttm_bo_reserve_optimistic_no_ticket  ===========
[12:31:46] [PASSED] Cannot be interrupted and sleeps
[12:31:46] [PASSED] Cannot be interrupted, locks straight away
[12:31:46] [PASSED] Can be interrupted, sleeps
[12:31:46] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[12:31:46] [PASSED] ttm_bo_reserve_locked_no_sleep
[12:31:46] [PASSED] ttm_bo_reserve_no_wait_ticket
[12:31:46] [PASSED] ttm_bo_reserve_double_resv
[12:31:46] [PASSED] ttm_bo_reserve_interrupted
[12:31:46] [PASSED] ttm_bo_reserve_deadlock
[12:31:46] [PASSED] ttm_bo_unreserve_basic
[12:31:46] [PASSED] ttm_bo_unreserve_pinned
[12:31:46] [PASSED] ttm_bo_unreserve_bulk
[12:31:46] [PASSED] ttm_bo_fini_basic
[12:31:46] [PASSED] ttm_bo_fini_shared_resv
[12:31:46] [PASSED] ttm_bo_pin_basic
[12:31:46] [PASSED] ttm_bo_pin_unpin_resource
[12:31:46] [PASSED] ttm_bo_multiple_pin_one_unpin
[12:31:46] ===================== [PASSED] ttm_bo ======================
[12:31:46] ============== ttm_bo_validate (21 subtests) ===============
[12:31:46] ============== ttm_bo_init_reserved_sys_man  ===============
[12:31:46] [PASSED] Buffer object for userspace
[12:31:46] [PASSED] Kernel buffer object
[12:31:46] [PASSED] Shared buffer object
[12:31:46] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[12:31:46] ============== ttm_bo_init_reserved_mock_man  ==============
[12:31:46] [PASSED] Buffer object for userspace
[12:31:46] [PASSED] Kernel buffer object
[12:31:46] [PASSED] Shared buffer object
[12:31:46] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[12:31:46] [PASSED] ttm_bo_init_reserved_resv
[12:31:46] ================== ttm_bo_validate_basic  ==================
[12:31:46] [PASSED] Buffer object for userspace
[12:31:46] [PASSED] Kernel buffer object
[12:31:46] [PASSED] Shared buffer object
[12:31:46] ============== [PASSED] ttm_bo_validate_basic ==============
[12:31:46] [PASSED] ttm_bo_validate_invalid_placement
[12:31:46] ============= ttm_bo_validate_same_placement  ==============
[12:31:46] [PASSED] System manager
[12:31:46] [PASSED] VRAM manager
[12:31:46] ========= [PASSED] ttm_bo_validate_same_placement ==========
[12:31:46] [PASSED] ttm_bo_validate_failed_alloc
[12:31:46] [PASSED] ttm_bo_validate_pinned
[12:31:46] [PASSED] ttm_bo_validate_busy_placement
[12:31:46] ================ ttm_bo_validate_multihop  =================
[12:31:46] [PASSED] Buffer object for userspace
[12:31:46] [PASSED] Kernel buffer object
[12:31:46] [PASSED] Shared buffer object
[12:31:46] ============ [PASSED] ttm_bo_validate_multihop =============
[12:31:46] ========== ttm_bo_validate_no_placement_signaled  ==========
[12:31:46] [PASSED] Buffer object in system domain, no page vector
[12:31:46] [PASSED] Buffer object in system domain with an existing page vector
[12:31:46] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[12:31:46] ======== ttm_bo_validate_no_placement_not_signaled  ========
[12:31:46] [PASSED] Buffer object for userspace
[12:31:46] [PASSED] Kernel buffer object
[12:31:46] [PASSED] Shared buffer object
[12:31:46] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[12:31:46] [PASSED] ttm_bo_validate_move_fence_signaled
[12:31:46] ========= ttm_bo_validate_move_fence_not_signaled  =========
[12:31:46] [PASSED] Waits for GPU
[12:31:46] [PASSED] Tries to lock straight away
[12:31:46] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[12:31:46] [PASSED] ttm_bo_validate_happy_evict
[12:31:46] [PASSED] ttm_bo_validate_all_pinned_evict
[12:31:46] [PASSED] ttm_bo_validate_allowed_only_evict
[12:31:46] [PASSED] ttm_bo_validate_deleted_evict
[12:31:46] [PASSED] ttm_bo_validate_busy_domain_evict
[12:31:46] [PASSED] ttm_bo_validate_evict_gutting
[12:31:46] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[12:31:46] ================= [PASSED] ttm_bo_validate =================
[12:31:46] ============================================================
[12:31:46] Testing complete. Ran 101 tests: passed: 101
[12:31:46] Elapsed time: 10.887s total, 1.708s configuring, 8.962s building, 0.172s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ Xe.CI.BAT: success for Intel Xe GPU Debug Support (eudebug) v5
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (21 preceding siblings ...)
  2025-10-06 12:31 ` ✓ CI.KUnit: success " Patchwork
@ 2025-10-06 13:14 ` Patchwork
  2025-10-06 15:53 ` ✗ Xe.CI.Full: failure " Patchwork
  23 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-06 13:14 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 4685 bytes --]

== Series Details ==

Series: Intel Xe GPU Debug Support (eudebug) v5
URL   : https://patchwork.freedesktop.org/series/155452/
State : success

== Summary ==

CI Bug Log - changes from xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d_BAT -> xe-pw-155452v1_BAT
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (10 -> 11)
------------------------------

  Additional (1): bat-adlp-7 

Known issues
------------

  Here are the changes found in xe-pw-155452v1_BAT that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_dsc@dsc-basic:
    - bat-adlp-7:         NOTRUN -> [SKIP][1] ([Intel XE#455])
   [1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@kms_dsc@dsc-basic.html

  * igt@xe_evict@evict-beng-small:
    - bat-adlp-7:         NOTRUN -> [SKIP][2] ([Intel XE#261] / [Intel XE#5564] / [Intel XE#688]) +9 other tests skip
   [2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_evict@evict-beng-small.html

  * igt@xe_evict_ccs@evict-overcommit-parallel-nofree-samefd:
    - bat-adlp-7:         NOTRUN -> [SKIP][3] ([Intel XE#688]) +1 other test skip
   [3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_evict_ccs@evict-overcommit-parallel-nofree-samefd.html

  * igt@xe_exec_fault_mode@twice-userptr-invalidate-prefetch:
    - bat-adlp-7:         NOTRUN -> [SKIP][4] ([Intel XE#288] / [Intel XE#5561]) +32 other tests skip
   [4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_exec_fault_mode@twice-userptr-invalidate-prefetch.html

  * igt@xe_live_ktest@xe_bo@xe_bo_evict_kunit:
    - bat-adlp-7:         NOTRUN -> [SKIP][5] ([Intel XE#2229] / [Intel XE#455]) +2 other tests skip
   [5]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_live_ktest@xe_bo@xe_bo_evict_kunit.html

  * igt@xe_live_ktest@xe_migrate@xe_validate_ccs_kunit:
    - bat-adlp-7:         NOTRUN -> [SKIP][6] ([Intel XE#2229] / [Intel XE#5488])
   [6]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_live_ktest@xe_migrate@xe_validate_ccs_kunit.html

  * igt@xe_mmap@vram:
    - bat-adlp-7:         NOTRUN -> [SKIP][7] ([Intel XE#1008] / [Intel XE#5591])
   [7]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_mmap@vram.html

  * igt@xe_pat@pat-index-xe2:
    - bat-adlp-7:         NOTRUN -> [SKIP][8] ([Intel XE#977])
   [8]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_pat@pat-index-xe2.html

  * igt@xe_pat@pat-index-xehpc:
    - bat-adlp-7:         NOTRUN -> [SKIP][9] ([Intel XE#2838] / [Intel XE#979])
   [9]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_pat@pat-index-xehpc.html

  * igt@xe_pat@pat-index-xelpg:
    - bat-adlp-7:         NOTRUN -> [SKIP][10] ([Intel XE#979])
   [10]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/bat-adlp-7/igt@xe_pat@pat-index-xelpg.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [Intel XE#1008]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1008
  [Intel XE#2229]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2229
  [Intel XE#261]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/261
  [Intel XE#2838]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2838
  [Intel XE#288]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/288
  [Intel XE#455]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/455
  [Intel XE#5488]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5488
  [Intel XE#5561]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5561
  [Intel XE#5564]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5564
  [Intel XE#5591]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5591
  [Intel XE#6287]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6287
  [Intel XE#688]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/688
  [Intel XE#977]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/977
  [Intel XE#979]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/979


Build changes
-------------

  * Linux: xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d -> xe-pw-155452v1

  IGT_8574: 44a15713124663a622c6eddf7c6ee5ba732e0d41 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d: dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d
  xe-pw-155452v1: 155452v1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/index.html

[-- Attachment #2: Type: text/html, Size: 5664 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✗ Xe.CI.Full: failure for Intel Xe GPU Debug Support (eudebug) v5
  2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
                   ` (22 preceding siblings ...)
  2025-10-06 13:14 ` ✓ Xe.CI.BAT: " Patchwork
@ 2025-10-06 15:53 ` Patchwork
  23 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-06 15:53 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 65459 bytes --]

== Series Details ==

Series: Intel Xe GPU Debug Support (eudebug) v5
URL   : https://patchwork.freedesktop.org/series/155452/
State : failure

== Summary ==

CI Bug Log - changes from xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d_FULL -> xe-pw-155452v1_FULL
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with xe-pw-155452v1_FULL absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in xe-pw-155452v1_FULL, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (4 -> 4)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in xe-pw-155452v1_FULL:

### IGT changes ###

#### Possible regressions ####

  * igt@sriov_basic@enable-vfs-autoprobe-off:
    - shard-adlp:         [PASS][1] -> [SKIP][2] +1 other test skip
   [1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@sriov_basic@enable-vfs-autoprobe-off.html
   [2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-3/igt@sriov_basic@enable-vfs-autoprobe-off.html

  * igt@xe_eudebug_online@basic-breakpoint@drm_xe_engine_class_render0:
    - shard-lnl:          NOTRUN -> [FAIL][3] +25 other tests fail
   [3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-2/igt@xe_eudebug_online@basic-breakpoint@drm_xe_engine_class_render0.html

  * igt@xe_eudebug_online@breakpoint-not-in-debug-mode@drm_xe_engine_class_render0:
    - shard-bmg:          NOTRUN -> [FAIL][4] +31 other tests fail
   [4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-1/igt@xe_eudebug_online@breakpoint-not-in-debug-mode@drm_xe_engine_class_render0.html

  * igt@xe_eudebug_online@debugger-reopen@drm_xe_engine_class_render0:
    - shard-dg2-set2:     NOTRUN -> [FAIL][5] +32 other tests fail
   [5]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@xe_eudebug_online@debugger-reopen@drm_xe_engine_class_render0.html

  * igt@xe_eudebug_online@preempt-breakpoint@drm_xe_engine_class_render0:
    - shard-adlp:         NOTRUN -> [FAIL][6] +18 other tests fail
   [6]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-1/igt@xe_eudebug_online@preempt-breakpoint@drm_xe_engine_class_render0.html

  * igt@xe_fault_injection@inject-fault-probe-function-xe_wopcm_init:
    - shard-dg2-set2:     NOTRUN -> [DMESG-WARN][7]
   [7]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@xe_fault_injection@inject-fault-probe-function-xe_wopcm_init.html

  * igt@xe_oa@syncs-ufence-wait-cfg@ccs-0:
    - shard-bmg:          NOTRUN -> [ABORT][8] +2 other tests abort
   [8]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@xe_oa@syncs-ufence-wait-cfg@ccs-0.html

  * igt@xe_oa@syncs-ufence-wait@ccs-0:
    - shard-bmg:          [PASS][9] -> [ABORT][10] +4 other tests abort
   [9]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-2/igt@xe_oa@syncs-ufence-wait@ccs-0.html
   [10]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@xe_oa@syncs-ufence-wait@ccs-0.html

  * igt@xe_oa@syncs-ufence-wait@rcs-0:
    - shard-lnl:          NOTRUN -> [ABORT][11]
   [11]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-4/igt@xe_oa@syncs-ufence-wait@rcs-0.html

  * igt@xe_oa@syncs-userptr-wait-cfg:
    - shard-lnl:          [PASS][12] -> [ABORT][13] +6 other tests abort
   [12]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-7/igt@xe_oa@syncs-userptr-wait-cfg.html
   [13]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-3/igt@xe_oa@syncs-userptr-wait-cfg.html

  
#### Warnings ####

  * igt@xe_eudebug@basic-close:
    - shard-dg2-set2:     [SKIP][14] ([Intel XE#4837]) -> [FAIL][15] +53 other tests fail
   [14]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-463/igt@xe_eudebug@basic-close.html
   [15]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@xe_eudebug@basic-close.html

  * igt@xe_eudebug@basic-connect:
    - shard-lnl:          [SKIP][16] ([Intel XE#4837]) -> [FAIL][17] +57 other tests fail
   [16]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-3/igt@xe_eudebug@basic-connect.html
   [17]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-4/igt@xe_eudebug@basic-connect.html

  * igt@xe_eudebug@basic-vm-access-faultable:
    - shard-adlp:         [SKIP][18] ([Intel XE#4837] / [Intel XE#5565]) -> [SKIP][19] +6 other tests skip
   [18]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-1/igt@xe_eudebug@basic-vm-access-faultable.html
   [19]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-8/igt@xe_eudebug@basic-vm-access-faultable.html

  * igt@xe_eudebug@basic-vm-bind-metadata-discovery:
    - shard-bmg:          [SKIP][20] ([Intel XE#4837]) -> [FAIL][21] +52 other tests fail
   [20]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@xe_eudebug@basic-vm-bind-metadata-discovery.html
   [21]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@xe_eudebug@basic-vm-bind-metadata-discovery.html

  * igt@xe_eudebug@vm-bind-clear-faultable:
    - shard-dg2-set2:     [SKIP][22] ([Intel XE#4837]) -> [SKIP][23] +6 other tests skip
   [22]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-433/igt@xe_eudebug@vm-bind-clear-faultable.html
   [23]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-434/igt@xe_eudebug@vm-bind-clear-faultable.html

  * igt@xe_eudebug_online@set-breakpoint-sigint-debugger:
    - shard-adlp:         [SKIP][24] ([Intel XE#4837] / [Intel XE#5565]) -> [FAIL][25] +48 other tests fail
   [24]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-3/igt@xe_eudebug_online@set-breakpoint-sigint-debugger.html
   [25]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-4/igt@xe_eudebug_online@set-breakpoint-sigint-debugger.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@xe_eudebug_online@set-breakpoint-faultable@drm_xe_engine_class_compute0}:
    - shard-lnl:          NOTRUN -> [FAIL][26] +2 other tests fail
   [26]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-7/igt@xe_eudebug_online@set-breakpoint-faultable@drm_xe_engine_class_compute0.html

  * {igt@xe_eudebug_online@set-breakpoint-sigint-debugger@drm_xe_engine_class_compute0}:
    - shard-bmg:          NOTRUN -> [FAIL][27] +2 other tests fail
   [27]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@xe_eudebug_online@set-breakpoint-sigint-debugger@drm_xe_engine_class_compute0.html
    - shard-dg2-set2:     NOTRUN -> [FAIL][28]
   [28]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-434/igt@xe_eudebug_online@set-breakpoint-sigint-debugger@drm_xe_engine_class_compute0.html

  * {igt@xe_eudebug_online@set-breakpoint-sigint-debugger@drm_xe_engine_class_render0}:
    - shard-adlp:         NOTRUN -> [FAIL][29]
   [29]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-4/igt@xe_eudebug_online@set-breakpoint-sigint-debugger@drm_xe_engine_class_render0.html

  * {igt@xe_fault_injection@exec-queue-create-fail-xe_pxp_exec_queue_add}:
    - shard-dg2-set2:     [SKIP][30] ([Intel XE#6281]) -> [SKIP][31]
   [30]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-466/igt@xe_fault_injection@exec-queue-create-fail-xe_pxp_exec_queue_add.html
   [31]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@xe_fault_injection@exec-queue-create-fail-xe_pxp_exec_queue_add.html

  * {igt@xe_noexec_ping_pong@basic}:
    - shard-bmg:          NOTRUN -> [INCOMPLETE][32]
   [32]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@xe_noexec_ping_pong@basic.html

  * {igt@xe_pmu@engine-activity-accuracy-50}:
    - shard-adlp:         [PASS][33] -> [FAIL][34] +4 other tests fail
   [33]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@xe_pmu@engine-activity-accuracy-50.html
   [34]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-3/igt@xe_pmu@engine-activity-accuracy-50.html

  
New tests
---------

  New tests have been introduced between xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d_FULL and xe-pw-155452v1_FULL:

### New IGT tests (1) ###

  * igt@xe_eudebug_online@interrupt-other@drm_xe_engine_class_compute0:
    - Statuses : 1 fail(s)
    - Exec time: [0.09] s

  

Known issues
------------

  Here are the changes found in xe-pw-155452v1_FULL that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_addfb_basic@addfb25-y-tiled-small-legacy:
    - shard-bmg:          NOTRUN -> [SKIP][35] ([Intel XE#2233])
   [35]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_addfb_basic@addfb25-y-tiled-small-legacy.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][36] ([Intel XE#623])
   [36]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_addfb_basic@addfb25-y-tiled-small-legacy.html

  * igt@kms_big_fb@x-tiled-8bpp-rotate-270:
    - shard-bmg:          NOTRUN -> [SKIP][37] ([Intel XE#2327])
   [37]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-1/igt@kms_big_fb@x-tiled-8bpp-rotate-270.html

  * igt@kms_big_fb@y-tiled-16bpp-rotate-0:
    - shard-bmg:          NOTRUN -> [SKIP][38] ([Intel XE#1124]) +4 other tests skip
   [38]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_big_fb@y-tiled-16bpp-rotate-0.html

  * igt@kms_big_fb@yf-tiled-32bpp-rotate-180:
    - shard-dg2-set2:     NOTRUN -> [SKIP][39] ([Intel XE#1124]) +4 other tests skip
   [39]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_big_fb@yf-tiled-32bpp-rotate-180.html

  * igt@kms_big_fb@yf-tiled-addfb:
    - shard-bmg:          NOTRUN -> [SKIP][40] ([Intel XE#2328])
   [40]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_big_fb@yf-tiled-addfb.html

  * igt@kms_bw@connected-linear-tiling-3-displays-3840x2160p:
    - shard-bmg:          NOTRUN -> [SKIP][41] ([Intel XE#2314] / [Intel XE#2894])
   [41]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_bw@connected-linear-tiling-3-displays-3840x2160p.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][42] ([Intel XE#2191])
   [42]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_bw@connected-linear-tiling-3-displays-3840x2160p.html

  * igt@kms_bw@linear-tiling-2-displays-1920x1080p:
    - shard-dg2-set2:     NOTRUN -> [SKIP][43] ([Intel XE#367])
   [43]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_bw@linear-tiling-2-displays-1920x1080p.html

  * igt@kms_bw@linear-tiling-2-displays-2560x1440p:
    - shard-bmg:          NOTRUN -> [SKIP][44] ([Intel XE#367]) +1 other test skip
   [44]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_bw@linear-tiling-2-displays-2560x1440p.html

  * igt@kms_ccs@bad-pixel-format-4-tiled-mtl-rc-ccs-cc@pipe-a-hdmi-a-6:
    - shard-dg2-set2:     NOTRUN -> [SKIP][45] ([Intel XE#787]) +125 other tests skip
   [45]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-464/igt@kms_ccs@bad-pixel-format-4-tiled-mtl-rc-ccs-cc@pipe-a-hdmi-a-6.html

  * igt@kms_ccs@bad-pixel-format-yf-tiled-ccs:
    - shard-dg2-set2:     NOTRUN -> [SKIP][46] ([Intel XE#455] / [Intel XE#787]) +22 other tests skip
   [46]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_ccs@bad-pixel-format-yf-tiled-ccs.html

  * igt@kms_ccs@crc-primary-basic-4-tiled-mtl-mc-ccs:
    - shard-bmg:          NOTRUN -> [SKIP][47] ([Intel XE#2887]) +8 other tests skip
   [47]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_ccs@crc-primary-basic-4-tiled-mtl-mc-ccs.html

  * igt@kms_ccs@crc-primary-rotation-180-4-tiled-bmg-ccs:
    - shard-dg2-set2:     NOTRUN -> [SKIP][48] ([Intel XE#2907])
   [48]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@kms_ccs@crc-primary-rotation-180-4-tiled-bmg-ccs.html

  * igt@kms_ccs@crc-primary-suspend-y-tiled-gen12-rc-ccs:
    - shard-bmg:          NOTRUN -> [SKIP][49] ([Intel XE#3432])
   [49]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_ccs@crc-primary-suspend-y-tiled-gen12-rc-ccs.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-b-hdmi-a-6:
    - shard-dg2-set2:     [PASS][50] -> [INCOMPLETE][51] ([Intel XE#1727] / [Intel XE#3113] / [Intel XE#4345] / [Intel XE#6168]) +1 other test incomplete
   [50]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-436/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-b-hdmi-a-6.html
   [51]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-b-hdmi-a-6.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-b-dp-4:
    - shard-dg2-set2:     NOTRUN -> [INCOMPLETE][52] ([Intel XE#6168])
   [52]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-435/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-b-dp-4.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-b-hdmi-a-6:
    - shard-dg2-set2:     NOTRUN -> [DMESG-WARN][53] ([Intel XE#1727] / [Intel XE#3113])
   [53]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-435/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-b-hdmi-a-6.html

  * igt@kms_chamelium_color@ctm-negative:
    - shard-bmg:          NOTRUN -> [SKIP][54] ([Intel XE#2325])
   [54]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_chamelium_color@ctm-negative.html

  * igt@kms_chamelium_hpd@dp-hpd:
    - shard-bmg:          NOTRUN -> [SKIP][55] ([Intel XE#2252]) +3 other tests skip
   [55]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_chamelium_hpd@dp-hpd.html

  * igt@kms_chamelium_hpd@hdmi-hpd-for-each-pipe:
    - shard-dg2-set2:     NOTRUN -> [SKIP][56] ([Intel XE#373]) +2 other tests skip
   [56]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_chamelium_hpd@hdmi-hpd-for-each-pipe.html

  * igt@kms_content_protection@atomic-dpms@pipe-a-dp-2:
    - shard-bmg:          NOTRUN -> [FAIL][57] ([Intel XE#1178])
   [57]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_content_protection@atomic-dpms@pipe-a-dp-2.html

  * igt@kms_content_protection@atomic@pipe-a-dp-2:
    - shard-dg2-set2:     NOTRUN -> [FAIL][58] ([Intel XE#1178]) +1 other test fail
   [58]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_content_protection@atomic@pipe-a-dp-2.html

  * igt@kms_content_protection@dp-mst-lic-type-0:
    - shard-dg2-set2:     NOTRUN -> [SKIP][59] ([Intel XE#307])
   [59]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_content_protection@dp-mst-lic-type-0.html

  * igt@kms_content_protection@lic-type-0@pipe-a-dp-4:
    - shard-dg2-set2:     NOTRUN -> [FAIL][60] ([Intel XE#3304])
   [60]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-464/igt@kms_content_protection@lic-type-0@pipe-a-dp-4.html

  * igt@kms_content_protection@lic-type-1:
    - shard-bmg:          NOTRUN -> [SKIP][61] ([Intel XE#2341])
   [61]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_content_protection@lic-type-1.html

  * igt@kms_content_protection@uevent@pipe-a-dp-4:
    - shard-dg2-set2:     NOTRUN -> [FAIL][62] ([Intel XE#1188])
   [62]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-435/igt@kms_content_protection@uevent@pipe-a-dp-4.html

  * igt@kms_cursor_crc@cursor-sliding-32x10:
    - shard-bmg:          NOTRUN -> [SKIP][63] ([Intel XE#2320])
   [63]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_cursor_crc@cursor-sliding-32x10.html

  * igt@kms_cursor_crc@cursor-sliding-512x512:
    - shard-dg2-set2:     NOTRUN -> [SKIP][64] ([Intel XE#308])
   [64]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_cursor_crc@cursor-sliding-512x512.html

  * igt@kms_cursor_legacy@cursorb-vs-flipb-varying-size:
    - shard-bmg:          [PASS][65] -> [SKIP][66] ([Intel XE#2291])
   [65]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@kms_cursor_legacy@cursorb-vs-flipb-varying-size.html
   [66]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_cursor_legacy@cursorb-vs-flipb-varying-size.html

  * igt@kms_cursor_legacy@torture-move:
    - shard-dg2-set2:     NOTRUN -> [INCOMPLETE][67] ([Intel XE#3226]) +1 other test incomplete
   [67]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@kms_cursor_legacy@torture-move.html

  * igt@kms_dither@fb-8bpc-vs-panel-6bpc@pipe-a-hdmi-a-2:
    - shard-dg2-set2:     NOTRUN -> [SKIP][68] ([Intel XE#4494])
   [68]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_dither@fb-8bpc-vs-panel-6bpc@pipe-a-hdmi-a-2.html

  * igt@kms_dsc@dsc-with-output-formats-with-bpc:
    - shard-dg2-set2:     NOTRUN -> [SKIP][69] ([Intel XE#455]) +15 other tests skip
   [69]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_dsc@dsc-with-output-formats-with-bpc.html
    - shard-bmg:          NOTRUN -> [SKIP][70] ([Intel XE#2244])
   [70]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_dsc@dsc-with-output-formats-with-bpc.html

  * igt@kms_fbc_dirty_rect@fbc-dirty-rectangle-out-visible-area:
    - shard-bmg:          NOTRUN -> [SKIP][71] ([Intel XE#4422])
   [71]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_fbc_dirty_rect@fbc-dirty-rectangle-out-visible-area.html

  * igt@kms_flip@2x-flip-vs-panning-vs-hang:
    - shard-bmg:          [PASS][72] -> [SKIP][73] ([Intel XE#2316]) +1 other test skip
   [72]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@kms_flip@2x-flip-vs-panning-vs-hang.html
   [73]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_flip@2x-flip-vs-panning-vs-hang.html

  * igt@kms_flip@dpms-off-confusion@c-hdmi-a1:
    - shard-adlp:         [PASS][74] -> [DMESG-WARN][75] ([Intel XE#4543]) +8 other tests dmesg-warn
   [74]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@kms_flip@dpms-off-confusion@c-hdmi-a1.html
   [75]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-6/igt@kms_flip@dpms-off-confusion@c-hdmi-a1.html

  * igt@kms_flip@nonexisting-fb:
    - shard-adlp:         [PASS][76] -> [DMESG-WARN][77] ([Intel XE#2953] / [Intel XE#4173])
   [76]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-4/igt@kms_flip@nonexisting-fb.html
   [77]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-9/igt@kms_flip@nonexisting-fb.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling:
    - shard-bmg:          NOTRUN -> [SKIP][78] ([Intel XE#2293] / [Intel XE#2380]) +2 other tests skip
   [78]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-valid-mode:
    - shard-bmg:          NOTRUN -> [SKIP][79] ([Intel XE#2293]) +2 other tests skip
   [79]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-valid-mode.html

  * igt@kms_frontbuffer_tracking@drrs-1p-primscrn-cur-indfb-onoff:
    - shard-dg2-set2:     NOTRUN -> [SKIP][80] ([Intel XE#651]) +7 other tests skip
   [80]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_frontbuffer_tracking@drrs-1p-primscrn-cur-indfb-onoff.html

  * igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-indfb-draw-blt:
    - shard-bmg:          NOTRUN -> [SKIP][81] ([Intel XE#2311]) +8 other tests skip
   [81]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-indfb-draw-mmap-wc:
    - shard-bmg:          NOTRUN -> [SKIP][82] ([Intel XE#5390]) +5 other tests skip
   [82]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt:
    - shard-dg2-set2:     NOTRUN -> [SKIP][83] ([Intel XE#653]) +20 other tests skip
   [83]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-render:
    - shard-bmg:          NOTRUN -> [SKIP][84] ([Intel XE#2313]) +15 other tests skip
   [84]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-render.html

  * igt@kms_hdr@invalid-metadata-sizes:
    - shard-bmg:          [PASS][85] -> [SKIP][86] ([Intel XE#1503])
   [85]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@kms_hdr@invalid-metadata-sizes.html
   [86]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_hdr@invalid-metadata-sizes.html

  * igt@kms_joiner@basic-ultra-joiner:
    - shard-bmg:          NOTRUN -> [SKIP][87] ([Intel XE#2927])
   [87]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_joiner@basic-ultra-joiner.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][88] ([Intel XE#2927])
   [88]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@kms_joiner@basic-ultra-joiner.html

  * igt@kms_joiner@invalid-modeset-force-big-joiner:
    - shard-adlp:         NOTRUN -> [SKIP][89] ([Intel XE#3012])
   [89]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-4/igt@kms_joiner@invalid-modeset-force-big-joiner.html
    - shard-bmg:          [PASS][90] -> [SKIP][91] ([Intel XE#3012])
   [90]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-7/igt@kms_joiner@invalid-modeset-force-big-joiner.html
   [91]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_joiner@invalid-modeset-force-big-joiner.html

  * igt@kms_multipipe_modeset@basic-max-pipe-crc-check:
    - shard-bmg:          NOTRUN -> [SKIP][92] ([Intel XE#2501])
   [92]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][93] ([Intel XE#356])
   [93]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html

  * igt@kms_plane_cursor@viewport@pipe-a-hdmi-a-6-size-64:
    - shard-dg2-set2:     NOTRUN -> [FAIL][94] ([Intel XE#616])
   [94]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-464/igt@kms_plane_cursor@viewport@pipe-a-hdmi-a-6-size-64.html

  * igt@kms_plane_multiple@2x-tiling-y:
    - shard-bmg:          NOTRUN -> [SKIP][95] ([Intel XE#5021])
   [95]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_plane_multiple@2x-tiling-y.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][96] ([Intel XE#5021])
   [96]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@kms_plane_multiple@2x-tiling-y.html

  * igt@kms_pm_dc@deep-pkgc:
    - shard-bmg:          NOTRUN -> [SKIP][97] ([Intel XE#2505])
   [97]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-1/igt@kms_pm_dc@deep-pkgc.html

  * igt@kms_pm_rpm@modeset-lpsp-stress-no-wait:
    - shard-bmg:          NOTRUN -> [SKIP][98] ([Intel XE#1439] / [Intel XE#3141] / [Intel XE#836])
   [98]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_pm_rpm@modeset-lpsp-stress-no-wait.html

  * igt@kms_psr2_sf@fbc-psr2-cursor-plane-move-continuous-sf:
    - shard-dg2-set2:     NOTRUN -> [SKIP][99] ([Intel XE#1406] / [Intel XE#1489]) +1 other test skip
   [99]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@kms_psr2_sf@fbc-psr2-cursor-plane-move-continuous-sf.html

  * igt@kms_psr2_sf@psr2-primary-plane-update-sf-dmg-area-big-fb:
    - shard-bmg:          NOTRUN -> [SKIP][100] ([Intel XE#1406] / [Intel XE#1489]) +5 other tests skip
   [100]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-1/igt@kms_psr2_sf@psr2-primary-plane-update-sf-dmg-area-big-fb.html

  * igt@kms_psr@fbc-pr-no-drrs:
    - shard-bmg:          NOTRUN -> [SKIP][101] ([Intel XE#1406] / [Intel XE#2234] / [Intel XE#2850]) +7 other tests skip
   [101]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_psr@fbc-pr-no-drrs.html

  * igt@kms_psr@fbc-pr-primary-page-flip:
    - shard-adlp:         NOTRUN -> [SKIP][102] ([Intel XE#1406] / [Intel XE#2850] / [Intel XE#929])
   [102]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-4/igt@kms_psr@fbc-pr-primary-page-flip.html

  * igt@kms_psr@fbc-psr2-primary-render:
    - shard-dg2-set2:     NOTRUN -> [SKIP][103] ([Intel XE#1406] / [Intel XE#2850] / [Intel XE#929]) +2 other tests skip
   [103]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_psr@fbc-psr2-primary-render.html

  * igt@kms_psr_stress_test@invalidate-primary-flip-overlay:
    - shard-bmg:          NOTRUN -> [SKIP][104] ([Intel XE#1406] / [Intel XE#2414])
   [104]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_psr_stress_test@invalidate-primary-flip-overlay.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][105] ([Intel XE#1406] / [Intel XE#2939])
   [105]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_psr_stress_test@invalidate-primary-flip-overlay.html

  * igt@kms_rotation_crc@multiplane-rotation:
    - shard-dg2-set2:     NOTRUN -> [INCOMPLETE][106] ([Intel XE#6171])
   [106]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@kms_rotation_crc@multiplane-rotation.html

  * igt@kms_rotation_crc@sprite-rotation-90:
    - shard-dg2-set2:     NOTRUN -> [SKIP][107] ([Intel XE#3414])
   [107]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@kms_rotation_crc@sprite-rotation-90.html

  * igt@xe_create@multigpu-create-massive-size:
    - shard-bmg:          NOTRUN -> [SKIP][108] ([Intel XE#2504])
   [108]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@xe_create@multigpu-create-massive-size.html

  * igt@xe_eudebug_online@writes-caching-sram-bb-sram-target-vram@drm_xe_engine_class_render0:
    - shard-adlp:         NOTRUN -> [SKIP][109] ([Intel XE#455]) +6 other tests skip
   [109]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-9/igt@xe_eudebug_online@writes-caching-sram-bb-sram-target-vram@drm_xe_engine_class_render0.html

  * igt@xe_eudebug_online@writes-caching-vram-bb-sram-target-vram@drm_xe_engine_class_compute0:
    - shard-lnl:          NOTRUN -> [SKIP][110] ([Intel XE#2825]) +5 other tests skip
   [110]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-7/igt@xe_eudebug_online@writes-caching-vram-bb-sram-target-vram@drm_xe_engine_class_compute0.html

  * igt@xe_exec_basic@multigpu-no-exec-basic-defer-bind:
    - shard-bmg:          NOTRUN -> [SKIP][111] ([Intel XE#2322]) +2 other tests skip
   [111]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@xe_exec_basic@multigpu-no-exec-basic-defer-bind.html

  * igt@xe_exec_basic@multigpu-no-exec-bindexecqueue:
    - shard-dg2-set2:     [PASS][112] -> [SKIP][113] ([Intel XE#1392]) +5 other tests skip
   [112]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-435/igt@xe_exec_basic@multigpu-no-exec-bindexecqueue.html
   [113]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@xe_exec_basic@multigpu-no-exec-bindexecqueue.html

  * igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-invalidate:
    - shard-dg2-set2:     NOTRUN -> [SKIP][114] ([Intel XE#1392])
   [114]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-invalidate.html

  * igt@xe_exec_fault_mode@many-execqueues-bindexecqueue-userptr-invalidate:
    - shard-dg2-set2:     NOTRUN -> [SKIP][115] ([Intel XE#288]) +11 other tests skip
   [115]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@xe_exec_fault_mode@many-execqueues-bindexecqueue-userptr-invalidate.html

  * igt@xe_exec_reset@parallel-gt-reset:
    - shard-adlp:         [PASS][116] -> [DMESG-WARN][117] ([Intel XE#3876])
   [116]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-2/igt@xe_exec_reset@parallel-gt-reset.html
   [117]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-9/igt@xe_exec_reset@parallel-gt-reset.html

  * igt@xe_exec_system_allocator@evict-malloc:
    - shard-bmg:          [PASS][118] -> [ABORT][119] ([Intel XE#3970])
   [118]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-4/igt@xe_exec_system_allocator@evict-malloc.html
   [119]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-2/igt@xe_exec_system_allocator@evict-malloc.html

  * igt@xe_exec_system_allocator@once-large-mmap-huge-nomemset:
    - shard-bmg:          NOTRUN -> [SKIP][120] ([Intel XE#4943]) +12 other tests skip
   [120]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@xe_exec_system_allocator@once-large-mmap-huge-nomemset.html

  * igt@xe_exec_system_allocator@threads-many-execqueues-new-busy:
    - shard-dg2-set2:     NOTRUN -> [INCOMPLETE][121] ([Intel XE#2594])
   [121]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@xe_exec_system_allocator@threads-many-execqueues-new-busy.html

  * igt@xe_exec_system_allocator@threads-shared-vm-many-stride-mmap-remap-eocheck:
    - shard-dg2-set2:     NOTRUN -> [SKIP][122] ([Intel XE#4915]) +95 other tests skip
   [122]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@xe_exec_system_allocator@threads-shared-vm-many-stride-mmap-remap-eocheck.html

  * igt@xe_exec_system_allocator@twice-malloc-race-nomemset:
    - shard-adlp:         NOTRUN -> [SKIP][123] ([Intel XE#4915]) +1 other test skip
   [123]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-4/igt@xe_exec_system_allocator@twice-malloc-race-nomemset.html

  * igt@xe_oa@disabled-read-error:
    - shard-dg2-set2:     NOTRUN -> [SKIP][124] ([Intel XE#3573]) +1 other test skip
   [124]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@xe_oa@disabled-read-error.html

  * igt@xe_oa@mmio-triggered-reports-read:
    - shard-dg2-set2:     NOTRUN -> [SKIP][125] ([Intel XE#6032])
   [125]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@xe_oa@mmio-triggered-reports-read.html

  * igt@xe_pm@d3hot-i2c:
    - shard-dg2-set2:     NOTRUN -> [SKIP][126] ([Intel XE#5742])
   [126]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@xe_pm@d3hot-i2c.html

  * igt@xe_pm@s2idle-vm-bind-prefetch:
    - shard-adlp:         [PASS][127] -> [DMESG-WARN][128] ([Intel XE#2953] / [Intel XE#4173] / [Intel XE#4504])
   [127]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-1/igt@xe_pm@s2idle-vm-bind-prefetch.html
   [128]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-8/igt@xe_pm@s2idle-vm-bind-prefetch.html

  * igt@xe_pm@s4-d3cold-basic-exec:
    - shard-dg2-set2:     NOTRUN -> [SKIP][129] ([Intel XE#2284] / [Intel XE#366])
   [129]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@xe_pm@s4-d3cold-basic-exec.html

  * igt@xe_pmu@fn-engine-activity-sched-if-idle:
    - shard-dg2-set2:     NOTRUN -> [SKIP][130] ([Intel XE#4650])
   [130]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@xe_pmu@fn-engine-activity-sched-if-idle.html

  * igt@xe_query@multigpu-query-uc-fw-version-guc:
    - shard-bmg:          NOTRUN -> [SKIP][131] ([Intel XE#944])
   [131]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@xe_query@multigpu-query-uc-fw-version-guc.html

  * igt@xe_render_copy@render-stress-0-copies:
    - shard-dg2-set2:     NOTRUN -> [SKIP][132] ([Intel XE#4814])
   [132]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@xe_render_copy@render-stress-0-copies.html

  * igt@xe_sriov_scheduling@nonpreempt-engine-resets@numvfs-random:
    - shard-adlp:         [PASS][133] -> [DMESG-FAIL][134] ([Intel XE#5213] / [Intel XE#5545]) +1 other test dmesg-fail
   [133]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@xe_sriov_scheduling@nonpreempt-engine-resets@numvfs-random.html
   [134]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-3/igt@xe_sriov_scheduling@nonpreempt-engine-resets@numvfs-random.html

  
#### Possible fixes ####

  * igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs:
    - shard-bmg:          [INCOMPLETE][135] ([Intel XE#3862]) -> [PASS][136] +1 other test pass
   [135]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-3/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html
   [136]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-1/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html

  * igt@kms_ccs@crc-primary-suspend-4-tiled-dg2-mc-ccs@pipe-d-dp-4:
    - shard-dg2-set2:     [INCOMPLETE][137] ([Intel XE#3862]) -> [PASS][138] +1 other test pass
   [137]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-463/igt@kms_ccs@crc-primary-suspend-4-tiled-dg2-mc-ccs@pipe-d-dp-4.html
   [138]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_ccs@crc-primary-suspend-4-tiled-dg2-mc-ccs@pipe-d-dp-4.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc:
    - shard-dg2-set2:     [INCOMPLETE][139] ([Intel XE#1727] / [Intel XE#3113] / [Intel XE#4345] / [Intel XE#6168]) -> [PASS][140]
   [139]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-466/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc.html
   [140]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc@pipe-b-hdmi-a-6:
    - shard-dg2-set2:     [INCOMPLETE][141] ([Intel XE#1727] / [Intel XE#3113] / [Intel XE#6168]) -> [PASS][142]
   [141]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-466/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc@pipe-b-hdmi-a-6.html
   [142]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-463/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc@pipe-b-hdmi-a-6.html

  * igt@kms_cursor_legacy@cursora-vs-flipb-legacy:
    - shard-bmg:          [SKIP][143] ([Intel XE#2291]) -> [PASS][144] +1 other test pass
   [143]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_cursor_legacy@cursora-vs-flipb-legacy.html
   [144]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_cursor_legacy@cursora-vs-flipb-legacy.html

  * igt@kms_feature_discovery@display-2x:
    - shard-bmg:          [SKIP][145] ([Intel XE#2373]) -> [PASS][146]
   [145]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_feature_discovery@display-2x.html
   [146]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_feature_discovery@display-2x.html

  * igt@kms_flip@2x-flip-vs-dpms-off-vs-modeset-interruptible:
    - shard-bmg:          [SKIP][147] ([Intel XE#2316]) -> [PASS][148] +4 other tests pass
   [147]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_flip@2x-flip-vs-dpms-off-vs-modeset-interruptible.html
   [148]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_flip@2x-flip-vs-dpms-off-vs-modeset-interruptible.html

  * igt@kms_flip@basic-flip-vs-dpms@c-hdmi-a1:
    - shard-adlp:         [DMESG-WARN][149] ([Intel XE#4543]) -> [PASS][150] +5 other tests pass
   [149]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-9/igt@kms_flip@basic-flip-vs-dpms@c-hdmi-a1.html
   [150]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-2/igt@kms_flip@basic-flip-vs-dpms@c-hdmi-a1.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-bmg:          [INCOMPLETE][151] ([Intel XE#2049] / [Intel XE#2597]) -> [PASS][152] +1 other test pass
   [151]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-8/igt@kms_flip@flip-vs-suspend-interruptible.html
   [152]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@kms_flip@flip-vs-suspend-interruptible.html
    - shard-dg2-set2:     [INCOMPLETE][153] ([Intel XE#2049] / [Intel XE#2597]) -> [PASS][154] +3 other tests pass
   [153]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-433/igt@kms_flip@flip-vs-suspend-interruptible.html
   [154]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-436/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_flip@flip-vs-suspend@d-hdmi-a1:
    - shard-adlp:         [DMESG-WARN][155] ([Intel XE#2953] / [Intel XE#4173]) -> [PASS][156] +9 other tests pass
   [155]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-6/igt@kms_flip@flip-vs-suspend@d-hdmi-a1.html
   [156]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-1/igt@kms_flip@flip-vs-suspend@d-hdmi-a1.html

  * igt@kms_flip_tiling@flip-change-tiling:
    - shard-adlp:         [DMESG-FAIL][157] ([Intel XE#4543]) -> [PASS][158] +1 other test pass
   [157]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@kms_flip_tiling@flip-change-tiling.html
   [158]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-6/igt@kms_flip_tiling@flip-change-tiling.html

  * igt@kms_plane_scaling@2x-scaler-multi-pipe:
    - shard-bmg:          [SKIP][159] ([Intel XE#2571]) -> [PASS][160]
   [159]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_plane_scaling@2x-scaler-multi-pipe.html
   [160]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_plane_scaling@2x-scaler-multi-pipe.html

  * igt@xe_eudebug_sriov@deny-sriov:
    - shard-adlp:         [SKIP][161] ([Intel XE#4519]) -> [PASS][162] +1 other test pass
   [161]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-6/igt@xe_eudebug_sriov@deny-sriov.html
   [162]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-1/igt@xe_eudebug_sriov@deny-sriov.html
    - shard-bmg:          [SKIP][163] ([Intel XE#5793]) -> [PASS][164]
   [163]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-7/igt@xe_eudebug_sriov@deny-sriov.html
   [164]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-5/igt@xe_eudebug_sriov@deny-sriov.html

  * igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-rebind:
    - shard-dg2-set2:     [SKIP][165] ([Intel XE#1392]) -> [PASS][166] +5 other tests pass
   [165]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-432/igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-rebind.html
   [166]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-464/igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-rebind.html

  * igt@xe_exec_sip_eudebug@breakpoint-writesip-nodebug:
    - shard-bmg:          [SKIP][167] ([Intel XE#4837]) -> [PASS][168] +2 other tests pass
   [167]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@xe_exec_sip_eudebug@breakpoint-writesip-nodebug.html
   [168]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@xe_exec_sip_eudebug@breakpoint-writesip-nodebug.html
    - shard-adlp:         [SKIP][169] ([Intel XE#4837] / [Intel XE#5565]) -> [PASS][170] +6 other tests pass
   [169]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-2/igt@xe_exec_sip_eudebug@breakpoint-writesip-nodebug.html
   [170]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-6/igt@xe_exec_sip_eudebug@breakpoint-writesip-nodebug.html

  * igt@xe_exec_sip_eudebug@wait-writesip-nodebug:
    - shard-dg2-set2:     [SKIP][171] ([Intel XE#4837]) -> [PASS][172] +2 other tests pass
   [171]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-466/igt@xe_exec_sip_eudebug@wait-writesip-nodebug.html
   [172]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-432/igt@xe_exec_sip_eudebug@wait-writesip-nodebug.html
    - shard-lnl:          [SKIP][173] ([Intel XE#4837]) -> [PASS][174] +2 other tests pass
   [173]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-8/igt@xe_exec_sip_eudebug@wait-writesip-nodebug.html
   [174]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-1/igt@xe_exec_sip_eudebug@wait-writesip-nodebug.html

  * {igt@xe_exec_system_allocator@pat-index-madvise-pat-idx-uc-single-vma}:
    - shard-lnl:          [FAIL][175] ([Intel XE#6267]) -> [PASS][176]
   [175]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-4/igt@xe_exec_system_allocator@pat-index-madvise-pat-idx-uc-single-vma.html
   [176]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-2/igt@xe_exec_system_allocator@pat-index-madvise-pat-idx-uc-single-vma.html

  * igt@xe_live_ktest@xe_eudebug:
    - shard-bmg:          [SKIP][177] ([Intel XE#2833]) -> [PASS][178]
   [177]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-3/igt@xe_live_ktest@xe_eudebug.html
   [178]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-1/igt@xe_live_ktest@xe_eudebug.html
    - shard-adlp:         [SKIP][179] ([Intel XE#455] / [Intel XE#5712]) -> [PASS][180]
   [179]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-9/igt@xe_live_ktest@xe_eudebug.html
   [180]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-2/igt@xe_live_ktest@xe_eudebug.html
    - shard-dg2-set2:     [SKIP][181] ([Intel XE#455]) -> [PASS][182]
   [181]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-432/igt@xe_live_ktest@xe_eudebug.html
   [182]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-464/igt@xe_live_ktest@xe_eudebug.html
    - shard-lnl:          [SKIP][183] ([Intel XE#2833]) -> [PASS][184]
   [183]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-1/igt@xe_live_ktest@xe_eudebug.html
   [184]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-5/igt@xe_live_ktest@xe_eudebug.html

  * igt@xe_pm@s2idle-basic:
    - shard-adlp:         [DMESG-WARN][185] ([Intel XE#2953] / [Intel XE#4173] / [Intel XE#4504]) -> [PASS][186]
   [185]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-9/igt@xe_pm@s2idle-basic.html
   [186]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-2/igt@xe_pm@s2idle-basic.html

  * igt@xe_pmu@gt-frequency:
    - shard-dg2-set2:     [FAIL][187] ([Intel XE#4819]) -> [PASS][188] +1 other test pass
   [187]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-434/igt@xe_pmu@gt-frequency.html
   [188]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@xe_pmu@gt-frequency.html

  
#### Warnings ####

  * igt@kms_content_protection@atomic-dpms:
    - shard-bmg:          [SKIP][189] ([Intel XE#2341]) -> [FAIL][190] ([Intel XE#1178])
   [189]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_content_protection@atomic-dpms.html
   [190]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_flip@flip-vs-panning-vs-hang@d-hdmi-a1:
    - shard-adlp:         [TIMEOUT][191] ([Intel XE#4543]) -> [DMESG-WARN][192] ([Intel XE#4543]) +1 other test dmesg-warn
   [191]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-4/igt@kms_flip@flip-vs-panning-vs-hang@d-hdmi-a1.html
   [192]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-1/igt@kms_flip@flip-vs-panning-vs-hang@d-hdmi-a1.html

  * igt@kms_flip@flip-vs-suspend@b-hdmi-a1:
    - shard-adlp:         [DMESG-WARN][193] ([Intel XE#2953] / [Intel XE#4173] / [Intel XE#4543]) -> [DMESG-WARN][194] ([Intel XE#4543]) +1 other test dmesg-warn
   [193]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-6/igt@kms_flip@flip-vs-suspend@b-hdmi-a1.html
   [194]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-1/igt@kms_flip@flip-vs-suspend@b-hdmi-a1.html

  * igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-indfb-draw-mmap-wc:
    - shard-bmg:          [SKIP][195] ([Intel XE#2312]) -> [SKIP][196] ([Intel XE#2311]) +7 other tests skip
   [195]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-indfb-draw-mmap-wc.html
   [196]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-mmap-wc:
    - shard-bmg:          [SKIP][197] ([Intel XE#2312]) -> [SKIP][198] ([Intel XE#5390]) +6 other tests skip
   [197]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-mmap-wc.html
   [198]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-7/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move:
    - shard-bmg:          [SKIP][199] ([Intel XE#5390]) -> [SKIP][200] ([Intel XE#2312]) +2 other tests skip
   [199]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move.html
   [200]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move.html

  * igt@kms_frontbuffer_tracking@fbcdrrs-2p-scndscrn-cur-indfb-draw-mmap-wc:
    - shard-bmg:          [SKIP][201] ([Intel XE#2311]) -> [SKIP][202] ([Intel XE#2312]) +8 other tests skip
   [201]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-7/igt@kms_frontbuffer_tracking@fbcdrrs-2p-scndscrn-cur-indfb-draw-mmap-wc.html
   [202]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcdrrs-2p-scndscrn-cur-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-shrfb-msflip-blt:
    - shard-bmg:          [SKIP][203] ([Intel XE#2313]) -> [SKIP][204] ([Intel XE#2312]) +6 other tests skip
   [203]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-7/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-shrfb-msflip-blt.html
   [204]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-shrfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-cur-indfb-draw-render:
    - shard-bmg:          [SKIP][205] ([Intel XE#2312]) -> [SKIP][206] ([Intel XE#2313]) +7 other tests skip
   [205]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-6/igt@kms_frontbuffer_tracking@psr-2p-primscrn-cur-indfb-draw-render.html
   [206]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-8/igt@kms_frontbuffer_tracking@psr-2p-primscrn-cur-indfb-draw-render.html

  * igt@kms_pm_dc@dc9-dpms:
    - shard-adlp:         [SKIP][207] ([Intel XE#734]) -> [FAIL][208] ([Intel XE#3325])
   [207]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-1/igt@kms_pm_dc@dc9-dpms.html
   [208]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-8/igt@kms_pm_dc@dc9-dpms.html

  * igt@kms_tiled_display@basic-test-pattern:
    - shard-bmg:          [FAIL][209] ([Intel XE#1729]) -> [SKIP][210] ([Intel XE#2426])
   [209]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@kms_tiled_display@basic-test-pattern.html
   [210]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-2/igt@kms_tiled_display@basic-test-pattern.html

  * igt@kms_tiled_display@basic-test-pattern-with-chamelium:
    - shard-bmg:          [SKIP][211] ([Intel XE#2426]) -> [SKIP][212] ([Intel XE#2509])
   [211]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-7/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html
   [212]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-5/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html

  * igt@xe_eudebug@multigpu-basic-client:
    - shard-bmg:          [SKIP][213] ([Intel XE#4837]) -> [SKIP][214] ([Intel XE#3894]) +1 other test skip
   [213]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-bmg-5/igt@xe_eudebug@multigpu-basic-client.html
   [214]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-bmg-3/igt@xe_eudebug@multigpu-basic-client.html

  * igt@xe_eudebug@multigpu-basic-client-many:
    - shard-lnl:          [SKIP][215] ([Intel XE#4837]) -> [SKIP][216] ([Intel XE#5132]) +1 other test skip
   [215]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-8/igt@xe_eudebug@multigpu-basic-client-many.html
   [216]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-1/igt@xe_eudebug@multigpu-basic-client-many.html

  * igt@xe_eudebug@read-metadata:
    - shard-adlp:         [SKIP][217] ([Intel XE#4837] / [Intel XE#5565]) -> [SKIP][218] ([Intel XE#5565]) +1 other test skip
   [217]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@xe_eudebug@read-metadata.html
   [218]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-3/igt@xe_eudebug@read-metadata.html

  * igt@xe_eudebug_online@breakpoint-many-sessions-tiles:
    - shard-adlp:         [SKIP][219] ([Intel XE#4837] / [Intel XE#5565]) -> [SKIP][220] ([Intel XE#2846])
   [219]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-1/igt@xe_eudebug_online@breakpoint-many-sessions-tiles.html
   [220]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-8/igt@xe_eudebug_online@breakpoint-many-sessions-tiles.html
    - shard-dg2-set2:     [SKIP][221] ([Intel XE#4837]) -> [SKIP][222] ([Intel XE#2846])
   [221]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-463/igt@xe_eudebug_online@breakpoint-many-sessions-tiles.html
   [222]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-435/igt@xe_eudebug_online@breakpoint-many-sessions-tiles.html

  * igt@xe_eudebug_online@interrupt-all-set-breakpoint-faultable:
    - shard-dg2-set2:     [SKIP][223] ([Intel XE#4837]) -> [SKIP][224] ([Intel XE#455]) +1 other test skip
   [223]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-dg2-463/igt@xe_eudebug_online@interrupt-all-set-breakpoint-faultable.html
   [224]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-dg2-466/igt@xe_eudebug_online@interrupt-all-set-breakpoint-faultable.html

  * igt@xe_eudebug_online@writes-caching-sram-bb-vram-target-sram:
    - shard-lnl:          [SKIP][225] ([Intel XE#4837]) -> [SKIP][226] ([Intel XE#2825]) +5 other tests skip
   [225]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-lnl-4/igt@xe_eudebug_online@writes-caching-sram-bb-vram-target-sram.html
   [226]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-lnl-2/igt@xe_eudebug_online@writes-caching-sram-bb-vram-target-sram.html

  * igt@xe_eudebug_online@writes-caching-vram-bb-vram-target-vram:
    - shard-adlp:         [SKIP][227] ([Intel XE#4837] / [Intel XE#5565]) -> [SKIP][228] ([Intel XE#455]) +7 other tests skip
   [227]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-1/igt@xe_eudebug_online@writes-caching-vram-bb-vram-target-vram.html
   [228]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-8/igt@xe_eudebug_online@writes-caching-vram-bb-vram-target-vram.html

  * igt@xe_fault_injection@probe-fail-guc-xe_guc_ct_send_recv:
    - shard-adlp:         [ABORT][229] ([Intel XE#5530]) -> [ABORT][230] ([Intel XE#4917] / [Intel XE#5530])
   [229]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@xe_fault_injection@probe-fail-guc-xe_guc_ct_send_recv.html
   [230]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-6/igt@xe_fault_injection@probe-fail-guc-xe_guc_ct_send_recv.html

  * igt@xe_query@multigpu-query-uc-fw-version-huc:
    - shard-adlp:         [SKIP][231] ([Intel XE#944]) -> [FAIL][232] ([Intel XE#6249])
   [231]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d/shard-adlp-8/igt@xe_query@multigpu-query-uc-fw-version-huc.html
   [232]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/shard-adlp-3/igt@xe_query@multigpu-query-uc-fw-version-huc.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [Intel XE#1124]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1124
  [Intel XE#1178]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1178
  [Intel XE#1188]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1188
  [Intel XE#1392]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1392
  [Intel XE#1406]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1406
  [Intel XE#1439]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1439
  [Intel XE#1489]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1489
  [Intel XE#1503]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1503
  [Intel XE#1727]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1727
  [Intel XE#1729]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1729
  [Intel XE#2049]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2049
  [Intel XE#2191]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2191
  [Intel XE#2233]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2233
  [Intel XE#2234]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2234
  [Intel XE#2244]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2244
  [Intel XE#2252]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2252
  [Intel XE#2284]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2284
  [Intel XE#2291]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2291
  [Intel XE#2293]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2293
  [Intel XE#2311]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2311
  [Intel XE#2312]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2312
  [Intel XE#2313]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2313
  [Intel XE#2314]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2314
  [Intel XE#2316]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2316
  [Intel XE#2320]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2320
  [Intel XE#2322]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2322
  [Intel XE#2325]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2325
  [Intel XE#2327]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2327
  [Intel XE#2328]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2328
  [Intel XE#2341]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2341
  [Intel XE#2373]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2373
  [Intel XE#2380]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2380
  [Intel XE#2414]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2414
  [Intel XE#2426]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2426
  [Intel XE#2501]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2501
  [Intel XE#2504]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2504
  [Intel XE#2505]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2505
  [Intel XE#2509]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2509
  [Intel XE#2571]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2571
  [Intel XE#2594]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2594
  [Intel XE#2597]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2597
  [Intel XE#2825]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2825
  [Intel XE#2833]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2833
  [Intel XE#2846]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2846
  [Intel XE#2850]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2850
  [Intel XE#288]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/288
  [Intel XE#2887]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2887
  [Intel XE#2894]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2894
  [Intel XE#2907]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2907
  [Intel XE#2927]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2927
  [Intel XE#2939]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2939
  [Intel XE#2953]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2953
  [Intel XE#3012]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3012
  [Intel XE#307]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/307
  [Intel XE#308]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/308
  [Intel XE#3113]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3113
  [Intel XE#3141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3141
  [Intel XE#3226]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3226
  [Intel XE#3304]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3304
  [Intel XE#3325]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3325
  [Intel XE#3414]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3414
  [Intel XE#3432]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3432
  [Intel XE#356]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/356
  [Intel XE#3573]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3573
  [Intel XE#366]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/366
  [Intel XE#367]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/367
  [Intel XE#373]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/373
  [Intel XE#3862]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3862
  [Intel XE#3876]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3876
  [Intel XE#3894]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3894
  [Intel XE#3970]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3970
  [Intel XE#4173]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4173
  [Intel XE#4345]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4345
  [Intel XE#4422]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4422
  [Intel XE#4494]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4494
  [Intel XE#4504]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4504
  [Intel XE#4519]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4519
  [Intel XE#4543]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4543
  [Intel XE#455]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/455
  [Intel XE#4650]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4650
  [Intel XE#4814]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4814
  [Intel XE#4819]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4819
  [Intel XE#4837]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4837
  [Intel XE#4915]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4915
  [Intel XE#4917]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4917
  [Intel XE#4943]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4943
  [Intel XE#5007]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5007
  [Intel XE#5021]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5021
  [Intel XE#5132]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5132
  [Intel XE#5213]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5213
  [Intel XE#5300]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5300
  [Intel XE#5390]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5390
  [Intel XE#5530]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5530
  [Intel XE#5545]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5545
  [Intel XE#5565]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5565
  [Intel XE#5712]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5712
  [Intel XE#5742]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5742
  [Intel XE#5793]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5793
  [Intel XE#6032]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6032
  [Intel XE#616]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/616
  [Intel XE#6168]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6168
  [Intel XE#6171]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6171
  [Intel XE#623]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/623
  [Intel XE#6249]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6249
  [Intel XE#6267]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6267
  [Intel XE#6281]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6281
  [Intel XE#651]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/651
  [Intel XE#653]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/653
  [Intel XE#734]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/734
  [Intel XE#787]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/787
  [Intel XE#836]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/836
  [Intel XE#929]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/929
  [Intel XE#944]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/944


Build changes
-------------

  * Linux: xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d -> xe-pw-155452v1

  IGT_8574: 44a15713124663a622c6eddf7c6ee5ba732e0d41 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-3870-dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d: dba1fd9754c6ee58b05564ffa50bbe7be5ddf37d
  xe-pw-155452v1: 155452v1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155452v1/index.html

[-- Attachment #2: Type: text/html, Size: 75921 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable
  2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
@ 2025-10-06 18:35   ` Matthew Brost
  2025-10-20 12:56     ` Mika Kuoppala
  2025-10-20 12:53   ` Mika Kuoppala
  2025-11-18 14:48   ` Mika Kuoppala
  2 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-06 18:35 UTC (permalink / raw)
  To: Mika Kuoppala
  Cc: intel-xe, simona.vetter, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek

On Mon, Oct 06, 2025 at 02:17:06PM +0300, Mika Kuoppala wrote:
> We need to inform to guc which contexts are debuggable
> as their handling is different from ordinary contexts.
> 
> Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  5 +++
>  drivers/gpu/drm/xe/xe_eudebug_hw.c       | 55 ++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_eudebug_hw.h       |  4 ++
>  drivers/gpu/drm/xe/xe_guc_submit.c       |  4 ++
>  4 files changed, 68 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> index 47756e4674a1..32a5f680a6d2 100644
> --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> @@ -155,6 +155,7 @@ enum xe_guc_action {
>  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
>  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
>  	XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
> +	XE_GUC_ACTION_EU_KERNEL_DEBUG = 0x8006,
>  	XE_GUC_ACTION_TEST_G2G_SEND = 0xF001,
>  	XE_GUC_ACTION_TEST_G2G_RECV = 0xF002,
>  	XE_GUC_ACTION_LIMIT
> @@ -278,4 +279,8 @@ enum xe_guc_g2g_type {
>  /* invalid type for XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR */
>  #define XE_GUC_CAT_ERR_TYPE_INVALID 0xdeadbeef
>  
> +enum  xe_guc_eu_kernel_debug_request_type {
> +	XE_GUC_EU_KERNEL_DEBUG_ENABLE = 0x3,
> +};
> +
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
> index a62c4b439888..cd4627705b56 100644
> --- a/drivers/gpu/drm/xe/xe_eudebug_hw.c
> +++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
> @@ -12,6 +12,7 @@
>  #include "regs/xe_gt_regs.h"
>  #include "regs/xe_engine_regs.h"
>  
> +#include "abi/guc_actions_abi.h"
>  #include "xe_eudebug.h"
>  #include "xe_eudebug_types.h"
>  #include "xe_exec_queue.h"
> @@ -20,6 +21,9 @@
>  #include "xe_gt.h"
>  #include "xe_gt_debug.h"
>  #include "xe_gt_mcr.h"
> +#include "xe_guc.h"
> +#include "xe_guc_ct.h"
> +#include "xe_guc_exec_queue_types.h"
>  #include "xe_hw_engine.h"
>  #include "xe_lrc.h"
>  #include "xe_macros.h"
> @@ -675,6 +679,57 @@ static int xe_eu_control_stopped(struct xe_eudebug *d,
>  	return xe_gt_eu_attention_bitmap(q->gt, bits, bitmask_size);
>  }
>  
> +static int xe_guc_action_eu_kernel_debug(struct xe_device *xe,
> +					 struct xe_exec_queue *q,
> +					 struct xe_lrc *lrc, u32 cmd)
> +{
> +	u32 action[] = {
> +		XE_GUC_ACTION_EU_KERNEL_DEBUG,
> +		q->guc->id,
> +		cmd,
> +		0, /* reserved */
> +	};
> +	int ret, i;
> +
> +	if (cmd != XE_GUC_EU_KERNEL_DEBUG_ENABLE)

Maybe an xe_gt_assert here instead.

> +		return -EINVAL;
> +
> +	ret = -EINVAL;
> +	for (i = 0; i < q->width; i++) {

I would double check with the GuC team if you have have enable EU
debugging on each guc_id in multi-lrc contexts. I suspect not given
register, scheduling toggles, and deregister H2G operate only on the
main GuC ID. I'm am however unsure as H2G 8006 is not in GuC spec I'm
looking at.

> +		if (lrc && q->lrc[i] != lrc)
> +			continue;
> +

The above code looks unused as LRC is always NULL.

> +		action[1] = q->guc->id + i;
> +		drm_dbg(&xe->drm, "Guc action[%u] for ctx=%d",
> +			cmd, action[1]);

Prefer xe_gt_dbg.

> +
> +		ret = xe_guc_ct_send(&q->gt->uc.guc.ct,
> +				     action, ARRAY_SIZE(action), 0, 0);
> +
> +		if (ret)
> +			drm_dbg(&xe->drm, "eudebug guc cmd %u failed with %d\n",
> +				cmd, ret);

Prefer xe_gt_dbg.

> +	}
> +
> +	return ret;
> +}
> +
> +static bool xe_guc_has_debug_contexts(struct xe_gt *gt)
> +{
> +	return GUC_FIRMWARE_VER(&gt->uc.guc) >=	MAKE_GUC_VER(70, 49, 0);
> +}
> +
> +int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q)
> +{

The return value is not checked at the caller, so NULL return would be
better choice.

> +	struct xe_device *xe = gt_to_xe(q->gt);
> +
> +	if (!xe_guc_has_debug_contexts(q->gt))
> +		return 0;
> +
> +	return xe_guc_action_eu_kernel_debug(xe, q, NULL,
> +					     XE_GUC_EU_KERNEL_DEBUG_ENABLE);
> +}
> +
>  static struct xe_eudebug_eu_control_ops eu_control = {
>  	.interrupt_all = xe_eu_control_interrupt_all,
>  	.stopped = xe_eu_control_stopped,
> diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
> index 8f59ec574e4e..5d1df5d7dc46 100644
> --- a/drivers/gpu/drm/xe/xe_eudebug_hw.h
> +++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
> @@ -23,10 +23,14 @@ long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg);
>  
>  struct xe_exec_queue *xe_gt_runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx);
>  
> +int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q);
> +

I'd probably stick this implementation xe_guc_submit.c as even though
this EU debug specific, IMO you don't really need to compile this part
out as surely if EU kconfig is unset EXEC_QUEUE_EUDEBUG_FLAG_ENABLE
would be clear. Maybe ask others about where to stick this, as my
opinion on this isn't strong either way.

Matt

>  #else /* CONFIG_DRM_XE_EUDEBUG */
>  
>  static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable) { }
>  
> +static inline int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q) { return 0; }
> +
>  #endif /* CONFIG_DRM_XE_EUDEBUG */
>  
>  #endif /* _XE_EUDEBUG_HW_H_ */
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 16f78376f196..da264c1cfe76 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -21,6 +21,7 @@
>  #include "xe_assert.h"
>  #include "xe_devcoredump.h"
>  #include "xe_device.h"
> +#include "xe_eudebug_hw.h"
>  #include "xe_exec_queue.h"
>  #include "xe_force_wake.h"
>  #include "xe_gpu_scheduler.h"
> @@ -655,6 +656,9 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
>  	if (xe_exec_queue_is_lr(q))
>  		xe_exec_queue_get(q);
>  
> +	if (q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)
> +		xe_eudebug_exec_queue_enable(q);
> +
>  	set_exec_queue_registered(q);
>  	trace_xe_exec_queue_register(q);
>  	if (xe_exec_queue_is_parallel(q))
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling
  2025-10-06 11:17 ` [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling Mika Kuoppala
@ 2025-10-06 18:43   ` Matthew Brost
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-10-06 18:43 UTC (permalink / raw)
  To: Mika Kuoppala
  Cc: intel-xe, simona.vetter, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun

On Mon, Oct 06, 2025 at 02:17:10PM +0300, Mika Kuoppala wrote:
> From: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> 
> The XE2 (and PVC) HW has a limitation that the pagefault due to invalid
> access will halt the corresponding EUs. To solve this problem, enable
> EU pagefault handling functionality, which allows to unhalt pagefaulted
> eu threads and to EU debugger to get inform about the eu attentions state
> of EU threads during execution.
> 
> If a pagefault occurs, send the DRM_XE_EUDEBUG_EVENT_PAGEFAULT event
> after handling the pagefault.
> 
> The pagefault handling is a mechanism that allows a stalled EU thread to
> enter SIP mode by installing a temporal null page to the page table entry
> where the pagefault happened.
> 
> A brief description of the page fault handling mechanism flow between KMD
> and the eu thread is as follows
> 
> (1) eu thread accesses unallocated address
> (2) pagefault happens and eu thread stalls
> (3) XE kmd set an force eu thread exception to allow the running eu thread
>     to enter SIP mode (kmd set ForceException / ForceExternalHalt bit of
>     TD_CTL register)
>     Not stalled (none-pagefaulted) eu threads enter SIP mode
> (4) XE kmd installs temporal null page to the pagetable entry of the
>     address where pagefault happened.
> (5) XE kmd replies pagefault successful message to GUC
> (6) stalled eu thread resumes as per pagefault condition has resolved
> (7) resumed eu thread enters SIP mode due to force exception set by (3)
> 
> As designed this feature to only work when eudbug is enabled, it should
> have no impact to regular recoverable pagefault code path.
> 
> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_pagefault.c | 80 +++++++++++++++++++++++++---
>  1 file changed, 74 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> index a054d6010ae0..873ffd982030 100644
> --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> @@ -13,6 +13,7 @@
>  
>  #include "abi/guc_actions_abi.h"
>  #include "xe_bo.h"
> +#include "xe_eudebug.h"
>  #include "xe_gt.h"
>  #include "xe_gt_printk.h"
>  #include "xe_gt_stats.h"
> @@ -173,10 +174,14 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
>  	return vm;
>  }
>  
> -static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> +static int handle_pagefault_start(struct xe_gt *gt, struct pagefault *pf,
> +				  struct xe_vm **pf_vm,
> +				  struct xe_eudebug_pagefault **eudebug_pf_out)
>  {
>  	struct xe_device *xe = gt_to_xe(gt);
>  	struct xe_vm *vm;
> +	struct xe_eudebug_pagefault *eudebug_pf;
> +	bool  destroy_eudebug_pf = false;
>  	struct xe_vma *vma = NULL;
>  	int err;
>  	bool atomic;
> @@ -189,6 +194,10 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>  	if (IS_ERR(vm))
>  		return PTR_ERR(vm);
>  
> +	eudebug_pf = xe_eudebug_pagefault_create(gt, vm, pf->page_addr,
> +						 pf->fault_type, pf->fault_level,
> +						 pf->access_type);
> +
>  	/*
>  	 * TODO: Change to read lock? Using write lock for simplicity.
>  	 */
> @@ -201,8 +210,27 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>  
>  	vma = xe_vm_find_vma_by_addr(vm, pf->page_addr);
>  	if (!vma) {
> -		err = -EINVAL;
> -		goto unlock_vm;

Not a full review, it will take me a minute to wrap my head around what
this doing but if a user application has SVM enabled the VMA lookup will
never fail as the entire VMA space is always populated by either valid
VMAs with bindings or VMAs which indicate the CPU address space is being
mirrored. The latter doesn't mean that CPU address exists though - the
xe_svm.c layer figures that out. I suspect you are missing some logic to
properly handle EU debug + SVM.

Let try to wrap my head around exactly what the EU debug pagefault code
is doing.

Matt

> +		if (eudebug_pf)
> +			vma = xe_vm_create_null_vma(vm, pf->page_addr);
> +
> +		if (IS_ERR_OR_NULL(vma)) {
> +			err = -EINVAL;
> +			if (eudebug_pf)
> +				destroy_eudebug_pf = true;
> +
> +			goto unlock_vm;
> +		}
> +	} else {
> +		/*
> +		 * When creating an instance of eudebug_pagefault, there was
> +		 * no vma containing the ppgtt address where the pagefault occurred,
> +		 * but when reacquiring vm->lock, there is.
> +		 * During not aquiring the vm->lock from this context,
> +		 * but vma corresponding to the address where the pagefault occurred
> +		 * in another context has allocated.
> +		 */
> +		if (eudebug_pf)
> +			destroy_eudebug_pf = true;
>  	}
>  
>  	atomic = access_is_atomic(pf->access_type);
> @@ -217,11 +245,43 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>  	if (!err)
>  		vm->usm.last_fault_vma = vma;
>  	up_write(&vm->lock);
> -	xe_vm_put(vm);
> +
> +	if (destroy_eudebug_pf) {
> +		xe_eudebug_pagefault_destroy(gt, vm, eudebug_pf, false);
> +		*eudebug_pf_out = NULL;
> +	} else {
> +		*eudebug_pf_out = eudebug_pf;
> +	}
> +
> +	/* while the lifetime of the eudebug pagefault instance, keep the VM instance.*/
> +	if (!*eudebug_pf_out) {
> +		xe_vm_put(vm);
> +		*pf_vm = NULL;
> +	} else {
> +		*pf_vm = vm;
> +	}
>  
>  	return err;
>  }
>  
> +static void handle_pagefault_end(struct xe_gt *gt, struct xe_vm *vm,
> +				 struct xe_eudebug_pagefault *eudebug_pf)
> +{
> +	/* if there no eudebug_pagefault then return */
> +	if (!eudebug_pf)
> +		return;
> +
> +	xe_eudebug_pagefault_process(gt, eudebug_pf);
> +
> +	/*
> +	 * TODO: Remove VMA added to handle eudebug pagefault
> +	 */
> +
> +	xe_eudebug_pagefault_destroy(gt, vm, eudebug_pf, true);
> +
> +	xe_vm_put(vm);
> +}
> +
>  static int send_pagefault_reply(struct xe_guc *guc,
>  				struct xe_guc_pagefault_reply *reply)
>  {
> @@ -346,7 +406,10 @@ static void pf_queue_work_func(struct work_struct *w)
>  	threshold = jiffies + msecs_to_jiffies(USM_QUEUE_MAX_RUNTIME_MS);
>  
>  	while (get_pagefault(pf_queue, &pf)) {
> -		ret = handle_pagefault(gt, &pf);
> +		struct xe_eudebug_pagefault *eudebug_pf = NULL;
> +		struct xe_vm *vm = NULL;
> +
> +		ret = handle_pagefault_start(gt, &pf, &vm, &eudebug_pf);
>  		if (unlikely(ret)) {
>  			print_pagefault(gt, &pf);
>  			pf.fault_unsuccessful = 1;
> @@ -364,7 +427,12 @@ static void pf_queue_work_func(struct work_struct *w)
>  			FIELD_PREP(PFR_ENG_CLASS, pf.engine_class) |
>  			FIELD_PREP(PFR_PDATA, pf.pdata);
>  
> -		send_pagefault_reply(&gt->uc.guc, &reply);
> +		ret = send_pagefault_reply(&gt->uc.guc, &reply);
> +
> +		if (unlikely(ret))
> +			xe_gt_dbg(gt, "GuC Pagefault reply failed: %d\n", ret);
> +
> +		handle_pagefault_end(gt, vm, eudebug_pf);
>  
>  		if (time_after(jiffies, threshold) &&
>  		    pf_queue->tail != pf_queue->head) {
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable
  2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
  2025-10-06 18:35   ` Matthew Brost
@ 2025-10-20 12:53   ` Mika Kuoppala
  2025-11-18 14:48   ` Mika Kuoppala
  2 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-20 12:53 UTC (permalink / raw)
  To: intel-xe; +Cc: Mika Kuoppala, Matthew Brost, Dominik Grzegorzek,
	Maciej Patelczyk

We need to inform to guc which contexts are debuggable
as their handling is different from ordinary contexts.

v2: void return, use xe_gt_dbg, no need for lrc (Matt)

Cc: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/abi/guc_actions_abi.h |  5 ++++
 drivers/gpu/drm/xe/xe_guc_submit.c       | 34 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
index 47756e4674a1..32a5f680a6d2 100644
--- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
@@ -155,6 +155,7 @@ enum xe_guc_action {
 	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
 	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
 	XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
+	XE_GUC_ACTION_EU_KERNEL_DEBUG = 0x8006,
 	XE_GUC_ACTION_TEST_G2G_SEND = 0xF001,
 	XE_GUC_ACTION_TEST_G2G_RECV = 0xF002,
 	XE_GUC_ACTION_LIMIT
@@ -278,4 +279,8 @@ enum xe_guc_g2g_type {
 /* invalid type for XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR */
 #define XE_GUC_CAT_ERR_TYPE_INVALID 0xdeadbeef
 
+enum  xe_guc_eu_kernel_debug_request_type {
+	XE_GUC_EU_KERNEL_DEBUG_ENABLE = 0x3,
+};
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 0ef67d3523a7..2ba576521d7a 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -21,6 +21,7 @@
 #include "xe_assert.h"
 #include "xe_devcoredump.h"
 #include "xe_device.h"
+#include "xe_eudebug_hw.h"
 #include "xe_exec_queue.h"
 #include "xe_force_wake.h"
 #include "xe_gpu_scheduler.h"
@@ -651,6 +652,36 @@ static void __register_exec_queue(struct xe_guc *guc,
 	xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
 }
 
+static bool xe_guc_has_debug_contexts(struct xe_guc *guc)
+{
+	return GUC_FIRMWARE_VER(guc) >=	MAKE_GUC_VER(70, 49, 0);
+}
+
+static void xe_guc_action_eu_kernel_debug_enable(struct xe_exec_queue *q)
+{
+	struct xe_gt *gt = q->hwe->gt;
+	u32 action[] = {
+		XE_GUC_ACTION_EU_KERNEL_DEBUG,
+		q->guc->id,
+		XE_GUC_EU_KERNEL_DEBUG_ENABLE,
+		0, /* reserved */
+	};
+	int i;
+
+	for (i = 0; i < q->width; i++) {
+		int ret;
+
+		action[1] = q->guc->id + i;
+
+		ret = xe_guc_ct_send(&q->gt->uc.guc.ct,
+				     action, ARRAY_SIZE(action), 0, 0);
+
+		if (ret)
+			xe_gt_dbg(gt, "Guc action[%u] for ctx=%d, failed with %d",
+				  action[2], action[1], ret);
+	}
+}
+
 static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
 {
 	struct xe_guc *guc = exec_queue_to_guc(q);
@@ -698,6 +729,9 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
 	if (xe_exec_queue_is_lr(q))
 		xe_exec_queue_get(q);
 
+	if (xe_exec_queue_is_debuggable(q) && xe_guc_has_debug_contexts(guc))
+		xe_guc_action_eu_kernel_debug_enable(q);
+
 	set_exec_queue_registered(q);
 	trace_xe_exec_queue_register(q);
 	if (xe_exec_queue_is_parallel(q))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable
  2025-10-06 18:35   ` Matthew Brost
@ 2025-10-20 12:56     ` Mika Kuoppala
  0 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2025-10-20 12:56 UTC (permalink / raw)
  To: Matthew Brost
  Cc: intel-xe, simona.vetter, christian.koenig, thomas.hellstrom,
	joonas.lahtinen, christoph.manszewski, rodrigo.vivi,
	lucas.demarchi, andrzej.hajda, matthew.auld, maciej.patelczyk,
	gwan-gyeong.mun, Dominik Grzegorzek

Matthew Brost <matthew.brost@intel.com> writes:

> On Mon, Oct 06, 2025 at 02:17:06PM +0300, Mika Kuoppala wrote:
>> We need to inform to guc which contexts are debuggable
>> as their handling is different from ordinary contexts.
>> 
>> Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>> Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> ---
>>  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  5 +++
>>  drivers/gpu/drm/xe/xe_eudebug_hw.c       | 55 ++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_eudebug_hw.h       |  4 ++
>>  drivers/gpu/drm/xe/xe_guc_submit.c       |  4 ++
>>  4 files changed, 68 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
>> index 47756e4674a1..32a5f680a6d2 100644
>> --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
>> +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
>> @@ -155,6 +155,7 @@ enum xe_guc_action {
>>  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
>>  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
>>  	XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
>> +	XE_GUC_ACTION_EU_KERNEL_DEBUG = 0x8006,
>>  	XE_GUC_ACTION_TEST_G2G_SEND = 0xF001,
>>  	XE_GUC_ACTION_TEST_G2G_RECV = 0xF002,
>>  	XE_GUC_ACTION_LIMIT
>> @@ -278,4 +279,8 @@ enum xe_guc_g2g_type {
>>  /* invalid type for XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR */
>>  #define XE_GUC_CAT_ERR_TYPE_INVALID 0xdeadbeef
>>  
>> +enum  xe_guc_eu_kernel_debug_request_type {
>> +	XE_GUC_EU_KERNEL_DEBUG_ENABLE = 0x3,
>> +};
>> +
>>  #endif
>> diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.c b/drivers/gpu/drm/xe/xe_eudebug_hw.c
>> index a62c4b439888..cd4627705b56 100644
>> --- a/drivers/gpu/drm/xe/xe_eudebug_hw.c
>> +++ b/drivers/gpu/drm/xe/xe_eudebug_hw.c
>> @@ -12,6 +12,7 @@
>>  #include "regs/xe_gt_regs.h"
>>  #include "regs/xe_engine_regs.h"
>>  
>> +#include "abi/guc_actions_abi.h"
>>  #include "xe_eudebug.h"
>>  #include "xe_eudebug_types.h"
>>  #include "xe_exec_queue.h"
>> @@ -20,6 +21,9 @@
>>  #include "xe_gt.h"
>>  #include "xe_gt_debug.h"
>>  #include "xe_gt_mcr.h"
>> +#include "xe_guc.h"
>> +#include "xe_guc_ct.h"
>> +#include "xe_guc_exec_queue_types.h"
>>  #include "xe_hw_engine.h"
>>  #include "xe_lrc.h"
>>  #include "xe_macros.h"
>> @@ -675,6 +679,57 @@ static int xe_eu_control_stopped(struct xe_eudebug *d,
>>  	return xe_gt_eu_attention_bitmap(q->gt, bits, bitmask_size);
>>  }
>>  
>> +static int xe_guc_action_eu_kernel_debug(struct xe_device *xe,
>> +					 struct xe_exec_queue *q,
>> +					 struct xe_lrc *lrc, u32 cmd)
>> +{
>> +	u32 action[] = {
>> +		XE_GUC_ACTION_EU_KERNEL_DEBUG,
>> +		q->guc->id,
>> +		cmd,
>> +		0, /* reserved */
>> +	};
>> +	int ret, i;
>> +
>> +	if (cmd != XE_GUC_EU_KERNEL_DEBUG_ENABLE)
>
> Maybe an xe_gt_assert here instead.
>
>> +		return -EINVAL;
>> +
>> +	ret = -EINVAL;
>> +	for (i = 0; i < q->width; i++) {
>
> I would double check with the GuC team if you have have enable EU
> debugging on each guc_id in multi-lrc contexts. I suspect not given
> register, scheduling toggles, and deregister H2G operate only on the
> main GuC ID. I'm am however unsure as H2G 8006 is not in GuC spec I'm
> looking at.
>

I have not yet had definite answer if main GuC id is enough.

>> +		if (lrc && q->lrc[i] != lrc)
>> +			continue;
>> +
>
> The above code looks unused as LRC is always NULL.
>
>> +		action[1] = q->guc->id + i;
>> +		drm_dbg(&xe->drm, "Guc action[%u] for ctx=%d",
>> +			cmd, action[1]);
>
> Prefer xe_gt_dbg.
>
>> +
>> +		ret = xe_guc_ct_send(&q->gt->uc.guc.ct,
>> +				     action, ARRAY_SIZE(action), 0, 0);
>> +
>> +		if (ret)
>> +			drm_dbg(&xe->drm, "eudebug guc cmd %u failed with %d\n",
>> +				cmd, ret);
>
> Prefer xe_gt_dbg.
>
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static bool xe_guc_has_debug_contexts(struct xe_gt *gt)
>> +{
>> +	return GUC_FIRMWARE_VER(&gt->uc.guc) >=	MAKE_GUC_VER(70, 49, 0);
>> +}
>> +
>> +int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q)
>> +{
>
> The return value is not checked at the caller, so NULL return would be
> better choice.
>
>> +	struct xe_device *xe = gt_to_xe(q->gt);
>> +
>> +	if (!xe_guc_has_debug_contexts(q->gt))
>> +		return 0;
>> +
>> +	return xe_guc_action_eu_kernel_debug(xe, q, NULL,
>> +					     XE_GUC_EU_KERNEL_DEBUG_ENABLE);
>> +}
>> +
>>  static struct xe_eudebug_eu_control_ops eu_control = {
>>  	.interrupt_all = xe_eu_control_interrupt_all,
>>  	.stopped = xe_eu_control_stopped,
>> diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
>> index 8f59ec574e4e..5d1df5d7dc46 100644
>> --- a/drivers/gpu/drm/xe/xe_eudebug_hw.h
>> +++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
>> @@ -23,10 +23,14 @@ long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg);
>>  
>>  struct xe_exec_queue *xe_gt_runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx);
>>  
>> +int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q);
>> +
>
> I'd probably stick this implementation xe_guc_submit.c as even though
> this EU debug specific, IMO you don't really need to compile this part
> out as surely if EU kconfig is unset EXEC_QUEUE_EUDEBUG_FLAG_ENABLE
> would be clear. Maybe ask others about where to stick this, as my
> opinion on this isn't strong either way.

I tucked it in xe_guc_submit.c without config switch. Much more
straightforward. Thanks.
-Mika

>
> Matt
>
>>  #else /* CONFIG_DRM_XE_EUDEBUG */
>>  
>>  static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe, bool enable) { }
>>  
>> +static inline int xe_eudebug_exec_queue_enable(struct xe_exec_queue *q) { return 0; }
>> +
>>  #endif /* CONFIG_DRM_XE_EUDEBUG */
>>  
>>  #endif /* _XE_EUDEBUG_HW_H_ */
>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>> index 16f78376f196..da264c1cfe76 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>> @@ -21,6 +21,7 @@
>>  #include "xe_assert.h"
>>  #include "xe_devcoredump.h"
>>  #include "xe_device.h"
>> +#include "xe_eudebug_hw.h"
>>  #include "xe_exec_queue.h"
>>  #include "xe_force_wake.h"
>>  #include "xe_gpu_scheduler.h"
>> @@ -655,6 +656,9 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
>>  	if (xe_exec_queue_is_lr(q))
>>  		xe_exec_queue_get(q);
>>  
>> +	if (q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)
>> +		xe_eudebug_exec_queue_enable(q);
>> +
>>  	set_exec_queue_registered(q);
>>  	trace_xe_exec_queue_register(q);
>>  	if (xe_exec_queue_is_parallel(q))
>> -- 
>> 2.43.0
>> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable
  2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
  2025-10-06 18:35   ` Matthew Brost
  2025-10-20 12:53   ` Mika Kuoppala
@ 2025-11-18 14:48   ` Mika Kuoppala
  2025-11-19 21:33     ` Daniele Ceraolo Spurio
  2 siblings, 1 reply; 31+ messages in thread
From: Mika Kuoppala @ 2025-11-18 14:48 UTC (permalink / raw)
  To: intel-xe
  Cc: Mika Kuoppala, Matthew Brost, Lucas De Marchi,
	Daniele Ceraolo Spurio, Jan Sokolowski, Dominik Grzegorzek,
	Maciej Patelczyk

We need to inform to guc which contexts are debuggable
as their handling is different from ordinary contexts.

v2: void return, use xe_gt_dbg, no need for lrc (Matt)
v3: add the workaround enabling (Daniele)
v4: version needed to 70.49.4

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jan Sokolowski <jan.sokolowski@intel.com>
Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/xe/abi/guc_actions_abi.h |  5 +++
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h    |  1 +
 drivers/gpu/drm/xe/xe_eudebug_hw.h       |  7 ++++
 drivers/gpu/drm/xe/xe_guc_ads.c          | 18 ++++++++++
 drivers/gpu/drm/xe/xe_guc_submit.c       | 45 ++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_wa_oob.rules       |  2 ++
 6 files changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
index 47756e4674a1..32a5f680a6d2 100644
--- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
@@ -155,6 +155,7 @@ enum xe_guc_action {
 	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
 	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
 	XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
+	XE_GUC_ACTION_EU_KERNEL_DEBUG = 0x8006,
 	XE_GUC_ACTION_TEST_G2G_SEND = 0xF001,
 	XE_GUC_ACTION_TEST_G2G_RECV = 0xF002,
 	XE_GUC_ACTION_LIMIT
@@ -278,4 +279,8 @@ enum xe_guc_g2g_type {
 /* invalid type for XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR */
 #define XE_GUC_CAT_ERR_TYPE_INVALID 0xdeadbeef
 
+enum  xe_guc_eu_kernel_debug_request_type {
+	XE_GUC_EU_KERNEL_DEBUG_ENABLE = 0x3,
+};
+
 #endif
diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index 265a135e7061..fba190d4f84b 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -423,6 +423,7 @@ enum xe_guc_klv_ids {
 	GUC_WA_KLV_WAKE_POWER_DOMAINS_FOR_OUTBOUND_MMIO					= 0x900a,
 	GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH					= 0x900b,
 	GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG					= 0x900c,
+	GUC_WA_KLV_RESET_DEP_ENGINES_ON_DEBUG_CTX_SWITCH				= 0x900d,
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
index 8f59ec574e4e..7c5df12859ac 100644
--- a/drivers/gpu/drm/xe/xe_eudebug_hw.h
+++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
@@ -14,6 +14,13 @@ struct xe_eudebug;
 struct xe_hw_engine;
 struct xe_gt;
 
+#define XE_EUDEBUG_GUC_VER_MAJOR 70
+#define XE_EUDEBUG_GUC_VER_MINOR 49
+#define XE_EUDEBUG_GUC_VER_PATCH 4
+#define XE_EUDEBUG_GUC_VER_MIN MAKE_GUC_VER(XE_EUDEBUG_GUC_VER_MAJOR, \
+					    XE_EUDEBUG_GUC_VER_MINOR, \
+					    XE_EUDEBUG_GUC_VER_PATCH)
+
 #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
 
 void xe_eudebug_hw_init(struct xe_eudebug *d);
diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
index 58e0b0294a5b..8ff90437c976 100644
--- a/drivers/gpu/drm/xe/xe_guc_ads.c
+++ b/drivers/gpu/drm/xe/xe_guc_ads.c
@@ -16,6 +16,7 @@
 #include "regs/xe_gt_regs.h"
 #include "regs/xe_guc_regs.h"
 #include "xe_bo.h"
+#include "xe_eudebug_hw.h"
 #include "xe_gt.h"
 #include "xe_gt_ccs_mode.h"
 #include "xe_gt_printk.h"
@@ -363,6 +364,23 @@ static void guc_waklv_init(struct xe_guc_ads *ads)
 		guc_waklv_enable(ads, NULL, 0, &offset, &remain,
 				 GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT);
 
+#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
+	if (XE_GT_WA(gt, 14022766366)) {
+		const struct xe_uc_fw_version needed = {
+			.major = XE_EUDEBUG_GUC_VER_MAJOR,
+			.minor = XE_EUDEBUG_GUC_VER_MINOR,
+			.patch = XE_EUDEBUG_GUC_VER_PATCH,
+		};
+
+		if (GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER_STRUCT(needed))
+			guc_waklv_enable(ads, NULL, 0, &offset, &remain,
+					 GUC_WA_KLV_RESET_DEP_ENGINES_ON_DEBUG_CTX_SWITCH);
+		else
+			xe_gt_warn(gt, "eudebug needs GuC version %u.%u.%u or greater\n",
+				   needed.major, needed.minor, needed.patch);
+	}
+#endif
+
 	size = guc_ads_waklv_size(ads) - remain;
 	if (!size)
 		return;
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 16f78376f196..26dc4a4a67f0 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -21,6 +21,7 @@
 #include "xe_assert.h"
 #include "xe_devcoredump.h"
 #include "xe_device.h"
+#include "xe_eudebug_hw.h"
 #include "xe_exec_queue.h"
 #include "xe_force_wake.h"
 #include "xe_gpu_scheduler.h"
@@ -608,6 +609,47 @@ static void __register_exec_queue(struct xe_guc *guc,
 	xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
 }
 
+static bool xe_guc_has_debug_contexts(struct xe_guc *guc)
+{
+	return GUC_FIRMWARE_VER(guc) >=	XE_EUDEBUG_GUC_VER_MIN;
+}
+
+static int xe_guc_action_eu_kernel_debug_enable(struct xe_guc *guc,
+						struct xe_exec_queue *q)
+{
+	struct xe_gt *gt = q->hwe->gt;
+	const u32 action[] = {
+		XE_GUC_ACTION_EU_KERNEL_DEBUG,
+		q->guc->id,
+		XE_GUC_EU_KERNEL_DEBUG_ENABLE,
+		0, /* reserved */
+	};
+	int ret;
+
+	ret = xe_guc_ct_send(&guc->ct, action,
+			     ARRAY_SIZE(action), 0, 0);
+
+	if (ret)
+		xe_gt_dbg(gt, "GuC ctx=%d debug enabling failed with %d",
+			  action[1], ret);
+	else
+		xe_gt_dbg(gt, "GuC ctx=%d enabled for debug", action[1]);
+
+	return ret;
+}
+
+static void set_debug(struct xe_guc *guc, struct xe_exec_queue *q)
+{
+	int ret;
+
+	if (!xe_guc_has_debug_contexts(guc))
+		return;
+
+	ret = xe_guc_action_eu_kernel_debug_enable(guc, q);
+	if (ret)
+		xe_gt_warn(q->gt, "Failed to set eu kernel debug enable");
+}
+
 static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
 {
 	struct xe_guc *guc = exec_queue_to_guc(q);
@@ -662,6 +704,9 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
 	else
 		__register_exec_queue(guc, &info);
 	init_policies(guc, q);
+
+	if (xe_exec_queue_is_debuggable(q))
+		set_debug(guc, q);
 }
 
 static u32 wq_space_until_wrap(struct xe_exec_queue *q)
diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
index e8f09ae7a67b..31272c83c6e0 100644
--- a/drivers/gpu/drm/xe/xe_wa_oob.rules
+++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
@@ -85,3 +85,5 @@
 #eudebug
 18022722726	GRAPHICS_VERSION_RANGE(1250, 1274)
 14015474168	PLATFORM(PVC)
+14022766366	GRAPHICS_VERSION_RANGE(2000, 2004)
+		GRAPHICS_VERSION_RANGE(3000, 3005)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable
  2025-11-18 14:48   ` Mika Kuoppala
@ 2025-11-19 21:33     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 31+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-11-19 21:33 UTC (permalink / raw)
  To: Mika Kuoppala, intel-xe
  Cc: Matthew Brost, Lucas De Marchi, Jan Sokolowski,
	Dominik Grzegorzek, Maciej Patelczyk



On 11/18/2025 6:48 AM, Mika Kuoppala wrote:
> We need to inform to guc which contexts are debuggable
> as their handling is different from ordinary contexts.
>
> v2: void return, use xe_gt_dbg, no need for lrc (Matt)
> v3: add the workaround enabling (Daniele)
> v4: version needed to 70.49.4
>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Jan Sokolowski <jan.sokolowski@intel.com>
> Co-developed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Co-developed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/xe/abi/guc_actions_abi.h |  5 +++
>   drivers/gpu/drm/xe/abi/guc_klvs_abi.h    |  1 +
>   drivers/gpu/drm/xe/xe_eudebug_hw.h       |  7 ++++
>   drivers/gpu/drm/xe/xe_guc_ads.c          | 18 ++++++++++
>   drivers/gpu/drm/xe/xe_guc_submit.c       | 45 ++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_wa_oob.rules       |  2 ++
>   6 files changed, 78 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> index 47756e4674a1..32a5f680a6d2 100644
> --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> @@ -155,6 +155,7 @@ enum xe_guc_action {
>   	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
>   	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
>   	XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
> +	XE_GUC_ACTION_EU_KERNEL_DEBUG = 0x8006,
>   	XE_GUC_ACTION_TEST_G2G_SEND = 0xF001,
>   	XE_GUC_ACTION_TEST_G2G_RECV = 0xF002,
>   	XE_GUC_ACTION_LIMIT
> @@ -278,4 +279,8 @@ enum xe_guc_g2g_type {
>   /* invalid type for XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR */
>   #define XE_GUC_CAT_ERR_TYPE_INVALID 0xdeadbeef
>   
> +enum  xe_guc_eu_kernel_debug_request_type {
> +	XE_GUC_EU_KERNEL_DEBUG_ENABLE = 0x3,
> +};
> +
>   #endif
> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> index 265a135e7061..fba190d4f84b 100644
> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> @@ -423,6 +423,7 @@ enum xe_guc_klv_ids {
>   	GUC_WA_KLV_WAKE_POWER_DOMAINS_FOR_OUTBOUND_MMIO					= 0x900a,
>   	GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH					= 0x900b,
>   	GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG					= 0x900c,
> +	GUC_WA_KLV_RESET_DEP_ENGINES_ON_DEBUG_CTX_SWITCH				= 0x900d,
>   };
>   
>   #endif
> diff --git a/drivers/gpu/drm/xe/xe_eudebug_hw.h b/drivers/gpu/drm/xe/xe_eudebug_hw.h
> index 8f59ec574e4e..7c5df12859ac 100644
> --- a/drivers/gpu/drm/xe/xe_eudebug_hw.h
> +++ b/drivers/gpu/drm/xe/xe_eudebug_hw.h
> @@ -14,6 +14,13 @@ struct xe_eudebug;
>   struct xe_hw_engine;
>   struct xe_gt;
>   
> +#define XE_EUDEBUG_GUC_VER_MAJOR 70
> +#define XE_EUDEBUG_GUC_VER_MINOR 49
> +#define XE_EUDEBUG_GUC_VER_PATCH 4
> +#define XE_EUDEBUG_GUC_VER_MIN MAKE_GUC_VER(XE_EUDEBUG_GUC_VER_MAJOR, \
> +					    XE_EUDEBUG_GUC_VER_MINOR, \
> +					    XE_EUDEBUG_GUC_VER_PATCH)
> +
>   #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
>   
>   void xe_eudebug_hw_init(struct xe_eudebug *d);
> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
> index 58e0b0294a5b..8ff90437c976 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
> @@ -16,6 +16,7 @@
>   #include "regs/xe_gt_regs.h"
>   #include "regs/xe_guc_regs.h"
>   #include "xe_bo.h"
> +#include "xe_eudebug_hw.h"
>   #include "xe_gt.h"
>   #include "xe_gt_ccs_mode.h"
>   #include "xe_gt_printk.h"
> @@ -363,6 +364,23 @@ static void guc_waklv_init(struct xe_guc_ads *ads)
>   		guc_waklv_enable(ads, NULL, 0, &offset, &remain,
>   				 GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT);
>   
> +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG)
> +	if (XE_GT_WA(gt, 14022766366)) {
> +		const struct xe_uc_fw_version needed = {
> +			.major = XE_EUDEBUG_GUC_VER_MAJOR,
> +			.minor = XE_EUDEBUG_GUC_VER_MINOR,
> +			.patch = XE_EUDEBUG_GUC_VER_PATCH,
> +		};
> +
> +		if (GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER_STRUCT(needed))
> +			guc_waklv_enable(ads, NULL, 0, &offset, &remain,
> +					 GUC_WA_KLV_RESET_DEP_ENGINES_ON_DEBUG_CTX_SWITCH);
> +		else
> +			xe_gt_warn(gt, "eudebug needs GuC version %u.%u.%u or greater\n",
> +				   needed.major, needed.minor, needed.patch);

We usually prefer to use gt_info for specific features needing a certain 
GuC version (see e.g. is_engine_activity_supported()), since there is 
already a warn-level message at GuC fetch time recommending a new enough 
version of the GuC

> +	}
> +#endif
> +
>   	size = guc_ads_waklv_size(ads) - remain;
>   	if (!size)
>   		return;
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 16f78376f196..26dc4a4a67f0 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -21,6 +21,7 @@
>   #include "xe_assert.h"
>   #include "xe_devcoredump.h"
>   #include "xe_device.h"
> +#include "xe_eudebug_hw.h"
>   #include "xe_exec_queue.h"
>   #include "xe_force_wake.h"
>   #include "xe_gpu_scheduler.h"
> @@ -608,6 +609,47 @@ static void __register_exec_queue(struct xe_guc *guc,
>   	xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
>   }
>   
> +static bool xe_guc_has_debug_contexts(struct xe_guc *guc)

nit: we usually don't use the "xe_" prefix for static functions (below 
as well)

> +{
> +	return GUC_FIRMWARE_VER(guc) >=	XE_EUDEBUG_GUC_VER_MIN;
> +}
> +
> +static int xe_guc_action_eu_kernel_debug_enable(struct xe_guc *guc,
> +						struct xe_exec_queue *q)
> +{
> +	struct xe_gt *gt = q->hwe->gt;
> +	const u32 action[] = {
> +		XE_GUC_ACTION_EU_KERNEL_DEBUG,
> +		q->guc->id,
> +		XE_GUC_EU_KERNEL_DEBUG_ENABLE,
> +		0, /* reserved */
> +	};
> +	int ret;
> +
> +	ret = xe_guc_ct_send(&guc->ct, action,
> +			     ARRAY_SIZE(action), 0, 0);
> +
> +	if (ret)
> +		xe_gt_dbg(gt, "GuC ctx=%d debug enabling failed with %d",
> +			  action[1], ret);

I think this should be gt_err and not dbg because it has info that we 
would want even if debug logs are disabled.

> +	else
> +		xe_gt_dbg(gt, "GuC ctx=%d enabled for debug", action[1]);
> +
> +	return ret;
> +}
> +
> +static void set_debug(struct xe_guc *guc, struct xe_exec_queue *q)
> +{
> +	int ret;
> +
> +	if (!xe_guc_has_debug_contexts(guc))
> +		return;

This seems to be silently dropping the setting. If I'm not missing 
anything, this means that if an app tries to enable debug on a context 
on a too old GuC it won't receive any actual error return and the only 
way the user has to know that something went wrong is to look at the 
driver load logs in dmesg. IMO we should either escalate an error from 
here (which would require de-registering the queue) or have a guc 
version check much earlier in the debuggable queue creation flow and 
fail there.

> +
> +	ret = xe_guc_action_eu_kernel_debug_enable(guc, q);
> +	if (ret)
> +		xe_gt_warn(q->gt, "Failed to set eu kernel debug enable");

If the other log is raised to gt_err then this one can be dropped.

> +}
> +
>   static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
>   {
>   	struct xe_guc *guc = exec_queue_to_guc(q);
> @@ -662,6 +704,9 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
>   	else
>   		__register_exec_queue(guc, &info);
>   	init_policies(guc, q);
> +
> +	if (xe_exec_queue_is_debuggable(q))
> +		set_debug(guc, q);
>   }
>   
>   static u32 wq_space_until_wrap(struct xe_exec_queue *q)
> diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
> index e8f09ae7a67b..31272c83c6e0 100644
> --- a/drivers/gpu/drm/xe/xe_wa_oob.rules
> +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
> @@ -85,3 +85,5 @@
>   #eudebug
>   18022722726	GRAPHICS_VERSION_RANGE(1250, 1274)
>   14015474168	PLATFORM(PVC)
> +14022766366	GRAPHICS_VERSION_RANGE(2000, 2004)

Looking at the WA details this seems to start at 2001.

Daniele

> +		GRAPHICS_VERSION_RANGE(3000, 3005)


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-11-19 21:33 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-06 11:16 [PATCH 00/20] Intel Xe GPU Debug Support (eudebug) v5 Mika Kuoppala
2025-10-06 11:16 ` [PATCH 01/20] drm/xe/eudebug: Introduce eudebug interface Mika Kuoppala
2025-10-06 11:16 ` [PATCH 02/20] drm/xe/eudebug: Introduce discovery for resources Mika Kuoppala
2025-10-06 11:16 ` [PATCH 03/20] drm/xe/eudebug: Introduce exec_queue events Mika Kuoppala
2025-10-06 11:16 ` [PATCH 04/20] drm/xe: Add EUDEBUG_ENABLE exec queue property Mika Kuoppala
2025-10-06 11:16 ` [PATCH 05/20] drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops Mika Kuoppala
2025-10-06 11:16 ` [PATCH 06/20] drm/xe/eudebug: Introduce vm bind and vm bind debug data events Mika Kuoppala
2025-10-06 11:16 ` [PATCH 07/20] drm/xe/eudebug: Add UFENCE events with acks Mika Kuoppala
2025-10-06 11:16 ` [PATCH 08/20] drm/xe/eudebug: vm open/pread/pwrite Mika Kuoppala
2025-10-06 11:16 ` [PATCH 09/20] drm/xe/eudebug: userptr vm pread/pwrite Mika Kuoppala
2025-10-06 11:17 ` [PATCH 10/20] drm/xe/eudebug: hw enablement for eudebug Mika Kuoppala
2025-10-06 11:17 ` [PATCH 11/20] drm/xe/eudebug: Introduce EU control interface Mika Kuoppala
2025-10-06 11:17 ` [PATCH 12/20] drm/xe/eudebug: Introduce per device attention scan worker Mika Kuoppala
2025-10-06 11:17 ` [PATCH 13/20] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Mika Kuoppala
2025-10-06 11:17 ` [PATCH 14/20] drm/xe: Implement SR-IOV and eudebug exclusivity Mika Kuoppala
2025-10-06 11:17 ` [PATCH 15/20] drm/xe: Add xe_client_debugfs and introduce debug_data file Mika Kuoppala
2025-10-06 11:17 ` [PATCH 16/20] drm/xe/eudebug: Mark guc contexts as debuggable Mika Kuoppala
2025-10-06 18:35   ` Matthew Brost
2025-10-20 12:56     ` Mika Kuoppala
2025-10-20 12:53   ` Mika Kuoppala
2025-11-18 14:48   ` Mika Kuoppala
2025-11-19 21:33     ` Daniele Ceraolo Spurio
2025-10-06 11:17 ` [PATCH 17/20] drm/xe/eudebug: Add read/count/compare helper for eu attention Mika Kuoppala
2025-10-06 11:17 ` [PATCH 18/20] drm/xe/eudebug: Introduce EU pagefault handling interface Mika Kuoppala
2025-10-06 11:17 ` [PATCH 19/20] drm/xe/vm: Support for adding null page VMA to VM on request Mika Kuoppala
2025-10-06 11:17 ` [PATCH 20/20] drm/xe/eudebug: Enable EU pagefault handling Mika Kuoppala
2025-10-06 18:43   ` Matthew Brost
2025-10-06 12:30 ` ✗ CI.checkpatch: warning for Intel Xe GPU Debug Support (eudebug) v5 Patchwork
2025-10-06 12:31 ` ✓ CI.KUnit: success " Patchwork
2025-10-06 13:14 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-06 15:53 ` ✗ Xe.CI.Full: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox