All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Xe driver asynchronous notification mechanism
@ 2026-06-12 13:53 Thomas Hellström
  2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

There is a need to inform user-space clients when a rebind worker
has ran out of memory so that it can react, adjust its working-set
and restart the job. This patch series aims to start a discussion
about the best way to accomplish this.

The series builds on the core "general notification mechanism" or
"watch_queue", and attaches a watch queue to each xe drm file.

The watch_queue is extremely flexible and allows filtering out
events of interest at the kernel level. There can be multiple
listeners.

Patch 1 Implements a restart IOCTL for rebind-workers
      paused on OOM.
Patch 2 Adds fault-injection into the rebind worker for
      testing.
Patch 3 Adds a DRM_XE_NOTIFY watch_type.
Patch 4 Implements watch_queue event sending from within
      xe.

igt series:
Test-with: https://patchwork.freedesktop.org/series/168429/

Compute UMD side is not available yet. Will be available before
final review.

Thomas Hellström (4):
  drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL
  drm/xe: Add fault injection for rebind worker -ENOSPC
  watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch()
  drm/xe: Add watch_queue-based device event notification

 MAINTAINERS                          |   1 +
 drivers/gpu/drm/xe/Kconfig           |   1 +
 drivers/gpu/drm/xe/Makefile          |   1 +
 drivers/gpu/drm/xe/xe_debugfs.c      |   4 +-
 drivers/gpu/drm/xe/xe_device.c       |   8 ++
 drivers/gpu/drm/xe/xe_device_types.h |   6 ++
 drivers/gpu/drm/xe/xe_vm.c           | 135 ++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_vm.h           |  13 ++-
 drivers/gpu/drm/xe/xe_vm_types.h     |   3 +
 drivers/gpu/drm/xe/xe_watch_queue.c  | 111 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_watch_queue.h  |  20 ++++
 include/uapi/drm/xe_drm.h            |  91 +++++++++++++++++-
 include/uapi/drm/xe_drm_events.h     |  62 ++++++++++++
 include/uapi/linux/watch_queue.h     |   3 +-
 kernel/watch_queue.c                 |  13 ++-
 15 files changed, 462 insertions(+), 10 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.c
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.h
 create mode 100644 include/uapi/drm/xe_drm_events.h

-- 
2.54.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  2026-06-12 13:53 ` [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC Thomas Hellström
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add an async VM restart IOCTL that allows userspace to re-queue the
preempt-rebind worker for a VM that has been paused after a recoverable
error.

Add xe_vm_restart_ioctl() which:
- Looks up the VM by id via xe_vm_lookup()
- Returns -EINVAL if the VM is not in preempt-fence mode or not restartable
- Returns -EALREADY if the VM is not currently paused
- Queues the rebind worker via and returns 0

If the optional @timestamp_ns field is non-zero, logs the latency
between that timestamp and the point the worker is queued.

Add DRM_XE_VM_CREATE_FLAG_RESTARTABLE to opt a VM in to the restartable
behaviour: on recoverable errors (-ENOMEM, -ENOSPC) the rebind worker
is deactivated rather than the VM being killed. Requires
DRM_XE_VM_CREATE_FLAG_LR_MODE and may not be used with
DRM_XE_VM_CREATE_FLAG_FAULT_MODE.

Add struct drm_xe_vm_restart UAPI struct with vm_id, pad, timestamp_ns
and reserved fields, and register the IOCTL at slot 0x10.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 51e3a2dd7b22..867d7c55dc03 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -215,6 +215,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
 			  DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_VM_RESTART, xe_vm_restart_ioctl, DRM_RENDER_ALLOW),
 };

 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 75841f3e9afa..86ed8f31a219 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -563,8 +563,14 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	}

 	if (err) {
-		drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
-		xe_vm_kill(vm, true);
+		if ((err == -ENOMEM || err == -ENOSPC) && xe_vm_is_restartable(vm)) {
+			vm->preempt.rebind_deactivated = true;
+			drm_dbg(&vm->xe->drm, "Rebind deactivated VM on error %pe\n",
+				ERR_PTR(err));
+		} else {
+			drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
+			xe_vm_kill(vm, true);
+		}
 	}
 	up_write(&vm->lock);

@@ -573,6 +579,85 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	trace_xe_vm_rebind_worker_exit(vm);
 }

+/**
+ * xe_vm_restart_ioctl() - Queue the preempt-rebind worker for a paused VM
+ * @dev: DRM device
+ * @data: pointer to &struct drm_xe_vm_restart from userspace
+ * @file: DRM file handle
+ *
+ * Looks up the VM identified by @vm_id and, if it is currently paused (its
+ * rebind worker was deactivated after a recoverable error), clears the paused
+ * state and queues the rebind worker.  Only valid for VMs in preempt-fence
+ * mode.
+ *
+ * If @timestamp_ns is non-zero, logs the latency between that timestamp and
+ * the point the vm lock is taken, regardless of whether the VM was paused.
+ *
+ * Return: 0 if the worker was queued, -EALREADY if the VM is not paused,
+ *         -EINVAL if the VM is not in preempt-fence mode or not restartable,
+ *         -ENOENT if the VM was not found.
+ */
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(dev);
+	struct xe_file *xef = to_xe_file(file);
+	struct drm_xe_vm_restart *args = data;
+	struct xe_vm *vm;
+	int err = 0;
+
+	if (XE_IOCTL_DBG(xe, args->reserved || args->pad))
+		return -EINVAL;
+
+	vm = xe_vm_lookup(xef, args->vm_id);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -ENOENT;
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_in_preempt_fence_mode(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_is_restartable(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	err = down_read_interruptible(&vm->lock);
+	if (err)
+		goto out;
+
+	if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) {
+		err = -ENOENT;
+		goto out_unlock_read;
+	}
+
+	if (args->timestamp_ns) {
+		u64 delay_us = (ktime_get_ns() - args->timestamp_ns) / NSEC_PER_USEC;
+
+		drm_dbg(&xe->drm, "VM %u restart latency: %llu us\n",
+			args->vm_id, delay_us);
+	}
+
+	err = xe_vm_lock(vm, true);
+	if (err)
+		goto out_unlock_read;
+
+	if (!vm->preempt.rebind_deactivated) {
+		err = -EALREADY;
+		goto out_unlock_resv;
+	}
+
+	xe_vm_reactivate_rebind(vm);
+out_unlock_resv:
+	xe_vm_unlock(vm);
+out_unlock_read:
+	up_read(&vm->lock);
+out:
+	xe_vm_put(vm);
+	return err;
+}
+
 /**
  * xe_vm_add_fault_entry_pf() - Add pagefault to vm fault list
  * @vm: The VM.
@@ -2049,7 +2134,8 @@ find_ufence_get(struct xe_sync_entry *syncs, u32 num_syncs)
 #define ALL_DRM_XE_VM_CREATE_FLAGS (DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | \
 				    DRM_XE_VM_CREATE_FLAG_LR_MODE | \
 				    DRM_XE_VM_CREATE_FLAG_FAULT_MODE | \
-				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
+				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT | \
+				    DRM_XE_VM_CREATE_FLAG_RESTARTABLE)

 int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *file)
@@ -2092,6 +2178,11 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 			 args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT))
 		return -EINVAL;

+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE &&
+			 (!(args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE) ||
+			  args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE)))
+		return -EINVAL;
+
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE)
 		flags |= XE_VM_FLAG_SCRATCH_PAGE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE)
@@ -2100,6 +2191,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		flags |= XE_VM_FLAG_FAULT_MODE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
 		flags |= XE_VM_FLAG_NO_VM_OVERCOMMIT;
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE)
+		flags |= XE_VM_FLAG_RESTARTABLE;

 	vm = xe_vm_create(xe, flags, xef);
 	if (IS_ERR(vm))
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index c5b900f38ded..9ee44599cacd 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -212,7 +212,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
 int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
 int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file);
-
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file);
 void xe_vm_close_and_put(struct xe_vm *vm);

 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
@@ -237,6 +238,11 @@ static inline bool xe_vm_allow_vm_eviction(struct xe_vm *vm)
 		 !(vm->flags & XE_VM_FLAG_NO_VM_OVERCOMMIT));
 }

+static inline bool xe_vm_is_restartable(struct xe_vm *vm)
+{
+	return vm->flags & XE_VM_FLAG_RESTARTABLE;
+}
+
 int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);
 void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);

diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 635ed29b9a69..7d295c3b8456 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -264,6 +264,7 @@ struct xe_vm {
 #define XE_VM_FLAG_SET_TILE_ID(tile)	FIELD_PREP(GENMASK(7, 6), (tile)->id)
 #define XE_VM_FLAG_GSC			BIT(8)
 #define XE_VM_FLAG_NO_VM_OVERCOMMIT     BIT(9)
+#define XE_VM_FLAG_RESTARTABLE          BIT(10)
 	unsigned long flags;

 	/**
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 48e9f1fdb78d..bebb0167bd31 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -85,6 +85,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS
  *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
+ *  - &DRM_IOCTL_XE_VM_RESTART
  */

 /*
@@ -110,6 +111,7 @@ extern "C" {
 #define DRM_XE_VM_QUERY_MEM_RANGE_ATTRS	0x0d
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x0e
 #define DRM_XE_VM_GET_PROPERTY		0x0f
+#define DRM_XE_VM_RESTART		0x10

 /* Must be kept compact -- no holes */

@@ -129,6 +131,7 @@ extern "C" {
 #define DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_QUERY_MEM_RANGE_ATTRS, struct drm_xe_vm_query_mem_range_attr)
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
+#define DRM_IOCTL_XE_VM_RESTART			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_RESTART, struct drm_xe_vm_restart)

 /**
  * DOC: Xe IOCTL Extensions
@@ -985,6 +988,10 @@ struct drm_xe_gem_mmap_offset {
  *    but only during a &DRM_IOCTL_XE_VM_BIND operation with the
  *    %DRM_XE_VM_BIND_FLAG_IMMEDIATE flag set. This may be useful for
  *    user-space naively probing the amount of available memory.
+ *  - %DRM_XE_VM_CREATE_FLAG_RESTARTABLE - Requires also
+ *    DRM_XE_VM_CREATE_FLAG_LR_MODE. Marks the VM as restartable, enabling
+ *    use of &DRM_IOCTL_XE_VM_RESTART to resume the preempt-rebind worker
+ *    after an error has paused it.
  */
 struct drm_xe_vm_create {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -994,6 +1001,7 @@ struct drm_xe_vm_create {
 #define DRM_XE_VM_CREATE_FLAG_LR_MODE	        (1 << 1)
 #define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(1 << 2)
 #define DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT  (1 << 3)
+#define DRM_XE_VM_CREATE_FLAG_RESTARTABLE       (1 << 4)
 	/** @flags: Flags */
 	__u32 flags;

@@ -2531,8 +2539,44 @@ struct drm_xe_exec_queue_set_property {
 };

 /**
- * DOC: Xe DRM RAS
+ * DOC: DRM_XE_VM_RESTART
+ *
+ * Restart a paused VM by queuing its preempt-rebind worker.  The VM must be
+ * in preempt-fence mode and must currently be paused (i.e. its rebind worker
+ * was deactivated after a recoverable error such as -ENOMEM or -ENOSPC).
+ *
+ * Returns 0 if the rebind worker was successfully queued.  Returns -EALREADY
+ * if the VM is not currently paused.  Returns -EINVAL if the VM is not in
+ * preempt-fence mode or not restartable.
  *
+ * An optional @timestamp_ns can be provided to measure the latency between
+ * event delivery and the point the worker is queued; the driver logs this
+ * once all sanity checks have passed.
+ */
+
+/**
+ * struct drm_xe_vm_restart - restart a VM's preempt-rebind worker
+ *
+ * Used with %DRM_IOCTL_XE_VM_RESTART.
+ */
+struct drm_xe_vm_restart {
+	/** @vm_id: ID of the VM to restart */
+	__u32 vm_id;
+	/** @pad: reserved, must be zero */
+	__u32 pad;
+	/**
+	 * @timestamp_ns: optional CLOCK_MONOTONIC timestamp in nanoseconds.
+	 * When non-zero, the driver logs the delay between this timestamp and
+	 * the point the vm lock is taken, regardless of whether the VM is
+	 * currently paused.  Pass zero to disable the logging.
+	 */
+	__u64 timestamp_ns;
+	/** @reserved: reserved, must be zero */
+	__u64 reserved;
+};
+
+/**
+ * DOC: Xe DRM RAS
  * The enums and strings defined below map to the attributes of the DRM RAS Netlink Interface.
  * Refer to Documentation/netlink/specs/drm_ras.yaml for complete interface specification.
  *
---
 drivers/gpu/drm/xe/xe_device.c   |  1 +
 drivers/gpu/drm/xe/xe_vm.c       | 99 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_vm.h       |  8 ++-
 drivers/gpu/drm/xe/xe_vm_types.h |  1 +
 include/uapi/drm/xe_drm.h        | 46 ++++++++++++++-
 5 files changed, 150 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 51e3a2dd7b22..867d7c55dc03 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -215,6 +215,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
 			  DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_VM_RESTART, xe_vm_restart_ioctl, DRM_RENDER_ALLOW),
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 75841f3e9afa..86ed8f31a219 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -563,8 +563,14 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	}
 
 	if (err) {
-		drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
-		xe_vm_kill(vm, true);
+		if ((err == -ENOMEM || err == -ENOSPC) && xe_vm_is_restartable(vm)) {
+			vm->preempt.rebind_deactivated = true;
+			drm_dbg(&vm->xe->drm, "Rebind deactivated VM on error %pe\n",
+				ERR_PTR(err));
+		} else {
+			drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
+			xe_vm_kill(vm, true);
+		}
 	}
 	up_write(&vm->lock);
 
@@ -573,6 +579,85 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	trace_xe_vm_rebind_worker_exit(vm);
 }
 
+/**
+ * xe_vm_restart_ioctl() - Queue the preempt-rebind worker for a paused VM
+ * @dev: DRM device
+ * @data: pointer to &struct drm_xe_vm_restart from userspace
+ * @file: DRM file handle
+ *
+ * Looks up the VM identified by @vm_id and, if it is currently paused (its
+ * rebind worker was deactivated after a recoverable error), clears the paused
+ * state and queues the rebind worker.  Only valid for VMs in preempt-fence
+ * mode.
+ *
+ * If @timestamp_ns is non-zero, logs the latency between that timestamp and
+ * the point the vm lock is taken, regardless of whether the VM was paused.
+ *
+ * Return: 0 if the worker was queued, -EALREADY if the VM is not paused,
+ *         -EINVAL if the VM is not in preempt-fence mode or not restartable,
+ *         -ENOENT if the VM was not found.
+ */
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(dev);
+	struct xe_file *xef = to_xe_file(file);
+	struct drm_xe_vm_restart *args = data;
+	struct xe_vm *vm;
+	int err = 0;
+
+	if (XE_IOCTL_DBG(xe, args->reserved || args->pad))
+		return -EINVAL;
+
+	vm = xe_vm_lookup(xef, args->vm_id);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -ENOENT;
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_in_preempt_fence_mode(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_is_restartable(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	err = down_read_interruptible(&vm->lock);
+	if (err)
+		goto out;
+
+	if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) {
+		err = -ENOENT;
+		goto out_unlock_read;
+	}
+
+	if (args->timestamp_ns) {
+		u64 delay_us = (ktime_get_ns() - args->timestamp_ns) / NSEC_PER_USEC;
+
+		drm_dbg(&xe->drm, "VM %u restart latency: %llu us\n",
+			args->vm_id, delay_us);
+	}
+
+	err = xe_vm_lock(vm, true);
+	if (err)
+		goto out_unlock_read;
+
+	if (!vm->preempt.rebind_deactivated) {
+		err = -EALREADY;
+		goto out_unlock_resv;
+	}
+
+	xe_vm_reactivate_rebind(vm);
+out_unlock_resv:
+	xe_vm_unlock(vm);
+out_unlock_read:
+	up_read(&vm->lock);
+out:
+	xe_vm_put(vm);
+	return err;
+}
+
 /**
  * xe_vm_add_fault_entry_pf() - Add pagefault to vm fault list
  * @vm: The VM.
@@ -2049,7 +2134,8 @@ find_ufence_get(struct xe_sync_entry *syncs, u32 num_syncs)
 #define ALL_DRM_XE_VM_CREATE_FLAGS (DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | \
 				    DRM_XE_VM_CREATE_FLAG_LR_MODE | \
 				    DRM_XE_VM_CREATE_FLAG_FAULT_MODE | \
-				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
+				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT | \
+				    DRM_XE_VM_CREATE_FLAG_RESTARTABLE)
 
 int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *file)
@@ -2092,6 +2178,11 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 			 args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT))
 		return -EINVAL;
 
+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE &&
+			 (!(args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE) ||
+			  args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE)))
+		return -EINVAL;
+
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE)
 		flags |= XE_VM_FLAG_SCRATCH_PAGE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE)
@@ -2100,6 +2191,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		flags |= XE_VM_FLAG_FAULT_MODE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
 		flags |= XE_VM_FLAG_NO_VM_OVERCOMMIT;
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE)
+		flags |= XE_VM_FLAG_RESTARTABLE;
 
 	vm = xe_vm_create(xe, flags, xef);
 	if (IS_ERR(vm))
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index c5b900f38ded..9ee44599cacd 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -212,7 +212,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
 int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
 int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file);
-
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file);
 void xe_vm_close_and_put(struct xe_vm *vm);
 
 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
@@ -237,6 +238,11 @@ static inline bool xe_vm_allow_vm_eviction(struct xe_vm *vm)
 		 !(vm->flags & XE_VM_FLAG_NO_VM_OVERCOMMIT));
 }
 
+static inline bool xe_vm_is_restartable(struct xe_vm *vm)
+{
+	return vm->flags & XE_VM_FLAG_RESTARTABLE;
+}
+
 int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);
 void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);
 
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 635ed29b9a69..7d295c3b8456 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -264,6 +264,7 @@ struct xe_vm {
 #define XE_VM_FLAG_SET_TILE_ID(tile)	FIELD_PREP(GENMASK(7, 6), (tile)->id)
 #define XE_VM_FLAG_GSC			BIT(8)
 #define XE_VM_FLAG_NO_VM_OVERCOMMIT     BIT(9)
+#define XE_VM_FLAG_RESTARTABLE          BIT(10)
 	unsigned long flags;
 
 	/**
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 48e9f1fdb78d..bebb0167bd31 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -85,6 +85,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS
  *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
+ *  - &DRM_IOCTL_XE_VM_RESTART
  */
 
 /*
@@ -110,6 +111,7 @@ extern "C" {
 #define DRM_XE_VM_QUERY_MEM_RANGE_ATTRS	0x0d
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x0e
 #define DRM_XE_VM_GET_PROPERTY		0x0f
+#define DRM_XE_VM_RESTART		0x10
 
 /* Must be kept compact -- no holes */
 
@@ -129,6 +131,7 @@ extern "C" {
 #define DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_QUERY_MEM_RANGE_ATTRS, struct drm_xe_vm_query_mem_range_attr)
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
+#define DRM_IOCTL_XE_VM_RESTART			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_RESTART, struct drm_xe_vm_restart)
 
 /**
  * DOC: Xe IOCTL Extensions
@@ -985,6 +988,10 @@ struct drm_xe_gem_mmap_offset {
  *    but only during a &DRM_IOCTL_XE_VM_BIND operation with the
  *    %DRM_XE_VM_BIND_FLAG_IMMEDIATE flag set. This may be useful for
  *    user-space naively probing the amount of available memory.
+ *  - %DRM_XE_VM_CREATE_FLAG_RESTARTABLE - Requires also
+ *    DRM_XE_VM_CREATE_FLAG_LR_MODE. Marks the VM as restartable, enabling
+ *    use of &DRM_IOCTL_XE_VM_RESTART to resume the preempt-rebind worker
+ *    after an error has paused it.
  */
 struct drm_xe_vm_create {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -994,6 +1001,7 @@ struct drm_xe_vm_create {
 #define DRM_XE_VM_CREATE_FLAG_LR_MODE	        (1 << 1)
 #define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(1 << 2)
 #define DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT  (1 << 3)
+#define DRM_XE_VM_CREATE_FLAG_RESTARTABLE       (1 << 4)
 	/** @flags: Flags */
 	__u32 flags;
 
@@ -2531,8 +2539,44 @@ struct drm_xe_exec_queue_set_property {
 };
 
 /**
- * DOC: Xe DRM RAS
+ * DOC: DRM_XE_VM_RESTART
+ *
+ * Restart a paused VM by queuing its preempt-rebind worker.  The VM must be
+ * in preempt-fence mode and must currently be paused (i.e. its rebind worker
+ * was deactivated after a recoverable error such as -ENOMEM or -ENOSPC).
+ *
+ * Returns 0 if the rebind worker was successfully queued.  Returns -EALREADY
+ * if the VM is not currently paused.  Returns -EINVAL if the VM is not in
+ * preempt-fence mode or not restartable.
  *
+ * An optional @timestamp_ns can be provided to measure the latency between
+ * event delivery and the point the worker is queued; the driver logs this
+ * once all sanity checks have passed.
+ */
+
+/**
+ * struct drm_xe_vm_restart - restart a VM's preempt-rebind worker
+ *
+ * Used with %DRM_IOCTL_XE_VM_RESTART.
+ */
+struct drm_xe_vm_restart {
+	/** @vm_id: ID of the VM to restart */
+	__u32 vm_id;
+	/** @pad: reserved, must be zero */
+	__u32 pad;
+	/**
+	 * @timestamp_ns: optional CLOCK_MONOTONIC timestamp in nanoseconds.
+	 * When non-zero, the driver logs the delay between this timestamp and
+	 * the point the vm lock is taken, regardless of whether the VM is
+	 * currently paused.  Pass zero to disable the logging.
+	 */
+	__u64 timestamp_ns;
+	/** @reserved: reserved, must be zero */
+	__u64 reserved;
+};
+
+/**
+ * DOC: Xe DRM RAS
  * The enums and strings defined below map to the attributes of the DRM RAS Netlink Interface.
  * Refer to Documentation/netlink/specs/drm_ras.yaml for complete interface specification.
  *
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
  2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  2026-06-12 13:53 ` [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() Thomas Hellström
  2026-06-12 13:53 ` [PATCH 4/4] drm/xe: Add watch_queue-based device event notification Thomas Hellström
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add fault injection support using the kernel fault injection
infrastructure to inject -ENOSPC early in the success path of
preempt_rebind_work_func(), before xe_svm_notifier_lock() is taken,
testing the error handling paths without interference from real
resource exhaustion.

Injection is restricted to restartable VMs. When triggered, the
worker deactivates the VM (rebind_deactivated).
Upcoming patches will then also post an error event to userspace.

Enable via debugfs:

  echo 1 > /sys/kernel/debug/dri/0/fail_rebind/times
  echo 100 > /sys/kernel/debug/dri/0/fail_rebind/probability

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_debugfs.c |  4 +++-
 drivers/gpu/drm/xe/xe_vm.c      | 32 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h      |  5 +++++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index 22b471303984..1a92c52ccd83 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -35,8 +35,8 @@
 #ifdef CONFIG_DRM_XE_DEBUG
 #include "xe_bo_evict.h"
 #include "xe_migrate.h"
-#include "xe_vm.h"
 #endif
+#include "xe_vm.h"
 
 DECLARE_FAULT_ATTR(gt_reset_failure);
 DECLARE_FAULT_ATTR(inject_csc_hw_error);
@@ -612,6 +612,8 @@ void xe_debugfs_register(struct xe_device *xe)
 
 	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
 
+	xe_vm_debugfs_register(root);
+
 	if (IS_SRIOV_PF(xe))
 		xe_sriov_pf_debugfs_register(xe, root);
 	else if (IS_SRIOV_VF(xe))
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 86ed8f31a219..b69a2e5bd9c9 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -18,6 +18,9 @@
 #include <linux/kthread.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
+#ifdef CONFIG_DEBUG_FS
+#include <linux/debugfs.h>
+#endif
 
 #include <generated/xe_wa_oob.h>
 
@@ -43,6 +46,17 @@
 #include "xe_vm_madvise.h"
 #include "xe_wa.h"
 
+#ifdef CONFIG_FAULT_INJECTION
+static DECLARE_FAULT_ATTR(rebind_enospc);
+
+static void xe_vm_register_fault_attrs(struct dentry *root)
+{
+	fault_create_debugfs_attr("fail_rebind", root, &rebind_enospc);
+}
+#else
+static inline void xe_vm_register_fault_attrs(struct dentry *root) {}
+#endif
+
 static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
 {
 	return vm->gpuvm.r_obj;
@@ -529,6 +543,13 @@ static void preempt_rebind_work_func(struct work_struct *w)
 		goto out_unlock;
 	}
 
+#ifdef CONFIG_FAULT_INJECTION
+	if (xe_vm_is_restartable(vm) && should_fail(&rebind_enospc, 1)) {
+		err = -ENOSPC;
+		goto out_unlock;
+	}
+#endif
+
 #define retry_required(__tries, __vm) \
 	(IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT) ? \
 	(!(__tries)++ || __xe_vm_userptr_needs_repin(__vm)) : \
@@ -5042,3 +5063,14 @@ void xe_vm_remove_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
 	}
 	up_write(&vm->exec_queues.lock);
 }
+
+#ifdef CONFIG_DEBUG_FS
+/**
+ * xe_vm_debugfs_register() - Register xe_vm debugfs entries
+ * @root: debugfs root dentry for this device
+ */
+void xe_vm_debugfs_register(struct dentry *root)
+{
+	xe_vm_register_fault_attrs(root);
+}
+#endif
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 9ee44599cacd..0f9a38d97bf6 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -216,6 +216,11 @@ int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file);
 void xe_vm_close_and_put(struct xe_vm *vm);
 
+#ifdef CONFIG_DEBUG_FS
+struct dentry;
+void xe_vm_debugfs_register(struct dentry *root);
+#endif
+
 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
 {
 	return vm->flags & XE_VM_FLAG_FAULT_MODE;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch()
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
  2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
  2026-06-12 13:53 ` [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  2026-06-12 13:53 ` [PATCH 4/4] drm/xe: Add watch_queue-based device event notification Thomas Hellström
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add a DRM_XE_NOTIFY watch type for asynchronous error notifications
from the DRM_XE kernel module.

The reason for not registering a DRM - wide notification type is
that the notification type is 24 bits wide, the subtype is only 8,
If this is a concern one could define the DRM - wide subtypes
to be per driver, not common across DRM.

Also export the init_watch() function for use from kernel drivers.
Use EXPORT_SYMBOL() to align with other exports from the same file.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 include/uapi/drm/xe_drm.h        |  4 ++--
 include/uapi/linux/watch_queue.h |  3 ++-
 kernel/watch_queue.c             | 13 ++++++++++---
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index bebb0167bd31..8d5e3f06b8d4 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -2550,8 +2550,8 @@ struct drm_xe_exec_queue_set_property {
  * preempt-fence mode or not restartable.
  *
  * An optional @timestamp_ns can be provided to measure the latency between
- * event delivery and the point the worker is queued; the driver logs this
- * once all sanity checks have passed.
+ * event delivery and locking; the driver logs this regardless of whether the
+ * VM was paused.
  */
 
 /**
diff --git a/include/uapi/linux/watch_queue.h b/include/uapi/linux/watch_queue.h
index c3d8320b5d3a..c800c153989d 100644
--- a/include/uapi/linux/watch_queue.h
+++ b/include/uapi/linux/watch_queue.h
@@ -14,7 +14,8 @@
 enum watch_notification_type {
 	WATCH_TYPE_META		= 0,	/* Special record */
 	WATCH_TYPE_KEY_NOTIFY	= 1,	/* Key change event notification */
-	WATCH_TYPE__NR		= 2
+	WATCH_TYPE_DRM_XE_NOTIFY	= 2,	/* DRM device event notification */
+	WATCH_TYPE__NR		= 3
 };
 
 enum watch_meta_notification_subtype {
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index 538520861e8b..701b5c388808 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -445,11 +445,17 @@ static void put_watch(struct watch *watch)
 }
 
 /**
- * init_watch - Initialise a watch
+ * init_watch() - Initialise a watch subscription
  * @watch: The watch to initialise.
- * @wqueue: The queue to assign.
+ * @wqueue: The watch queue (notification pipe) to associate with the watch.
  *
- * Initialise a watch and set the watch queue.
+ * Initialise a newly allocated watch object and associate it with @wqueue.
+ * The caller must subsequently set @watch->id and @watch->info_id before
+ * calling add_watch_to_object() to subscribe the watch to a notification
+ * source.
+ *
+ * The watch queue reference is held internally; call put_watch_queue() if
+ * the watch is not successfully passed to add_watch_to_object().
  */
 void init_watch(struct watch *watch, struct watch_queue *wqueue)
 {
@@ -458,6 +464,7 @@ void init_watch(struct watch *watch, struct watch_queue *wqueue)
 	INIT_HLIST_NODE(&watch->queue_node);
 	rcu_assign_pointer(watch->queue, wqueue);
 }
+EXPORT_SYMBOL(init_watch);
 
 static int add_one_watch(struct watch *watch, struct watch_list *wlist, struct watch_queue *wqueue)
 {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] drm/xe: Add watch_queue-based device event notification
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
                   ` (2 preceding siblings ...)
  2026-06-12 13:53 ` [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add a watch_queue notification channel tied to struct xe_file so that
userspace can subscribe to asynchronous GPU device events via the
general kernel notification mechanism.

Introduce DRM_IOCTL_XE_WATCH_QUEUE to let userspace subscribe a
notification pipe (opened with pipe2(O_NOTIFICATION_PIPE)) to the device
event stream.  Embed the watch_id field (0-255) in the WATCH_INFO_ID
field of every notification, allowing multiple watches to share a single
pipe and be told apart by the reader.

Deliver notifications as struct drm_xe_watch_notification records, with
type always set to WATCH_TYPE_DRM_XE_NOTIFY and subtype drawn from enum
drm_xe_watch_event.  Define DRM_XE_WATCH_EVENT_VM_ERR as the first
event, posted by the preempt-rebind worker when a VM encounters an
unrecoverable error.  Expose xe_watch_queue_post_vm_err_event() as the
in-kernel posting API.

Add event definitions in a separate uapi header, <drm/xe_drm_events.h>.
The main reason is that the header needs to include <linux/watch_queue.h>
which in turn includes <linux/fcntl.h> which may conflict with the
system <fcntl.h>. Hence user-space must pay special attention when
including this file.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 MAINTAINERS                          |   1 +
 drivers/gpu/drm/xe/Kconfig           |   1 +
 drivers/gpu/drm/xe/Makefile          |   1 +
 drivers/gpu/drm/xe/xe_device.c       |   7 ++
 drivers/gpu/drm/xe/xe_device_types.h |   6 ++
 drivers/gpu/drm/xe/xe_vm.c           |   4 +
 drivers/gpu/drm/xe/xe_vm_types.h     |   2 +
 drivers/gpu/drm/xe/xe_watch_queue.c  | 111 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_watch_queue.h  |  20 +++++
 include/uapi/drm/xe_drm.h            |  45 +++++++++++
 include/uapi/drm/xe_drm_events.h     |  62 +++++++++++++++
 11 files changed, 260 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.c
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.h
 create mode 100644 include/uapi/drm/xe_drm_events.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 8c0d9965c636..b7e02cfa692b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12900,6 +12900,7 @@ F:	Documentation/gpu/xe/
 F:	drivers/gpu/drm/xe/
 F:	include/drm/intel/
 F:	include/uapi/drm/xe_drm.h
+F:	include/uapi/drm/xe_drm_events.h
 
 INTEL ELKHART LAKE PSE I/O DRIVER
 M:	Raag Jadav <raag.jadav@intel.com>
diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index 4d7dcaff2b91..dbdc2fb49c53 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -25,6 +25,7 @@ config DRM_XE
 	select DRM_MIPI_DSI
 	select RELAY
 	select IRQ_WORK
+	select WATCH_QUEUE
 	# xe depends on ACPI_VIDEO when ACPI is enabled
 	# but for select to work, need to select ACPI_VIDEO's dependencies, ick
 	select BACKLIGHT_CLASS_DEVICE if ACPI
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 8e7b146880f4..fc8b4023a044 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -150,6 +150,7 @@ xe-y += xe_bb.o \
 	xe_vsec.o \
 	xe_wa.o \
 	xe_wait_user_fence.o \
+	xe_watch_queue.o \
 	xe_wopcm.o
 
 xe-$(CONFIG_I2C)	+= xe_i2c.o
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 867d7c55dc03..788ef2fbd6e5 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -9,6 +9,7 @@
 #include <linux/delay.h>
 #include <linux/fault-inject.h>
 #include <linux/units.h>
+#include <linux/watch_queue.h>
 
 #include <drm/drm_client.h>
 #include <drm/drm_gem_ttm_helper.h>
@@ -77,6 +78,7 @@
 #include "xe_vsec.h"
 #include "xe_wait_user_fence.h"
 #include "xe_wa.h"
+#include "xe_watch_queue.h"
 
 #include <generated/xe_device_wa_oob.h>
 #include <generated/xe_wa_oob.h>
@@ -112,6 +114,8 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 	file->driver_priv = xef;
 	kref_init(&xef->refcount);
 
+	init_watch_list(&xef->watch_list, NULL);
+
 	task = get_pid_task(rcu_access_pointer(file->pid), PIDTYPE_PID);
 	if (task) {
 		xef->process_name = kstrdup(task->comm, GFP_KERNEL);
@@ -126,6 +130,8 @@ static void xe_file_destroy(struct kref *ref)
 {
 	struct xe_file *xef = container_of(ref, struct xe_file, refcount);
 
+	remove_watch_from_object(&xef->watch_list, NULL, 0, true);
+
 	xa_destroy(&xef->exec_queue.xa);
 	mutex_destroy(&xef->exec_queue.lock);
 	xa_destroy(&xef->vm.xa);
@@ -216,6 +222,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_RESTART, xe_vm_restart_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_WATCH_QUEUE, xe_watch_queue_ioctl, DRM_RENDER_ALLOW),
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 32dd2ffbc796..ca726ada30a7 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -11,6 +11,7 @@
 #include <drm/drm_device.h>
 #include <drm/drm_file.h>
 #include <drm/ttm/ttm_device.h>
+#include <linux/watch_queue.h>
 
 #include "xe_devcoredump_types.h"
 #include "xe_drm_ras_types.h"
@@ -632,6 +633,11 @@ struct xe_file {
 
 	/** @refcount: ref count of this xe file */
 	struct kref refcount;
+
+#ifdef CONFIG_WATCH_QUEUE
+	/** @watch_list: per-file notification source for device events */
+	struct watch_list watch_list;
+#endif
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index b69a2e5bd9c9..232de0d948d2 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -13,6 +13,7 @@
 #include <drm/drm_print.h>
 #include <drm/ttm/ttm_tt.h>
 #include <uapi/drm/xe_drm.h>
+#include <uapi/drm/xe_drm_events.h>
 #include <linux/ascii85.h>
 #include <linux/delay.h>
 #include <linux/kthread.h>
@@ -45,6 +46,7 @@
 #include "xe_trace_bo.h"
 #include "xe_vm_madvise.h"
 #include "xe_wa.h"
+#include "xe_watch_queue.h"
 
 #ifdef CONFIG_FAULT_INJECTION
 static DECLARE_FAULT_ATTR(rebind_enospc);
@@ -584,6 +586,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	}
 
 	if (err) {
+		xe_watch_queue_post_vm_err_event(vm->xef, vm->id, err);
 		if ((err == -ENOMEM || err == -ENOSPC) && xe_vm_is_restartable(vm)) {
 			vm->preempt.rebind_deactivated = true;
 			drm_dbg(&vm->xe->drm, "Rebind deactivated VM on error %pe\n",
@@ -2229,6 +2232,7 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (err)
 		goto err_close_and_put;
 
+	vm->id = id;
 	args->vm_id = id;
 
 	return 0;
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 7d295c3b8456..19a673099588 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -407,6 +407,8 @@ struct xe_vm {
 	bool batch_invalidate_tlb;
 	/** @xef: Xe file handle for tracking this VM's drm client */
 	struct xe_file *xef;
+	/** @id: The id of the VM in the VM table of @xef. */
+	u32 id;
 };
 
 /** struct xe_vma_op_map - VMA map operation */
diff --git a/drivers/gpu/drm/xe/xe_watch_queue.c b/drivers/gpu/drm/xe/xe_watch_queue.c
new file mode 100644
index 000000000000..32763591075b
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_watch_queue.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#include <linux/slab.h>
+#include <linux/timekeeping.h>
+#include <linux/watch_queue.h>
+
+#include <uapi/drm/xe_drm.h>
+#include <uapi/drm/xe_drm_events.h>
+
+#include "xe_device.h"
+#include "xe_device_types.h"
+#include "xe_macros.h"
+#include "xe_watch_queue.h"
+
+/**
+ * struct xe_watch_notification_vm_err - kernel-side VM error event notification
+ * @base: common watch notification header; type is %WATCH_TYPE_DRM_XE_NOTIFY,
+ *        subtype is %DRM_XE_WATCH_EVENT_VM_ERR
+ * @vm_id: ID of the VM that hit error
+ * @error_code: error code describing the error condition (negative errno)
+ * @timestamp_ns: CLOCK_MONOTONIC timestamp in nanoseconds at the point the
+ *                error was detected
+ *
+ * Layout mirrors &struct drm_xe_watch_notification_vm_err.
+ */
+struct xe_watch_notification_vm_err {
+	struct watch_notification base;
+	u32 vm_id;
+	s32 error_code;
+	u64 timestamp_ns;
+};
+
+/**
+ * xe_watch_queue_ioctl() - Subscribe a pipe to per-file device event notifications
+ * @dev: DRM device
+ * @data: pointer to &struct drm_xe_watch_queue from userspace
+ * @file: DRM file handle of the subscribing process
+ *
+ * Subscribes a notification pipe to receive Xe device events for the calling
+ * process's file handle.  Only events scoped to this file (e.g. VM error on a
+ * VM owned by this file) are delivered.  The pipe must have been opened with
+ * O_NOTIFICATION_PIPE and sized with %IOC_WATCH_QUEUE_SET_SIZE before calling
+ * this IOCTL.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int xe_watch_queue_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+{
+	struct xe_file *xef = file->driver_priv;
+	struct xe_device *xe = to_xe_device(dev);
+	struct drm_xe_watch_queue *args = data;
+	struct watch_queue *wqueue;
+	struct watch *watch;
+	int ret;
+
+	if (XE_IOCTL_DBG(xe, args->flags || args->pad))
+		return -EINVAL;
+	if (XE_IOCTL_DBG(xe, args->watch_id > 0xff))
+		return -EINVAL;
+
+	wqueue = get_watch_queue(args->fd);
+	if (XE_IOCTL_DBG(xe, IS_ERR(wqueue)))
+		return PTR_ERR(wqueue);
+
+	watch = kzalloc_obj(*watch, GFP_KERNEL | __GFP_ACCOUNT);
+	if (XE_IOCTL_DBG(xe, !watch)) {
+		ret = -ENOMEM;
+		goto out_put_queue;
+	}
+
+	init_watch(watch, wqueue);
+	watch->id = 0;
+	watch->info_id = (u32)args->watch_id << WATCH_INFO_ID__SHIFT;
+
+	ret = add_watch_to_object(watch, &xef->watch_list);
+	if (XE_IOCTL_DBG(xe, ret))
+		kfree(watch);
+
+out_put_queue:
+	put_watch_queue(wqueue);
+	return ret;
+}
+
+/**
+ * xe_watch_queue_post_vm_err_event() - Post a VM error event
+ * @xef: xe file handle that owns the VM
+ * @vm_id: userspace ID of the VM that hit error
+ * @error_code: error code describing the error condition (negative errno)
+ *
+ * Posts a %DRM_XE_WATCH_EVENT_VM_ERR notification carrying @vm_id and
+ * @error_code to every pipe that @xef has subscribed via
+ * %DRM_IOCTL_XE_WATCH_QUEUE.  Only the owning process is notified,
+ * preventing information leaks to other clients.
+ */
+void xe_watch_queue_post_vm_err_event(struct xe_file *xef, u32 vm_id,
+				      int error_code)
+{
+	struct xe_watch_notification_vm_err n = {};
+
+	n.base.type    = WATCH_TYPE_DRM_XE_NOTIFY;
+	n.base.subtype = DRM_XE_WATCH_EVENT_VM_ERR;
+	n.base.info    = watch_sizeof(struct xe_watch_notification_vm_err);
+	n.vm_id        = vm_id;
+	n.error_code   = error_code;
+	n.timestamp_ns = ktime_get_ns();
+
+	post_watch_notification(&xef->watch_list, &n.base, current_cred(), 0);
+}
diff --git a/drivers/gpu/drm/xe/xe_watch_queue.h b/drivers/gpu/drm/xe/xe_watch_queue.h
new file mode 100644
index 000000000000..ad199ee68205
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_watch_queue.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#ifndef _XE_WATCH_QUEUE_H_
+#define _XE_WATCH_QUEUE_H_
+
+#include <linux/types.h>
+
+struct drm_device;
+struct drm_file;
+struct xe_file;
+
+int xe_watch_queue_ioctl(struct drm_device *dev, void *data,
+			 struct drm_file *file);
+void xe_watch_queue_post_vm_err_event(struct xe_file *xef, u32 vm_id,
+				      int error_code);
+
+#endif /* _XE_WATCH_QUEUE_H_ */
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 8d5e3f06b8d4..0083dd712f7e 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -86,6 +86,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_RESTART
+ *  - &DRM_IOCTL_XE_WATCH_QUEUE
  */
 
 /*
@@ -112,6 +113,7 @@ extern "C" {
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x0e
 #define DRM_XE_VM_GET_PROPERTY		0x0f
 #define DRM_XE_VM_RESTART		0x10
+#define DRM_XE_WATCH_QUEUE		0x11
 
 /* Must be kept compact -- no holes */
 
@@ -132,6 +134,7 @@ extern "C" {
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
 #define DRM_IOCTL_XE_VM_RESTART			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_RESTART, struct drm_xe_vm_restart)
+#define DRM_IOCTL_XE_WATCH_QUEUE		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_WATCH_QUEUE, struct drm_xe_watch_queue)
 
 /**
  * DOC: Xe IOCTL Extensions
@@ -2653,6 +2656,48 @@ enum drm_xe_ras_error_component {
 	[DRM_XE_RAS_ERR_COMP_SOC_INTERNAL] = "soc-internal"		\
 }
 
+/**
+ * DOC: DRM_XE_WATCH_QUEUE
+ *
+ * Subscribe a notification pipe to receive device events for the calling
+ * process's DRM file handle.  Events are scoped to the subscribing file:
+ * only events that belong to that file (for example, VM error on a VM created
+ * through the same file) are delivered, preventing information leaks between
+ * processes sharing the same GPU device.
+ *
+ * The pipe must first be opened with O_NOTIFICATION_PIPE (i.e. O_EXCL passed
+ * to pipe2()) and sized via %IOC_WATCH_QUEUE_SET_SIZE before subscribing.
+ *
+ * Events are delivered as notification records read from the pipe.  The
+ * @watch_id field is embedded in the notification info field and can be used
+ * to distinguish multiple watches sharing a pipe.
+ *
+ * Currently defined event subtypes:
+ *  - %DRM_XE_WATCH_EVENT_VM_ERR - a VM owned by this file has encountered an error
+ */
+
+/**
+ * struct drm_xe_watch_queue - subscribe to device event notifications
+ *
+ * Used with %DRM_IOCTL_XE_WATCH_QUEUE.  Notifications are scoped to the
+ * DRM file handle used to issue this IOCTL.
+ */
+struct drm_xe_watch_queue {
+	/** @fd: file descriptor of pipe opened with O_NOTIFICATION_PIPE */
+	__u32 fd;
+
+	/**
+	 * @watch_id: identifier (0–255) embedded in the watch notification
+	 * info field; allows multiplexing several watches on one pipe
+	 */
+	__u32 watch_id;
+
+	/** @flags: must be zero */
+	__u32 flags;
+
+	/** @pad: reserved, must be zero */
+	__u32 pad;
+};
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/uapi/drm/xe_drm_events.h b/include/uapi/drm/xe_drm_events.h
new file mode 100644
index 000000000000..6cc7528bfb9b
--- /dev/null
+++ b/include/uapi/drm/xe_drm_events.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#ifndef _UAPI_XE_DRM_EVENTS_H_
+#define _UAPI_XE_DRM_EVENTS_H_
+
+#include <linux/types.h>
+#include <linux/watch_queue.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/**
+ * enum drm_xe_watch_event - Xe device watch event subtypes
+ *
+ * Subtypes for notifications delivered via %WATCH_TYPE_DRM_XE_NOTIFY when
+ * reading from a pipe subscribed with %DRM_IOCTL_XE_WATCH_QUEUE.
+ */
+enum drm_xe_watch_event {
+	/**
+	 * @DRM_XE_WATCH_EVENT_VM_ERR: a VM has encountered an error.
+	 *
+	 * Indicates that a fatal or resource error occurred within the given
+	 * VM.  The vm_id of the affected VM is carried in the
+	 * @drm_xe_watch_notification_vm_err::vm_id field of the extended
+	 * notification record.
+	 */
+	DRM_XE_WATCH_EVENT_VM_ERR = 0,
+};
+
+/**
+ * struct drm_xe_watch_notification_vm_err - VM error event notification
+ *
+ * Notification record delivered for %DRM_XE_WATCH_EVENT_VM_ERR.
+ * The record type is always %WATCH_TYPE_DRM_XE_NOTIFY and the subtype is
+ * %DRM_XE_WATCH_EVENT_VM_ERR.
+ */
+struct drm_xe_watch_notification_vm_err {
+	/** @base: common watch notification header */
+	struct watch_notification base;
+
+	/** @vm_id: ID of the VM that encountered an error */
+	__u32 vm_id;
+
+	/** @error_code: error code describing the error condition (negative errno) */
+	__s32 error_code;
+
+	/**
+	 * @timestamp_ns: CLOCK_MONOTONIC timestamp in nanoseconds at the
+	 * point the error was detected
+	 */
+	__u64 timestamp_ns;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _UAPI_XE_DRM_EVENTS_H_ */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-12 13:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
2026-06-12 13:53 ` [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC Thomas Hellström
2026-06-12 13:53 ` [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() Thomas Hellström
2026-06-12 13:53 ` [PATCH 4/4] drm/xe: Add watch_queue-based device event notification Thomas Hellström

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.