The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Xe driver asynchronous notification mechanism
@ 2026-06-12 13:53 Thomas Hellström
  2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

There is a need to inform user-space clients when a rebind worker
has ran out of memory so that it can react, adjust its working-set
and restart the job. This patch series aims to start a discussion
about the best way to accomplish this.

The series builds on the core "general notification mechanism" or
"watch_queue", and attaches a watch queue to each xe drm file.

The watch_queue is extremely flexible and allows filtering out
events of interest at the kernel level. There can be multiple
listeners.

Patch 1 Implements a restart IOCTL for rebind-workers
      paused on OOM.
Patch 2 Adds fault-injection into the rebind worker for
      testing.
Patch 3 Adds a DRM_XE_NOTIFY watch_type.
Patch 4 Implements watch_queue event sending from within
      xe.

igt series:
Test-with: https://patchwork.freedesktop.org/series/168429/

Compute UMD side is not available yet. Will be available before
final review.

Thomas Hellström (4):
  drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL
  drm/xe: Add fault injection for rebind worker -ENOSPC
  watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch()
  drm/xe: Add watch_queue-based device event notification

 MAINTAINERS                          |   1 +
 drivers/gpu/drm/xe/Kconfig           |   1 +
 drivers/gpu/drm/xe/Makefile          |   1 +
 drivers/gpu/drm/xe/xe_debugfs.c      |   4 +-
 drivers/gpu/drm/xe/xe_device.c       |   8 ++
 drivers/gpu/drm/xe/xe_device_types.h |   6 ++
 drivers/gpu/drm/xe/xe_vm.c           | 135 ++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_vm.h           |  13 ++-
 drivers/gpu/drm/xe/xe_vm_types.h     |   3 +
 drivers/gpu/drm/xe/xe_watch_queue.c  | 111 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_watch_queue.h  |  20 ++++
 include/uapi/drm/xe_drm.h            |  91 +++++++++++++++++-
 include/uapi/drm/xe_drm_events.h     |  62 ++++++++++++
 include/uapi/linux/watch_queue.h     |   3 +-
 kernel/watch_queue.c                 |  13 ++-
 15 files changed, 462 insertions(+), 10 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.c
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.h
 create mode 100644 include/uapi/drm/xe_drm_events.h

-- 
2.54.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  2026-06-12 13:53 ` [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC Thomas Hellström
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add an async VM restart IOCTL that allows userspace to re-queue the
preempt-rebind worker for a VM that has been paused after a recoverable
error.

Add xe_vm_restart_ioctl() which:
- Looks up the VM by id via xe_vm_lookup()
- Returns -EINVAL if the VM is not in preempt-fence mode or not restartable
- Returns -EALREADY if the VM is not currently paused
- Queues the rebind worker via and returns 0

If the optional @timestamp_ns field is non-zero, logs the latency
between that timestamp and the point the worker is queued.

Add DRM_XE_VM_CREATE_FLAG_RESTARTABLE to opt a VM in to the restartable
behaviour: on recoverable errors (-ENOMEM, -ENOSPC) the rebind worker
is deactivated rather than the VM being killed. Requires
DRM_XE_VM_CREATE_FLAG_LR_MODE and may not be used with
DRM_XE_VM_CREATE_FLAG_FAULT_MODE.

Add struct drm_xe_vm_restart UAPI struct with vm_id, pad, timestamp_ns
and reserved fields, and register the IOCTL at slot 0x10.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 51e3a2dd7b22..867d7c55dc03 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -215,6 +215,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
 			  DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_VM_RESTART, xe_vm_restart_ioctl, DRM_RENDER_ALLOW),
 };

 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 75841f3e9afa..86ed8f31a219 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -563,8 +563,14 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	}

 	if (err) {
-		drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
-		xe_vm_kill(vm, true);
+		if ((err == -ENOMEM || err == -ENOSPC) && xe_vm_is_restartable(vm)) {
+			vm->preempt.rebind_deactivated = true;
+			drm_dbg(&vm->xe->drm, "Rebind deactivated VM on error %pe\n",
+				ERR_PTR(err));
+		} else {
+			drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
+			xe_vm_kill(vm, true);
+		}
 	}
 	up_write(&vm->lock);

@@ -573,6 +579,85 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	trace_xe_vm_rebind_worker_exit(vm);
 }

+/**
+ * xe_vm_restart_ioctl() - Queue the preempt-rebind worker for a paused VM
+ * @dev: DRM device
+ * @data: pointer to &struct drm_xe_vm_restart from userspace
+ * @file: DRM file handle
+ *
+ * Looks up the VM identified by @vm_id and, if it is currently paused (its
+ * rebind worker was deactivated after a recoverable error), clears the paused
+ * state and queues the rebind worker.  Only valid for VMs in preempt-fence
+ * mode.
+ *
+ * If @timestamp_ns is non-zero, logs the latency between that timestamp and
+ * the point the vm lock is taken, regardless of whether the VM was paused.
+ *
+ * Return: 0 if the worker was queued, -EALREADY if the VM is not paused,
+ *         -EINVAL if the VM is not in preempt-fence mode or not restartable,
+ *         -ENOENT if the VM was not found.
+ */
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(dev);
+	struct xe_file *xef = to_xe_file(file);
+	struct drm_xe_vm_restart *args = data;
+	struct xe_vm *vm;
+	int err = 0;
+
+	if (XE_IOCTL_DBG(xe, args->reserved || args->pad))
+		return -EINVAL;
+
+	vm = xe_vm_lookup(xef, args->vm_id);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -ENOENT;
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_in_preempt_fence_mode(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_is_restartable(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	err = down_read_interruptible(&vm->lock);
+	if (err)
+		goto out;
+
+	if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) {
+		err = -ENOENT;
+		goto out_unlock_read;
+	}
+
+	if (args->timestamp_ns) {
+		u64 delay_us = (ktime_get_ns() - args->timestamp_ns) / NSEC_PER_USEC;
+
+		drm_dbg(&xe->drm, "VM %u restart latency: %llu us\n",
+			args->vm_id, delay_us);
+	}
+
+	err = xe_vm_lock(vm, true);
+	if (err)
+		goto out_unlock_read;
+
+	if (!vm->preempt.rebind_deactivated) {
+		err = -EALREADY;
+		goto out_unlock_resv;
+	}
+
+	xe_vm_reactivate_rebind(vm);
+out_unlock_resv:
+	xe_vm_unlock(vm);
+out_unlock_read:
+	up_read(&vm->lock);
+out:
+	xe_vm_put(vm);
+	return err;
+}
+
 /**
  * xe_vm_add_fault_entry_pf() - Add pagefault to vm fault list
  * @vm: The VM.
@@ -2049,7 +2134,8 @@ find_ufence_get(struct xe_sync_entry *syncs, u32 num_syncs)
 #define ALL_DRM_XE_VM_CREATE_FLAGS (DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | \
 				    DRM_XE_VM_CREATE_FLAG_LR_MODE | \
 				    DRM_XE_VM_CREATE_FLAG_FAULT_MODE | \
-				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
+				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT | \
+				    DRM_XE_VM_CREATE_FLAG_RESTARTABLE)

 int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *file)
@@ -2092,6 +2178,11 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 			 args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT))
 		return -EINVAL;

+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE &&
+			 (!(args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE) ||
+			  args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE)))
+		return -EINVAL;
+
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE)
 		flags |= XE_VM_FLAG_SCRATCH_PAGE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE)
@@ -2100,6 +2191,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		flags |= XE_VM_FLAG_FAULT_MODE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
 		flags |= XE_VM_FLAG_NO_VM_OVERCOMMIT;
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE)
+		flags |= XE_VM_FLAG_RESTARTABLE;

 	vm = xe_vm_create(xe, flags, xef);
 	if (IS_ERR(vm))
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index c5b900f38ded..9ee44599cacd 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -212,7 +212,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
 int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
 int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file);
-
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file);
 void xe_vm_close_and_put(struct xe_vm *vm);

 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
@@ -237,6 +238,11 @@ static inline bool xe_vm_allow_vm_eviction(struct xe_vm *vm)
 		 !(vm->flags & XE_VM_FLAG_NO_VM_OVERCOMMIT));
 }

+static inline bool xe_vm_is_restartable(struct xe_vm *vm)
+{
+	return vm->flags & XE_VM_FLAG_RESTARTABLE;
+}
+
 int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);
 void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);

diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 635ed29b9a69..7d295c3b8456 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -264,6 +264,7 @@ struct xe_vm {
 #define XE_VM_FLAG_SET_TILE_ID(tile)	FIELD_PREP(GENMASK(7, 6), (tile)->id)
 #define XE_VM_FLAG_GSC			BIT(8)
 #define XE_VM_FLAG_NO_VM_OVERCOMMIT     BIT(9)
+#define XE_VM_FLAG_RESTARTABLE          BIT(10)
 	unsigned long flags;

 	/**
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 48e9f1fdb78d..bebb0167bd31 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -85,6 +85,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS
  *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
+ *  - &DRM_IOCTL_XE_VM_RESTART
  */

 /*
@@ -110,6 +111,7 @@ extern "C" {
 #define DRM_XE_VM_QUERY_MEM_RANGE_ATTRS	0x0d
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x0e
 #define DRM_XE_VM_GET_PROPERTY		0x0f
+#define DRM_XE_VM_RESTART		0x10

 /* Must be kept compact -- no holes */

@@ -129,6 +131,7 @@ extern "C" {
 #define DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_QUERY_MEM_RANGE_ATTRS, struct drm_xe_vm_query_mem_range_attr)
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
+#define DRM_IOCTL_XE_VM_RESTART			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_RESTART, struct drm_xe_vm_restart)

 /**
  * DOC: Xe IOCTL Extensions
@@ -985,6 +988,10 @@ struct drm_xe_gem_mmap_offset {
  *    but only during a &DRM_IOCTL_XE_VM_BIND operation with the
  *    %DRM_XE_VM_BIND_FLAG_IMMEDIATE flag set. This may be useful for
  *    user-space naively probing the amount of available memory.
+ *  - %DRM_XE_VM_CREATE_FLAG_RESTARTABLE - Requires also
+ *    DRM_XE_VM_CREATE_FLAG_LR_MODE. Marks the VM as restartable, enabling
+ *    use of &DRM_IOCTL_XE_VM_RESTART to resume the preempt-rebind worker
+ *    after an error has paused it.
  */
 struct drm_xe_vm_create {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -994,6 +1001,7 @@ struct drm_xe_vm_create {
 #define DRM_XE_VM_CREATE_FLAG_LR_MODE	        (1 << 1)
 #define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(1 << 2)
 #define DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT  (1 << 3)
+#define DRM_XE_VM_CREATE_FLAG_RESTARTABLE       (1 << 4)
 	/** @flags: Flags */
 	__u32 flags;

@@ -2531,8 +2539,44 @@ struct drm_xe_exec_queue_set_property {
 };

 /**
- * DOC: Xe DRM RAS
+ * DOC: DRM_XE_VM_RESTART
+ *
+ * Restart a paused VM by queuing its preempt-rebind worker.  The VM must be
+ * in preempt-fence mode and must currently be paused (i.e. its rebind worker
+ * was deactivated after a recoverable error such as -ENOMEM or -ENOSPC).
+ *
+ * Returns 0 if the rebind worker was successfully queued.  Returns -EALREADY
+ * if the VM is not currently paused.  Returns -EINVAL if the VM is not in
+ * preempt-fence mode or not restartable.
  *
+ * An optional @timestamp_ns can be provided to measure the latency between
+ * event delivery and the point the worker is queued; the driver logs this
+ * once all sanity checks have passed.
+ */
+
+/**
+ * struct drm_xe_vm_restart - restart a VM's preempt-rebind worker
+ *
+ * Used with %DRM_IOCTL_XE_VM_RESTART.
+ */
+struct drm_xe_vm_restart {
+	/** @vm_id: ID of the VM to restart */
+	__u32 vm_id;
+	/** @pad: reserved, must be zero */
+	__u32 pad;
+	/**
+	 * @timestamp_ns: optional CLOCK_MONOTONIC timestamp in nanoseconds.
+	 * When non-zero, the driver logs the delay between this timestamp and
+	 * the point the vm lock is taken, regardless of whether the VM is
+	 * currently paused.  Pass zero to disable the logging.
+	 */
+	__u64 timestamp_ns;
+	/** @reserved: reserved, must be zero */
+	__u64 reserved;
+};
+
+/**
+ * DOC: Xe DRM RAS
  * The enums and strings defined below map to the attributes of the DRM RAS Netlink Interface.
  * Refer to Documentation/netlink/specs/drm_ras.yaml for complete interface specification.
  *
---
 drivers/gpu/drm/xe/xe_device.c   |  1 +
 drivers/gpu/drm/xe/xe_vm.c       | 99 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_vm.h       |  8 ++-
 drivers/gpu/drm/xe/xe_vm_types.h |  1 +
 include/uapi/drm/xe_drm.h        | 46 ++++++++++++++-
 5 files changed, 150 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 51e3a2dd7b22..867d7c55dc03 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -215,6 +215,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
 			  DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_VM_RESTART, xe_vm_restart_ioctl, DRM_RENDER_ALLOW),
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 75841f3e9afa..86ed8f31a219 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -563,8 +563,14 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	}
 
 	if (err) {
-		drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
-		xe_vm_kill(vm, true);
+		if ((err == -ENOMEM || err == -ENOSPC) && xe_vm_is_restartable(vm)) {
+			vm->preempt.rebind_deactivated = true;
+			drm_dbg(&vm->xe->drm, "Rebind deactivated VM on error %pe\n",
+				ERR_PTR(err));
+		} else {
+			drm_warn(&vm->xe->drm, "VM worker error: %d\n", err);
+			xe_vm_kill(vm, true);
+		}
 	}
 	up_write(&vm->lock);
 
@@ -573,6 +579,85 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	trace_xe_vm_rebind_worker_exit(vm);
 }
 
+/**
+ * xe_vm_restart_ioctl() - Queue the preempt-rebind worker for a paused VM
+ * @dev: DRM device
+ * @data: pointer to &struct drm_xe_vm_restart from userspace
+ * @file: DRM file handle
+ *
+ * Looks up the VM identified by @vm_id and, if it is currently paused (its
+ * rebind worker was deactivated after a recoverable error), clears the paused
+ * state and queues the rebind worker.  Only valid for VMs in preempt-fence
+ * mode.
+ *
+ * If @timestamp_ns is non-zero, logs the latency between that timestamp and
+ * the point the vm lock is taken, regardless of whether the VM was paused.
+ *
+ * Return: 0 if the worker was queued, -EALREADY if the VM is not paused,
+ *         -EINVAL if the VM is not in preempt-fence mode or not restartable,
+ *         -ENOENT if the VM was not found.
+ */
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(dev);
+	struct xe_file *xef = to_xe_file(file);
+	struct drm_xe_vm_restart *args = data;
+	struct xe_vm *vm;
+	int err = 0;
+
+	if (XE_IOCTL_DBG(xe, args->reserved || args->pad))
+		return -EINVAL;
+
+	vm = xe_vm_lookup(xef, args->vm_id);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -ENOENT;
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_in_preempt_fence_mode(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	if (XE_IOCTL_DBG(xe, !xe_vm_is_restartable(vm))) {
+		xe_vm_put(vm);
+		return -EINVAL;
+	}
+
+	err = down_read_interruptible(&vm->lock);
+	if (err)
+		goto out;
+
+	if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) {
+		err = -ENOENT;
+		goto out_unlock_read;
+	}
+
+	if (args->timestamp_ns) {
+		u64 delay_us = (ktime_get_ns() - args->timestamp_ns) / NSEC_PER_USEC;
+
+		drm_dbg(&xe->drm, "VM %u restart latency: %llu us\n",
+			args->vm_id, delay_us);
+	}
+
+	err = xe_vm_lock(vm, true);
+	if (err)
+		goto out_unlock_read;
+
+	if (!vm->preempt.rebind_deactivated) {
+		err = -EALREADY;
+		goto out_unlock_resv;
+	}
+
+	xe_vm_reactivate_rebind(vm);
+out_unlock_resv:
+	xe_vm_unlock(vm);
+out_unlock_read:
+	up_read(&vm->lock);
+out:
+	xe_vm_put(vm);
+	return err;
+}
+
 /**
  * xe_vm_add_fault_entry_pf() - Add pagefault to vm fault list
  * @vm: The VM.
@@ -2049,7 +2134,8 @@ find_ufence_get(struct xe_sync_entry *syncs, u32 num_syncs)
 #define ALL_DRM_XE_VM_CREATE_FLAGS (DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | \
 				    DRM_XE_VM_CREATE_FLAG_LR_MODE | \
 				    DRM_XE_VM_CREATE_FLAG_FAULT_MODE | \
-				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
+				    DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT | \
+				    DRM_XE_VM_CREATE_FLAG_RESTARTABLE)
 
 int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *file)
@@ -2092,6 +2178,11 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 			 args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT))
 		return -EINVAL;
 
+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE &&
+			 (!(args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE) ||
+			  args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE)))
+		return -EINVAL;
+
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE)
 		flags |= XE_VM_FLAG_SCRATCH_PAGE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_LR_MODE)
@@ -2100,6 +2191,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		flags |= XE_VM_FLAG_FAULT_MODE;
 	if (args->flags & DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT)
 		flags |= XE_VM_FLAG_NO_VM_OVERCOMMIT;
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_RESTARTABLE)
+		flags |= XE_VM_FLAG_RESTARTABLE;
 
 	vm = xe_vm_create(xe, flags, xef);
 	if (IS_ERR(vm))
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index c5b900f38ded..9ee44599cacd 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -212,7 +212,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
 int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
 int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file);
-
+int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file);
 void xe_vm_close_and_put(struct xe_vm *vm);
 
 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
@@ -237,6 +238,11 @@ static inline bool xe_vm_allow_vm_eviction(struct xe_vm *vm)
 		 !(vm->flags & XE_VM_FLAG_NO_VM_OVERCOMMIT));
 }
 
+static inline bool xe_vm_is_restartable(struct xe_vm *vm)
+{
+	return vm->flags & XE_VM_FLAG_RESTARTABLE;
+}
+
 int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);
 void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q);
 
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 635ed29b9a69..7d295c3b8456 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -264,6 +264,7 @@ struct xe_vm {
 #define XE_VM_FLAG_SET_TILE_ID(tile)	FIELD_PREP(GENMASK(7, 6), (tile)->id)
 #define XE_VM_FLAG_GSC			BIT(8)
 #define XE_VM_FLAG_NO_VM_OVERCOMMIT     BIT(9)
+#define XE_VM_FLAG_RESTARTABLE          BIT(10)
 	unsigned long flags;
 
 	/**
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 48e9f1fdb78d..bebb0167bd31 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -85,6 +85,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS
  *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
+ *  - &DRM_IOCTL_XE_VM_RESTART
  */
 
 /*
@@ -110,6 +111,7 @@ extern "C" {
 #define DRM_XE_VM_QUERY_MEM_RANGE_ATTRS	0x0d
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x0e
 #define DRM_XE_VM_GET_PROPERTY		0x0f
+#define DRM_XE_VM_RESTART		0x10
 
 /* Must be kept compact -- no holes */
 
@@ -129,6 +131,7 @@ extern "C" {
 #define DRM_IOCTL_XE_VM_QUERY_MEM_RANGE_ATTRS	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_QUERY_MEM_RANGE_ATTRS, struct drm_xe_vm_query_mem_range_attr)
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
+#define DRM_IOCTL_XE_VM_RESTART			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_RESTART, struct drm_xe_vm_restart)
 
 /**
  * DOC: Xe IOCTL Extensions
@@ -985,6 +988,10 @@ struct drm_xe_gem_mmap_offset {
  *    but only during a &DRM_IOCTL_XE_VM_BIND operation with the
  *    %DRM_XE_VM_BIND_FLAG_IMMEDIATE flag set. This may be useful for
  *    user-space naively probing the amount of available memory.
+ *  - %DRM_XE_VM_CREATE_FLAG_RESTARTABLE - Requires also
+ *    DRM_XE_VM_CREATE_FLAG_LR_MODE. Marks the VM as restartable, enabling
+ *    use of &DRM_IOCTL_XE_VM_RESTART to resume the preempt-rebind worker
+ *    after an error has paused it.
  */
 struct drm_xe_vm_create {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -994,6 +1001,7 @@ struct drm_xe_vm_create {
 #define DRM_XE_VM_CREATE_FLAG_LR_MODE	        (1 << 1)
 #define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(1 << 2)
 #define DRM_XE_VM_CREATE_FLAG_NO_VM_OVERCOMMIT  (1 << 3)
+#define DRM_XE_VM_CREATE_FLAG_RESTARTABLE       (1 << 4)
 	/** @flags: Flags */
 	__u32 flags;
 
@@ -2531,8 +2539,44 @@ struct drm_xe_exec_queue_set_property {
 };
 
 /**
- * DOC: Xe DRM RAS
+ * DOC: DRM_XE_VM_RESTART
+ *
+ * Restart a paused VM by queuing its preempt-rebind worker.  The VM must be
+ * in preempt-fence mode and must currently be paused (i.e. its rebind worker
+ * was deactivated after a recoverable error such as -ENOMEM or -ENOSPC).
+ *
+ * Returns 0 if the rebind worker was successfully queued.  Returns -EALREADY
+ * if the VM is not currently paused.  Returns -EINVAL if the VM is not in
+ * preempt-fence mode or not restartable.
  *
+ * An optional @timestamp_ns can be provided to measure the latency between
+ * event delivery and the point the worker is queued; the driver logs this
+ * once all sanity checks have passed.
+ */
+
+/**
+ * struct drm_xe_vm_restart - restart a VM's preempt-rebind worker
+ *
+ * Used with %DRM_IOCTL_XE_VM_RESTART.
+ */
+struct drm_xe_vm_restart {
+	/** @vm_id: ID of the VM to restart */
+	__u32 vm_id;
+	/** @pad: reserved, must be zero */
+	__u32 pad;
+	/**
+	 * @timestamp_ns: optional CLOCK_MONOTONIC timestamp in nanoseconds.
+	 * When non-zero, the driver logs the delay between this timestamp and
+	 * the point the vm lock is taken, regardless of whether the VM is
+	 * currently paused.  Pass zero to disable the logging.
+	 */
+	__u64 timestamp_ns;
+	/** @reserved: reserved, must be zero */
+	__u64 reserved;
+};
+
+/**
+ * DOC: Xe DRM RAS
  * The enums and strings defined below map to the attributes of the DRM RAS Netlink Interface.
  * Refer to Documentation/netlink/specs/drm_ras.yaml for complete interface specification.
  *
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
  2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  2026-06-12 13:53 ` [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() Thomas Hellström
  2026-06-12 13:53 ` [PATCH 4/4] drm/xe: Add watch_queue-based device event notification Thomas Hellström
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add fault injection support using the kernel fault injection
infrastructure to inject -ENOSPC early in the success path of
preempt_rebind_work_func(), before xe_svm_notifier_lock() is taken,
testing the error handling paths without interference from real
resource exhaustion.

Injection is restricted to restartable VMs. When triggered, the
worker deactivates the VM (rebind_deactivated).
Upcoming patches will then also post an error event to userspace.

Enable via debugfs:

  echo 1 > /sys/kernel/debug/dri/0/fail_rebind/times
  echo 100 > /sys/kernel/debug/dri/0/fail_rebind/probability

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_debugfs.c |  4 +++-
 drivers/gpu/drm/xe/xe_vm.c      | 32 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h      |  5 +++++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index 22b471303984..1a92c52ccd83 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -35,8 +35,8 @@
 #ifdef CONFIG_DRM_XE_DEBUG
 #include "xe_bo_evict.h"
 #include "xe_migrate.h"
-#include "xe_vm.h"
 #endif
+#include "xe_vm.h"
 
 DECLARE_FAULT_ATTR(gt_reset_failure);
 DECLARE_FAULT_ATTR(inject_csc_hw_error);
@@ -612,6 +612,8 @@ void xe_debugfs_register(struct xe_device *xe)
 
 	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
 
+	xe_vm_debugfs_register(root);
+
 	if (IS_SRIOV_PF(xe))
 		xe_sriov_pf_debugfs_register(xe, root);
 	else if (IS_SRIOV_VF(xe))
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 86ed8f31a219..b69a2e5bd9c9 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -18,6 +18,9 @@
 #include <linux/kthread.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
+#ifdef CONFIG_DEBUG_FS
+#include <linux/debugfs.h>
+#endif
 
 #include <generated/xe_wa_oob.h>
 
@@ -43,6 +46,17 @@
 #include "xe_vm_madvise.h"
 #include "xe_wa.h"
 
+#ifdef CONFIG_FAULT_INJECTION
+static DECLARE_FAULT_ATTR(rebind_enospc);
+
+static void xe_vm_register_fault_attrs(struct dentry *root)
+{
+	fault_create_debugfs_attr("fail_rebind", root, &rebind_enospc);
+}
+#else
+static inline void xe_vm_register_fault_attrs(struct dentry *root) {}
+#endif
+
 static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
 {
 	return vm->gpuvm.r_obj;
@@ -529,6 +543,13 @@ static void preempt_rebind_work_func(struct work_struct *w)
 		goto out_unlock;
 	}
 
+#ifdef CONFIG_FAULT_INJECTION
+	if (xe_vm_is_restartable(vm) && should_fail(&rebind_enospc, 1)) {
+		err = -ENOSPC;
+		goto out_unlock;
+	}
+#endif
+
 #define retry_required(__tries, __vm) \
 	(IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT) ? \
 	(!(__tries)++ || __xe_vm_userptr_needs_repin(__vm)) : \
@@ -5042,3 +5063,14 @@ void xe_vm_remove_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
 	}
 	up_write(&vm->exec_queues.lock);
 }
+
+#ifdef CONFIG_DEBUG_FS
+/**
+ * xe_vm_debugfs_register() - Register xe_vm debugfs entries
+ * @root: debugfs root dentry for this device
+ */
+void xe_vm_debugfs_register(struct dentry *root)
+{
+	xe_vm_register_fault_attrs(root);
+}
+#endif
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 9ee44599cacd..0f9a38d97bf6 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -216,6 +216,11 @@ int xe_vm_restart_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file);
 void xe_vm_close_and_put(struct xe_vm *vm);
 
+#ifdef CONFIG_DEBUG_FS
+struct dentry;
+void xe_vm_debugfs_register(struct dentry *root);
+#endif
+
 static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
 {
 	return vm->flags & XE_VM_FLAG_FAULT_MODE;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch()
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
  2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
  2026-06-12 13:53 ` [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  2026-06-12 13:53 ` [PATCH 4/4] drm/xe: Add watch_queue-based device event notification Thomas Hellström
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add a DRM_XE_NOTIFY watch type for asynchronous error notifications
from the DRM_XE kernel module.

The reason for not registering a DRM - wide notification type is
that the notification type is 24 bits wide, the subtype is only 8,
If this is a concern one could define the DRM - wide subtypes
to be per driver, not common across DRM.

Also export the init_watch() function for use from kernel drivers.
Use EXPORT_SYMBOL() to align with other exports from the same file.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 include/uapi/drm/xe_drm.h        |  4 ++--
 include/uapi/linux/watch_queue.h |  3 ++-
 kernel/watch_queue.c             | 13 ++++++++++---
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index bebb0167bd31..8d5e3f06b8d4 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -2550,8 +2550,8 @@ struct drm_xe_exec_queue_set_property {
  * preempt-fence mode or not restartable.
  *
  * An optional @timestamp_ns can be provided to measure the latency between
- * event delivery and the point the worker is queued; the driver logs this
- * once all sanity checks have passed.
+ * event delivery and locking; the driver logs this regardless of whether the
+ * VM was paused.
  */
 
 /**
diff --git a/include/uapi/linux/watch_queue.h b/include/uapi/linux/watch_queue.h
index c3d8320b5d3a..c800c153989d 100644
--- a/include/uapi/linux/watch_queue.h
+++ b/include/uapi/linux/watch_queue.h
@@ -14,7 +14,8 @@
 enum watch_notification_type {
 	WATCH_TYPE_META		= 0,	/* Special record */
 	WATCH_TYPE_KEY_NOTIFY	= 1,	/* Key change event notification */
-	WATCH_TYPE__NR		= 2
+	WATCH_TYPE_DRM_XE_NOTIFY	= 2,	/* DRM device event notification */
+	WATCH_TYPE__NR		= 3
 };
 
 enum watch_meta_notification_subtype {
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index 538520861e8b..701b5c388808 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -445,11 +445,17 @@ static void put_watch(struct watch *watch)
 }
 
 /**
- * init_watch - Initialise a watch
+ * init_watch() - Initialise a watch subscription
  * @watch: The watch to initialise.
- * @wqueue: The queue to assign.
+ * @wqueue: The watch queue (notification pipe) to associate with the watch.
  *
- * Initialise a watch and set the watch queue.
+ * Initialise a newly allocated watch object and associate it with @wqueue.
+ * The caller must subsequently set @watch->id and @watch->info_id before
+ * calling add_watch_to_object() to subscribe the watch to a notification
+ * source.
+ *
+ * The watch queue reference is held internally; call put_watch_queue() if
+ * the watch is not successfully passed to add_watch_to_object().
  */
 void init_watch(struct watch *watch, struct watch_queue *wqueue)
 {
@@ -458,6 +464,7 @@ void init_watch(struct watch *watch, struct watch_queue *wqueue)
 	INIT_HLIST_NODE(&watch->queue_node);
 	rcu_assign_pointer(watch->queue, wqueue);
 }
+EXPORT_SYMBOL(init_watch);
 
 static int add_one_watch(struct watch *watch, struct watch_list *wlist, struct watch_queue *wqueue)
 {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] drm/xe: Add watch_queue-based device event notification
  2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
                   ` (2 preceding siblings ...)
  2026-06-12 13:53 ` [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() Thomas Hellström
@ 2026-06-12 13:53 ` Thomas Hellström
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Hellström @ 2026-06-12 13:53 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Matthew Brost, Maarten Lankhorst,
	Michal Mrozek, John Falkowski, Rodrigo Vivi, Lahtinen Joonas,
	David Howells, Christian Brauner, Kees Cook, Davidlohr Bueso,
	Christian König, Dave Airlie, Simona Vetter, dri-devel, LMKL

Add a watch_queue notification channel tied to struct xe_file so that
userspace can subscribe to asynchronous GPU device events via the
general kernel notification mechanism.

Introduce DRM_IOCTL_XE_WATCH_QUEUE to let userspace subscribe a
notification pipe (opened with pipe2(O_NOTIFICATION_PIPE)) to the device
event stream.  Embed the watch_id field (0-255) in the WATCH_INFO_ID
field of every notification, allowing multiple watches to share a single
pipe and be told apart by the reader.

Deliver notifications as struct drm_xe_watch_notification records, with
type always set to WATCH_TYPE_DRM_XE_NOTIFY and subtype drawn from enum
drm_xe_watch_event.  Define DRM_XE_WATCH_EVENT_VM_ERR as the first
event, posted by the preempt-rebind worker when a VM encounters an
unrecoverable error.  Expose xe_watch_queue_post_vm_err_event() as the
in-kernel posting API.

Add event definitions in a separate uapi header, <drm/xe_drm_events.h>.
The main reason is that the header needs to include <linux/watch_queue.h>
which in turn includes <linux/fcntl.h> which may conflict with the
system <fcntl.h>. Hence user-space must pay special attention when
including this file.

Assisted-by: GitHub_Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 MAINTAINERS                          |   1 +
 drivers/gpu/drm/xe/Kconfig           |   1 +
 drivers/gpu/drm/xe/Makefile          |   1 +
 drivers/gpu/drm/xe/xe_device.c       |   7 ++
 drivers/gpu/drm/xe/xe_device_types.h |   6 ++
 drivers/gpu/drm/xe/xe_vm.c           |   4 +
 drivers/gpu/drm/xe/xe_vm_types.h     |   2 +
 drivers/gpu/drm/xe/xe_watch_queue.c  | 111 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_watch_queue.h  |  20 +++++
 include/uapi/drm/xe_drm.h            |  45 +++++++++++
 include/uapi/drm/xe_drm_events.h     |  62 +++++++++++++++
 11 files changed, 260 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.c
 create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.h
 create mode 100644 include/uapi/drm/xe_drm_events.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 8c0d9965c636..b7e02cfa692b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12900,6 +12900,7 @@ F:	Documentation/gpu/xe/
 F:	drivers/gpu/drm/xe/
 F:	include/drm/intel/
 F:	include/uapi/drm/xe_drm.h
+F:	include/uapi/drm/xe_drm_events.h
 
 INTEL ELKHART LAKE PSE I/O DRIVER
 M:	Raag Jadav <raag.jadav@intel.com>
diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index 4d7dcaff2b91..dbdc2fb49c53 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -25,6 +25,7 @@ config DRM_XE
 	select DRM_MIPI_DSI
 	select RELAY
 	select IRQ_WORK
+	select WATCH_QUEUE
 	# xe depends on ACPI_VIDEO when ACPI is enabled
 	# but for select to work, need to select ACPI_VIDEO's dependencies, ick
 	select BACKLIGHT_CLASS_DEVICE if ACPI
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 8e7b146880f4..fc8b4023a044 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -150,6 +150,7 @@ xe-y += xe_bb.o \
 	xe_vsec.o \
 	xe_wa.o \
 	xe_wait_user_fence.o \
+	xe_watch_queue.o \
 	xe_wopcm.o
 
 xe-$(CONFIG_I2C)	+= xe_i2c.o
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 867d7c55dc03..788ef2fbd6e5 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -9,6 +9,7 @@
 #include <linux/delay.h>
 #include <linux/fault-inject.h>
 #include <linux/units.h>
+#include <linux/watch_queue.h>
 
 #include <drm/drm_client.h>
 #include <drm/drm_gem_ttm_helper.h>
@@ -77,6 +78,7 @@
 #include "xe_vsec.h"
 #include "xe_wait_user_fence.h"
 #include "xe_wa.h"
+#include "xe_watch_queue.h"
 
 #include <generated/xe_device_wa_oob.h>
 #include <generated/xe_wa_oob.h>
@@ -112,6 +114,8 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 	file->driver_priv = xef;
 	kref_init(&xef->refcount);
 
+	init_watch_list(&xef->watch_list, NULL);
+
 	task = get_pid_task(rcu_access_pointer(file->pid), PIDTYPE_PID);
 	if (task) {
 		xef->process_name = kstrdup(task->comm, GFP_KERNEL);
@@ -126,6 +130,8 @@ static void xe_file_destroy(struct kref *ref)
 {
 	struct xe_file *xef = container_of(ref, struct xe_file, refcount);
 
+	remove_watch_from_object(&xef->watch_list, NULL, 0, true);
+
 	xa_destroy(&xef->exec_queue.xa);
 	mutex_destroy(&xef->exec_queue.lock);
 	xa_destroy(&xef->vm.xa);
@@ -216,6 +222,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_RESTART, xe_vm_restart_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_WATCH_QUEUE, xe_watch_queue_ioctl, DRM_RENDER_ALLOW),
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 32dd2ffbc796..ca726ada30a7 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -11,6 +11,7 @@
 #include <drm/drm_device.h>
 #include <drm/drm_file.h>
 #include <drm/ttm/ttm_device.h>
+#include <linux/watch_queue.h>
 
 #include "xe_devcoredump_types.h"
 #include "xe_drm_ras_types.h"
@@ -632,6 +633,11 @@ struct xe_file {
 
 	/** @refcount: ref count of this xe file */
 	struct kref refcount;
+
+#ifdef CONFIG_WATCH_QUEUE
+	/** @watch_list: per-file notification source for device events */
+	struct watch_list watch_list;
+#endif
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index b69a2e5bd9c9..232de0d948d2 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -13,6 +13,7 @@
 #include <drm/drm_print.h>
 #include <drm/ttm/ttm_tt.h>
 #include <uapi/drm/xe_drm.h>
+#include <uapi/drm/xe_drm_events.h>
 #include <linux/ascii85.h>
 #include <linux/delay.h>
 #include <linux/kthread.h>
@@ -45,6 +46,7 @@
 #include "xe_trace_bo.h"
 #include "xe_vm_madvise.h"
 #include "xe_wa.h"
+#include "xe_watch_queue.h"
 
 #ifdef CONFIG_FAULT_INJECTION
 static DECLARE_FAULT_ATTR(rebind_enospc);
@@ -584,6 +586,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	}
 
 	if (err) {
+		xe_watch_queue_post_vm_err_event(vm->xef, vm->id, err);
 		if ((err == -ENOMEM || err == -ENOSPC) && xe_vm_is_restartable(vm)) {
 			vm->preempt.rebind_deactivated = true;
 			drm_dbg(&vm->xe->drm, "Rebind deactivated VM on error %pe\n",
@@ -2229,6 +2232,7 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (err)
 		goto err_close_and_put;
 
+	vm->id = id;
 	args->vm_id = id;
 
 	return 0;
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 7d295c3b8456..19a673099588 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -407,6 +407,8 @@ struct xe_vm {
 	bool batch_invalidate_tlb;
 	/** @xef: Xe file handle for tracking this VM's drm client */
 	struct xe_file *xef;
+	/** @id: The id of the VM in the VM table of @xef. */
+	u32 id;
 };
 
 /** struct xe_vma_op_map - VMA map operation */
diff --git a/drivers/gpu/drm/xe/xe_watch_queue.c b/drivers/gpu/drm/xe/xe_watch_queue.c
new file mode 100644
index 000000000000..32763591075b
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_watch_queue.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#include <linux/slab.h>
+#include <linux/timekeeping.h>
+#include <linux/watch_queue.h>
+
+#include <uapi/drm/xe_drm.h>
+#include <uapi/drm/xe_drm_events.h>
+
+#include "xe_device.h"
+#include "xe_device_types.h"
+#include "xe_macros.h"
+#include "xe_watch_queue.h"
+
+/**
+ * struct xe_watch_notification_vm_err - kernel-side VM error event notification
+ * @base: common watch notification header; type is %WATCH_TYPE_DRM_XE_NOTIFY,
+ *        subtype is %DRM_XE_WATCH_EVENT_VM_ERR
+ * @vm_id: ID of the VM that hit error
+ * @error_code: error code describing the error condition (negative errno)
+ * @timestamp_ns: CLOCK_MONOTONIC timestamp in nanoseconds at the point the
+ *                error was detected
+ *
+ * Layout mirrors &struct drm_xe_watch_notification_vm_err.
+ */
+struct xe_watch_notification_vm_err {
+	struct watch_notification base;
+	u32 vm_id;
+	s32 error_code;
+	u64 timestamp_ns;
+};
+
+/**
+ * xe_watch_queue_ioctl() - Subscribe a pipe to per-file device event notifications
+ * @dev: DRM device
+ * @data: pointer to &struct drm_xe_watch_queue from userspace
+ * @file: DRM file handle of the subscribing process
+ *
+ * Subscribes a notification pipe to receive Xe device events for the calling
+ * process's file handle.  Only events scoped to this file (e.g. VM error on a
+ * VM owned by this file) are delivered.  The pipe must have been opened with
+ * O_NOTIFICATION_PIPE and sized with %IOC_WATCH_QUEUE_SET_SIZE before calling
+ * this IOCTL.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int xe_watch_queue_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+{
+	struct xe_file *xef = file->driver_priv;
+	struct xe_device *xe = to_xe_device(dev);
+	struct drm_xe_watch_queue *args = data;
+	struct watch_queue *wqueue;
+	struct watch *watch;
+	int ret;
+
+	if (XE_IOCTL_DBG(xe, args->flags || args->pad))
+		return -EINVAL;
+	if (XE_IOCTL_DBG(xe, args->watch_id > 0xff))
+		return -EINVAL;
+
+	wqueue = get_watch_queue(args->fd);
+	if (XE_IOCTL_DBG(xe, IS_ERR(wqueue)))
+		return PTR_ERR(wqueue);
+
+	watch = kzalloc_obj(*watch, GFP_KERNEL | __GFP_ACCOUNT);
+	if (XE_IOCTL_DBG(xe, !watch)) {
+		ret = -ENOMEM;
+		goto out_put_queue;
+	}
+
+	init_watch(watch, wqueue);
+	watch->id = 0;
+	watch->info_id = (u32)args->watch_id << WATCH_INFO_ID__SHIFT;
+
+	ret = add_watch_to_object(watch, &xef->watch_list);
+	if (XE_IOCTL_DBG(xe, ret))
+		kfree(watch);
+
+out_put_queue:
+	put_watch_queue(wqueue);
+	return ret;
+}
+
+/**
+ * xe_watch_queue_post_vm_err_event() - Post a VM error event
+ * @xef: xe file handle that owns the VM
+ * @vm_id: userspace ID of the VM that hit error
+ * @error_code: error code describing the error condition (negative errno)
+ *
+ * Posts a %DRM_XE_WATCH_EVENT_VM_ERR notification carrying @vm_id and
+ * @error_code to every pipe that @xef has subscribed via
+ * %DRM_IOCTL_XE_WATCH_QUEUE.  Only the owning process is notified,
+ * preventing information leaks to other clients.
+ */
+void xe_watch_queue_post_vm_err_event(struct xe_file *xef, u32 vm_id,
+				      int error_code)
+{
+	struct xe_watch_notification_vm_err n = {};
+
+	n.base.type    = WATCH_TYPE_DRM_XE_NOTIFY;
+	n.base.subtype = DRM_XE_WATCH_EVENT_VM_ERR;
+	n.base.info    = watch_sizeof(struct xe_watch_notification_vm_err);
+	n.vm_id        = vm_id;
+	n.error_code   = error_code;
+	n.timestamp_ns = ktime_get_ns();
+
+	post_watch_notification(&xef->watch_list, &n.base, current_cred(), 0);
+}
diff --git a/drivers/gpu/drm/xe/xe_watch_queue.h b/drivers/gpu/drm/xe/xe_watch_queue.h
new file mode 100644
index 000000000000..ad199ee68205
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_watch_queue.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#ifndef _XE_WATCH_QUEUE_H_
+#define _XE_WATCH_QUEUE_H_
+
+#include <linux/types.h>
+
+struct drm_device;
+struct drm_file;
+struct xe_file;
+
+int xe_watch_queue_ioctl(struct drm_device *dev, void *data,
+			 struct drm_file *file);
+void xe_watch_queue_post_vm_err_event(struct xe_file *xef, u32 vm_id,
+				      int error_code);
+
+#endif /* _XE_WATCH_QUEUE_H_ */
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 8d5e3f06b8d4..0083dd712f7e 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -86,6 +86,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
  *  - &DRM_IOCTL_XE_VM_RESTART
+ *  - &DRM_IOCTL_XE_WATCH_QUEUE
  */
 
 /*
@@ -112,6 +113,7 @@ extern "C" {
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x0e
 #define DRM_XE_VM_GET_PROPERTY		0x0f
 #define DRM_XE_VM_RESTART		0x10
+#define DRM_XE_WATCH_QUEUE		0x11
 
 /* Must be kept compact -- no holes */
 
@@ -132,6 +134,7 @@ extern "C" {
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
 #define DRM_IOCTL_XE_VM_RESTART			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_RESTART, struct drm_xe_vm_restart)
+#define DRM_IOCTL_XE_WATCH_QUEUE		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_WATCH_QUEUE, struct drm_xe_watch_queue)
 
 /**
  * DOC: Xe IOCTL Extensions
@@ -2653,6 +2656,48 @@ enum drm_xe_ras_error_component {
 	[DRM_XE_RAS_ERR_COMP_SOC_INTERNAL] = "soc-internal"		\
 }
 
+/**
+ * DOC: DRM_XE_WATCH_QUEUE
+ *
+ * Subscribe a notification pipe to receive device events for the calling
+ * process's DRM file handle.  Events are scoped to the subscribing file:
+ * only events that belong to that file (for example, VM error on a VM created
+ * through the same file) are delivered, preventing information leaks between
+ * processes sharing the same GPU device.
+ *
+ * The pipe must first be opened with O_NOTIFICATION_PIPE (i.e. O_EXCL passed
+ * to pipe2()) and sized via %IOC_WATCH_QUEUE_SET_SIZE before subscribing.
+ *
+ * Events are delivered as notification records read from the pipe.  The
+ * @watch_id field is embedded in the notification info field and can be used
+ * to distinguish multiple watches sharing a pipe.
+ *
+ * Currently defined event subtypes:
+ *  - %DRM_XE_WATCH_EVENT_VM_ERR - a VM owned by this file has encountered an error
+ */
+
+/**
+ * struct drm_xe_watch_queue - subscribe to device event notifications
+ *
+ * Used with %DRM_IOCTL_XE_WATCH_QUEUE.  Notifications are scoped to the
+ * DRM file handle used to issue this IOCTL.
+ */
+struct drm_xe_watch_queue {
+	/** @fd: file descriptor of pipe opened with O_NOTIFICATION_PIPE */
+	__u32 fd;
+
+	/**
+	 * @watch_id: identifier (0–255) embedded in the watch notification
+	 * info field; allows multiplexing several watches on one pipe
+	 */
+	__u32 watch_id;
+
+	/** @flags: must be zero */
+	__u32 flags;
+
+	/** @pad: reserved, must be zero */
+	__u32 pad;
+};
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/uapi/drm/xe_drm_events.h b/include/uapi/drm/xe_drm_events.h
new file mode 100644
index 000000000000..6cc7528bfb9b
--- /dev/null
+++ b/include/uapi/drm/xe_drm_events.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#ifndef _UAPI_XE_DRM_EVENTS_H_
+#define _UAPI_XE_DRM_EVENTS_H_
+
+#include <linux/types.h>
+#include <linux/watch_queue.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/**
+ * enum drm_xe_watch_event - Xe device watch event subtypes
+ *
+ * Subtypes for notifications delivered via %WATCH_TYPE_DRM_XE_NOTIFY when
+ * reading from a pipe subscribed with %DRM_IOCTL_XE_WATCH_QUEUE.
+ */
+enum drm_xe_watch_event {
+	/**
+	 * @DRM_XE_WATCH_EVENT_VM_ERR: a VM has encountered an error.
+	 *
+	 * Indicates that a fatal or resource error occurred within the given
+	 * VM.  The vm_id of the affected VM is carried in the
+	 * @drm_xe_watch_notification_vm_err::vm_id field of the extended
+	 * notification record.
+	 */
+	DRM_XE_WATCH_EVENT_VM_ERR = 0,
+};
+
+/**
+ * struct drm_xe_watch_notification_vm_err - VM error event notification
+ *
+ * Notification record delivered for %DRM_XE_WATCH_EVENT_VM_ERR.
+ * The record type is always %WATCH_TYPE_DRM_XE_NOTIFY and the subtype is
+ * %DRM_XE_WATCH_EVENT_VM_ERR.
+ */
+struct drm_xe_watch_notification_vm_err {
+	/** @base: common watch notification header */
+	struct watch_notification base;
+
+	/** @vm_id: ID of the VM that encountered an error */
+	__u32 vm_id;
+
+	/** @error_code: error code describing the error condition (negative errno) */
+	__s32 error_code;
+
+	/**
+	 * @timestamp_ns: CLOCK_MONOTONIC timestamp in nanoseconds at the
+	 * point the error was detected
+	 */
+	__u64 timestamp_ns;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _UAPI_XE_DRM_EVENTS_H_ */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-12 13:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 13:53 [RFC PATCH 0/4] Xe driver asynchronous notification mechanism Thomas Hellström
2026-06-12 13:53 ` [PATCH 1/4] drm/xe: Add DRM_IOCTL_XE_VM_RESTART IOCTL Thomas Hellström
2026-06-12 13:53 ` [PATCH 2/4] drm/xe: Add fault injection for rebind worker -ENOSPC Thomas Hellström
2026-06-12 13:53 ` [PATCH 3/4] watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() Thomas Hellström
2026-06-12 13:53 ` [PATCH 4/4] drm/xe: Add watch_queue-based device event notification Thomas Hellström

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox