dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
@ 2025-04-24 20:49 Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 1/5] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs Jonathan Cavitt
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Jonathan Cavitt @ 2025-04-24 20:49 UTC (permalink / raw)
  To: intel-xe
  Cc: saurabhg.gupta, alex.zuo, jonathan.cavitt, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

Add additional information to each VM so they can report up to the first
50 seen faults.  Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.

Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.

To allow userspace to access these faults, a new ioctl -
xe_vm_get_property_ioct - was created.

v2: (Matt Brost)
- Break full ban list request into a separate property.
- Reformat drm_xe_vm_get_property struct.
- Remove need for drm_xe_faults helper struct.
- Separate data pointer and scalar return value in ioctl.
- Get address type on pagefault report and save it to the pagefault.
- Correctly reject writes to read-only VMAs.
- Miscellaneous formatting fixes.

v3: (Matt Brost)
- Only allow querying of failed pagefaults

v4:
- Remove unnecessary size parameter from helper function, as it
  is a property of the arguments. (jcavitt)
- Remove unnecessary copy_from_user (Jainxun)
- Set address_precision to 1 (Jainxun)
- Report max size instead of dynamic size for memory allocation
  purposes.  Total memory usage is reported separately.

v5:
- Return int from xe_vm_get_property_size (Shuicheng)
- Fix memory leak (Shuicheng)
- Remove unnecessary size variable (jcavitt)

v6:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)
- Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)

v7:
- Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt)
- Fix double-locking error (jcavitt)
- Assert kmemdump is successful (Shuicheng)
- Repair and move fill_faults break condition (Dan Carpenter)
- Free vm after use (jcavitt)
- Combine assertions (jcavitt)
- Expand size check in xe_vm_get_faults_ioctl (jcavitt)
- Remove return mask from fill_faults, as return is already -EFAULT or 0
  (jcavitt)

v8:
- Revert back to using drm_xe_vm_get_property_ioctl
- s/Migrate/Move (Michal)
- s/xe_pagefault/xe_gt_pagefault (Michal)
- Create new header file, xe_gt_pagefault_types.h (Michal)
- Add and fix kernel docs (Michal)
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)
- Apply better copy_to_user logic (jcavitt)

v9:
- More kernel doc fixes (Michal W, Jianxun)
- Better error handling (jcavitt)

v10:
- Convert enums to defines in regs folder (Michal W)
- Move xe_guc_pagefault_desc to regs folder (Michal W)
- Future-proof size logic for zero-size properties (jcavitt)
- Replace address type extern with access type (Jianxun)
- Add fault type to xe_drm_fault (Jianxun)

v11:
- Remove unnecessary switch case logic (Raag)
- Compress size get, size validation, and property fill functions into a
  single helper function (jcavitt)
- Assert valid size (jcavitt)
- Store pagefaults in non-fault-mode VMs as well (Jianxun)

v12:
- Remove unnecessary else condition
- Correct backwards helper function size logic (jcavitt)
- Fix kernel docs and comments (Michal W)

v13:
- Move xe and user engine class mapping arrays to header (John H)

v14:
- Fix double locking issue (Jianxun)
- Use size_t instead of int (Raag)
- Remove unnecessary includes (jcavitt)

v15:
- Do not report faults from reserved engines (Jianxun)

v16:
- Remove engine class and instance (Ivan)

v17:
- Map access type, fault type, and fault level to user macros (Matt
  Brost, Ivan)

v18:
- Add uAPI merge request to this cover letter

v19:
- Perform kzalloc outside of lock (Auld)

v20:
- Fix inconsistent use of whitespace in defines

v21:
- Remove unnecessary size assertion (jcavitt)

v22:
- Fix xe_vm_fault_entry kernel docs (Shuicheng)

v23:
- Nit fixes (Matt Brost)

uAPI: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32987
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Cc: Zhang Jianxun <jianxun.zhang@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>

Jonathan Cavitt (5):
  drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs
  drm/xe/xe_gt_pagefault: Move pagefault struct to header
  drm/xe/uapi: Define drm_xe_vm_get_property
  drm/xe/xe_vm: Add per VM fault info
  drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h |  49 +++++
 drivers/gpu/drm/xe/xe_device.c              |   2 +
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |  72 +++----
 drivers/gpu/drm/xe/xe_gt_pagefault_types.h  |  42 +++++
 drivers/gpu/drm/xe/xe_guc_fwif.h            |  28 ---
 drivers/gpu/drm/xe/xe_vm.c                  | 196 ++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h                  |  11 ++
 drivers/gpu/drm/xe/xe_vm_types.h            |  29 +++
 include/uapi/drm/xe_drm.h                   |  86 +++++++++
 9 files changed, 454 insertions(+), 61 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
 create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v23 1/5] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs
  2025-04-24 20:49 [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
@ 2025-04-24 20:49 ` Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header Jonathan Cavitt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Cavitt @ 2025-04-24 20:49 UTC (permalink / raw)
  To: intel-xe
  Cc: saurabhg.gupta, alex.zuo, jonathan.cavitt, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

The page fault handler should reject write/atomic access to read only
VMAs.  Add code to handle this in handle_pagefault after the VMA lookup.

Fixes: 3d420e9fa848 ("drm/xe: Rework GPU page fault handling")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_pagefault.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index 10622ca471a2..d4e3b7eb165a 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -237,6 +237,11 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 		goto unlock_vm;
 	}
 
+	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
+		err = -EPERM;
+		goto unlock_vm;
+	}
+
 	atomic = access_is_atomic(pf->access_type);
 
 	if (xe_vma_is_cpu_addr_mirror(vma))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-24 20:49 [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 1/5] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs Jonathan Cavitt
@ 2025-04-24 20:49 ` Jonathan Cavitt
  2025-04-27 11:57   ` Dafna Hirschfeld
  2025-04-29 14:22   ` Dafna Hirschfeld
  2025-04-24 20:49 ` [PATCH v23 3/5] drm/xe/uapi: Define drm_xe_vm_get_property Jonathan Cavitt
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Jonathan Cavitt @ 2025-04-24 20:49 UTC (permalink / raw)
  To: intel-xe
  Cc: saurabhg.gupta, alex.zuo, jonathan.cavitt, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

Move the pagefault struct from xe_gt_pagefault.c to the
xe_gt_pagefault_types.h header file, and move the associated enum values
into the regs folder under xe_pagefault_desc.h

Since xe_pagefault_desc.h is being initialized here, also move the
xe_guc_pagefault_desc hardware formats to the new file.

v2:
- Normalize names for common header (Matt Brost)

v3:
- s/Migrate/Move (Michal W)
- s/xe_pagefault/xe_gt_pagefault (Michal W)
- Create new header file, xe_gt_pagefault_types.h (Michal W)
- Add kernel docs (Michal W)

v4:
- Fix includes usage (Michal W)
- Reference Bspec (Michal W)

v5:
- Convert enums to defines in regs folder (Michal W)
- Move xe_guc_pagefault_desc to regs folder (Michal W)

Bspec: 77412
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
 drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
 4 files changed, 100 insertions(+), 62 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
 create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h

diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
new file mode 100644
index 000000000000..a169ac274e14
--- /dev/null
+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_PAGEFAULT_DESC_H_
+#define _XE_PAGEFAULT_DESC_H_
+
+#include <linux/bits.h>
+#include <linux/types.h>
+
+struct xe_guc_pagefault_desc {
+	u32 dw0;
+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
+#define PFD_SRC_ID		GENMASK(10, 3)
+#define PFD_RSVD_0		GENMASK(17, 11)
+#define XE2_PFD_TRVA_FAULT	BIT(18)
+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
+#define PFD_ENG_CLASS		GENMASK(27, 25)
+#define PFD_PDATA_LO		GENMASK(31, 28)
+
+	u32 dw1;
+#define PFD_PDATA_HI		GENMASK(11, 0)
+#define PFD_PDATA_HI_SHIFT	4
+#define PFD_ASID		GENMASK(31, 12)
+
+	u32 dw2;
+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
+#define PFD_FAULT_TYPE		GENMASK(3, 2)
+#define PFD_VFID		GENMASK(9, 4)
+#define PFD_RSVD_1		GENMASK(11, 10)
+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
+
+	u32 dw3;
+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
+} __packed;
+
+#define FLT_ACCESS_TYPE_READ		0u
+#define FLT_ACCESS_TYPE_WRITE		1u
+#define FLT_ACCESS_TYPE_ATOMIC		2u
+#define FLT_ACCESS_TYPE_RESERVED	3u
+
+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index d4e3b7eb165a..93afa54c8780 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -12,8 +12,10 @@
 #include <drm/drm_managed.h>
 
 #include "abi/guc_actions_abi.h"
+#include "regs/xe_pagefault_desc.h"
 #include "xe_bo.h"
 #include "xe_gt.h"
+#include "xe_gt_pagefault_types.h"
 #include "xe_gt_stats.h"
 #include "xe_gt_tlb_invalidation.h"
 #include "xe_guc.h"
@@ -23,33 +25,6 @@
 #include "xe_trace_bo.h"
 #include "xe_vm.h"
 
-struct pagefault {
-	u64 page_addr;
-	u32 asid;
-	u16 pdata;
-	u8 vfid;
-	u8 access_type;
-	u8 fault_type;
-	u8 fault_level;
-	u8 engine_class;
-	u8 engine_instance;
-	u8 fault_unsuccessful;
-	bool trva_fault;
-};
-
-enum access_type {
-	ACCESS_TYPE_READ = 0,
-	ACCESS_TYPE_WRITE = 1,
-	ACCESS_TYPE_ATOMIC = 2,
-	ACCESS_TYPE_RESERVED = 3,
-};
-
-enum fault_type {
-	NOT_PRESENT = 0,
-	WRITE_ACCESS_VIOLATION = 1,
-	ATOMIC_ACCESS_VIOLATION = 2,
-};
-
 struct acc {
 	u64 va_range_base;
 	u32 asid;
@@ -61,9 +36,9 @@ struct acc {
 	u8 engine_instance;
 };
 
-static bool access_is_atomic(enum access_type access_type)
+static bool access_is_atomic(u32 access_type)
 {
-	return access_type == ACCESS_TYPE_ATOMIC;
+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
 }
 
 static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
 	return vm;
 }
 
-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
 {
 	struct xe_device *xe = gt_to_xe(gt);
 	struct xe_vm *vm;
@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 		goto unlock_vm;
 	}
 
-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
 		err = -EPERM;
 		goto unlock_vm;
 	}
@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
 }
 
-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
 {
 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
 		 "\tVFID: %d\n"
@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
 
 #define PF_MSG_LEN_DW	4
 
-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
 {
 	const struct xe_guc_pagefault_desc *desc;
 	bool ret = false;
@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
 	struct xe_gt *gt = pf_queue->gt;
 	struct xe_device *xe = gt_to_xe(gt);
 	struct xe_guc_pagefault_reply reply = {};
-	struct pagefault pf = {};
+	struct xe_gt_pagefault pf = {};
 	unsigned long threshold;
 	int ret;
 
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
new file mode 100644
index 000000000000..b7d41b558de3
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022-2025 Intel Corporation
+ */
+
+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
+#define _XE_GT_PAGEFAULT_TYPES_H_
+
+#include <linux/types.h>
+
+/**
+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
+ * pagefault handler
+ */
+struct xe_gt_pagefault {
+	/** @page_addr: faulted address of this pagefault */
+	u64 page_addr;
+	/** @asid: ASID of this pagefault */
+	u32 asid;
+	/** @pdata: PDATA of this pagefault */
+	u16 pdata;
+	/** @vfid: VFID of this pagefault */
+	u8 vfid;
+	/** @access_type: access type of this pagefault */
+	u8 access_type;
+	/** @fault_type: fault type of this pagefault */
+	u8 fault_type;
+	/** @fault_level: fault level of this pagefault */
+	u8 fault_level;
+	/** @engine_class: engine class this pagefault was reported on */
+	u8 engine_class;
+	/** @engine_instance: engine instance this pagefault was reported on */
+	u8 engine_instance;
+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
+	u8 fault_unsuccessful;
+	/** @prefetch: unused */
+	bool prefetch;
+	/** @trva_fault: is set if this is a TRTT fault */
+	bool trva_fault;
+};
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
index 6f57578b07cb..30ac21bb4f15 100644
--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
 	FAULT_RESPONSE_DESC
 };
 
-struct xe_guc_pagefault_desc {
-	u32 dw0;
-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
-#define PFD_SRC_ID		GENMASK(10, 3)
-#define PFD_RSVD_0		GENMASK(17, 11)
-#define XE2_PFD_TRVA_FAULT	BIT(18)
-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
-#define PFD_ENG_CLASS		GENMASK(27, 25)
-#define PFD_PDATA_LO		GENMASK(31, 28)
-
-	u32 dw1;
-#define PFD_PDATA_HI		GENMASK(11, 0)
-#define PFD_PDATA_HI_SHIFT	4
-#define PFD_ASID		GENMASK(31, 12)
-
-	u32 dw2;
-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
-#define PFD_FAULT_TYPE		GENMASK(3, 2)
-#define PFD_VFID		GENMASK(9, 4)
-#define PFD_RSVD_1		GENMASK(11, 10)
-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
-
-	u32 dw3;
-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
-} __packed;
-
 struct xe_guc_pagefault_reply {
 	u32 dw0;
 #define PFR_VALID		BIT(0)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v23 3/5] drm/xe/uapi: Define drm_xe_vm_get_property
  2025-04-24 20:49 [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 1/5] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header Jonathan Cavitt
@ 2025-04-24 20:49 ` Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 4/5] drm/xe/xe_vm: Add per VM fault info Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Cavitt @ 2025-04-24 20:49 UTC (permalink / raw)
  To: intel-xe
  Cc: saurabhg.gupta, alex.zuo, jonathan.cavitt, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

Add initial declarations for the drm_xe_vm_get_property ioctl.

v2:
- Expand kernel docs for drm_xe_vm_get_property (Jianxun)

v3:
- Remove address type external definitions (Jianxun)
- Add fault type to xe_drm_fault struct (Jianxun)

v4:
- Remove engine class and instance (Ivan)

v5:
- Add declares for fault type, access type, and fault level (Matt Brost,
  Ivan)

v6:
- Fix inconsistent use of whitespace in defines

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Cc: Zhang Jianxun <jianxun.zhang@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 include/uapi/drm/xe_drm.h | 86 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 9c08738c3b91..556fc360a076 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -81,6 +81,7 @@ extern "C" {
  *  - &DRM_IOCTL_XE_EXEC
  *  - &DRM_IOCTL_XE_WAIT_USER_FENCE
  *  - &DRM_IOCTL_XE_OBSERVATION
+ *  - &DRM_IOCTL_XE_VM_GET_PROPERTY
  */
 
 /*
@@ -102,6 +103,7 @@ extern "C" {
 #define DRM_XE_EXEC			0x09
 #define DRM_XE_WAIT_USER_FENCE		0x0a
 #define DRM_XE_OBSERVATION		0x0b
+#define DRM_XE_VM_GET_PROPERTY		0x0c
 
 /* Must be kept compact -- no holes */
 
@@ -117,6 +119,7 @@ extern "C" {
 #define DRM_IOCTL_XE_EXEC			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
 #define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
 #define DRM_IOCTL_XE_OBSERVATION		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param)
+#define DRM_IOCTL_XE_VM_GET_PROPERTY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property)
 
 /**
  * DOC: Xe IOCTL Extensions
@@ -1193,6 +1196,89 @@ struct drm_xe_vm_bind {
 	__u64 reserved[2];
 };
 
+/** struct xe_vm_fault - Describes faults for %DRM_XE_VM_GET_PROPERTY_FAULTS */
+struct xe_vm_fault {
+	/** @address: Address of the fault */
+	__u64 address;
+	/** @address_precision: Precision of faulted address */
+	__u32 address_precision;
+	/** @access_type: Type of address access that resulted in fault */
+#define FAULT_ACCESS_TYPE_READ		0
+#define FAULT_ACCESS_TYPE_WRITE		1
+#define FAULT_ACCESS_TYPE_ATOMIC	2
+	__u8 access_type;
+	/** @fault_type: Type of fault reported */
+#define FAULT_TYPE_NOT_PRESENT		0
+#define FAULT_TYPE_WRITE_ACCESS		1
+#define FAULT_TYPE_ATOMIC_ACCESS	2
+	__u8 fault_type;
+	/** @fault_level: fault level of the fault */
+#define FAULT_LEVEL_PTE		0
+#define FAULT_LEVEL_PDE		1
+#define FAULT_LEVEL_PDP		2
+#define FAULT_LEVEL_PML4	3
+#define FAULT_LEVEL_PML5	4
+	__u8 fault_level;
+	/** @pad: MBZ */
+	__u8 pad;
+	/** @reserved: MBZ */
+	__u64 reserved[4];
+};
+
+/**
+ * struct drm_xe_vm_get_property - Input of &DRM_IOCTL_XE_VM_GET_PROPERTY
+ *
+ * The user provides a VM and a property to query among DRM_XE_VM_GET_PROPERTY_*,
+ * and sets the values in the vm_id and property members, respectively.  This
+ * determines both the VM to get the property of, as well as the property to
+ * report.
+ *
+ * If size is set to 0, the driver fills it with the required size for the
+ * requested property.  The user is expected here to allocate memory for the
+ * property structure and to provide a pointer to the allocated memory using the
+ * data member.  For some properties, this may be zero, in which case, the
+ * value of the property will be saved to the value member and size will remain
+ * zero on return.
+ *
+ * If size is not zero, then the IOCTL will attempt to copy the requested
+ * property into the data member.
+ *
+ * The IOCTL will return -ENOENT if the VM could not be identified from the
+ * provided VM ID, or -EINVAL if the IOCTL fails for any other reason, such as
+ * providing an invalid size for the given property or if the property data
+ * could not be copied to the memory allocated to the data member.
+ *
+ * The property member can be:
+ *  - %DRM_XE_VM_GET_PROPERTY_FAULTS
+ */
+struct drm_xe_vm_get_property {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	/** @vm_id: The ID of the VM to query the properties of */
+	__u32 vm_id;
+
+#define DRM_XE_VM_GET_PROPERTY_FAULTS		0
+	/** @property: property to get */
+	__u32 property;
+
+	/** @size: Size to allocate for @data */
+	__u32 size;
+
+	/** @pad: MBZ */
+	__u32 pad;
+
+	union {
+		/** @data: Pointer to user-defined array of flexible size and type */
+		__u64 data;
+		/** @value: Return value for scalar queries */
+		__u64 value;
+	};
+
+	/** @reserved: MBZ */
+	__u64 reserved[3];
+};
+
 /**
  * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE
  *
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v23 4/5] drm/xe/xe_vm: Add per VM fault info
  2025-04-24 20:49 [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
                   ` (2 preceding siblings ...)
  2025-04-24 20:49 ` [PATCH v23 3/5] drm/xe/uapi: Define drm_xe_vm_get_property Jonathan Cavitt
@ 2025-04-24 20:49 ` Jonathan Cavitt
  2025-04-24 20:49 ` [PATCH v23 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Cavitt @ 2025-04-24 20:49 UTC (permalink / raw)
  To: intel-xe
  Cc: saurabhg.gupta, alex.zuo, jonathan.cavitt, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

Add additional information to each VM so they can report up to the first
50 seen faults.  Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.

Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.

v2:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)

v3:
- Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt)
- Fix double-locking error (jcavitt)
- Assert kmemdump is successful (Shuicheng)

v4:
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Add and fix kernel docs (Michal W)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)

v5:
- Store fault and access type instead of address type (Jianxun)

v6:
- Store pagefaults in non-fault-mode VMs as well (Jianxun)

v7:
- Fix kernel docs and comments (Michal W)

v8:
- Fix double-locking issue (Jianxun)

v9:
- Do not report faults from reserved engines (Jianxun)

v10:
- Remove engine class and instance (Ivan)

v11:
- Perform kzalloc outside of lock (Auld)

v12:
- Fix xe_vm_fault_entry kernel docs (Shuicheng)

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Jianxun Zhang <jianxun.zhang@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
Cc: Michal Mzorek <michal.mzorek@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_pagefault.c | 26 ++++++++
 drivers/gpu/drm/xe/xe_vm.c           | 88 ++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h           |  9 +++
 drivers/gpu/drm/xe/xe_vm_types.h     | 29 +++++++++
 4 files changed, 152 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index 93afa54c8780..a84f6247f8a2 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -345,6 +345,31 @@ int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len)
 	return full ? -ENOSPC : 0;
 }
 
+static void save_pagefault_to_vm(struct xe_device *xe, struct xe_gt_pagefault *pf)
+{
+	struct xe_vm *vm;
+
+	/*
+	 * Pagefault may be associated to VM that is not in fault mode.
+	 * Perform asid_to_vm behavior, except if vm is not in fault
+	 * mode, return the VM anyways.
+	 */
+	down_read(&xe->usm.lock);
+	vm = xa_load(&xe->usm.asid_to_vm, pf->asid);
+	if (vm)
+		xe_vm_get(vm);
+	else
+		vm = ERR_PTR(-EINVAL);
+	up_read(&xe->usm.lock);
+
+	if (IS_ERR(vm))
+		return;
+
+	xe_vm_add_fault_entry_pf(vm, pf);
+
+	xe_vm_put(vm);
+}
+
 #define USM_QUEUE_MAX_RUNTIME_MS	20
 
 static void pf_queue_work_func(struct work_struct *w)
@@ -364,6 +389,7 @@ static void pf_queue_work_func(struct work_struct *w)
 		if (unlikely(ret)) {
 			print_pagefault(xe, &pf);
 			pf.fault_unsuccessful = 1;
+			save_pagefault_to_vm(xe, &pf);
 			drm_dbg(&xe->drm, "Fault response: Unsuccessful %d\n", ret);
 		}
 
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 0c69ef6b5ec5..107e397b4987 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -27,7 +27,9 @@
 #include "xe_device.h"
 #include "xe_drm_client.h"
 #include "xe_exec_queue.h"
+#include "xe_gt.h"
 #include "xe_gt_pagefault.h"
+#include "xe_gt_pagefault_types.h"
 #include "xe_gt_tlb_invalidation.h"
 #include "xe_migrate.h"
 #include "xe_pat.h"
@@ -778,6 +780,87 @@ int xe_vm_userptr_check_repin(struct xe_vm *vm)
 		list_empty_careful(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
 }
 
+static struct xe_hw_engine *
+hw_engine_lookup_class_instance(struct xe_vm *vm,
+				enum xe_engine_class class,
+				u16 instance)
+{
+	struct xe_device *xe = vm->xe;
+	struct xe_hw_engine *hwe;
+	enum xe_hw_engine_id id;
+	struct xe_gt *gt;
+	u8 gt_id;
+
+	for_each_gt(gt, xe, gt_id)
+		for_each_hw_engine(hwe, gt, id)
+			if (hwe->class == class && hwe->instance == instance)
+				return hwe;
+	return NULL;
+}
+
+/**
+ * xe_vm_add_fault_entry_pf() - Add pagefault to vm fault list
+ * @vm: The VM.
+ * @pf: The pagefault.
+ *
+ * This function takes the data from the pagefault @pf and saves it to @vm->faults.list.
+ *
+ * The function exits silently if the list is full, and reports a warning if the pagefault
+ * could not be saved to the list.
+ */
+void xe_vm_add_fault_entry_pf(struct xe_vm *vm, struct xe_gt_pagefault *pf)
+{
+	struct xe_vm_fault_entry *e = NULL;
+	struct xe_hw_engine *hwe;
+
+	/* Do not report faults on reserved engines */
+	hwe = hw_engine_lookup_class_instance(vm, pf->engine_class, pf->engine_instance);
+	if (!hwe || xe_hw_engine_is_reserved(hwe))
+		return;
+
+	e = kzalloc(sizeof(*e), GFP_KERNEL);
+	if (!e) {
+		drm_warn(&vm->xe->drm,
+			 "Could not allocate memory for fault!\n");
+		return;
+	}
+
+	spin_lock(&vm->faults.lock);
+
+	/*
+	 * Limit the number of faults in the fault list to prevent
+	 * memory overuse.
+	 */
+	if (vm->faults.len >= MAX_FAULTS_SAVED_PER_VM) {
+		kfree(e);
+		goto out;
+	}
+
+	e->address = pf->page_addr;
+	e->address_precision = 1;
+	e->access_type = pf->access_type;
+	e->fault_type = pf->fault_type;
+	e->fault_level = pf->fault_level;
+
+	list_add_tail(&e->list, &vm->faults.list);
+	vm->faults.len++;
+out:
+	spin_unlock(&vm->faults.lock);
+}
+
+static void xe_vm_clear_fault_entries(struct xe_vm *vm)
+{
+	struct xe_vm_fault_entry *e, *tmp;
+
+	spin_lock(&vm->faults.lock);
+	list_for_each_entry_safe(e, tmp, &vm->faults.list, list) {
+		list_del(&e->list);
+		kfree(e);
+	}
+	vm->faults.len = 0;
+	spin_unlock(&vm->faults.lock);
+}
+
 static int xe_vma_ops_alloc(struct xe_vma_ops *vops, bool array_of_binds)
 {
 	int i;
@@ -1660,6 +1743,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	init_rwsem(&vm->userptr.notifier_lock);
 	spin_lock_init(&vm->userptr.invalidated_lock);
 
+	INIT_LIST_HEAD(&vm->faults.list);
+	spin_lock_init(&vm->faults.lock);
+
 	ttm_lru_bulk_move_init(&vm->lru_bulk_move);
 
 	INIT_WORK(&vm->destroy_work, vm_destroy_work_func);
@@ -1930,6 +2016,8 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	}
 	up_write(&xe->usm.lock);
 
+	xe_vm_clear_fault_entries(vm);
+
 	for_each_tile(tile, xe, id)
 		xe_range_fence_tree_fini(&vm->rftree[id]);
 
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 0ef811fc2bde..9bd7e93824da 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -12,6 +12,12 @@
 #include "xe_map.h"
 #include "xe_vm_types.h"
 
+/**
+ * MAX_FAULTS_SAVED_PER_VM - Maximum number of faults each vm can store before future
+ * faults are discarded to prevent memory overuse
+ */
+#define MAX_FAULTS_SAVED_PER_VM	50
+
 struct drm_device;
 struct drm_printer;
 struct drm_file;
@@ -22,6 +28,7 @@ struct dma_fence;
 
 struct xe_exec_queue;
 struct xe_file;
+struct xe_gt_pagefault;
 struct xe_sync_entry;
 struct xe_svm_range;
 struct drm_exec;
@@ -257,6 +264,8 @@ int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma);
 
 int xe_vma_userptr_check_repin(struct xe_userptr_vma *uvma);
 
+void xe_vm_add_fault_entry_pf(struct xe_vm *vm, struct xe_gt_pagefault *pf);
+
 bool xe_vm_validate_should_retry(struct drm_exec *exec, int err, ktime_t *end);
 
 int xe_vm_lock_vma(struct drm_exec *exec, struct xe_vma *vma);
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 1662604c4486..38e9f5d66386 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -19,6 +19,7 @@
 #include "xe_range_fence.h"
 
 struct xe_bo;
+struct xe_pagefault;
 struct xe_svm_range;
 struct xe_sync_entry;
 struct xe_user_fence;
@@ -142,6 +143,24 @@ struct xe_userptr_vma {
 
 struct xe_device;
 
+/**
+ * struct xe_vm_fault_entry - Elements of vm->faults.list
+ * @list: link into @xe_vm.faults.list
+ * @address: address of the fault
+ * @address_precision: precision of faulted address
+ * @access_type: type of address access that resulted in fault
+ * @fault_type: type of fault reported
+ * @fault_level: fault level of the fault
+ */
+struct xe_vm_fault_entry {
+	struct list_head list;
+	u64 address;
+	u32 address_precision;
+	u8 access_type;
+	u8 fault_type;
+	u8 fault_level;
+};
+
 struct xe_vm {
 	/** @gpuvm: base GPUVM used to track VMAs */
 	struct drm_gpuvm gpuvm;
@@ -305,6 +324,16 @@ struct xe_vm {
 		bool capture_once;
 	} error_capture;
 
+	/** @faults: List of all faults associated with this VM */
+	struct {
+		/** @faults.lock: lock protecting @faults.list */
+		spinlock_t lock;
+		/** @faults.list: list of xe_vm_fault_entry entries */
+		struct list_head list;
+		/** @faults.len: length of @faults.list */
+		unsigned int len;
+	} faults;
+
 	/**
 	 * @tlb_flush_seqno: Required TLB flush seqno for the next exec.
 	 * protected by the vm resv.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v23 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
  2025-04-24 20:49 [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
                   ` (3 preceding siblings ...)
  2025-04-24 20:49 ` [PATCH v23 4/5] drm/xe/xe_vm: Add per VM fault info Jonathan Cavitt
@ 2025-04-24 20:49 ` Jonathan Cavitt
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Cavitt @ 2025-04-24 20:49 UTC (permalink / raw)
  To: intel-xe
  Cc: saurabhg.gupta, alex.zuo, jonathan.cavitt, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

Add support for userspace to request a list of observed faults
from a specified VM.

v2:
- Only allow querying of failed pagefaults (Matt Brost)

v3:
- Remove unnecessary size parameter from helper function, as it
  is a property of the arguments. (jcavitt)
- Remove unnecessary copy_from_user (Jainxun)
- Set address_precision to 1 (Jainxun)
- Report max size instead of dynamic size for memory allocation
  purposes.  Total memory usage is reported separately.

v4:
- Return int from xe_vm_get_property_size (Shuicheng)
- Fix memory leak (Shuicheng)
- Remove unnecessary size variable (jcavitt)

v5:
- Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)
- Update fill_property_pfs to eliminate need for kzalloc (Jianxun)

v6:
- Repair and move fill_faults break condition (Dan Carpenter)
- Free vm after use (jcavitt)
- Combine assertions (jcavitt)
- Expand size check in xe_vm_get_faults_ioctl (jcavitt)
- Remove return mask from fill_faults, as return is already -EFAULT or 0
  (jcavitt)

v7:
- Revert back to using xe_vm_get_property_ioctl
- Apply better copy_to_user logic (jcavitt)

v8:
- Fix and clean up error value handling in ioctl (jcavitt)
- Reapply return mask for fill_faults (jcavitt)

v9:
- Future-proof size logic for zero-size properties (jcavitt)
- Add access and fault types (Jianxun)
- Remove address type (Jianxun)

v10:
- Remove unnecessary switch case logic (Raag)
- Compress size get, size validation, and property fill functions into a
  single helper function (jcavitt)
- Assert valid size (jcavitt)

v11:
- Remove unnecessary else condition
- Correct backwards helper function size logic (jcavitt)

v12:
- Use size_t instead of int (Raag)

v13:
- Remove engine class and instance (Ivan)

v14:
- Map access type, fault type, and fault level to user macros (Matt
  Brost, Ivan)

v15:
- Remove unnecessary size assertion (jcavitt)

v16:
- Nit fixes (Matt Brost)

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Cc: Jainxun Zhang <jianxun.zhang@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c |   2 +
 drivers/gpu/drm/xe/xe_vm.c     | 108 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h     |   2 +
 3 files changed, 112 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 75e753e0a682..319768733504 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -196,6 +196,8 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl,
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
+			  DRM_RENDER_ALLOW),
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 107e397b4987..5e79de592dd3 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3600,6 +3600,114 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	return err;
 }
 
+/*
+ * Map access type, fault type, and fault level from current bspec
+ * specification to user spec abstraction.  The current mapping is
+ * 1-to-1, but if there is ever a hardware change, we will need
+ * this abstraction layer to maintain API stability through the
+ * hardware change.
+ */
+static u8 xe_to_user_access_type(u8 access_type)
+{
+	return access_type;
+}
+
+static u8 xe_to_user_fault_type(u8 fault_type)
+{
+	return fault_type;
+}
+
+static u8 xe_to_user_fault_level(u8 fault_level)
+{
+	return fault_level;
+}
+
+static int fill_faults(struct xe_vm *vm,
+		       struct drm_xe_vm_get_property *args)
+{
+	struct xe_vm_fault __user *usr_ptr = u64_to_user_ptr(args->data);
+	struct xe_vm_fault store = { 0 };
+	struct xe_vm_fault_entry *entry;
+	int ret = 0, i = 0, count, entry_size;
+
+	entry_size = sizeof(struct xe_vm_fault);
+	count = args->size / entry_size;
+
+	spin_lock(&vm->faults.lock);
+	list_for_each_entry(entry, &vm->faults.list, list) {
+		if (i++ == count)
+			break;
+
+		memset(&store, 0, entry_size);
+
+		store.address = entry->address;
+		store.address_precision = entry->address_precision;
+
+		store.access_type = xe_to_user_access_type(entry->access_type);
+		store.fault_type = xe_to_user_fault_type(entry->fault_type);
+		store.fault_level = xe_to_user_fault_level(entry->fault_level);
+
+		ret = copy_to_user(usr_ptr, &store, entry_size);
+		if (ret)
+			break;
+
+		usr_ptr++;
+	}
+	spin_unlock(&vm->faults.lock);
+
+	return ret ? -EFAULT : 0;
+}
+
+static int xe_vm_get_property_helper(struct xe_vm *vm,
+				     struct drm_xe_vm_get_property *args)
+{
+	size_t size;
+
+	switch (args->property) {
+	case DRM_XE_VM_GET_PROPERTY_FAULTS:
+		spin_lock(&vm->faults.lock);
+		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
+		spin_unlock(&vm->faults.lock);
+
+		if (args->size)
+			/*
+			 * Number of faults may increase between calls to
+			 * xe_vm_get_property_ioctl, so just report the
+			 * number of faults the user requests if it's less
+			 * than or equal to the number of faults in the VM
+			 * fault array.
+			 */
+			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
+
+		args->size = size;
+		return 0;
+	}
+	return -EINVAL;
+}
+
+int xe_vm_get_property_ioctl(struct drm_device *drm, void *data,
+			     struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(drm);
+	struct xe_file *xef = to_xe_file(file);
+	struct drm_xe_vm_get_property *args = data;
+	struct xe_vm *vm;
+	int ret = 0;
+
+	if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1] ||
+			     args->reserved[2]))
+		return -EINVAL;
+
+	vm = xe_vm_lookup(xef, args->vm_id);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -ENOENT;
+
+	ret = xe_vm_get_property_helper(vm, args);
+
+	xe_vm_put(vm);
+	return ret;
+}
+
 /**
  * xe_vm_bind_kernel_bo - bind a kernel BO to a VM
  * @vm: VM to bind the BO to
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 9bd7e93824da..63ec22458e04 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -196,6 +196,8 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file);
 int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file);
+int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file);
 
 void xe_vm_close_and_put(struct xe_vm *vm);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-24 20:49 ` [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header Jonathan Cavitt
@ 2025-04-27 11:57   ` Dafna Hirschfeld
  2025-04-28  1:51     ` Matthew Brost
                       ` (2 more replies)
  2025-04-29 14:22   ` Dafna Hirschfeld
  1 sibling, 3 replies; 17+ messages in thread
From: Dafna Hirschfeld @ 2025-04-27 11:57 UTC (permalink / raw)
  To: Jonathan Cavitt
  Cc: intel-xe, saurabhg.gupta, alex.zuo, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

On 24.04.2025 20:49, Jonathan Cavitt wrote:
>Move the pagefault struct from xe_gt_pagefault.c to the
>xe_gt_pagefault_types.h header file, and move the associated enum values
>into the regs folder under xe_pagefault_desc.h
>
>Since xe_pagefault_desc.h is being initialized here, also move the
>xe_guc_pagefault_desc hardware formats to the new file.

Hi, currently page fault is assumed to be handled by guc only.
In order to make the descriptor more general, I sent two patches moving
the descriptor to xe_uc_fw_types.h. [1]
Maybe my patches could be used instead of this one?
[1] https://lore.kernel.org/intel-xe/20250209121217.1604458-1-dafna.hirschfeld@intel.com/

Thanks,
Dafna


>
>v2:
>- Normalize names for common header (Matt Brost)
>
>v3:
>- s/Migrate/Move (Michal W)
>- s/xe_pagefault/xe_gt_pagefault (Michal W)
>- Create new header file, xe_gt_pagefault_types.h (Michal W)
>- Add kernel docs (Michal W)
>
>v4:
>- Fix includes usage (Michal W)
>- Reference Bspec (Michal W)
>
>v5:
>- Convert enums to defines in regs folder (Michal W)
>- Move xe_guc_pagefault_desc to regs folder (Michal W)
>
>Bspec: 77412
>Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
>Acked-by: Matthew Brost <matthew.brost@intel.com>
>Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
>---
> drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> 4 files changed, 100 insertions(+), 62 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>new file mode 100644
>index 000000000000..a169ac274e14
>--- /dev/null
>+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>@@ -0,0 +1,49 @@
>+/* SPDX-License-Identifier: MIT */
>+/*
>+ * Copyright © 2025 Intel Corporation
>+ */
>+
>+#ifndef _XE_PAGEFAULT_DESC_H_
>+#define _XE_PAGEFAULT_DESC_H_
>+
>+#include <linux/bits.h>
>+#include <linux/types.h>
>+
>+struct xe_guc_pagefault_desc {
>+	u32 dw0;
>+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
>+#define PFD_SRC_ID		GENMASK(10, 3)
>+#define PFD_RSVD_0		GENMASK(17, 11)
>+#define XE2_PFD_TRVA_FAULT	BIT(18)
>+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
>+#define PFD_ENG_CLASS		GENMASK(27, 25)
>+#define PFD_PDATA_LO		GENMASK(31, 28)
>+
>+	u32 dw1;
>+#define PFD_PDATA_HI		GENMASK(11, 0)
>+#define PFD_PDATA_HI_SHIFT	4
>+#define PFD_ASID		GENMASK(31, 12)
>+
>+	u32 dw2;
>+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
>+#define PFD_FAULT_TYPE		GENMASK(3, 2)
>+#define PFD_VFID		GENMASK(9, 4)
>+#define PFD_RSVD_1		GENMASK(11, 10)
>+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
>+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
>+
>+	u32 dw3;
>+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
>+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
>+} __packed;
>+
>+#define FLT_ACCESS_TYPE_READ		0u
>+#define FLT_ACCESS_TYPE_WRITE		1u
>+#define FLT_ACCESS_TYPE_ATOMIC		2u
>+#define FLT_ACCESS_TYPE_RESERVED	3u
>+
>+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
>+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
>+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
>+
>+#endif
>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>index d4e3b7eb165a..93afa54c8780 100644
>--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>@@ -12,8 +12,10 @@
> #include <drm/drm_managed.h>
>
> #include "abi/guc_actions_abi.h"
>+#include "regs/xe_pagefault_desc.h"
> #include "xe_bo.h"
> #include "xe_gt.h"
>+#include "xe_gt_pagefault_types.h"
> #include "xe_gt_stats.h"
> #include "xe_gt_tlb_invalidation.h"
> #include "xe_guc.h"
>@@ -23,33 +25,6 @@
> #include "xe_trace_bo.h"
> #include "xe_vm.h"
>
>-struct pagefault {
>-	u64 page_addr;
>-	u32 asid;
>-	u16 pdata;
>-	u8 vfid;
>-	u8 access_type;
>-	u8 fault_type;
>-	u8 fault_level;
>-	u8 engine_class;
>-	u8 engine_instance;
>-	u8 fault_unsuccessful;
>-	bool trva_fault;
>-};
>-
>-enum access_type {
>-	ACCESS_TYPE_READ = 0,
>-	ACCESS_TYPE_WRITE = 1,
>-	ACCESS_TYPE_ATOMIC = 2,
>-	ACCESS_TYPE_RESERVED = 3,
>-};
>-
>-enum fault_type {
>-	NOT_PRESENT = 0,
>-	WRITE_ACCESS_VIOLATION = 1,
>-	ATOMIC_ACCESS_VIOLATION = 2,
>-};
>-
> struct acc {
> 	u64 va_range_base;
> 	u32 asid;
>@@ -61,9 +36,9 @@ struct acc {
> 	u8 engine_instance;
> };
>
>-static bool access_is_atomic(enum access_type access_type)
>+static bool access_is_atomic(u32 access_type)
> {
>-	return access_type == ACCESS_TYPE_ATOMIC;
>+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> }
>
> static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
>@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> 	return vm;
> }
>
>-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> {
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct xe_vm *vm;
>@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> 		goto unlock_vm;
> 	}
>
>-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
>+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> 		err = -EPERM;
> 		goto unlock_vm;
> 	}
>@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> }
>
>-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
>+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> {
> 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> 		 "\tVFID: %d\n"
>@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
>
> #define PF_MSG_LEN_DW	4
>
>-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
>+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> {
> 	const struct xe_guc_pagefault_desc *desc;
> 	bool ret = false;
>@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> 	struct xe_gt *gt = pf_queue->gt;
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct xe_guc_pagefault_reply reply = {};
>-	struct pagefault pf = {};
>+	struct xe_gt_pagefault pf = {};
> 	unsigned long threshold;
> 	int ret;
>
>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>new file mode 100644
>index 000000000000..b7d41b558de3
>--- /dev/null
>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>@@ -0,0 +1,42 @@
>+/* SPDX-License-Identifier: MIT */
>+/*
>+ * Copyright © 2022-2025 Intel Corporation
>+ */
>+
>+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
>+#define _XE_GT_PAGEFAULT_TYPES_H_
>+
>+#include <linux/types.h>
>+
>+/**
>+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
>+ * pagefault handler
>+ */
>+struct xe_gt_pagefault {
>+	/** @page_addr: faulted address of this pagefault */
>+	u64 page_addr;
>+	/** @asid: ASID of this pagefault */
>+	u32 asid;
>+	/** @pdata: PDATA of this pagefault */
>+	u16 pdata;
>+	/** @vfid: VFID of this pagefault */
>+	u8 vfid;
>+	/** @access_type: access type of this pagefault */
>+	u8 access_type;
>+	/** @fault_type: fault type of this pagefault */
>+	u8 fault_type;
>+	/** @fault_level: fault level of this pagefault */
>+	u8 fault_level;
>+	/** @engine_class: engine class this pagefault was reported on */
>+	u8 engine_class;
>+	/** @engine_instance: engine instance this pagefault was reported on */
>+	u8 engine_instance;
>+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
>+	u8 fault_unsuccessful;
>+	/** @prefetch: unused */
>+	bool prefetch;
>+	/** @trva_fault: is set if this is a TRTT fault */
>+	bool trva_fault;
>+};
>+
>+#endif
>diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
>index 6f57578b07cb..30ac21bb4f15 100644
>--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> 	FAULT_RESPONSE_DESC
> };
>
>-struct xe_guc_pagefault_desc {
>-	u32 dw0;
>-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
>-#define PFD_SRC_ID		GENMASK(10, 3)
>-#define PFD_RSVD_0		GENMASK(17, 11)
>-#define XE2_PFD_TRVA_FAULT	BIT(18)
>-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
>-#define PFD_ENG_CLASS		GENMASK(27, 25)
>-#define PFD_PDATA_LO		GENMASK(31, 28)
>-
>-	u32 dw1;
>-#define PFD_PDATA_HI		GENMASK(11, 0)
>-#define PFD_PDATA_HI_SHIFT	4
>-#define PFD_ASID		GENMASK(31, 12)
>-
>-	u32 dw2;
>-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
>-#define PFD_FAULT_TYPE		GENMASK(3, 2)
>-#define PFD_VFID		GENMASK(9, 4)
>-#define PFD_RSVD_1		GENMASK(11, 10)
>-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
>-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
>-
>-	u32 dw3;
>-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
>-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
>-} __packed;
>-
> struct xe_guc_pagefault_reply {
> 	u32 dw0;
> #define PFR_VALID		BIT(0)
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-27 11:57   ` Dafna Hirschfeld
@ 2025-04-28  1:51     ` Matthew Brost
  2025-04-28  2:09       ` Matthew Brost
  2025-04-28  2:14     ` Matthew Brost
  2025-04-28 14:17     ` Cavitt, Jonathan
  2 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-28  1:51 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Jonathan Cavitt, intel-xe, saurabhg.gupta, alex.zuo,
	joonas.lahtinen, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

On Sun, Apr 27, 2025 at 02:57:22PM +0300, Dafna Hirschfeld wrote:
> On 24.04.2025 20:49, Jonathan Cavitt wrote:
> > Move the pagefault struct from xe_gt_pagefault.c to the
> > xe_gt_pagefault_types.h header file, and move the associated enum values
> > into the regs folder under xe_pagefault_desc.h
> > 
> > Since xe_pagefault_desc.h is being initialized here, also move the
> > xe_guc_pagefault_desc hardware formats to the new file.
> 
> Hi, currently page fault is assumed to be handled by guc only.

Yea, this is layered incorrectly - this is my bad. So in our TLB
invalidation code, also my bad.

> In order to make the descriptor more general, I sent two patches moving
> the descriptor to xe_uc_fw_types.h. [1]
> Maybe my patches could be used instead of this one?
> [1] https://lore.kernel.org/intel-xe/20250209121217.1604458-1-dafna.hirschfeld@intel.com/
>

I think this patch is merged, or will be very shortly. I don't this is
worth holding up, but in general I think the flow should be adjusted to
roughly...

- GuC (or any other PF client) parses PF into HW agnostic format and
  passes it to GT page fault component - I think we drop any fields from
  the current GuC format that are unused by the GT page fault component
- A vfunc is attached too the HW agnostic format which is used to signal
  PF done, in the case of the GuC this issues H2G
- All GuC specific code is removed from GT page fault component

This should allow other firmwares to hook into the generic page fault
code.

Matt
 
> Thanks,
> Dafna
> 
> 
> > 
> > v2:
> > - Normalize names for common header (Matt Brost)
> > 
> > v3:
> > - s/Migrate/Move (Michal W)
> > - s/xe_pagefault/xe_gt_pagefault (Michal W)
> > - Create new header file, xe_gt_pagefault_types.h (Michal W)
> > - Add kernel docs (Michal W)
> > 
> > v4:
> > - Fix includes usage (Michal W)
> > - Reference Bspec (Michal W)
> > 
> > v5:
> > - Convert enums to defines in regs folder (Michal W)
> > - Move xe_guc_pagefault_desc to regs folder (Michal W)
> > 
> > Bspec: 77412
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> > Acked-by: Matthew Brost <matthew.brost@intel.com>
> > Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> > ---
> > drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> > drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> > 4 files changed, 100 insertions(+), 62 deletions(-)
> > create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > 
> > diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > new file mode 100644
> > index 000000000000..a169ac274e14
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_PAGEFAULT_DESC_H_
> > +#define _XE_PAGEFAULT_DESC_H_
> > +
> > +#include <linux/bits.h>
> > +#include <linux/types.h>
> > +
> > +struct xe_guc_pagefault_desc {
> > +	u32 dw0;
> > +#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > +#define PFD_SRC_ID		GENMASK(10, 3)
> > +#define PFD_RSVD_0		GENMASK(17, 11)
> > +#define XE2_PFD_TRVA_FAULT	BIT(18)
> > +#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > +#define PFD_ENG_CLASS		GENMASK(27, 25)
> > +#define PFD_PDATA_LO		GENMASK(31, 28)
> > +
> > +	u32 dw1;
> > +#define PFD_PDATA_HI		GENMASK(11, 0)
> > +#define PFD_PDATA_HI_SHIFT	4
> > +#define PFD_ASID		GENMASK(31, 12)
> > +
> > +	u32 dw2;
> > +#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > +#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > +#define PFD_VFID		GENMASK(9, 4)
> > +#define PFD_RSVD_1		GENMASK(11, 10)
> > +#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > +#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > +
> > +	u32 dw3;
> > +#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > +#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > +} __packed;
> > +
> > +#define FLT_ACCESS_TYPE_READ		0u
> > +#define FLT_ACCESS_TYPE_WRITE		1u
> > +#define FLT_ACCESS_TYPE_ATOMIC		2u
> > +#define FLT_ACCESS_TYPE_RESERVED	3u
> > +
> > +#define FLT_TYPE_NOT_PRESENT_FAULT		0u
> > +#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
> > +#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > index d4e3b7eb165a..93afa54c8780 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > @@ -12,8 +12,10 @@
> > #include <drm/drm_managed.h>
> > 
> > #include "abi/guc_actions_abi.h"
> > +#include "regs/xe_pagefault_desc.h"
> > #include "xe_bo.h"
> > #include "xe_gt.h"
> > +#include "xe_gt_pagefault_types.h"
> > #include "xe_gt_stats.h"
> > #include "xe_gt_tlb_invalidation.h"
> > #include "xe_guc.h"
> > @@ -23,33 +25,6 @@
> > #include "xe_trace_bo.h"
> > #include "xe_vm.h"
> > 
> > -struct pagefault {
> > -	u64 page_addr;
> > -	u32 asid;
> > -	u16 pdata;
> > -	u8 vfid;
> > -	u8 access_type;
> > -	u8 fault_type;
> > -	u8 fault_level;
> > -	u8 engine_class;
> > -	u8 engine_instance;
> > -	u8 fault_unsuccessful;
> > -	bool trva_fault;
> > -};
> > -
> > -enum access_type {
> > -	ACCESS_TYPE_READ = 0,
> > -	ACCESS_TYPE_WRITE = 1,
> > -	ACCESS_TYPE_ATOMIC = 2,
> > -	ACCESS_TYPE_RESERVED = 3,
> > -};
> > -
> > -enum fault_type {
> > -	NOT_PRESENT = 0,
> > -	WRITE_ACCESS_VIOLATION = 1,
> > -	ATOMIC_ACCESS_VIOLATION = 2,
> > -};
> > -
> > struct acc {
> > 	u64 va_range_base;
> > 	u32 asid;
> > @@ -61,9 +36,9 @@ struct acc {
> > 	u8 engine_instance;
> > };
> > 
> > -static bool access_is_atomic(enum access_type access_type)
> > +static bool access_is_atomic(u32 access_type)
> > {
> > -	return access_type == ACCESS_TYPE_ATOMIC;
> > +	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> > }
> > 
> > static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> > @@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> > 	return vm;
> > }
> > 
> > -static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > +static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> > {
> > 	struct xe_device *xe = gt_to_xe(gt);
> > 	struct xe_vm *vm;
> > @@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > 		goto unlock_vm;
> > 	}
> > 
> > -	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
> > +	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> > 		err = -EPERM;
> > 		goto unlock_vm;
> > 	}
> > @@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> > 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> > }
> > 
> > -static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > +static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> > {
> > 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> > 		 "\tVFID: %d\n"
> > @@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > 
> > #define PF_MSG_LEN_DW	4
> > 
> > -static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
> > +static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> > {
> > 	const struct xe_guc_pagefault_desc *desc;
> > 	bool ret = false;
> > @@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> > 	struct xe_gt *gt = pf_queue->gt;
> > 	struct xe_device *xe = gt_to_xe(gt);
> > 	struct xe_guc_pagefault_reply reply = {};
> > -	struct pagefault pf = {};
> > +	struct xe_gt_pagefault pf = {};
> > 	unsigned long threshold;
> > 	int ret;
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > new file mode 100644
> > index 000000000000..b7d41b558de3
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > @@ -0,0 +1,42 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2022-2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_GT_PAGEFAULT_TYPES_H_
> > +#define _XE_GT_PAGEFAULT_TYPES_H_
> > +
> > +#include <linux/types.h>
> > +
> > +/**
> > + * struct xe_gt_pagefault - Structure of pagefaults returned by the
> > + * pagefault handler
> > + */
> > +struct xe_gt_pagefault {
> > +	/** @page_addr: faulted address of this pagefault */
> > +	u64 page_addr;
> > +	/** @asid: ASID of this pagefault */
> > +	u32 asid;
> > +	/** @pdata: PDATA of this pagefault */
> > +	u16 pdata;
> > +	/** @vfid: VFID of this pagefault */
> > +	u8 vfid;
> > +	/** @access_type: access type of this pagefault */
> > +	u8 access_type;
> > +	/** @fault_type: fault type of this pagefault */
> > +	u8 fault_type;
> > +	/** @fault_level: fault level of this pagefault */
> > +	u8 fault_level;
> > +	/** @engine_class: engine class this pagefault was reported on */
> > +	u8 engine_class;
> > +	/** @engine_instance: engine instance this pagefault was reported on */
> > +	u8 engine_instance;
> > +	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
> > +	u8 fault_unsuccessful;
> > +	/** @prefetch: unused */
> > +	bool prefetch;
> > +	/** @trva_fault: is set if this is a TRTT fault */
> > +	bool trva_fault;
> > +};
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > index 6f57578b07cb..30ac21bb4f15 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > @@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> > 	FAULT_RESPONSE_DESC
> > };
> > 
> > -struct xe_guc_pagefault_desc {
> > -	u32 dw0;
> > -#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > -#define PFD_SRC_ID		GENMASK(10, 3)
> > -#define PFD_RSVD_0		GENMASK(17, 11)
> > -#define XE2_PFD_TRVA_FAULT	BIT(18)
> > -#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > -#define PFD_ENG_CLASS		GENMASK(27, 25)
> > -#define PFD_PDATA_LO		GENMASK(31, 28)
> > -
> > -	u32 dw1;
> > -#define PFD_PDATA_HI		GENMASK(11, 0)
> > -#define PFD_PDATA_HI_SHIFT	4
> > -#define PFD_ASID		GENMASK(31, 12)
> > -
> > -	u32 dw2;
> > -#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > -#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > -#define PFD_VFID		GENMASK(9, 4)
> > -#define PFD_RSVD_1		GENMASK(11, 10)
> > -#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > -#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > -
> > -	u32 dw3;
> > -#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > -#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > -} __packed;
> > -
> > struct xe_guc_pagefault_reply {
> > 	u32 dw0;
> > #define PFR_VALID		BIT(0)
> > -- 
> > 2.43.0
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-28  1:51     ` Matthew Brost
@ 2025-04-28  2:09       ` Matthew Brost
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Brost @ 2025-04-28  2:09 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Jonathan Cavitt, intel-xe, saurabhg.gupta, alex.zuo,
	joonas.lahtinen, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

On Sun, Apr 27, 2025 at 06:51:30PM -0700, Matthew Brost wrote:

Ignore this one... I screwed up the reply list as intel-xe is not
include. Replying again to include intel-xe.

Matt 

> On Sun, Apr 27, 2025 at 02:57:22PM +0300, Dafna Hirschfeld wrote:
> > On 24.04.2025 20:49, Jonathan Cavitt wrote:
> > > Move the pagefault struct from xe_gt_pagefault.c to the
> > > xe_gt_pagefault_types.h header file, and move the associated enum values
> > > into the regs folder under xe_pagefault_desc.h
> > > 
> > > Since xe_pagefault_desc.h is being initialized here, also move the
> > > xe_guc_pagefault_desc hardware formats to the new file.
> > 
> > Hi, currently page fault is assumed to be handled by guc only.
> 
> Yea, this is layered incorrectly - this is my bad. So in our TLB
> invalidation code, also my bad.
> 
> > In order to make the descriptor more general, I sent two patches moving
> > the descriptor to xe_uc_fw_types.h. [1]
> > Maybe my patches could be used instead of this one?
> > [1] https://lore.kernel.org/intel-xe/20250209121217.1604458-1-dafna.hirschfeld@intel.com/
> >
> 
> I think this patch is merged, or will be very shortly. I don't this is
> worth holding up, but in general I think the flow should be adjusted to
> roughly...
> 
> - GuC (or any other PF client) parses PF into HW agnostic format and
>   passes it to GT page fault component - I think we drop any fields from
>   the current GuC format that are unused by the GT page fault component
> - A vfunc is attached too the HW agnostic format which is used to signal
>   PF done, in the case of the GuC this issues H2G
> - All GuC specific code is removed from GT page fault component
> 
> This should allow other firmwares to hook into the generic page fault
> code.
> 
> Matt
>  
> > Thanks,
> > Dafna
> > 
> > 
> > > 
> > > v2:
> > > - Normalize names for common header (Matt Brost)
> > > 
> > > v3:
> > > - s/Migrate/Move (Michal W)
> > > - s/xe_pagefault/xe_gt_pagefault (Michal W)
> > > - Create new header file, xe_gt_pagefault_types.h (Michal W)
> > > - Add kernel docs (Michal W)
> > > 
> > > v4:
> > > - Fix includes usage (Michal W)
> > > - Reference Bspec (Michal W)
> > > 
> > > v5:
> > > - Convert enums to defines in regs folder (Michal W)
> > > - Move xe_guc_pagefault_desc to regs folder (Michal W)
> > > 
> > > Bspec: 77412
> > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> > > Acked-by: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> > > drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> > > drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> > > drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> > > 4 files changed, 100 insertions(+), 62 deletions(-)
> > > create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > > create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > > 
> > > diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > > new file mode 100644
> > > index 000000000000..a169ac274e14
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > > @@ -0,0 +1,49 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright © 2025 Intel Corporation
> > > + */
> > > +
> > > +#ifndef _XE_PAGEFAULT_DESC_H_
> > > +#define _XE_PAGEFAULT_DESC_H_
> > > +
> > > +#include <linux/bits.h>
> > > +#include <linux/types.h>
> > > +
> > > +struct xe_guc_pagefault_desc {
> > > +	u32 dw0;
> > > +#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > > +#define PFD_SRC_ID		GENMASK(10, 3)
> > > +#define PFD_RSVD_0		GENMASK(17, 11)
> > > +#define XE2_PFD_TRVA_FAULT	BIT(18)
> > > +#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > > +#define PFD_ENG_CLASS		GENMASK(27, 25)
> > > +#define PFD_PDATA_LO		GENMASK(31, 28)
> > > +
> > > +	u32 dw1;
> > > +#define PFD_PDATA_HI		GENMASK(11, 0)
> > > +#define PFD_PDATA_HI_SHIFT	4
> > > +#define PFD_ASID		GENMASK(31, 12)
> > > +
> > > +	u32 dw2;
> > > +#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > > +#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > > +#define PFD_VFID		GENMASK(9, 4)
> > > +#define PFD_RSVD_1		GENMASK(11, 10)
> > > +#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > > +#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > > +
> > > +	u32 dw3;
> > > +#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > > +#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > > +} __packed;
> > > +
> > > +#define FLT_ACCESS_TYPE_READ		0u
> > > +#define FLT_ACCESS_TYPE_WRITE		1u
> > > +#define FLT_ACCESS_TYPE_ATOMIC		2u
> > > +#define FLT_ACCESS_TYPE_RESERVED	3u
> > > +
> > > +#define FLT_TYPE_NOT_PRESENT_FAULT		0u
> > > +#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
> > > +#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
> > > +
> > > +#endif
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > index d4e3b7eb165a..93afa54c8780 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > @@ -12,8 +12,10 @@
> > > #include <drm/drm_managed.h>
> > > 
> > > #include "abi/guc_actions_abi.h"
> > > +#include "regs/xe_pagefault_desc.h"
> > > #include "xe_bo.h"
> > > #include "xe_gt.h"
> > > +#include "xe_gt_pagefault_types.h"
> > > #include "xe_gt_stats.h"
> > > #include "xe_gt_tlb_invalidation.h"
> > > #include "xe_guc.h"
> > > @@ -23,33 +25,6 @@
> > > #include "xe_trace_bo.h"
> > > #include "xe_vm.h"
> > > 
> > > -struct pagefault {
> > > -	u64 page_addr;
> > > -	u32 asid;
> > > -	u16 pdata;
> > > -	u8 vfid;
> > > -	u8 access_type;
> > > -	u8 fault_type;
> > > -	u8 fault_level;
> > > -	u8 engine_class;
> > > -	u8 engine_instance;
> > > -	u8 fault_unsuccessful;
> > > -	bool trva_fault;
> > > -};
> > > -
> > > -enum access_type {
> > > -	ACCESS_TYPE_READ = 0,
> > > -	ACCESS_TYPE_WRITE = 1,
> > > -	ACCESS_TYPE_ATOMIC = 2,
> > > -	ACCESS_TYPE_RESERVED = 3,
> > > -};
> > > -
> > > -enum fault_type {
> > > -	NOT_PRESENT = 0,
> > > -	WRITE_ACCESS_VIOLATION = 1,
> > > -	ATOMIC_ACCESS_VIOLATION = 2,
> > > -};
> > > -
> > > struct acc {
> > > 	u64 va_range_base;
> > > 	u32 asid;
> > > @@ -61,9 +36,9 @@ struct acc {
> > > 	u8 engine_instance;
> > > };
> > > 
> > > -static bool access_is_atomic(enum access_type access_type)
> > > +static bool access_is_atomic(u32 access_type)
> > > {
> > > -	return access_type == ACCESS_TYPE_ATOMIC;
> > > +	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> > > }
> > > 
> > > static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> > > @@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> > > 	return vm;
> > > }
> > > 
> > > -static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > > +static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> > > {
> > > 	struct xe_device *xe = gt_to_xe(gt);
> > > 	struct xe_vm *vm;
> > > @@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > > 		goto unlock_vm;
> > > 	}
> > > 
> > > -	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
> > > +	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> > > 		err = -EPERM;
> > > 		goto unlock_vm;
> > > 	}
> > > @@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> > > 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> > > }
> > > 
> > > -static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > > +static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> > > {
> > > 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> > > 		 "\tVFID: %d\n"
> > > @@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > > 
> > > #define PF_MSG_LEN_DW	4
> > > 
> > > -static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
> > > +static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> > > {
> > > 	const struct xe_guc_pagefault_desc *desc;
> > > 	bool ret = false;
> > > @@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> > > 	struct xe_gt *gt = pf_queue->gt;
> > > 	struct xe_device *xe = gt_to_xe(gt);
> > > 	struct xe_guc_pagefault_reply reply = {};
> > > -	struct pagefault pf = {};
> > > +	struct xe_gt_pagefault pf = {};
> > > 	unsigned long threshold;
> > > 	int ret;
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > > new file mode 100644
> > > index 000000000000..b7d41b558de3
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > > @@ -0,0 +1,42 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright © 2022-2025 Intel Corporation
> > > + */
> > > +
> > > +#ifndef _XE_GT_PAGEFAULT_TYPES_H_
> > > +#define _XE_GT_PAGEFAULT_TYPES_H_
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +/**
> > > + * struct xe_gt_pagefault - Structure of pagefaults returned by the
> > > + * pagefault handler
> > > + */
> > > +struct xe_gt_pagefault {
> > > +	/** @page_addr: faulted address of this pagefault */
> > > +	u64 page_addr;
> > > +	/** @asid: ASID of this pagefault */
> > > +	u32 asid;
> > > +	/** @pdata: PDATA of this pagefault */
> > > +	u16 pdata;
> > > +	/** @vfid: VFID of this pagefault */
> > > +	u8 vfid;
> > > +	/** @access_type: access type of this pagefault */
> > > +	u8 access_type;
> > > +	/** @fault_type: fault type of this pagefault */
> > > +	u8 fault_type;
> > > +	/** @fault_level: fault level of this pagefault */
> > > +	u8 fault_level;
> > > +	/** @engine_class: engine class this pagefault was reported on */
> > > +	u8 engine_class;
> > > +	/** @engine_instance: engine instance this pagefault was reported on */
> > > +	u8 engine_instance;
> > > +	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
> > > +	u8 fault_unsuccessful;
> > > +	/** @prefetch: unused */
> > > +	bool prefetch;
> > > +	/** @trva_fault: is set if this is a TRTT fault */
> > > +	bool trva_fault;
> > > +};
> > > +
> > > +#endif
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > index 6f57578b07cb..30ac21bb4f15 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > @@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> > > 	FAULT_RESPONSE_DESC
> > > };
> > > 
> > > -struct xe_guc_pagefault_desc {
> > > -	u32 dw0;
> > > -#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > > -#define PFD_SRC_ID		GENMASK(10, 3)
> > > -#define PFD_RSVD_0		GENMASK(17, 11)
> > > -#define XE2_PFD_TRVA_FAULT	BIT(18)
> > > -#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > > -#define PFD_ENG_CLASS		GENMASK(27, 25)
> > > -#define PFD_PDATA_LO		GENMASK(31, 28)
> > > -
> > > -	u32 dw1;
> > > -#define PFD_PDATA_HI		GENMASK(11, 0)
> > > -#define PFD_PDATA_HI_SHIFT	4
> > > -#define PFD_ASID		GENMASK(31, 12)
> > > -
> > > -	u32 dw2;
> > > -#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > > -#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > > -#define PFD_VFID		GENMASK(9, 4)
> > > -#define PFD_RSVD_1		GENMASK(11, 10)
> > > -#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > > -#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > > -
> > > -	u32 dw3;
> > > -#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > > -#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > > -} __packed;
> > > -
> > > struct xe_guc_pagefault_reply {
> > > 	u32 dw0;
> > > #define PFR_VALID		BIT(0)
> > > -- 
> > > 2.43.0
> > > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-27 11:57   ` Dafna Hirschfeld
  2025-04-28  1:51     ` Matthew Brost
@ 2025-04-28  2:14     ` Matthew Brost
  2025-04-28 14:17     ` Cavitt, Jonathan
  2 siblings, 0 replies; 17+ messages in thread
From: Matthew Brost @ 2025-04-28  2:14 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Jonathan Cavitt, intel-xe, saurabhg.gupta, alex.zuo,
	joonas.lahtinen, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

On Sun, Apr 27, 2025 at 02:57:22PM +0300, Dafna Hirschfeld wrote:
> On 24.04.2025 20:49, Jonathan Cavitt wrote:
> > Move the pagefault struct from xe_gt_pagefault.c to the
> > xe_gt_pagefault_types.h header file, and move the associated enum values
> > into the regs folder under xe_pagefault_desc.h
> > 
> > Since xe_pagefault_desc.h is being initialized here, also move the
> > xe_guc_pagefault_desc hardware formats to the new file.
> 
> Hi, currently page fault is assumed to be handled by guc only.

Yea, this is layered incorrectly - this is my bad. So in our TLB
invalidation code, also my bad.

> In order to make the descriptor more general, I sent two patches moving
> the descriptor to xe_uc_fw_types.h. [1]
> Maybe my patches could be used instead of this one?
> [1] https://lore.kernel.org/intel-xe/20250209121217.1604458-1-dafna.hirschfeld@intel.com/
>

I think this patch is merged, or will be very shortly. I don't this is
worth holding up, but in general I think the flow should be adjusted to
roughly...

- GuC (or any other PF client) parses PF into HW agnostic format and
  passes it to GT page fault component - I think we drop any fields from
  the current GuC format that are unused by the GT page fault component
- A vfunc is attached too the HW agnostic format which is used to signal
  PF done, in the case of the GuC this issues H2G
- All GuC specific code is removed from GT page fault component

This should allow other firmwares to hook into the generic page fault
code.

Matt
 
> Thanks,
> Dafna
> 
> 
> > 
> > v2:
> > - Normalize names for common header (Matt Brost)
> > 
> > v3:
> > - s/Migrate/Move (Michal W)
> > - s/xe_pagefault/xe_gt_pagefault (Michal W)
> > - Create new header file, xe_gt_pagefault_types.h (Michal W)
> > - Add kernel docs (Michal W)
> > 
> > v4:
> > - Fix includes usage (Michal W)
> > - Reference Bspec (Michal W)
> > 
> > v5:
> > - Convert enums to defines in regs folder (Michal W)
> > - Move xe_guc_pagefault_desc to regs folder (Michal W)
> > 
> > Bspec: 77412
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> > Acked-by: Matthew Brost <matthew.brost@intel.com>
> > Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> > ---
> > drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> > drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> > 4 files changed, 100 insertions(+), 62 deletions(-)
> > create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > 
> > diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > new file mode 100644
> > index 000000000000..a169ac274e14
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_PAGEFAULT_DESC_H_
> > +#define _XE_PAGEFAULT_DESC_H_
> > +
> > +#include <linux/bits.h>
> > +#include <linux/types.h>
> > +
> > +struct xe_guc_pagefault_desc {
> > +	u32 dw0;
> > +#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > +#define PFD_SRC_ID		GENMASK(10, 3)
> > +#define PFD_RSVD_0		GENMASK(17, 11)
> > +#define XE2_PFD_TRVA_FAULT	BIT(18)
> > +#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > +#define PFD_ENG_CLASS		GENMASK(27, 25)
> > +#define PFD_PDATA_LO		GENMASK(31, 28)
> > +
> > +	u32 dw1;
> > +#define PFD_PDATA_HI		GENMASK(11, 0)
> > +#define PFD_PDATA_HI_SHIFT	4
> > +#define PFD_ASID		GENMASK(31, 12)
> > +
> > +	u32 dw2;
> > +#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > +#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > +#define PFD_VFID		GENMASK(9, 4)
> > +#define PFD_RSVD_1		GENMASK(11, 10)
> > +#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > +#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > +
> > +	u32 dw3;
> > +#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > +#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > +} __packed;
> > +
> > +#define FLT_ACCESS_TYPE_READ		0u
> > +#define FLT_ACCESS_TYPE_WRITE		1u
> > +#define FLT_ACCESS_TYPE_ATOMIC		2u
> > +#define FLT_ACCESS_TYPE_RESERVED	3u
> > +
> > +#define FLT_TYPE_NOT_PRESENT_FAULT		0u
> > +#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
> > +#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > index d4e3b7eb165a..93afa54c8780 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > @@ -12,8 +12,10 @@
> > #include <drm/drm_managed.h>
> > 
> > #include "abi/guc_actions_abi.h"
> > +#include "regs/xe_pagefault_desc.h"
> > #include "xe_bo.h"
> > #include "xe_gt.h"
> > +#include "xe_gt_pagefault_types.h"
> > #include "xe_gt_stats.h"
> > #include "xe_gt_tlb_invalidation.h"
> > #include "xe_guc.h"
> > @@ -23,33 +25,6 @@
> > #include "xe_trace_bo.h"
> > #include "xe_vm.h"
> > 
> > -struct pagefault {
> > -	u64 page_addr;
> > -	u32 asid;
> > -	u16 pdata;
> > -	u8 vfid;
> > -	u8 access_type;
> > -	u8 fault_type;
> > -	u8 fault_level;
> > -	u8 engine_class;
> > -	u8 engine_instance;
> > -	u8 fault_unsuccessful;
> > -	bool trva_fault;
> > -};
> > -
> > -enum access_type {
> > -	ACCESS_TYPE_READ = 0,
> > -	ACCESS_TYPE_WRITE = 1,
> > -	ACCESS_TYPE_ATOMIC = 2,
> > -	ACCESS_TYPE_RESERVED = 3,
> > -};
> > -
> > -enum fault_type {
> > -	NOT_PRESENT = 0,
> > -	WRITE_ACCESS_VIOLATION = 1,
> > -	ATOMIC_ACCESS_VIOLATION = 2,
> > -};
> > -
> > struct acc {
> > 	u64 va_range_base;
> > 	u32 asid;
> > @@ -61,9 +36,9 @@ struct acc {
> > 	u8 engine_instance;
> > };
> > 
> > -static bool access_is_atomic(enum access_type access_type)
> > +static bool access_is_atomic(u32 access_type)
> > {
> > -	return access_type == ACCESS_TYPE_ATOMIC;
> > +	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> > }
> > 
> > static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> > @@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> > 	return vm;
> > }
> > 
> > -static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > +static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> > {
> > 	struct xe_device *xe = gt_to_xe(gt);
> > 	struct xe_vm *vm;
> > @@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > 		goto unlock_vm;
> > 	}
> > 
> > -	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
> > +	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> > 		err = -EPERM;
> > 		goto unlock_vm;
> > 	}
> > @@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> > 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> > }
> > 
> > -static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > +static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> > {
> > 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> > 		 "\tVFID: %d\n"
> > @@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > 
> > #define PF_MSG_LEN_DW	4
> > 
> > -static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
> > +static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> > {
> > 	const struct xe_guc_pagefault_desc *desc;
> > 	bool ret = false;
> > @@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> > 	struct xe_gt *gt = pf_queue->gt;
> > 	struct xe_device *xe = gt_to_xe(gt);
> > 	struct xe_guc_pagefault_reply reply = {};
> > -	struct pagefault pf = {};
> > +	struct xe_gt_pagefault pf = {};
> > 	unsigned long threshold;
> > 	int ret;
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > new file mode 100644
> > index 000000000000..b7d41b558de3
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > @@ -0,0 +1,42 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2022-2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_GT_PAGEFAULT_TYPES_H_
> > +#define _XE_GT_PAGEFAULT_TYPES_H_
> > +
> > +#include <linux/types.h>
> > +
> > +/**
> > + * struct xe_gt_pagefault - Structure of pagefaults returned by the
> > + * pagefault handler
> > + */
> > +struct xe_gt_pagefault {
> > +	/** @page_addr: faulted address of this pagefault */
> > +	u64 page_addr;
> > +	/** @asid: ASID of this pagefault */
> > +	u32 asid;
> > +	/** @pdata: PDATA of this pagefault */
> > +	u16 pdata;
> > +	/** @vfid: VFID of this pagefault */
> > +	u8 vfid;
> > +	/** @access_type: access type of this pagefault */
> > +	u8 access_type;
> > +	/** @fault_type: fault type of this pagefault */
> > +	u8 fault_type;
> > +	/** @fault_level: fault level of this pagefault */
> > +	u8 fault_level;
> > +	/** @engine_class: engine class this pagefault was reported on */
> > +	u8 engine_class;
> > +	/** @engine_instance: engine instance this pagefault was reported on */
> > +	u8 engine_instance;
> > +	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
> > +	u8 fault_unsuccessful;
> > +	/** @prefetch: unused */
> > +	bool prefetch;
> > +	/** @trva_fault: is set if this is a TRTT fault */
> > +	bool trva_fault;
> > +};
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > index 6f57578b07cb..30ac21bb4f15 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > @@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> > 	FAULT_RESPONSE_DESC
> > };
> > 
> > -struct xe_guc_pagefault_desc {
> > -	u32 dw0;
> > -#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > -#define PFD_SRC_ID		GENMASK(10, 3)
> > -#define PFD_RSVD_0		GENMASK(17, 11)
> > -#define XE2_PFD_TRVA_FAULT	BIT(18)
> > -#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > -#define PFD_ENG_CLASS		GENMASK(27, 25)
> > -#define PFD_PDATA_LO		GENMASK(31, 28)
> > -
> > -	u32 dw1;
> > -#define PFD_PDATA_HI		GENMASK(11, 0)
> > -#define PFD_PDATA_HI_SHIFT	4
> > -#define PFD_ASID		GENMASK(31, 12)
> > -
> > -	u32 dw2;
> > -#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > -#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > -#define PFD_VFID		GENMASK(9, 4)
> > -#define PFD_RSVD_1		GENMASK(11, 10)
> > -#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > -#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > -
> > -	u32 dw3;
> > -#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > -#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > -} __packed;
> > -
> > struct xe_guc_pagefault_reply {
> > 	u32 dw0;
> > #define PFR_VALID		BIT(0)
> > -- 
> > 2.43.0
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-27 11:57   ` Dafna Hirschfeld
  2025-04-28  1:51     ` Matthew Brost
  2025-04-28  2:14     ` Matthew Brost
@ 2025-04-28 14:17     ` Cavitt, Jonathan
  2 siblings, 0 replies; 17+ messages in thread
From: Cavitt, Jonathan @ 2025-04-28 14:17 UTC (permalink / raw)
  To: Hirschfeld, Dafna, Ausmus, James
  Cc: intel-xe@listxe_uc_fw_types.hs.freedesktop.org, Gupta, saurabhg,
	Zuo, Alex, joonas.lahtinen@linux.intel.com, Brost, Matthew,
	Zhang, Jianxun, Lin, Shuicheng, dri-devel@lists.freedesktop.org,
	Wajdeczko, Michal, Mrozek, Michal, Jadav, Raag, Harrison, John C,
	Briano, Ivan, Auld, Matthew

-----Original Message-----
From: Hirschfeld, Dafna <dafna.hirschfeld@intel.com> 
Sent: Sunday, April 27, 2025 4:57 AM
To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
Cc: intel-xe@listxe_uc_fw_types.hs.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Wajdeczko, Michal <Michal.Wajdeczko@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>; Jadav, Raag <raag.jadav@intel.com>; Harrison, John C <john.c.harrison@intel.com>; Briano, Ivan <ivan.briano@intel.com>; Auld, Matthew <matthew.auld@intel.com>
Subject: Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
> 
> On 24.04.2025 20:49, Jonathan Cavitt wrote:
> >Move the pagefault struct from xe_gt_pagefault.c to the
> >xe_gt_pagefault_types.h header file, and move the associated enum values
> >into the regs folder under xe_pagefault_desc.h
> >
> >Since xe_pagefault_desc.h is being initialized here, also move the
> >xe_guc_pagefault_desc hardware formats to the new file.
> 
> Hi, currently page fault is assumed to be handled by guc only.
> In order to make the descriptor more general, I sent two patches moving
> the descriptor to xe_uc_fw_types.h. [1]
> Maybe my patches could be used instead of this one?
> [1] https://lore.kernel.org/intel-xe/20250209121217.1604458-1-dafna.hirschfeld@intel.com/

I'm fine proceeding with either method, though I suspect @Ausmus, James may want to
be informed about that patch series before it's pushed.
-Jonathan Cavitt

> 
> Thanks,
> Dafna
> 
> 
> >
> >v2:
> >- Normalize names for common header (Matt Brost)
> >
> >v3:
> >- s/Migrate/Move (Michal W)
> >- s/xe_pagefault/xe_gt_pagefault (Michal W)
> >- Create new header file, xe_gt_pagefault_types.h (Michal W)
> >- Add kernel docs (Michal W)
> >
> >v4:
> >- Fix includes usage (Michal W)
> >- Reference Bspec (Michal W)
> >
> >v5:
> >- Convert enums to defines in regs folder (Michal W)
> >- Move xe_guc_pagefault_desc to regs folder (Michal W)
> >
> >Bspec: 77412
> >Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> >Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> >Acked-by: Matthew Brost <matthew.brost@intel.com>
> >Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> >---
> > drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> > drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> > 4 files changed, 100 insertions(+), 62 deletions(-)
> > create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >
> >diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> >new file mode 100644
> >index 000000000000..a169ac274e14
> >--- /dev/null
> >+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> >@@ -0,0 +1,49 @@
> >+/* SPDX-License-Identifier: MIT */
> >+/*
> >+ * Copyright © 2025 Intel Corporation
> >+ */
> >+
> >+#ifndef _XE_PAGEFAULT_DESC_H_
> >+#define _XE_PAGEFAULT_DESC_H_
> >+
> >+#include <linux/bits.h>
> >+#include <linux/types.h>
> >+
> >+struct xe_guc_pagefault_desc {
> >+	u32 dw0;
> >+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> >+#define PFD_SRC_ID		GENMASK(10, 3)
> >+#define PFD_RSVD_0		GENMASK(17, 11)
> >+#define XE2_PFD_TRVA_FAULT	BIT(18)
> >+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> >+#define PFD_ENG_CLASS		GENMASK(27, 25)
> >+#define PFD_PDATA_LO		GENMASK(31, 28)
> >+
> >+	u32 dw1;
> >+#define PFD_PDATA_HI		GENMASK(11, 0)
> >+#define PFD_PDATA_HI_SHIFT	4
> >+#define PFD_ASID		GENMASK(31, 12)
> >+
> >+	u32 dw2;
> >+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> >+#define PFD_FAULT_TYPE		GENMASK(3, 2)
> >+#define PFD_VFID		GENMASK(9, 4)
> >+#define PFD_RSVD_1		GENMASK(11, 10)
> >+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> >+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> >+
> >+	u32 dw3;
> >+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> >+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> >+} __packed;
> >+
> >+#define FLT_ACCESS_TYPE_READ		0u
> >+#define FLT_ACCESS_TYPE_WRITE		1u
> >+#define FLT_ACCESS_TYPE_ATOMIC		2u
> >+#define FLT_ACCESS_TYPE_RESERVED	3u
> >+
> >+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
> >+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
> >+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
> >+
> >+#endif
> >diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> >index d4e3b7eb165a..93afa54c8780 100644
> >--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> >+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> >@@ -12,8 +12,10 @@
> > #include <drm/drm_managed.h>
> >
> > #include "abi/guc_actions_abi.h"
> >+#include "regs/xe_pagefault_desc.h"
> > #include "xe_bo.h"
> > #include "xe_gt.h"
> >+#include "xe_gt_pagefault_types.h"
> > #include "xe_gt_stats.h"
> > #include "xe_gt_tlb_invalidation.h"
> > #include "xe_guc.h"
> >@@ -23,33 +25,6 @@
> > #include "xe_trace_bo.h"
> > #include "xe_vm.h"
> >
> >-struct pagefault {
> >-	u64 page_addr;
> >-	u32 asid;
> >-	u16 pdata;
> >-	u8 vfid;
> >-	u8 access_type;
> >-	u8 fault_type;
> >-	u8 fault_level;
> >-	u8 engine_class;
> >-	u8 engine_instance;
> >-	u8 fault_unsuccessful;
> >-	bool trva_fault;
> >-};
> >-
> >-enum access_type {
> >-	ACCESS_TYPE_READ = 0,
> >-	ACCESS_TYPE_WRITE = 1,
> >-	ACCESS_TYPE_ATOMIC = 2,
> >-	ACCESS_TYPE_RESERVED = 3,
> >-};
> >-
> >-enum fault_type {
> >-	NOT_PRESENT = 0,
> >-	WRITE_ACCESS_VIOLATION = 1,
> >-	ATOMIC_ACCESS_VIOLATION = 2,
> >-};
> >-
> > struct acc {
> > 	u64 va_range_base;
> > 	u32 asid;
> >@@ -61,9 +36,9 @@ struct acc {
> > 	u8 engine_instance;
> > };
> >
> >-static bool access_is_atomic(enum access_type access_type)
> >+static bool access_is_atomic(u32 access_type)
> > {
> >-	return access_type == ACCESS_TYPE_ATOMIC;
> >+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> > }
> >
> > static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> >@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> > 	return vm;
> > }
> >
> >-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> >+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> > {
> > 	struct xe_device *xe = gt_to_xe(gt);
> > 	struct xe_vm *vm;
> >@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > 		goto unlock_vm;
> > 	}
> >
> >-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
> >+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> > 		err = -EPERM;
> > 		goto unlock_vm;
> > 	}
> >@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> > 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> > }
> >
> >-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> >+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> > {
> > 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> > 		 "\tVFID: %d\n"
> >@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> >
> > #define PF_MSG_LEN_DW	4
> >
> >-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
> >+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> > {
> > 	const struct xe_guc_pagefault_desc *desc;
> > 	bool ret = false;
> >@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> > 	struct xe_gt *gt = pf_queue->gt;
> > 	struct xe_device *xe = gt_to_xe(gt);
> > 	struct xe_guc_pagefault_reply reply = {};
> >-	struct pagefault pf = {};
> >+	struct xe_gt_pagefault pf = {};
> > 	unsigned long threshold;
> > 	int ret;
> >
> >diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >new file mode 100644
> >index 000000000000..b7d41b558de3
> >--- /dev/null
> >+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >@@ -0,0 +1,42 @@
> >+/* SPDX-License-Identifier: MIT */
> >+/*
> >+ * Copyright © 2022-2025 Intel Corporation
> >+ */
> >+
> >+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
> >+#define _XE_GT_PAGEFAULT_TYPES_H_
> >+
> >+#include <linux/types.h>
> >+
> >+/**
> >+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
> >+ * pagefault handler
> >+ */
> >+struct xe_gt_pagefault {
> >+	/** @page_addr: faulted address of this pagefault */
> >+	u64 page_addr;
> >+	/** @asid: ASID of this pagefault */
> >+	u32 asid;
> >+	/** @pdata: PDATA of this pagefault */
> >+	u16 pdata;
> >+	/** @vfid: VFID of this pagefault */
> >+	u8 vfid;
> >+	/** @access_type: access type of this pagefault */
> >+	u8 access_type;
> >+	/** @fault_type: fault type of this pagefault */
> >+	u8 fault_type;
> >+	/** @fault_level: fault level of this pagefault */
> >+	u8 fault_level;
> >+	/** @engine_class: engine class this pagefault was reported on */
> >+	u8 engine_class;
> >+	/** @engine_instance: engine instance this pagefault was reported on */
> >+	u8 engine_instance;
> >+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
> >+	u8 fault_unsuccessful;
> >+	/** @prefetch: unused */
> >+	bool prefetch;
> >+	/** @trva_fault: is set if this is a TRTT fault */
> >+	bool trva_fault;
> >+};
> >+
> >+#endif
> >diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> >index 6f57578b07cb..30ac21bb4f15 100644
> >--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> >+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> >@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> > 	FAULT_RESPONSE_DESC
> > };
> >
> >-struct xe_guc_pagefault_desc {
> >-	u32 dw0;
> >-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> >-#define PFD_SRC_ID		GENMASK(10, 3)
> >-#define PFD_RSVD_0		GENMASK(17, 11)
> >-#define XE2_PFD_TRVA_FAULT	BIT(18)
> >-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> >-#define PFD_ENG_CLASS		GENMASK(27, 25)
> >-#define PFD_PDATA_LO		GENMASK(31, 28)
> >-
> >-	u32 dw1;
> >-#define PFD_PDATA_HI		GENMASK(11, 0)
> >-#define PFD_PDATA_HI_SHIFT	4
> >-#define PFD_ASID		GENMASK(31, 12)
> >-
> >-	u32 dw2;
> >-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> >-#define PFD_FAULT_TYPE		GENMASK(3, 2)
> >-#define PFD_VFID		GENMASK(9, 4)
> >-#define PFD_RSVD_1		GENMASK(11, 10)
> >-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> >-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> >-
> >-	u32 dw3;
> >-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> >-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> >-} __packed;
> >-
> > struct xe_guc_pagefault_reply {
> > 	u32 dw0;
> > #define PFR_VALID		BIT(0)
> >-- 
> >2.43.0
> >
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-24 20:49 ` [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header Jonathan Cavitt
  2025-04-27 11:57   ` Dafna Hirschfeld
@ 2025-04-29 14:22   ` Dafna Hirschfeld
  2025-04-30  7:06     ` Dafna Hirschfeld
  2025-05-07 16:34     ` Michal Wajdeczko
  1 sibling, 2 replies; 17+ messages in thread
From: Dafna Hirschfeld @ 2025-04-29 14:22 UTC (permalink / raw)
  To: Jonathan Cavitt
  Cc: intel-xe, saurabhg.gupta, alex.zuo, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

On 24.04.2025 20:49, Jonathan Cavitt wrote:
>Move the pagefault struct from xe_gt_pagefault.c to the
>xe_gt_pagefault_types.h header file, and move the associated enum values
>into the regs folder under xe_pagefault_desc.h
>
>Since xe_pagefault_desc.h is being initialized here, also move the
>xe_guc_pagefault_desc hardware formats to the new file.
>
>v2:
>- Normalize names for common header (Matt Brost)
>
>v3:
>- s/Migrate/Move (Michal W)
>- s/xe_pagefault/xe_gt_pagefault (Michal W)
>- Create new header file, xe_gt_pagefault_types.h (Michal W)
>- Add kernel docs (Michal W)
>
>v4:
>- Fix includes usage (Michal W)
>- Reference Bspec (Michal W)
>
>v5:
>- Convert enums to defines in regs folder (Michal W)
>- Move xe_guc_pagefault_desc to regs folder (Michal W)
>
>Bspec: 77412
>Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
>Acked-by: Matthew Brost <matthew.brost@intel.com>
>Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
>---
> drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> 4 files changed, 100 insertions(+), 62 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>new file mode 100644
>index 000000000000..a169ac274e14
>--- /dev/null
>+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h

Maybe change the file name to xe_guc_pagefault_desc.h ,
since this is currently guc specific.

Also, the define 'PF_MSG_LEN_DW	4' relates to the
length of this struct so should move here.

Thanks,
Dafna

>@@ -0,0 +1,49 @@
>+/* SPDX-License-Identifier: MIT */
>+/*
>+ * Copyright © 2025 Intel Corporation
>+ */
>+
>+#ifndef _XE_PAGEFAULT_DESC_H_
>+#define _XE_PAGEFAULT_DESC_H_
>+
>+#include <linux/bits.h>
>+#include <linux/types.h>
>+
>+struct xe_guc_pagefault_desc {
>+	u32 dw0;
>+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
>+#define PFD_SRC_ID		GENMASK(10, 3)
>+#define PFD_RSVD_0		GENMASK(17, 11)
>+#define XE2_PFD_TRVA_FAULT	BIT(18)
>+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
>+#define PFD_ENG_CLASS		GENMASK(27, 25)
>+#define PFD_PDATA_LO		GENMASK(31, 28)
>+
>+	u32 dw1;
>+#define PFD_PDATA_HI		GENMASK(11, 0)
>+#define PFD_PDATA_HI_SHIFT	4
>+#define PFD_ASID		GENMASK(31, 12)
>+
>+	u32 dw2;
>+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
>+#define PFD_FAULT_TYPE		GENMASK(3, 2)
>+#define PFD_VFID		GENMASK(9, 4)
>+#define PFD_RSVD_1		GENMASK(11, 10)
>+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
>+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
>+
>+	u32 dw3;
>+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
>+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
>+} __packed;
>+
>+#define FLT_ACCESS_TYPE_READ		0u
>+#define FLT_ACCESS_TYPE_WRITE		1u
>+#define FLT_ACCESS_TYPE_ATOMIC		2u
>+#define FLT_ACCESS_TYPE_RESERVED	3u
>+
>+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
>+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
>+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
>+
>+#endif
>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>index d4e3b7eb165a..93afa54c8780 100644
>--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>@@ -12,8 +12,10 @@
> #include <drm/drm_managed.h>
>
> #include "abi/guc_actions_abi.h"
>+#include "regs/xe_pagefault_desc.h"
> #include "xe_bo.h"
> #include "xe_gt.h"
>+#include "xe_gt_pagefault_types.h"
> #include "xe_gt_stats.h"
> #include "xe_gt_tlb_invalidation.h"
> #include "xe_guc.h"
>@@ -23,33 +25,6 @@
> #include "xe_trace_bo.h"
> #include "xe_vm.h"
>
>-struct pagefault {
>-	u64 page_addr;
>-	u32 asid;
>-	u16 pdata;
>-	u8 vfid;
>-	u8 access_type;
>-	u8 fault_type;
>-	u8 fault_level;
>-	u8 engine_class;
>-	u8 engine_instance;
>-	u8 fault_unsuccessful;
>-	bool trva_fault;
>-};
>-
>-enum access_type {
>-	ACCESS_TYPE_READ = 0,
>-	ACCESS_TYPE_WRITE = 1,
>-	ACCESS_TYPE_ATOMIC = 2,
>-	ACCESS_TYPE_RESERVED = 3,
>-};
>-
>-enum fault_type {
>-	NOT_PRESENT = 0,
>-	WRITE_ACCESS_VIOLATION = 1,
>-	ATOMIC_ACCESS_VIOLATION = 2,
>-};
>-
> struct acc {
> 	u64 va_range_base;
> 	u32 asid;
>@@ -61,9 +36,9 @@ struct acc {
> 	u8 engine_instance;
> };
>
>-static bool access_is_atomic(enum access_type access_type)
>+static bool access_is_atomic(u32 access_type)
> {
>-	return access_type == ACCESS_TYPE_ATOMIC;
>+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> }
>
> static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
>@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> 	return vm;
> }
>
>-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> {
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct xe_vm *vm;
>@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> 		goto unlock_vm;
> 	}
>
>-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
>+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> 		err = -EPERM;
> 		goto unlock_vm;
> 	}
>@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> 	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> }
>
>-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
>+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> {
> 	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> 		 "\tVFID: %d\n"
>@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
>
> #define PF_MSG_LEN_DW	4
>
>-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
>+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> {
> 	const struct xe_guc_pagefault_desc *desc;
> 	bool ret = false;
>@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> 	struct xe_gt *gt = pf_queue->gt;
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct xe_guc_pagefault_reply reply = {};
>-	struct pagefault pf = {};
>+	struct xe_gt_pagefault pf = {};
> 	unsigned long threshold;
> 	int ret;
>
>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>new file mode 100644
>index 000000000000..b7d41b558de3
>--- /dev/null
>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>@@ -0,0 +1,42 @@
>+/* SPDX-License-Identifier: MIT */
>+/*
>+ * Copyright © 2022-2025 Intel Corporation
>+ */
>+
>+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
>+#define _XE_GT_PAGEFAULT_TYPES_H_
>+
>+#include <linux/types.h>
>+
>+/**
>+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
>+ * pagefault handler
>+ */
>+struct xe_gt_pagefault {
>+	/** @page_addr: faulted address of this pagefault */
>+	u64 page_addr;
>+	/** @asid: ASID of this pagefault */
>+	u32 asid;
>+	/** @pdata: PDATA of this pagefault */
>+	u16 pdata;
>+	/** @vfid: VFID of this pagefault */
>+	u8 vfid;
>+	/** @access_type: access type of this pagefault */
>+	u8 access_type;
>+	/** @fault_type: fault type of this pagefault */
>+	u8 fault_type;
>+	/** @fault_level: fault level of this pagefault */
>+	u8 fault_level;
>+	/** @engine_class: engine class this pagefault was reported on */
>+	u8 engine_class;
>+	/** @engine_instance: engine instance this pagefault was reported on */
>+	u8 engine_instance;
>+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
>+	u8 fault_unsuccessful;
>+	/** @prefetch: unused */
>+	bool prefetch;
>+	/** @trva_fault: is set if this is a TRTT fault */
>+	bool trva_fault;
>+};
>+
>+#endif
>diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
>index 6f57578b07cb..30ac21bb4f15 100644
>--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> 	FAULT_RESPONSE_DESC
> };
>
>-struct xe_guc_pagefault_desc {
>-	u32 dw0;
>-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
>-#define PFD_SRC_ID		GENMASK(10, 3)
>-#define PFD_RSVD_0		GENMASK(17, 11)
>-#define XE2_PFD_TRVA_FAULT	BIT(18)
>-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
>-#define PFD_ENG_CLASS		GENMASK(27, 25)
>-#define PFD_PDATA_LO		GENMASK(31, 28)
>-
>-	u32 dw1;
>-#define PFD_PDATA_HI		GENMASK(11, 0)
>-#define PFD_PDATA_HI_SHIFT	4
>-#define PFD_ASID		GENMASK(31, 12)
>-
>-	u32 dw2;
>-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
>-#define PFD_FAULT_TYPE		GENMASK(3, 2)
>-#define PFD_VFID		GENMASK(9, 4)
>-#define PFD_RSVD_1		GENMASK(11, 10)
>-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
>-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
>-
>-	u32 dw3;
>-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
>-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
>-} __packed;
>-
> struct xe_guc_pagefault_reply {
> 	u32 dw0;
> #define PFR_VALID		BIT(0)
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-29 14:22   ` Dafna Hirschfeld
@ 2025-04-30  7:06     ` Dafna Hirschfeld
  2025-04-30 18:42       ` Cavitt, Jonathan
  2025-05-07 16:34     ` Michal Wajdeczko
  1 sibling, 1 reply; 17+ messages in thread
From: Dafna Hirschfeld @ 2025-04-30  7:06 UTC (permalink / raw)
  To: Jonathan Cavitt
  Cc: intel-xe, saurabhg.gupta, alex.zuo, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	Michal.Wajdeczko, michal.mrozek, raag.jadav, john.c.harrison,
	ivan.briano, matthew.auld

On 29.04.2025 17:22, Dafna Hirschfeld wrote:
>On 24.04.2025 20:49, Jonathan Cavitt wrote:
>>Move the pagefault struct from xe_gt_pagefault.c to the
>>xe_gt_pagefault_types.h header file, and move the associated enum values
>>into the regs folder under xe_pagefault_desc.h
>>
>>Since xe_pagefault_desc.h is being initialized here, also move the
>>xe_guc_pagefault_desc hardware formats to the new file.
>>
>>v2:
>>- Normalize names for common header (Matt Brost)
>>
>>v3:
>>- s/Migrate/Move (Michal W)
>>- s/xe_pagefault/xe_gt_pagefault (Michal W)
>>- Create new header file, xe_gt_pagefault_types.h (Michal W)
>>- Add kernel docs (Michal W)
>>
>>v4:
>>- Fix includes usage (Michal W)
>>- Reference Bspec (Michal W)
>>
>>v5:
>>- Convert enums to defines in regs folder (Michal W)
>>- Move xe_guc_pagefault_desc to regs folder (Michal W)
>>
>>Bspec: 77412
>>Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>>Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
>>Acked-by: Matthew Brost <matthew.brost@intel.com>
>>Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
>>---
>>drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
>>drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
>>drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
>>drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
>>4 files changed, 100 insertions(+), 62 deletions(-)
>>create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>>create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>>
>>diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>>new file mode 100644
>>index 000000000000..a169ac274e14
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>
>Maybe change the file name to xe_guc_pagefault_desc.h ,
>since this is currently guc specific.
>
>Also, the define 'PF_MSG_LEN_DW	4' relates to the
>length of this struct so should move here.

Actually, the struct 'xe_guc_acc_desc' and its defines should
also move here.

Thanks,
Dafna

>
>Thanks,
>Dafna
>
>>@@ -0,0 +1,49 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2025 Intel Corporation
>>+ */
>>+
>>+#ifndef _XE_PAGEFAULT_DESC_H_
>>+#define _XE_PAGEFAULT_DESC_H_
>>+
>>+#include <linux/bits.h>
>>+#include <linux/types.h>
>>+
>>+struct xe_guc_pagefault_desc {
>>+	u32 dw0;
>>+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
>>+#define PFD_SRC_ID		GENMASK(10, 3)
>>+#define PFD_RSVD_0		GENMASK(17, 11)
>>+#define XE2_PFD_TRVA_FAULT	BIT(18)
>>+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
>>+#define PFD_ENG_CLASS		GENMASK(27, 25)
>>+#define PFD_PDATA_LO		GENMASK(31, 28)
>>+
>>+	u32 dw1;
>>+#define PFD_PDATA_HI		GENMASK(11, 0)
>>+#define PFD_PDATA_HI_SHIFT	4
>>+#define PFD_ASID		GENMASK(31, 12)
>>+
>>+	u32 dw2;
>>+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
>>+#define PFD_FAULT_TYPE		GENMASK(3, 2)
>>+#define PFD_VFID		GENMASK(9, 4)
>>+#define PFD_RSVD_1		GENMASK(11, 10)
>>+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
>>+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
>>+
>>+	u32 dw3;
>>+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
>>+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
>>+} __packed;
>>+
>>+#define FLT_ACCESS_TYPE_READ		0u
>>+#define FLT_ACCESS_TYPE_WRITE		1u
>>+#define FLT_ACCESS_TYPE_ATOMIC		2u
>>+#define FLT_ACCESS_TYPE_RESERVED	3u
>>+
>>+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
>>+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
>>+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
>>+
>>+#endif
>>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>>index d4e3b7eb165a..93afa54c8780 100644
>>--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
>>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>>@@ -12,8 +12,10 @@
>>#include <drm/drm_managed.h>
>>
>>#include "abi/guc_actions_abi.h"
>>+#include "regs/xe_pagefault_desc.h"
>>#include "xe_bo.h"
>>#include "xe_gt.h"
>>+#include "xe_gt_pagefault_types.h"
>>#include "xe_gt_stats.h"
>>#include "xe_gt_tlb_invalidation.h"
>>#include "xe_guc.h"
>>@@ -23,33 +25,6 @@
>>#include "xe_trace_bo.h"
>>#include "xe_vm.h"
>>
>>-struct pagefault {
>>-	u64 page_addr;
>>-	u32 asid;
>>-	u16 pdata;
>>-	u8 vfid;
>>-	u8 access_type;
>>-	u8 fault_type;
>>-	u8 fault_level;
>>-	u8 engine_class;
>>-	u8 engine_instance;
>>-	u8 fault_unsuccessful;
>>-	bool trva_fault;
>>-};
>>-
>>-enum access_type {
>>-	ACCESS_TYPE_READ = 0,
>>-	ACCESS_TYPE_WRITE = 1,
>>-	ACCESS_TYPE_ATOMIC = 2,
>>-	ACCESS_TYPE_RESERVED = 3,
>>-};
>>-
>>-enum fault_type {
>>-	NOT_PRESENT = 0,
>>-	WRITE_ACCESS_VIOLATION = 1,
>>-	ATOMIC_ACCESS_VIOLATION = 2,
>>-};
>>-
>>struct acc {
>>	u64 va_range_base;
>>	u32 asid;
>>@@ -61,9 +36,9 @@ struct acc {
>>	u8 engine_instance;
>>};
>>
>>-static bool access_is_atomic(enum access_type access_type)
>>+static bool access_is_atomic(u32 access_type)
>>{
>>-	return access_type == ACCESS_TYPE_ATOMIC;
>>+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
>>}
>>
>>static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
>>@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
>>	return vm;
>>}
>>
>>-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>>+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
>>{
>>	struct xe_device *xe = gt_to_xe(gt);
>>	struct xe_vm *vm;
>>@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>>		goto unlock_vm;
>>	}
>>
>>-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
>>+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
>>		err = -EPERM;
>>		goto unlock_vm;
>>	}
>>@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
>>	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
>>}
>>
>>-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
>>+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
>>{
>>	drm_dbg(&xe->drm, "\n\tASID: %d\n"
>>		 "\tVFID: %d\n"
>>@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
>>
>>#define PF_MSG_LEN_DW	4
>>
>>-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
>>+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
>>{
>>	const struct xe_guc_pagefault_desc *desc;
>>	bool ret = false;
>>@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
>>	struct xe_gt *gt = pf_queue->gt;
>>	struct xe_device *xe = gt_to_xe(gt);
>>	struct xe_guc_pagefault_reply reply = {};
>>-	struct pagefault pf = {};
>>+	struct xe_gt_pagefault pf = {};
>>	unsigned long threshold;
>>	int ret;
>>
>>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>>new file mode 100644
>>index 000000000000..b7d41b558de3
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>>@@ -0,0 +1,42 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2022-2025 Intel Corporation
>>+ */
>>+
>>+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
>>+#define _XE_GT_PAGEFAULT_TYPES_H_
>>+
>>+#include <linux/types.h>
>>+
>>+/**
>>+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
>>+ * pagefault handler
>>+ */
>>+struct xe_gt_pagefault {
>>+	/** @page_addr: faulted address of this pagefault */
>>+	u64 page_addr;
>>+	/** @asid: ASID of this pagefault */
>>+	u32 asid;
>>+	/** @pdata: PDATA of this pagefault */
>>+	u16 pdata;
>>+	/** @vfid: VFID of this pagefault */
>>+	u8 vfid;
>>+	/** @access_type: access type of this pagefault */
>>+	u8 access_type;
>>+	/** @fault_type: fault type of this pagefault */
>>+	u8 fault_type;
>>+	/** @fault_level: fault level of this pagefault */
>>+	u8 fault_level;
>>+	/** @engine_class: engine class this pagefault was reported on */
>>+	u8 engine_class;
>>+	/** @engine_instance: engine instance this pagefault was reported on */
>>+	u8 engine_instance;
>>+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
>>+	u8 fault_unsuccessful;
>>+	/** @prefetch: unused */
>>+	bool prefetch;
>>+	/** @trva_fault: is set if this is a TRTT fault */
>>+	bool trva_fault;
>>+};
>>+
>>+#endif
>>diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>index 6f57578b07cb..30ac21bb4f15 100644
>>--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>>+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
>>	FAULT_RESPONSE_DESC
>>};
>>
>>-struct xe_guc_pagefault_desc {
>>-	u32 dw0;
>>-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
>>-#define PFD_SRC_ID		GENMASK(10, 3)
>>-#define PFD_RSVD_0		GENMASK(17, 11)
>>-#define XE2_PFD_TRVA_FAULT	BIT(18)
>>-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
>>-#define PFD_ENG_CLASS		GENMASK(27, 25)
>>-#define PFD_PDATA_LO		GENMASK(31, 28)
>>-
>>-	u32 dw1;
>>-#define PFD_PDATA_HI		GENMASK(11, 0)
>>-#define PFD_PDATA_HI_SHIFT	4
>>-#define PFD_ASID		GENMASK(31, 12)
>>-
>>-	u32 dw2;
>>-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
>>-#define PFD_FAULT_TYPE		GENMASK(3, 2)
>>-#define PFD_VFID		GENMASK(9, 4)
>>-#define PFD_RSVD_1		GENMASK(11, 10)
>>-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
>>-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
>>-
>>-	u32 dw3;
>>-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
>>-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
>>-} __packed;
>>-
>>struct xe_guc_pagefault_reply {
>>	u32 dw0;
>>#define PFR_VALID		BIT(0)
>>-- 
>>2.43.0
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-30  7:06     ` Dafna Hirschfeld
@ 2025-04-30 18:42       ` Cavitt, Jonathan
  2025-05-06 21:30         ` Cavitt, Jonathan
  0 siblings, 1 reply; 17+ messages in thread
From: Cavitt, Jonathan @ 2025-04-30 18:42 UTC (permalink / raw)
  To: Hirschfeld, Dafna, Brost, Matthew, Wajdeczko, Michal
  Cc: intel-xe@lists.freedesktop.org, Gupta,  saurabhg, Zuo, Alex,
	joonas.lahtinen@linux.intel.com, Zhang, Jianxun, Lin, Shuicheng,
	dri-devel@lists.freedesktop.org, Mrozek, Michal, Jadav, Raag,
	Harrison, John C, Briano, Ivan, Auld, Matthew, Cavitt, Jonathan

-----Original Message-----
From: Hirschfeld, Dafna <dafna.hirschfeld@intel.com> 
Sent: Wednesday, April 30, 2025 12:07 AM
To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
Cc: intel-xe@lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Wajdeczko, Michal <Michal.Wajdeczko@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>; Jadav, Raag <raag.jadav@intel.com>; Harrison, John C <john.c.harrison@intel.com>; Briano, Ivan <ivan.briano@intel.com>; Auld, Matthew <matthew.auld@intel.com>
Subject: Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
> 
> On 29.04.2025 17:22, Dafna Hirschfeld wrote:
> >On 24.04.2025 20:49, Jonathan Cavitt wrote:
> >>Move the pagefault struct from xe_gt_pagefault.c to the
> >>xe_gt_pagefault_types.h header file, and move the associated enum values
> >>into the regs folder under xe_pagefault_desc.h
> >>
> >>Since xe_pagefault_desc.h is being initialized here, also move the
> >>xe_guc_pagefault_desc hardware formats to the new file.
> >>
> >>v2:
> >>- Normalize names for common header (Matt Brost)
> >>
> >>v3:
> >>- s/Migrate/Move (Michal W)
> >>- s/xe_pagefault/xe_gt_pagefault (Michal W)
> >>- Create new header file, xe_gt_pagefault_types.h (Michal W)
> >>- Add kernel docs (Michal W)
> >>
> >>v4:
> >>- Fix includes usage (Michal W)
> >>- Reference Bspec (Michal W)
> >>
> >>v5:
> >>- Convert enums to defines in regs folder (Michal W)
> >>- Move xe_guc_pagefault_desc to regs folder (Michal W)
> >>
> >>Bspec: 77412
> >>Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> >>Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> >>Acked-by: Matthew Brost <matthew.brost@intel.com>
> >>Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> >>---
> >>drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> >>drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> >>drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> >>drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> >>4 files changed, 100 insertions(+), 62 deletions(-)
> >>create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> >>create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >>
> >>diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> >>new file mode 100644
> >>index 000000000000..a169ac274e14
> >>--- /dev/null
> >>+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> >
> >Maybe change the file name to xe_guc_pagefault_desc.h ,
> >since this is currently guc specific.

On paper, this shouldn't be a difficult change, though I'd like to ask
@Brost, Matthew and @Wajdeczko, Michal if this request is
amenable.

> >
> >Also, the define 'PF_MSG_LEN_DW	4' relates to the
> >length of this struct so should move here.

I don't see the harm in moving it, though I'd like to wait for a second
opinion before proceeding.

> 
> Actually, the struct 'xe_guc_acc_desc' and its defines should
> also move here.

IMO, I think that if xe_guc_acc_desc needed to be extracted from the
xe_gt_pagefault.c file, it should be put in its own header file in the regs
folder, rather than sharing a file with xe_guc_pagefault_desc.h.  Say,
regs/xe_guc_acc_desc.h?

Though at that point, I think the migration would be out of scope for
this series.
-Jonathan Cavitt

> 
> Thanks,
> Dafna
> 
> >
> >Thanks,
> >Dafna
> >
> >>@@ -0,0 +1,49 @@
> >>+/* SPDX-License-Identifier: MIT */
> >>+/*
> >>+ * Copyright (c) 2025 Intel Corporation
> >>+ */
> >>+
> >>+#ifndef _XE_PAGEFAULT_DESC_H_
> >>+#define _XE_PAGEFAULT_DESC_H_
> >>+
> >>+#include <linux/bits.h>
> >>+#include <linux/types.h>
> >>+
> >>+struct xe_guc_pagefault_desc {
> >>+	u32 dw0;
> >>+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> >>+#define PFD_SRC_ID		GENMASK(10, 3)
> >>+#define PFD_RSVD_0		GENMASK(17, 11)
> >>+#define XE2_PFD_TRVA_FAULT	BIT(18)
> >>+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> >>+#define PFD_ENG_CLASS		GENMASK(27, 25)
> >>+#define PFD_PDATA_LO		GENMASK(31, 28)
> >>+
> >>+	u32 dw1;
> >>+#define PFD_PDATA_HI		GENMASK(11, 0)
> >>+#define PFD_PDATA_HI_SHIFT	4
> >>+#define PFD_ASID		GENMASK(31, 12)
> >>+
> >>+	u32 dw2;
> >>+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> >>+#define PFD_FAULT_TYPE		GENMASK(3, 2)
> >>+#define PFD_VFID		GENMASK(9, 4)
> >>+#define PFD_RSVD_1		GENMASK(11, 10)
> >>+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> >>+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> >>+
> >>+	u32 dw3;
> >>+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> >>+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> >>+} __packed;
> >>+
> >>+#define FLT_ACCESS_TYPE_READ		0u
> >>+#define FLT_ACCESS_TYPE_WRITE		1u
> >>+#define FLT_ACCESS_TYPE_ATOMIC		2u
> >>+#define FLT_ACCESS_TYPE_RESERVED	3u
> >>+
> >>+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
> >>+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
> >>+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
> >>+
> >>+#endif
> >>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> >>index d4e3b7eb165a..93afa54c8780 100644
> >>--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> >>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> >>@@ -12,8 +12,10 @@
> >>#include <drm/drm_managed.h>
> >>
> >>#include "abi/guc_actions_abi.h"
> >>+#include "regs/xe_pagefault_desc.h"
> >>#include "xe_bo.h"
> >>#include "xe_gt.h"
> >>+#include "xe_gt_pagefault_types.h"
> >>#include "xe_gt_stats.h"
> >>#include "xe_gt_tlb_invalidation.h"
> >>#include "xe_guc.h"
> >>@@ -23,33 +25,6 @@
> >>#include "xe_trace_bo.h"
> >>#include "xe_vm.h"
> >>
> >>-struct pagefault {
> >>-	u64 page_addr;
> >>-	u32 asid;
> >>-	u16 pdata;
> >>-	u8 vfid;
> >>-	u8 access_type;
> >>-	u8 fault_type;
> >>-	u8 fault_level;
> >>-	u8 engine_class;
> >>-	u8 engine_instance;
> >>-	u8 fault_unsuccessful;
> >>-	bool trva_fault;
> >>-};
> >>-
> >>-enum access_type {
> >>-	ACCESS_TYPE_READ = 0,
> >>-	ACCESS_TYPE_WRITE = 1,
> >>-	ACCESS_TYPE_ATOMIC = 2,
> >>-	ACCESS_TYPE_RESERVED = 3,
> >>-};
> >>-
> >>-enum fault_type {
> >>-	NOT_PRESENT = 0,
> >>-	WRITE_ACCESS_VIOLATION = 1,
> >>-	ATOMIC_ACCESS_VIOLATION = 2,
> >>-};
> >>-
> >>struct acc {
> >>	u64 va_range_base;
> >>	u32 asid;
> >>@@ -61,9 +36,9 @@ struct acc {
> >>	u8 engine_instance;
> >>};
> >>
> >>-static bool access_is_atomic(enum access_type access_type)
> >>+static bool access_is_atomic(u32 access_type)
> >>{
> >>-	return access_type == ACCESS_TYPE_ATOMIC;
> >>+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> >>}
> >>
> >>static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> >>@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> >>	return vm;
> >>}
> >>
> >>-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> >>+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> >>{
> >>	struct xe_device *xe = gt_to_xe(gt);
> >>	struct xe_vm *vm;
> >>@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> >>		goto unlock_vm;
> >>	}
> >>
> >>-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
> >>+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> >>		err = -EPERM;
> >>		goto unlock_vm;
> >>	}
> >>@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> >>	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> >>}
> >>
> >>-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> >>+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> >>{
> >>	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> >>		 "\tVFID: %d\n"
> >>@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> >>
> >>#define PF_MSG_LEN_DW	4
> >>
> >>-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
> >>+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> >>{
> >>	const struct xe_guc_pagefault_desc *desc;
> >>	bool ret = false;
> >>@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> >>	struct xe_gt *gt = pf_queue->gt;
> >>	struct xe_device *xe = gt_to_xe(gt);
> >>	struct xe_guc_pagefault_reply reply = {};
> >>-	struct pagefault pf = {};
> >>+	struct xe_gt_pagefault pf = {};
> >>	unsigned long threshold;
> >>	int ret;
> >>
> >>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >>new file mode 100644
> >>index 000000000000..b7d41b558de3
> >>--- /dev/null
> >>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >>@@ -0,0 +1,42 @@
> >>+/* SPDX-License-Identifier: MIT */
> >>+/*
> >>+ * Copyright (c) 2022-2025 Intel Corporation
> >>+ */
> >>+
> >>+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
> >>+#define _XE_GT_PAGEFAULT_TYPES_H_
> >>+
> >>+#include <linux/types.h>
> >>+
> >>+/**
> >>+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
> >>+ * pagefault handler
> >>+ */
> >>+struct xe_gt_pagefault {
> >>+	/** @page_addr: faulted address of this pagefault */
> >>+	u64 page_addr;
> >>+	/** @asid: ASID of this pagefault */
> >>+	u32 asid;
> >>+	/** @pdata: PDATA of this pagefault */
> >>+	u16 pdata;
> >>+	/** @vfid: VFID of this pagefault */
> >>+	u8 vfid;
> >>+	/** @access_type: access type of this pagefault */
> >>+	u8 access_type;
> >>+	/** @fault_type: fault type of this pagefault */
> >>+	u8 fault_type;
> >>+	/** @fault_level: fault level of this pagefault */
> >>+	u8 fault_level;
> >>+	/** @engine_class: engine class this pagefault was reported on */
> >>+	u8 engine_class;
> >>+	/** @engine_instance: engine instance this pagefault was reported on */
> >>+	u8 engine_instance;
> >>+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
> >>+	u8 fault_unsuccessful;
> >>+	/** @prefetch: unused */
> >>+	bool prefetch;
> >>+	/** @trva_fault: is set if this is a TRTT fault */
> >>+	bool trva_fault;
> >>+};
> >>+
> >>+#endif
> >>diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> >>index 6f57578b07cb..30ac21bb4f15 100644
> >>--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> >>+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> >>@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> >>	FAULT_RESPONSE_DESC
> >>};
> >>
> >>-struct xe_guc_pagefault_desc {
> >>-	u32 dw0;
> >>-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> >>-#define PFD_SRC_ID		GENMASK(10, 3)
> >>-#define PFD_RSVD_0		GENMASK(17, 11)
> >>-#define XE2_PFD_TRVA_FAULT	BIT(18)
> >>-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> >>-#define PFD_ENG_CLASS		GENMASK(27, 25)
> >>-#define PFD_PDATA_LO		GENMASK(31, 28)
> >>-
> >>-	u32 dw1;
> >>-#define PFD_PDATA_HI		GENMASK(11, 0)
> >>-#define PFD_PDATA_HI_SHIFT	4
> >>-#define PFD_ASID		GENMASK(31, 12)
> >>-
> >>-	u32 dw2;
> >>-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> >>-#define PFD_FAULT_TYPE		GENMASK(3, 2)
> >>-#define PFD_VFID		GENMASK(9, 4)
> >>-#define PFD_RSVD_1		GENMASK(11, 10)
> >>-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> >>-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> >>-
> >>-	u32 dw3;
> >>-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> >>-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> >>-} __packed;
> >>-
> >>struct xe_guc_pagefault_reply {
> >>	u32 dw0;
> >>#define PFR_VALID		BIT(0)
> >>-- 
> >>2.43.0
> >>
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-30 18:42       ` Cavitt, Jonathan
@ 2025-05-06 21:30         ` Cavitt, Jonathan
  0 siblings, 0 replies; 17+ messages in thread
From: Cavitt, Jonathan @ 2025-05-06 21:30 UTC (permalink / raw)
  To: Hirschfeld, Dafna, Brost, Matthew, Wajdeczko, Michal
  Cc: intel-xe@lists.freedesktop.org, Gupta,  saurabhg, Zuo, Alex,
	joonas.lahtinen@linux.intel.com, Zhang, Jianxun, Lin, Shuicheng,
	dri-devel@lists.freedesktop.org, Mrozek, Michal, Jadav, Raag,
	Harrison, John C, Briano, Ivan, Auld, Matthew

I think this patch series is still blocked on these requests.  Do you want me to implement these
suggested changes, or should they be dropped?
@Brost, Matthew and @Wajdeczko, Michal
-Jonathan Cavitt

-----Original Message-----
From: Cavitt, Jonathan <jonathan.cavitt@intel.com> 
Sent: Wednesday, April 30, 2025 11:42 AM
To: Hirschfeld, Dafna <dafna.hirschfeld@intel.com>; Brost, Matthew <matthew.brost@intel.com>; Wajdeczko, Michal <Michal.Wajdeczko@intel.com>
Cc: intel-xe@lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Mrozek, Michal <michal.mrozek@intel.com>; Jadav, Raag <raag.jadav@intel.com>; Harrison, John C <john.c.harrison@intel.com>; Briano, Ivan <ivan.briano@intel.com>; Auld, Matthew <matthew.auld@intel.com>; Cavitt, Jonathan <jonathan.cavitt@intel.com>
Subject: RE: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
> 
> -----Original Message-----
> From: Hirschfeld, Dafna <dafna.hirschfeld@intel.com> 
> Sent: Wednesday, April 30, 2025 12:07 AM
> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
> Cc: intel-xe@lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Wajdeczko, Michal <Michal.Wajdeczko@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>; Jadav, Raag <raag.jadav@intel.com>; Harrison, John C <john.c.harrison@intel.com>; Briano, Ivan <ivan.briano@intel.com>; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
> > 
> > On 29.04.2025 17:22, Dafna Hirschfeld wrote:
> > >On 24.04.2025 20:49, Jonathan Cavitt wrote:
> > >>Move the pagefault struct from xe_gt_pagefault.c to the
> > >>xe_gt_pagefault_types.h header file, and move the associated enum values
> > >>into the regs folder under xe_pagefault_desc.h
> > >>
> > >>Since xe_pagefault_desc.h is being initialized here, also move the
> > >>xe_guc_pagefault_desc hardware formats to the new file.
> > >>
> > >>v2:
> > >>- Normalize names for common header (Matt Brost)
> > >>
> > >>v3:
> > >>- s/Migrate/Move (Michal W)
> > >>- s/xe_pagefault/xe_gt_pagefault (Michal W)
> > >>- Create new header file, xe_gt_pagefault_types.h (Michal W)
> > >>- Add kernel docs (Michal W)
> > >>
> > >>v4:
> > >>- Fix includes usage (Michal W)
> > >>- Reference Bspec (Michal W)
> > >>
> > >>v5:
> > >>- Convert enums to defines in regs folder (Michal W)
> > >>- Move xe_guc_pagefault_desc to regs folder (Michal W)
> > >>
> > >>Bspec: 77412
> > >>Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > >>Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> > >>Acked-by: Matthew Brost <matthew.brost@intel.com>
> > >>Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> > >>---
> > >>drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> > >>drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> > >>drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> > >>drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> > >>4 files changed, 100 insertions(+), 62 deletions(-)
> > >>create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > >>create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > >>
> > >>diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > >>new file mode 100644
> > >>index 000000000000..a169ac274e14
> > >>--- /dev/null
> > >>+++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > >
> > >Maybe change the file name to xe_guc_pagefault_desc.h ,
> > >since this is currently guc specific.
> 
> On paper, this shouldn't be a difficult change, though I'd like to ask
> @Brost, Matthew and @Wajdeczko, Michal if this request is
> amenable.
> 
> > >
> > >Also, the define 'PF_MSG_LEN_DW	4' relates to the
> > >length of this struct so should move here.
> 
> I don't see the harm in moving it, though I'd like to wait for a second
> opinion before proceeding.
> 
> > 
> > Actually, the struct 'xe_guc_acc_desc' and its defines should
> > also move here.
> 
> IMO, I think that if xe_guc_acc_desc needed to be extracted from the
> xe_gt_pagefault.c file, it should be put in its own header file in the regs
> folder, rather than sharing a file with xe_guc_pagefault_desc.h.  Say,
> regs/xe_guc_acc_desc.h?
> 
> Though at that point, I think the migration would be out of scope for
> this series.
> -Jonathan Cavitt
> 
> > 
> > Thanks,
> > Dafna
> > 
> > >
> > >Thanks,
> > >Dafna
> > >
> > >>@@ -0,0 +1,49 @@
> > >>+/* SPDX-License-Identifier: MIT */
> > >>+/*
> > >>+ * Copyright (c) 2025 Intel Corporation
> > >>+ */
> > >>+
> > >>+#ifndef _XE_PAGEFAULT_DESC_H_
> > >>+#define _XE_PAGEFAULT_DESC_H_
> > >>+
> > >>+#include <linux/bits.h>
> > >>+#include <linux/types.h>
> > >>+
> > >>+struct xe_guc_pagefault_desc {
> > >>+	u32 dw0;
> > >>+#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > >>+#define PFD_SRC_ID		GENMASK(10, 3)
> > >>+#define PFD_RSVD_0		GENMASK(17, 11)
> > >>+#define XE2_PFD_TRVA_FAULT	BIT(18)
> > >>+#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > >>+#define PFD_ENG_CLASS		GENMASK(27, 25)
> > >>+#define PFD_PDATA_LO		GENMASK(31, 28)
> > >>+
> > >>+	u32 dw1;
> > >>+#define PFD_PDATA_HI		GENMASK(11, 0)
> > >>+#define PFD_PDATA_HI_SHIFT	4
> > >>+#define PFD_ASID		GENMASK(31, 12)
> > >>+
> > >>+	u32 dw2;
> > >>+#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > >>+#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > >>+#define PFD_VFID		GENMASK(9, 4)
> > >>+#define PFD_RSVD_1		GENMASK(11, 10)
> > >>+#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > >>+#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > >>+
> > >>+	u32 dw3;
> > >>+#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > >>+#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > >>+} __packed;
> > >>+
> > >>+#define FLT_ACCESS_TYPE_READ		0u
> > >>+#define FLT_ACCESS_TYPE_WRITE		1u
> > >>+#define FLT_ACCESS_TYPE_ATOMIC		2u
> > >>+#define FLT_ACCESS_TYPE_RESERVED	3u
> > >>+
> > >>+#define FLT_TYPE_NOT_PRESENT_FAULT		0u
> > >>+#define FLT_TYPE_WRITE_ACCESS_VIOLATION		1u
> > >>+#define FLT_TYPE_ATOMIC_ACCESS_VIOLATION	2u
> > >>+
> > >>+#endif
> > >>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > >>index d4e3b7eb165a..93afa54c8780 100644
> > >>--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > >>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > >>@@ -12,8 +12,10 @@
> > >>#include <drm/drm_managed.h>
> > >>
> > >>#include "abi/guc_actions_abi.h"
> > >>+#include "regs/xe_pagefault_desc.h"
> > >>#include "xe_bo.h"
> > >>#include "xe_gt.h"
> > >>+#include "xe_gt_pagefault_types.h"
> > >>#include "xe_gt_stats.h"
> > >>#include "xe_gt_tlb_invalidation.h"
> > >>#include "xe_guc.h"
> > >>@@ -23,33 +25,6 @@
> > >>#include "xe_trace_bo.h"
> > >>#include "xe_vm.h"
> > >>
> > >>-struct pagefault {
> > >>-	u64 page_addr;
> > >>-	u32 asid;
> > >>-	u16 pdata;
> > >>-	u8 vfid;
> > >>-	u8 access_type;
> > >>-	u8 fault_type;
> > >>-	u8 fault_level;
> > >>-	u8 engine_class;
> > >>-	u8 engine_instance;
> > >>-	u8 fault_unsuccessful;
> > >>-	bool trva_fault;
> > >>-};
> > >>-
> > >>-enum access_type {
> > >>-	ACCESS_TYPE_READ = 0,
> > >>-	ACCESS_TYPE_WRITE = 1,
> > >>-	ACCESS_TYPE_ATOMIC = 2,
> > >>-	ACCESS_TYPE_RESERVED = 3,
> > >>-};
> > >>-
> > >>-enum fault_type {
> > >>-	NOT_PRESENT = 0,
> > >>-	WRITE_ACCESS_VIOLATION = 1,
> > >>-	ATOMIC_ACCESS_VIOLATION = 2,
> > >>-};
> > >>-
> > >>struct acc {
> > >>	u64 va_range_base;
> > >>	u32 asid;
> > >>@@ -61,9 +36,9 @@ struct acc {
> > >>	u8 engine_instance;
> > >>};
> > >>
> > >>-static bool access_is_atomic(enum access_type access_type)
> > >>+static bool access_is_atomic(u32 access_type)
> > >>{
> > >>-	return access_type == ACCESS_TYPE_ATOMIC;
> > >>+	return access_type == FLT_ACCESS_TYPE_ATOMIC;
> > >>}
> > >>
> > >>static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> > >>@@ -205,7 +180,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid)
> > >>	return vm;
> > >>}
> > >>
> > >>-static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > >>+static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf)
> > >>{
> > >>	struct xe_device *xe = gt_to_xe(gt);
> > >>	struct xe_vm *vm;
> > >>@@ -237,7 +212,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > >>		goto unlock_vm;
> > >>	}
> > >>
> > >>-	if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) {
> > >>+	if (xe_vma_read_only(vma) && pf->access_type != FLT_ACCESS_TYPE_READ) {
> > >>		err = -EPERM;
> > >>		goto unlock_vm;
> > >>	}
> > >>@@ -271,7 +246,7 @@ static int send_pagefault_reply(struct xe_guc *guc,
> > >>	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
> > >>}
> > >>
> > >>-static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > >>+static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf)
> > >>{
> > >>	drm_dbg(&xe->drm, "\n\tASID: %d\n"
> > >>		 "\tVFID: %d\n"
> > >>@@ -291,7 +266,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
> > >>
> > >>#define PF_MSG_LEN_DW	4
> > >>
> > >>-static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
> > >>+static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf)
> > >>{
> > >>	const struct xe_guc_pagefault_desc *desc;
> > >>	bool ret = false;
> > >>@@ -378,7 +353,7 @@ static void pf_queue_work_func(struct work_struct *w)
> > >>	struct xe_gt *gt = pf_queue->gt;
> > >>	struct xe_device *xe = gt_to_xe(gt);
> > >>	struct xe_guc_pagefault_reply reply = {};
> > >>-	struct pagefault pf = {};
> > >>+	struct xe_gt_pagefault pf = {};
> > >>	unsigned long threshold;
> > >>	int ret;
> > >>
> > >>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > >>new file mode 100644
> > >>index 000000000000..b7d41b558de3
> > >>--- /dev/null
> > >>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> > >>@@ -0,0 +1,42 @@
> > >>+/* SPDX-License-Identifier: MIT */
> > >>+/*
> > >>+ * Copyright (c) 2022-2025 Intel Corporation
> > >>+ */
> > >>+
> > >>+#ifndef _XE_GT_PAGEFAULT_TYPES_H_
> > >>+#define _XE_GT_PAGEFAULT_TYPES_H_
> > >>+
> > >>+#include <linux/types.h>
> > >>+
> > >>+/**
> > >>+ * struct xe_gt_pagefault - Structure of pagefaults returned by the
> > >>+ * pagefault handler
> > >>+ */
> > >>+struct xe_gt_pagefault {
> > >>+	/** @page_addr: faulted address of this pagefault */
> > >>+	u64 page_addr;
> > >>+	/** @asid: ASID of this pagefault */
> > >>+	u32 asid;
> > >>+	/** @pdata: PDATA of this pagefault */
> > >>+	u16 pdata;
> > >>+	/** @vfid: VFID of this pagefault */
> > >>+	u8 vfid;
> > >>+	/** @access_type: access type of this pagefault */
> > >>+	u8 access_type;
> > >>+	/** @fault_type: fault type of this pagefault */
> > >>+	u8 fault_type;
> > >>+	/** @fault_level: fault level of this pagefault */
> > >>+	u8 fault_level;
> > >>+	/** @engine_class: engine class this pagefault was reported on */
> > >>+	u8 engine_class;
> > >>+	/** @engine_instance: engine instance this pagefault was reported on */
> > >>+	u8 engine_instance;
> > >>+	/** @fault_unsuccessful: flag for if the pagefault recovered or not */
> > >>+	u8 fault_unsuccessful;
> > >>+	/** @prefetch: unused */
> > >>+	bool prefetch;
> > >>+	/** @trva_fault: is set if this is a TRTT fault */
> > >>+	bool trva_fault;
> > >>+};
> > >>+
> > >>+#endif
> > >>diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > >>index 6f57578b07cb..30ac21bb4f15 100644
> > >>--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > >>+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > >>@@ -290,34 +290,6 @@ enum xe_guc_response_desc_type {
> > >>	FAULT_RESPONSE_DESC
> > >>};
> > >>
> > >>-struct xe_guc_pagefault_desc {
> > >>-	u32 dw0;
> > >>-#define PFD_FAULT_LEVEL		GENMASK(2, 0)
> > >>-#define PFD_SRC_ID		GENMASK(10, 3)
> > >>-#define PFD_RSVD_0		GENMASK(17, 11)
> > >>-#define XE2_PFD_TRVA_FAULT	BIT(18)
> > >>-#define PFD_ENG_INSTANCE	GENMASK(24, 19)
> > >>-#define PFD_ENG_CLASS		GENMASK(27, 25)
> > >>-#define PFD_PDATA_LO		GENMASK(31, 28)
> > >>-
> > >>-	u32 dw1;
> > >>-#define PFD_PDATA_HI		GENMASK(11, 0)
> > >>-#define PFD_PDATA_HI_SHIFT	4
> > >>-#define PFD_ASID		GENMASK(31, 12)
> > >>-
> > >>-	u32 dw2;
> > >>-#define PFD_ACCESS_TYPE		GENMASK(1, 0)
> > >>-#define PFD_FAULT_TYPE		GENMASK(3, 2)
> > >>-#define PFD_VFID		GENMASK(9, 4)
> > >>-#define PFD_RSVD_1		GENMASK(11, 10)
> > >>-#define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> > >>-#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> > >>-
> > >>-	u32 dw3;
> > >>-#define PFD_VIRTUAL_ADDR_HI	GENMASK(31, 0)
> > >>-#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
> > >>-} __packed;
> > >>-
> > >>struct xe_guc_pagefault_reply {
> > >>	u32 dw0;
> > >>#define PFR_VALID		BIT(0)
> > >>-- 
> > >>2.43.0
> > >>
> > 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-04-29 14:22   ` Dafna Hirschfeld
  2025-04-30  7:06     ` Dafna Hirschfeld
@ 2025-05-07 16:34     ` Michal Wajdeczko
  2025-05-07 16:45       ` Cavitt, Jonathan
  1 sibling, 1 reply; 17+ messages in thread
From: Michal Wajdeczko @ 2025-05-07 16:34 UTC (permalink / raw)
  To: Dafna Hirschfeld, Jonathan Cavitt
  Cc: intel-xe, saurabhg.gupta, alex.zuo, joonas.lahtinen,
	matthew.brost, jianxun.zhang, shuicheng.lin, dri-devel,
	michal.mrozek, raag.jadav, john.c.harrison, ivan.briano,
	matthew.auld



On 29.04.2025 16:22, Dafna Hirschfeld wrote:
> On 24.04.2025 20:49, Jonathan Cavitt wrote:
>> Move the pagefault struct from xe_gt_pagefault.c to the
>> xe_gt_pagefault_types.h header file, and move the associated enum values
>> into the regs folder under xe_pagefault_desc.h
>>
>> Since xe_pagefault_desc.h is being initialized here, also move the
>> xe_guc_pagefault_desc hardware formats to the new file.
>>
>> v2:
>> - Normalize names for common header (Matt Brost)
>>
>> v3:
>> - s/Migrate/Move (Michal W)
>> - s/xe_pagefault/xe_gt_pagefault (Michal W)
>> - Create new header file, xe_gt_pagefault_types.h (Michal W)
>> - Add kernel docs (Michal W)
>>
>> v4:
>> - Fix includes usage (Michal W)
>> - Reference Bspec (Michal W)
>>
>> v5:
>> - Convert enums to defines in regs folder (Michal W)
>> - Move xe_guc_pagefault_desc to regs folder (Michal W)
>>
>> Bspec: 77412

maybe also mention 59654 here?

>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
>> Acked-by: Matthew Brost <matthew.brost@intel.com>
>> Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
>> ---
>> drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
>> drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
>> drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
>> drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
>> 4 files changed, 100 insertions(+), 62 deletions(-)
>> create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
>> create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/
>> gpu/drm/xe/regs/xe_pagefault_desc.h
>> new file mode 100644
>> index 000000000000..a169ac274e14
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> 
> Maybe change the file name to xe_guc_pagefault_desc.h ,
> since this is currently guc specific.

IMO 'guc' tag is not applicable here

my understanding was that GuC sends data as it was read from the HW
regs, so the origin of the struct layout is HW, hence we put it in regs/
where we have all the HW defs, not in the abi/ folder which is for FW defs.

unless we want to make this struct part of the stable GuC pagefault ABI,
and then even if HW definition/layout would change, we will stick with
current layout. Then agree, we should move all this to abi/ and also
drop the Bspec reference as n/a

> 
> Also, the define 'PF_MSG_LEN_DW    4' relates to the
> length of this struct so should move here.

any G2H message related definitions should be in the abi/ GuC files, not
here in regs/ where we keep HW definitions.

and please don't start define name with "PF", use "GUC" instead

and btw, in many places by message length we assume also HXG header
length, so the actual length of the G2H 6002 message is 5 as it includes
1 DW of header with DATA0 and 4 DWs of payload with DATA1..4

#define GUC2HOST_NOTIFY_PAGE_FAULT_REQ_DESC_MSG_LEN \
	(GUC_HXG_REQUEST_MSG_MIN_LEN + 4u)

> 
> Thanks,
> Dafna
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
  2025-05-07 16:34     ` Michal Wajdeczko
@ 2025-05-07 16:45       ` Cavitt, Jonathan
  0 siblings, 0 replies; 17+ messages in thread
From: Cavitt, Jonathan @ 2025-05-07 16:45 UTC (permalink / raw)
  To: Wajdeczko, Michal, Hirschfeld, Dafna
  Cc: intel-xe@lists.freedesktop.org, Gupta,  saurabhg, Zuo, Alex,
	joonas.lahtinen@linux.intel.com, Brost, Matthew, Zhang, Jianxun,
	Lin, Shuicheng, dri-devel@lists.freedesktop.org, Mrozek,  Michal,
	Jadav, Raag, Harrison, John C, Briano, Ivan, Auld, Matthew

-----Original Message-----
From: Wajdeczko, Michal <Michal.Wajdeczko@intel.com> 
Sent: Wednesday, May 7, 2025 9:34 AM
To: Hirschfeld, Dafna <dafna.hirschfeld@intel.com>; Cavitt, Jonathan <jonathan.cavitt@intel.com>
Cc: intel-xe@lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Mrozek, Michal <michal.mrozek@intel.com>; Jadav, Raag <raag.jadav@intel.com>; Harrison, John C <john.c.harrison@intel.com>; Briano, Ivan <ivan.briano@intel.com>; Auld, Matthew <matthew.auld@intel.com>
Subject: Re: [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header
> 
> On 29.04.2025 16:22, Dafna Hirschfeld wrote:
> > On 24.04.2025 20:49, Jonathan Cavitt wrote:
> >> Move the pagefault struct from xe_gt_pagefault.c to the
> >> xe_gt_pagefault_types.h header file, and move the associated enum values
> >> into the regs folder under xe_pagefault_desc.h
> >>
> >> Since xe_pagefault_desc.h is being initialized here, also move the
> >> xe_guc_pagefault_desc hardware formats to the new file.
> >>
> >> v2:
> >> - Normalize names for common header (Matt Brost)
> >>
> >> v3:
> >> - s/Migrate/Move (Michal W)
> >> - s/xe_pagefault/xe_gt_pagefault (Michal W)
> >> - Create new header file, xe_gt_pagefault_types.h (Michal W)
> >> - Add kernel docs (Michal W)
> >>
> >> v4:
> >> - Fix includes usage (Michal W)
> >> - Reference Bspec (Michal W)
> >>
> >> v5:
> >> - Convert enums to defines in regs folder (Michal W)
> >> - Move xe_guc_pagefault_desc to regs folder (Michal W)
> >>
> >> Bspec: 77412
> 
> maybe also mention 59654 here?
> 
> >> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> >> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> >> Acked-by: Matthew Brost <matthew.brost@intel.com>
> >> Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
> >> ---
> >> drivers/gpu/drm/xe/regs/xe_pagefault_desc.h | 49 +++++++++++++++++++++
> >> drivers/gpu/drm/xe/xe_gt_pagefault.c        | 43 ++++--------------
> >> drivers/gpu/drm/xe/xe_gt_pagefault_types.h  | 42 ++++++++++++++++++
> >> drivers/gpu/drm/xe/xe_guc_fwif.h            | 28 ------------
> >> 4 files changed, 100 insertions(+), 62 deletions(-)
> >> create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> >> create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h
> >>
> >> diff --git a/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h b/drivers/
> >> gpu/drm/xe/regs/xe_pagefault_desc.h
> >> new file mode 100644
> >> index 000000000000..a169ac274e14
> >> --- /dev/null
> >> +++ b/drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
> > 
> > Maybe change the file name to xe_guc_pagefault_desc.h ,
> > since this is currently guc specific.
> 
> IMO 'guc' tag is not applicable here
> 
> my understanding was that GuC sends data as it was read from the HW
> regs, so the origin of the struct layout is HW, hence we put it in regs/
> where we have all the HW defs, not in the abi/ folder which is for FW defs.
> 
> unless we want to make this struct part of the stable GuC pagefault ABI,
> and then even if HW definition/layout would change, we will stick with
> current layout. Then agree, we should move all this to abi/ and also
> drop the Bspec reference as n/a

So, to summarize, you want me to *not* change the name of the regs/xe_pagefault_desc.h file?

I'll get on reverting that change shortly.

> 
> > 
> > Also, the define 'PF_MSG_LEN_DW    4' relates to the
> > length of this struct so should move here.
> 
> any G2H message related definitions should be in the abi/ GuC files, not
> here in regs/ where we keep HW definitions.
> 
> and please don't start define name with "PF", use "GUC" instead
> 
> and btw, in many places by message length we assume also HXG header
> length, so the actual length of the G2H 6002 message is 5 as it includes
> 1 DW of header with DATA0 and 4 DWs of payload with DATA1..4
> 
> #define GUC2HOST_NOTIFY_PAGE_FAULT_REQ_DESC_MSG_LEN \
> 	(GUC_HXG_REQUEST_MSG_MIN_LEN + 4u)

Should I replace all instances of PF_MSG_LEN_DW with
GUC2HOST_NOTIFY_PAGE_FAULT_REQ_DESC_MSG_LEN, instead of attempting
to relocate and rename the value?

If not, which abi  file should I relocate the value to,
and what should the value be renamed to?
GUC_PF_MSG_LEN_DW?

-Jonathan Cavitt

> 
> > 
> > Thanks,
> > Dafna
> > 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-05-07 16:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24 20:49 [PATCH v23 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
2025-04-24 20:49 ` [PATCH v23 1/5] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs Jonathan Cavitt
2025-04-24 20:49 ` [PATCH v23 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header Jonathan Cavitt
2025-04-27 11:57   ` Dafna Hirschfeld
2025-04-28  1:51     ` Matthew Brost
2025-04-28  2:09       ` Matthew Brost
2025-04-28  2:14     ` Matthew Brost
2025-04-28 14:17     ` Cavitt, Jonathan
2025-04-29 14:22   ` Dafna Hirschfeld
2025-04-30  7:06     ` Dafna Hirschfeld
2025-04-30 18:42       ` Cavitt, Jonathan
2025-05-06 21:30         ` Cavitt, Jonathan
2025-05-07 16:34     ` Michal Wajdeczko
2025-05-07 16:45       ` Cavitt, Jonathan
2025-04-24 20:49 ` [PATCH v23 3/5] drm/xe/uapi: Define drm_xe_vm_get_property Jonathan Cavitt
2025-04-24 20:49 ` [PATCH v23 4/5] drm/xe/xe_vm: Add per VM fault info Jonathan Cavitt
2025-04-24 20:49 ` [PATCH v23 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).