intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13] drm/xe: Add psmi support
@ 2025-08-08 17:29 Lucas De Marchi
  2025-08-08 17:29 ` [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI Lucas De Marchi
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison, Vinay Belgaumkar,
	Brian Welty, Badal Nilawar, Michal Wajdeczko

Add PSMI support to aid debugging. More information about PSMI is in the
first and second patches.

Expose the toggle to enable it via configfs, which allows to debug just
one of the possible cards attached.

The buffer allocation request is done via debugfs.

In order to apply WAs conditionally to using PSMI, a new
RTP match is also added.

The rest of the patches are improvements to our configfs integration
that I've been collecting while reviewing other semi-related patches.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
Changes in v3:
- Rebase and refactor on multiple configfs changes
- Add patches to log to dmesg when non-default configfs value is found
- Block runtime changes of configfs attributes as they don't have any
  effect
- Improve documentation
- Link to v2: https://lore.kernel.org/r/20250723-psmi-v2-0-84a04b5a3c04@intel.com

Changes in v2:
- configfs
- some refactors as noted on individual patches
- Link to v1: https://lore.kernel.org/r/20250716-psmi-v1-0-674c13d7028e@intel.com
---
Badal Nilawar (1):
      drm/xe/psmi: Add Wa_14020001231

Lucas De Marchi (11):
      drm/xe/psmi: Add GuC flag to enable PSMI
      drm/xe/psmi: Add debugfs interface for PSMI
      drm/xe/rtp: Add match for psmi
      drm/xe/configfs: Simplify kernel doc
      drm/xe/configfs: Allow to enable PSMI
      drm/xe/configfs: Use guard() for dev->lock
      drm/xe/configfs: Block runtime attribute changes
      drm/xe/configfs: Use tree-like output in documentation
      drm/xe/configfs: Improve documentation steps
      drm/xe/configfs: Minor fixes to documentation
      drm/xe/configfs: Dump custom settings when binding

Vinay Belgaumkar (1):
      drm/xe/psmi: Add Wa_16023683509

 drivers/gpu/drm/xe/Makefile           |   1 +
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h |   1 +
 drivers/gpu/drm/xe/xe_configfs.c      | 208 ++++++++++++++++++----
 drivers/gpu/drm/xe/xe_configfs.h      |   4 +
 drivers/gpu/drm/xe/xe_debugfs.c       |   3 +
 drivers/gpu/drm/xe/xe_device.c        |   5 +
 drivers/gpu/drm/xe/xe_device_types.h  |   8 +
 drivers/gpu/drm/xe/xe_guc.c           |  10 +-
 drivers/gpu/drm/xe/xe_guc_ads.c       |   4 +
 drivers/gpu/drm/xe/xe_guc_fwif.h      |   2 +
 drivers/gpu/drm/xe/xe_pci.c           |   3 +
 drivers/gpu/drm/xe/xe_psmi.c          | 313 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_psmi.h          |  14 ++
 drivers/gpu/drm/xe/xe_rtp.c           |   7 +
 drivers/gpu/drm/xe/xe_rtp.h           |   3 +
 drivers/gpu/drm/xe/xe_wa_oob.rules    |   6 +
 16 files changed, 559 insertions(+), 33 deletions(-)

base-commit: 2632d3c5d7f7c1b80996fc74d27bed6612a0ff9b
change-id: 20250618-psmi-9f270bf67895

Lucas De Marchi


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13  0:38   ` Belgaumkar, Vinay
  2025-08-08 17:29 ` [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI Lucas De Marchi
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison

PSMI allows to capture data from the GPU useful for early
validation. From the kernel side there isn't much to be done, just a few
things:

	1) Toggle the feature support in GuC
	2) Enable some additional WAs
	3) Allocate buffers

Here is the first step, with the next ones to follow. For now everything
is disabled through a check in configfs that is currently hardcoded to
disabled.

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.h | 2 ++
 drivers/gpu/drm/xe/xe_guc.c      | 7 ++++++-
 drivers/gpu/drm/xe/xe_guc_fwif.h | 1 +
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
index fb87640080896..c14588b86e833 100644
--- a/drivers/gpu/drm/xe/xe_configfs.h
+++ b/drivers/gpu/drm/xe/xe_configfs.h
@@ -16,12 +16,14 @@ void xe_configfs_exit(void);
 bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
 void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
+static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
 #else
 static inline int xe_configfs_init(void) { return 0; }
 static inline void xe_configfs_exit(void) { }
 static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
 static inline void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) { }
 static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
+static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
 #endif
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index 9e34401e4489f..cb757a53de856 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -16,6 +16,7 @@
 #include "regs/xe_guc_regs.h"
 #include "regs/xe_irq_regs.h"
 #include "xe_bo.h"
+#include "xe_configfs.h"
 #include "xe_device.h"
 #include "xe_force_wake.h"
 #include "xe_gt.h"
@@ -81,11 +82,15 @@ static u32 guc_ctl_debug_flags(struct xe_guc *guc)
 
 static u32 guc_ctl_feature_flags(struct xe_guc *guc)
 {
+	struct xe_device *xe = guc_to_xe(guc);
 	u32 flags = GUC_CTL_ENABLE_LITE_RESTORE;
 
-	if (!guc_to_xe(guc)->info.skip_guc_pc)
+	if (!xe->info.skip_guc_pc)
 		flags |= GUC_CTL_ENABLE_SLPC;
 
+	if (xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev)))
+		flags |= GUC_CTL_ENABLE_PSMI;
+
 	return flags;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
index ca9f999d38d1e..4dc000c977faf 100644
--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
@@ -112,6 +112,7 @@ struct guc_update_exec_queue_policy {
 #define GUC_CTL_FEATURE			2
 #define   GUC_CTL_ENABLE_SLPC		BIT(2)
 #define   GUC_CTL_ENABLE_LITE_RESTORE	BIT(4)
+#define   GUC_CTL_ENABLE_PSMI		BIT(7)
 #define   GUC_CTL_DISABLE_SCHEDULER	BIT(14)
 
 #define GUC_CTL_DEBUG			3

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
  2025-08-08 17:29 ` [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13  1:41   ` Belgaumkar, Vinay
  2025-08-13 10:42   ` Matthew Auld
  2025-08-08 17:29 ` [PATCH v3 03/13] drm/xe/rtp: Add match for psmi Lucas De Marchi
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Vinay Belgaumkar, Brian Welty

Requirement for PSMI capture is to have a physically contiguous buffer.
All the needed configuration is done by the userspace tool directly to
the GPU via mmio access.

This interface only support allocating from VRAM regions. For integrated
devices, the PSMI buffer is in SYSTEM memory and should be allocated by
userspace using hugetlbfs.

Here we add the ability to allocate a region of physically contiguous
memory by writing to debugfs file (listed below). For multi-tile devices,
the capture tool requires ability to allocate a capture buffer per tile
(VRAM region) and so user can specify a region_mask. The tool then
can mmap the buffers via direct mmap of the PCIBAR via sysfs.

To support the capture tool, 3 new debugfs entries are added:

   psmi_capture_addr - physical address per VRAM region's capture buffer
   psmi_capture_region_mask - select which region(s) to allocate a buffer
   psmi_capture_size - size of current capture buffer

Writing psmi_capture_size will allocate new buffer of requested size per
region after freeing any current buffers.

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Original-author: Brian Welty <brian.welty@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
v2:
 - Fix kernel-doc
 - Do not walk all region_mask on cleanup: it should never be needed
 - Replace sysmem checks by asserts as they should never be set
 - s/debugfs_create/debugfs_register/ and do not pass the root dir:
   this makes it similar to other parts registering debugfs
 - Do not export a cleanup function, rather use a init that registers
   a devm action if needed
 - Drop modparam in favor of configfs whose attribute will be
   implemented when everything is ready
---
 drivers/gpu/drm/xe/Makefile          |   1 +
 drivers/gpu/drm/xe/xe_debugfs.c      |   3 +
 drivers/gpu/drm/xe/xe_device.c       |   5 +
 drivers/gpu/drm/xe/xe_device_types.h |   8 +
 drivers/gpu/drm/xe/xe_psmi.c         | 313 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_psmi.h         |  14 ++
 6 files changed, 344 insertions(+)

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 8e0c3412a757c..85b8d3a59ef07 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -98,6 +98,7 @@ xe-y += xe_bb.o \
 	xe_pcode.o \
 	xe_pm.o \
 	xe_preempt_fence.o \
+	xe_psmi.o \
 	xe_pt.o \
 	xe_pt_walk.o \
 	xe_pxp.o \
diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index 0b4a532f7c45c..bc717519502dd 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -20,6 +20,7 @@
 #include "xe_guc_ads.h"
 #include "xe_mmio.h"
 #include "xe_pm.h"
+#include "xe_psmi.h"
 #include "xe_pxp_debugfs.h"
 #include "xe_sriov.h"
 #include "xe_sriov_pf.h"
@@ -400,6 +401,8 @@ void xe_debugfs_register(struct xe_device *xe)
 
 	xe_pxp_debugfs_register(xe->pxp);
 
+	xe_psmi_debugfs_register(xe);
+
 	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
 
 	if (IS_SRIOV_PF(xe))
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 57edbc63da6f4..62edb39b61fb0 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -54,6 +54,7 @@
 #include "xe_pcode.h"
 #include "xe_pm.h"
 #include "xe_pmu.h"
+#include "xe_psmi.h"
 #include "xe_pxp.h"
 #include "xe_query.h"
 #include "xe_shrinker.h"
@@ -908,6 +909,10 @@ int xe_device_probe(struct xe_device *xe)
 	if (err)
 		return err;
 
+	err = xe_psmi_init(xe);
+	if (err)
+		return err;
+
 	err = drm_dev_register(&xe->drm, 0);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 01e8fa0d2f9f7..bf9af8d0b84ae 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -576,6 +576,14 @@ struct xe_device {
 	atomic64_t global_total_pages;
 #endif
 
+	/** @psmi: GPU debugging via additional validation HW */
+	struct {
+		/** @psmi.capture_obj: PSMI buffer for VRAM */
+		struct xe_bo *capture_obj[XE_MAX_TILES_PER_DEVICE + 1];
+		/** @psmi.region_mask: Mask of valid memory regions */
+		u8 region_mask;
+	} psmi;
+
 	/* private: */
 
 #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
diff --git a/drivers/gpu/drm/xe/xe_psmi.c b/drivers/gpu/drm/xe/xe_psmi.c
new file mode 100644
index 0000000000000..e6a67e85e1bb2
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_psmi.c
@@ -0,0 +1,313 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <linux/debugfs.h>
+
+#include "xe_bo.h"
+#include "xe_device.h"
+#include "xe_configfs.h"
+#include "xe_psmi.h"
+
+/*
+ * PSMI capture support
+ *
+ * Requirement for PSMI capture is to have a physically contiguous buffer.  The
+ * PSMI tool owns doing all necessary configuration (MMIO register writes are
+ * done from user-space). However, KMD needs to provide the PSMI tool with the
+ * required physical address of the base of PSMI buffer in case of VRAM.
+ *
+ * VRAM backed PSMI buffer:
+ * Buffer is allocated as GEM object and with XE_BO_CREATE_PINNED_BIT flag which
+ * creates a contiguous allocation. The physical address is returned from
+ * psmi_debugfs_capture_addr_show(). PSMI tool can mmap the buffer via the
+ * PCIBAR through sysfs.
+ *
+ * SYSTEM memory backed PSMI buffer:
+ * Interface here does not support allocating from SYSTEM memory region.  The
+ * PSMI tool needs to allocate memory themselves using hugetlbfs. In order to
+ * get the physical address, user-space can query /proc/[pid]/pagemap. As an
+ * alternative, CMA debugfs could also be used to allocate reserved CMA memory.
+ */
+
+static bool psmi_enabled(struct xe_device *xe)
+{
+	return xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev));
+}
+
+static void psmi_free_object(struct xe_bo *bo)
+{
+	xe_bo_lock(bo, NULL);
+	xe_bo_unpin(bo);
+	xe_bo_unlock(bo);
+	xe_bo_put(bo);
+}
+
+/*
+ * Free PSMI capture buffer objects.
+ */
+static void psmi_cleanup(struct xe_device *xe)
+{
+	unsigned long id, region_mask = xe->psmi.region_mask;
+	struct xe_bo *bo;
+
+	for_each_set_bit(id, &region_mask,
+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
+		/* smem should never be set */
+		xe_assert(xe, id);
+
+		bo = xe->psmi.capture_obj[id];
+		if (bo) {
+			psmi_free_object(bo);
+			xe->psmi.capture_obj[id] = NULL;
+		}
+	}
+}
+
+static struct xe_bo *psmi_alloc_object(struct xe_device *xe,
+				       unsigned int id, size_t bo_size)
+{
+	struct xe_bo *bo = NULL;
+	struct xe_tile *tile;
+	int err;
+
+	if (!id || !bo_size)
+		return NULL;
+
+	tile = &xe->tiles[id - 1];
+
+	/* VRAM: Allocate GEM object for the capture buffer */
+	bo = xe_bo_create_locked(xe, tile, NULL, bo_size,
+				 ttm_bo_type_kernel,
+				 XE_BO_FLAG_VRAM_IF_DGFX(tile) |
+				 XE_BO_FLAG_PINNED |
+				 XE_BO_FLAG_NEEDS_CPU_ACCESS);
+
+	if (!IS_ERR(bo)) {
+		/* Buffer written by HW, ensure stays resident */
+		err = xe_bo_pin(bo);
+		if (err)
+			bo = ERR_PTR(err);
+		xe_bo_unlock(bo);
+	}
+
+	return bo;
+}
+
+/*
+ * Allocate PSMI capture buffer objects (via debugfs set function), based on
+ * which regions the user has selected in region_mask.  @size: size in bytes
+ * (should be power of 2)
+ *
+ * Always release/free the current buffer objects before attempting to allocate
+ * new ones.  Size == 0 will free all current buffers.
+ *
+ * Note, we don't write any registers as the capture tool is already configuring
+ * all PSMI registers itself via mmio space.
+ */
+static int psmi_resize_object(struct xe_device *xe, size_t size)
+{
+	unsigned long id, region_mask = xe->psmi.region_mask;
+	struct xe_bo *bo = NULL;
+	int err = 0;
+
+	/*
+	 * Buddy allocator anyway will roundup to next power of 2,
+	 * so rather than waste unused pages, require user to ask for
+	 * power of 2 sized PSMI buffers.
+	 */
+	if (size && !is_power_of_2(size))
+		return -EINVAL;
+
+	/* if resizing, free currently allocated buffers first */
+	psmi_cleanup(xe);
+
+	/* can set size to 0, in which case, now done */
+	if (!size)
+		return 0;
+
+	for_each_set_bit(id, &region_mask,
+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
+		/* smem should never be set */
+		xe_assert(xe, id);
+
+		bo = psmi_alloc_object(xe, id, size);
+		if (IS_ERR(bo)) {
+			err = PTR_ERR(bo);
+			break;
+		}
+		xe->psmi.capture_obj[id] = bo;
+
+		drm_info(&xe->drm,
+			 "PSMI capture size requested: %zu bytes, allocated: %lu:%zu\n",
+			 size, id, bo ? xe_bo_size(bo) : 0);
+	}
+
+	/* on error, reverse what was allocated */
+	if (err)
+		psmi_cleanup(xe);
+
+	return err;
+}
+
+/*
+ * Returns an address for the capture tool to use to find start of capture
+ * buffer. Capture tool requires the capability to have a buffer allocated per
+ * each tile (VRAM region), thus we return an address for each region.
+ */
+static int psmi_debugfs_capture_addr_show(struct seq_file *m, void *data)
+{
+	struct xe_device *xe = m->private;
+	unsigned long id, region_mask;
+	struct xe_bo *bo;
+	u64 val;
+
+	region_mask = xe->psmi.region_mask;
+	for_each_set_bit(id, &region_mask,
+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
+		/* smem should never be set */
+		xe_assert(xe, id);
+
+		/* VRAM region */
+		bo = xe->psmi.capture_obj[id];
+		if (!bo)
+			continue;
+
+		/* pinned, so don't need bo_lock */
+		val = __xe_bo_addr(bo, 0, PAGE_SIZE);
+		seq_printf(m, "%ld: 0x%llx\n", id, val);
+	}
+
+	return 0;
+}
+
+/*
+ * Return capture buffer size, using the size from first allocated object that
+ * is found. This works because all objects must be of the same size.
+ */
+static int psmi_debugfs_capture_size_get(void *data, u64 *val)
+{
+	unsigned long id, region_mask;
+	struct xe_device *xe = data;
+	struct xe_bo *bo;
+
+	region_mask = xe->psmi.region_mask;
+	for_each_set_bit(id, &region_mask,
+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
+		/* smem should never be set */
+		xe_assert(xe, id);
+
+		bo = xe->psmi.capture_obj[id];
+		if (bo) {
+			*val = xe_bo_size(bo);
+			return 0;
+		}
+	}
+
+	/* no capture objects are allocated */
+	*val = 0;
+
+	return 0;
+}
+
+/*
+ * Set size of PSMI capture buffer. This triggers the allocation of capture
+ * buffer in each memory region as specified with prior write to
+ * psmi_capture_region_mask.
+ */
+static int psmi_debugfs_capture_size_set(void *data, u64 val)
+{
+	struct xe_device *xe = data;
+
+	/* user must have specified at least one region */
+	if (!xe->psmi.region_mask)
+		return -EINVAL;
+
+	return psmi_resize_object(xe, val);
+}
+
+static int psmi_debugfs_capture_region_mask_get(void *data, u64 *val)
+{
+	struct xe_device *xe = data;
+
+	*val = xe->psmi.region_mask;
+
+	return 0;
+}
+
+/*
+ * Select VRAM regions for multi-tile devices, only allowed when buffer is not
+ * currently allocated.
+ */
+static int psmi_debugfs_capture_region_mask_set(void *data, u64 region_mask)
+{
+	struct xe_device *xe = data;
+	u64 size = 0;
+
+	/* SMEM is not supported (see comments at top of file) */
+	if (region_mask & 0x1)
+		return -EOPNOTSUPP;
+
+	/* input bitmask should contain only valid TTM regions */
+	if (!region_mask || region_mask & ~xe->info.mem_region_mask)
+		return -EINVAL;
+
+	/* only allow setting mask if buffer is not yet allocated */
+	psmi_debugfs_capture_size_get(xe, &size);
+	if (size)
+		return -EBUSY;
+
+	xe->psmi.region_mask = region_mask;
+
+	return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(psmi_debugfs_capture_addr);
+
+DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_region_mask_fops,
+			 psmi_debugfs_capture_region_mask_get,
+			 psmi_debugfs_capture_region_mask_set,
+			 "0x%llx\n");
+
+DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_size_fops,
+			 psmi_debugfs_capture_size_get,
+			 psmi_debugfs_capture_size_set,
+			 "%lld\n");
+
+void xe_psmi_debugfs_register(struct xe_device *xe)
+{
+	struct drm_minor *minor;
+
+	if (!psmi_enabled(xe))
+		return;
+
+	minor = xe->drm.primary;
+	if (!minor->debugfs_root)
+		return;
+
+	debugfs_create_file("psmi_capture_addr",
+			    0400, minor->debugfs_root, xe,
+			    &psmi_debugfs_capture_addr_fops);
+
+	debugfs_create_file("psmi_capture_region_mask",
+			    0600, minor->debugfs_root, xe,
+			    &psmi_debugfs_capture_region_mask_fops);
+
+	debugfs_create_file("psmi_capture_size",
+			    0600, minor->debugfs_root, xe,
+			    &psmi_debugfs_capture_size_fops);
+}
+
+static void psmi_fini(void *arg)
+{
+	psmi_cleanup(arg);
+}
+
+int xe_psmi_init(struct xe_device *xe)
+{
+	if (!psmi_enabled(xe))
+		return 0;
+
+	return devm_add_action(xe->drm.dev, psmi_fini, xe);
+}
diff --git a/drivers/gpu/drm/xe/xe_psmi.h b/drivers/gpu/drm/xe/xe_psmi.h
new file mode 100644
index 0000000000000..b1dfba80d893d
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_psmi.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_PSMI_H_
+#define _XE_PSMI_H_
+
+struct xe_device;
+
+int xe_psmi_init(struct xe_device *xe);
+void xe_psmi_debugfs_register(struct xe_device *xe);
+
+#endif

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 03/13] drm/xe/rtp: Add match for psmi
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
  2025-08-08 17:29 ` [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI Lucas De Marchi
  2025-08-08 17:29 ` [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-14 21:28   ` Belgaumkar, Vinay
  2025-08-08 17:29 ` [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231 Lucas De Marchi
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

Add match to be used on WAs for only enabling workarounds if psmi is
intended to be used.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_rtp.c | 7 +++++++
 drivers/gpu/drm/xe/xe_rtp.h | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_rtp.c b/drivers/gpu/drm/xe/xe_rtp.c
index 95571b87aa73c..47ea1521dc80c 100644
--- a/drivers/gpu/drm/xe/xe_rtp.c
+++ b/drivers/gpu/drm/xe/xe_rtp.c
@@ -9,6 +9,7 @@
 
 #include <uapi/drm/xe_drm.h>
 
+#include "xe_configfs.h"
 #include "xe_gt.h"
 #include "xe_gt_topology.h"
 #include "xe_macros.h"
@@ -363,3 +364,9 @@ bool xe_rtp_match_not_sriov_vf(const struct xe_gt *gt,
 {
 	return !IS_SRIOV_VF(gt_to_xe(gt));
 }
+
+bool xe_rtp_match_psmi_enabled(const struct xe_gt *gt,
+			       const struct xe_hw_engine *hwe)
+{
+	return xe_configfs_get_psmi_enabled(to_pci_dev(gt_to_xe(gt)->drm.dev));
+}
diff --git a/drivers/gpu/drm/xe/xe_rtp.h b/drivers/gpu/drm/xe/xe_rtp.h
index 5ed6c14b9ae34..7951fefdbe044 100644
--- a/drivers/gpu/drm/xe/xe_rtp.h
+++ b/drivers/gpu/drm/xe/xe_rtp.h
@@ -477,4 +477,7 @@ bool xe_rtp_match_first_render_or_compute(const struct xe_gt *gt,
 bool xe_rtp_match_not_sriov_vf(const struct xe_gt *gt,
 			       const struct xe_hw_engine *hwe);
 
+bool xe_rtp_match_psmi_enabled(const struct xe_gt *gt,
+			       const struct xe_hw_engine *hwe);
+
 #endif

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (2 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 03/13] drm/xe/rtp: Add match for psmi Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13 17:44   ` Riana Tauro
  2025-08-08 17:29 ` [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509 Lucas De Marchi
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane,
	Badal Nilawar

From: Badal Nilawar <badal.nilawar@intel.com>

Enable Wa 14020001231 to block psmi interrupts during C6 entry exit
flow. It's only enabled if PSMI is enabled in runtime.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
v2:
 - Enable only when PSMI is enabled
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h | 1 +
 drivers/gpu/drm/xe/xe_guc_ads.c       | 4 ++++
 drivers/gpu/drm/xe/xe_wa_oob.rules    | 4 ++++
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index 31dbfeee289e7..0e78351c6ef5a 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -390,6 +390,7 @@ enum  {
  */
 enum xe_guc_klv_ids {
 	GUC_WORKAROUND_KLV_BLOCK_INTERRUPTS_WHEN_MGSR_BLOCKED				= 0x9002,
+	GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT		= 0x9004,
 	GUC_WORKAROUND_KLV_ID_GAM_PFQ_SHADOW_TAIL_POLLING				= 0x9005,
 	GUC_WORKAROUND_KLV_ID_DISABLE_MTP_DURING_ASYNC_COMPUTE				= 0x9007,
 	GUC_WA_KLV_NP_RD_WRITE_TO_CLEAR_RCSM_AT_CGP_LATE_RESTORE			= 0x9008,
diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
index 2ceaa197cb2f0..c42fc78798ca0 100644
--- a/drivers/gpu/drm/xe/xe_guc_ads.c
+++ b/drivers/gpu/drm/xe/xe_guc_ads.c
@@ -359,6 +359,10 @@ static void guc_waklv_init(struct xe_guc_ads *ads)
 				 GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG);
 	}
 
+	if (XE_WA(gt, 14020001231))
+		guc_waklv_enable(ads, NULL, 0, &offset, &remain,
+				 GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT);
+
 	size = guc_ads_waklv_size(ads) - remain;
 	if (!size)
 		return;
diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
index 8d0aabab67773..303a5e05d9932 100644
--- a/drivers/gpu/drm/xe/xe_wa_oob.rules
+++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
@@ -68,6 +68,10 @@ no_media_l3	MEDIA_VERSION(3000)
 		MEDIA_VERSION_RANGE(1300, 3000)
 		MEDIA_VERSION(3002)
 		GRAPHICS_VERSION(3003)
+14020001231	GRAPHICS_VERSION_RANGE(2001,2004), FUNC(xe_rtp_match_psmi_enabled)
+		MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled)
+		MEDIA_VERSION(3000), FUNC(xe_rtp_match_psmi_enabled)
+		MEDIA_VERSION(3002), FUNC(xe_rtp_match_psmi_enabled)
 
 # SoC workaround - currently applies to all platforms with the following
 # primary GT GMDID

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (3 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231 Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13 11:15   ` Bhadane, Dnyaneshwar
  2025-08-08 17:29 ` [PATCH v3 06/13] drm/xe/configfs: Simplify kernel doc Lucas De Marchi
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane,
	Vinay Belgaumkar

From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>

This WA ensures GuC will restore the media MCFG registers at C6
exit.

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
v2:
 - Enable only when PSMI is enabled
---
 drivers/gpu/drm/xe/xe_guc.c        | 3 +++
 drivers/gpu/drm/xe/xe_guc_fwif.h   | 1 +
 drivers/gpu/drm/xe/xe_wa_oob.rules | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index cb757a53de856..f55c4a37cfed1 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -219,6 +219,9 @@ static u32 guc_ctl_wa_flags(struct xe_guc *guc)
 	if (XE_WA(gt, 14018913170))
 		flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6;
 
+	if (XE_WA(gt, 16023683509))
+		flags |= GUC_WA_SAVE_RESTORE_MCFG_REG_AT_MC6;
+
 	return flags;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
index 4dc000c977faf..a169ad0da0d47 100644
--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
@@ -108,6 +108,7 @@ struct guc_update_exec_queue_policy {
 #define   GUC_WA_RENDER_RST_RC6_EXIT	BIT(19)
 #define   GUC_WA_RCS_REGS_IN_CCS_REGS_LIST	BIT(21)
 #define   GUC_WA_ENABLE_TSC_CHECK_ON_RC6	BIT(22)
+#define   GUC_WA_SAVE_RESTORE_MCFG_REG_AT_MC6	BIT(25)
 
 #define GUC_CTL_FEATURE			2
 #define   GUC_CTL_ENABLE_SLPC		BIT(2)
diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
index 303a5e05d9932..fe369e8a01012 100644
--- a/drivers/gpu/drm/xe/xe_wa_oob.rules
+++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
@@ -72,6 +72,8 @@ no_media_l3	MEDIA_VERSION(3000)
 		MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled)
 		MEDIA_VERSION(3000), FUNC(xe_rtp_match_psmi_enabled)
 		MEDIA_VERSION(3002), FUNC(xe_rtp_match_psmi_enabled)
+16023683509	MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled)
+		MEDIA_VERSION(3000), GRAPHICS_STEP(A0, B0), FUNC(xe_rtp_match_psmi_enabled)
 
 # SoC workaround - currently applies to all platforms with the following
 # primary GT GMDID

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 06/13] drm/xe/configfs: Simplify kernel doc
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (4 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509 Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13  6:23   ` Riana Tauro
  2025-08-08 17:29 ` [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI Lucas De Marchi
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

From the caller perspective reading the documentation, there's no need
to be so specific about everything the function is doing/checking. Just
document the functionality a caller cares about.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 853da2ee837ac..17b1d6ae1ff6a 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -389,10 +389,7 @@ static struct xe_config_group_device *find_xe_config_group_device(struct pci_dev
  * xe_configfs_get_survivability_mode - get configfs survivability mode attribute
  * @pdev: pci device
  *
- * find the configfs group that belongs to the pci device and return
- * the survivability mode attribute
- *
- * Return: survivability mode if config group is found, false otherwise
+ * Return: survivability_mode attribute in configfs
  */
 bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)
 {
@@ -409,11 +406,8 @@ bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)
 }
 
 /**
- * xe_configfs_clear_survivability_mode - clear configfs survivability mode attribute
+ * xe_configfs_clear_survivability_mode - clear configfs survivability mode
  * @pdev: pci device
- *
- * find the configfs group that belongs to the pci device and clear survivability
- * mode attribute
  */
 void xe_configfs_clear_survivability_mode(struct pci_dev *pdev)
 {
@@ -433,10 +427,7 @@ void xe_configfs_clear_survivability_mode(struct pci_dev *pdev)
  * xe_configfs_get_engines_allowed - get engine allowed mask from configfs
  * @pdev: pci device
  *
- * Find the configfs group that belongs to the pci device and return
- * the mask of engines allowed to be used.
- *
- * Return: engine mask with allowed engines
+ * Return: engine mask with allowed engines set in configfs
  */
 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev)
 {

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (5 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 06/13] drm/xe/configfs: Simplify kernel doc Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13  6:58   ` Riana Tauro
  2025-08-08 17:29 ` [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock Lucas De Marchi
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison

Now that additional WAs are in place and it's possible to allocate
buffers through debugfs, add the configfs attribute to turn PSMI on.

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 66 +++++++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/xe/xe_configfs.h |  2 +-
 2 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 17b1d6ae1ff6a..8cf6b1375b7d4 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -77,6 +77,16 @@
  * available for migrations, but it's disabled. This is intended for debugging
  * purposes only.
  *
+ * PSMI
+ * ----
+ *
+ * Enable extra debugging capabilities to trace engine execution. Only useful
+ * during early platform enabling and requiring additional hardware connected.
+ * Once it's enabled, additionals WAs are added and runtime configuration is
+ * done via debugfs. Example to enable it::
+ *
+ *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi
+ *
  * Remove devices
  * ==============
  *
@@ -89,8 +99,9 @@ struct xe_config_group_device {
 	struct config_group group;
 
 	struct xe_config_device {
-		bool survivability_mode;
 		u64 engines_allowed;
+		bool survivability_mode;
+		bool enable_psmi;
 	} config;
 
 	/* protects attributes */
@@ -98,8 +109,9 @@ struct xe_config_group_device {
 };
 
 static const struct xe_config_device device_defaults = {
-	.survivability_mode = false,
 	.engines_allowed = U64_MAX,
+	.survivability_mode = false,
+	.enable_psmi = false,
 };
 
 static void set_device_defaults(struct xe_config_device *config)
@@ -243,12 +255,38 @@ static ssize_t engines_allowed_store(struct config_item *item, const char *page,
 	return len;
 }
 
-CONFIGFS_ATTR(, survivability_mode);
+static ssize_t enable_psmi_show(struct config_item *item, char *page)
+{
+	struct xe_config_group_device *dev = to_xe_config_group_device(item);
+
+	return sprintf(page, "%d\n", dev->config.enable_psmi);
+}
+
+static ssize_t enable_psmi_store(struct config_item *item, const char *page, size_t len)
+{
+	struct xe_config_group_device *dev = to_xe_config_group_device(item);
+	bool val;
+	int ret;
+
+	ret = kstrtobool(page, &val);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->lock);
+	dev->config.enable_psmi = val;
+	mutex_unlock(&dev->lock);
+
+	return len;
+}
+
 CONFIGFS_ATTR(, engines_allowed);
+CONFIGFS_ATTR(, survivability_mode);
+CONFIGFS_ATTR(, enable_psmi);
 
 static struct configfs_attribute *xe_config_device_attrs[] = {
-	&attr_survivability_mode,
 	&attr_engines_allowed,
+	&attr_survivability_mode,
+	&attr_enable_psmi,
 	NULL,
 };
 
@@ -443,6 +481,26 @@ u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev)
 	return engines_allowed;
 }
 
+/**
+ * xe_configfs_get_psmi_enabled - get configfs enable_psmi setting
+ * @pdev: pci device
+ *
+ * Return: enable_psmi setting in configfs
+ */
+bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev)
+{
+	struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
+	bool ret;
+
+	if (!dev)
+		return false;
+
+	ret = dev->config.enable_psmi;
+	config_item_put(&dev->group.cg_item);
+
+	return ret;
+}
+
 int __init xe_configfs_init(void)
 {
 	int ret;
diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
index c14588b86e833..603dd7796c8b2 100644
--- a/drivers/gpu/drm/xe/xe_configfs.h
+++ b/drivers/gpu/drm/xe/xe_configfs.h
@@ -16,7 +16,7 @@ void xe_configfs_exit(void);
 bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
 void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
-static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
+bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
 #else
 static inline int xe_configfs_init(void) { return 0; }
 static inline void xe_configfs_exit(void) { }

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (6 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-12 10:23   ` Bhadane, Dnyaneshwar
  2025-08-08 17:29 ` [PATCH v3 09/13] drm/xe/configfs: Block runtime attribute changes Lucas De Marchi
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

Instead of the manual lock()/unlock() pattern, use guard() which will
make things easier for handling errors or early returns.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 8cf6b1375b7d4..4c2d4ff6a70f5 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -5,6 +5,7 @@
 
 #include <linux/bitops.h>
 #include <linux/configfs.h>
+#include <linux/cleanup.h>
 #include <linux/find.h>
 #include <linux/init.h>
 #include <linux/module.h>
@@ -164,9 +165,8 @@ static ssize_t survivability_mode_store(struct config_item *item, const char *pa
 	if (ret)
 		return ret;
 
-	mutex_lock(&dev->lock);
+	guard(mutex)(&dev->lock);
 	dev->config.survivability_mode = survivability_mode;
-	mutex_unlock(&dev->lock);
 
 	return len;
 }
@@ -248,9 +248,8 @@ static ssize_t engines_allowed_store(struct config_item *item, const char *page,
 		val |= mask;
 	}
 
-	mutex_lock(&dev->lock);
+	guard(mutex)(&dev->lock);
 	dev->config.engines_allowed = val;
-	mutex_unlock(&dev->lock);
 
 	return len;
 }
@@ -272,9 +271,8 @@ static ssize_t enable_psmi_store(struct config_item *item, const char *page, siz
 	if (ret)
 		return ret;
 
-	mutex_lock(&dev->lock);
+	guard(mutex)(&dev->lock);
 	dev->config.enable_psmi = val;
-	mutex_unlock(&dev->lock);
 
 	return len;
 }
@@ -454,9 +452,8 @@ void xe_configfs_clear_survivability_mode(struct pci_dev *pdev)
 	if (!dev)
 		return;
 
-	mutex_lock(&dev->lock);
+	guard(mutex)(&dev->lock);
 	dev->config.survivability_mode = 0;
-	mutex_unlock(&dev->lock);
 
 	config_group_put(&dev->group);
 }

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 09/13] drm/xe/configfs: Block runtime attribute changes
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (7 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13 11:03   ` Riana Tauro
  2025-08-08 17:29 ` [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in documentation Lucas De Marchi
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

Although it's possible to change the attributes in runtime, they have no
effect after the driver is already bound to the device. Check for that
and return -EBUSY in that case.

This should help users understand what's going on when the behavior is
not changing even if the value from the configfs is "right", but it got
to that state too late.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 4c2d4ff6a70f5..489c5c67001dc 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -54,6 +54,8 @@
  *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
  *	# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind  (Enters survivability mode if supported)
  *
+ * This attribute can only be set before binding to the device.
+ *
  * Allowed engines:
  * ----------------
  *
@@ -78,6 +80,8 @@
  * available for migrations, but it's disabled. This is intended for debugging
  * purposes only.
  *
+ * This attribute can only be set before binding to the device.
+ *
  * PSMI
  * ----
  *
@@ -88,6 +92,8 @@
  *
  *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi
  *
+ * This attribute can only be set before binding to the device.
+ *
  * Remove devices
  * ==============
  *
@@ -148,6 +154,29 @@ static struct xe_config_device *to_xe_config_device(struct config_item *item)
 	return &to_xe_config_group_device(item)->config;
 }
 
+static bool is_bound(struct xe_config_group_device *dev)
+{
+	unsigned int domain, bus, slot, function;
+	struct pci_dev *pdev;
+	const char *name;
+	bool ret;
+
+	lockdep_assert_held(&dev->lock);
+
+	name = dev->group.cg_item.ci_name;
+	if (sscanf(name, "%x:%x:%x.%x", &domain, &bus, &slot, &function) != 4)
+		return false;
+
+	pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, function));
+	if (!pdev)
+		return false;
+
+	ret = pci_get_drvdata(pdev);
+	pci_dev_put(pdev);
+
+	return ret;
+}
+
 static ssize_t survivability_mode_show(struct config_item *item, char *page)
 {
 	struct xe_config_device *dev = to_xe_config_device(item);
@@ -166,6 +195,9 @@ static ssize_t survivability_mode_store(struct config_item *item, const char *pa
 		return ret;
 
 	guard(mutex)(&dev->lock);
+	if (is_bound(dev))
+		return -EBUSY;
+
 	dev->config.survivability_mode = survivability_mode;
 
 	return len;
@@ -249,6 +281,9 @@ static ssize_t engines_allowed_store(struct config_item *item, const char *page,
 	}
 
 	guard(mutex)(&dev->lock);
+	if (is_bound(dev))
+		return -EBUSY;
+
 	dev->config.engines_allowed = val;
 
 	return len;
@@ -272,6 +307,9 @@ static ssize_t enable_psmi_store(struct config_item *item, const char *page, siz
 		return ret;
 
 	guard(mutex)(&dev->lock);
+	if (is_bound(dev))
+		return -EBUSY;
+
 	dev->config.enable_psmi = val;
 
 	return len;

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in documentation
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (8 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 09/13] drm/xe/configfs: Block runtime attribute changes Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-14 21:31   ` Bhadane, Dnyaneshwar
  2025-08-08 17:29 ` [PATCH v3 11/13] drm/xe/configfs: Improve documentation steps Lucas De Marchi
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

When documenting the directories, use an output similar to the `tree`
command and add VFs and missing attributes.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 489c5c67001dc..7c92d293ba733 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -39,8 +39,16 @@
  * used to configure it::
  *
  *	/sys/kernel/config/xe/
- *		.. 0000:03:00.0/
- *			... survivability_mode
+ *	├── 0000:00:02.0
+ *	│   └── ...
+ *	├── 0000:00:02.1
+ *	│   └── ...
+ *	:
+ *	└── 0000:03:00.0
+ *	    ├── survivability_mode
+ *	    ├── engines_allowed
+ *	    └── enable_psmi
+ *
  *
  * Configure Attributes
  * ====================

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 11/13] drm/xe/configfs: Improve documentation steps
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (9 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in documentation Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-13 11:08   ` Riana Tauro
  2025-08-08 17:29 ` [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation Lucas De Marchi
  2025-08-08 17:29 ` [PATCH v3 13/13] drm/xe/configfs: Dump custom settings when binding Lucas De Marchi
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

The steps are roughly:

1. Load the module without binding to the device
2. Configure the desired device
3. Bind the device

Move the binding part to the "Create devices" since it's not exclusive
to the survivability_mode attribute and better document the steps.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 7c92d293ba733..58196b7571239 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -29,11 +29,18 @@
  * See Documentation/filesystems/configfs.rst for more information about how configfs works.
  *
  * Create devices
- * ===============
+ * ==============
+ *
+ * To create a device, the ``xe`` module should already be loaded, but some
+ * attributes can only be set before binding the device. It can be accomplished
+ * by blocking the driver autoprobe:
  *
- * In order to create a device, the user has to create a directory inside ``'xe'``::
+ *	# echo 0 > /sys/bus/pci/drivers_autoprobe
+ *	# modprobe xe
  *
- *	mkdir /sys/kernel/config/xe/0000:03:00.0/
+ * In order to create a device, the user has to create a directory inside ``xe``::
+ *
+ *	# mkdir /sys/kernel/config/xe/0000:03:00.0/
  *
  * Every device created is populated by the driver with entries that can be
  * used to configure it::
@@ -49,6 +56,12 @@
  *	    ├── engines_allowed
  *	    └── enable_psmi
  *
+ * After configuring the attributes as per next section, the device can be
+ * probed with::
+ *
+ *	# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind
+ *	# # or
+ *	# echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
  *
  * Configure Attributes
  * ====================
@@ -60,7 +73,6 @@
  * effect when probing the device. Example to enable it::
  *
  *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
- *	# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind  (Enters survivability mode if supported)
  *
  * This attribute can only be set before binding to the device.
  *

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (10 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 11/13] drm/xe/configfs: Improve documentation steps Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-12 10:24   ` Bhadane, Dnyaneshwar
  2025-08-08 17:29 ` [PATCH v3 13/13] drm/xe/configfs: Dump custom settings when binding Lucas De Marchi
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane

Add a few missing punctuation and line breaks and make the syntax for
code snippets common to all of them.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 58196b7571239..3b9d24c0bb588 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -24,9 +24,10 @@
  * =========
  *
  * Configfs is a filesystem-based manager of kernel objects. XE KMD registers a
- * configfs subsystem called ``'xe'`` that creates a directory in the mounted configfs directory
- * The user can create devices under this directory and configure them as necessary
- * See Documentation/filesystems/configfs.rst for more information about how configfs works.
+ * configfs subsystem called ``xe`` that creates a directory in the mounted
+ * configfs directory. The user can create devices under this directory and
+ * configure them as necessary. See Documentation/filesystems/configfs.rst for
+ * more information about how configfs works.
  *
  * Create devices
  * ==============
@@ -119,7 +120,7 @@
  *
  * The created device directories can be removed using ``rmdir``::
  *
- *	rmdir /sys/kernel/config/xe/0000:03:00.0/
+ *	# rmdir /sys/kernel/config/xe/0000:03:00.0/
  */
 
 struct xe_config_group_device {

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 13/13] drm/xe/configfs: Dump custom settings when binding
  2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
                   ` (11 preceding siblings ...)
  2025-08-08 17:29 ` [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation Lucas De Marchi
@ 2025-08-08 17:29 ` Lucas De Marchi
  2025-08-15  0:48   ` Belgaumkar, Vinay
  12 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-08 17:29 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, prashanth.kumar, dnyaneshwar.bhadane,
	Michal Wajdeczko, John Harrison

Device configuration using configfs could be prepared long time prior
the driver load. Currently all the xe configfs entries are for things
that are important to have in the log if a non-default value is being
used. Add a info-level message about that with the individual entries
that are different than the default.

Based on previous patch by Michal Wajdeczko.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 39 +++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_configfs.h |  2 ++
 drivers/gpu/drm/xe/xe_pci.c      |  3 +++
 3 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 3b9d24c0bb588..9a283b713ff9d 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -480,6 +480,45 @@ static struct xe_config_group_device *find_xe_config_group_device(struct pci_dev
 	return to_xe_config_group_device(item);
 }
 
+static void dump_custom_dev_config(struct pci_dev *pdev,
+				   struct xe_config_group_device *dev)
+{
+#define PRI_CUSTOM_ATTR(fmt_, attr_) do { \
+		if (dev->config.attr_ != device_defaults.attr_) \
+			pci_info(pdev, "configfs: " __stringify(attr_) " = " fmt_ "\n", \
+				 dev->config.attr_); \
+	} while (0)
+
+	PRI_CUSTOM_ATTR("%llx", engines_allowed);
+	PRI_CUSTOM_ATTR("%d", enable_psmi);
+	PRI_CUSTOM_ATTR("%d", survivability_mode);
+
+#undef PRI_CUSTOM_ATTR
+}
+
+/**
+ * xe_configfs_check_device() - Test if device was configured by configfs
+ * @pdev: the &pci_dev device to test
+ *
+ * Try to find the configfs group that belongs to the specified pci device
+ * and print a diagnostic message if found.
+ */
+void xe_configfs_check_device(struct pci_dev *pdev)
+{
+	struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
+
+	if (!dev)
+		return;
+
+	/* memcmp here is safe as both as zero-initialized */
+	if (memcmp(&dev->config, &device_defaults, sizeof(dev->config))) {
+		pci_info(pdev, "Found custom settings in configfs\n");
+		dump_custom_dev_config(pdev, dev);
+	}
+
+	config_group_put(&dev->group);
+}
+
 /**
  * xe_configfs_get_survivability_mode - get configfs survivability mode attribute
  * @pdev: pci device
diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
index 603dd7796c8b2..58c8c31640008 100644
--- a/drivers/gpu/drm/xe/xe_configfs.h
+++ b/drivers/gpu/drm/xe/xe_configfs.h
@@ -13,6 +13,7 @@ struct pci_dev;
 #if IS_ENABLED(CONFIG_CONFIGFS_FS)
 int xe_configfs_init(void);
 void xe_configfs_exit(void);
+void xe_configfs_check_device(struct pci_dev *pdev);
 bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
 void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
@@ -20,6 +21,7 @@ bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
 #else
 static inline int xe_configfs_init(void) { return 0; }
 static inline void xe_configfs_exit(void) { }
+static inline void xe_configfs_check_device(struct pci_dev *pdev) { }
 static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
 static inline void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) { }
 static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 52d46c66ae1eb..9ce6e6dca5bc7 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -17,6 +17,7 @@
 
 #include "display/xe_display.h"
 #include "regs/xe_gt_regs.h"
+#include "xe_configfs.h"
 #include "xe_device.h"
 #include "xe_drv.h"
 #include "xe_gt.h"
@@ -771,6 +772,8 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	struct xe_device *xe;
 	int err;
 
+	xe_configfs_check_device(pdev);
+
 	if (desc->require_force_probe && !id_forced(pdev->device)) {
 		dev_info(&pdev->dev,
 			 "Your graphics device %04x is not officially supported\n"

-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* RE: [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock
  2025-08-08 17:29 ` [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock Lucas De Marchi
@ 2025-08-12 10:23   ` Bhadane, Dnyaneshwar
  0 siblings, 0 replies; 33+ messages in thread
From: Bhadane, Dnyaneshwar @ 2025-08-12 10:23 UTC (permalink / raw)
  To: De Marchi, Lucas, intel-xe@lists.freedesktop.org; +Cc: Kumar, Prashanth



> -----Original Message-----
> From: De Marchi, Lucas <lucas.demarchi@intel.com>
> Sent: Friday, August 8, 2025 11:00 PM
> To: intel-xe@lists.freedesktop.org
> Cc: De Marchi, Lucas <lucas.demarchi@intel.com>; Kumar, Prashanth
> <prashanth.kumar@intel.com>; Bhadane, Dnyaneshwar
> <dnyaneshwar.bhadane@intel.com>
> Subject: [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock
> 
> Instead of the manual lock()/unlock() pattern, use guard() which will make
> things easier for handling errors or early returns.
> 

> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

LGTM,
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_configfs.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c
> b/drivers/gpu/drm/xe/xe_configfs.c
> index 8cf6b1375b7d4..4c2d4ff6a70f5 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -5,6 +5,7 @@
> 
>  #include <linux/bitops.h>
>  #include <linux/configfs.h>
> +#include <linux/cleanup.h>
>  #include <linux/find.h>
>  #include <linux/init.h>
>  #include <linux/module.h>
> @@ -164,9 +165,8 @@ static ssize_t survivability_mode_store(struct
> config_item *item, const char *pa
>  	if (ret)
>  		return ret;
> 
> -	mutex_lock(&dev->lock);
> +	guard(mutex)(&dev->lock);
>  	dev->config.survivability_mode = survivability_mode;
> -	mutex_unlock(&dev->lock);
> 
>  	return len;
>  }
> @@ -248,9 +248,8 @@ static ssize_t engines_allowed_store(struct
> config_item *item, const char *page,
>  		val |= mask;
>  	}
> 
> -	mutex_lock(&dev->lock);
> +	guard(mutex)(&dev->lock);
>  	dev->config.engines_allowed = val;
> -	mutex_unlock(&dev->lock);
> 
>  	return len;
>  }
> @@ -272,9 +271,8 @@ static ssize_t enable_psmi_store(struct config_item
> *item, const char *page, siz
>  	if (ret)
>  		return ret;
> 
> -	mutex_lock(&dev->lock);
> +	guard(mutex)(&dev->lock);
>  	dev->config.enable_psmi = val;
> -	mutex_unlock(&dev->lock);
> 
>  	return len;
>  }
> @@ -454,9 +452,8 @@ void xe_configfs_clear_survivability_mode(struct
> pci_dev *pdev)
>  	if (!dev)
>  		return;
> 
> -	mutex_lock(&dev->lock);
> +	guard(mutex)(&dev->lock);
>  	dev->config.survivability_mode = 0;
> -	mutex_unlock(&dev->lock);
> 
>  	config_group_put(&dev->group);
>  }
> 
> --
> 2.50.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation
  2025-08-08 17:29 ` [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation Lucas De Marchi
@ 2025-08-12 10:24   ` Bhadane, Dnyaneshwar
  0 siblings, 0 replies; 33+ messages in thread
From: Bhadane, Dnyaneshwar @ 2025-08-12 10:24 UTC (permalink / raw)
  To: De Marchi, Lucas, intel-xe@lists.freedesktop.org; +Cc: Kumar, Prashanth



> -----Original Message-----
> From: De Marchi, Lucas <lucas.demarchi@intel.com>
> Sent: Friday, August 8, 2025 11:00 PM
> To: intel-xe@lists.freedesktop.org
> Cc: De Marchi, Lucas <lucas.demarchi@intel.com>; Kumar, Prashanth
> <prashanth.kumar@intel.com>; Bhadane, Dnyaneshwar
> <dnyaneshwar.bhadane@intel.com>
> Subject: [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation
> 
> Add a few missing punctuation and line breaks and make the syntax for code
> snippets common to all of them.
> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

LGTM,
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>


> ---
>  drivers/gpu/drm/xe/xe_configfs.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c
> b/drivers/gpu/drm/xe/xe_configfs.c
> index 58196b7571239..3b9d24c0bb588 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -24,9 +24,10 @@
>   * =========
>   *
>   * Configfs is a filesystem-based manager of kernel objects. XE KMD registers
> a
> - * configfs subsystem called ``'xe'`` that creates a directory in the mounted
> configfs directory
> - * The user can create devices under this directory and configure them as
> necessary
> - * See Documentation/filesystems/configfs.rst for more information about
> how configfs works.
> + * configfs subsystem called ``xe`` that creates a directory in the
> + mounted
> + * configfs directory. The user can create devices under this directory
> + and
> + * configure them as necessary. See
> + Documentation/filesystems/configfs.rst for
> + * more information about how configfs works.
>   *
>   * Create devices
>   * ==============
> @@ -119,7 +120,7 @@
>   *
>   * The created device directories can be removed using ``rmdir``::
>   *
> - *	rmdir /sys/kernel/config/xe/0000:03:00.0/
> + *	# rmdir /sys/kernel/config/xe/0000:03:00.0/
>   */
> 
>  struct xe_config_group_device {
> 
> --
> 2.50.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI
  2025-08-08 17:29 ` [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI Lucas De Marchi
@ 2025-08-13  0:38   ` Belgaumkar, Vinay
  2025-08-15 21:34     ` Lucas De Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Belgaumkar, Vinay @ 2025-08-13  0:38 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe
  Cc: prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison


On 8/8/2025 10:29 AM, Lucas De Marchi wrote:
> PSMI allows to capture data from the GPU useful for early
> validation. From the kernel side there isn't much to be done, just a few
> things:
>
> 	1) Toggle the feature support in GuC
> 	2) Enable some additional WAs
> 	3) Allocate buffers
>
> Here is the first step, with the next ones to follow. For now everything
> is disabled through a check in configfs that is currently hardcoded to
> disabled.
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_configfs.h | 2 ++
>   drivers/gpu/drm/xe/xe_guc.c      | 7 ++++++-
>   drivers/gpu/drm/xe/xe_guc_fwif.h | 1 +
>   3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
> index fb87640080896..c14588b86e833 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.h
> +++ b/drivers/gpu/drm/xe/xe_configfs.h
> @@ -16,12 +16,14 @@ void xe_configfs_exit(void);
>   bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>   void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
>   u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
> +static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
>   #else
>   static inline int xe_configfs_init(void) { return 0; }
>   static inline void xe_configfs_exit(void) { }
>   static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
>   static inline void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) { }
>   static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
> +static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
>   #endif
>   
>   #endif
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index 9e34401e4489f..cb757a53de856 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -16,6 +16,7 @@
>   #include "regs/xe_guc_regs.h"
>   #include "regs/xe_irq_regs.h"
>   #include "xe_bo.h"
> +#include "xe_configfs.h"
>   #include "xe_device.h"
>   #include "xe_force_wake.h"
>   #include "xe_gt.h"
> @@ -81,11 +82,15 @@ static u32 guc_ctl_debug_flags(struct xe_guc *guc)
>   
>   static u32 guc_ctl_feature_flags(struct xe_guc *guc)
>   {
> +	struct xe_device *xe = guc_to_xe(guc);
>   	u32 flags = GUC_CTL_ENABLE_LITE_RESTORE;
>   
> -	if (!guc_to_xe(guc)->info.skip_guc_pc)
> +	if (!xe->info.skip_guc_pc)
>   		flags |= GUC_CTL_ENABLE_SLPC;
>   
> +	if (xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev)))
> +		flags |= GUC_CTL_ENABLE_PSMI;
> +
>   	return flags;
>   }
>   
> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> index ca9f999d38d1e..4dc000c977faf 100644
> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> @@ -112,6 +112,7 @@ struct guc_update_exec_queue_policy {
>   #define GUC_CTL_FEATURE			2
>   #define   GUC_CTL_ENABLE_SLPC		BIT(2)
>   #define   GUC_CTL_ENABLE_LITE_RESTORE	BIT(4)
> +#define   GUC_CTL_ENABLE_PSMI		BIT(7)

Should we have this as GUC_CTL_ENABLE_PSMI_LOGGING to match the GuC 
nomenclature?

Thanks,

Vinay.

>   #define   GUC_CTL_DISABLE_SCHEDULER	BIT(14)
>   
>   #define GUC_CTL_DEBUG			3
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI
  2025-08-08 17:29 ` [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI Lucas De Marchi
@ 2025-08-13  1:41   ` Belgaumkar, Vinay
  2025-08-13 10:42   ` Matthew Auld
  1 sibling, 0 replies; 33+ messages in thread
From: Belgaumkar, Vinay @ 2025-08-13  1:41 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe
  Cc: prashanth.kumar, dnyaneshwar.bhadane, Matt Roper, Brian Welty


On 8/8/2025 10:29 AM, Lucas De Marchi wrote:
> Requirement for PSMI capture is to have a physically contiguous buffer.
> All the needed configuration is done by the userspace tool directly to
> the GPU via mmio access.
>
> This interface only support allocating from VRAM regions. For integrated

NIT: s/support/supports

Other than that, LGTM,

Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>

> devices, the PSMI buffer is in SYSTEM memory and should be allocated by
> userspace using hugetlbfs.
>
> Here we add the ability to allocate a region of physically contiguous
> memory by writing to debugfs file (listed below). For multi-tile devices,
> the capture tool requires ability to allocate a capture buffer per tile
> (VRAM region) and so user can specify a region_mask. The tool then
> can mmap the buffers via direct mmap of the PCIBAR via sysfs.
>
> To support the capture tool, 3 new debugfs entries are added:
>
>     psmi_capture_addr - physical address per VRAM region's capture buffer
>     psmi_capture_region_mask - select which region(s) to allocate a buffer
>     psmi_capture_size - size of current capture buffer
>
> Writing psmi_capture_size will allocate new buffer of requested size per
> region after freeing any current buffers.
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Original-author: Brian Welty <brian.welty@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> v2:
>   - Fix kernel-doc
>   - Do not walk all region_mask on cleanup: it should never be needed
>   - Replace sysmem checks by asserts as they should never be set
>   - s/debugfs_create/debugfs_register/ and do not pass the root dir:
>     this makes it similar to other parts registering debugfs
>   - Do not export a cleanup function, rather use a init that registers
>     a devm action if needed
>   - Drop modparam in favor of configfs whose attribute will be
>     implemented when everything is ready
> ---
>   drivers/gpu/drm/xe/Makefile          |   1 +
>   drivers/gpu/drm/xe/xe_debugfs.c      |   3 +
>   drivers/gpu/drm/xe/xe_device.c       |   5 +
>   drivers/gpu/drm/xe/xe_device_types.h |   8 +
>   drivers/gpu/drm/xe/xe_psmi.c         | 313 +++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_psmi.h         |  14 ++
>   6 files changed, 344 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 8e0c3412a757c..85b8d3a59ef07 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -98,6 +98,7 @@ xe-y += xe_bb.o \
>   	xe_pcode.o \
>   	xe_pm.o \
>   	xe_preempt_fence.o \
> +	xe_psmi.o \
>   	xe_pt.o \
>   	xe_pt_walk.o \
>   	xe_pxp.o \
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index 0b4a532f7c45c..bc717519502dd 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -20,6 +20,7 @@
>   #include "xe_guc_ads.h"
>   #include "xe_mmio.h"
>   #include "xe_pm.h"
> +#include "xe_psmi.h"
>   #include "xe_pxp_debugfs.h"
>   #include "xe_sriov.h"
>   #include "xe_sriov_pf.h"
> @@ -400,6 +401,8 @@ void xe_debugfs_register(struct xe_device *xe)
>   
>   	xe_pxp_debugfs_register(xe->pxp);
>   
> +	xe_psmi_debugfs_register(xe);
> +
>   	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
>   
>   	if (IS_SRIOV_PF(xe))
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 57edbc63da6f4..62edb39b61fb0 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -54,6 +54,7 @@
>   #include "xe_pcode.h"
>   #include "xe_pm.h"
>   #include "xe_pmu.h"
> +#include "xe_psmi.h"
>   #include "xe_pxp.h"
>   #include "xe_query.h"
>   #include "xe_shrinker.h"
> @@ -908,6 +909,10 @@ int xe_device_probe(struct xe_device *xe)
>   	if (err)
>   		return err;
>   
> +	err = xe_psmi_init(xe);
> +	if (err)
> +		return err;
> +
>   	err = drm_dev_register(&xe->drm, 0);
>   	if (err)
>   		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 01e8fa0d2f9f7..bf9af8d0b84ae 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -576,6 +576,14 @@ struct xe_device {
>   	atomic64_t global_total_pages;
>   #endif
>   
> +	/** @psmi: GPU debugging via additional validation HW */
> +	struct {
> +		/** @psmi.capture_obj: PSMI buffer for VRAM */
> +		struct xe_bo *capture_obj[XE_MAX_TILES_PER_DEVICE + 1];
> +		/** @psmi.region_mask: Mask of valid memory regions */
> +		u8 region_mask;
> +	} psmi;
> +
>   	/* private: */
>   
>   #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_psmi.c b/drivers/gpu/drm/xe/xe_psmi.c
> new file mode 100644
> index 0000000000000..e6a67e85e1bb2
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_psmi.c
> @@ -0,0 +1,313 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <linux/debugfs.h>
> +
> +#include "xe_bo.h"
> +#include "xe_device.h"
> +#include "xe_configfs.h"
> +#include "xe_psmi.h"
> +
> +/*
> + * PSMI capture support
> + *
> + * Requirement for PSMI capture is to have a physically contiguous buffer.  The
> + * PSMI tool owns doing all necessary configuration (MMIO register writes are
> + * done from user-space). However, KMD needs to provide the PSMI tool with the
> + * required physical address of the base of PSMI buffer in case of VRAM.
> + *
> + * VRAM backed PSMI buffer:
> + * Buffer is allocated as GEM object and with XE_BO_CREATE_PINNED_BIT flag which
> + * creates a contiguous allocation. The physical address is returned from
> + * psmi_debugfs_capture_addr_show(). PSMI tool can mmap the buffer via the
> + * PCIBAR through sysfs.
> + *
> + * SYSTEM memory backed PSMI buffer:
> + * Interface here does not support allocating from SYSTEM memory region.  The
> + * PSMI tool needs to allocate memory themselves using hugetlbfs. In order to
> + * get the physical address, user-space can query /proc/[pid]/pagemap. As an
> + * alternative, CMA debugfs could also be used to allocate reserved CMA memory.
> + */
> +
> +static bool psmi_enabled(struct xe_device *xe)
> +{
> +	return xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev));
> +}
> +
> +static void psmi_free_object(struct xe_bo *bo)
> +{
> +	xe_bo_lock(bo, NULL);
> +	xe_bo_unpin(bo);
> +	xe_bo_unlock(bo);
> +	xe_bo_put(bo);
> +}
> +
> +/*
> + * Free PSMI capture buffer objects.
> + */
> +static void psmi_cleanup(struct xe_device *xe)
> +{
> +	unsigned long id, region_mask = xe->psmi.region_mask;
> +	struct xe_bo *bo;
> +
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		bo = xe->psmi.capture_obj[id];
> +		if (bo) {
> +			psmi_free_object(bo);
> +			xe->psmi.capture_obj[id] = NULL;
> +		}
> +	}
> +}
> +
> +static struct xe_bo *psmi_alloc_object(struct xe_device *xe,
> +				       unsigned int id, size_t bo_size)
> +{
> +	struct xe_bo *bo = NULL;
> +	struct xe_tile *tile;
> +	int err;
> +
> +	if (!id || !bo_size)
> +		return NULL;
> +
> +	tile = &xe->tiles[id - 1];
> +
> +	/* VRAM: Allocate GEM object for the capture buffer */
> +	bo = xe_bo_create_locked(xe, tile, NULL, bo_size,
> +				 ttm_bo_type_kernel,
> +				 XE_BO_FLAG_VRAM_IF_DGFX(tile) |
> +				 XE_BO_FLAG_PINNED |
> +				 XE_BO_FLAG_NEEDS_CPU_ACCESS);
> +
> +	if (!IS_ERR(bo)) {
> +		/* Buffer written by HW, ensure stays resident */
> +		err = xe_bo_pin(bo);
> +		if (err)
> +			bo = ERR_PTR(err);
> +		xe_bo_unlock(bo);
> +	}
> +
> +	return bo;
> +}
> +
> +/*
> + * Allocate PSMI capture buffer objects (via debugfs set function), based on
> + * which regions the user has selected in region_mask.  @size: size in bytes
> + * (should be power of 2)
> + *
> + * Always release/free the current buffer objects before attempting to allocate
> + * new ones.  Size == 0 will free all current buffers.
> + *
> + * Note, we don't write any registers as the capture tool is already configuring
> + * all PSMI registers itself via mmio space.
> + */
> +static int psmi_resize_object(struct xe_device *xe, size_t size)
> +{
> +	unsigned long id, region_mask = xe->psmi.region_mask;
> +	struct xe_bo *bo = NULL;
> +	int err = 0;
> +
> +	/*
> +	 * Buddy allocator anyway will roundup to next power of 2,
> +	 * so rather than waste unused pages, require user to ask for
> +	 * power of 2 sized PSMI buffers.
> +	 */
> +	if (size && !is_power_of_2(size))
> +		return -EINVAL;
> +
> +	/* if resizing, free currently allocated buffers first */
> +	psmi_cleanup(xe);
> +
> +	/* can set size to 0, in which case, now done */
> +	if (!size)
> +		return 0;
> +
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		bo = psmi_alloc_object(xe, id, size);
> +		if (IS_ERR(bo)) {
> +			err = PTR_ERR(bo);
> +			break;
> +		}
> +		xe->psmi.capture_obj[id] = bo;
> +
> +		drm_info(&xe->drm,
> +			 "PSMI capture size requested: %zu bytes, allocated: %lu:%zu\n",
> +			 size, id, bo ? xe_bo_size(bo) : 0);
> +	}
> +
> +	/* on error, reverse what was allocated */
> +	if (err)
> +		psmi_cleanup(xe);
> +
> +	return err;
> +}
> +
> +/*
> + * Returns an address for the capture tool to use to find start of capture
> + * buffer. Capture tool requires the capability to have a buffer allocated per
> + * each tile (VRAM region), thus we return an address for each region.
> + */
> +static int psmi_debugfs_capture_addr_show(struct seq_file *m, void *data)
> +{
> +	struct xe_device *xe = m->private;
> +	unsigned long id, region_mask;
> +	struct xe_bo *bo;
> +	u64 val;
> +
> +	region_mask = xe->psmi.region_mask;
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		/* VRAM region */
> +		bo = xe->psmi.capture_obj[id];
> +		if (!bo)
> +			continue;
> +
> +		/* pinned, so don't need bo_lock */
> +		val = __xe_bo_addr(bo, 0, PAGE_SIZE);
> +		seq_printf(m, "%ld: 0x%llx\n", id, val);
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Return capture buffer size, using the size from first allocated object that
> + * is found. This works because all objects must be of the same size.
> + */
> +static int psmi_debugfs_capture_size_get(void *data, u64 *val)
> +{
> +	unsigned long id, region_mask;
> +	struct xe_device *xe = data;
> +	struct xe_bo *bo;
> +
> +	region_mask = xe->psmi.region_mask;
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		bo = xe->psmi.capture_obj[id];
> +		if (bo) {
> +			*val = xe_bo_size(bo);
> +			return 0;
> +		}
> +	}
> +
> +	/* no capture objects are allocated */
> +	*val = 0;
> +
> +	return 0;
> +}
> +
> +/*
> + * Set size of PSMI capture buffer. This triggers the allocation of capture
> + * buffer in each memory region as specified with prior write to
> + * psmi_capture_region_mask.
> + */
> +static int psmi_debugfs_capture_size_set(void *data, u64 val)
> +{
> +	struct xe_device *xe = data;
> +
> +	/* user must have specified at least one region */
> +	if (!xe->psmi.region_mask)
> +		return -EINVAL;
> +
> +	return psmi_resize_object(xe, val);
> +}
> +
> +static int psmi_debugfs_capture_region_mask_get(void *data, u64 *val)
> +{
> +	struct xe_device *xe = data;
> +
> +	*val = xe->psmi.region_mask;
> +
> +	return 0;
> +}
> +
> +/*
> + * Select VRAM regions for multi-tile devices, only allowed when buffer is not
> + * currently allocated.
> + */
> +static int psmi_debugfs_capture_region_mask_set(void *data, u64 region_mask)
> +{
> +	struct xe_device *xe = data;
> +	u64 size = 0;
> +
> +	/* SMEM is not supported (see comments at top of file) */
> +	if (region_mask & 0x1)
> +		return -EOPNOTSUPP;
> +
> +	/* input bitmask should contain only valid TTM regions */
> +	if (!region_mask || region_mask & ~xe->info.mem_region_mask)
> +		return -EINVAL;
> +
> +	/* only allow setting mask if buffer is not yet allocated */
> +	psmi_debugfs_capture_size_get(xe, &size);
> +	if (size)
> +		return -EBUSY;
> +
> +	xe->psmi.region_mask = region_mask;
> +
> +	return 0;
> +}
> +
> +DEFINE_SHOW_ATTRIBUTE(psmi_debugfs_capture_addr);
> +
> +DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_region_mask_fops,
> +			 psmi_debugfs_capture_region_mask_get,
> +			 psmi_debugfs_capture_region_mask_set,
> +			 "0x%llx\n");
> +
> +DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_size_fops,
> +			 psmi_debugfs_capture_size_get,
> +			 psmi_debugfs_capture_size_set,
> +			 "%lld\n");
> +
> +void xe_psmi_debugfs_register(struct xe_device *xe)
> +{
> +	struct drm_minor *minor;
> +
> +	if (!psmi_enabled(xe))
> +		return;
> +
> +	minor = xe->drm.primary;
> +	if (!minor->debugfs_root)
> +		return;
> +
> +	debugfs_create_file("psmi_capture_addr",
> +			    0400, minor->debugfs_root, xe,
> +			    &psmi_debugfs_capture_addr_fops);
> +
> +	debugfs_create_file("psmi_capture_region_mask",
> +			    0600, minor->debugfs_root, xe,
> +			    &psmi_debugfs_capture_region_mask_fops);
> +
> +	debugfs_create_file("psmi_capture_size",
> +			    0600, minor->debugfs_root, xe,
> +			    &psmi_debugfs_capture_size_fops);
> +}
> +
> +static void psmi_fini(void *arg)
> +{
> +	psmi_cleanup(arg);
> +}
> +
> +int xe_psmi_init(struct xe_device *xe)
> +{
> +	if (!psmi_enabled(xe))
> +		return 0;
> +
> +	return devm_add_action(xe->drm.dev, psmi_fini, xe);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_psmi.h b/drivers/gpu/drm/xe/xe_psmi.h
> new file mode 100644
> index 0000000000000..b1dfba80d893d
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_psmi.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PSMI_H_
> +#define _XE_PSMI_H_
> +
> +struct xe_device;
> +
> +int xe_psmi_init(struct xe_device *xe);
> +void xe_psmi_debugfs_register(struct xe_device *xe);
> +
> +#endif
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 06/13] drm/xe/configfs: Simplify kernel doc
  2025-08-08 17:29 ` [PATCH v3 06/13] drm/xe/configfs: Simplify kernel doc Lucas De Marchi
@ 2025-08-13  6:23   ` Riana Tauro
  0 siblings, 0 replies; 33+ messages in thread
From: Riana Tauro @ 2025-08-13  6:23 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe; +Cc: prashanth.kumar, dnyaneshwar.bhadane



On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
>  From the caller perspective reading the documentation, there's no need
> to be so specific about everything the function is doing/checking. Just
> document the functionality a caller cares about.
> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

Looks good to me
Reviewed-by: Riana Tauro <riana.tauro@intel.com>

> ---
>   drivers/gpu/drm/xe/xe_configfs.c | 15 +++------------
>   1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 853da2ee837ac..17b1d6ae1ff6a 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -389,10 +389,7 @@ static struct xe_config_group_device *find_xe_config_group_device(struct pci_dev
>    * xe_configfs_get_survivability_mode - get configfs survivability mode attribute
>    * @pdev: pci device
>    *
> - * find the configfs group that belongs to the pci device and return
> - * the survivability mode attribute
> - *
> - * Return: survivability mode if config group is found, false otherwise
> + * Return: survivability_mode attribute in configfs
>    */
>   bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)
>   {
> @@ -409,11 +406,8 @@ bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)
>   }
>   
>   /**
> - * xe_configfs_clear_survivability_mode - clear configfs survivability mode attribute
> + * xe_configfs_clear_survivability_mode - clear configfs survivability mode
>    * @pdev: pci device
> - *
> - * find the configfs group that belongs to the pci device and clear survivability
> - * mode attribute
>    */
>   void xe_configfs_clear_survivability_mode(struct pci_dev *pdev)
>   {
> @@ -433,10 +427,7 @@ void xe_configfs_clear_survivability_mode(struct pci_dev *pdev)
>    * xe_configfs_get_engines_allowed - get engine allowed mask from configfs
>    * @pdev: pci device
>    *
> - * Find the configfs group that belongs to the pci device and return
> - * the mask of engines allowed to be used.
> - *
> - * Return: engine mask with allowed engines
> + * Return: engine mask with allowed engines set in configfs
>    */
>   u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev)
>   {
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI
  2025-08-08 17:29 ` [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI Lucas De Marchi
@ 2025-08-13  6:58   ` Riana Tauro
  2025-08-13 11:23     ` Lucas De Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Riana Tauro @ 2025-08-13  6:58 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe
  Cc: prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison



On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
> Now that additional WAs are in place and it's possible to allocate
> buffers through debugfs, add the configfs attribute to turn PSMI on.
> 
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_configfs.c | 66 +++++++++++++++++++++++++++++++++++++---
>   drivers/gpu/drm/xe/xe_configfs.h |  2 +-
>   2 files changed, 63 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 17b1d6ae1ff6a..8cf6b1375b7d4 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -77,6 +77,16 @@
>    * available for migrations, but it's disabled. This is intended for debugging
>    * purposes only.
>    *
> + * PSMI
> + * ----
> + *
> + * Enable extra debugging capabilities to trace engine execution. Only useful
> + * during early platform enabling and requiring additional hardware connected.

%s/requiring/requires

> + * Once it's enabled, additionals WAs are added and runtime configuration is
> + * done via debugfs. Example to enable it::
> + *
> + *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi
> + *
>    * Remove devices
>    * ==============
>    *
> @@ -89,8 +99,9 @@ struct xe_config_group_device {
>   	struct config_group group;
>   
>   	struct xe_config_device {
> -		bool survivability_mode;
>   		u64 engines_allowed;
> +		bool survivability_mode;
> +		bool enable_psmi;
>   	} config;
>   
>   	/* protects attributes */
> @@ -98,8 +109,9 @@ struct xe_config_group_device {
>   };
>   
>   static const struct xe_config_device device_defaults = {
> -	.survivability_mode = false,
>   	.engines_allowed = U64_MAX,
> +	.survivability_mode = false,
> +	.enable_psmi = false,
>   };
>   
>   static void set_device_defaults(struct xe_config_device *config)
> @@ -243,12 +255,38 @@ static ssize_t engines_allowed_store(struct config_item *item, const char *page,
>   	return len;
>   }
>   
> -CONFIGFS_ATTR(, survivability_mode);
> +static ssize_t enable_psmi_show(struct config_item *item, char *page)
> +{
> +	struct xe_config_group_device *dev = to_xe_config_group_device(item);

To have consistency with other functions, we can use

struct xe_config_device *dev = to_xe_config_device(item);

> +
> +	return sprintf(page, "%d\n", dev->config.enable_psmi);
> +}
> +
> +static ssize_t enable_psmi_store(struct config_item *item, const char *page, size_t len)
> +{
> +	struct xe_config_group_device *dev = to_xe_config_group_device(item);
> +	bool val;
> +	int ret;
> +
> +	ret = kstrtobool(page, &val);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&dev->lock);
> +	dev->config.enable_psmi = val;
> +	mutex_unlock(&dev->lock);
> +
> +	return len;
> +}
> +
>   CONFIGFS_ATTR(, engines_allowed);
> +CONFIGFS_ATTR(, survivability_mode);
> +CONFIGFS_ATTR(, enable_psmi);

alphabetical?

Thanks
Riana

>   
>   static struct configfs_attribute *xe_config_device_attrs[] = {
> -	&attr_survivability_mode,
>   	&attr_engines_allowed,
> +	&attr_survivability_mode,
> +	&attr_enable_psmi,
>   	NULL,
>   };
>   
> @@ -443,6 +481,26 @@ u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev)
>   	return engines_allowed;
>   }
>   
> +/**
> + * xe_configfs_get_psmi_enabled - get configfs enable_psmi setting
> + * @pdev: pci device
> + *
> + * Return: enable_psmi setting in configfs
> + */
> +bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev)
> +{
> +	struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
> +	bool ret;
> +
> +	if (!dev)
> +		return false;
> +
> +	ret = dev->config.enable_psmi;
> +	config_item_put(&dev->group.cg_item);
> +
> +	return ret;
> +}
> +
>   int __init xe_configfs_init(void)
>   {
>   	int ret;
> diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
> index c14588b86e833..603dd7796c8b2 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.h
> +++ b/drivers/gpu/drm/xe/xe_configfs.h
> @@ -16,7 +16,7 @@ void xe_configfs_exit(void);
>   bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>   void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
>   u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
> -static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
> +bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
>   #else
>   static inline int xe_configfs_init(void) { return 0; }
>   static inline void xe_configfs_exit(void) { }
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI
  2025-08-08 17:29 ` [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI Lucas De Marchi
  2025-08-13  1:41   ` Belgaumkar, Vinay
@ 2025-08-13 10:42   ` Matthew Auld
  2025-08-15 21:35     ` Lucas De Marchi
  1 sibling, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2025-08-13 10:42 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe
  Cc: prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Vinay Belgaumkar, Brian Welty

On 08/08/2025 18:29, Lucas De Marchi wrote:
> Requirement for PSMI capture is to have a physically contiguous buffer.
> All the needed configuration is done by the userspace tool directly to
> the GPU via mmio access.
> 
> This interface only support allocating from VRAM regions. For integrated
> devices, the PSMI buffer is in SYSTEM memory and should be allocated by
> userspace using hugetlbfs.
> 
> Here we add the ability to allocate a region of physically contiguous
> memory by writing to debugfs file (listed below). For multi-tile devices,
> the capture tool requires ability to allocate a capture buffer per tile
> (VRAM region) and so user can specify a region_mask. The tool then
> can mmap the buffers via direct mmap of the PCIBAR via sysfs.
> 
> To support the capture tool, 3 new debugfs entries are added:
> 
>     psmi_capture_addr - physical address per VRAM region's capture buffer
>     psmi_capture_region_mask - select which region(s) to allocate a buffer
>     psmi_capture_size - size of current capture buffer
> 
> Writing psmi_capture_size will allocate new buffer of requested size per
> region after freeing any current buffers.
> 
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Original-author: Brian Welty <brian.welty@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> v2:
>   - Fix kernel-doc
>   - Do not walk all region_mask on cleanup: it should never be needed
>   - Replace sysmem checks by asserts as they should never be set
>   - s/debugfs_create/debugfs_register/ and do not pass the root dir:
>     this makes it similar to other parts registering debugfs
>   - Do not export a cleanup function, rather use a init that registers
>     a devm action if needed
>   - Drop modparam in favor of configfs whose attribute will be
>     implemented when everything is ready
> ---
>   drivers/gpu/drm/xe/Makefile          |   1 +
>   drivers/gpu/drm/xe/xe_debugfs.c      |   3 +
>   drivers/gpu/drm/xe/xe_device.c       |   5 +
>   drivers/gpu/drm/xe/xe_device_types.h |   8 +
>   drivers/gpu/drm/xe/xe_psmi.c         | 313 +++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_psmi.h         |  14 ++
>   6 files changed, 344 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 8e0c3412a757c..85b8d3a59ef07 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -98,6 +98,7 @@ xe-y += xe_bb.o \
>   	xe_pcode.o \
>   	xe_pm.o \
>   	xe_preempt_fence.o \
> +	xe_psmi.o \
>   	xe_pt.o \
>   	xe_pt_walk.o \
>   	xe_pxp.o \
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index 0b4a532f7c45c..bc717519502dd 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -20,6 +20,7 @@
>   #include "xe_guc_ads.h"
>   #include "xe_mmio.h"
>   #include "xe_pm.h"
> +#include "xe_psmi.h"
>   #include "xe_pxp_debugfs.h"
>   #include "xe_sriov.h"
>   #include "xe_sriov_pf.h"
> @@ -400,6 +401,8 @@ void xe_debugfs_register(struct xe_device *xe)
>   
>   	xe_pxp_debugfs_register(xe->pxp);
>   
> +	xe_psmi_debugfs_register(xe);
> +
>   	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
>   
>   	if (IS_SRIOV_PF(xe))
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 57edbc63da6f4..62edb39b61fb0 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -54,6 +54,7 @@
>   #include "xe_pcode.h"
>   #include "xe_pm.h"
>   #include "xe_pmu.h"
> +#include "xe_psmi.h"
>   #include "xe_pxp.h"
>   #include "xe_query.h"
>   #include "xe_shrinker.h"
> @@ -908,6 +909,10 @@ int xe_device_probe(struct xe_device *xe)
>   	if (err)
>   		return err;
>   
> +	err = xe_psmi_init(xe);
> +	if (err)
> +		return err;
> +
>   	err = drm_dev_register(&xe->drm, 0);
>   	if (err)
>   		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 01e8fa0d2f9f7..bf9af8d0b84ae 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -576,6 +576,14 @@ struct xe_device {
>   	atomic64_t global_total_pages;
>   #endif
>   
> +	/** @psmi: GPU debugging via additional validation HW */
> +	struct {
> +		/** @psmi.capture_obj: PSMI buffer for VRAM */
> +		struct xe_bo *capture_obj[XE_MAX_TILES_PER_DEVICE + 1];
> +		/** @psmi.region_mask: Mask of valid memory regions */
> +		u8 region_mask;
> +	} psmi;
> +
>   	/* private: */
>   
>   #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_psmi.c b/drivers/gpu/drm/xe/xe_psmi.c
> new file mode 100644
> index 0000000000000..e6a67e85e1bb2
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_psmi.c
> @@ -0,0 +1,313 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <linux/debugfs.h>
> +
> +#include "xe_bo.h"
> +#include "xe_device.h"
> +#include "xe_configfs.h"
> +#include "xe_psmi.h"
> +
> +/*
> + * PSMI capture support
> + *
> + * Requirement for PSMI capture is to have a physically contiguous buffer.  The
> + * PSMI tool owns doing all necessary configuration (MMIO register writes are
> + * done from user-space). However, KMD needs to provide the PSMI tool with the
> + * required physical address of the base of PSMI buffer in case of VRAM.
> + *
> + * VRAM backed PSMI buffer:
> + * Buffer is allocated as GEM object and with XE_BO_CREATE_PINNED_BIT flag which
> + * creates a contiguous allocation. The physical address is returned from
> + * psmi_debugfs_capture_addr_show(). PSMI tool can mmap the buffer via the
> + * PCIBAR through sysfs.
> + *
> + * SYSTEM memory backed PSMI buffer:
> + * Interface here does not support allocating from SYSTEM memory region.  The
> + * PSMI tool needs to allocate memory themselves using hugetlbfs. In order to
> + * get the physical address, user-space can query /proc/[pid]/pagemap. As an
> + * alternative, CMA debugfs could also be used to allocate reserved CMA memory.
> + */
> +
> +static bool psmi_enabled(struct xe_device *xe)
> +{
> +	return xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev));
> +}
> +
> +static void psmi_free_object(struct xe_bo *bo)
> +{
> +	xe_bo_lock(bo, NULL);
> +	xe_bo_unpin(bo);
> +	xe_bo_unlock(bo);
> +	xe_bo_put(bo);
> +}
> +
> +/*
> + * Free PSMI capture buffer objects.
> + */
> +static void psmi_cleanup(struct xe_device *xe)
> +{
> +	unsigned long id, region_mask = xe->psmi.region_mask;
> +	struct xe_bo *bo;
> +
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		bo = xe->psmi.capture_obj[id];
> +		if (bo) {
> +			psmi_free_object(bo);
> +			xe->psmi.capture_obj[id] = NULL;
> +		}
> +	}
> +}
> +
> +static struct xe_bo *psmi_alloc_object(struct xe_device *xe,
> +				       unsigned int id, size_t bo_size)
> +{
> +	struct xe_bo *bo = NULL;
> +	struct xe_tile *tile;
> +	int err;
> +
> +	if (!id || !bo_size)
> +		return NULL;
> +
> +	tile = &xe->tiles[id - 1];
> +
> +	/* VRAM: Allocate GEM object for the capture buffer */
> +	bo = xe_bo_create_locked(xe, tile, NULL, bo_size,
> +				 ttm_bo_type_kernel,
> +				 XE_BO_FLAG_VRAM_IF_DGFX(tile) |
> +				 XE_BO_FLAG_PINNED |
> +				 XE_BO_FLAG_NEEDS_CPU_ACCESS);

It might make sense to add XE_BO_FLAG_PINNED_LATE_RESTORE, assuming this 
memory needs to be saved and restored for VRAM case. Since it might 
could be ~large it might benefit from using the blitter instead of CPU?

Also do you want the memory to be pre-cleared here? Currently it just 
gives you back uncleared memory, also with uncleared CCS. If uncleared 
is fine, then should that be documented somewhere for the user so that 
there are no surprises?

> +
> +	if (!IS_ERR(bo)) {
> +		/* Buffer written by HW, ensure stays resident */
> +		err = xe_bo_pin(bo);
> +		if (err)
> +			bo = ERR_PTR(err);
> +		xe_bo_unlock(bo);
> +	}
> +
> +	return bo;
> +}
> +
> +/*
> + * Allocate PSMI capture buffer objects (via debugfs set function), based on
> + * which regions the user has selected in region_mask.  @size: size in bytes
> + * (should be power of 2)
> + *
> + * Always release/free the current buffer objects before attempting to allocate
> + * new ones.  Size == 0 will free all current buffers.
> + *
> + * Note, we don't write any registers as the capture tool is already configuring
> + * all PSMI registers itself via mmio space.
> + */
> +static int psmi_resize_object(struct xe_device *xe, size_t size)
> +{
> +	unsigned long id, region_mask = xe->psmi.region_mask;
> +	struct xe_bo *bo = NULL;
> +	int err = 0;
> +
> +	/*
> +	 * Buddy allocator anyway will roundup to next power of 2,
> +	 * so rather than waste unused pages, require user to ask for
> +	 * power of 2 sized PSMI buffers.

It will internally do a trim for you to give back any excess, if not a 
power-of-two, so might be beneficial to drop this restriction. Probably 
doesn't matter all that much though.

> +	 */
> +	if (size && !is_power_of_2(size))
> +		return -EINVAL;
> +
> +	/* if resizing, free currently allocated buffers first */
> +	psmi_cleanup(xe);
> +
> +	/* can set size to 0, in which case, now done */
> +	if (!size)
> +		return 0;
> +
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		bo = psmi_alloc_object(xe, id, size);
> +		if (IS_ERR(bo)) {
> +			err = PTR_ERR(bo);
> +			break;
> +		}
> +		xe->psmi.capture_obj[id] = bo;
> +
> +		drm_info(&xe->drm,
> +			 "PSMI capture size requested: %zu bytes, allocated: %lu:%zu\n",
> +			 size, id, bo ? xe_bo_size(bo) : 0);
> +	}
> +
> +	/* on error, reverse what was allocated */
> +	if (err)
> +		psmi_cleanup(xe);
> +
> +	return err;
> +}
> +
> +/*
> + * Returns an address for the capture tool to use to find start of capture
> + * buffer. Capture tool requires the capability to have a buffer allocated per
> + * each tile (VRAM region), thus we return an address for each region.
> + */
> +static int psmi_debugfs_capture_addr_show(struct seq_file *m, void *data)
> +{
> +	struct xe_device *xe = m->private;
> +	unsigned long id, region_mask;
> +	struct xe_bo *bo;
> +	u64 val;
> +
> +	region_mask = xe->psmi.region_mask;
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		/* VRAM region */
> +		bo = xe->psmi.capture_obj[id];
> +		if (!bo)
> +			continue;
> +
> +		/* pinned, so don't need bo_lock */
> +		val = __xe_bo_addr(bo, 0, PAGE_SIZE);
> +		seq_printf(m, "%ld: 0x%llx\n", id, val);
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Return capture buffer size, using the size from first allocated object that
> + * is found. This works because all objects must be of the same size.
> + */
> +static int psmi_debugfs_capture_size_get(void *data, u64 *val)
> +{
> +	unsigned long id, region_mask;
> +	struct xe_device *xe = data;
> +	struct xe_bo *bo;
> +
> +	region_mask = xe->psmi.region_mask;
> +	for_each_set_bit(id, &region_mask,
> +			 ARRAY_SIZE(xe->psmi.capture_obj)) {
> +		/* smem should never be set */
> +		xe_assert(xe, id);
> +
> +		bo = xe->psmi.capture_obj[id];
> +		if (bo) {
> +			*val = xe_bo_size(bo);
> +			return 0;
> +		}
> +	}
> +
> +	/* no capture objects are allocated */
> +	*val = 0;
> +
> +	return 0;
> +}
> +
> +/*
> + * Set size of PSMI capture buffer. This triggers the allocation of capture
> + * buffer in each memory region as specified with prior write to
> + * psmi_capture_region_mask.
> + */
> +static int psmi_debugfs_capture_size_set(void *data, u64 val)
> +{
> +	struct xe_device *xe = data;
> +
> +	/* user must have specified at least one region */
> +	if (!xe->psmi.region_mask)
> +		return -EINVAL;
> +
> +	return psmi_resize_object(xe, val);
> +}
> +
> +static int psmi_debugfs_capture_region_mask_get(void *data, u64 *val)
> +{
> +	struct xe_device *xe = data;
> +
> +	*val = xe->psmi.region_mask;
> +
> +	return 0;
> +}
> +
> +/*
> + * Select VRAM regions for multi-tile devices, only allowed when buffer is not
> + * currently allocated.
> + */
> +static int psmi_debugfs_capture_region_mask_set(void *data, u64 region_mask)
> +{
> +	struct xe_device *xe = data;
> +	u64 size = 0;
> +
> +	/* SMEM is not supported (see comments at top of file) */
> +	if (region_mask & 0x1)
> +		return -EOPNOTSUPP;
> +
> +	/* input bitmask should contain only valid TTM regions */
> +	if (!region_mask || region_mask & ~xe->info.mem_region_mask)
> +		return -EINVAL;
> +
> +	/* only allow setting mask if buffer is not yet allocated */
> +	psmi_debugfs_capture_size_get(xe, &size);
> +	if (size)
> +		return -EBUSY;
> +
> +	xe->psmi.region_mask = region_mask;
> +
> +	return 0;
> +}
> +
> +DEFINE_SHOW_ATTRIBUTE(psmi_debugfs_capture_addr);
> +
> +DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_region_mask_fops,
> +			 psmi_debugfs_capture_region_mask_get,
> +			 psmi_debugfs_capture_region_mask_set,
> +			 "0x%llx\n");
> +
> +DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_size_fops,
> +			 psmi_debugfs_capture_size_get,
> +			 psmi_debugfs_capture_size_set,
> +			 "%lld\n");
> +
> +void xe_psmi_debugfs_register(struct xe_device *xe)
> +{
> +	struct drm_minor *minor;
> +
> +	if (!psmi_enabled(xe))
> +		return;
> +
> +	minor = xe->drm.primary;
> +	if (!minor->debugfs_root)
> +		return;
> +
> +	debugfs_create_file("psmi_capture_addr",
> +			    0400, minor->debugfs_root, xe,
> +			    &psmi_debugfs_capture_addr_fops);
> +
> +	debugfs_create_file("psmi_capture_region_mask",
> +			    0600, minor->debugfs_root, xe,
> +			    &psmi_debugfs_capture_region_mask_fops);
> +
> +	debugfs_create_file("psmi_capture_size",
> +			    0600, minor->debugfs_root, xe,
> +			    &psmi_debugfs_capture_size_fops);
> +}
> +
> +static void psmi_fini(void *arg)
> +{
> +	psmi_cleanup(arg);
> +}
> +
> +int xe_psmi_init(struct xe_device *xe)
> +{
> +	if (!psmi_enabled(xe))
> +		return 0;
> +
> +	return devm_add_action(xe->drm.dev, psmi_fini, xe);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_psmi.h b/drivers/gpu/drm/xe/xe_psmi.h
> new file mode 100644
> index 0000000000000..b1dfba80d893d
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_psmi.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PSMI_H_
> +#define _XE_PSMI_H_
> +
> +struct xe_device;
> +
> +int xe_psmi_init(struct xe_device *xe);
> +void xe_psmi_debugfs_register(struct xe_device *xe);
> +
> +#endif
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 09/13] drm/xe/configfs: Block runtime attribute changes
  2025-08-08 17:29 ` [PATCH v3 09/13] drm/xe/configfs: Block runtime attribute changes Lucas De Marchi
@ 2025-08-13 11:03   ` Riana Tauro
  0 siblings, 0 replies; 33+ messages in thread
From: Riana Tauro @ 2025-08-13 11:03 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe; +Cc: prashanth.kumar, dnyaneshwar.bhadane



On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
> Although it's possible to change the attributes in runtime, they have no
> effect after the driver is already bound to the device. Check for that
> and return -EBUSY in that case.
> 
> This should help users understand what's going on when the behavior is
> not changing even if the value from the configfs is "right", but it got
> to that state too late.
> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_configfs.c | 38 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 4c2d4ff6a70f5..489c5c67001dc 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -54,6 +54,8 @@
>    *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
>    *	# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind  (Enters survivability mode if supported)
>    *
> + * This attribute can only be set before binding to the device.
> + *
>    * Allowed engines:
>    * ----------------
>    *
> @@ -78,6 +80,8 @@
>    * available for migrations, but it's disabled. This is intended for debugging
>    * purposes only.
>    *
> + * This attribute can only be set before binding to the device.
> + *
>    * PSMI
>    * ----
>    *
> @@ -88,6 +92,8 @@
>    *
>    *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi
>    *
> + * This attribute can only be set before binding to the device.
> + *
>    * Remove devices
>    * ==============
>    *
> @@ -148,6 +154,29 @@ static struct xe_config_device *to_xe_config_device(struct config_item *item)
>   	return &to_xe_config_group_device(item)->config;
>   }
>   
> +static bool is_bound(struct xe_config_group_device *dev)
> +{
> +	unsigned int domain, bus, slot, function;
> +	struct pci_dev *pdev;
> +	const char *name;
> +	bool ret;
> +
> +	lockdep_assert_held(&dev->lock);
> +
> +	name = dev->group.cg_item.ci_name;
> +	if (sscanf(name, "%x:%x:%x.%x", &domain, &bus, &slot, &function) != 4)
> +		return false;
> +
> +	pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, function));
> +	if (!pdev)
> +		return false;
> +
> +	ret = pci_get_drvdata(pdev);
> +	pci_dev_put(pdev);

It would be useful to have a dbg log here to inform users about the 
expectation if device is bound

With that
Reviewed-by: Riana Tauro <riana.tauro@intel.com>

> +
> +	return ret;
> +}
> +
>   static ssize_t survivability_mode_show(struct config_item *item, char *page)
>   {
>   	struct xe_config_device *dev = to_xe_config_device(item);
> @@ -166,6 +195,9 @@ static ssize_t survivability_mode_store(struct config_item *item, const char *pa
>   		return ret;
>   
>   	guard(mutex)(&dev->lock);
> +	if (is_bound(dev))
> +		return -EBUSY;
> +
>   	dev->config.survivability_mode = survivability_mode;
>   
>   	return len;
> @@ -249,6 +281,9 @@ static ssize_t engines_allowed_store(struct config_item *item, const char *page,
>   	}
>   
>   	guard(mutex)(&dev->lock);
> +	if (is_bound(dev))
> +		return -EBUSY;
> +
>   	dev->config.engines_allowed = val;
>   
>   	return len;
> @@ -272,6 +307,9 @@ static ssize_t enable_psmi_store(struct config_item *item, const char *page, siz
>   		return ret;
>   
>   	guard(mutex)(&dev->lock);
> +	if (is_bound(dev))
> +		return -EBUSY;
> +
>   	dev->config.enable_psmi = val;
>   
>   	return len;
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 11/13] drm/xe/configfs: Improve documentation steps
  2025-08-08 17:29 ` [PATCH v3 11/13] drm/xe/configfs: Improve documentation steps Lucas De Marchi
@ 2025-08-13 11:08   ` Riana Tauro
  0 siblings, 0 replies; 33+ messages in thread
From: Riana Tauro @ 2025-08-13 11:08 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe; +Cc: prashanth.kumar, dnyaneshwar.bhadane



On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
> The steps are roughly:
> 
> 1. Load the module without binding to the device
> 2. Configure the desired device
> 3. Bind the device
> 
> Move the binding part to the "Create devices" since it's not exclusive
> to the survivability_mode attribute and better document the steps.
> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

Reviewed-by: Riana Tauro <riana.tauro@intel.com>

> ---
>   drivers/gpu/drm/xe/xe_configfs.c | 20 ++++++++++++++++----
>   1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 7c92d293ba733..58196b7571239 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -29,11 +29,18 @@
>    * See Documentation/filesystems/configfs.rst for more information about how configfs works.
>    *
>    * Create devices
> - * ===============
> + * ==============
> + *
> + * To create a device, the ``xe`` module should already be loaded, but some
> + * attributes can only be set before binding the device. It can be accomplished
> + * by blocking the driver autoprobe:
>    *
> - * In order to create a device, the user has to create a directory inside ``'xe'``::
> + *	# echo 0 > /sys/bus/pci/drivers_autoprobe
> + *	# modprobe xe
>    *
> - *	mkdir /sys/kernel/config/xe/0000:03:00.0/
> + * In order to create a device, the user has to create a directory inside ``xe``::
> + *
> + *	# mkdir /sys/kernel/config/xe/0000:03:00.0/
>    *
>    * Every device created is populated by the driver with entries that can be
>    * used to configure it::
> @@ -49,6 +56,12 @@
>    *	    ├── engines_allowed
>    *	    └── enable_psmi
>    *
> + * After configuring the attributes as per next section, the device can be
> + * probed with::
> + *
> + *	# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind
> + *	# # or
> + *	# echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
>    *
>    * Configure Attributes
>    * ====================
> @@ -60,7 +73,6 @@
>    * effect when probing the device. Example to enable it::
>    *
>    *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
> - *	# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind  (Enters survivability mode if supported)
>    *
>    * This attribute can only be set before binding to the device.
>    *
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509
  2025-08-08 17:29 ` [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509 Lucas De Marchi
@ 2025-08-13 11:15   ` Bhadane, Dnyaneshwar
  0 siblings, 0 replies; 33+ messages in thread
From: Bhadane, Dnyaneshwar @ 2025-08-13 11:15 UTC (permalink / raw)
  To: De Marchi, Lucas, intel-xe@lists.freedesktop.org
  Cc: Kumar, Prashanth, Belgaumkar, Vinay



> -----Original Message-----
> From: De Marchi, Lucas <lucas.demarchi@intel.com>
> Sent: Friday, August 8, 2025 11:00 PM
> To: intel-xe@lists.freedesktop.org
> Cc: De Marchi, Lucas <lucas.demarchi@intel.com>; Kumar, Prashanth
> <prashanth.kumar@intel.com>; Bhadane, Dnyaneshwar
> <dnyaneshwar.bhadane@intel.com>; Belgaumkar, Vinay
> <vinay.belgaumkar@intel.com>
> Subject: [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509
> 
> From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> 
> This WA ensures GuC will restore the media MCFG registers at C6 exit.
> 
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>


There is minor conflict while applying this patch. 

Dnyaneshwar 

> ---
> v2:
>  - Enable only when PSMI is enabled
> ---
>  drivers/gpu/drm/xe/xe_guc.c        | 3 +++
>  drivers/gpu/drm/xe/xe_guc_fwif.h   | 1 +
>  drivers/gpu/drm/xe/xe_wa_oob.rules | 2 ++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c index
> cb757a53de856..f55c4a37cfed1 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -219,6 +219,9 @@ static u32 guc_ctl_wa_flags(struct xe_guc *guc)
>  	if (XE_WA(gt, 14018913170))
>  		flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6;
> 
> +	if (XE_WA(gt, 16023683509))
> +		flags |= GUC_WA_SAVE_RESTORE_MCFG_REG_AT_MC6;
> +
>  	return flags;
>  }
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
> b/drivers/gpu/drm/xe/xe_guc_fwif.h
> index 4dc000c977faf..a169ad0da0d47 100644
> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> @@ -108,6 +108,7 @@ struct guc_update_exec_queue_policy {
>  #define   GUC_WA_RENDER_RST_RC6_EXIT	BIT(19)
>  #define   GUC_WA_RCS_REGS_IN_CCS_REGS_LIST	BIT(21)
>  #define   GUC_WA_ENABLE_TSC_CHECK_ON_RC6	BIT(22)
> +#define   GUC_WA_SAVE_RESTORE_MCFG_REG_AT_MC6	BIT(25)
> 
>  #define GUC_CTL_FEATURE			2
>  #define   GUC_CTL_ENABLE_SLPC		BIT(2)
> diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules
> b/drivers/gpu/drm/xe/xe_wa_oob.rules
> index 303a5e05d9932..fe369e8a01012 100644
> --- a/drivers/gpu/drm/xe/xe_wa_oob.rules
> +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
> @@ -72,6 +72,8 @@ no_media_l3	MEDIA_VERSION(3000)
>  		MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled)
>  		MEDIA_VERSION(3000), FUNC(xe_rtp_match_psmi_enabled)
>  		MEDIA_VERSION(3002), FUNC(xe_rtp_match_psmi_enabled)
> +16023683509	MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled)
> +		MEDIA_VERSION(3000), GRAPHICS_STEP(A0, B0),
> +FUNC(xe_rtp_match_psmi_enabled)
> 
>  # SoC workaround - currently applies to all platforms with the following  #
> primary GT GMDID
> 
> --
> 2.50.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI
  2025-08-13  6:58   ` Riana Tauro
@ 2025-08-13 11:23     ` Lucas De Marchi
  2025-08-13 17:38       ` Riana Tauro
  0 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-13 11:23 UTC (permalink / raw)
  To: Riana Tauro
  Cc: intel-xe, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison

On Wed, Aug 13, 2025 at 12:28:07PM +0530, Riana Tauro wrote:
>
>
>On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
>>Now that additional WAs are in place and it's possible to allocate
>>buffers through debugfs, add the configfs attribute to turn PSMI on.
>>
>>Cc: Matt Roper <matthew.d.roper@intel.com>
>>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>Cc: John Harrison <John.C.Harrison@Intel.com>
>>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>---
>>  drivers/gpu/drm/xe/xe_configfs.c | 66 +++++++++++++++++++++++++++++++++++++---
>>  drivers/gpu/drm/xe/xe_configfs.h |  2 +-
>>  2 files changed, 63 insertions(+), 5 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
>>index 17b1d6ae1ff6a..8cf6b1375b7d4 100644
>>--- a/drivers/gpu/drm/xe/xe_configfs.c
>>+++ b/drivers/gpu/drm/xe/xe_configfs.c
>>@@ -77,6 +77,16 @@
>>   * available for migrations, but it's disabled. This is intended for debugging
>>   * purposes only.
>>   *
>>+ * PSMI
>>+ * ----
>>+ *
>>+ * Enable extra debugging capabilities to trace engine execution. Only useful
>>+ * during early platform enabling and requiring additional hardware connected.
>
>%s/requiring/requires
>
>>+ * Once it's enabled, additionals WAs are added and runtime configuration is
>>+ * done via debugfs. Example to enable it::
>>+ *
>>+ *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi
>>+ *
>>   * Remove devices
>>   * ==============
>>   *
>>@@ -89,8 +99,9 @@ struct xe_config_group_device {
>>  	struct config_group group;
>>  	struct xe_config_device {
>>-		bool survivability_mode;
>>  		u64 engines_allowed;
>>+		bool survivability_mode;
>>+		bool enable_psmi;
>>  	} config;
>>  	/* protects attributes */
>>@@ -98,8 +109,9 @@ struct xe_config_group_device {
>>  };
>>  static const struct xe_config_device device_defaults = {
>>-	.survivability_mode = false,
>>  	.engines_allowed = U64_MAX,
>>+	.survivability_mode = false,
>>+	.enable_psmi = false,
>>  };
>>  static void set_device_defaults(struct xe_config_device *config)
>>@@ -243,12 +255,38 @@ static ssize_t engines_allowed_store(struct config_item *item, const char *page,
>>  	return len;
>>  }
>>-CONFIGFS_ATTR(, survivability_mode);
>>+static ssize_t enable_psmi_show(struct config_item *item, char *page)
>>+{
>>+	struct xe_config_group_device *dev = to_xe_config_group_device(item);
>
>To have consistency with other functions, we can use
>
>struct xe_config_device *dev = to_xe_config_device(item);

The struct and function got renamed a week or so ago. I think you don't
have this commit in the codebase you are looking at:

88df7939d728 ("drm/xe/configfs: Rename struct xe_config_device")

thanks
Lucas De Marchi

>
>>+
>>+	return sprintf(page, "%d\n", dev->config.enable_psmi);
>>+}
>>+
>>+static ssize_t enable_psmi_store(struct config_item *item, const char *page, size_t len)
>>+{
>>+	struct xe_config_group_device *dev = to_xe_config_group_device(item);
>>+	bool val;
>>+	int ret;
>>+
>>+	ret = kstrtobool(page, &val);
>>+	if (ret)
>>+		return ret;
>>+
>>+	mutex_lock(&dev->lock);
>>+	dev->config.enable_psmi = val;
>>+	mutex_unlock(&dev->lock);
>>+
>>+	return len;
>>+}
>>+
>>  CONFIGFS_ATTR(, engines_allowed);
>>+CONFIGFS_ATTR(, survivability_mode);
>>+CONFIGFS_ATTR(, enable_psmi);
>
>alphabetical?
>
>Thanks
>Riana
>
>>  static struct configfs_attribute *xe_config_device_attrs[] = {
>>-	&attr_survivability_mode,
>>  	&attr_engines_allowed,
>>+	&attr_survivability_mode,
>>+	&attr_enable_psmi,
>>  	NULL,
>>  };
>>@@ -443,6 +481,26 @@ u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev)
>>  	return engines_allowed;
>>  }
>>+/**
>>+ * xe_configfs_get_psmi_enabled - get configfs enable_psmi setting
>>+ * @pdev: pci device
>>+ *
>>+ * Return: enable_psmi setting in configfs
>>+ */
>>+bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev)
>>+{
>>+	struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
>>+	bool ret;
>>+
>>+	if (!dev)
>>+		return false;
>>+
>>+	ret = dev->config.enable_psmi;
>>+	config_item_put(&dev->group.cg_item);
>>+
>>+	return ret;
>>+}
>>+
>>  int __init xe_configfs_init(void)
>>  {
>>  	int ret;
>>diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
>>index c14588b86e833..603dd7796c8b2 100644
>>--- a/drivers/gpu/drm/xe/xe_configfs.h
>>+++ b/drivers/gpu/drm/xe/xe_configfs.h
>>@@ -16,7 +16,7 @@ void xe_configfs_exit(void);
>>  bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>>  void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
>>  u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
>>-static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
>>+bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
>>  #else
>>  static inline int xe_configfs_init(void) { return 0; }
>>  static inline void xe_configfs_exit(void) { }
>>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI
  2025-08-13 11:23     ` Lucas De Marchi
@ 2025-08-13 17:38       ` Riana Tauro
  0 siblings, 0 replies; 33+ messages in thread
From: Riana Tauro @ 2025-08-13 17:38 UTC (permalink / raw)
  To: Lucas De Marchi
  Cc: intel-xe, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison

Hi Lucas

On 8/13/2025 4:53 PM, Lucas De Marchi wrote:
> On Wed, Aug 13, 2025 at 12:28:07PM +0530, Riana Tauro wrote:
>>
>>
>> On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
>>> Now that additional WAs are in place and it's possible to allocate
>>> buffers through debugfs, add the configfs attribute to turn PSMI on.
>>>
>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>> ---
>>>  drivers/gpu/drm/xe/xe_configfs.c | 66 ++++++++++++++++++++++++++++++ 
>>> +++++++---
>>>  drivers/gpu/drm/xe/xe_configfs.h |  2 +-
>>>  2 files changed, 63 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/ 
>>> xe_configfs.c
>>> index 17b1d6ae1ff6a..8cf6b1375b7d4 100644
>>> --- a/drivers/gpu/drm/xe/xe_configfs.c
>>> +++ b/drivers/gpu/drm/xe/xe_configfs.c
>>> @@ -77,6 +77,16 @@
>>>   * available for migrations, but it's disabled. This is intended for 
>>> debugging
>>>   * purposes only.
>>>   *
>>> + * PSMI
>>> + * ----
>>> + *
>>> + * Enable extra debugging capabilities to trace engine execution. 
>>> Only useful
>>> + * during early platform enabling and requiring additional hardware 
>>> connected.
>>
>> %s/requiring/requires
>>
>>> + * Once it's enabled, additionals WAs are added and runtime 
>>> configuration is
>>> + * done via debugfs. Example to enable it::
>>> + *
>>> + *    # echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi
>>> + *
>>>   * Remove devices
>>>   * ==============
>>>   *
>>> @@ -89,8 +99,9 @@ struct xe_config_group_device {
>>>      struct config_group group;
>>>      struct xe_config_device {
>>> -        bool survivability_mode;
>>>          u64 engines_allowed;
>>> +        bool survivability_mode;
>>> +        bool enable_psmi;
>>>      } config;
>>>      /* protects attributes */
>>> @@ -98,8 +109,9 @@ struct xe_config_group_device {
>>>  };
>>>  static const struct xe_config_device device_defaults = {
>>> -    .survivability_mode = false,
>>>      .engines_allowed = U64_MAX,
>>> +    .survivability_mode = false,
>>> +    .enable_psmi = false,
>>>  };
>>>  static void set_device_defaults(struct xe_config_device *config)
>>> @@ -243,12 +255,38 @@ static ssize_t engines_allowed_store(struct 
>>> config_item *item, const char *page,
>>>      return len;
>>>  }
>>> -CONFIGFS_ATTR(, survivability_mode);
>>> +static ssize_t enable_psmi_show(struct config_item *item, char *page)
>>> +{
>>> +    struct xe_config_group_device *dev = 
>>> to_xe_config_group_device(item);
>>
>> To have consistency with other functions, we can use
>>
>> struct xe_config_device *dev = to_xe_config_device(item);
> 
> The struct and function got renamed a week or so ago. I think you don't
> have this commit in the codebase you are looking at:
> 

I do have that commit. It was again changed in

3c643f621621 ("drm/xe/configfs: Reintroduce struct xe_config_device")

It is trivial so not necessary and works fine.

Thanks
Riana

> 88df7939d728 ("drm/xe/configfs: Rename struct xe_config_device")
> 
> thanks
> Lucas De Marchi
> 
>>
>>> +
>>> +    return sprintf(page, "%d\n", dev->config.enable_psmi);
>>> +}
>>> +
>>> +static ssize_t enable_psmi_store(struct config_item *item, const 
>>> char *page, size_t len)
>>> +{
>>> +    struct xe_config_group_device *dev = 
>>> to_xe_config_group_device(item);
>>> +    bool val;
>>> +    int ret;
>>> +
>>> +    ret = kstrtobool(page, &val);
>>> +    if (ret)
>>> +        return ret;
>>> +
>>> +    mutex_lock(&dev->lock);
>>> +    dev->config.enable_psmi = val;
>>> +    mutex_unlock(&dev->lock);
>>> +
>>> +    return len;
>>> +}
>>> +
>>>  CONFIGFS_ATTR(, engines_allowed);
>>> +CONFIGFS_ATTR(, survivability_mode);
>>> +CONFIGFS_ATTR(, enable_psmi);
>>
>> alphabetical?
>>
>> Thanks
>> Riana
>>
>>>  static struct configfs_attribute *xe_config_device_attrs[] = {
>>> -    &attr_survivability_mode,
>>>      &attr_engines_allowed,
>>> +    &attr_survivability_mode,
>>> +    &attr_enable_psmi,
>>>      NULL,
>>>  };
>>> @@ -443,6 +481,26 @@ u64 xe_configfs_get_engines_allowed(struct 
>>> pci_dev *pdev)
>>>      return engines_allowed;
>>>  }
>>> +/**
>>> + * xe_configfs_get_psmi_enabled - get configfs enable_psmi setting
>>> + * @pdev: pci device
>>> + *
>>> + * Return: enable_psmi setting in configfs
>>> + */
>>> +bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev)
>>> +{
>>> +    struct xe_config_group_device *dev = 
>>> find_xe_config_group_device(pdev);
>>> +    bool ret;
>>> +
>>> +    if (!dev)
>>> +        return false;
>>> +
>>> +    ret = dev->config.enable_psmi;
>>> +    config_item_put(&dev->group.cg_item);
>>> +
>>> +    return ret;
>>> +}
>>> +
>>>  int __init xe_configfs_init(void)
>>>  {
>>>      int ret;
>>> diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/ 
>>> xe_configfs.h
>>> index c14588b86e833..603dd7796c8b2 100644
>>> --- a/drivers/gpu/drm/xe/xe_configfs.h
>>> +++ b/drivers/gpu/drm/xe/xe_configfs.h
>>> @@ -16,7 +16,7 @@ void xe_configfs_exit(void);
>>>  bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>>>  void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
>>>  u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
>>> -static inline bool xe_configfs_get_psmi_enabled(struct pci_dev 
>>> *pdev) { return false; }
>>> +bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
>>>  #else
>>>  static inline int xe_configfs_init(void) { return 0; }
>>>  static inline void xe_configfs_exit(void) { }
>>>
>>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231
  2025-08-08 17:29 ` [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231 Lucas De Marchi
@ 2025-08-13 17:44   ` Riana Tauro
  2025-08-14 11:13     ` Lucas De Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Riana Tauro @ 2025-08-13 17:44 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe
  Cc: prashanth.kumar, dnyaneshwar.bhadane, Badal Nilawar



On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
> From: Badal Nilawar <badal.nilawar@intel.com>
> 
> Enable Wa 14020001231 to block psmi interrupts during C6 entry exit
> flow. It's only enabled if PSMI is enabled in runtime.
> 
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> v2:
>   - Enable only when PSMI is enabled
> ---
>   drivers/gpu/drm/xe/abi/guc_klvs_abi.h | 1 +
>   drivers/gpu/drm/xe/xe_guc_ads.c       | 4 ++++
>   drivers/gpu/drm/xe/xe_wa_oob.rules    | 4 ++++
>   3 files changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> index 31dbfeee289e7..0e78351c6ef5a 100644
> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> @@ -390,6 +390,7 @@ enum  {
>    */
>   enum xe_guc_klv_ids {
>   	GUC_WORKAROUND_KLV_BLOCK_INTERRUPTS_WHEN_MGSR_BLOCKED				= 0x9002,
> +	GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT		= 0x9004,
>   	GUC_WORKAROUND_KLV_ID_GAM_PFQ_SHADOW_TAIL_POLLING				= 0x9005,
>   	GUC_WORKAROUND_KLV_ID_DISABLE_MTP_DURING_ASYNC_COMPUTE				= 0x9007,
>   	GUC_WA_KLV_NP_RD_WRITE_TO_CLEAR_RCSM_AT_CGP_LATE_RESTORE			= 0x9008,
> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
> index 2ceaa197cb2f0..c42fc78798ca0 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
> @@ -359,6 +359,10 @@ static void guc_waklv_init(struct xe_guc_ads *ads)
>   				 GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG);
>   	}
>   
> +	if (XE_WA(gt, 14020001231))


This commit has changed XE_WA to XE_GT_WA. Both WA patches need this change
4d5c98eb77fe ("drm/xe: rename XE_WA to XE_GT_WA")

Thanks
Riana

> +		guc_waklv_enable(ads, NULL, 0, &offset, &remain,
> +				 GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT);
> +
>   	size = guc_ads_waklv_size(ads) - remain;
>   	if (!size)
>   		return;
> diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
> index 8d0aabab67773..303a5e05d9932 100644
> --- a/drivers/gpu/drm/xe/xe_wa_oob.rules
> +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
> @@ -68,6 +68,10 @@ no_media_l3	MEDIA_VERSION(3000)
>   		MEDIA_VERSION_RANGE(1300, 3000)
>   		MEDIA_VERSION(3002)
>   		GRAPHICS_VERSION(3003)
> +14020001231	GRAPHICS_VERSION_RANGE(2001,2004), FUNC(xe_rtp_match_psmi_enabled)
> +		MEDIA_VERSION(2000), FUNC(xe_rtp_match_psmi_enabled)
> +		MEDIA_VERSION(3000), FUNC(xe_rtp_match_psmi_enabled)
> +		MEDIA_VERSION(3002), FUNC(xe_rtp_match_psmi_enabled)
>   
>   # SoC workaround - currently applies to all platforms with the following
>   # primary GT GMDID
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231
  2025-08-13 17:44   ` Riana Tauro
@ 2025-08-14 11:13     ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-14 11:13 UTC (permalink / raw)
  To: Riana Tauro; +Cc: intel-xe, prashanth.kumar, dnyaneshwar.bhadane, Badal Nilawar

On Wed, Aug 13, 2025 at 11:14:57PM +0530, Riana Tauro wrote:
>
>
>On 8/8/2025 10:59 PM, Lucas De Marchi wrote:
>>From: Badal Nilawar <badal.nilawar@intel.com>
>>
>>Enable Wa 14020001231 to block psmi interrupts during C6 entry exit
>>flow. It's only enabled if PSMI is enabled in runtime.
>>
>>Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
>>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>---
>>v2:
>>  - Enable only when PSMI is enabled
>>---
>>  drivers/gpu/drm/xe/abi/guc_klvs_abi.h | 1 +
>>  drivers/gpu/drm/xe/xe_guc_ads.c       | 4 ++++
>>  drivers/gpu/drm/xe/xe_wa_oob.rules    | 4 ++++
>>  3 files changed, 9 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>>index 31dbfeee289e7..0e78351c6ef5a 100644
>>--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>>+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>>@@ -390,6 +390,7 @@ enum  {
>>   */
>>  enum xe_guc_klv_ids {
>>  	GUC_WORKAROUND_KLV_BLOCK_INTERRUPTS_WHEN_MGSR_BLOCKED				= 0x9002,
>>+	GUC_WORKAROUND_KLV_DISABLE_PSMI_INTERRUPTS_AT_C6_ENTRY_RESTORE_AT_EXIT		= 0x9004,
>>  	GUC_WORKAROUND_KLV_ID_GAM_PFQ_SHADOW_TAIL_POLLING				= 0x9005,
>>  	GUC_WORKAROUND_KLV_ID_DISABLE_MTP_DURING_ASYNC_COMPUTE				= 0x9007,
>>  	GUC_WA_KLV_NP_RD_WRITE_TO_CLEAR_RCSM_AT_CGP_LATE_RESTORE			= 0x9008,
>>diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
>>index 2ceaa197cb2f0..c42fc78798ca0 100644
>>--- a/drivers/gpu/drm/xe/xe_guc_ads.c
>>+++ b/drivers/gpu/drm/xe/xe_guc_ads.c
>>@@ -359,6 +359,10 @@ static void guc_waklv_init(struct xe_guc_ads *ads)
>>  				 GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG);
>>  	}
>>+	if (XE_WA(gt, 14020001231))
>
>
>This commit has changed XE_WA to XE_GT_WA. Both WA patches need this change
>4d5c98eb77fe ("drm/xe: rename XE_WA to XE_GT_WA")

yep, I was waiting for reviews and comments to settle/resolve before
spinning a new version.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 03/13] drm/xe/rtp: Add match for psmi
  2025-08-08 17:29 ` [PATCH v3 03/13] drm/xe/rtp: Add match for psmi Lucas De Marchi
@ 2025-08-14 21:28   ` Belgaumkar, Vinay
  0 siblings, 0 replies; 33+ messages in thread
From: Belgaumkar, Vinay @ 2025-08-14 21:28 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe; +Cc: prashanth.kumar, dnyaneshwar.bhadane


On 8/8/2025 10:29 AM, Lucas De Marchi wrote:
> Add match to be used on WAs for only enabling workarounds if psmi is
> intended to be used.
>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

LGTM,

Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>

> ---
>   drivers/gpu/drm/xe/xe_rtp.c | 7 +++++++
>   drivers/gpu/drm/xe/xe_rtp.h | 3 +++
>   2 files changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_rtp.c b/drivers/gpu/drm/xe/xe_rtp.c
> index 95571b87aa73c..47ea1521dc80c 100644
> --- a/drivers/gpu/drm/xe/xe_rtp.c
> +++ b/drivers/gpu/drm/xe/xe_rtp.c
> @@ -9,6 +9,7 @@
>   
>   #include <uapi/drm/xe_drm.h>
>   
> +#include "xe_configfs.h"
>   #include "xe_gt.h"
>   #include "xe_gt_topology.h"
>   #include "xe_macros.h"
> @@ -363,3 +364,9 @@ bool xe_rtp_match_not_sriov_vf(const struct xe_gt *gt,
>   {
>   	return !IS_SRIOV_VF(gt_to_xe(gt));
>   }
> +
> +bool xe_rtp_match_psmi_enabled(const struct xe_gt *gt,
> +			       const struct xe_hw_engine *hwe)
> +{
> +	return xe_configfs_get_psmi_enabled(to_pci_dev(gt_to_xe(gt)->drm.dev));
> +}
> diff --git a/drivers/gpu/drm/xe/xe_rtp.h b/drivers/gpu/drm/xe/xe_rtp.h
> index 5ed6c14b9ae34..7951fefdbe044 100644
> --- a/drivers/gpu/drm/xe/xe_rtp.h
> +++ b/drivers/gpu/drm/xe/xe_rtp.h
> @@ -477,4 +477,7 @@ bool xe_rtp_match_first_render_or_compute(const struct xe_gt *gt,
>   bool xe_rtp_match_not_sriov_vf(const struct xe_gt *gt,
>   			       const struct xe_hw_engine *hwe);
>   
> +bool xe_rtp_match_psmi_enabled(const struct xe_gt *gt,
> +			       const struct xe_hw_engine *hwe);
> +
>   #endif
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in documentation
  2025-08-08 17:29 ` [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in documentation Lucas De Marchi
@ 2025-08-14 21:31   ` Bhadane, Dnyaneshwar
  0 siblings, 0 replies; 33+ messages in thread
From: Bhadane, Dnyaneshwar @ 2025-08-14 21:31 UTC (permalink / raw)
  To: De Marchi, Lucas, intel-xe@lists.freedesktop.org; +Cc: Kumar, Prashanth



> -----Original Message-----
> From: De Marchi, Lucas <lucas.demarchi@intel.com>
> Sent: Friday, August 8, 2025 11:00 PM
> To: intel-xe@lists.freedesktop.org
> Cc: De Marchi, Lucas <lucas.demarchi@intel.com>; Kumar, Prashanth
> <prashanth.kumar@intel.com>; Bhadane, Dnyaneshwar
> <dnyaneshwar.bhadane@intel.com>
> Subject: [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in
> documentation
> 
> When documenting the directories, use an output similar to the `tree`
> command and add VFs and missing attributes.
> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
LGTM,
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>


>  drivers/gpu/drm/xe/xe_configfs.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c
> b/drivers/gpu/drm/xe/xe_configfs.c
> index 489c5c67001dc..7c92d293ba733 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -39,8 +39,16 @@
>   * used to configure it::
>   *
>   *	/sys/kernel/config/xe/
> - *		.. 0000:03:00.0/
> - *			... survivability_mode
> + *	├── 0000:00:02.0
> + *	│   └── ...
> + *	├── 0000:00:02.1
> + *	│   └── ...
> + *	:
> + *	└── 0000:03:00.0
> + *	    ├── survivability_mode
> + *	    ├── engines_allowed
> + *	    └── enable_psmi
> + *
>   *
>   * Configure Attributes
>   * ====================
> 
> --
> 2.50.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 13/13] drm/xe/configfs: Dump custom settings when binding
  2025-08-08 17:29 ` [PATCH v3 13/13] drm/xe/configfs: Dump custom settings when binding Lucas De Marchi
@ 2025-08-15  0:48   ` Belgaumkar, Vinay
  0 siblings, 0 replies; 33+ messages in thread
From: Belgaumkar, Vinay @ 2025-08-15  0:48 UTC (permalink / raw)
  To: Lucas De Marchi, intel-xe
  Cc: prashanth.kumar, dnyaneshwar.bhadane, Michal Wajdeczko,
	John Harrison


On 8/8/2025 10:29 AM, Lucas De Marchi wrote:
> Device configuration using configfs could be prepared long time prior
> the driver load. Currently all the xe configfs entries are for things
> that are important to have in the log if a non-default value is being
> used. Add a info-level message about that with the individual entries
> that are different than the default.
>
> Based on previous patch by Michal Wajdeczko.
>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_configfs.c | 39 +++++++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_configfs.h |  2 ++
>   drivers/gpu/drm/xe/xe_pci.c      |  3 +++
>   3 files changed, 44 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 3b9d24c0bb588..9a283b713ff9d 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -480,6 +480,45 @@ static struct xe_config_group_device *find_xe_config_group_device(struct pci_dev
>   	return to_xe_config_group_device(item);
>   }
>   
> +static void dump_custom_dev_config(struct pci_dev *pdev,
> +				   struct xe_config_group_device *dev)
> +{
> +#define PRI_CUSTOM_ATTR(fmt_, attr_) do { \
> +		if (dev->config.attr_ != device_defaults.attr_) \
> +			pci_info(pdev, "configfs: " __stringify(attr_) " = " fmt_ "\n", \
> +				 dev->config.attr_); \
> +	} while (0)
> +
> +	PRI_CUSTOM_ATTR("%llx", engines_allowed);
> +	PRI_CUSTOM_ATTR("%d", enable_psmi);
> +	PRI_CUSTOM_ATTR("%d", survivability_mode);
> +
> +#undef PRI_CUSTOM_ATTR
> +}
> +
> +/**
> + * xe_configfs_check_device() - Test if device was configured by configfs
> + * @pdev: the &pci_dev device to test
> + *
> + * Try to find the configfs group that belongs to the specified pci device
> + * and print a diagnostic message if found.
We print only if custom ones are found right? Not otherwise.
> + */
> +void xe_configfs_check_device(struct pci_dev *pdev)
> +{
> +	struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
> +
> +	if (!dev)
> +		return;
> +
> +	/* memcmp here is safe as both as zero-initialized */

s/as/are?

With the above nits resolved,

Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>

> +	if (memcmp(&dev->config, &device_defaults, sizeof(dev->config))) {
> +		pci_info(pdev, "Found custom settings in configfs\n");
> +		dump_custom_dev_config(pdev, dev);
> +	}
> +
> +	config_group_put(&dev->group);
> +}
> +
>   /**
>    * xe_configfs_get_survivability_mode - get configfs survivability mode attribute
>    * @pdev: pci device
> diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
> index 603dd7796c8b2..58c8c31640008 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.h
> +++ b/drivers/gpu/drm/xe/xe_configfs.h
> @@ -13,6 +13,7 @@ struct pci_dev;
>   #if IS_ENABLED(CONFIG_CONFIGFS_FS)
>   int xe_configfs_init(void);
>   void xe_configfs_exit(void);
> +void xe_configfs_check_device(struct pci_dev *pdev);
>   bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>   void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
>   u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
> @@ -20,6 +21,7 @@ bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
>   #else
>   static inline int xe_configfs_init(void) { return 0; }
>   static inline void xe_configfs_exit(void) { }
> +static inline void xe_configfs_check_device(struct pci_dev *pdev) { }
>   static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
>   static inline void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) { }
>   static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index 52d46c66ae1eb..9ce6e6dca5bc7 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -17,6 +17,7 @@
>   
>   #include "display/xe_display.h"
>   #include "regs/xe_gt_regs.h"
> +#include "xe_configfs.h"
>   #include "xe_device.h"
>   #include "xe_drv.h"
>   #include "xe_gt.h"
> @@ -771,6 +772,8 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	struct xe_device *xe;
>   	int err;
>   
> +	xe_configfs_check_device(pdev);
> +
>   	if (desc->require_force_probe && !id_forced(pdev->device)) {
>   		dev_info(&pdev->dev,
>   			 "Your graphics device %04x is not officially supported\n"
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI
  2025-08-13  0:38   ` Belgaumkar, Vinay
@ 2025-08-15 21:34     ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-15 21:34 UTC (permalink / raw)
  To: Belgaumkar, Vinay
  Cc: intel-xe, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Daniele Ceraolo Spurio, John Harrison

On Tue, Aug 12, 2025 at 05:38:14PM -0700, Belgaumkar, Vinay wrote:
>
>On 8/8/2025 10:29 AM, Lucas De Marchi wrote:
>>PSMI allows to capture data from the GPU useful for early
>>validation. From the kernel side there isn't much to be done, just a few
>>things:
>>
>>	1) Toggle the feature support in GuC
>>	2) Enable some additional WAs
>>	3) Allocate buffers
>>
>>Here is the first step, with the next ones to follow. For now everything
>>is disabled through a check in configfs that is currently hardcoded to
>>disabled.
>>
>>Cc: Matt Roper <matthew.d.roper@intel.com>
>>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>Cc: John Harrison <John.C.Harrison@Intel.com>
>>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>---
>>  drivers/gpu/drm/xe/xe_configfs.h | 2 ++
>>  drivers/gpu/drm/xe/xe_guc.c      | 7 ++++++-
>>  drivers/gpu/drm/xe/xe_guc_fwif.h | 1 +
>>  3 files changed, 9 insertions(+), 1 deletion(-)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
>>index fb87640080896..c14588b86e833 100644
>>--- a/drivers/gpu/drm/xe/xe_configfs.h
>>+++ b/drivers/gpu/drm/xe/xe_configfs.h
>>@@ -16,12 +16,14 @@ void xe_configfs_exit(void);
>>  bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>>  void xe_configfs_clear_survivability_mode(struct pci_dev *pdev);
>>  u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
>>+static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
>>  #else
>>  static inline int xe_configfs_init(void) { return 0; }
>>  static inline void xe_configfs_exit(void) { }
>>  static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
>>  static inline void xe_configfs_clear_survivability_mode(struct pci_dev *pdev) { }
>>  static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
>>+static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
>>  #endif
>>  #endif
>>diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>>index 9e34401e4489f..cb757a53de856 100644
>>--- a/drivers/gpu/drm/xe/xe_guc.c
>>+++ b/drivers/gpu/drm/xe/xe_guc.c
>>@@ -16,6 +16,7 @@
>>  #include "regs/xe_guc_regs.h"
>>  #include "regs/xe_irq_regs.h"
>>  #include "xe_bo.h"
>>+#include "xe_configfs.h"
>>  #include "xe_device.h"
>>  #include "xe_force_wake.h"
>>  #include "xe_gt.h"
>>@@ -81,11 +82,15 @@ static u32 guc_ctl_debug_flags(struct xe_guc *guc)
>>  static u32 guc_ctl_feature_flags(struct xe_guc *guc)
>>  {
>>+	struct xe_device *xe = guc_to_xe(guc);
>>  	u32 flags = GUC_CTL_ENABLE_LITE_RESTORE;
>>-	if (!guc_to_xe(guc)->info.skip_guc_pc)
>>+	if (!xe->info.skip_guc_pc)
>>  		flags |= GUC_CTL_ENABLE_SLPC;
>>+	if (xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev)))
>>+		flags |= GUC_CTL_ENABLE_PSMI;
>>+
>>  	return flags;
>>  }
>>diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>index ca9f999d38d1e..4dc000c977faf 100644
>>--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>>+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>@@ -112,6 +112,7 @@ struct guc_update_exec_queue_policy {
>>  #define GUC_CTL_FEATURE			2
>>  #define   GUC_CTL_ENABLE_SLPC		BIT(2)
>>  #define   GUC_CTL_ENABLE_LITE_RESTORE	BIT(4)
>>+#define   GUC_CTL_ENABLE_PSMI		BIT(7)
>
>Should we have this as GUC_CTL_ENABLE_PSMI_LOGGING to match the GuC 
>nomenclature?

done, thanks.

Lucas De Marchi

>
>Thanks,
>
>Vinay.
>
>>  #define   GUC_CTL_DISABLE_SCHEDULER	BIT(14)
>>  #define GUC_CTL_DEBUG			3
>>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI
  2025-08-13 10:42   ` Matthew Auld
@ 2025-08-15 21:35     ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2025-08-15 21:35 UTC (permalink / raw)
  To: Matthew Auld
  Cc: intel-xe, prashanth.kumar, dnyaneshwar.bhadane, Matt Roper,
	Vinay Belgaumkar, Brian Welty

On Wed, Aug 13, 2025 at 11:42:07AM +0100, Matthew Auld wrote:
>On 08/08/2025 18:29, Lucas De Marchi wrote:
>>Requirement for PSMI capture is to have a physically contiguous buffer.
>>All the needed configuration is done by the userspace tool directly to
>>the GPU via mmio access.
>>
>>This interface only support allocating from VRAM regions. For integrated
>>devices, the PSMI buffer is in SYSTEM memory and should be allocated by
>>userspace using hugetlbfs.
>>
>>Here we add the ability to allocate a region of physically contiguous
>>memory by writing to debugfs file (listed below). For multi-tile devices,
>>the capture tool requires ability to allocate a capture buffer per tile
>>(VRAM region) and so user can specify a region_mask. The tool then
>>can mmap the buffers via direct mmap of the PCIBAR via sysfs.
>>
>>To support the capture tool, 3 new debugfs entries are added:
>>
>>    psmi_capture_addr - physical address per VRAM region's capture buffer
>>    psmi_capture_region_mask - select which region(s) to allocate a buffer
>>    psmi_capture_size - size of current capture buffer
>>
>>Writing psmi_capture_size will allocate new buffer of requested size per
>>region after freeing any current buffers.
>>
>>Cc: Matt Roper <matthew.d.roper@intel.com>
>>Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>>Original-author: Brian Welty <brian.welty@intel.com>
>>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>---
>>v2:
>>  - Fix kernel-doc
>>  - Do not walk all region_mask on cleanup: it should never be needed
>>  - Replace sysmem checks by asserts as they should never be set
>>  - s/debugfs_create/debugfs_register/ and do not pass the root dir:
>>    this makes it similar to other parts registering debugfs
>>  - Do not export a cleanup function, rather use a init that registers
>>    a devm action if needed
>>  - Drop modparam in favor of configfs whose attribute will be
>>    implemented when everything is ready
>>---
>>  drivers/gpu/drm/xe/Makefile          |   1 +
>>  drivers/gpu/drm/xe/xe_debugfs.c      |   3 +
>>  drivers/gpu/drm/xe/xe_device.c       |   5 +
>>  drivers/gpu/drm/xe/xe_device_types.h |   8 +
>>  drivers/gpu/drm/xe/xe_psmi.c         | 313 +++++++++++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_psmi.h         |  14 ++
>>  6 files changed, 344 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>>index 8e0c3412a757c..85b8d3a59ef07 100644
>>--- a/drivers/gpu/drm/xe/Makefile
>>+++ b/drivers/gpu/drm/xe/Makefile
>>@@ -98,6 +98,7 @@ xe-y += xe_bb.o \
>>  	xe_pcode.o \
>>  	xe_pm.o \
>>  	xe_preempt_fence.o \
>>+	xe_psmi.o \
>>  	xe_pt.o \
>>  	xe_pt_walk.o \
>>  	xe_pxp.o \
>>diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
>>index 0b4a532f7c45c..bc717519502dd 100644
>>--- a/drivers/gpu/drm/xe/xe_debugfs.c
>>+++ b/drivers/gpu/drm/xe/xe_debugfs.c
>>@@ -20,6 +20,7 @@
>>  #include "xe_guc_ads.h"
>>  #include "xe_mmio.h"
>>  #include "xe_pm.h"
>>+#include "xe_psmi.h"
>>  #include "xe_pxp_debugfs.h"
>>  #include "xe_sriov.h"
>>  #include "xe_sriov_pf.h"
>>@@ -400,6 +401,8 @@ void xe_debugfs_register(struct xe_device *xe)
>>  	xe_pxp_debugfs_register(xe->pxp);
>>+	xe_psmi_debugfs_register(xe);
>>+
>>  	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
>>  	if (IS_SRIOV_PF(xe))
>>diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>index 57edbc63da6f4..62edb39b61fb0 100644
>>--- a/drivers/gpu/drm/xe/xe_device.c
>>+++ b/drivers/gpu/drm/xe/xe_device.c
>>@@ -54,6 +54,7 @@
>>  #include "xe_pcode.h"
>>  #include "xe_pm.h"
>>  #include "xe_pmu.h"
>>+#include "xe_psmi.h"
>>  #include "xe_pxp.h"
>>  #include "xe_query.h"
>>  #include "xe_shrinker.h"
>>@@ -908,6 +909,10 @@ int xe_device_probe(struct xe_device *xe)
>>  	if (err)
>>  		return err;
>>+	err = xe_psmi_init(xe);
>>+	if (err)
>>+		return err;
>>+
>>  	err = drm_dev_register(&xe->drm, 0);
>>  	if (err)
>>  		return err;
>>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>>index 01e8fa0d2f9f7..bf9af8d0b84ae 100644
>>--- a/drivers/gpu/drm/xe/xe_device_types.h
>>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>>@@ -576,6 +576,14 @@ struct xe_device {
>>  	atomic64_t global_total_pages;
>>  #endif
>>+	/** @psmi: GPU debugging via additional validation HW */
>>+	struct {
>>+		/** @psmi.capture_obj: PSMI buffer for VRAM */
>>+		struct xe_bo *capture_obj[XE_MAX_TILES_PER_DEVICE + 1];
>>+		/** @psmi.region_mask: Mask of valid memory regions */
>>+		u8 region_mask;
>>+	} psmi;
>>+
>>  	/* private: */
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>>diff --git a/drivers/gpu/drm/xe/xe_psmi.c b/drivers/gpu/drm/xe/xe_psmi.c
>>new file mode 100644
>>index 0000000000000..e6a67e85e1bb2
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/xe_psmi.c
>>@@ -0,0 +1,313 @@
>>+// SPDX-License-Identifier: MIT
>>+/*
>>+ * Copyright © 2025 Intel Corporation
>>+ */
>>+
>>+#include <linux/debugfs.h>
>>+
>>+#include "xe_bo.h"
>>+#include "xe_device.h"
>>+#include "xe_configfs.h"
>>+#include "xe_psmi.h"
>>+
>>+/*
>>+ * PSMI capture support
>>+ *
>>+ * Requirement for PSMI capture is to have a physically contiguous buffer.  The
>>+ * PSMI tool owns doing all necessary configuration (MMIO register writes are
>>+ * done from user-space). However, KMD needs to provide the PSMI tool with the
>>+ * required physical address of the base of PSMI buffer in case of VRAM.
>>+ *
>>+ * VRAM backed PSMI buffer:
,,>>+ * Buffer is allocated as GEM object and with XE_BO_CREATE_PINNED_BIT flag which
>>+ * creates a contiguous allocation. The physical address is returned from
>>+ * psmi_debugfs_capture_addr_show(). PSMI tool can mmap the buffer via the
>>+ * PCIBAR through sysfs.
>>+ *
>>+ * SYSTEM memory backed PSMI buffer:
>>+ * Interface here does not support allocating from SYSTEM memory region.  The
>>+ * PSMI tool needs to allocate memory themselves using hugetlbfs. In order to
>>+ * get the physical address, user-space can query /proc/[pid]/pagemap. As an
>>+ * alternative, CMA debugfs could also be used to allocate reserved CMA memory.
>>+ */
>>+
>>+static bool psmi_enabled(struct xe_device *xe)
>>+{
>>+	return xe_configfs_get_psmi_enabled(to_pci_dev(xe->drm.dev));
>>+}
>>+
>>+static void psmi_free_object(struct xe_bo *bo)
>>+{
>>+	xe_bo_lock(bo, NULL);
>>+	xe_bo_unpin(bo);
>>+	xe_bo_unlock(bo);
>>+	xe_bo_put(bo);
>>+}
>>+
>>+/*
>>+ * Free PSMI capture buffer objects.
>>+ */
>>+static void psmi_cleanup(struct xe_device *xe)
>>+{
>>+	unsigned long id, region_mask = xe->psmi.region_mask;
>>+	struct xe_bo *bo;
>>+
>>+	for_each_set_bit(id, &region_mask,
>>+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
>>+		/* smem should never be set */
>>+		xe_assert(xe, id);
>>+
>>+		bo = xe->psmi.capture_obj[id];
>>+		if (bo) {
>>+			psmi_free_object(bo);
>>+			xe->psmi.capture_obj[id] = NULL;
>>+		}
>>+	}
>>+}
>>+
>>+static struct xe_bo *psmi_alloc_object(struct xe_device *xe,
>>+				       unsigned int id, size_t bo_size)
>>+{
>>+	struct xe_bo *bo = NULL;
>>+	struct xe_tile *tile;
>>+	int err;
>>+
>>+	if (!id || !bo_size)
>>+		return NULL;
>>+
>>+	tile = &xe->tiles[id - 1];
>>+
>>+	/* VRAM: Allocate GEM object for the capture buffer */
>>+	bo = xe_bo_create_locked(xe, tile, NULL, bo_size,
>>+				 ttm_bo_type_kernel,
>>+				 XE_BO_FLAG_VRAM_IF_DGFX(tile) |
>>+				 XE_BO_FLAG_PINNED |
>>+				 XE_BO_FLAG_NEEDS_CPU_ACCESS);
>
>It might make sense to add XE_BO_FLAG_PINNED_LATE_RESTORE, assuming 
>this memory needs to be saved and restored for VRAM case. Since it 
>might could be ~large it might benefit from using the blitter instead 
>of CPU?

ok

>
>Also do you want the memory to be pre-cleared here? Currently it just 
>gives you back uncleared memory, also with uncleared CCS. If uncleared 
>is fine, then should that be documented somewhere for the user so that 
>there are no surprises?

I don't think it needs to be cleared - for example for the system memory
case it's coming from hugetlbfs. I will double check that... may not
change anything in next version regarding that though as I want to
resolve other comments.

>
>>+
>>+	if (!IS_ERR(bo)) {
>>+		/* Buffer written by HW, ensure stays resident */
>>+		err = xe_bo_pin(bo);
>>+		if (err)
>>+			bo = ERR_PTR(err);
>>+		xe_bo_unlock(bo);
>>+	}
>>+
>>+	return bo;
>>+}
>>+
>>+/*
>>+ * Allocate PSMI capture buffer objects (via debugfs set function), based on
>>+ * which regions the user has selected in region_mask.  @size: size in bytes
>>+ * (should be power of 2)
>>+ *
>>+ * Always release/free the current buffer objects before attempting to allocate
>>+ * new ones.  Size == 0 will free all current buffers.
>>+ *
>>+ * Note, we don't write any registers as the capture tool is already configuring
>>+ * all PSMI registers itself via mmio space.
>>+ */
>>+static int psmi_resize_object(struct xe_device *xe, size_t size)
>>+{
>>+	unsigned long id, region_mask = xe->psmi.region_mask;
>>+	struct xe_bo *bo = NULL;
>>+	int err = 0;
>>+
>>+	/*
>>+	 * Buddy allocator anyway will roundup to next power of 2,
>>+	 * so rather than waste unused pages, require user to ask for
>>+	 * power of 2 sized PSMI buffers.
>
>It will internally do a trim for you to give back any excess, if not a 
>power-of-two, so might be beneficial to drop this restriction. 
>Probably doesn't matter all that much though.

ok

thanks
Lucas De Marchi

>
>>+	 */
>>+	if (size && !is_power_of_2(size))
>>+		return -EINVAL;
>>+
>>+	/* if resizing, free currently allocated buffers first */
>>+	psmi_cleanup(xe);
>>+
>>+	/* can set size to 0, in which case, now done */
>>+	if (!size)
>>+		return 0;
>>+
>>+	for_each_set_bit(id, &region_mask,
>>+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
>>+		/* smem should never be set */
>>+		xe_assert(xe, id);
>>+
>>+		bo = psmi_alloc_object(xe, id, size);
>>+		if (IS_ERR(bo)) {
>>+			err = PTR_ERR(bo);
>>+			break;
>>+		}
>>+		xe->psmi.capture_obj[id] = bo;
>>+
>>+		drm_info(&xe->drm,
>>+			 "PSMI capture size requested: %zu bytes, allocated: %lu:%zu\n",
>>+			 size, id, bo ? xe_bo_size(bo) : 0);
>>+	}
>>+
>>+	/* on error, reverse what was allocated */
>>+	if (err)
>>+		psmi_cleanup(xe);
>>+
>>+	return err;
>>+}
>>+
>>+/*
>>+ * Returns an address for the capture tool to use to find start of capture
>>+ * buffer. Capture tool requires the capability to have a buffer allocated per
>>+ * each tile (VRAM region), thus we return an address for each region.
>>+ */
>>+static int psmi_debugfs_capture_addr_show(struct seq_file *m, void *data)
>>+{
>>+	struct xe_device *xe = m->private;
>>+	unsigned long id, region_mask;
>>+	struct xe_bo *bo;
>>+	u64 val;
>>+
>>+	region_mask = xe->psmi.region_mask;
>>+	for_each_set_bit(id, &region_mask,
>>+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
>>+		/* smem should never be set */
>>+		xe_assert(xe, id);
>>+
>>+		/* VRAM region */
>>+		bo = xe->psmi.capture_obj[id];
>>+		if (!bo)
>>+			continue;
>>+
>>+		/* pinned, so don't need bo_lock */
>>+		val = __xe_bo_addr(bo, 0, PAGE_SIZE);
>>+		seq_printf(m, "%ld: 0x%llx\n", id, val);
>>+	}
>>+
>>+	return 0;
>>+}
>>+
>>+/*
>>+ * Return capture buffer size, using the size from first allocated object that
>>+ * is found. This works because all objects must be of the same size.
>>+ */
>>+static int psmi_debugfs_capture_size_get(void *data, u64 *val)
>>+{
>>+	unsigned long id, region_mask;
>>+	struct xe_device *xe = data;
>>+	struct xe_bo *bo;
>>+
>>+	region_mask = xe->psmi.region_mask;
>>+	for_each_set_bit(id, &region_mask,
>>+			 ARRAY_SIZE(xe->psmi.capture_obj)) {
>>+		/* smem should never be set */
>>+		xe_assert(xe, id);
>>+
>>+		bo = xe->psmi.capture_obj[id];
>>+		if (bo) {
>>+			*val = xe_bo_size(bo);
>>+			return 0;
>>+		}
>>+	}
>>+
>>+	/* no capture objects are allocated */
>>+	*val = 0;
>>+
>>+	return 0;
>>+}
>>+
>>+/*
>>+ * Set size of PSMI capture buffer. This triggers the allocation of capture
>>+ * buffer in each memory region as specified with prior write to
>>+ * psmi_capture_region_mask.
>>+ */
>>+static int psmi_debugfs_capture_size_set(void *data, u64 val)
>>+{
>>+	struct xe_device *xe = data;
>>+
>>+	/* user must have specified at least one region */
>>+	if (!xe->psmi.region_mask)
>>+		return -EINVAL;
>>+
>>+	return psmi_resize_object(xe, val);
>>+}
>>+
>>+static int psmi_debugfs_capture_region_mask_get(void *data, u64 *val)
>>+{
>>+	struct xe_device *xe = data;
>>+
>>+	*val = xe->psmi.region_mask;
>>+
>>+	return 0;
>>+}
>>+
>>+/*
>>+ * Select VRAM regions for multi-tile devices, only allowed when buffer is not
>>+ * currently allocated.
>>+ */
>>+static int psmi_debugfs_capture_region_mask_set(void *data, u64 region_mask)
>>+{
>>+	struct xe_device *xe = data;
>>+	u64 size = 0;
>>+
>>+	/* SMEM is not supported (see comments at top of file) */
>>+	if (region_mask & 0x1)
>>+		return -EOPNOTSUPP;
>>+
>>+	/* input bitmask should contain only valid TTM regions */
>>+	if (!region_mask || region_mask & ~xe->info.mem_region_mask)
>>+		return -EINVAL;
>>+
>>+	/* only allow setting mask if buffer is not yet allocated */
>>+	psmi_debugfs_capture_size_get(xe, &size);
>>+	if (size)
>>+		return -EBUSY;
>>+
>>+	xe->psmi.region_mask = region_mask;
>>+
>>+	return 0;
>>+}
>>+
>>+DEFINE_SHOW_ATTRIBUTE(psmi_debugfs_capture_addr);
>>+
>>+DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_region_mask_fops,
>>+			 psmi_debugfs_capture_region_mask_get,
>>+			 psmi_debugfs_capture_region_mask_set,
>>+			 "0x%llx\n");
>>+
>>+DEFINE_DEBUGFS_ATTRIBUTE(psmi_debugfs_capture_size_fops,
>>+			 psmi_debugfs_capture_size_get,
>>+			 psmi_debugfs_capture_size_set,
>>+			 "%lld\n");
>>+
>>+void xe_psmi_debugfs_register(struct xe_device *xe)
>>+{
>>+	struct drm_minor *minor;
>>+
>>+	if (!psmi_enabled(xe))
>>+		return;
>>+
>>+	minor = xe->drm.primary;
>>+	if (!minor->debugfs_root)
>>+		return;
>>+
>>+	debugfs_create_file("psmi_capture_addr",
>>+			    0400, minor->debugfs_root, xe,
>>+			    &psmi_debugfs_capture_addr_fops);
>>+
>>+	debugfs_create_file("psmi_capture_region_mask",
>>+			    0600, minor->debugfs_root, xe,
>>+			    &psmi_debugfs_capture_region_mask_fops);
>>+
>>+	debugfs_create_file("psmi_capture_size",
>>+			    0600, minor->debugfs_root, xe,
>>+			    &psmi_debugfs_capture_size_fops);
>>+}
>>+
>>+static void psmi_fini(void *arg)
>>+{
>>+	psmi_cleanup(arg);
>>+}
>>+
>>+int xe_psmi_init(struct xe_device *xe)
>>+{
>>+	if (!psmi_enabled(xe))
>>+		return 0;
>>+
>>+	return devm_add_action(xe->drm.dev, psmi_fini, xe);
>>+}
>>diff --git a/drivers/gpu/drm/xe/xe_psmi.h b/drivers/gpu/drm/xe/xe_psmi.h
>>new file mode 100644
>>index 0000000000000..b1dfba80d893d
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/xe_psmi.h
>>@@ -0,0 +1,14 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2025 Intel Corporation
>>+ */
>>+
>>+#ifndef _XE_PSMI_H_
>>+#define _XE_PSMI_H_
>>+
>>+struct xe_device;
>>+
>>+int xe_psmi_init(struct xe_device *xe);
>>+void xe_psmi_debugfs_register(struct xe_device *xe);
>>+
>>+#endif
>>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2025-08-15 21:35 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08 17:29 [PATCH v3 00/13] drm/xe: Add psmi support Lucas De Marchi
2025-08-08 17:29 ` [PATCH v3 01/13] drm/xe/psmi: Add GuC flag to enable PSMI Lucas De Marchi
2025-08-13  0:38   ` Belgaumkar, Vinay
2025-08-15 21:34     ` Lucas De Marchi
2025-08-08 17:29 ` [PATCH v3 02/13] drm/xe/psmi: Add debugfs interface for PSMI Lucas De Marchi
2025-08-13  1:41   ` Belgaumkar, Vinay
2025-08-13 10:42   ` Matthew Auld
2025-08-15 21:35     ` Lucas De Marchi
2025-08-08 17:29 ` [PATCH v3 03/13] drm/xe/rtp: Add match for psmi Lucas De Marchi
2025-08-14 21:28   ` Belgaumkar, Vinay
2025-08-08 17:29 ` [PATCH v3 04/13] drm/xe/psmi: Add Wa_14020001231 Lucas De Marchi
2025-08-13 17:44   ` Riana Tauro
2025-08-14 11:13     ` Lucas De Marchi
2025-08-08 17:29 ` [PATCH v3 05/13] drm/xe/psmi: Add Wa_16023683509 Lucas De Marchi
2025-08-13 11:15   ` Bhadane, Dnyaneshwar
2025-08-08 17:29 ` [PATCH v3 06/13] drm/xe/configfs: Simplify kernel doc Lucas De Marchi
2025-08-13  6:23   ` Riana Tauro
2025-08-08 17:29 ` [PATCH v3 07/13] drm/xe/configfs: Allow to enable PSMI Lucas De Marchi
2025-08-13  6:58   ` Riana Tauro
2025-08-13 11:23     ` Lucas De Marchi
2025-08-13 17:38       ` Riana Tauro
2025-08-08 17:29 ` [PATCH v3 08/13] drm/xe/configfs: Use guard() for dev->lock Lucas De Marchi
2025-08-12 10:23   ` Bhadane, Dnyaneshwar
2025-08-08 17:29 ` [PATCH v3 09/13] drm/xe/configfs: Block runtime attribute changes Lucas De Marchi
2025-08-13 11:03   ` Riana Tauro
2025-08-08 17:29 ` [PATCH v3 10/13] drm/xe/configfs: Use tree-like output in documentation Lucas De Marchi
2025-08-14 21:31   ` Bhadane, Dnyaneshwar
2025-08-08 17:29 ` [PATCH v3 11/13] drm/xe/configfs: Improve documentation steps Lucas De Marchi
2025-08-13 11:08   ` Riana Tauro
2025-08-08 17:29 ` [PATCH v3 12/13] drm/xe/configfs: Minor fixes to documentation Lucas De Marchi
2025-08-12 10:24   ` Bhadane, Dnyaneshwar
2025-08-08 17:29 ` [PATCH v3 13/13] drm/xe/configfs: Dump custom settings when binding Lucas De Marchi
2025-08-15  0:48   ` Belgaumkar, Vinay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).