intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/34] Kill mem_access v2
@ 2024-01-26 20:30 Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 01/34] Revert "drm/xe/uc: Store firmware binary in system-memory backed BO" Rodrigo Vivi
                   ` (36 more replies)
  0 siblings, 37 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Hi all,

First of all, thank you so much for the good feedback and ideas on the
v1 of this RFC series:

v1: lore.kernel.org/all/20231228021232.2366249-1-rodrigo.vivi@intel.com

First of all, this v2 has a more organized/split patches. So I'd like
to ask you to start the reviews of the simple ones already, so I can
try to split the series and merge a little by little.

I have 2 pending issues on this series that I couldn't solve yet.
Matt Brost is already helping me on these, but any help is welcomed.

But as I told, I'd like to start the review of the simplest ones first
anyway, please!

Details of the current issues:

1. Underflow on a gpu-hang test coming from d3cold. The culprit is the
pm_runtime_{get,put} around g2h_outstanding.

[  476.450482] [IGT] xe_exec_threads: starting subtest threads-hang-basic
[  476.507039] xe 0000:03:00.0: [drm] Engine reset: guc_id=78
[snip]
[  476.518814] xe 0000:03:00.0: [drm] Timedout job: seqno=4294967169, guc_id=78, flags=0x8
[  476.524595] xe 0000:03:00.0: [drm] Engine reset: guc_id=78
[  476.532615] xe 0000:03:00.0: [drm] Xe device coredump has been created[  476.801805] [IGT] xe_exec_threads: finished subtest threads-hang-basic, SUCCESS
[  476.808391] xe 0000:03:00.0: Runtime PM usage count underflow!
[  476.813455] [IGT] xe_exec_threads: exiting, ret=0
[  476.819146] xe 0000:03:00.0: Runtime PM usage count underflow!
[  476.829816] xe 0000:03:00.0: Runtime PM usage count underflow!
[and on, and on]

2. Failing rmmod due to an invalidation that happens at xe_pci_removal
(a case that fails only coming from D3cold and with display_enabled, but
on idle/blank-screen)

[  326.857464] xe 0000:03:00.0: [drm] GT0: resumed
[  327.135455] show_signal_msg: 126 callbacks suppressed
[  327.135467] gnome-shell[2488]: segfault at 0 ip 00007fad50e315cc sp 00007ffd6f04a360 error 4 in libmutter-clutter-11.so.0.0.0[7fad50dbd000+97000] likely on CPU 15 (core 28, socket 0)
[  327.157020] Code: e9 6f ff ff ff 66 0f 1f 84 00 00 00 00 00 48 8b 05 49 12 07 00 48 85 c0 74 4c 48 8b 38 e8 fc 49 f9 ff 48 89 c5 e8 14 75 f9 ff <48> 8b 55 00 48 8b 92 d0 00 00 00 48 85 d2 74 07 89 c6 48 89 ef ff
[  328.160905] xe 0000:03:00.0: [drm] Xe device coredump has been deleted.
[  329.099696] pci 0000:03:00.0: [drm] *ERROR* GT0: TLB invalidation time'd out, seqno=1668, recv=1667
[sip]
[  329.312763] ------------[ cut here ]------------
[  329.317417] pci 0000:03:00.0: [drm] Assertion `ct->g2h_outstanding == 0 || state == XE_GUC_CT_STATE_STOPPED` failed!
               platform: 7 subplatform: 4
               graphics: Xe_HPG 12.55 step C0
               media: Xe_HPM 12.55 step C0
               tile: 0 VRAM 8.00 GiB
               GT: 0 type 1

Thanks in advance,
Rodrigo.

Rodrigo Vivi (34):
  Revert "drm/xe/uc: Store firmware binary in system-memory backed BO"
  drm/xe: Document Xe PM component
  drm/xe: Fix display runtime_pm handling
  drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
  drm/xe: Convert xe_pm_runtime_{get,put} to void and protect from
    recursion
  drm/xe: Prepare display for D3Cold
  drm/xe: Convert mem_access assertion towards the runtime_pm state
  drm/xe: Runtime PM wake on every IOCTL
  drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
  drm/xe: Convert scheduler towards direct pm_runtime
  drm/xe: Runtime PM wake on every sysfs call
  drm/xe: Ensure device is awake before removing it
  drm/xe: Remove mem_access from guc_pc calls
  drm/xe: Runtime PM wake on every debugfs call
  drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
  drm/xe: Removing extra mem_access protection from runtime pm
  drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
  drm/xe: Move lockdep protection from mem_access to xe_pm_runtime
  drm/xe: Remove pm_runtime lockdep
  drm/xe: Stop checking for power_lost on D3Cold
  drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
  drm/xe: Keep D0 for the entire duration of a LR VM
  drm/xe: Ensure D0 on TLB invalidation
  drm/xe: Remove useless mem_access protection for query ioctls
  drm/xe: Convert gsc_work from mem_access to xe_pm_runtime
  drm/xe: VMs don't need the mem_access protection anymore
  drm/xe: Remove useless mem_access during probe
  drm/xe: Remove mem_access from suspend and resume functions
  drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
  drm/xe: Remove useless mem_access on PAT dumps
  drm/xe: Remove inner mem_access protections
  drm/xe: Kill xe_device_mem_access_{get*,put}
  drm/xe: Remove unused runtime pm helper
  drm/xe: Enable D3Cold on 'low' VRAM utilization

 .../gpu/drm/xe/compat-i915-headers/i915_drv.h |   8 +-
 drivers/gpu/drm/xe/display/xe_fb_pin.c        |   7 +-
 drivers/gpu/drm/xe/tests/xe_bo.c              |   8 +-
 drivers/gpu/drm/xe/tests/xe_migrate.c         |   7 +-
 drivers/gpu/drm/xe/tests/xe_mocs.c            |  14 +-
 drivers/gpu/drm/xe/xe_bo.c                    |  10 +-
 drivers/gpu/drm/xe/xe_debugfs.c               |  13 +-
 drivers/gpu/drm/xe/xe_device.c                | 129 ++++------
 drivers/gpu/drm/xe/xe_device.h                |   9 -
 drivers/gpu/drm/xe/xe_device_sysfs.c          |   4 +
 drivers/gpu/drm/xe/xe_device_types.h          |   6 -
 drivers/gpu/drm/xe/xe_display.c               |  22 ++
 drivers/gpu/drm/xe/xe_display.h               |   2 +
 drivers/gpu/drm/xe/xe_dma_buf.c               |   5 +-
 drivers/gpu/drm/xe/xe_exec_queue.c            |  19 --
 drivers/gpu/drm/xe/xe_ggtt.c                  |   6 -
 drivers/gpu/drm/xe/xe_gpu_scheduler.c         |   8 +-
 drivers/gpu/drm/xe/xe_gpu_scheduler.h         |   3 +-
 drivers/gpu/drm/xe/xe_gpu_scheduler_types.h   |   2 +
 drivers/gpu/drm/xe/xe_gsc.c                   |   5 +-
 drivers/gpu/drm/xe/xe_gt.c                    |  21 +-
 drivers/gpu/drm/xe/xe_gt_debugfs.c            |  53 ++++-
 drivers/gpu/drm/xe/xe_gt_freq.c               |  38 ++-
 drivers/gpu/drm/xe/xe_gt_idle.c               |  23 +-
 drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c     |   3 +
 drivers/gpu/drm/xe/xe_guc_ct.c                |  79 +++----
 drivers/gpu/drm/xe/xe_guc_ct_types.h          |   2 +
 drivers/gpu/drm/xe/xe_guc_debugfs.c           |   9 +-
 drivers/gpu/drm/xe/xe_guc_pc.c                |  62 +----
 drivers/gpu/drm/xe/xe_guc_submit.c            |   2 +-
 drivers/gpu/drm/xe/xe_huc_debugfs.c           |   5 +-
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  58 ++++-
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h |   7 +
 drivers/gpu/drm/xe/xe_hwmon.c                 |  25 +-
 drivers/gpu/drm/xe/xe_pat.c                   |  10 -
 drivers/gpu/drm/xe/xe_pci.c                   |   2 +-
 drivers/gpu/drm/xe/xe_pm.c                    | 223 +++++++++++++-----
 drivers/gpu/drm/xe/xe_pm.h                    |  16 +-
 drivers/gpu/drm/xe/xe_pt.c                    |   3 +
 drivers/gpu/drm/xe/xe_query.c                 |   4 -
 drivers/gpu/drm/xe/xe_sched_job.c             |  12 +-
 drivers/gpu/drm/xe/xe_tile.c                  |  10 +-
 drivers/gpu/drm/xe/xe_tile_sysfs.c            |   1 +
 drivers/gpu/drm/xe/xe_ttm_sys_mgr.c           |   5 +-
 drivers/gpu/drm/xe/xe_uc_fw.c                 |   4 +-
 drivers/gpu/drm/xe/xe_vm.c                    |  10 +-
 46 files changed, 565 insertions(+), 409 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [RFC 01/34] Revert "drm/xe/uc: Store firmware binary in system-memory backed BO"
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 02/34] drm/xe: Document Xe PM component Rodrigo Vivi
                   ` (35 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Matt Roper, Michał Winiarski, Rodrigo Vivi

This reverts commit 48cd5ebcb645ac1bd48480cf11dced902108d306.

This is breaking the GuC load after suspend cycles where the
power was lost such as s3 and d3cold on DGFX.

xe 0000:03:00.0: [drm] GuC load failed: status = 0x400000A0
xe 0000:03:00.0: [drm] GuC load failed: status: Reset = 0,
   BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01

Currently our suspend/resume flows only evict and restore the BOs
living in the VRAM itself since the system memory power won't be
lost. However, it seems that we might lose the PTEs for those
entries. While we don't have that in place, let's ensure that we
keep uc images in VRAM if possible.

Also another concern is the impact in the resume latency of
this case. So I wonder that if we end up moving this
uc_fw_copy to an earlier stage where LMEM is really not yet
available, maybe we need to later switch the uc_fw->bo to
a VRAM for the final access and better resume times.

Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_uc_fw.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_uc_fw.c b/drivers/gpu/drm/xe/xe_uc_fw.c
index 73d6938c921d..2c77322f103e 100644
--- a/drivers/gpu/drm/xe/xe_uc_fw.c
+++ b/drivers/gpu/drm/xe/xe_uc_fw.c
@@ -752,6 +752,8 @@ static int uc_fw_copy(struct xe_uc_fw *uc_fw, const void *data, size_t size, u32
 int xe_uc_fw_init(struct xe_uc_fw *uc_fw)
 {
 	const struct firmware *fw = NULL;
+	struct xe_gt *gt = uc_fw_to_gt(uc_fw);
+	struct xe_tile *tile = gt_to_tile(gt);
 	int err;
 
 	err = uc_fw_request(uc_fw, &fw);
@@ -763,7 +765,7 @@ int xe_uc_fw_init(struct xe_uc_fw *uc_fw)
 		return 0;
 
 	err = uc_fw_copy(uc_fw, fw->data, fw->size,
-			 XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_GGTT_BIT);
+			 XE_BO_CREATE_VRAM_IF_DGFX(tile) | XE_BO_CREATE_GGTT_BIT);
 
 	uc_fw_release(fw);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 02/34] drm/xe: Document Xe PM component
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 01/34] Revert "drm/xe/uc: Store firmware binary in system-memory backed BO" Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-29 10:38   ` Francois Dugast
  2024-01-26 20:30 ` [RFC 03/34] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
                   ` (34 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Replace outdated information with a proper PM documentation.
Already establish the rules for the runtime PM get and put that
Xe needs to follow.

Also add missing function documentation to all the "exported" functions.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 107 +++++++++++++++++++++++++++++++++----
 1 file changed, 96 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index d5f219796d7e..bd35fe9f6227 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -25,21 +25,45 @@
 /**
  * DOC: Xe Power Management
  *
- * Xe PM shall be guided by the simplicity.
- * Use the simplest hook options whenever possible.
- * Let's not reinvent the runtime_pm references and hooks.
- * Shall have a clear separation of display and gt underneath this component.
+ * Xe PM implements the main routines for both system level suspend states and
+ * for the opportunistic runtime suspend states.
  *
- * What's next:
+ * System Level Suspend (S-States) - In general this is OS initiated suspend
+ * driven by ACPI for achieving S0ix (a.k.a. S2idle, freeze), S3 (suspend to ram),
+ * S4 (disk). The main functions here are `xe_pm_suspend` and `xe_pm_resume` that
+ * are the main point for the suspend to and resume from these states.
  *
- * For now s2idle and s3 are only working in integrated devices. The next step
- * is to iterate through all VRAM's BO backing them up into the system memory
- * before allowing the system suspend.
+ * Runtime Suspend (D-States) - This is the opportunistic PCIe device low power
+ * state D3. Xe PM component provides `xe_pm_runtime_suspend` and
+ * `xe_pm_runtime_resume` systems that PCI subsystem will call before transition
+ * to D3. Also, Xe PM provides get and put functions that Xe driver will use to
+ * indicate activity. In order to avoid locking complications with the memory
+ * management, whenever possible, these get and put functions needs to be called
+ * from the higher/outer levels.
  *
- * Also runtime_pm needs to be here from the beginning.
+ * The main cases that need to be protected from the outer levels are: IOCLT,
+ * sysfs, debugfs, dma-buf sharing, GPU execution.
  *
- * RC6/RPS are also critical PM features. Let's start with GuCRC and GuC SLPC
- * and no wait boost. Frequency optimizations should come on a next stage.
+ * PCI D3 is special and can mean D3hot, where Vcc power is on for keeping memory
+ * alive and quicker low latency resume or D3Cold where Vcc power is off for
+ * better power savings.
+ * The Vcc control of PCI hierarchy can only be controlled at the PCI root port
+ * level, while the device driver can be behind multiple bridges/switches and
+ * paired with other devices. For this reason, the PCI subsystem cannot perform
+ * the transition towards D3Cold. The lowest runtime PM possible from the PCI
+ * subsystem is D3hot. Then, if all these paired devices in the same root port
+ * are in D3hot, ACPI will assist here and run its _PR3 and _OFF methods to
+ * perform the transition from D3hot to D3cold. Xe may disallow this transition
+ * based on runtime conditions such as VRAM usage for a quick and low latency
+ * resume for instance.
+ *
+ * Intel systems are capable of taking the system to S0ix when devices are on
+ * D3hot through the runtime PM. This is also called as 'opportunistic-S0iX'.
+ * But in this case, the `xe_pm_suspend` and `xe_pm_resume` won't be called for
+ * S0ix.
+ *
+ * This component is no responsible for GT idleness (RC6) nor GT frequency
+ * management (RPS).
  */
 
 /**
@@ -169,6 +193,12 @@ void xe_pm_init_early(struct xe_device *xe)
 	drmm_mutex_init(&xe->drm, &xe->mem_access.vram_userfault.lock);
 }
 
+/**
+ * xe_pm_init - Initialize Xe Power Management
+ * @xe: xe device instance
+ *
+ * This component is responsible for System and Device sleep states.
+ */
 void xe_pm_init(struct xe_device *xe)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
@@ -189,6 +219,10 @@ void xe_pm_init(struct xe_device *xe)
 	xe_pm_runtime_init(xe);
 }
 
+/**
+ * xe_pm_runtime_fini - Finalize Runtime PM
+ * @xe: xe device instance
+ */
 void xe_pm_runtime_fini(struct xe_device *xe)
 {
 	struct device *dev = xe->drm.dev;
@@ -218,6 +252,12 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe)
 	return READ_ONCE(xe->pm_callback_task);
 }
 
+/**
+ * xe_pm_runtime_suspend - Prepare our device for D3hot/D3Cold
+ * @xe: xe device instance
+ *
+ * Returns 0 for success, negative error code otherwise.
+ */
 int xe_pm_runtime_suspend(struct xe_device *xe)
 {
 	struct xe_bo *bo, *on;
@@ -283,6 +323,12 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	return err;
 }
 
+/**
+ * xe_pm_runtime_resume - Waking up from D3hot/D3Cold
+ * @xe: xe device instance
+ *
+ * Returns 0 for success, negative error code otherwise.
+ */
 int xe_pm_runtime_resume(struct xe_device *xe)
 {
 	struct xe_gt *gt;
@@ -334,22 +380,47 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	return err;
 }
 
+/**
+ * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
+ * @xe: xe device instance
+ *
+ * Returns: Any number grater than or equal to 0 for success, negative error
+ * code otherwise.
+ */
 int xe_pm_runtime_get(struct xe_device *xe)
 {
 	return pm_runtime_get_sync(xe->drm.dev);
 }
 
+/**
+ * xe_pm_runtime_put - Put the runtime_pm reference back and mark as idle
+ * @xe: xe device instance
+ *
+ * Returns: Any number grater than or equal to 0 for success, negative error
+ * code otherwise.
+ */
 int xe_pm_runtime_put(struct xe_device *xe)
 {
 	pm_runtime_mark_last_busy(xe->drm.dev);
 	return pm_runtime_put(xe->drm.dev);
 }
 
+/**
+ * xe_pm_runtime_get_if_active - Get a runtime_pm reference if device active
+ * @xe: xe device instance
+ *
+ * Returns: Any number grater than or equal to 0 for success, negative error
+ * code otherwise.
+ */
 int xe_pm_runtime_get_if_active(struct xe_device *xe)
 {
 	return pm_runtime_get_if_active(xe->drm.dev, true);
 }
 
+/**
+ * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
+ * @xe: xe device instance
+ */
 void xe_pm_assert_unbounded_bridge(struct xe_device *xe)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
@@ -364,6 +435,13 @@ void xe_pm_assert_unbounded_bridge(struct xe_device *xe)
 	}
 }
 
+/**
+ * xe_pm_set_vram_threshold - Set a vram threshold for allowing/blocking D3Cold
+ * @xe: xe device instance
+ * @threshold: VRAM size in bites for the D3cold threshold
+ *
+ * Returns 0 for success, negative error code otherwise.
+ */
 int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold)
 {
 	struct ttm_resource_manager *man;
@@ -388,6 +466,13 @@ int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold)
 	return 0;
 }
 
+/**
+ * xe_pm_d3cold_allowed_toggle - Check conditions to toggle d3cold.allowed
+ * @xe: xe device instance
+ *
+ * To be called during runtime_pm idle callback.
+ * Check for all the D3Cold conditions ahead of runtime suspend.
+ */
 void xe_pm_d3cold_allowed_toggle(struct xe_device *xe)
 {
 	struct ttm_resource_manager *man;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 03/34] drm/xe: Fix display runtime_pm handling
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 01/34] Revert "drm/xe/uc: Store firmware binary in system-memory backed BO" Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 02/34] drm/xe: Document Xe PM component Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05  9:11   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 04/34] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display Rodrigo Vivi
                   ` (33 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

i915's intel_runtime_pm_get_if_in_use actually calls the
pm_runtime_get_if_active() with ign_usage_count = false, but Xe
was erroneously calling it with true because of the mem_access cases.
This can lead to unbalanced references.

Let's use directly the 'if_in_use' function provided by linux/pm_runtime.

Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  2 +-
 drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
 drivers/gpu/drm/xe/xe_pm.h                      |  1 +
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
index 420eba0e4be0..ad5864d1dd74 100644
--- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
+++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
@@ -177,7 +177,7 @@ static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_p
 {
 	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
 
-	return xe_pm_runtime_get_if_active(xe);
+	return xe_pm_runtime_get_if_in_use(xe);
 }
 
 static inline void intel_runtime_pm_put_unchecked(struct xe_runtime_pm *pm)
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index bd35fe9f6227..19f88cb7715b 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -417,6 +417,23 @@ int xe_pm_runtime_get_if_active(struct xe_device *xe)
 	return pm_runtime_get_if_active(xe->drm.dev, true);
 }
 
+/**
+ * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
+ * @xe: xe device instance
+ *
+ * Returns: True if device is awake and the reference was taken, false otherwise.
+ */
+bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
+{
+	if (xe_pm_read_callback_task(xe) == current) {
+		/* The device is awake, grab the ref and move on */
+		pm_runtime_get_noresume(xe->drm.dev);
+		return true;
+	}
+
+	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;
+}
+
 /**
  * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
  * @xe: xe device instance
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index 64a97c6726a7..9d372cbf388b 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -28,6 +28,7 @@ int xe_pm_runtime_resume(struct xe_device *xe);
 int xe_pm_runtime_get(struct xe_device *xe);
 int xe_pm_runtime_put(struct xe_device *xe);
 int xe_pm_runtime_get_if_active(struct xe_device *xe);
+bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
 void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
 int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
 void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 04/34] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 03/34] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 05/34] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion Rodrigo Vivi
                   ` (32 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Introduce the resume and get to fulfill the display need for checking
if the device was actually resumed (or it is awake) and the reference
was taken.

Then we can convert the remaining cases to a void function and have
individual functions for individual cases.

Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  6 +-----
 drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
 drivers/gpu/drm/xe/xe_pm.h                      |  1 +
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
index ad5864d1dd74..fef969112b1d 100644
--- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
+++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
@@ -166,11 +166,7 @@ static inline intel_wakeref_t intel_runtime_pm_get(struct xe_runtime_pm *pm)
 {
 	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
 
-	if (xe_pm_runtime_get(xe) < 0) {
-		xe_pm_runtime_put(xe);
-		return 0;
-	}
-	return 1;
+	return xe_pm_runtime_resume_and_get(xe);
 }
 
 static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_pm *pm)
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 19f88cb7715b..2264276dc431 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -434,6 +434,23 @@ bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
 	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;
 }
 
+/**
+ * xe_pm_runtime_resume_and_get - Resume, then get a runtime_pm ref if awake.
+ * @xe: xe device instance
+ *
+ * Returns: True if device is awake and the reference was taken, false otherwise.
+ */
+bool xe_pm_runtime_resume_and_get(struct xe_device *xe)
+{
+	if (xe_pm_read_callback_task(xe) == current) {
+		/* The device is awake, grab the ref and move on */
+		pm_runtime_get_noresume(xe->drm.dev);
+		return true;
+	}
+
+	return pm_runtime_resume_and_get(xe->drm.dev) >= 0;
+}
+
 /**
  * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
  * @xe: xe device instance
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index 9d372cbf388b..59aabb44a350 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -29,6 +29,7 @@ int xe_pm_runtime_get(struct xe_device *xe);
 int xe_pm_runtime_put(struct xe_device *xe);
 int xe_pm_runtime_get_if_active(struct xe_device *xe);
 bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
+bool xe_pm_runtime_resume_and_get(struct xe_device *xe);
 void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
 int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
 void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 05/34] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (3 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 04/34] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 06/34] drm/xe: Prepare display for D3Cold Rodrigo Vivi
                   ` (31 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

With mem_access will go away and pm_runtime calls will be called instead.
So, we need to protect these against recursions.

For D3cold, the TTM migration helpers will call for the job execution.
Jobs execution will be protected by direct runtime_pm calls, but they
cannot be called again if we are already at a runtime suspend/resume
transaction when evicting/restoring memory for D3Cold. So, we will check
for the xe_pm_read_callback_task.

The put is asynchronous so there's no need to block it. However, for a
proper balance, we need to ensure that the references are taken and
restored regardless of the flow. So, let's convert them all to void and
use some direct linux/pm_runtime functions.

Cases that need to check for the references or runtime status will
be handled separate from these main get and put functions here.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 25 ++++++++++++++-----------
 drivers/gpu/drm/xe/xe_pm.h |  4 ++--
 2 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 2264276dc431..28a4231ddd8a 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -383,26 +383,29 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 /**
  * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
  * @xe: xe device instance
- *
- * Returns: Any number grater than or equal to 0 for success, negative error
- * code otherwise.
  */
-int xe_pm_runtime_get(struct xe_device *xe)
+void xe_pm_runtime_get(struct xe_device *xe)
 {
-	return pm_runtime_get_sync(xe->drm.dev);
+	pm_runtime_get_noresume(xe->drm.dev);
+
+	if (xe_pm_read_callback_task(xe) == current)
+		return;
+
+	pm_runtime_resume(xe->drm.dev);
 }
 
 /**
  * xe_pm_runtime_put - Put the runtime_pm reference back and mark as idle
  * @xe: xe device instance
- *
- * Returns: Any number grater than or equal to 0 for success, negative error
- * code otherwise.
  */
-int xe_pm_runtime_put(struct xe_device *xe)
+void xe_pm_runtime_put(struct xe_device *xe)
 {
-	pm_runtime_mark_last_busy(xe->drm.dev);
-	return pm_runtime_put(xe->drm.dev);
+	if (xe_pm_read_callback_task(xe) == current) {
+		pm_runtime_put_noidle(xe->drm.dev);
+	} else {
+		pm_runtime_mark_last_busy(xe->drm.dev);
+		pm_runtime_put(xe->drm.dev);
+	}
 }
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index 59aabb44a350..ecb19ee10db6 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -25,8 +25,8 @@ void xe_pm_init(struct xe_device *xe);
 void xe_pm_runtime_fini(struct xe_device *xe);
 int xe_pm_runtime_suspend(struct xe_device *xe);
 int xe_pm_runtime_resume(struct xe_device *xe);
-int xe_pm_runtime_get(struct xe_device *xe);
-int xe_pm_runtime_put(struct xe_device *xe);
+void xe_pm_runtime_get(struct xe_device *xe);
+void xe_pm_runtime_put(struct xe_device *xe);
 int xe_pm_runtime_get_if_active(struct xe_device *xe);
 bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
 bool xe_pm_runtime_resume_and_get(struct xe_device *xe);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 06/34] drm/xe: Prepare display for D3Cold
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (4 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 05/34] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
                   ` (30 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Prepare power-well and DC handling for a full power
lost during D3Cold, then sanitize it upon D3->D0.
Otherwise we get a bunch of state mismatch.

Ideally we could leave DC9 enabled and wouldn't need
to move DC9->DC0 on every runtime resume, however,
the disable_DC is part of the power-well checks and
intrinsic to the dc_off power well. In the future that
can be detangled so we can have even bigger power savings.
But for now, let's focus on getting a D3Cold, which saves
much more power by itself.

v2: We cannot do the full-suspend-resume path or we end
with a deadlock between xe_gem_fault and the modeset-ioctl.
Reduce only to the power-well & DC sanitization.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_display.c | 22 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_display.h |  2 ++
 drivers/gpu/drm/xe/xe_pm.c      |  5 +++++
 3 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
index 74391d9b11ae..3ab23c67c145 100644
--- a/drivers/gpu/drm/xe/xe_display.c
+++ b/drivers/gpu/drm/xe/xe_display.c
@@ -368,6 +368,28 @@ void xe_display_pm_suspend_late(struct xe_device *xe)
 	intel_display_power_suspend_late(xe);
 }
 
+void xe_display_pm_runtime_suspend(struct xe_device *xe)
+{
+	if (!xe->info.enable_display)
+		return;
+
+	intel_power_domains_disable(xe);
+	intel_opregion_suspend(xe, PCI_D3cold);
+	intel_dmc_suspend(xe);
+	intel_power_domains_suspend(xe, PCI_D3cold);
+}
+
+void xe_display_pm_runtime_resume(struct xe_device *xe)
+{
+	if (!xe->info.enable_display)
+		return;
+
+	intel_power_domains_resume(xe);
+	intel_dmc_resume(xe);
+	intel_opregion_resume(xe);
+	intel_power_domains_enable(xe);
+}
+
 void xe_display_pm_resume_early(struct xe_device *xe)
 {
 	if (!xe->info.enable_display)
diff --git a/drivers/gpu/drm/xe/xe_display.h b/drivers/gpu/drm/xe/xe_display.h
index 710e56180b52..cd5a2c4de9de 100644
--- a/drivers/gpu/drm/xe/xe_display.h
+++ b/drivers/gpu/drm/xe/xe_display.h
@@ -38,6 +38,8 @@ void xe_display_pm_suspend(struct xe_device *xe);
 void xe_display_pm_suspend_late(struct xe_device *xe);
 void xe_display_pm_resume_early(struct xe_device *xe);
 void xe_display_pm_resume(struct xe_device *xe);
+void xe_display_pm_runtime_suspend(struct xe_device *xe);
+void xe_display_pm_runtime_resume(struct xe_device *xe);
 
 #else
 
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 28a4231ddd8a..9910b748adab 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -317,6 +317,9 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	}
 
 	xe_irq_suspend(xe);
+
+	if (xe->d3cold.allowed)
+		xe_display_pm_runtime_suspend(xe);
 out:
 	lock_map_release(&xe_device_mem_access_lockdep_map);
 	xe_pm_write_callback_task(xe, NULL);
@@ -355,6 +358,8 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 				goto out;
 		}
 
+		xe_display_pm_runtime_resume(xe);
+
 		/*
 		 * This only restores pinned memory which is the memory
 		 * required for the GT(s) to resume.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (5 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 06/34] drm/xe: Prepare display for D3Cold Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05  9:55   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
                   ` (29 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

The mem_access helpers are going away and getting replaced by
direct calls of the xe_pm_runtime_{get,put} functions. However, an
assertion with a warning splat is desired when we hit the worst
case of a memory access with the device really in the 'suspended'
state.

Also, this needs to be the first step. Otherwise, the upcoming
conversion would be really noise with warn splats of missing mem_access
gets.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c | 13 ++++++++++++-
 drivers/gpu/drm/xe/xe_pm.c     | 16 ++++++++++++++++
 drivers/gpu/drm/xe/xe_pm.h     |  1 +
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 6faa7865b1aa..01db34f06a7d 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -653,9 +653,20 @@ bool xe_device_mem_access_ongoing(struct xe_device *xe)
 	return atomic_read(&xe->mem_access.ref);
 }
 
+/**
+ * xe_device_assert_mem_access - Inspect the current runtime_pm state.
+ * @xe: xe device instance
+ *
+ * To be used before any kind of memory access. It will splat a debug warning
+ * if the device is currently sleeping. But it doesn't guarantee in any way
+ * that the device is going to continue awake. Xe PM runtime get and put
+ * functions might be added to the outer bound of the memory access, while
+ * this check is intended for inner usage to splat some warning if the worst
+ * case has just happened.
+ */
 void xe_device_assert_mem_access(struct xe_device *xe)
 {
-	XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
+	XE_WARN_ON(xe_pm_runtime_suspended(xe));
 }
 
 bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 9910b748adab..3ef14937d5d2 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -252,6 +252,22 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe)
 	return READ_ONCE(xe->pm_callback_task);
 }
 
+/**
+ * xe_pm_runtime_suspended - Inspect the current runtime_pm state.
+ * @xe: xe device instance
+ *
+ * This does not provide any guarantee that the device is going to continue
+ * suspended as it might be racing with the runtime state transitions.
+ * It can be used only as a non-reliable assertion, to ensure that we are not in
+ * the sleep state while trying to access some memory for instance.
+ *
+ * Returns true if PCI device is suspended, false otherwise.
+ */
+bool xe_pm_runtime_suspended(struct xe_device *xe)
+{
+	return pm_runtime_suspended(xe->drm.dev);
+}
+
 /**
  * xe_pm_runtime_suspend - Prepare our device for D3hot/D3Cold
  * @xe: xe device instance
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index ecb19ee10db6..a672adffd0e1 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -23,6 +23,7 @@ int xe_pm_resume(struct xe_device *xe);
 void xe_pm_init_early(struct xe_device *xe);
 void xe_pm_init(struct xe_device *xe);
 void xe_pm_runtime_fini(struct xe_device *xe);
+bool xe_pm_runtime_suspended(struct xe_device *xe);
 int xe_pm_runtime_suspend(struct xe_device *xe);
 int xe_pm_runtime_resume(struct xe_device *xe);
 void xe_pm_runtime_get(struct xe_device *xe);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (6 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05  9:39   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime Rodrigo Vivi
                   ` (28 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's ensure our PCI device is awaken on every IOCTL entry.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c | 32 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/xe/xe_pm.c     | 15 +++++++++++++++
 drivers/gpu/drm/xe/xe_pm.h     |  1 +
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 01db34f06a7d..ab41202ecaf8 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -141,15 +141,43 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 			  DRM_RENDER_ALLOW),
 };
 
+static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct drm_file *file_priv = file->private_data;
+	struct xe_device *xe = to_xe_device(file_priv->minor->dev);
+	long ret;
+
+	ret = xe_pm_runtime_get_sync(xe);
+	if (ret >= 0)
+		ret = drm_ioctl(file, cmd, arg);
+	xe_pm_runtime_put(xe);
+
+	return ret;
+}
+
+static long xe_drm_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct drm_file *file_priv = file->private_data;
+	struct xe_device *xe = to_xe_device(file_priv->minor->dev);
+	long ret;
+
+	ret = xe_pm_runtime_get_sync(xe);
+	if (ret >= 0)
+		ret = drm_compat_ioctl(file, cmd, arg);
+	xe_pm_runtime_put(xe);
+
+	return ret;
+}
+
 static const struct file_operations xe_driver_fops = {
 	.owner = THIS_MODULE,
 	.open = drm_open,
 	.release = drm_release_noglobal,
-	.unlocked_ioctl = drm_ioctl,
+	.unlocked_ioctl = xe_drm_ioctl,
 	.mmap = drm_gem_mmap,
 	.poll = drm_poll,
 	.read = drm_read,
-	.compat_ioctl = drm_compat_ioctl,
+	.compat_ioctl = xe_drm_compat_ioctl,
 	.llseek = noop_llseek,
 #ifdef CONFIG_PROC_FS
 	.show_fdinfo = drm_show_fdinfo,
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 3ef14937d5d2..d98f4bb3ad02 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -429,6 +429,21 @@ void xe_pm_runtime_put(struct xe_device *xe)
 	}
 }
 
+/**
+ * xe_pm_runtime_get_sync - Get a runtime_pm reference and resume synchronously
+ * @xe: xe device instance
+ *
+ * Returns: Any number grater than or equal to 0 for success, negative error
+ * code otherwise.
+ */
+int xe_pm_runtime_get_sync(struct xe_device *xe)
+{
+	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
+		return -ELOOP;
+
+	return pm_runtime_get_sync(xe->drm.dev);
+}
+
 /**
  * xe_pm_runtime_get_if_active - Get a runtime_pm reference if device active
  * @xe: xe device instance
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index a672adffd0e1..7d8cf87b95d2 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -27,6 +27,7 @@ bool xe_pm_runtime_suspended(struct xe_device *xe);
 int xe_pm_runtime_suspend(struct xe_device *xe);
 int xe_pm_runtime_resume(struct xe_device *xe);
 void xe_pm_runtime_get(struct xe_device *xe);
+int xe_pm_runtime_get_sync(struct xe_device *xe);
 void xe_pm_runtime_put(struct xe_device *xe);
 int xe_pm_runtime_get_if_active(struct xe_device *xe);
 bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (7 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05  9:57   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime Rodrigo Vivi
                   ` (27 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's convert the kunit tests that are currently relying on
xe_device_mem_access_{get,put} towards the direct xe_pm_runtime_{get,put}.
While doing this we need to move the get/put calls towards the outer
bounds of the tests to ensure consistency with the other usages of
pm_runtime on the regular paths.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/tests/xe_bo.c      |  8 ++++----
 drivers/gpu/drm/xe/tests/xe_migrate.c |  7 +++++--
 drivers/gpu/drm/xe/tests/xe_mocs.c    | 14 ++++++++++----
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index 3436fd9cf2b2..0926a1c2eb86 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -163,7 +163,7 @@ static int ccs_test_run_device(struct xe_device *xe)
 		return 0;
 	}
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 
 	for_each_tile(tile, xe, id) {
 		/* For igfx run only for primary tile */
@@ -172,7 +172,7 @@ static int ccs_test_run_device(struct xe_device *xe)
 		ccs_test_run_tile(xe, tile, test);
 	}
 
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 
 	return 0;
 }
@@ -335,12 +335,12 @@ static int evict_test_run_device(struct xe_device *xe)
 		return 0;
 	}
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 
 	for_each_tile(tile, xe, id)
 		evict_test_run_tile(xe, tile, test);
 
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
index a6523df0f1d3..ce531498f57f 100644
--- a/drivers/gpu/drm/xe/tests/xe_migrate.c
+++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
@@ -10,6 +10,7 @@
 #include "tests/xe_pci_test.h"
 
 #include "xe_pci.h"
+#include "xe_pm.h"
 
 static bool sanity_fence_failed(struct xe_device *xe, struct dma_fence *fence,
 				const char *str, struct kunit *test)
@@ -423,17 +424,19 @@ static int migrate_test_run_device(struct xe_device *xe)
 	struct xe_tile *tile;
 	int id;
 
+	xe_pm_runtime_get(xe);
+
 	for_each_tile(tile, xe, id) {
 		struct xe_migrate *m = tile->migrate;
 
 		kunit_info(test, "Testing tile id %d.\n", id);
 		xe_vm_lock(m->q->vm, true);
-		xe_device_mem_access_get(xe);
 		xe_migrate_sanity_test(m, test);
-		xe_device_mem_access_put(xe);
 		xe_vm_unlock(m->q->vm);
 	}
 
+	xe_pm_runtime_put(xe);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/xe/tests/xe_mocs.c b/drivers/gpu/drm/xe/tests/xe_mocs.c
index df5c36b70ab4..d561582f8063 100644
--- a/drivers/gpu/drm/xe/tests/xe_mocs.c
+++ b/drivers/gpu/drm/xe/tests/xe_mocs.c
@@ -45,7 +45,6 @@ static void read_l3cc_table(struct xe_gt *gt,
 
 	struct kunit *test = xe_cur_kunit();
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	ret = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	KUNIT_ASSERT_EQ_MSG(test, ret, 0, "Forcewake Failed.\n");
 	mocs_dbg(&gt_to_xe(gt)->drm, "L3CC entries:%d\n", info->n_entries);
@@ -65,7 +64,6 @@ static void read_l3cc_table(struct xe_gt *gt,
 				   XELP_LNCFCMOCS(i).addr);
 	}
 	xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
-	xe_device_mem_access_put(gt_to_xe(gt));
 }
 
 static void read_mocs_table(struct xe_gt *gt,
@@ -80,7 +78,6 @@ static void read_mocs_table(struct xe_gt *gt,
 
 	struct kunit *test = xe_cur_kunit();
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	ret = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	KUNIT_ASSERT_EQ_MSG(test, ret, 0, "Forcewake Failed.\n");
 	mocs_dbg(&gt_to_xe(gt)->drm, "Global MOCS entries:%d\n", info->n_entries);
@@ -100,7 +97,6 @@ static void read_mocs_table(struct xe_gt *gt,
 				   XELP_GLOBAL_MOCS(i).addr);
 	}
 	xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
-	xe_device_mem_access_put(gt_to_xe(gt));
 }
 
 static int mocs_kernel_test_run_device(struct xe_device *xe)
@@ -113,6 +109,8 @@ static int mocs_kernel_test_run_device(struct xe_device *xe)
 	unsigned int flags;
 	int id;
 
+	xe_pm_runtime_get(xe);
+
 	for_each_gt(gt, xe, id) {
 		flags = live_mocs_init(&mocs, gt);
 		if (flags & HAS_GLOBAL_MOCS)
@@ -120,6 +118,9 @@ static int mocs_kernel_test_run_device(struct xe_device *xe)
 		if (flags & HAS_LNCF_MOCS)
 			read_l3cc_table(gt, &mocs.table);
 	}
+
+	xe_pm_runtime_put(xe);
+
 	return 0;
 }
 
@@ -139,6 +140,8 @@ static int mocs_reset_test_run_device(struct xe_device *xe)
 	int id;
 	struct kunit *test = xe_cur_kunit();
 
+	xe_pm_runtime_get(xe);
+
 	for_each_gt(gt, xe, id) {
 		flags = live_mocs_init(&mocs, gt);
 		kunit_info(test, "mocs_reset_test before reset\n");
@@ -156,6 +159,9 @@ static int mocs_reset_test_run_device(struct xe_device *xe)
 		if (flags & HAS_LNCF_MOCS)
 			read_l3cc_table(gt, &mocs.table);
 	}
+
+	xe_pm_runtime_put(xe);
+
 	return 0;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (8 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 10:46   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
                   ` (26 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's ensure our PCI device is awaken on every GT execution to
the end of the execution.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_exec_queue.c          | 19 -------------------
 drivers/gpu/drm/xe/xe_gpu_scheduler.c       |  8 ++++++--
 drivers/gpu/drm/xe/xe_gpu_scheduler.h       |  3 ++-
 drivers/gpu/drm/xe/xe_gpu_scheduler_types.h |  2 ++
 drivers/gpu/drm/xe/xe_guc_submit.c          |  2 +-
 drivers/gpu/drm/xe/xe_sched_job.c           | 12 +++++++-----
 6 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index c0b7434e78f1..04bd6a639d41 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -111,7 +111,6 @@ static void __xe_exec_queue_free(struct xe_exec_queue *q)
 
 static int __xe_exec_queue_init(struct xe_exec_queue *q)
 {
-	struct xe_device *xe = gt_to_xe(q->gt);
 	int i, err;
 
 	for (i = 0; i < q->width; ++i) {
@@ -124,17 +123,6 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q)
 	if (err)
 		goto err_lrc;
 
-	/*
-	 * Normally the user vm holds an rpm ref to keep the device
-	 * awake, and the context holds a ref for the vm, however for
-	 * some engines we use the kernels migrate vm underneath which offers no
-	 * such rpm ref, or we lack a vm. Make sure we keep a ref here, so we
-	 * can perform GuC CT actions when needed. Caller is expected to have
-	 * already grabbed the rpm ref outside any sensitive locks.
-	 */
-	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
-		drm_WARN_ON(&xe->drm, !xe_device_mem_access_get_if_ongoing(xe));
-
 	return 0;
 
 err_lrc:
@@ -221,8 +209,6 @@ void xe_exec_queue_fini(struct xe_exec_queue *q)
 
 	for (i = 0; i < q->width; ++i)
 		xe_lrc_finish(q->lrc + i);
-	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
-		xe_device_mem_access_put(gt_to_xe(q->gt));
 	__xe_exec_queue_free(q);
 }
 
@@ -704,9 +690,6 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 			if (XE_IOCTL_DBG(xe, !hwe))
 				return -EINVAL;
 
-			/* The migration vm doesn't hold rpm ref */
-			xe_device_mem_access_get(xe);
-
 			flags = EXEC_QUEUE_FLAG_PERSISTENT | EXEC_QUEUE_FLAG_VM |
 				(id ? EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD : 0);
 
@@ -715,8 +698,6 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 						   args->width, hwe, flags,
 						   args->extensions);
 
-			xe_device_mem_access_put(xe); /* now held by engine */
-
 			xe_vm_put(migrate_vm);
 			if (IS_ERR(new)) {
 				err = PTR_ERR(new);
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
index e4ad1d6ce1d5..457f814bfee8 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
@@ -4,6 +4,7 @@
  */
 
 #include "xe_gpu_scheduler.h"
+#include "xe_pm.h"
 
 static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
 {
@@ -48,13 +49,15 @@ static void xe_sched_process_msg_work(struct work_struct *w)
 
 	msg = xe_sched_get_msg(sched);
 	if (msg) {
+		xe_pm_runtime_get(sched->xe);
 		sched->ops->process_msg(msg);
-
 		xe_sched_process_msg_queue_if_ready(sched);
+		xe_pm_runtime_put(sched->xe);
 	}
 }
 
-int xe_sched_init(struct xe_gpu_scheduler *sched,
+int xe_sched_init(struct xe_device *xe,
+		  struct xe_gpu_scheduler *sched,
 		  const struct drm_sched_backend_ops *ops,
 		  const struct xe_sched_backend_ops *xe_ops,
 		  struct workqueue_struct *submit_wq,
@@ -63,6 +66,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
 		  atomic_t *score, const char *name,
 		  struct device *dev)
 {
+	sched->xe = xe;
 	sched->ops = xe_ops;
 	INIT_LIST_HEAD(&sched->msgs);
 	INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work);
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
index 10c6bb9c9386..bac37c311859 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
@@ -9,7 +9,8 @@
 #include "xe_gpu_scheduler_types.h"
 #include "xe_sched_job_types.h"
 
-int xe_sched_init(struct xe_gpu_scheduler *sched,
+int xe_sched_init(struct xe_device *xe,
+		  struct xe_gpu_scheduler *sched,
 		  const struct drm_sched_backend_ops *ops,
 		  const struct xe_sched_backend_ops *xe_ops,
 		  struct workqueue_struct *submit_wq,
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
index 6731b13da8bb..f14b88264025 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
@@ -43,6 +43,8 @@ struct xe_sched_backend_ops {
 struct xe_gpu_scheduler {
 	/** @base: DRM GPU scheduler */
 	struct drm_gpu_scheduler		base;
+	/** @xe: back pointer to Xe Device */
+	struct xe_device			*xe;
 	/** @ops: Xe scheduler ops */
 	const struct xe_sched_backend_ops	*ops;
 	/** @msgs: list of messages to be processed in @work_process_msg */
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 2b008ec1b6de..6b313b9b351a 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1219,7 +1219,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
 
 	timeout = (q->vm && xe_vm_in_lr_mode(q->vm)) ? MAX_SCHEDULE_TIMEOUT :
 		  q->sched_props.job_timeout_ms;
-	err = xe_sched_init(&ge->sched, &drm_sched_ops, &xe_sched_ops,
+	err = xe_sched_init(xe, &ge->sched, &drm_sched_ops, &xe_sched_ops,
 			    get_submit_wq(guc),
 			    q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES, 64,
 			    timeout, guc_to_gt(guc)->ordered_wq, NULL,
diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
index 01106a1156ad..c93284ec66c5 100644
--- a/drivers/gpu/drm/xe/xe_sched_job.c
+++ b/drivers/gpu/drm/xe/xe_sched_job.c
@@ -15,6 +15,7 @@
 #include "xe_hw_fence.h"
 #include "xe_lrc.h"
 #include "xe_macros.h"
+#include "xe_pm.h"
 #include "xe_trace.h"
 #include "xe_vm.h"
 
@@ -67,6 +68,8 @@ static void job_free(struct xe_sched_job *job)
 	struct xe_exec_queue *q = job->q;
 	bool is_migration = xe_sched_job_is_migration(q);
 
+	xe_pm_runtime_put(gt_to_xe(q->gt));
+
 	kmem_cache_free(xe_exec_queue_is_parallel(job->q) || is_migration ?
 			xe_sched_job_parallel_slab : xe_sched_job_slab, job);
 }
@@ -79,6 +82,7 @@ static struct xe_device *job_to_xe(struct xe_sched_job *job)
 struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 					 u64 *batch_addr)
 {
+	struct xe_device *xe = gt_to_xe(q->gt);
 	struct xe_sched_job *job;
 	struct dma_fence **fences;
 	bool is_migration = xe_sched_job_is_migration(q);
@@ -86,6 +90,9 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 	int i, j;
 	u32 width;
 
+	if (!xe_pm_runtime_get_if_in_use(xe))
+		drm_err(&xe->drm, "Failed to grab RPM ref for sched_job\n");
+
 	/* only a kernel context can submit a vm-less job */
 	XE_WARN_ON(!q->vm && !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
 
@@ -155,9 +162,6 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 	for (i = 0; i < width; ++i)
 		job->batch_addr[i] = batch_addr[i];
 
-	/* All other jobs require a VM to be open which has a ref */
-	if (unlikely(q->flags & EXEC_QUEUE_FLAG_KERNEL))
-		xe_device_mem_access_get(job_to_xe(job));
 	xe_device_assert_mem_access(job_to_xe(job));
 
 	trace_xe_sched_job_create(job);
@@ -189,8 +193,6 @@ void xe_sched_job_destroy(struct kref *ref)
 	struct xe_sched_job *job =
 		container_of(ref, struct xe_sched_job, refcount);
 
-	if (unlikely(job->q->flags & EXEC_QUEUE_FLAG_KERNEL))
-		xe_device_mem_access_put(job_to_xe(job));
 	xe_exec_queue_put(job->q);
 	dma_fence_put(job->fence);
 	drm_sched_job_cleanup(&job->drm);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (9 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 10:55   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 12/34] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
                   ` (25 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's ensure our PCI device is awaken on every sysfs call.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.

For now, for the files with small number of attr functions,
let's only call the runtime pm functions directly.
For the hw_engines entries with many files, let's add
the sysfs_ops wrapper.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device_sysfs.c          |  4 ++
 drivers/gpu/drm/xe/xe_gt_freq.c               | 38 +++++++++++-
 drivers/gpu/drm/xe/xe_gt_idle.c               | 23 +++++++-
 drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c     |  3 +
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 58 ++++++++++++++++++-
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h |  7 +++
 drivers/gpu/drm/xe/xe_tile_sysfs.c            |  1 +
 7 files changed, 129 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device_sysfs.c b/drivers/gpu/drm/xe/xe_device_sysfs.c
index 99113a5a2b84..e47c8ad1bb17 100644
--- a/drivers/gpu/drm/xe/xe_device_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_device_sysfs.c
@@ -35,7 +35,9 @@ vram_d3cold_threshold_show(struct device *dev,
 	if (!xe)
 		return -EINVAL;
 
+	xe_pm_runtime_get(xe);
 	ret = sysfs_emit(buf, "%d\n", xe->d3cold.vram_threshold);
+	xe_pm_runtime_put(xe);
 
 	return ret;
 }
@@ -58,7 +60,9 @@ vram_d3cold_threshold_store(struct device *dev, struct device_attribute *attr,
 
 	drm_dbg(&xe->drm, "vram_d3cold_threshold: %u\n", vram_d3cold_threshold);
 
+	xe_pm_runtime_get(xe);
 	ret = xe_pm_set_vram_threshold(xe, vram_d3cold_threshold);
+	xe_pm_runtime_put(xe);
 
 	return ret ?: count;
 }
diff --git a/drivers/gpu/drm/xe/xe_gt_freq.c b/drivers/gpu/drm/xe/xe_gt_freq.c
index e5b0f4ecdbe8..32b9a743629c 100644
--- a/drivers/gpu/drm/xe/xe_gt_freq.c
+++ b/drivers/gpu/drm/xe/xe_gt_freq.c
@@ -15,6 +15,7 @@
 #include "xe_gt_sysfs.h"
 #include "xe_gt_throttle_sysfs.h"
 #include "xe_guc_pc.h"
+#include "xe_pm.h"
 
 /**
  * DOC: Xe GT Frequency Management
@@ -49,12 +50,23 @@ dev_to_pc(struct device *dev)
 	return &kobj_to_gt(dev->kobj.parent)->uc.guc.pc;
 }
 
+static struct xe_device *
+dev_to_xe(struct device *dev)
+{
+	return gt_to_xe(kobj_to_gt(dev->kobj.parent));
+}
+
 static ssize_t act_freq_show(struct device *dev,
 			     struct device_attribute *attr, char *buf)
 {
 	struct xe_guc_pc *pc = dev_to_pc(dev);
+	u32 freq;
+
+	xe_pm_runtime_get(dev_to_xe(dev));
+	freq = xe_guc_pc_get_act_freq(pc);
+	xe_pm_runtime_put(dev_to_xe(dev));
 
-	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_act_freq(pc));
+	return sysfs_emit(buf, "%d\n", freq);
 }
 static DEVICE_ATTR_RO(act_freq);
 
@@ -65,7 +77,9 @@ static ssize_t cur_freq_show(struct device *dev,
 	u32 freq;
 	ssize_t ret;
 
+	xe_pm_runtime_get(dev_to_xe(dev));
 	ret = xe_guc_pc_get_cur_freq(pc, &freq);
+	xe_pm_runtime_put(dev_to_xe(dev));
 	if (ret)
 		return ret;
 
@@ -77,8 +91,13 @@ static ssize_t rp0_freq_show(struct device *dev,
 			     struct device_attribute *attr, char *buf)
 {
 	struct xe_guc_pc *pc = dev_to_pc(dev);
+	u32 freq;
+
+	xe_pm_runtime_get(dev_to_xe(dev));
+	freq = xe_guc_pc_get_rp0_freq(pc);
+	xe_pm_runtime_put(dev_to_xe(dev));
 
-	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_rp0_freq(pc));
+	return sysfs_emit(buf, "%d\n", freq);
 }
 static DEVICE_ATTR_RO(rp0_freq);
 
@@ -86,8 +105,13 @@ static ssize_t rpe_freq_show(struct device *dev,
 			     struct device_attribute *attr, char *buf)
 {
 	struct xe_guc_pc *pc = dev_to_pc(dev);
+	u32 freq;
+
+	xe_pm_runtime_get(dev_to_xe(dev));
+	freq = xe_guc_pc_get_rpe_freq(pc);
+	xe_pm_runtime_put(dev_to_xe(dev));
 
-	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_rpe_freq(pc));
+	return sysfs_emit(buf, "%d\n", freq);
 }
 static DEVICE_ATTR_RO(rpe_freq);
 
@@ -107,7 +131,9 @@ static ssize_t min_freq_show(struct device *dev,
 	u32 freq;
 	ssize_t ret;
 
+	xe_pm_runtime_get(dev_to_xe(dev));
 	ret = xe_guc_pc_get_min_freq(pc, &freq);
+	xe_pm_runtime_put(dev_to_xe(dev));
 	if (ret)
 		return ret;
 
@@ -125,7 +151,9 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
 	if (ret)
 		return ret;
 
+	xe_pm_runtime_get(dev_to_xe(dev));
 	ret = xe_guc_pc_set_min_freq(pc, freq);
+	xe_pm_runtime_put(dev_to_xe(dev));
 	if (ret)
 		return ret;
 
@@ -140,7 +168,9 @@ static ssize_t max_freq_show(struct device *dev,
 	u32 freq;
 	ssize_t ret;
 
+	xe_pm_runtime_get(dev_to_xe(dev));
 	ret = xe_guc_pc_get_max_freq(pc, &freq);
+	xe_pm_runtime_put(dev_to_xe(dev));
 	if (ret)
 		return ret;
 
@@ -158,7 +188,9 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
 	if (ret)
 		return ret;
 
+	xe_pm_runtime_get(dev_to_xe(dev));
 	ret = xe_guc_pc_set_max_freq(pc, freq);
+	xe_pm_runtime_put(dev_to_xe(dev));
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c
index 9358f7336889..824ff458011f 100644
--- a/drivers/gpu/drm/xe/xe_gt_idle.c
+++ b/drivers/gpu/drm/xe/xe_gt_idle.c
@@ -12,6 +12,7 @@
 #include "xe_guc_pc.h"
 #include "regs/xe_gt_regs.h"
 #include "xe_mmio.h"
+#include "xe_pm.h"
 
 /**
  * DOC: Xe GT Idle
@@ -40,6 +41,15 @@ static struct xe_guc_pc *gtidle_to_pc(struct xe_gt_idle *gtidle)
 	return &gtidle_to_gt(gtidle)->uc.guc.pc;
 }
 
+static struct xe_device *
+pc_to_xe(struct xe_guc_pc *pc)
+{
+	struct xe_guc *guc = container_of(pc, struct xe_guc, pc);
+	struct xe_gt *gt = container_of(guc, struct xe_gt, uc.guc);
+
+	return gt_to_xe(gt);
+}
+
 static const char *gt_idle_state_to_string(enum xe_gt_idle_state state)
 {
 	switch (state) {
@@ -86,8 +96,14 @@ static ssize_t name_show(struct device *dev,
 			 struct device_attribute *attr, char *buff)
 {
 	struct xe_gt_idle *gtidle = dev_to_gtidle(dev);
+	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
+	ssize_t ret;
+
+	xe_pm_runtime_get(pc_to_xe(pc));
+	ret = sysfs_emit(buff, "%s\n", gtidle->name);
+	xe_pm_runtime_put(pc_to_xe(pc));
 
-	return sysfs_emit(buff, "%s\n", gtidle->name);
+	return ret;
 }
 static DEVICE_ATTR_RO(name);
 
@@ -98,7 +114,9 @@ static ssize_t idle_status_show(struct device *dev,
 	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
 	enum xe_gt_idle_state state;
 
+	xe_pm_runtime_get(pc_to_xe(pc));
 	state = gtidle->idle_status(pc);
+	xe_pm_runtime_put(pc_to_xe(pc));
 
 	return sysfs_emit(buff, "%s\n", gt_idle_state_to_string(state));
 }
@@ -111,7 +129,10 @@ static ssize_t idle_residency_ms_show(struct device *dev,
 	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
 	u64 residency;
 
+	xe_pm_runtime_get(pc_to_xe(pc));
 	residency = gtidle->idle_residency(pc);
+	xe_pm_runtime_put(pc_to_xe(pc));
+
 	return sysfs_emit(buff, "%llu\n", get_residency_ms(gtidle, residency));
 }
 static DEVICE_ATTR_RO(idle_residency_ms);
diff --git a/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c b/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
index 63d640591a52..9c33045ff1ef 100644
--- a/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
@@ -11,6 +11,7 @@
 #include "xe_gt_sysfs.h"
 #include "xe_gt_throttle_sysfs.h"
 #include "xe_mmio.h"
+#include "xe_pm.h"
 
 /**
  * DOC: Xe GT Throttle
@@ -38,10 +39,12 @@ static u32 read_perf_limit_reasons(struct xe_gt *gt)
 {
 	u32 reg;
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	if (xe_gt_is_media_type(gt))
 		reg = xe_mmio_read32(gt, MTL_MEDIA_PERF_LIMIT_REASONS);
 	else
 		reg = xe_mmio_read32(gt, GT0_PERF_LIMIT_REASONS);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return reg;
 }
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
index 2345fb42fa39..9e23ca7f45ad 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
@@ -9,6 +9,7 @@
 
 #include "xe_gt.h"
 #include "xe_hw_engine_class_sysfs.h"
+#include "xe_pm.h"
 
 #define MAX_ENGINE_CLASS_NAME_LEN    16
 static int xe_add_hw_engine_class_defaults(struct xe_device *xe,
@@ -513,6 +514,7 @@ kobj_xe_hw_engine_class(struct xe_device *xe, struct kobject *parent, char *name
 		kobject_put(&keclass->base);
 		return NULL;
 	}
+	keclass->xe = xe;
 
 	err = drmm_add_action_or_reset(&xe->drm, kobj_xe_hw_engine_class_fini,
 				       &keclass->base);
@@ -567,9 +569,63 @@ static void xe_hw_engine_sysfs_kobj_release(struct kobject *kobj)
 	kfree(kobj);
 }
 
+#include "xe_pm.h"
+
+static inline struct xe_device *pdev_to_xe_device(struct pci_dev *pdev)
+{
+	return pci_get_drvdata(pdev);
+}
+
+static inline struct xe_device *to_xe_device(const struct drm_device *dev)
+{
+	return container_of(dev, struct xe_device, drm);
+}
+
+static ssize_t xe_hw_engine_class_sysfs_attr_show(struct kobject *kobj,
+						  struct attribute *attr,
+						  char *buf)
+{
+	struct xe_device *xe = kobj_to_xe(kobj);
+	struct kobj_attribute *kattr;
+	ssize_t ret = -EIO;
+
+	kattr = container_of(attr, struct kobj_attribute, attr);
+	if (kattr->show) {
+		xe_pm_runtime_get(xe);
+		ret = kattr->show(kobj, kattr, buf);
+		xe_pm_runtime_put(xe);
+	}
+
+	return ret;
+}
+
+static ssize_t xe_hw_engine_class_sysfs_attr_store(struct kobject *kobj,
+						   struct attribute *attr,
+						   const char *buf,
+						   size_t count)
+{
+	struct xe_device *xe = kobj_to_xe(kobj);
+	struct kobj_attribute *kattr;
+	ssize_t ret = -EIO;
+
+	kattr = container_of(attr, struct kobj_attribute, attr);
+	if (kattr->store) {
+		xe_pm_runtime_get(xe);
+		ret = kattr->store(kobj, kattr, buf, count);
+		xe_pm_runtime_put(xe);
+	}
+
+	return ret;
+}
+
+static const struct sysfs_ops xe_hw_engine_class_sysfs_ops = {
+	.show = xe_hw_engine_class_sysfs_attr_show,
+	.store = xe_hw_engine_class_sysfs_attr_store,
+};
+
 static const struct kobj_type xe_hw_engine_sysfs_kobj_type = {
 	.release = xe_hw_engine_sysfs_kobj_release,
-	.sysfs_ops = &kobj_sysfs_ops,
+	.sysfs_ops = &xe_hw_engine_class_sysfs_ops,
 };
 
 static void hw_engine_class_sysfs_fini(struct drm_device *drm, void *arg)
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
index ec5ba673b314..28a0d7c909c0 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
@@ -26,6 +26,8 @@ struct kobj_eclass {
 	struct kobject base;
 	/** @eclass: A pointer to the hw engine class interface */
 	struct xe_hw_engine_class_intf *eclass;
+	/** @xe: A pointer to the xe device */
+	struct xe_device *xe;
 };
 
 static inline struct xe_hw_engine_class_intf *kobj_to_eclass(struct kobject *kobj)
@@ -33,4 +35,9 @@ static inline struct xe_hw_engine_class_intf *kobj_to_eclass(struct kobject *kob
 	return container_of(kobj, struct kobj_eclass, base)->eclass;
 }
 
+static inline struct xe_device *kobj_to_xe(struct kobject *kobj)
+{
+	return container_of(kobj, struct kobj_eclass, base)->xe;
+}
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_tile_sysfs.c b/drivers/gpu/drm/xe/xe_tile_sysfs.c
index 0662968d7bcb..237a0761d3ad 100644
--- a/drivers/gpu/drm/xe/xe_tile_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_tile_sysfs.c
@@ -7,6 +7,7 @@
 #include <linux/sysfs.h>
 #include <drm/drm_managed.h>
 
+#include "xe_pm.h"
 #include "xe_tile.h"
 #include "xe_tile_sysfs.h"
 #include "xe_vram_freq.h"
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 12/34] drm/xe: Ensure device is awake before removing it
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (10 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:05   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
                   ` (24 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

If device is suspended, the module unload might face challenges.
So, let's ensure that the very first thing of the "unprobe" (remove)
is to wake the device.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index df6b3b31f419..0f296df729c4 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -686,8 +686,8 @@ static void xe_pci_remove(struct pci_dev *pdev)
 	if (!xe) /* driver load aborted, nothing to cleanup */
 		return;
 
-	xe_device_remove(xe);
 	xe_pm_runtime_fini(xe);
+	xe_device_remove(xe);
 	pci_set_drvdata(pdev, NULL);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (11 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 12/34] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:08   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
                   ` (23 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

We are now protected by init, sysfs, or removal and don't
need these mem_access protections around GuC_PC anymore.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_pc.c | 62 ++++++----------------------------
 1 file changed, 10 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_pc.c b/drivers/gpu/drm/xe/xe_guc_pc.c
index f71085228cb3..ce39eac7c8f5 100644
--- a/drivers/gpu/drm/xe/xe_guc_pc.c
+++ b/drivers/gpu/drm/xe/xe_guc_pc.c
@@ -381,8 +381,6 @@ u32 xe_guc_pc_get_act_freq(struct xe_guc_pc *pc)
 	struct xe_device *xe = gt_to_xe(gt);
 	u32 freq;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
-
 	/* When in RC6, actual frequency reported will be 0. */
 	if (GRAPHICS_VERx100(xe) >= 1270) {
 		freq = xe_mmio_read32(gt, MTL_MIRROR_TARGET_WP1);
@@ -394,8 +392,6 @@ u32 xe_guc_pc_get_act_freq(struct xe_guc_pc *pc)
 
 	freq = decode_freq(freq);
 
-	xe_device_mem_access_put(gt_to_xe(gt));
-
 	return freq;
 }
 
@@ -412,14 +408,13 @@ int xe_guc_pc_get_cur_freq(struct xe_guc_pc *pc, u32 *freq)
 	struct xe_gt *gt = pc_to_gt(pc);
 	int ret;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	/*
 	 * GuC SLPC plays with cur freq request when GuCRC is enabled
 	 * Block RC6 for a more reliable read.
 	 */
 	ret = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (ret)
-		goto out;
+		return ret;
 
 	*freq = xe_mmio_read32(gt, RPNSWREQ);
 
@@ -427,9 +422,7 @@ int xe_guc_pc_get_cur_freq(struct xe_guc_pc *pc, u32 *freq)
 	*freq = decode_freq(*freq);
 
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-out:
-	xe_device_mem_access_put(gt_to_xe(gt));
-	return ret;
+	return 0;
 }
 
 /**
@@ -451,12 +444,7 @@ u32 xe_guc_pc_get_rp0_freq(struct xe_guc_pc *pc)
  */
 u32 xe_guc_pc_get_rpe_freq(struct xe_guc_pc *pc)
 {
-	struct xe_gt *gt = pc_to_gt(pc);
-	struct xe_device *xe = gt_to_xe(gt);
-
-	xe_device_mem_access_get(xe);
 	pc_update_rp_values(pc);
-	xe_device_mem_access_put(xe);
 
 	return pc->rpe_freq;
 }
@@ -485,7 +473,6 @@ int xe_guc_pc_get_min_freq(struct xe_guc_pc *pc, u32 *freq)
 	struct xe_gt *gt = pc_to_gt(pc);
 	int ret;
 
-	xe_device_mem_access_get(pc_to_xe(pc));
 	mutex_lock(&pc->freq_lock);
 	if (!pc->freq_ready) {
 		/* Might be in the middle of a gt reset */
@@ -511,7 +498,6 @@ int xe_guc_pc_get_min_freq(struct xe_guc_pc *pc, u32 *freq)
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 out:
 	mutex_unlock(&pc->freq_lock);
-	xe_device_mem_access_put(pc_to_xe(pc));
 	return ret;
 }
 
@@ -528,7 +514,6 @@ int xe_guc_pc_set_min_freq(struct xe_guc_pc *pc, u32 freq)
 {
 	int ret;
 
-	xe_device_mem_access_get(pc_to_xe(pc));
 	mutex_lock(&pc->freq_lock);
 	if (!pc->freq_ready) {
 		/* Might be in the middle of a gt reset */
@@ -544,8 +529,6 @@ int xe_guc_pc_set_min_freq(struct xe_guc_pc *pc, u32 freq)
 
 out:
 	mutex_unlock(&pc->freq_lock);
-	xe_device_mem_access_put(pc_to_xe(pc));
-
 	return ret;
 }
 
@@ -561,7 +544,6 @@ int xe_guc_pc_get_max_freq(struct xe_guc_pc *pc, u32 *freq)
 {
 	int ret;
 
-	xe_device_mem_access_get(pc_to_xe(pc));
 	mutex_lock(&pc->freq_lock);
 	if (!pc->freq_ready) {
 		/* Might be in the middle of a gt reset */
@@ -577,7 +559,6 @@ int xe_guc_pc_get_max_freq(struct xe_guc_pc *pc, u32 *freq)
 
 out:
 	mutex_unlock(&pc->freq_lock);
-	xe_device_mem_access_put(pc_to_xe(pc));
 	return ret;
 }
 
@@ -594,7 +575,6 @@ int xe_guc_pc_set_max_freq(struct xe_guc_pc *pc, u32 freq)
 {
 	int ret;
 
-	xe_device_mem_access_get(pc_to_xe(pc));
 	mutex_lock(&pc->freq_lock);
 	if (!pc->freq_ready) {
 		/* Might be in the middle of a gt reset */
@@ -610,7 +590,6 @@ int xe_guc_pc_set_max_freq(struct xe_guc_pc *pc, u32 freq)
 
 out:
 	mutex_unlock(&pc->freq_lock);
-	xe_device_mem_access_put(pc_to_xe(pc));
 	return ret;
 }
 
@@ -623,8 +602,6 @@ enum xe_gt_idle_state xe_guc_pc_c_status(struct xe_guc_pc *pc)
 	struct xe_gt *gt = pc_to_gt(pc);
 	u32 reg, gt_c_state;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
-
 	if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) {
 		reg = xe_mmio_read32(gt, MTL_MIRROR_TARGET_WP1);
 		gt_c_state = REG_FIELD_GET(MTL_CC_MASK, reg);
@@ -633,8 +610,6 @@ enum xe_gt_idle_state xe_guc_pc_c_status(struct xe_guc_pc *pc)
 		gt_c_state = REG_FIELD_GET(RCN_MASK, reg);
 	}
 
-	xe_device_mem_access_put(gt_to_xe(gt));
-
 	switch (gt_c_state) {
 	case GT_C6:
 		return GT_IDLE_C6;
@@ -654,9 +629,7 @@ u64 xe_guc_pc_rc6_residency(struct xe_guc_pc *pc)
 	struct xe_gt *gt = pc_to_gt(pc);
 	u32 reg;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	reg = xe_mmio_read32(gt, GT_GFX_RC6);
-	xe_device_mem_access_put(gt_to_xe(gt));
 
 	return reg;
 }
@@ -670,9 +643,7 @@ u64 xe_guc_pc_mc6_residency(struct xe_guc_pc *pc)
 	struct xe_gt *gt = pc_to_gt(pc);
 	u64 reg;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	reg = xe_mmio_read32(gt, MTL_MEDIA_MC6);
-	xe_device_mem_access_put(gt_to_xe(gt));
 
 	return reg;
 }
@@ -801,23 +772,19 @@ int xe_guc_pc_gucrc_disable(struct xe_guc_pc *pc)
 	if (xe->info.skip_guc_pc)
 		return 0;
 
-	xe_device_mem_access_get(pc_to_xe(pc));
-
 	ret = pc_action_setup_gucrc(pc, XE_GUCRC_HOST_CONTROL);
 	if (ret)
-		goto out;
+		return ret;
 
 	ret = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (ret)
-		goto out;
+		return ret;
 
 	xe_gt_idle_disable_c6(gt);
 
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 
-out:
-	xe_device_mem_access_put(pc_to_xe(pc));
-	return ret;
+	return 0;
 }
 
 static void pc_init_pcode_freq(struct xe_guc_pc *pc)
@@ -870,11 +837,9 @@ int xe_guc_pc_start(struct xe_guc_pc *pc)
 
 	xe_gt_assert(gt, xe_device_uc_enabled(xe));
 
-	xe_device_mem_access_get(pc_to_xe(pc));
-
 	ret = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (ret)
-		goto out_fail_force_wake;
+		return ret;
 
 	if (xe->info.skip_guc_pc) {
 		if (xe->info.platform != XE_PVC)
@@ -914,8 +879,6 @@ int xe_guc_pc_start(struct xe_guc_pc *pc)
 
 out:
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-out_fail_force_wake:
-	xe_device_mem_access_put(pc_to_xe(pc));
 	return ret;
 }
 
@@ -928,12 +891,9 @@ int xe_guc_pc_stop(struct xe_guc_pc *pc)
 	struct xe_device *xe = pc_to_xe(pc);
 	int ret;
 
-	xe_device_mem_access_get(pc_to_xe(pc));
-
 	if (xe->info.skip_guc_pc) {
 		xe_gt_idle_disable_c6(pc_to_gt(pc));
-		ret = 0;
-		goto out;
+		return 0;
 	}
 
 	mutex_lock(&pc->freq_lock);
@@ -942,16 +902,14 @@ int xe_guc_pc_stop(struct xe_guc_pc *pc)
 
 	ret = pc_action_shutdown(pc);
 	if (ret)
-		goto out;
+		return ret;
 
 	if (wait_for_pc_state(pc, SLPC_GLOBAL_STATE_NOT_RUNNING)) {
 		drm_err(&pc_to_xe(pc)->drm, "GuC PC Shutdown failed\n");
-		ret = -EIO;
+		return -EIO;
 	}
 
-out:
-	xe_device_mem_access_put(pc_to_xe(pc));
-	return ret;
+	return 0;
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (12 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:10   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
                   ` (22 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's ensure our PCI device is awaken on every debugfs call.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.

Also remove the mem_access get_put helpers, now that they are not
needed anymore.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_debugfs.c     | 10 +++---
 drivers/gpu/drm/xe/xe_gt_debugfs.c  | 53 ++++++++++++++++++++++++++---
 drivers/gpu/drm/xe/xe_guc_debugfs.c |  9 ++---
 drivers/gpu/drm/xe/xe_huc_debugfs.c |  5 +--
 drivers/gpu/drm/xe/xe_ttm_sys_mgr.c |  5 ++-
 5 files changed, 66 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index 01db5b27bec5..8abdf3c17e1d 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -12,6 +12,7 @@
 #include "xe_bo.h"
 #include "xe_device.h"
 #include "xe_gt_debugfs.h"
+#include "xe_pm.h"
 #include "xe_step.h"
 
 #ifdef CONFIG_DRM_XE_DEBUG
@@ -37,6 +38,8 @@ static int info(struct seq_file *m, void *data)
 	struct xe_gt *gt;
 	u8 id;
 
+	xe_pm_runtime_get(xe);
+
 	drm_printf(&p, "graphics_verx100 %d\n", xe->info.graphics_verx100);
 	drm_printf(&p, "media_verx100 %d\n", xe->info.media_verx100);
 	drm_printf(&p, "stepping G:%s M:%s D:%s B:%s\n",
@@ -63,6 +66,7 @@ static int info(struct seq_file *m, void *data)
 			   gt->info.engine_mask);
 	}
 
+	xe_pm_runtime_put(xe);
 	return 0;
 }
 
@@ -76,8 +80,7 @@ static int forcewake_open(struct inode *inode, struct file *file)
 	struct xe_gt *gt;
 	u8 id;
 
-	xe_device_mem_access_get(xe);
-
+	xe_pm_runtime_get(xe);
 	for_each_gt(gt, xe, id)
 		XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 
@@ -92,8 +95,7 @@ static int forcewake_release(struct inode *inode, struct file *file)
 
 	for_each_gt(gt, xe, id)
 		XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
index c4b67cf09f8f..6b4dc2927727 100644
--- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
@@ -18,6 +18,7 @@
 #include "xe_lrc.h"
 #include "xe_macros.h"
 #include "xe_pat.h"
+#include "xe_pm.h"
 #include "xe_reg_sr.h"
 #include "xe_reg_whitelist.h"
 #include "xe_uc_debugfs.h"
@@ -37,10 +38,10 @@ static int hw_engines(struct seq_file *m, void *data)
 	enum xe_hw_engine_id id;
 	int err;
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (err) {
-		xe_device_mem_access_put(xe);
+		xe_pm_runtime_put(xe);
 		return err;
 	}
 
@@ -48,7 +49,7 @@ static int hw_engines(struct seq_file *m, void *data)
 		xe_hw_engine_print(hwe, &p);
 
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 	if (err)
 		return err;
 
@@ -59,18 +60,23 @@ static int force_reset(struct seq_file *m, void *data)
 {
 	struct xe_gt *gt = node_to_gt(m->private);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_gt_reset_async(gt);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return 0;
 }
 
 static int sa_info(struct seq_file *m, void *data)
 {
-	struct xe_tile *tile = gt_to_tile(node_to_gt(m->private));
+	struct xe_gt *gt = node_to_gt(m->private);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	drm_suballoc_dump_debug_info(&tile->mem.kernel_bb_pool->base, &p,
 				     tile->mem.kernel_bb_pool->gpu_addr);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return 0;
 }
@@ -80,7 +86,9 @@ static int topology(struct seq_file *m, void *data)
 	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_gt_topology_dump(gt, &p);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return 0;
 }
@@ -90,7 +98,9 @@ static int steering(struct seq_file *m, void *data)
 	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_gt_mcr_steering_dump(gt, &p);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return 0;
 }
@@ -99,8 +109,13 @@ static int ggtt(struct seq_file *m, void *data)
 {
 	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
+	int ret;
+
+	xe_pm_runtime_get(gt_to_xe(gt));
+	ret = xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
-	return xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
+	return ret;
 }
 
 static int register_save_restore(struct seq_file *m, void *data)
@@ -110,6 +125,8 @@ static int register_save_restore(struct seq_file *m, void *data)
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 
+	xe_pm_runtime_get(gt_to_xe(gt));
+
 	xe_reg_sr_dump(&gt->reg_sr, &p);
 	drm_printf(&p, "\n");
 
@@ -127,6 +144,8 @@ static int register_save_restore(struct seq_file *m, void *data)
 	for_each_hw_engine(hwe, gt, id)
 		xe_reg_whitelist_dump(&hwe->reg_whitelist, &p);
 
+	xe_pm_runtime_put(gt_to_xe(gt));
+
 	return 0;
 }
 
@@ -135,7 +154,9 @@ static int workarounds(struct seq_file *m, void *data)
 	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_wa_dump(gt, &p);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return 0;
 }
@@ -145,48 +166,70 @@ static int pat(struct seq_file *m, void *data)
 	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_pat_dump(gt, &p);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	return 0;
 }
 
 static int rcs_default_lrc(struct seq_file *m, void *data)
 {
+	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_RENDER);
+	xe_pm_runtime_put(gt_to_xe(gt));
+
 	return 0;
 }
 
 static int ccs_default_lrc(struct seq_file *m, void *data)
 {
+	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_COMPUTE);
+	xe_pm_runtime_put(gt_to_xe(gt));
+
 	return 0;
 }
 
 static int bcs_default_lrc(struct seq_file *m, void *data)
 {
+	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_COPY);
+	xe_pm_runtime_put(gt_to_xe(gt));
+
 	return 0;
 }
 
 static int vcs_default_lrc(struct seq_file *m, void *data)
 {
+	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_VIDEO_DECODE);
+	xe_pm_runtime_put(gt_to_xe(gt));
+
 	return 0;
 }
 
 static int vecs_default_lrc(struct seq_file *m, void *data)
 {
+	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_VIDEO_ENHANCE);
+	xe_pm_runtime_put(gt_to_xe(gt));
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_debugfs.c b/drivers/gpu/drm/xe/xe_guc_debugfs.c
index ffd7d53bcc42..d3822cbea273 100644
--- a/drivers/gpu/drm/xe/xe_guc_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_guc_debugfs.c
@@ -14,6 +14,7 @@
 #include "xe_guc_ct.h"
 #include "xe_guc_log.h"
 #include "xe_macros.h"
+#include "xe_pm.h"
 
 static struct xe_guc *node_to_guc(struct drm_info_node *node)
 {
@@ -26,9 +27,9 @@ static int guc_info(struct seq_file *m, void *data)
 	struct xe_device *xe = guc_to_xe(guc);
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 	xe_guc_print_info(guc, &p);
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 
 	return 0;
 }
@@ -39,9 +40,9 @@ static int guc_log(struct seq_file *m, void *data)
 	struct xe_device *xe = guc_to_xe(guc);
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 	xe_guc_log_print(&guc->log, &p);
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_huc_debugfs.c b/drivers/gpu/drm/xe/xe_huc_debugfs.c
index 18585a7eeb9d..3a888a40188b 100644
--- a/drivers/gpu/drm/xe/xe_huc_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_huc_debugfs.c
@@ -12,6 +12,7 @@
 #include "xe_gt.h"
 #include "xe_huc.h"
 #include "xe_macros.h"
+#include "xe_pm.h"
 
 static struct xe_gt *
 huc_to_gt(struct xe_huc *huc)
@@ -36,9 +37,9 @@ static int huc_info(struct seq_file *m, void *data)
 	struct xe_device *xe = huc_to_xe(huc);
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 	xe_huc_print_info(huc, &p);
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
index 3e1fa0c832ca..9844a8edbfe1 100644
--- a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
@@ -73,7 +73,10 @@ static void xe_ttm_sys_mgr_del(struct ttm_resource_manager *man,
 static void xe_ttm_sys_mgr_debug(struct ttm_resource_manager *man,
 				 struct drm_printer *printer)
 {
-
+	/*
+	 * This function is called by debugfs entry and would require
+	 * pm_runtime_{get,put} wrappers around any operation.
+	 */
 }
 
 static const struct ttm_resource_manager_func xe_ttm_sys_mgr_func = {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (13 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:15   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
                   ` (21 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Continue on the path to entirely remove mem_access helpers in
favor of the direct xe_pm_runtime calls. This item is one of
the direct outer bounds of the protection.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_dma_buf.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index da2627ed6ae7..5b26af21e029 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -16,6 +16,7 @@
 #include "tests/xe_test.h"
 #include "xe_bo.h"
 #include "xe_device.h"
+#include "xe_pm.h"
 #include "xe_ttm_vram_mgr.h"
 #include "xe_vm.h"
 
@@ -33,7 +34,7 @@ static int xe_dma_buf_attach(struct dma_buf *dmabuf,
 	if (!attach->peer2peer && !xe_bo_can_migrate(gem_to_xe_bo(obj), XE_PL_TT))
 		return -EOPNOTSUPP;
 
-	xe_device_mem_access_get(to_xe_device(obj->dev));
+	xe_pm_runtime_get(to_xe_device(obj->dev));
 	return 0;
 }
 
@@ -42,7 +43,7 @@ static void xe_dma_buf_detach(struct dma_buf *dmabuf,
 {
 	struct drm_gem_object *obj = attach->dmabuf->priv;
 
-	xe_device_mem_access_put(to_xe_device(obj->dev));
+	xe_pm_runtime_put(to_xe_device(obj->dev));
 }
 
 static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (14 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:23   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
                   ` (20 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

This is not needed any longer, now that we have all the protection
in place with the runtime pm itself.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c | 8 --------
 drivers/gpu/drm/xe/xe_device.h | 1 -
 drivers/gpu/drm/xe/xe_pm.c     | 3 ---
 3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index ab41202ecaf8..1711d9064ecc 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -673,14 +673,6 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
 		DIV_ROUND_UP_ULL(size, NUM_BYTES_PER_CCS_BYTE(xe)) : 0;
 }
 
-bool xe_device_mem_access_ongoing(struct xe_device *xe)
-{
-	if (xe_pm_read_callback_task(xe) != NULL)
-		return true;
-
-	return atomic_read(&xe->mem_access.ref);
-}
-
 /**
  * xe_device_assert_mem_access - Inspect the current runtime_pm state.
  * @xe: xe device instance
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 270124da1e00..74074939d157 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -146,7 +146,6 @@ bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe);
 void xe_device_mem_access_put(struct xe_device *xe);
 
 void xe_device_assert_mem_access(struct xe_device *xe);
-bool xe_device_mem_access_ongoing(struct xe_device *xe);
 
 static inline bool xe_device_in_fault_mode(struct xe_device *xe)
 {
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index d98f4bb3ad02..967d3dc0ded5 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -281,9 +281,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	u8 id;
 	int err = 0;
 
-	if (xe->d3cold.allowed && xe_device_mem_access_ongoing(xe))
-		return -EBUSY;
-
 	/* Disable access_ongoing asserts and prevent recursive pm calls */
 	xe_pm_write_callback_task(xe, current);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (15 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:25   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime Rodrigo Vivi
                   ` (19 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Continue the work to kill the mem_access in favor of a pure runtime pm.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_hwmon.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index 89c6f7f84b5a..bcb96469b81a 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -16,6 +16,7 @@
 #include "xe_mmio.h"
 #include "xe_pcode.h"
 #include "xe_pcode_api.h"
+#include "xe_pm.h"
 
 enum xe_hwmon_reg {
 	REG_PKG_RAPL_LIMIT,
@@ -264,7 +265,7 @@ xe_hwmon_power1_max_interval_show(struct device *dev, struct device_attribute *a
 	u32 x, y, x_w = 2; /* 2 bits */
 	u64 r, tau4, out;
 
-	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_get(gt_to_xe(hwmon->gt));
 
 	mutex_lock(&hwmon->hwmon_lock);
 
@@ -273,7 +274,7 @@ xe_hwmon_power1_max_interval_show(struct device *dev, struct device_attribute *a
 
 	mutex_unlock(&hwmon->hwmon_lock);
 
-	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_put(gt_to_xe(hwmon->gt));
 
 	x = REG_FIELD_GET(PKG_PWR_LIM_1_TIME_X, r);
 	y = REG_FIELD_GET(PKG_PWR_LIM_1_TIME_Y, r);
@@ -352,7 +353,7 @@ xe_hwmon_power1_max_interval_store(struct device *dev, struct device_attribute *
 
 	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
 
-	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_get(gt_to_xe(hwmon->gt));
 
 	mutex_lock(&hwmon->hwmon_lock);
 
@@ -361,7 +362,7 @@ xe_hwmon_power1_max_interval_store(struct device *dev, struct device_attribute *
 
 	mutex_unlock(&hwmon->hwmon_lock);
 
-	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_put(gt_to_xe(hwmon->gt));
 
 	return count;
 }
@@ -382,12 +383,12 @@ static umode_t xe_hwmon_attributes_visible(struct kobject *kobj,
 	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
 	int ret = 0;
 
-	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_get(gt_to_xe(hwmon->gt));
 
 	if (attr == &sensor_dev_attr_power1_max_interval.dev_attr.attr)
 		ret = xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT) ? attr->mode : 0;
 
-	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_put(gt_to_xe(hwmon->gt));
 
 	return ret;
 }
@@ -608,7 +609,7 @@ xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
 	int ret;
 
-	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_get(gt_to_xe(hwmon->gt));
 
 	switch (type) {
 	case hwmon_power:
@@ -628,7 +629,7 @@ xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 		break;
 	}
 
-	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_put(gt_to_xe(hwmon->gt));
 
 	return ret;
 }
@@ -640,7 +641,7 @@ xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
 	int ret;
 
-	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_get(gt_to_xe(hwmon->gt));
 
 	switch (type) {
 	case hwmon_power:
@@ -660,7 +661,7 @@ xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 		break;
 	}
 
-	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_put(gt_to_xe(hwmon->gt));
 
 	return ret;
 }
@@ -672,7 +673,7 @@ xe_hwmon_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
 	int ret;
 
-	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_get(gt_to_xe(hwmon->gt));
 
 	switch (type) {
 	case hwmon_power:
@@ -686,7 +687,7 @@ xe_hwmon_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 		break;
 	}
 
-	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+	xe_pm_runtime_put(gt_to_xe(hwmon->gt));
 
 	return ret;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (16 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:31   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 19/34] drm/xe: Remove pm_runtime lockdep Rodrigo Vivi
                   ` (18 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

The mem_access itself is not holding any lock, but attempting
to train lockdep with possible scarring locks happening during
runtime pm. We are going soon to kill the mem_access get and put
helpers in favor of direct xe_pm_runtime calls, so let's just
move this lock around to where it now belongs.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c | 23 -----------------
 drivers/gpu/drm/xe/xe_device.h |  4 ---
 drivers/gpu/drm/xe/xe_pm.c     | 45 ++++++++++++++++++++++++++++------
 3 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 1711d9064ecc..0b27e09b3db9 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -45,12 +45,6 @@
 #include "xe_wait_user_fence.h"
 #include "xe_hwmon.h"
 
-#ifdef CONFIG_LOCKDEP
-struct lockdep_map xe_device_mem_access_lockdep_map = {
-	.name = "xe_device_mem_access_lockdep_map"
-};
-#endif
-
 static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct xe_device *xe = to_xe_device(dev);
@@ -722,23 +716,6 @@ void xe_device_mem_access_get(struct xe_device *xe)
 	if (xe_pm_read_callback_task(xe) == current)
 		return;
 
-	/*
-	 * Since the resume here is synchronous it can be quite easy to deadlock
-	 * if we are not careful. Also in practice it might be quite timing
-	 * sensitive to ever see the 0 -> 1 transition with the callers locks
-	 * held, so deadlocks might exist but are hard for lockdep to ever see.
-	 * With this in mind, help lockdep learn about the potentially scary
-	 * stuff that can happen inside the runtime_resume callback by acquiring
-	 * a dummy lock (it doesn't protect anything and gets compiled out on
-	 * non-debug builds).  Lockdep then only needs to see the
-	 * mem_access_lockdep_map -> runtime_resume callback once, and then can
-	 * hopefully validate all the (callers_locks) -> mem_access_lockdep_map.
-	 * For example if the (callers_locks) are ever grabbed in the
-	 * runtime_resume callback, lockdep should give us a nice splat.
-	 */
-	lock_map_acquire(&xe_device_mem_access_lockdep_map);
-	lock_map_release(&xe_device_mem_access_lockdep_map);
-
 	xe_pm_runtime_get(xe);
 	ref = atomic_inc_return(&xe->mem_access.ref);
 
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 74074939d157..050d4b6cdc65 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -16,10 +16,6 @@ struct xe_file;
 #include "xe_force_wake.h"
 #include "xe_macros.h"
 
-#ifdef CONFIG_LOCKDEP
-extern struct lockdep_map xe_device_mem_access_lockdep_map;
-#endif
-
 static inline struct xe_device *to_xe_device(const struct drm_device *dev)
 {
 	return container_of(dev, struct xe_device, drm);
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 967d3dc0ded5..86bf225dba02 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -66,6 +66,13 @@
  * management (RPS).
  */
 
+
+#ifdef CONFIG_LOCKDEP
+struct lockdep_map xe_pm_runtime_lockdep_map = {
+	.name = "xe_pm_runtime_lockdep_map"
+};
+#endif
+
 /**
  * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
  * @xe: xe device instance
@@ -285,11 +292,11 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	xe_pm_write_callback_task(xe, current);
 
 	/*
-	 * The actual xe_device_mem_access_put() is always async underneath, so
+	 * The actual xe_pm_runtime_put() is always async underneath, so
 	 * exactly where that is called should makes no difference to us. However
 	 * we still need to be very careful with the locks that this callback
 	 * acquires and the locks that are acquired and held by any callers of
-	 * xe_device_mem_access_get(). We already have the matching annotation
+	 * xe_runtime_pm_get(). We already have the matching annotation
 	 * on that side, but we also need it here. For example lockdep should be
 	 * able to tell us if the following scenario is in theory possible:
 	 *
@@ -297,15 +304,15 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	 * lock(A)                       |
 	 *                               | xe_pm_runtime_suspend()
 	 *                               |      lock(A)
-	 * xe_device_mem_access_get()    |
+	 * xe_pm_runtime_get()           |
 	 *
 	 * This will clearly deadlock since rpm core needs to wait for
 	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
 	 * on CPU0 which prevents CPU1 making forward progress.  With the
-	 * annotation here and in xe_device_mem_access_get() lockdep will see
+	 * annotation here and in xe_pm_runtime_get() lockdep will see
 	 * the potential lock inversion and give us a nice splat.
 	 */
-	lock_map_acquire(&xe_device_mem_access_lockdep_map);
+	lock_map_acquire(&xe_pm_runtime_lockdep_map);
 
 	/*
 	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
@@ -334,7 +341,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	if (xe->d3cold.allowed)
 		xe_display_pm_runtime_suspend(xe);
 out:
-	lock_map_release(&xe_device_mem_access_lockdep_map);
+	lock_map_release(&xe_pm_runtime_lockdep_map);
 	xe_pm_write_callback_task(xe, NULL);
 	return err;
 }
@@ -354,7 +361,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	/* Disable access_ongoing asserts and prevent recursive pm calls */
 	xe_pm_write_callback_task(xe, current);
 
-	lock_map_acquire(&xe_device_mem_access_lockdep_map);
+	lock_map_acquire(&xe_pm_runtime_lockdep_map);
 
 	/*
 	 * It can be possible that xe has allowed d3cold but other pcie devices
@@ -393,11 +400,31 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 			goto out;
 	}
 out:
-	lock_map_release(&xe_device_mem_access_lockdep_map);
+	lock_map_release(&xe_pm_runtime_lockdep_map);
 	xe_pm_write_callback_task(xe, NULL);
 	return err;
 }
 
+/*
+ * For places where resume is synchronous it can be quite easy to deadlock
+ * if we are not careful. Also in practice it might be quite timing
+ * sensitive to ever see the 0 -> 1 transition with the callers locks
+ * held, so deadlocks might exist but are hard for lockdep to ever see.
+ * With this in mind, help lockdep learn about the potentially scary
+ * stuff that can happen inside the runtime_resume callback by acquiring
+ * a dummy lock (it doesn't protect anything and gets compiled out on
+ * non-debug builds).  Lockdep then only needs to see the
+ * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
+ * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
+ * For example if the (callers_locks) are ever grabbed in the
+ * runtime_resume callback, lockdep should give us a nice splat.
+ */
+static void pm_runtime_lockdep_training(void)
+{
+	lock_map_acquire(&xe_pm_runtime_lockdep_map);
+	lock_map_release(&xe_pm_runtime_lockdep_map);
+}
+
 /**
  * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
  * @xe: xe device instance
@@ -409,6 +436,7 @@ void xe_pm_runtime_get(struct xe_device *xe)
 	if (xe_pm_read_callback_task(xe) == current)
 		return;
 
+	pm_runtime_lockdep_training();
 	pm_runtime_resume(xe->drm.dev);
 }
 
@@ -438,6 +466,7 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
 	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
 		return -ELOOP;
 
+	pm_runtime_lockdep_training();
 	return pm_runtime_get_sync(xe->drm.dev);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 19/34] drm/xe: Remove pm_runtime lockdep
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (17 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 11:54   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 20/34] drm/xe: Stop checking for power_lost on D3Cold Rodrigo Vivi
                   ` (17 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

This lockdep was initially designed for the mem_access,
where the mem_access needed to run the resume sync from
the innerbounds and count the references.

With the runtime moving to the outer bounds of the driver
and the mem_access replaced by the pure rpm get/put
references, it is no longer needed and it is in a matter
of fact just splatting may false positives as the following:

               -> #1 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}:
[  384.778761]        xe_pm_runtime_get+0xa3/0x100 [xe]
[  384.783871]        invalidation_fence_work_func+0x7f/0x2b0 [xe]
[  384.789942]        invalidation_fence_init+0x8c2/0xce0 [xe]
[  384.795671]        __xe_pt_unbind_vma+0x4a7/0x1be0 [xe]
[  384.801050]        xe_vm_unbind+0x22f/0xc70 [xe]
[  384.805821]        __xe_vma_op_execute+0xc67/0x1af0 [xe]
[  384.811286]        xe_vm_bind_ioctl+0x3a36/0x66c0 [xe]
[  384.816579]        drm_ioctl_kernel+0x14a/0x2c0
[  384.821132]        drm_ioctl+0x4c6/0xab0
[  384.825073]        xe_drm_ioctl+0xa1/0xe0 [xe]
[  384.829651]        __x64_sys_ioctl+0x130/0x1a0
[  384.834115]        do_syscall_64+0x5c/0xe0
[  384.838232]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  384.843829]
               -> #0 (reservation_ww_class_mutex){+.+.}-{4:4}:
[  384.850911]        __lock_acquire+0x3261/0x6330
[  384.855462]        lock_acquire+0x19b/0x4d0
[  384.859666]        __ww_mutex_lock.constprop.0+0x1d8/0x3500
[  384.865263]        ww_mutex_lock+0x38/0x150
[  384.869465]        xe_bo_lock+0x41/0x70 [xe]
[  384.873869]        xe_bo_evict_all+0x7ad/0xa40 [xe]
[  384.878883]        xe_pm_runtime_suspend+0x297/0x340 [xe]
[  384.884431]        xe_pci_runtime_suspend+0x3b/0x1e0 [xe]
[  384.889975]        pci_pm_runtime_suspend+0x168/0x540
[  384.895052]        __rpm_callback+0xa9/0x390
[  384.899343]        rpm_callback+0x1aa/0x210
[  384.903543]        rpm_suspend+0x2ea/0x14c0
[  384.907746]        pm_runtime_work+0x133/0x170
[  384.912213]        process_one_work+0x73b/0x1230
[  384.916853]        worker_thread+0x726/0x1320
[  384.921237]        kthread+0x2ee/0x3d0
[  384.925005]        ret_from_fork+0x2d/0x70
[  384.929120]        ret_from_fork_asm+0x1b/0x30
[  384.933585]
               other info that might help us debug this:

[  384.941625]  Possible unsafe locking scenario:

[  384.947572]        CPU0                    CPU1
[  384.952123]        ----                    ----
[  384.956676]   lock(xe_pm_runtime_lockdep_map);
[  384.961140]                                lock(reservation_ww_class_mutex);
[  384.968220]                                lock(xe_pm_runtime_lockdep_map);
[  384.975214]   lock(reservation_ww_class_mutex);
[  384.979765]
                *** DEADLOCK ***

In a matter of fact, there's actually a third lock that is not
in this picture:
spin_lock_irq(&dev->power.lock);
and INIT_WORK(&dev->power.work, pm_runtime_work);

The pm_callback_task will ensure that there's no recursive
calls of the resume function and it will increase the
reference counter anyway.

Then, the pm_runtime workqueue and spin locks will avoid that
any resume and suspend operations happens in parallel with
other resume and suspend operations.

With that, the only thing that we are actually doing here is
to wrongly train the lockdep, basically saying that we will
acquire some locks on resume and on suspend concurrently,
entirely ignoring its serialization and protection.

The above scenario is simply not possible because there's
a serialization with the spin_lock_irq(&dev->power.lock)
before each operation. However we are telling the lockep
that the lock(xe_pm_runtime_lockdep_map) occurs before
the &dev->power.lock and lockdep is not capable to see
that other protection.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 55 --------------------------------------
 1 file changed, 55 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 86bf225dba02..f49e449d9fb7 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -67,12 +67,6 @@
  */
 
 
-#ifdef CONFIG_LOCKDEP
-struct lockdep_map xe_pm_runtime_lockdep_map = {
-	.name = "xe_pm_runtime_lockdep_map"
-};
-#endif
-
 /**
  * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
  * @xe: xe device instance
@@ -291,29 +285,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	/* Disable access_ongoing asserts and prevent recursive pm calls */
 	xe_pm_write_callback_task(xe, current);
 
-	/*
-	 * The actual xe_pm_runtime_put() is always async underneath, so
-	 * exactly where that is called should makes no difference to us. However
-	 * we still need to be very careful with the locks that this callback
-	 * acquires and the locks that are acquired and held by any callers of
-	 * xe_runtime_pm_get(). We already have the matching annotation
-	 * on that side, but we also need it here. For example lockdep should be
-	 * able to tell us if the following scenario is in theory possible:
-	 *
-	 * CPU0                          | CPU1 (kworker)
-	 * lock(A)                       |
-	 *                               | xe_pm_runtime_suspend()
-	 *                               |      lock(A)
-	 * xe_pm_runtime_get()           |
-	 *
-	 * This will clearly deadlock since rpm core needs to wait for
-	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
-	 * on CPU0 which prevents CPU1 making forward progress.  With the
-	 * annotation here and in xe_pm_runtime_get() lockdep will see
-	 * the potential lock inversion and give us a nice splat.
-	 */
-	lock_map_acquire(&xe_pm_runtime_lockdep_map);
-
 	/*
 	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
 	 * also checks and delets bo entry from user fault list.
@@ -341,7 +312,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 	if (xe->d3cold.allowed)
 		xe_display_pm_runtime_suspend(xe);
 out:
-	lock_map_release(&xe_pm_runtime_lockdep_map);
 	xe_pm_write_callback_task(xe, NULL);
 	return err;
 }
@@ -361,8 +331,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	/* Disable access_ongoing asserts and prevent recursive pm calls */
 	xe_pm_write_callback_task(xe, current);
 
-	lock_map_acquire(&xe_pm_runtime_lockdep_map);
-
 	/*
 	 * It can be possible that xe has allowed d3cold but other pcie devices
 	 * in gfx card soc would have blocked d3cold, therefore card has not
@@ -400,31 +368,10 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 			goto out;
 	}
 out:
-	lock_map_release(&xe_pm_runtime_lockdep_map);
 	xe_pm_write_callback_task(xe, NULL);
 	return err;
 }
 
-/*
- * For places where resume is synchronous it can be quite easy to deadlock
- * if we are not careful. Also in practice it might be quite timing
- * sensitive to ever see the 0 -> 1 transition with the callers locks
- * held, so deadlocks might exist but are hard for lockdep to ever see.
- * With this in mind, help lockdep learn about the potentially scary
- * stuff that can happen inside the runtime_resume callback by acquiring
- * a dummy lock (it doesn't protect anything and gets compiled out on
- * non-debug builds).  Lockdep then only needs to see the
- * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
- * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
- * For example if the (callers_locks) are ever grabbed in the
- * runtime_resume callback, lockdep should give us a nice splat.
- */
-static void pm_runtime_lockdep_training(void)
-{
-	lock_map_acquire(&xe_pm_runtime_lockdep_map);
-	lock_map_release(&xe_pm_runtime_lockdep_map);
-}
-
 /**
  * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
  * @xe: xe device instance
@@ -436,7 +383,6 @@ void xe_pm_runtime_get(struct xe_device *xe)
 	if (xe_pm_read_callback_task(xe) == current)
 		return;
 
-	pm_runtime_lockdep_training();
 	pm_runtime_resume(xe->drm.dev);
 }
 
@@ -466,7 +412,6 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
 	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
 		return -ELOOP;
 
-	pm_runtime_lockdep_training();
 	return pm_runtime_get_sync(xe->drm.dev);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 20/34] drm/xe: Stop checking for power_lost on D3Cold
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (18 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 19/34] drm/xe: Remove pm_runtime lockdep Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime Rodrigo Vivi
                   ` (16 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

GuC reset status is not reliable for this purpose and it is
once in a while ending up in a situation of D3Cold, where
power_reset is false and without the proper memory restoration
the GuC reload and Display will fail to come back from D3Cold.

So, let's do a full restoration of everything if we have a risk
of losing power, without further optimizations.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h |  3 ---
 drivers/gpu/drm/xe/xe_pm.c           | 12 ++----------
 2 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index eb2b806a1d23..6914cab9191d 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -431,9 +431,6 @@ struct xe_device {
 		/** @d3cold.allowed: Indicates if d3cold is a valid device state */
 		bool allowed;
 
-		/** @d3cold.power_lost: Indicates if card has really lost power. */
-		bool power_lost;
-
 		/**
 		 * @d3cold.vram_threshold:
 		 *
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index f49e449d9fb7..8bec1f175c9d 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -331,15 +331,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	/* Disable access_ongoing asserts and prevent recursive pm calls */
 	xe_pm_write_callback_task(xe, current);
 
-	/*
-	 * It can be possible that xe has allowed d3cold but other pcie devices
-	 * in gfx card soc would have blocked d3cold, therefore card has not
-	 * really lost power. Detecting primary Gt power is sufficient.
-	 */
-	gt = xe_device_get_gt(xe, 0);
-	xe->d3cold.power_lost = xe_guc_in_reset(&gt->uc.guc);
-
-	if (xe->d3cold.allowed && xe->d3cold.power_lost) {
+	if (xe->d3cold.allowed) {
 		for_each_gt(gt, xe, id) {
 			err = xe_pcode_init(gt);
 			if (err)
@@ -362,7 +354,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	for_each_gt(gt, xe, id)
 		xe_gt_resume(gt);
 
-	if (xe->d3cold.allowed && xe->d3cold.power_lost) {
+	if (xe->d3cold.allowed) {
 		err = xe_bo_restore_user(xe);
 		if (err)
 			goto out;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (19 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 20/34] drm/xe: Stop checking for power_lost on D3Cold Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 12:23   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 22/34] drm/xe: Keep D0 for the entire duration of a LR VM Rodrigo Vivi
                   ` (15 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

For the G2H and TLB invalidation cases, the outer bounds
protection of the pm_runtime are not going to work. So
we need to have the inner side protections here.

Cc: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_debugfs.c      |  3 ++
 drivers/gpu/drm/xe/xe_guc_ct.c       | 79 +++++++++++++---------------
 drivers/gpu/drm/xe/xe_guc_ct_types.h |  2 +
 3 files changed, 41 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index 8abdf3c17e1d..b3669eac23c9 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -64,6 +64,9 @@ static int info(struct seq_file *m, void *data)
 			   xe_force_wake_ref(gt_to_fw(gt), XE_FW_GT));
 		drm_printf(&p, "gt%d engine_mask 0x%llx\n", id,
 			   gt->info.engine_mask);
+		drm_printf(&p, "gt%d g2h_outstanding %d g2h_pm_refs %d\n", id,
+			   gt->uc.guc.ct.g2h_outstanding,
+			   gt->uc.guc.ct.g2h_pm_refs);
 	}
 
 	xe_pm_runtime_put(xe);
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index f3d356383ced..139cfe733661 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -287,16 +287,49 @@ static int guc_ct_control_toggle(struct xe_guc_ct *ct, bool enable)
 	return ret > 0 ? -EPROTO : ret;
 }
 
+static void guc_ct_g2h_outstanding_clear(struct xe_guc_ct *ct)
+{
+	int i;
+
+	for (i = 0; i < ct->g2h_pm_refs; i++)
+		xe_pm_runtime_put(ct_to_xe(ct));
+	ct->g2h_outstanding = 0;
+	ct->g2h_pm_refs = 0;
+}
+
+static void guc_ct_g2h_outstanding_dec(struct xe_guc_ct *ct)
+{
+	if (ct->g2h_pm_refs > 0)
+		xe_pm_runtime_put(ct_to_xe(ct));
+	ct->g2h_pm_refs--;
+	ct->g2h_outstanding--;
+}
+
+static void guc_ct_g2h_outstanding_add(struct xe_guc_ct *ct, int num_g2h)
+{
+	struct xe_device *xe = ct_to_xe(ct);
+	int i;
+
+	for (i = 0; i < num_g2h; i++) {
+		if (xe_pm_runtime_get_if_in_use(xe))
+			ct->g2h_pm_refs++;
+		else
+			drm_err(&xe->drm, "Failed to grab RPM ref for outstanding g2h\n");
+	}
+	ct->g2h_outstanding += num_g2h;
+}
+
 static void xe_guc_ct_set_state(struct xe_guc_ct *ct,
 				enum xe_guc_ct_state state)
 {
 	mutex_lock(&ct->lock);		/* Serialise dequeue_one_g2h() */
 	spin_lock_irq(&ct->fast_lock);	/* Serialise CT fast-path */
 
+
 	xe_gt_assert(ct_to_gt(ct), ct->g2h_outstanding == 0 ||
 		     state == XE_GUC_CT_STATE_STOPPED);
 
-	ct->g2h_outstanding = 0;
+	guc_ct_g2h_outstanding_clear(ct);
 	ct->state = state;
 
 	spin_unlock_irq(&ct->fast_lock);
@@ -428,7 +461,7 @@ static void __g2h_reserve_space(struct xe_guc_ct *ct, u32 g2h_len, u32 num_g2h)
 		lockdep_assert_held(&ct->fast_lock);
 
 		ct->ctbs.g2h.info.space -= g2h_len;
-		ct->g2h_outstanding += num_g2h;
+		guc_ct_g2h_outstanding_add(ct, num_g2h);
 	}
 }
 
@@ -439,7 +472,7 @@ static void __g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
 		  ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space);
 
 	ct->ctbs.g2h.info.space += g2h_len;
-	--ct->g2h_outstanding;
+	guc_ct_g2h_outstanding_dec(ct);
 }
 
 static void g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
@@ -1199,14 +1232,8 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
  */
 void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
 {
-	struct xe_device *xe = ct_to_xe(ct);
-	bool ongoing;
 	int len;
 
-	ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
-	if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
-		return;
-
 	spin_lock(&ct->fast_lock);
 	do {
 		len = g2h_read(ct, ct->fast_msg, true);
@@ -1214,9 +1241,6 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
 			g2h_fast_path(ct, ct->fast_msg, len);
 	} while (len > 0);
 	spin_unlock(&ct->fast_lock);
-
-	if (ongoing)
-		xe_device_mem_access_put(xe);
 }
 
 /* Returns less than zero on error, 0 on done, 1 on more available */
@@ -1247,36 +1271,8 @@ static int dequeue_one_g2h(struct xe_guc_ct *ct)
 static void g2h_worker_func(struct work_struct *w)
 {
 	struct xe_guc_ct *ct = container_of(w, struct xe_guc_ct, g2h_worker);
-	bool ongoing;
 	int ret;
 
-	/*
-	 * Normal users must always hold mem_access.ref around CT calls. However
-	 * during the runtime pm callbacks we rely on CT to talk to the GuC, but
-	 * at this stage we can't rely on mem_access.ref and even the
-	 * callback_task will be different than current.  For such cases we just
-	 * need to ensure we always process the responses from any blocking
-	 * ct_send requests or where we otherwise expect some response when
-	 * initiated from those callbacks (which will need to wait for the below
-	 * dequeue_one_g2h()).  The dequeue_one_g2h() will gracefully fail if
-	 * the device has suspended to the point that the CT communication has
-	 * been disabled.
-	 *
-	 * If we are inside the runtime pm callback, we can be the only task
-	 * still issuing CT requests (since that requires having the
-	 * mem_access.ref).  It seems like it might in theory be possible to
-	 * receive unsolicited events from the GuC just as we are
-	 * suspending-resuming, but those will currently anyway be lost when
-	 * eventually exiting from suspend, hence no need to wake up the device
-	 * here. If we ever need something stronger than get_if_ongoing() then
-	 * we need to be careful with blocking the pm callbacks from getting CT
-	 * responses, if the worker here is blocked on those callbacks
-	 * completing, creating a deadlock.
-	 */
-	ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
-	if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
-		return;
-
 	do {
 		mutex_lock(&ct->lock);
 		ret = dequeue_one_g2h(ct);
@@ -1290,9 +1286,6 @@ static void g2h_worker_func(struct work_struct *w)
 			kick_reset(ct);
 		}
 	} while (ret == 1);
-
-	if (ongoing)
-		xe_device_mem_access_put(ct_to_xe(ct));
 }
 
 static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h b/drivers/gpu/drm/xe/xe_guc_ct_types.h
index d29144c9f20b..4220230e7be4 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
@@ -108,6 +108,8 @@ struct xe_guc_ct {
 	} ctbs;
 	/** @g2h_outstanding: number of outstanding G2H */
 	u32 g2h_outstanding;
+	/** @g2h_pm_refs: number of pm_refs for pending G2H */
+	u32 g2h_pm_refs;
 	/** @g2h_worker: worker to process G2H messages */
 	struct work_struct g2h_worker;
 	/** @state: CT state */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 22/34] drm/xe: Keep D0 for the entire duration of a LR VM
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (20 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation Rodrigo Vivi
                   ` (14 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's keep this extra protection for Long Running VMs since
LR job's hw fence is signal immediately after scheduling the job to the
hardware. Once the hw fence is signalled, the job can be
typically be freed. So, the exec/sched protection might not
be enough there.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index d096a8c00bd4..9ad154a01ad7 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1330,6 +1330,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 		INIT_WORK(&vm->preempt.rebind_work, preempt_rebind_work_func);
 		vm->flags |= XE_VM_FLAG_LR_MODE;
 		vm->batch_invalidate_tlb = false;
+		xe_pm_runtime_get(xe);
 	}
 
 	/* Fill pt_root after allocating scratch tables */
@@ -1496,6 +1497,8 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	for_each_tile(tile, xe, id)
 		xe_range_fence_tree_fini(&vm->rftree[id]);
 
+	if (vm->flags & XE_VM_FLAG_LR_MODE)
+		xe_pm_runtime_put(xe);
 	xe_vm_put(vm);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (21 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 22/34] drm/xe: Keep D0 for the entire duration of a LR VM Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 12:41   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls Rodrigo Vivi
                   ` (13 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's grab the runtime references around TLB invalidation
to ensure that hardware is awake in D0.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c | 5 +++--
 drivers/gpu/drm/xe/xe_pt.c | 3 +++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 686d716c5581..17d3b7f69580 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -22,6 +22,7 @@
 #include "xe_gt.h"
 #include "xe_map.h"
 #include "xe_migrate.h"
+#include "xe_pm.h"
 #include "xe_preempt_fence.h"
 #include "xe_res_cursor.h"
 #include "xe_trace.h"
@@ -1136,7 +1137,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
 	int idx, r = 0;
 
 	if (needs_rpm)
-		xe_device_mem_access_get(xe);
+		xe_pm_runtime_get(xe);
 
 	ret = ttm_bo_vm_reserve(tbo, vmf);
 	if (ret)
@@ -1176,7 +1177,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
 	dma_resv_unlock(tbo->base.resv);
 out:
 	if (needs_rpm)
-		xe_device_mem_access_put(xe);
+		xe_pm_runtime_put(xe);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index de1030a47588..8d1d4bea7323 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -11,6 +11,7 @@
 #include "xe_gt.h"
 #include "xe_gt_tlb_invalidation.h"
 #include "xe_migrate.h"
+#include "xe_pm.h"
 #include "xe_pt_types.h"
 #include "xe_pt_walk.h"
 #include "xe_res_cursor.h"
@@ -1104,8 +1105,10 @@ static void invalidation_fence_work_func(struct work_struct *w)
 	struct invalidation_fence *ifence =
 		container_of(w, struct invalidation_fence, work);
 
+	xe_pm_runtime_get(gt_to_xe(ifence->gt));
 	trace_xe_gt_tlb_invalidation_fence_work_func(&ifence->base);
 	xe_gt_tlb_invalidation_vma(ifence->gt, &ifence->base, ifence->vma);
+	xe_pm_runtime_put(gt_to_xe(ifence->gt));
 }
 
 static int invalidation_fence_init(struct xe_gt *gt,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (22 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 12:43   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime Rodrigo Vivi
                   ` (12 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Every IOCTL is already protected on its outer bounds by
xe_pm_runtime_{get,put} calls, so we can now remove
these.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 9b35673b286c..86222c80a874 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -147,7 +147,6 @@ query_engine_cycles(struct xe_device *xe,
 	if (!hwe)
 		return -EINVAL;
 
-	xe_device_mem_access_get(xe);
 	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 
 	__read_timestamps(gt,
@@ -159,7 +158,6 @@ query_engine_cycles(struct xe_device *xe,
 			  cpu_clock);
 
 	xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
-	xe_device_mem_access_put(xe);
 	resp.width = 36;
 
 	/* Only write to the output fields of user query */
@@ -437,9 +435,7 @@ static int query_hwconfig(struct xe_device *xe,
 	if (!hwconfig)
 		return -ENOMEM;
 
-	xe_device_mem_access_get(xe);
 	xe_guc_hwconfig_copy(&gt->uc.guc, hwconfig);
-	xe_device_mem_access_put(xe);
 
 	if (copy_to_user(query_ptr, hwconfig, size)) {
 		kfree(hwconfig);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (23 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 13:11   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore Rodrigo Vivi
                   ` (11 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Let's directly use xe_pm_runtime_{get,put} instead of the
mem_access helpers that are going away soon.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_gsc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gsc.c b/drivers/gpu/drm/xe/xe_gsc.c
index 0b90fd9ef63a..904026634991 100644
--- a/drivers/gpu/drm/xe/xe_gsc.c
+++ b/drivers/gpu/drm/xe/xe_gsc.c
@@ -20,6 +20,7 @@
 #include "xe_huc.h"
 #include "xe_map.h"
 #include "xe_mmio.h"
+#include "xe_pm.h"
 #include "xe_sched_job.h"
 #include "xe_uc_fw.h"
 #include "xe_wa.h"
@@ -284,7 +285,7 @@ static void gsc_work(struct work_struct *work)
 	gsc->work_actions = 0;
 	spin_unlock_irq(&gsc->lock);
 
-	xe_device_mem_access_get(xe);
+	xe_pm_runtime_get(xe);
 	xe_force_wake_get(gt_to_fw(gt), XE_FW_GSC);
 
 	if (actions & GSC_ACTION_FW_LOAD) {
@@ -299,7 +300,7 @@ static void gsc_work(struct work_struct *work)
 		xe_gsc_proxy_request_handler(gsc);
 
 	xe_force_wake_put(gt_to_fw(gt), XE_FW_GSC);
-	xe_device_mem_access_put(xe);
+	xe_pm_runtime_put(xe);
 }
 
 int xe_gsc_init(struct xe_gsc *gsc)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (24 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 13:29   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 27/34] drm/xe: Remove useless mem_access during probe Rodrigo Vivi
                   ` (10 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

All the VM operations are now protected or by the IOCTL
calls, or by sched/exec operations or by the specific
xe_pm_runtime_{get,put} around the duration of the LR VMs.

So, these inner mem_access protections can be removed.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 9ad154a01ad7..f314ff128028 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1280,9 +1280,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 
 	vm->pt_ops = &xelp_pt_ops;
 
-	if (!(flags & XE_VM_FLAG_MIGRATION))
-		xe_device_mem_access_get(xe);
-
 	vm_resv_obj = drm_gpuvm_resv_object_alloc(&xe->drm);
 	if (!vm_resv_obj) {
 		err = -ENOMEM;
@@ -1391,8 +1388,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	for_each_tile(tile, xe, id)
 		xe_range_fence_tree_fini(&vm->rftree[id]);
 	kfree(vm);
-	if (!(flags & XE_VM_FLAG_MIGRATION))
-		xe_device_mem_access_put(xe);
 	return ERR_PTR(err);
 }
 
@@ -1515,8 +1510,6 @@ static void vm_destroy_work_func(struct work_struct *w)
 	xe_assert(xe, !vm->size);
 
 	if (!(vm->flags & XE_VM_FLAG_MIGRATION)) {
-		xe_device_mem_access_put(xe);
-
 		if (xe->info.has_asid && vm->usm.asid) {
 			mutex_lock(&xe->usm.lock);
 			lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 27/34] drm/xe: Remove useless mem_access during probe
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (25 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 13:18   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions Rodrigo Vivi
                   ` (9 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

xe_pm_init is the very last thing during the xe_pci_probe(),
hence these protections are useless from the point of view
of ensuring that the device is awake.

Let's remove it so we continue towards the goal of killing
xe_device_mem_access.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_ggtt.c |  2 --
 drivers/gpu/drm/xe/xe_gt.c   |  6 ------
 drivers/gpu/drm/xe/xe_tile.c | 10 +++-------
 3 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 6fdf830678b3..7faf02523b03 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -201,14 +201,12 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
 	u64 start, end;
 
 	/* Display may have allocated inside ggtt, so be careful with clearing here */
-	xe_device_mem_access_get(tile_to_xe(ggtt->tile));
 	mutex_lock(&ggtt->lock);
 	drm_mm_for_each_hole(hole, &ggtt->mm, start, end)
 		xe_ggtt_clear(ggtt, start, end - start);
 
 	xe_ggtt_invalidate(ggtt);
 	mutex_unlock(&ggtt->lock);
-	xe_device_mem_access_put(tile_to_xe(ggtt->tile));
 }
 
 int xe_ggtt_init(struct xe_ggtt *ggtt)
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 675a2927a19e..c6c4292c21f5 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -349,7 +349,6 @@ static int gt_fw_domain_init(struct xe_gt *gt)
 {
 	int err, i;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	if (err)
 		goto err_hw_fence_irq;
@@ -407,7 +406,6 @@ static int gt_fw_domain_init(struct xe_gt *gt)
 
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 	XE_WARN_ON(err);
-	xe_device_mem_access_put(gt_to_xe(gt));
 
 	return 0;
 
@@ -417,7 +415,6 @@ static int gt_fw_domain_init(struct xe_gt *gt)
 err_hw_fence_irq:
 	for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
 		xe_hw_fence_irq_finish(&gt->fence_irq[i]);
-	xe_device_mem_access_put(gt_to_xe(gt));
 
 	return err;
 }
@@ -426,7 +423,6 @@ static int all_fw_domain_init(struct xe_gt *gt)
 {
 	int err, i;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (err)
 		goto err_hw_fence_irq;
@@ -489,7 +485,6 @@ static int all_fw_domain_init(struct xe_gt *gt)
 
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	XE_WARN_ON(err);
-	xe_device_mem_access_put(gt_to_xe(gt));
 
 	return 0;
 
@@ -498,7 +493,6 @@ static int all_fw_domain_init(struct xe_gt *gt)
 err_hw_fence_irq:
 	for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
 		xe_hw_fence_irq_finish(&gt->fence_irq[i]);
-	xe_device_mem_access_put(gt_to_xe(gt));
 
 	return err;
 }
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
index 044c20881de7..74ecb5f39438 100644
--- a/drivers/gpu/drm/xe/xe_tile.c
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -160,23 +160,19 @@ int xe_tile_init_noalloc(struct xe_tile *tile)
 {
 	int err;
 
-	xe_device_mem_access_get(tile_to_xe(tile));
-
 	err = tile_ttm_mgr_init(tile);
 	if (err)
-		goto err_mem_access;
+		return err;
 
 	tile->mem.kernel_bb_pool = xe_sa_bo_manager_init(tile, SZ_1M, 16);
 	if (IS_ERR(tile->mem.kernel_bb_pool))
-		err = PTR_ERR(tile->mem.kernel_bb_pool);
+		return PTR_ERR(tile->mem.kernel_bb_pool);
 
 	xe_wa_apply_tile_workarounds(tile);
 
 	xe_tile_sysfs_init(tile);
 
-err_mem_access:
-	xe_device_mem_access_put(tile_to_xe(tile));
-	return err;
+	return 0;
 }
 
 void xe_tile_migrate_wait(struct xe_tile *tile)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (26 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 27/34] drm/xe: Remove useless mem_access during probe Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 13:30   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime Rodrigo Vivi
                   ` (8 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

At these points, we are sure that device is awake in D0.
Likely in the middle of the transition, but awake. So,
these extra protections are useless. Let's remove it and
continue with the killing of xe_device_mem_access.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index c6c4292c21f5..9565ab9807bb 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -684,13 +684,11 @@ void xe_gt_reset_async(struct xe_gt *gt)
 
 void xe_gt_suspend_prepare(struct xe_gt *gt)
 {
-	xe_device_mem_access_get(gt_to_xe(gt));
 	XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 
 	xe_uc_stop_prepare(&gt->uc);
 
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-	xe_device_mem_access_put(gt_to_xe(gt));
 }
 
 int xe_gt_suspend(struct xe_gt *gt)
@@ -699,7 +697,6 @@ int xe_gt_suspend(struct xe_gt *gt)
 
 	xe_gt_sanitize(gt);
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (err)
 		goto err_msg;
@@ -709,7 +706,6 @@ int xe_gt_suspend(struct xe_gt *gt)
 		goto err_force_wake;
 
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-	xe_device_mem_access_put(gt_to_xe(gt));
 	xe_gt_info(gt, "suspended\n");
 
 	return 0;
@@ -717,7 +713,6 @@ int xe_gt_suspend(struct xe_gt *gt)
 err_force_wake:
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 err_msg:
-	xe_device_mem_access_put(gt_to_xe(gt));
 	xe_gt_err(gt, "suspend failed (%pe)\n", ERR_PTR(err));
 
 	return err;
@@ -727,7 +722,6 @@ int xe_gt_resume(struct xe_gt *gt)
 {
 	int err;
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (err)
 		goto err_msg;
@@ -737,7 +731,6 @@ int xe_gt_resume(struct xe_gt *gt)
 		goto err_force_wake;
 
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-	xe_device_mem_access_put(gt_to_xe(gt));
 	xe_gt_info(gt, "resumed\n");
 
 	return 0;
@@ -745,7 +738,6 @@ int xe_gt_resume(struct xe_gt *gt)
 err_force_wake:
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 err_msg:
-	xe_device_mem_access_put(gt_to_xe(gt));
 	xe_gt_err(gt, "resume failed (%pe)\n", ERR_PTR(err));
 
 	return err;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (27 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 13:33   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps Rodrigo Vivi
                   ` (7 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

We need to ensure that device is in D0 on any kind of GT reset.
We are likely already protected by outer bounds like exec,
but if exec/sched ref gets dropped on a hang, we might transition
to D3 before we are able to perform the gt_reset and recover.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 9565ab9807bb..ad4cf0a8adaf 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -43,6 +43,7 @@
 #include "xe_migrate.h"
 #include "xe_mmio.h"
 #include "xe_pat.h"
+#include "xe_pm.h"
 #include "xe_mocs.h"
 #include "xe_reg_sr.h"
 #include "xe_ring_ops.h"
@@ -617,9 +618,9 @@ static int gt_reset(struct xe_gt *gt)
 		goto err_fail;
 	}
 
+	xe_pm_runtime_get(gt_to_xe(gt));
 	xe_gt_sanitize(gt);
 
-	xe_device_mem_access_get(gt_to_xe(gt));
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	if (err)
 		goto err_msg;
@@ -643,8 +644,8 @@ static int gt_reset(struct xe_gt *gt)
 		goto err_out;
 
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
-	xe_device_mem_access_put(gt_to_xe(gt));
 	XE_WARN_ON(err);
+	xe_pm_runtime_put(gt_to_xe(gt));
 
 	xe_gt_info(gt, "reset done\n");
 
@@ -654,7 +655,7 @@ static int gt_reset(struct xe_gt *gt)
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 err_msg:
 	XE_WARN_ON(xe_uc_start(&gt->uc));
-	xe_device_mem_access_put(gt_to_xe(gt));
+	xe_pm_runtime_put(gt_to_xe(gt));
 err_fail:
 	xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err));
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (28 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-02-05 13:34   ` Matthew Auld
  2024-01-26 20:30 ` [RFC 31/34] drm/xe: Remove inner mem_access protections Rodrigo Vivi
                   ` (6 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

PAT dumps are already protected by the xe_pm_runtime_{get,put}
around the debugfs call. So, these can be removed.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pat.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
index 1ff6bc79e7d4..b253ef863c27 100644
--- a/drivers/gpu/drm/xe/xe_pat.c
+++ b/drivers/gpu/drm/xe/xe_pat.c
@@ -173,7 +173,6 @@ static void xelp_dump(struct xe_gt *gt, struct drm_printer *p)
 	struct xe_device *xe = gt_to_xe(gt);
 	int i, err;
 
-	xe_device_mem_access_get(xe);
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	if (err)
 		goto err_fw;
@@ -191,7 +190,6 @@ static void xelp_dump(struct xe_gt *gt, struct drm_printer *p)
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 err_fw:
 	xe_assert(xe, !err);
-	xe_device_mem_access_put(xe);
 }
 
 static const struct xe_pat_ops xelp_pat_ops = {
@@ -204,7 +202,6 @@ static void xehp_dump(struct xe_gt *gt, struct drm_printer *p)
 	struct xe_device *xe = gt_to_xe(gt);
 	int i, err;
 
-	xe_device_mem_access_get(xe);
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	if (err)
 		goto err_fw;
@@ -224,7 +221,6 @@ static void xehp_dump(struct xe_gt *gt, struct drm_printer *p)
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 err_fw:
 	xe_assert(xe, !err);
-	xe_device_mem_access_put(xe);
 }
 
 static const struct xe_pat_ops xehp_pat_ops = {
@@ -237,7 +233,6 @@ static void xehpc_dump(struct xe_gt *gt, struct drm_printer *p)
 	struct xe_device *xe = gt_to_xe(gt);
 	int i, err;
 
-	xe_device_mem_access_get(xe);
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	if (err)
 		goto err_fw;
@@ -255,7 +250,6 @@ static void xehpc_dump(struct xe_gt *gt, struct drm_printer *p)
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 err_fw:
 	xe_assert(xe, !err);
-	xe_device_mem_access_put(xe);
 }
 
 static const struct xe_pat_ops xehpc_pat_ops = {
@@ -268,7 +262,6 @@ static void xelpg_dump(struct xe_gt *gt, struct drm_printer *p)
 	struct xe_device *xe = gt_to_xe(gt);
 	int i, err;
 
-	xe_device_mem_access_get(xe);
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	if (err)
 		goto err_fw;
@@ -291,7 +284,6 @@ static void xelpg_dump(struct xe_gt *gt, struct drm_printer *p)
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 err_fw:
 	xe_assert(xe, !err);
-	xe_device_mem_access_put(xe);
 }
 
 /*
@@ -324,7 +316,6 @@ static void xe2_dump(struct xe_gt *gt, struct drm_printer *p)
 	int i, err;
 	u32 pat;
 
-	xe_device_mem_access_get(xe);
 	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
 	if (err)
 		goto err_fw;
@@ -369,7 +360,6 @@ static void xe2_dump(struct xe_gt *gt, struct drm_printer *p)
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 err_fw:
 	xe_assert(xe, !err);
-	xe_device_mem_access_put(xe);
 }
 
 static const struct xe_pat_ops xe2_pat_ops = {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 31/34] drm/xe: Remove inner mem_access protections
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (29 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 32/34] drm/xe: Kill xe_device_mem_access_{get*,put} Rodrigo Vivi
                   ` (5 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

At this point, all the outer bounds callers of direct
xe_pm_runtime_{get,put}, should already be ensuring that
device is in D0 at these points of BO creation and
manipulation. So, time to remove the very last users
of the xe_device_mem_access_{get,put}.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/display/xe_fb_pin.c | 7 ++-----
 drivers/gpu/drm/xe/xe_bo.c             | 5 -----
 drivers/gpu/drm/xe/xe_ggtt.c           | 4 ----
 3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
index 722c84a56607..077294ec50ec 100644
--- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
+++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
@@ -190,10 +190,9 @@ static int __xe_pin_fb_vma_ggtt(struct intel_framebuffer *fb,
 	/* TODO: Consider sharing framebuffer mapping?
 	 * embed i915_vma inside intel_framebuffer
 	 */
-	xe_device_mem_access_get(tile_to_xe(ggtt->tile));
 	ret = mutex_lock_interruptible(&ggtt->lock);
 	if (ret)
-		goto out;
+		return ret;
 
 	align = XE_PAGE_SIZE;
 	if (xe_bo_is_vram(bo) && ggtt->flags & XE_GGTT_FLAGS_64K)
@@ -241,8 +240,6 @@ static int __xe_pin_fb_vma_ggtt(struct intel_framebuffer *fb,
 	xe_ggtt_invalidate(ggtt);
 out_unlock:
 	mutex_unlock(&ggtt->lock);
-out:
-	xe_device_mem_access_put(tile_to_xe(ggtt->tile));
 	return ret;
 }
 
@@ -381,4 +378,4 @@ struct i915_address_space *intel_dpt_create(struct intel_framebuffer *fb)
 void intel_dpt_destroy(struct i915_address_space *vm)
 {
 	return;
-}
\ No newline at end of file
+}
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 17d3b7f69580..99c6c5452f84 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -730,7 +730,6 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 	xe_assert(xe, migrate);
 
 	trace_xe_bo_move(bo);
-	xe_device_mem_access_get(xe);
 
 	if (xe_bo_is_pinned(bo) && !xe_bo_is_user(bo)) {
 		/*
@@ -754,7 +753,6 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 
 				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
 					ret = -EINVAL;
-					xe_device_mem_access_put(xe);
 					goto out;
 				}
 
@@ -772,7 +770,6 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 						new_mem, handle_system_ccs);
 		if (IS_ERR(fence)) {
 			ret = PTR_ERR(fence);
-			xe_device_mem_access_put(xe);
 			goto out;
 		}
 		if (!move_lacks_source) {
@@ -797,8 +794,6 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 		dma_fence_put(fence);
 	}
 
-	xe_device_mem_access_put(xe);
-
 out:
 	return ret;
 
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 7faf02523b03..1bbc5250f363 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -435,14 +435,12 @@ static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
 	if (err)
 		return err;
 
-	xe_device_mem_access_get(tile_to_xe(ggtt->tile));
 	mutex_lock(&ggtt->lock);
 	err = drm_mm_insert_node_in_range(&ggtt->mm, &bo->ggtt_node, bo->size,
 					  alignment, 0, start, end, 0);
 	if (!err)
 		xe_ggtt_map_bo(ggtt, bo);
 	mutex_unlock(&ggtt->lock);
-	xe_device_mem_access_put(tile_to_xe(ggtt->tile));
 
 	return err;
 }
@@ -460,7 +458,6 @@ int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
 
 void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node)
 {
-	xe_device_mem_access_get(tile_to_xe(ggtt->tile));
 	mutex_lock(&ggtt->lock);
 
 	xe_ggtt_clear(ggtt, node->start, node->size);
@@ -470,7 +467,6 @@ void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node)
 	xe_ggtt_invalidate(ggtt);
 
 	mutex_unlock(&ggtt->lock);
-	xe_device_mem_access_put(tile_to_xe(ggtt->tile));
 }
 
 void xe_ggtt_remove_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 32/34] drm/xe: Kill xe_device_mem_access_{get*,put}
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (30 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 31/34] drm/xe: Remove inner mem_access protections Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 33/34] drm/xe: Remove unused runtime pm helper Rodrigo Vivi
                   ` (4 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

No extra callers, so let's simply kill them for good.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c       | 53 ----------------------------
 drivers/gpu/drm/xe/xe_device.h       |  4 ---
 drivers/gpu/drm/xe/xe_device_types.h |  3 --
 3 files changed, 60 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 0b27e09b3db9..c76e12ec673a 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -683,59 +683,6 @@ void xe_device_assert_mem_access(struct xe_device *xe)
 	XE_WARN_ON(xe_pm_runtime_suspended(xe));
 }
 
-bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe)
-{
-	bool active;
-
-	if (xe_pm_read_callback_task(xe) == current)
-		return true;
-
-	active = xe_pm_runtime_get_if_active(xe);
-	if (active) {
-		int ref = atomic_inc_return(&xe->mem_access.ref);
-
-		xe_assert(xe, ref != S32_MAX);
-	}
-
-	return active;
-}
-
-void xe_device_mem_access_get(struct xe_device *xe)
-{
-	int ref;
-
-	/*
-	 * This looks racy, but should be fine since the pm_callback_task only
-	 * transitions from NULL -> current (and back to NULL again), during the
-	 * runtime_resume() or runtime_suspend() callbacks, for which there can
-	 * only be a single one running for our device. We only need to prevent
-	 * recursively calling the runtime_get or runtime_put from those
-	 * callbacks, as well as preventing triggering any access_ongoing
-	 * asserts.
-	 */
-	if (xe_pm_read_callback_task(xe) == current)
-		return;
-
-	xe_pm_runtime_get(xe);
-	ref = atomic_inc_return(&xe->mem_access.ref);
-
-	xe_assert(xe, ref != S32_MAX);
-
-}
-
-void xe_device_mem_access_put(struct xe_device *xe)
-{
-	int ref;
-
-	if (xe_pm_read_callback_task(xe) == current)
-		return;
-
-	ref = atomic_dec_return(&xe->mem_access.ref);
-	xe_pm_runtime_put(xe);
-
-	xe_assert(xe, ref >= 0);
-}
-
 void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
 {
 	struct xe_gt *gt;
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 050d4b6cdc65..2761ee741e7b 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -137,10 +137,6 @@ static inline struct xe_force_wake *gt_to_fw(struct xe_gt *gt)
 	return &gt->mmio.fw;
 }
 
-void xe_device_mem_access_get(struct xe_device *xe);
-bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe);
-void xe_device_mem_access_put(struct xe_device *xe);
-
 void xe_device_assert_mem_access(struct xe_device *xe);
 
 static inline bool xe_device_in_fault_mode(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 6914cab9191d..78817778234b 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -385,9 +385,6 @@ struct xe_device {
 	 * triggering additional actions when they occur.
 	 */
 	struct {
-		/** @mem_access.ref: ref count of memory accesses */
-		atomic_t ref;
-
 		/**
 		 * @mem_access.vram_userfault: Encapsulate vram_userfault
 		 * related stuff
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 33/34] drm/xe: Remove unused runtime pm helper
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (31 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 32/34] drm/xe: Kill xe_device_mem_access_{get*,put} Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-26 20:30 ` [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization Rodrigo Vivi
                   ` (3 subsequent siblings)
  36 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Now that ongoing variants of mem_access is gone, this is not
needed anymore.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 12 ------------
 drivers/gpu/drm/xe/xe_pm.h |  1 -
 2 files changed, 13 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 8bec1f175c9d..e5b0455d2922 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -407,18 +407,6 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
 	return pm_runtime_get_sync(xe->drm.dev);
 }
 
-/**
- * xe_pm_runtime_get_if_active - Get a runtime_pm reference if device active
- * @xe: xe device instance
- *
- * Returns: Any number grater than or equal to 0 for success, negative error
- * code otherwise.
- */
-int xe_pm_runtime_get_if_active(struct xe_device *xe)
-{
-	return pm_runtime_get_if_active(xe->drm.dev, true);
-}
-
 /**
  * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
  * @xe: xe device instance
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index 7d8cf87b95d2..0ae4f6a60c6d 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -29,7 +29,6 @@ int xe_pm_runtime_resume(struct xe_device *xe);
 void xe_pm_runtime_get(struct xe_device *xe);
 int xe_pm_runtime_get_sync(struct xe_device *xe);
 void xe_pm_runtime_put(struct xe_device *xe);
-int xe_pm_runtime_get_if_active(struct xe_device *xe);
 bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
 bool xe_pm_runtime_resume_and_get(struct xe_device *xe);
 void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (32 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 33/34] drm/xe: Remove unused runtime pm helper Rodrigo Vivi
@ 2024-01-26 20:30 ` Rodrigo Vivi
  2024-01-29 12:12   ` Matthew Auld
  2024-01-26 20:39 ` ✓ CI.Patch_applied: success for Kill mem_access v2 Patchwork
                   ` (2 subsequent siblings)
  36 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-01-26 20:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

Now that we eliminated all the mem_access get/put with its
locking issues from the inner calls of migration, we can
allow D3Cold.

Enable it when VRAM utilization is lower then 300Mb. On
higher utilization we only allow D3hot so we don't increase
so much the latency on runtime resume due to the memory
restoration.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.h | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index 0ae4f6a60c6d..bc6bd2a01189 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -8,12 +8,7 @@
 
 #include <linux/pm_runtime.h>
 
-/*
- * TODO: Threshold = 0 will block D3Cold.
- *       Before we can move this to a higher value (like 300), we need to:
- *           1. rewrite the VRAM save / restore to avoid buffer object locks
- */
-#define DEFAULT_VRAM_THRESHOLD 0 /* in MB */
+#define DEFAULT_VRAM_THRESHOLD 300 /* in MB */
 
 struct xe_device;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* ✓ CI.Patch_applied: success for Kill mem_access v2
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (33 preceding siblings ...)
  2024-01-26 20:30 ` [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization Rodrigo Vivi
@ 2024-01-26 20:39 ` Patchwork
  2024-01-26 20:40 ` ✗ CI.checkpatch: warning " Patchwork
  2024-01-26 20:40 ` ✗ CI.KUnit: failure " Patchwork
  36 siblings, 0 replies; 77+ messages in thread
From: Patchwork @ 2024-01-26 20:39 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-xe

== Series Details ==

Series: Kill mem_access v2
URL   : https://patchwork.freedesktop.org/series/129217/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: eeedb1b43 drm-tip: 2024y-01m-26d-20h-14m-43s UTC integration manifest
=== git am output follows ===
Applying: Revert "drm/xe/uc: Store firmware binary in system-memory backed BO"
Applying: drm/xe: Document Xe PM component
Applying: drm/xe: Fix display runtime_pm handling
Applying: drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
Applying: drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion
Applying: drm/xe: Prepare display for D3Cold
Applying: drm/xe: Convert mem_access assertion towards the runtime_pm state
Applying: drm/xe: Runtime PM wake on every IOCTL
Applying: drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
Applying: drm/xe: Convert scheduler towards direct pm_runtime
Applying: drm/xe: Runtime PM wake on every sysfs call
Applying: drm/xe: Ensure device is awake before removing it
Applying: drm/xe: Remove mem_access from guc_pc calls
Applying: drm/xe: Runtime PM wake on every debugfs call
Applying: drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
Applying: drm/xe: Removing extra mem_access protection from runtime pm
Applying: drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
Applying: drm/xe: Move lockdep protection from mem_access to xe_pm_runtime
Applying: drm/xe: Remove pm_runtime lockdep
Applying: drm/xe: Stop checking for power_lost on D3Cold
Applying: drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
Applying: drm/xe: Keep D0 for the entire duration of a LR VM
Applying: drm/xe: Ensure D0 on TLB invalidation
Applying: drm/xe: Remove useless mem_access protection for query ioctls
Applying: drm/xe: Convert gsc_work from mem_access to xe_pm_runtime
Applying: drm/xe: VMs don't need the mem_access protection anymore
Applying: drm/xe: Remove useless mem_access during probe
Applying: drm/xe: Remove mem_access from suspend and resume functions
Applying: drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
Applying: drm/xe: Remove useless mem_access on PAT dumps
Applying: drm/xe: Remove inner mem_access protections
Applying: drm/xe: Kill xe_device_mem_access_{get*,put}
Applying: drm/xe: Remove unused runtime pm helper
Applying: drm/xe: Enable D3Cold on 'low' VRAM utilization



^ permalink raw reply	[flat|nested] 77+ messages in thread

* ✗ CI.checkpatch: warning for Kill mem_access v2
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (34 preceding siblings ...)
  2024-01-26 20:39 ` ✓ CI.Patch_applied: success for Kill mem_access v2 Patchwork
@ 2024-01-26 20:40 ` Patchwork
  2024-01-26 20:40 ` ✗ CI.KUnit: failure " Patchwork
  36 siblings, 0 replies; 77+ messages in thread
From: Patchwork @ 2024-01-26 20:40 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-xe

== Series Details ==

Series: Kill mem_access v2
URL   : https://patchwork.freedesktop.org/series/129217/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
2d919ac662b2798c053d68d1cc16b758c61b55ca
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 93e0c6fa97e358d4566f0b4d2a49f817e9f46b1d
Author: Rodrigo Vivi <rodrigo.vivi@intel.com>
Date:   Fri Jan 26 15:30:43 2024 -0500

    drm/xe: Enable D3Cold on 'low' VRAM utilization
    
    Now that we eliminated all the mem_access get/put with its
    locking issues from the inner calls of migration, we can
    allow D3Cold.
    
    Enable it when VRAM utilization is lower then 300Mb. On
    higher utilization we only allow D3hot so we don't increase
    so much the latency on runtime resume due to the memory
    restoration.
    
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
+ /mt/dim checkpatch eeedb1b4360b100e4fa15a9c5c968a8f5a8de7ed drm-intel
7d02b670f Revert "drm/xe/uc: Store firmware binary in system-memory backed BO"
-:10: WARNING:UNKNOWN_COMMIT_ID: Unknown commit id '48cd5ebcb645ac1bd48480cf11dced902108d306', maybe rebased or not pulled?
#10: 
This reverts commit 48cd5ebcb645ac1bd48480cf11dced902108d306.

total: 0 errors, 1 warnings, 0 checks, 16 lines checked
92dc66bd3 drm/xe: Document Xe PM component
44c3714b3 drm/xe: Fix display runtime_pm handling
dec1d19dd drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
b1e8011df drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion
951bc6233 drm/xe: Prepare display for D3Cold
f4cc91996 drm/xe: Convert mem_access assertion towards the runtime_pm state
2bb486a64 drm/xe: Runtime PM wake on every IOCTL
37accbef8 drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
90153fcc8 drm/xe: Convert scheduler towards direct pm_runtime
38f847a8c drm/xe: Runtime PM wake on every sysfs call
2474ee7ee drm/xe: Ensure device is awake before removing it
506ba97bc drm/xe: Remove mem_access from guc_pc calls
51f5f8cb0 drm/xe: Runtime PM wake on every debugfs call
708cfafb7 drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
ec9f9856d drm/xe: Removing extra mem_access protection from runtime pm
eb17f8d5d drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
921c9f272 drm/xe: Move lockdep protection from mem_access to xe_pm_runtime
-:79: CHECK:LINE_SPACING: Please don't use multiple blank lines
#79: FILE: drivers/gpu/drm/xe/xe_pm.c:69:
 
+

total: 0 errors, 0 warnings, 1 checks, 151 lines checked
4898153f8 drm/xe: Remove pm_runtime lockdep
d4058f95b drm/xe: Stop checking for power_lost on D3Cold
b5e40341e drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
-:74: CHECK:LINE_SPACING: Please don't use multiple blank lines
#74: FILE: drivers/gpu/drm/xe/xe_guc_ct.c:328:
 
+

total: 0 errors, 0 warnings, 1 checks, 151 lines checked
58692c1ec drm/xe: Keep D0 for the entire duration of a LR VM
96049aa8b drm/xe: Ensure D0 on TLB invalidation
c5de9716c drm/xe: Remove useless mem_access protection for query ioctls
bfd882e54 drm/xe: Convert gsc_work from mem_access to xe_pm_runtime
53a9c0067 drm/xe: VMs don't need the mem_access protection anymore
5fdef087c drm/xe: Remove useless mem_access during probe
924d62305 drm/xe: Remove mem_access from suspend and resume functions
b0d058d67 drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
aa9950c99 drm/xe: Remove useless mem_access on PAT dumps
74a70e377 drm/xe: Remove inner mem_access protections
1f5011daf drm/xe: Kill xe_device_mem_access_{get*,put}
e5e705461 drm/xe: Remove unused runtime pm helper
93e0c6fa9 drm/xe: Enable D3Cold on 'low' VRAM utilization



^ permalink raw reply	[flat|nested] 77+ messages in thread

* ✗ CI.KUnit: failure for Kill mem_access v2
  2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
                   ` (35 preceding siblings ...)
  2024-01-26 20:40 ` ✗ CI.checkpatch: warning " Patchwork
@ 2024-01-26 20:40 ` Patchwork
  36 siblings, 0 replies; 77+ messages in thread
From: Patchwork @ 2024-01-26 20:40 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-xe

== Series Details ==

Series: Kill mem_access v2
URL   : https://patchwork.freedesktop.org/series/129217/
State : failure

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
ERROR:root:../arch/x86/um/user-offsets.c:17:6: warning: no previous prototype for ‘foo’ [-Wmissing-prototypes]
   17 | void foo(void)
      |      ^~~
In file included from ../arch/um/kernel/asm-offsets.c:1:
../arch/x86/um/shared/sysdep/kernel-offsets.h:9:6: warning: no previous prototype for ‘foo’ [-Wmissing-prototypes]
    9 | void foo(void)
      |      ^~~
../arch/x86/um/bugs_64.c:9:6: warning: no previous prototype for ‘arch_check_bugs’ [-Wmissing-prototypes]
    9 | void arch_check_bugs(void)
      |      ^~~~~~~~~~~~~~~
../arch/x86/um/bugs_64.c:13:6: warning: no previous prototype for ‘arch_examine_signal’ [-Wmissing-prototypes]
   13 | void arch_examine_signal(int sig, struct uml_pt_regs *regs)
      |      ^~~~~~~~~~~~~~~~~~~
../arch/x86/um/fault.c:18:5: warning: no previous prototype for ‘arch_fixup’ [-Wmissing-prototypes]
   18 | int arch_fixup(unsigned long address, struct uml_pt_regs *regs)
      |     ^~~~~~~~~~
../arch/x86/um/os-Linux/registers.c:146:15: warning: no previous prototype for ‘get_thread_reg’ [-Wmissing-prototypes]
  146 | unsigned long get_thread_reg(int reg, jmp_buf *buf)
      |               ^~~~~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:16:5: warning: no previous prototype for ‘__vdso_clock_gettime’ [-Wmissing-prototypes]
   16 | int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts)
      |     ^~~~~~~~~~~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:30:5: warning: no previous prototype for ‘__vdso_gettimeofday’ [-Wmissing-prototypes]
   30 | int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
      |     ^~~~~~~~~~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:44:21: warning: no previous prototype for ‘__vdso_time’ [-Wmissing-prototypes]
   44 | __kernel_old_time_t __vdso_time(__kernel_old_time_t *t)
      |                     ^~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:57:1: warning: no previous prototype for ‘__vdso_getcpu’ [-Wmissing-prototypes]
   57 | __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused)
      | ^~~~~~~~~~~~~
../arch/um/os-Linux/skas/process.c:107:6: warning: no previous prototype for ‘wait_stub_done’ [-Wmissing-prototypes]
  107 | void wait_stub_done(int pid)
      |      ^~~~~~~~~~~~~~
../arch/um/os-Linux/skas/process.c:683:6: warning: no previous prototype for ‘__switch_mm’ [-Wmissing-prototypes]
  683 | void __switch_mm(struct mm_id *mm_idp)
      |      ^~~~~~~~~~~
../arch/x86/um/os-Linux/mcontext.c:7:6: warning: no previous prototype for ‘get_regs_from_mc’ [-Wmissing-prototypes]
    7 | void get_regs_from_mc(struct uml_pt_regs *regs, mcontext_t *mc)
      |      ^~~~~~~~~~~~~~~~
../arch/um/kernel/skas/process.c:36:12: warning: no previous prototype for ‘start_uml’ [-Wmissing-prototypes]
   36 | int __init start_uml(void)
      |            ^~~~~~~~~
../arch/um/kernel/skas/mmu.c:17:5: warning: no previous prototype for ‘init_new_context’ [-Wmissing-prototypes]
   17 | int init_new_context(struct task_struct *task, struct mm_struct *mm)
      |     ^~~~~~~~~~~~~~~~
../arch/um/kernel/skas/mmu.c:60:6: warning: no previous prototype for ‘destroy_context’ [-Wmissing-prototypes]
   60 | void destroy_context(struct mm_struct *mm)
      |      ^~~~~~~~~~~~~~~
../arch/um/os-Linux/main.c:187:7: warning: no previous prototype for ‘__wrap_malloc’ [-Wmissing-prototypes]
  187 | void *__wrap_malloc(int size)
      |       ^~~~~~~~~~~~~
../arch/um/os-Linux/main.c:208:7: warning: no previous prototype for ‘__wrap_calloc’ [-Wmissing-prototypes]
  208 | void *__wrap_calloc(int n, int size)
      |       ^~~~~~~~~~~~~
../arch/um/os-Linux/main.c:222:6: warning: no previous prototype for ‘__wrap_free’ [-Wmissing-prototypes]
  222 | void __wrap_free(void *ptr)
      |      ^~~~~~~~~~~
../arch/x86/um/ptrace_64.c:111:5: warning: no previous prototype for ‘poke_user’ [-Wmissing-prototypes]
  111 | int poke_user(struct task_struct *child, long addr, long data)
      |     ^~~~~~~~~
../arch/x86/um/ptrace_64.c:171:5: warning: no previous prototype for ‘peek_user’ [-Wmissing-prototypes]
  171 | int peek_user(struct task_struct *child, long addr, long data)
      |     ^~~~~~~~~
../arch/um/os-Linux/mem.c:28:6: warning: no previous prototype for ‘kasan_map_memory’ [-Wmissing-prototypes]
   28 | void kasan_map_memory(void *start, size_t len)
      |      ^~~~~~~~~~~~~~~~
../arch/um/os-Linux/mem.c:212:13: warning: no previous prototype for ‘check_tmpexec’ [-Wmissing-prototypes]
  212 | void __init check_tmpexec(void)
      |             ^~~~~~~~~~~~~
../arch/um/os-Linux/signal.c:75:6: warning: no previous prototype for ‘sig_handler’ [-Wmissing-prototypes]
   75 | void sig_handler(int sig, struct siginfo *si, mcontext_t *mc)
      |      ^~~~~~~~~~~
../arch/um/os-Linux/signal.c:111:6: warning: no previous prototype for ‘timer_alarm_handler’ [-Wmissing-prototypes]
  111 | void timer_alarm_handler(int sig, struct siginfo *unused_si, mcontext_t *mc)
      |      ^~~~~~~~~~~~~~~~~~~
../arch/um/os-Linux/start_up.c:301:12: warning: no previous prototype for ‘parse_iomem’ [-Wmissing-prototypes]
  301 | int __init parse_iomem(char *str, int *add)
      |            ^~~~~~~~~~~
../arch/x86/um/signal.c:560:6: warning: no previous prototype for ‘sys_rt_sigreturn’ [-Wmissing-prototypes]
  560 | long sys_rt_sigreturn(void)
      |      ^~~~~~~~~~~~~~~~
../arch/um/kernel/mem.c:202:8: warning: no previous prototype for ‘pgd_alloc’ [-Wmissing-prototypes]
  202 | pgd_t *pgd_alloc(struct mm_struct *mm)
      |        ^~~~~~~~~
../arch/um/kernel/mem.c:215:7: warning: no previous prototype for ‘uml_kmalloc’ [-Wmissing-prototypes]
  215 | void *uml_kmalloc(int size, int flags)
      |       ^~~~~~~~~~~
../arch/um/kernel/process.c:51:5: warning: no previous prototype for ‘pid_to_processor_id’ [-Wmissing-prototypes]
   51 | int pid_to_processor_id(int pid)
      |     ^~~~~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:87:7: warning: no previous prototype for ‘__switch_to’ [-Wmissing-prototypes]
   87 | void *__switch_to(struct task_struct *from, struct task_struct *to)
      |       ^~~~~~~~~~~
../arch/um/kernel/process.c:140:6: warning: no previous prototype for ‘fork_handler’ [-Wmissing-prototypes]
  140 | void fork_handler(void)
      |      ^~~~~~~~~~~~
../arch/um/kernel/process.c:217:6: warning: no previous prototype for ‘arch_cpu_idle’ [-Wmissing-prototypes]
  217 | void arch_cpu_idle(void)
      |      ^~~~~~~~~~~~~
../arch/um/kernel/process.c:253:5: warning: no previous prototype for ‘copy_to_user_proc’ [-Wmissing-prototypes]
  253 | int copy_to_user_proc(void __user *to, void *from, int size)
      |     ^~~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:263:5: warning: no previous prototype for ‘clear_user_proc’ [-Wmissing-prototypes]
  263 | int clear_user_proc(void __user *buf, int size)
      |     ^~~~~~~~~~~~~~~
../arch/um/kernel/process.c:271:6: warning: no previous prototype for ‘set_using_sysemu’ [-Wmissing-prototypes]
  271 | void set_using_sysemu(int value)
      |      ^~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:278:5: warning: no previous prototype for ‘get_using_sysemu’ [-Wmissing-prototypes]
  278 | int get_using_sysemu(void)
      |     ^~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:316:12: warning: no previous prototype for ‘make_proc_sysemu’ [-Wmissing-prototypes]
  316 | int __init make_proc_sysemu(void)
      |            ^~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:348:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes]
  348 | unsigned long arch_align_stack(unsigned long sp)
      |               ^~~~~~~~~~~~~~~~
../arch/um/kernel/reboot.c:45:6: warning: no previous prototype for ‘machine_restart’ [-Wmissing-prototypes]
   45 | void machine_restart(char * __unused)
      |      ^~~~~~~~~~~~~~~
../arch/um/kernel/reboot.c:51:6: warning: no previous prototype for ‘machine_power_off’ [-Wmissing-prototypes]
   51 | void machine_power_off(void)
      |      ^~~~~~~~~~~~~~~~~
../arch/um/kernel/reboot.c:57:6: warning: no previous prototype for ‘machine_halt’ [-Wmissing-prototypes]
   57 | void machine_halt(void)
      |      ^~~~~~~~~~~~
../drivers/gpu/drm/xe/xe_bo.c:42:3: error: ‘struct ttm_placement’ has no member named ‘num_busy_placement’; did you mean ‘num_placement’?
   42 |  .num_busy_placement = 1,
      |   ^~~~~~~~~~~~~~~~~~
      |   num_placement
../drivers/gpu/drm/xe/xe_bo.c:42:24: warning: excess elements in struct initializer
   42 |  .num_busy_placement = 1,
      |                        ^
../drivers/gpu/drm/xe/xe_bo.c:42:24: note: (near initialization for ‘sys_placement’)
../drivers/gpu/drm/xe/xe_bo.c:43:3: error: ‘struct ttm_placement’ has no member named ‘busy_placement’; did you mean ‘num_placement’?
   43 |  .busy_placement = &sys_placement_flags,
      |   ^~~~~~~~~~~~~~
      |   num_placement
../drivers/gpu/drm/xe/xe_bo.c:43:20: warning: excess elements in struct initializer
   43 |  .busy_placement = &sys_placement_flags,
      |                    ^
../drivers/gpu/drm/xe/xe_bo.c:43:20: note: (near initialization for ‘sys_placement’)
../drivers/gpu/drm/xe/xe_bo.c:56:3: error: ‘struct ttm_placement’ has no member named ‘num_busy_placement’; did you mean ‘num_placement’?
   56 |  .num_busy_placement = 1,
      |   ^~~~~~~~~~~~~~~~~~
      |   num_placement
../drivers/gpu/drm/xe/xe_bo.c:56:24: warning: excess elements in struct initializer
   56 |  .num_busy_placement = 1,
      |                        ^
../drivers/gpu/drm/xe/xe_bo.c:56:24: note: (near initialization for ‘tt_placement’)
../drivers/gpu/drm/xe/xe_bo.c:57:3: error: ‘struct ttm_placement’ has no member named ‘busy_placement’; did you mean ‘num_placement’?
   57 |  .busy_placement = &sys_placement_flags,
      |   ^~~~~~~~~~~~~~
      |   num_placement
../drivers/gpu/drm/xe/xe_bo.c:57:20: warning: excess elements in struct initializer
   57 |  .busy_placement = &sys_placement_flags,
      |                    ^
../drivers/gpu/drm/xe/xe_bo.c:57:20: note: (near initialization for ‘tt_placement’)
../drivers/gpu/drm/xe/xe_bo.c: In function ‘__xe_bo_placement_for_flags’:
../drivers/gpu/drm/xe/xe_bo.c:234:4: error: ‘struct ttm_placement’ has no member named ‘num_busy_placement’; did you mean ‘num_placement’?
  234 |   .num_busy_placement = c,
      |    ^~~~~~~~~~~~~~~~~~
      |    num_placement
../drivers/gpu/drm/xe/xe_bo.c:234:25: warning: excess elements in struct initializer
  234 |   .num_busy_placement = c,
      |                         ^
../drivers/gpu/drm/xe/xe_bo.c:234:25: note: (near initialization for ‘(anonymous)’)
../drivers/gpu/drm/xe/xe_bo.c:235:4: error: ‘struct ttm_placement’ has no member named ‘busy_placement’; did you mean ‘num_placement’?
  235 |   .busy_placement = bo->placements,
      |    ^~~~~~~~~~~~~~
      |    num_placement
../drivers/gpu/drm/xe/xe_bo.c:235:21: warning: excess elements in struct initializer
  235 |   .busy_placement = bo->placements,
      |                     ^~
../drivers/gpu/drm/xe/xe_bo.c:235:21: note: (near initialization for ‘(anonymous)’)
../drivers/gpu/drm/xe/xe_bo.c: In function ‘xe_evict_flags’:
../drivers/gpu/drm/xe/xe_bo.c:255:15: error: ‘struct ttm_placement’ has no member named ‘num_busy_placement’; did you mean ‘num_placement’?
  255 |    placement->num_busy_placement = 0;
      |               ^~~~~~~~~~~~~~~~~~
      |               num_placement
../drivers/gpu/drm/xe/xe_bo.c: In function ‘__xe_bo_fixed_placement’:
../drivers/gpu/drm/xe/xe_bo.c:1390:4: error: ‘struct ttm_placement’ has no member named ‘num_busy_placement’; did you mean ‘num_placement’?
 1390 |   .num_busy_placement = 1,
      |    ^~~~~~~~~~~~~~~~~~
      |    num_placement
../drivers/gpu/drm/xe/xe_bo.c:1390:25: warning: excess elements in struct initializer
 1390 |   .num_busy_placement = 1,
      |                         ^
../drivers/gpu/drm/xe/xe_bo.c:1390:25: note: (near initialization for ‘(anonymous)’)
../drivers/gpu/drm/xe/xe_bo.c:1391:4: error: ‘struct ttm_placement’ has no member named ‘busy_placement’; did you mean ‘num_placement’?
 1391 |   .busy_placement = place,
      |    ^~~~~~~~~~~~~~
      |    num_placement
../drivers/gpu/drm/xe/xe_bo.c:1391:21: warning: excess elements in struct initializer
 1391 |   .busy_placement = place,
      |                     ^~~~~
../drivers/gpu/drm/xe/xe_bo.c:1391:21: note: (near initialization for ‘(anonymous)’)
../drivers/gpu/drm/xe/xe_bo.c: In function ‘xe_bo_migrate’:
../drivers/gpu/drm/xe/xe_bo.c:2149:12: error: ‘struct ttm_placement’ has no member named ‘num_busy_placement’; did you mean ‘num_placement’?
 2149 |  placement.num_busy_placement = 1;
      |            ^~~~~~~~~~~~~~~~~~
      |            num_placement
../drivers/gpu/drm/xe/xe_bo.c:2151:12: error: ‘struct ttm_placement’ has no member named ‘busy_placement’; did you mean ‘num_placement’?
 2151 |  placement.busy_placement = &requested;
      |            ^~~~~~~~~~~~~~
      |            num_placement
make[7]: *** [../scripts/Makefile.build:243: drivers/gpu/drm/xe/xe_bo.o] Error 1
make[7]: *** Waiting for unfinished jobs....
../arch/x86/um/syscalls_64.c:48:6: warning: no previous prototype for ‘arch_switch_to’ [-Wmissing-prototypes]
   48 | void arch_switch_to(struct task_struct *to)
      |      ^~~~~~~~~~~~~~
../arch/um/kernel/tlb.c:579:6: warning: no previous prototype for ‘flush_tlb_mm_range’ [-Wmissing-prototypes]
  579 | void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
      |      ^~~~~~~~~~~~~~~~~~
../arch/um/kernel/tlb.c:594:6: warning: no previous prototype for ‘force_flush_all’ [-Wmissing-prototypes]
  594 | void force_flush_all(void)
      |      ^~~~~~~~~~~~~~~
../arch/um/kernel/kmsg_dump.c:60:12: warning: no previous prototype for ‘kmsg_dumper_stdout_init’ [-Wmissing-prototypes]
   60 | int __init kmsg_dumper_stdout_init(void)
      |            ^~~~~~~~~~~~~~~~~~~~~~~
../arch/um/kernel/um_arch.c:408:19: warning: no previous prototype for ‘read_initrd’ [-Wmissing-prototypes]
  408 | int __init __weak read_initrd(void)
      |                   ^~~~~~~~~~~
../arch/um/kernel/um_arch.c:461:7: warning: no previous prototype for ‘text_poke’ [-Wmissing-prototypes]
  461 | void *text_poke(void *addr, const void *opcode, size_t len)
      |       ^~~~~~~~~
../arch/um/kernel/um_arch.c:473:6: warning: no previous prototype for ‘text_poke_sync’ [-Wmissing-prototypes]
  473 | void text_poke_sync(void)
      |      ^~~~~~~~~~~~~~
make[6]: *** [../scripts/Makefile.build:481: drivers/gpu/drm/xe] Error 2
make[6]: *** Waiting for unfinished jobs....
make[5]: *** [../scripts/Makefile.build:481: drivers/gpu/drm] Error 2
make[4]: *** [../scripts/Makefile.build:481: drivers/gpu] Error 2
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [../scripts/Makefile.build:481: drivers] Error 2
make[3]: *** Waiting for unfinished jobs....
../lib/iomap.c:156:5: warning: no previous prototype for ‘ioread64_lo_hi’ [-Wmissing-prototypes]
  156 | u64 ioread64_lo_hi(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~
../lib/iomap.c:163:5: warning: no previous prototype for ‘ioread64_hi_lo’ [-Wmissing-prototypes]
  163 | u64 ioread64_hi_lo(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~
../lib/iomap.c:170:5: warning: no previous prototype for ‘ioread64be_lo_hi’ [-Wmissing-prototypes]
  170 | u64 ioread64be_lo_hi(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~~~
../lib/iomap.c:178:5: warning: no previous prototype for ‘ioread64be_hi_lo’ [-Wmissing-prototypes]
  178 | u64 ioread64be_hi_lo(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~~~
../lib/iomap.c:264:6: warning: no previous prototype for ‘iowrite64_lo_hi’ [-Wmissing-prototypes]
  264 | void iowrite64_lo_hi(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~
../lib/iomap.c:272:6: warning: no previous prototype for ‘iowrite64_hi_lo’ [-Wmissing-prototypes]
  272 | void iowrite64_hi_lo(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~
../lib/iomap.c:280:6: warning: no previous prototype for ‘iowrite64be_lo_hi’ [-Wmissing-prototypes]
  280 | void iowrite64be_lo_hi(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~~~
../lib/iomap.c:288:6: warning: no previous prototype for ‘iowrite64be_hi_lo’ [-Wmissing-prototypes]
  288 | void iowrite64be_hi_lo(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~~~
make[2]: *** [/kernel/Makefile:1917: .] Error 2
make[1]: *** [/kernel/Makefile:240: __sub-make] Error 2
make: *** [Makefile:240: __sub-make] Error 2

[20:40:06] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[20:40:10] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 02/34] drm/xe: Document Xe PM component
  2024-01-26 20:30 ` [RFC 02/34] drm/xe: Document Xe PM component Rodrigo Vivi
@ 2024-01-29 10:38   ` Francois Dugast
  0 siblings, 0 replies; 77+ messages in thread
From: Francois Dugast @ 2024-01-29 10:38 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-xe

On Fri, Jan 26, 2024 at 03:30:11PM -0500, Rodrigo Vivi wrote:
> Replace outdated information with a proper PM documentation.
> Already establish the rules for the runtime PM get and put that
> Xe needs to follow.
> 
> Also add missing function documentation to all the "exported" functions.
> 
> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Hi,

Very helpful summary, I think the level of detail is spot on. Minor nits below.

Acked-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_pm.c | 107 +++++++++++++++++++++++++++++++++----
>  1 file changed, 96 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index d5f219796d7e..bd35fe9f6227 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -25,21 +25,45 @@
>  /**
>   * DOC: Xe Power Management
>   *
> - * Xe PM shall be guided by the simplicity.
> - * Use the simplest hook options whenever possible.
> - * Let's not reinvent the runtime_pm references and hooks.
> - * Shall have a clear separation of display and gt underneath this component.
> + * Xe PM implements the main routines for both system level suspend states and
> + * for the opportunistic runtime suspend states.
>   *
> - * What's next:
> + * System Level Suspend (S-States) - In general this is OS initiated suspend
> + * driven by ACPI for achieving S0ix (a.k.a. S2idle, freeze), S3 (suspend to ram),
> + * S4 (disk). The main functions here are `xe_pm_suspend` and `xe_pm_resume` that

The sentence seems a bit long to me, maybe s/that/. They/?

> + * are the main point for the suspend to and resume from these states.
>   *
> - * For now s2idle and s3 are only working in integrated devices. The next step
> - * is to iterate through all VRAM's BO backing them up into the system memory
> - * before allowing the system suspend.
> + * Runtime Suspend (D-States) - This is the opportunistic PCIe device low power
> + * state D3. Xe PM component provides `xe_pm_runtime_suspend` and
> + * `xe_pm_runtime_resume` systems that PCI subsystem will call before transition
> + * to D3. Also, Xe PM provides get and put functions that Xe driver will use to
> + * indicate activity. In order to avoid locking complications with the memory
> + * management, whenever possible, these get and put functions needs to be called
> + * from the higher/outer levels.
>   *
> - * Also runtime_pm needs to be here from the beginning.
> + * The main cases that need to be protected from the outer levels are: IOCLT,

s/IOCLT/IOCTL/

> + * sysfs, debugfs, dma-buf sharing, GPU execution.
>   *
> - * RC6/RPS are also critical PM features. Let's start with GuCRC and GuC SLPC
> - * and no wait boost. Frequency optimizations should come on a next stage.
> + * PCI D3 is special and can mean D3hot, where Vcc power is on for keeping memory
> + * alive and quicker low latency resume or D3Cold where Vcc power is off for
> + * better power savings.
> + * The Vcc control of PCI hierarchy can only be controlled at the PCI root port
> + * level, while the device driver can be behind multiple bridges/switches and
> + * paired with other devices. For this reason, the PCI subsystem cannot perform
> + * the transition towards D3Cold. The lowest runtime PM possible from the PCI
> + * subsystem is D3hot. Then, if all these paired devices in the same root port
> + * are in D3hot, ACPI will assist here and run its _PR3 and _OFF methods to

If the "_PR3 and _OFF" reference is needed, maybe provide a pointer to those?

> + * perform the transition from D3hot to D3cold. Xe may disallow this transition

How?

Francois

> + * based on runtime conditions such as VRAM usage for a quick and low latency
> + * resume for instance.
> + *
> + * Intel systems are capable of taking the system to S0ix when devices are on
> + * D3hot through the runtime PM. This is also called as 'opportunistic-S0iX'.
> + * But in this case, the `xe_pm_suspend` and `xe_pm_resume` won't be called for
> + * S0ix.
> + *
> + * This component is no responsible for GT idleness (RC6) nor GT frequency
> + * management (RPS).
>   */
>  
>  /**
> @@ -169,6 +193,12 @@ void xe_pm_init_early(struct xe_device *xe)
>  	drmm_mutex_init(&xe->drm, &xe->mem_access.vram_userfault.lock);
>  }
>  
> +/**
> + * xe_pm_init - Initialize Xe Power Management
> + * @xe: xe device instance
> + *
> + * This component is responsible for System and Device sleep states.
> + */
>  void xe_pm_init(struct xe_device *xe)
>  {
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> @@ -189,6 +219,10 @@ void xe_pm_init(struct xe_device *xe)
>  	xe_pm_runtime_init(xe);
>  }
>  
> +/**
> + * xe_pm_runtime_fini - Finalize Runtime PM
> + * @xe: xe device instance
> + */
>  void xe_pm_runtime_fini(struct xe_device *xe)
>  {
>  	struct device *dev = xe->drm.dev;
> @@ -218,6 +252,12 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe)
>  	return READ_ONCE(xe->pm_callback_task);
>  }
>  
> +/**
> + * xe_pm_runtime_suspend - Prepare our device for D3hot/D3Cold
> + * @xe: xe device instance
> + *
> + * Returns 0 for success, negative error code otherwise.
> + */
>  int xe_pm_runtime_suspend(struct xe_device *xe)
>  {
>  	struct xe_bo *bo, *on;
> @@ -283,6 +323,12 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>  	return err;
>  }
>  
> +/**
> + * xe_pm_runtime_resume - Waking up from D3hot/D3Cold
> + * @xe: xe device instance
> + *
> + * Returns 0 for success, negative error code otherwise.
> + */
>  int xe_pm_runtime_resume(struct xe_device *xe)
>  {
>  	struct xe_gt *gt;
> @@ -334,22 +380,47 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>  	return err;
>  }
>  
> +/**
> + * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
> + * @xe: xe device instance
> + *
> + * Returns: Any number grater than or equal to 0 for success, negative error
> + * code otherwise.
> + */
>  int xe_pm_runtime_get(struct xe_device *xe)
>  {
>  	return pm_runtime_get_sync(xe->drm.dev);
>  }
>  
> +/**
> + * xe_pm_runtime_put - Put the runtime_pm reference back and mark as idle
> + * @xe: xe device instance
> + *
> + * Returns: Any number grater than or equal to 0 for success, negative error
> + * code otherwise.
> + */
>  int xe_pm_runtime_put(struct xe_device *xe)
>  {
>  	pm_runtime_mark_last_busy(xe->drm.dev);
>  	return pm_runtime_put(xe->drm.dev);
>  }
>  
> +/**
> + * xe_pm_runtime_get_if_active - Get a runtime_pm reference if device active
> + * @xe: xe device instance
> + *
> + * Returns: Any number grater than or equal to 0 for success, negative error
> + * code otherwise.
> + */
>  int xe_pm_runtime_get_if_active(struct xe_device *xe)
>  {
>  	return pm_runtime_get_if_active(xe->drm.dev, true);
>  }
>  
> +/**
> + * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
> + * @xe: xe device instance
> + */
>  void xe_pm_assert_unbounded_bridge(struct xe_device *xe)
>  {
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> @@ -364,6 +435,13 @@ void xe_pm_assert_unbounded_bridge(struct xe_device *xe)
>  	}
>  }
>  
> +/**
> + * xe_pm_set_vram_threshold - Set a vram threshold for allowing/blocking D3Cold
> + * @xe: xe device instance
> + * @threshold: VRAM size in bites for the D3cold threshold
> + *
> + * Returns 0 for success, negative error code otherwise.
> + */
>  int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold)
>  {
>  	struct ttm_resource_manager *man;
> @@ -388,6 +466,13 @@ int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold)
>  	return 0;
>  }
>  
> +/**
> + * xe_pm_d3cold_allowed_toggle - Check conditions to toggle d3cold.allowed
> + * @xe: xe device instance
> + *
> + * To be called during runtime_pm idle callback.
> + * Check for all the D3Cold conditions ahead of runtime suspend.
> + */
>  void xe_pm_d3cold_allowed_toggle(struct xe_device *xe)
>  {
>  	struct ttm_resource_manager *man;
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization
  2024-01-26 20:30 ` [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization Rodrigo Vivi
@ 2024-01-29 12:12   ` Matthew Auld
  2024-01-29 19:01     ` Vivi, Rodrigo
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-01-29 12:12 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Now that we eliminated all the mem_access get/put with its
> locking issues from the inner calls of migration, we can
> allow D3Cold.
> 
> Enable it when VRAM utilization is lower then 300Mb. On
> higher utilization we only allow D3hot so we don't increase
> so much the latency on runtime resume due to the memory
> restoration.

Note that nothing in CI is d3cold capable it seems, so they will never 
trigger this path AFAIK. All platforms return:

[drm:xe_pci_probe [xe]] d3cold: capable=no

Is that a bug, or perhaps there is something else needed? Main question 
is how to get CI test coverage for this here.

> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_pm.h | 7 +------
>   1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> index 0ae4f6a60c6d..bc6bd2a01189 100644
> --- a/drivers/gpu/drm/xe/xe_pm.h
> +++ b/drivers/gpu/drm/xe/xe_pm.h
> @@ -8,12 +8,7 @@
>   
>   #include <linux/pm_runtime.h>
>   
> -/*
> - * TODO: Threshold = 0 will block D3Cold.
> - *       Before we can move this to a higher value (like 300), we need to:
> - *           1. rewrite the VRAM save / restore to avoid buffer object locks
> - */
> -#define DEFAULT_VRAM_THRESHOLD 0 /* in MB */
> +#define DEFAULT_VRAM_THRESHOLD 300 /* in MB */
>   
>   struct xe_device;
>   

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization
  2024-01-29 12:12   ` Matthew Auld
@ 2024-01-29 19:01     ` Vivi, Rodrigo
  2024-01-30 15:01       ` Gupta, Anshuman
  0 siblings, 1 reply; 77+ messages in thread
From: Vivi, Rodrigo @ 2024-01-29 19:01 UTC (permalink / raw)
  To: intel-xe@lists.freedesktop.org, Knop,  Ryszard, Gupta, Anshuman,
	Auld, Matthew, Musial, Ewelina

On Mon, 2024-01-29 at 12:12 +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > Now that we eliminated all the mem_access get/put with its
> > locking issues from the inner calls of migration, we can
> > allow D3Cold.
> > 
> > Enable it when VRAM utilization is lower then 300Mb. On
> > higher utilization we only allow D3hot so we don't increase
> > so much the latency on runtime resume due to the memory
> > restoration.
> 
> Note that nothing in CI is d3cold capable it seems, so they will
> never 
> trigger this path AFAIK. All platforms return:
> 
> [drm:xe_pci_probe [xe]] d3cold: capable=no

Bummer... :/

> 
> Is that a bug, or perhaps there is something else needed?

No, it is not a bug, it is a limitation in the DG2 parts that we have
in our CI. Anshuman, do you know exactly why these CI parts are like
this? and what should we put as a requirement for our CI to get parts
that do support D3Cold?

>  Main question 
> is how to get CI test coverage for this here.

Well, that's tricky anyways. There are many other factors that can
block D3cold underneath. To validate here sometimes I need to force
the rpm auto for all the devices in the same root port. We have a
tool in IGT to do that... probably need some adaptation to xe.

But well, there's no reliable way to ensure that the entire CI runs
with always entering D3Cold. Unless we can make that tool and request
CI folks to ensure a pre-script before every run on DG2?

> 
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_pm.h | 7 +------
> >   1 file changed, 1 insertion(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pm.h
> > b/drivers/gpu/drm/xe/xe_pm.h
> > index 0ae4f6a60c6d..bc6bd2a01189 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.h
> > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > @@ -8,12 +8,7 @@
> >   
> >   #include <linux/pm_runtime.h>
> >   
> > -/*
> > - * TODO: Threshold = 0 will block D3Cold.
> > - *       Before we can move this to a higher value (like 300), we
> > need to:
> > - *           1. rewrite the VRAM save / restore to avoid buffer
> > object locks
> > - */
> > -#define DEFAULT_VRAM_THRESHOLD 0 /* in MB */
> > +#define DEFAULT_VRAM_THRESHOLD 300 /* in MB */
> >   
> >   struct xe_device;
> >   


^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization
  2024-01-29 19:01     ` Vivi, Rodrigo
@ 2024-01-30 15:01       ` Gupta, Anshuman
  0 siblings, 0 replies; 77+ messages in thread
From: Gupta, Anshuman @ 2024-01-30 15:01 UTC (permalink / raw)
  To: Vivi, Rodrigo, intel-xe@lists.freedesktop.org, Knop, Ryszard,
	Auld, Matthew, Musial, Ewelina



> -----Original Message-----
> From: Vivi, Rodrigo <rodrigo.vivi@intel.com>
> Sent: Tuesday, January 30, 2024 12:31 AM
> To: intel-xe@lists.freedesktop.org; Knop, Ryszard <ryszard.knop@intel.com>;
> Gupta, Anshuman <anshuman.gupta@intel.com>; Auld, Matthew
> <matthew.auld@intel.com>; Musial, Ewelina <ewelina.musial@intel.com>
> Subject: Re: [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization
> 
> On Mon, 2024-01-29 at 12:12 +0000, Matthew Auld wrote:
> > On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > > Now that we eliminated all the mem_access get/put with its locking
> > > issues from the inner calls of migration, we can allow D3Cold.
> > >
> > > Enable it when VRAM utilization is lower then 300Mb. On higher
> > > utilization we only allow D3hot so we don't increase so much the
> > > latency on runtime resume due to the memory restoration.
> >
> > Note that nothing in CI is d3cold capable it seems, so they will never
> > trigger this path AFAIK. All platforms return:
> >
> > [drm:xe_pci_probe [xe]] d3cold: capable=no
> 
> Bummer... :/
> 
> >
> > Is that a bug, or perhaps there is something else needed? 
> 
> No, it is not a bug, it is a limitation in the DG2 parts that we have in our CI.
> Anshuman, do you know exactly why these CI parts are like this? and what
> should we put as a requirement for our CI to get parts that do support
> D3Cold?
There are couple pre-requisite  from card and Host.
PCIe Card : PMC - Power Management Capabilities (offset = 2)
bit(15) 1 XXXXb - PME# can be asserted from D3cold

Host:
ACPI _pr3 resources, which depends on BIOS.
I have noticed that if ACPI D3COLD is disabled from BIOS, we don't se _pr3 resources. So we need to review the Host BIOS settings in CI

Probably we should add the drm_dbg log for both of this requirement ?
        if (!pci_pme_capable(root_pdev, PCI_D3cold) || !pci_pr3_present(root_pdev))
                return false;
Thanks,
Anshuman Gupta.
> 
> >  Main question
> > is how to get CI test coverage for this here.
> 
> Well, that's tricky anyways. There are many other factors that can block
> D3cold underneath. To validate here sometimes I need to force the rpm auto
> for all the devices in the same root port. We have a tool in IGT to do that...
> probably need some adaptation to xe.
> 
> But well, there's no reliable way to ensure that the entire CI runs with always
> entering D3Cold. Unless we can make that tool and request CI folks to ensure a
> pre-script before every run on DG2?
> 
> >
> > >
> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_pm.h | 7 +------
> > >   1 file changed, 1 insertion(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > > index 0ae4f6a60c6d..bc6bd2a01189 100644
> > > --- a/drivers/gpu/drm/xe/xe_pm.h
> > > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > > @@ -8,12 +8,7 @@
> > >
> > >   #include <linux/pm_runtime.h>
> > >
> > > -/*
> > > - * TODO: Threshold = 0 will block D3Cold.
> > > - *       Before we can move this to a higher value (like 300), we
> > > need to:
> > > - *           1. rewrite the VRAM save / restore to avoid buffer
> > > object locks
> > > - */
> > > -#define DEFAULT_VRAM_THRESHOLD 0 /* in MB */
> > > +#define DEFAULT_VRAM_THRESHOLD 300 /* in MB */
> > >
> > >   struct xe_device;
> > >


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 03/34] drm/xe: Fix display runtime_pm handling
  2024-01-26 20:30 ` [RFC 03/34] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
@ 2024-02-05  9:11   ` Matthew Auld
  2024-02-14 18:05     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05  9:11 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> i915's intel_runtime_pm_get_if_in_use actually calls the
> pm_runtime_get_if_active() with ign_usage_count = false, but Xe
> was erroneously calling it with true because of the mem_access cases.

Good catch.

> This can lead to unbalanced references.

Is there an actual imbalance here though? Is it not just a case of being 
overzealous in keeping the device awake when it is not currently 
"in_use" vs "if_active"? If the api increments the usage count we will 
still decrement it later, regardless of active vs in-use, AFAICT.

> 
> Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
> 
> Also, already start this new function protected from the runtime
> recursion, since runtime_pm will need to call for display functions
> for a proper D3Cold flow.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  2 +-
>   drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
>   drivers/gpu/drm/xe/xe_pm.h                      |  1 +
>   3 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> index 420eba0e4be0..ad5864d1dd74 100644
> --- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> +++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> @@ -177,7 +177,7 @@ static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_p
>   {
>   	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
>   
> -	return xe_pm_runtime_get_if_active(xe);
> +	return xe_pm_runtime_get_if_in_use(xe);
>   }
>   
>   static inline void intel_runtime_pm_put_unchecked(struct xe_runtime_pm *pm)
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index bd35fe9f6227..19f88cb7715b 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -417,6 +417,23 @@ int xe_pm_runtime_get_if_active(struct xe_device *xe)
>   	return pm_runtime_get_if_active(xe->drm.dev, true);
>   }
>   
> +/**
> + * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
> + * @xe: xe device instance
> + *
> + * Returns: True if device is awake and the reference was taken, false otherwise.
> + */
> +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
> +{
> +	if (xe_pm_read_callback_task(xe) == current) {
> +		/* The device is awake, grab the ref and move on */
> +		pm_runtime_get_noresume(xe->drm.dev);
> +		return true;
> +	}
> +
> +	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;

This is doing atomic_inc_not_zero() underneath for the "in_use" case 
AFAICT. If the usage count is zero it doesn't increment it and returns 
0. Does that not lead to an imbalance? Should this rather be > 0?

> +}
> +
>   /**
>    * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
>    * @xe: xe device instance
> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> index 64a97c6726a7..9d372cbf388b 100644
> --- a/drivers/gpu/drm/xe/xe_pm.h
> +++ b/drivers/gpu/drm/xe/xe_pm.h
> @@ -28,6 +28,7 @@ int xe_pm_runtime_resume(struct xe_device *xe);
>   int xe_pm_runtime_get(struct xe_device *xe);
>   int xe_pm_runtime_put(struct xe_device *xe);
>   int xe_pm_runtime_get_if_active(struct xe_device *xe);
> +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
>   void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
>   int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
>   void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL
  2024-01-26 20:30 ` [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
@ 2024-02-05  9:39   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05  9:39 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's ensure our PCI device is awaken on every IOCTL entry.
> Let's increase the runtime_pm protection and start moving
> that to the outer bounds.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c | 32 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/xe/xe_pm.c     | 15 +++++++++++++++
>   drivers/gpu/drm/xe/xe_pm.h     |  1 +
>   3 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 01db34f06a7d..ab41202ecaf8 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -141,15 +141,43 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
>   			  DRM_RENDER_ALLOW),
>   };
>   
> +static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> +	struct drm_file *file_priv = file->private_data;
> +	struct xe_device *xe = to_xe_device(file_priv->minor->dev);
> +	long ret;
> +
> +	ret = xe_pm_runtime_get_sync(xe);
> +	if (ret >= 0)
> +		ret = drm_ioctl(file, cmd, arg);
> +	xe_pm_runtime_put(xe);
> +
> +	return ret;
> +}
> +
> +static long xe_drm_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> +	struct drm_file *file_priv = file->private_data;
> +	struct xe_device *xe = to_xe_device(file_priv->minor->dev);
> +	long ret;
> +
> +	ret = xe_pm_runtime_get_sync(xe);
> +	if (ret >= 0)
> +		ret = drm_compat_ioctl(file, cmd, arg);
> +	xe_pm_runtime_put(xe);
> +
> +	return ret;
> +}
> +
>   static const struct file_operations xe_driver_fops = {
>   	.owner = THIS_MODULE,
>   	.open = drm_open,
>   	.release = drm_release_noglobal,
> -	.unlocked_ioctl = drm_ioctl,
> +	.unlocked_ioctl = xe_drm_ioctl,
>   	.mmap = drm_gem_mmap,
>   	.poll = drm_poll,
>   	.read = drm_read,
> -	.compat_ioctl = drm_compat_ioctl,
> +	.compat_ioctl = xe_drm_compat_ioctl,
>   	.llseek = noop_llseek,
>   #ifdef CONFIG_PROC_FS
>   	.show_fdinfo = drm_show_fdinfo,
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 3ef14937d5d2..d98f4bb3ad02 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -429,6 +429,21 @@ void xe_pm_runtime_put(struct xe_device *xe)
>   	}
>   }
>   
> +/**
> + * xe_pm_runtime_get_sync - Get a runtime_pm reference and resume synchronously
> + * @xe: xe device instance
> + *
> + * Returns: Any number grater than or equal to 0 for success, negative error

s/grater/greater/

> + * code otherwise.
> + */
> +int xe_pm_runtime_get_sync(struct xe_device *xe)

AFAICT we now have the following API:

xe_pm_runtime_get()
xe_pm_runtime_get_sync()

Which one should users pick? The xe_pm_runtime_get() makes it sound like 
it's async(), if there is also a _sync() version? Is 
xe_pm_runtime_get_sync() just for the ioctl level? Maybe we should add 
_ioctl or something and make it clear?

Otherwise I think LGTM.

> +{
> +	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
> +		return -ELOOP;
> +
> +	return pm_runtime_get_sync(xe->drm.dev);
> +}
> +
>   /**
>    * xe_pm_runtime_get_if_active - Get a runtime_pm reference if device active
>    * @xe: xe device instance
> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> index a672adffd0e1..7d8cf87b95d2 100644
> --- a/drivers/gpu/drm/xe/xe_pm.h
> +++ b/drivers/gpu/drm/xe/xe_pm.h
> @@ -27,6 +27,7 @@ bool xe_pm_runtime_suspended(struct xe_device *xe);
>   int xe_pm_runtime_suspend(struct xe_device *xe);
>   int xe_pm_runtime_resume(struct xe_device *xe);
>   void xe_pm_runtime_get(struct xe_device *xe);
> +int xe_pm_runtime_get_sync(struct xe_device *xe);
>   void xe_pm_runtime_put(struct xe_device *xe);
>   int xe_pm_runtime_get_if_active(struct xe_device *xe);
>   bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state
  2024-01-26 20:30 ` [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
@ 2024-02-05  9:55   ` Matthew Auld
  2024-02-14 18:15     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05  9:55 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> The mem_access helpers are going away and getting replaced by
> direct calls of the xe_pm_runtime_{get,put} functions. However, an
> assertion with a warning splat is desired when we hit the worst
> case of a memory access with the device really in the 'suspended'
> state.
> 
> Also, this needs to be the first step. Otherwise, the upcoming
> conversion would be really noise with warn splats of missing mem_access
> gets.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c | 13 ++++++++++++-
>   drivers/gpu/drm/xe/xe_pm.c     | 16 ++++++++++++++++
>   drivers/gpu/drm/xe/xe_pm.h     |  1 +
>   3 files changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 6faa7865b1aa..01db34f06a7d 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -653,9 +653,20 @@ bool xe_device_mem_access_ongoing(struct xe_device *xe)
>   	return atomic_read(&xe->mem_access.ref);
>   }
>   
> +/**
> + * xe_device_assert_mem_access - Inspect the current runtime_pm state.
> + * @xe: xe device instance
> + *
> + * To be used before any kind of memory access. It will splat a debug warning
> + * if the device is currently sleeping. But it doesn't guarantee in any way
> + * that the device is going to continue awake. Xe PM runtime get and put

s/continue awake/remain awake/ ?

> + * functions might be added to the outer bound of the memory access, while
> + * this check is intended for inner usage to splat some warning if the worst
> + * case has just happened.
> + */
>   void xe_device_assert_mem_access(struct xe_device *xe)
>   {
> -	XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> +	XE_WARN_ON(xe_pm_runtime_suspended(xe));

I guess could also check if we are inside a callback and it doesn't 
match current? Sorry if that was already suggested, and there were good 
reasons to avoid.

>   }
>   
>   bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe)
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 9910b748adab..3ef14937d5d2 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -252,6 +252,22 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe)
>   	return READ_ONCE(xe->pm_callback_task);
>   }
>   
> +/**
> + * xe_pm_runtime_suspended - Inspect the current runtime_pm state.

"Check if runtime_pm state is suspended" ?

> + * @xe: xe device instance
> + *
> + * This does not provide any guarantee that the device is going to continue
> + * suspended as it might be racing with the runtime state transitions.

s/continue suspended/remain suspended/ ?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> + * It can be used only as a non-reliable assertion, to ensure that we are not in
> + * the sleep state while trying to access some memory for instance.
> + *
> + * Returns true if PCI device is suspended, false otherwise.
> + */
> +bool xe_pm_runtime_suspended(struct xe_device *xe)
> +{
> +	return pm_runtime_suspended(xe->drm.dev);
> +}
> +
>   /**
>    * xe_pm_runtime_suspend - Prepare our device for D3hot/D3Cold
>    * @xe: xe device instance
> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> index ecb19ee10db6..a672adffd0e1 100644
> --- a/drivers/gpu/drm/xe/xe_pm.h
> +++ b/drivers/gpu/drm/xe/xe_pm.h
> @@ -23,6 +23,7 @@ int xe_pm_resume(struct xe_device *xe);
>   void xe_pm_init_early(struct xe_device *xe);
>   void xe_pm_init(struct xe_device *xe);
>   void xe_pm_runtime_fini(struct xe_device *xe);
> +bool xe_pm_runtime_suspended(struct xe_device *xe);
>   int xe_pm_runtime_suspend(struct xe_device *xe);
>   int xe_pm_runtime_resume(struct xe_device *xe);
>   void xe_pm_runtime_get(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
  2024-01-26 20:30 ` [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-02-05  9:57   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05  9:57 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's convert the kunit tests that are currently relying on
> xe_device_mem_access_{get,put} towards the direct xe_pm_runtime_{get,put}.
> While doing this we need to move the get/put calls towards the outer
> bounds of the tests to ensure consistency with the other usages of
> pm_runtime on the regular paths.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime
  2024-01-26 20:30 ` [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime Rodrigo Vivi
@ 2024-02-05 10:46   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 10:46 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe, Matthew Brost

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's ensure our PCI device is awaken on every GT execution to
> the end of the execution.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_exec_queue.c          | 19 -------------------
>   drivers/gpu/drm/xe/xe_gpu_scheduler.c       |  8 ++++++--
>   drivers/gpu/drm/xe/xe_gpu_scheduler.h       |  3 ++-
>   drivers/gpu/drm/xe/xe_gpu_scheduler_types.h |  2 ++
>   drivers/gpu/drm/xe/xe_guc_submit.c          |  2 +-
>   drivers/gpu/drm/xe/xe_sched_job.c           | 12 +++++++-----
>   6 files changed, 18 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index c0b7434e78f1..04bd6a639d41 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -111,7 +111,6 @@ static void __xe_exec_queue_free(struct xe_exec_queue *q)
>   
>   static int __xe_exec_queue_init(struct xe_exec_queue *q)
>   {
> -	struct xe_device *xe = gt_to_xe(q->gt);
>   	int i, err;
>   
>   	for (i = 0; i < q->width; ++i) {
> @@ -124,17 +123,6 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q)
>   	if (err)
>   		goto err_lrc;
>   
> -	/*
> -	 * Normally the user vm holds an rpm ref to keep the device
> -	 * awake, and the context holds a ref for the vm, however for
> -	 * some engines we use the kernels migrate vm underneath which offers no
> -	 * such rpm ref, or we lack a vm. Make sure we keep a ref here, so we
> -	 * can perform GuC CT actions when needed. Caller is expected to have
> -	 * already grabbed the rpm ref outside any sensitive locks.
> -	 */
> -	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
> -		drm_WARN_ON(&xe->drm, !xe_device_mem_access_get_if_ongoing(xe));
> -
>   	return 0;
>   
>   err_lrc:
> @@ -221,8 +209,6 @@ void xe_exec_queue_fini(struct xe_exec_queue *q)
>   
>   	for (i = 0; i < q->width; ++i)
>   		xe_lrc_finish(q->lrc + i);
> -	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
> -		xe_device_mem_access_put(gt_to_xe(q->gt));
>   	__xe_exec_queue_free(q);
>   }
>   
> @@ -704,9 +690,6 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>   			if (XE_IOCTL_DBG(xe, !hwe))
>   				return -EINVAL;
>   
> -			/* The migration vm doesn't hold rpm ref */
> -			xe_device_mem_access_get(xe);
> -
>   			flags = EXEC_QUEUE_FLAG_PERSISTENT | EXEC_QUEUE_FLAG_VM |
>   				(id ? EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD : 0);
>   
> @@ -715,8 +698,6 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>   						   args->width, hwe, flags,
>   						   args->extensions);
>   
> -			xe_device_mem_access_put(xe); /* now held by engine */
> -
>   			xe_vm_put(migrate_vm);
>   			if (IS_ERR(new)) {
>   				err = PTR_ERR(new);
> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> index e4ad1d6ce1d5..457f814bfee8 100644
> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> @@ -4,6 +4,7 @@
>    */
>   
>   #include "xe_gpu_scheduler.h"
> +#include "xe_pm.h"
>   
>   static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
>   {
> @@ -48,13 +49,15 @@ static void xe_sched_process_msg_work(struct work_struct *w)
>   
>   	msg = xe_sched_get_msg(sched);
>   	if (msg) {
> +		xe_pm_runtime_get(sched->xe);

This seems very deep? I get the feeling this should be if_active() or 
done higher up.

What does lockdep say about this with our lockdep pm annotations? I 
think rpm suspend code wants to flush this worker, but if device is 
SUSPENDING then we are already in our pm callback, and it won't be able 
to make forward progress since worker here is potentially trying to wake 
up the device. Seems like potential deadlocks.

See suspend -> xe_guc_submit_stop() -> guc_exec_queue_stop() -> 
xe_sched_submission_stop().

>   		sched->ops->process_msg(msg);
> -
>   		xe_sched_process_msg_queue_if_ready(sched);
> +		xe_pm_runtime_put(sched->xe);
>   	}
>   }
>   
> -int xe_sched_init(struct xe_gpu_scheduler *sched,
> +int xe_sched_init(struct xe_device *xe,
> +		  struct xe_gpu_scheduler *sched,
>   		  const struct drm_sched_backend_ops *ops,
>   		  const struct xe_sched_backend_ops *xe_ops,
>   		  struct workqueue_struct *submit_wq,
> @@ -63,6 +66,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
>   		  atomic_t *score, const char *name,
>   		  struct device *dev)
>   {
> +	sched->xe = xe;
>   	sched->ops = xe_ops;
>   	INIT_LIST_HEAD(&sched->msgs);
>   	INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work);
> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> index 10c6bb9c9386..bac37c311859 100644
> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> @@ -9,7 +9,8 @@
>   #include "xe_gpu_scheduler_types.h"
>   #include "xe_sched_job_types.h"
>   
> -int xe_sched_init(struct xe_gpu_scheduler *sched,
> +int xe_sched_init(struct xe_device *xe,
> +		  struct xe_gpu_scheduler *sched,
>   		  const struct drm_sched_backend_ops *ops,
>   		  const struct xe_sched_backend_ops *xe_ops,
>   		  struct workqueue_struct *submit_wq,
> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
> index 6731b13da8bb..f14b88264025 100644
> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
> @@ -43,6 +43,8 @@ struct xe_sched_backend_ops {
>   struct xe_gpu_scheduler {
>   	/** @base: DRM GPU scheduler */
>   	struct drm_gpu_scheduler		base;
> +	/** @xe: back pointer to Xe Device */
> +	struct xe_device			*xe;
>   	/** @ops: Xe scheduler ops */
>   	const struct xe_sched_backend_ops	*ops;
>   	/** @msgs: list of messages to be processed in @work_process_msg */
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 2b008ec1b6de..6b313b9b351a 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -1219,7 +1219,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
>   
>   	timeout = (q->vm && xe_vm_in_lr_mode(q->vm)) ? MAX_SCHEDULE_TIMEOUT :
>   		  q->sched_props.job_timeout_ms;
> -	err = xe_sched_init(&ge->sched, &drm_sched_ops, &xe_sched_ops,
> +	err = xe_sched_init(xe, &ge->sched, &drm_sched_ops, &xe_sched_ops,
>   			    get_submit_wq(guc),
>   			    q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES, 64,
>   			    timeout, guc_to_gt(guc)->ordered_wq, NULL,
> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
> index 01106a1156ad..c93284ec66c5 100644
> --- a/drivers/gpu/drm/xe/xe_sched_job.c
> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
> @@ -15,6 +15,7 @@
>   #include "xe_hw_fence.h"
>   #include "xe_lrc.h"
>   #include "xe_macros.h"
> +#include "xe_pm.h"
>   #include "xe_trace.h"
>   #include "xe_vm.h"
>   
> @@ -67,6 +68,8 @@ static void job_free(struct xe_sched_job *job)
>   	struct xe_exec_queue *q = job->q;
>   	bool is_migration = xe_sched_job_is_migration(q);
>   
> +	xe_pm_runtime_put(gt_to_xe(q->gt));
> +
>   	kmem_cache_free(xe_exec_queue_is_parallel(job->q) || is_migration ?
>   			xe_sched_job_parallel_slab : xe_sched_job_slab, job);
>   }
> @@ -79,6 +82,7 @@ static struct xe_device *job_to_xe(struct xe_sched_job *job)
>   struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>   					 u64 *batch_addr)
>   {
> +	struct xe_device *xe = gt_to_xe(q->gt);
>   	struct xe_sched_job *job;
>   	struct dma_fence **fences;
>   	bool is_migration = xe_sched_job_is_migration(q);
> @@ -86,6 +90,9 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>   	int i, j;
>   	u32 width;
>   
> +	if (!xe_pm_runtime_get_if_in_use(xe))
> +		drm_err(&xe->drm, "Failed to grab RPM ref for sched_job\n");

Could just return an error here also?

> +
>   	/* only a kernel context can submit a vm-less job */
>   	XE_WARN_ON(!q->vm && !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
>   
> @@ -155,9 +162,6 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>   	for (i = 0; i < width; ++i)
>   		job->batch_addr[i] = batch_addr[i];
>   
> -	/* All other jobs require a VM to be open which has a ref */
> -	if (unlikely(q->flags & EXEC_QUEUE_FLAG_KERNEL))
> -		xe_device_mem_access_get(job_to_xe(job));
>   	xe_device_assert_mem_access(job_to_xe(job));
>   
>   	trace_xe_sched_job_create(job);
> @@ -189,8 +193,6 @@ void xe_sched_job_destroy(struct kref *ref)
>   	struct xe_sched_job *job =
>   		container_of(ref, struct xe_sched_job, refcount);
>   
> -	if (unlikely(job->q->flags & EXEC_QUEUE_FLAG_KERNEL))
> -		xe_device_mem_access_put(job_to_xe(job));
>   	xe_exec_queue_put(job->q);
>   	dma_fence_put(job->fence);
>   	drm_sched_job_cleanup(&job->drm);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call
  2024-01-26 20:30 ` [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
@ 2024-02-05 10:55   ` Matthew Auld
  2024-02-14 18:48     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 10:55 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's ensure our PCI device is awaken on every sysfs call.
> Let's increase the runtime_pm protection and start moving
> that to the outer bounds.
> 
> For now, for the files with small number of attr functions,
> let's only call the runtime pm functions directly.
> For the hw_engines entries with many files, let's add
> the sysfs_ops wrapper.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device_sysfs.c          |  4 ++
>   drivers/gpu/drm/xe/xe_gt_freq.c               | 38 +++++++++++-
>   drivers/gpu/drm/xe/xe_gt_idle.c               | 23 +++++++-
>   drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c     |  3 +
>   drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 58 ++++++++++++++++++-
>   drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h |  7 +++
>   drivers/gpu/drm/xe/xe_tile_sysfs.c            |  1 +
>   7 files changed, 129 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device_sysfs.c b/drivers/gpu/drm/xe/xe_device_sysfs.c
> index 99113a5a2b84..e47c8ad1bb17 100644
> --- a/drivers/gpu/drm/xe/xe_device_sysfs.c
> +++ b/drivers/gpu/drm/xe/xe_device_sysfs.c
> @@ -35,7 +35,9 @@ vram_d3cold_threshold_show(struct device *dev,
>   	if (!xe)
>   		return -EINVAL;
>   
> +	xe_pm_runtime_get(xe);

Does it make sense to use the get_sync/ioctl version?

Assuming CI is happy,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

>   	ret = sysfs_emit(buf, "%d\n", xe->d3cold.vram_threshold);
> +	xe_pm_runtime_put(xe);
>   
>   	return ret;
>   }
> @@ -58,7 +60,9 @@ vram_d3cold_threshold_store(struct device *dev, struct device_attribute *attr,
>   
>   	drm_dbg(&xe->drm, "vram_d3cold_threshold: %u\n", vram_d3cold_threshold);
>   
> +	xe_pm_runtime_get(xe);
>   	ret = xe_pm_set_vram_threshold(xe, vram_d3cold_threshold);
> +	xe_pm_runtime_put(xe);
>   
>   	return ret ?: count;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_gt_freq.c b/drivers/gpu/drm/xe/xe_gt_freq.c
> index e5b0f4ecdbe8..32b9a743629c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_freq.c
> +++ b/drivers/gpu/drm/xe/xe_gt_freq.c
> @@ -15,6 +15,7 @@
>   #include "xe_gt_sysfs.h"
>   #include "xe_gt_throttle_sysfs.h"
>   #include "xe_guc_pc.h"
> +#include "xe_pm.h"
>   
>   /**
>    * DOC: Xe GT Frequency Management
> @@ -49,12 +50,23 @@ dev_to_pc(struct device *dev)
>   	return &kobj_to_gt(dev->kobj.parent)->uc.guc.pc;
>   }
>   
> +static struct xe_device *
> +dev_to_xe(struct device *dev)
> +{
> +	return gt_to_xe(kobj_to_gt(dev->kobj.parent));
> +}
> +
>   static ssize_t act_freq_show(struct device *dev,
>   			     struct device_attribute *attr, char *buf)
>   {
>   	struct xe_guc_pc *pc = dev_to_pc(dev);
> +	u32 freq;
> +
> +	xe_pm_runtime_get(dev_to_xe(dev));
> +	freq = xe_guc_pc_get_act_freq(pc);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   
> -	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_act_freq(pc));
> +	return sysfs_emit(buf, "%d\n", freq);
>   }
>   static DEVICE_ATTR_RO(act_freq);
>   
> @@ -65,7 +77,9 @@ static ssize_t cur_freq_show(struct device *dev,
>   	u32 freq;
>   	ssize_t ret;
>   
> +	xe_pm_runtime_get(dev_to_xe(dev));
>   	ret = xe_guc_pc_get_cur_freq(pc, &freq);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   	if (ret)
>   		return ret;
>   
> @@ -77,8 +91,13 @@ static ssize_t rp0_freq_show(struct device *dev,
>   			     struct device_attribute *attr, char *buf)
>   {
>   	struct xe_guc_pc *pc = dev_to_pc(dev);
> +	u32 freq;
> +
> +	xe_pm_runtime_get(dev_to_xe(dev));
> +	freq = xe_guc_pc_get_rp0_freq(pc);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   
> -	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_rp0_freq(pc));
> +	return sysfs_emit(buf, "%d\n", freq);
>   }
>   static DEVICE_ATTR_RO(rp0_freq);
>   
> @@ -86,8 +105,13 @@ static ssize_t rpe_freq_show(struct device *dev,
>   			     struct device_attribute *attr, char *buf)
>   {
>   	struct xe_guc_pc *pc = dev_to_pc(dev);
> +	u32 freq;
> +
> +	xe_pm_runtime_get(dev_to_xe(dev));
> +	freq = xe_guc_pc_get_rpe_freq(pc);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   
> -	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_rpe_freq(pc));
> +	return sysfs_emit(buf, "%d\n", freq);
>   }
>   static DEVICE_ATTR_RO(rpe_freq);
>   
> @@ -107,7 +131,9 @@ static ssize_t min_freq_show(struct device *dev,
>   	u32 freq;
>   	ssize_t ret;
>   
> +	xe_pm_runtime_get(dev_to_xe(dev));
>   	ret = xe_guc_pc_get_min_freq(pc, &freq);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   	if (ret)
>   		return ret;
>   
> @@ -125,7 +151,9 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
>   	if (ret)
>   		return ret;
>   
> +	xe_pm_runtime_get(dev_to_xe(dev));
>   	ret = xe_guc_pc_set_min_freq(pc, freq);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   	if (ret)
>   		return ret;
>   
> @@ -140,7 +168,9 @@ static ssize_t max_freq_show(struct device *dev,
>   	u32 freq;
>   	ssize_t ret;
>   
> +	xe_pm_runtime_get(dev_to_xe(dev));
>   	ret = xe_guc_pc_get_max_freq(pc, &freq);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   	if (ret)
>   		return ret;
>   
> @@ -158,7 +188,9 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
>   	if (ret)
>   		return ret;
>   
> +	xe_pm_runtime_get(dev_to_xe(dev));
>   	ret = xe_guc_pc_set_max_freq(pc, freq);
> +	xe_pm_runtime_put(dev_to_xe(dev));
>   	if (ret)
>   		return ret;
>   
> diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c
> index 9358f7336889..824ff458011f 100644
> --- a/drivers/gpu/drm/xe/xe_gt_idle.c
> +++ b/drivers/gpu/drm/xe/xe_gt_idle.c
> @@ -12,6 +12,7 @@
>   #include "xe_guc_pc.h"
>   #include "regs/xe_gt_regs.h"
>   #include "xe_mmio.h"
> +#include "xe_pm.h"
>   
>   /**
>    * DOC: Xe GT Idle
> @@ -40,6 +41,15 @@ static struct xe_guc_pc *gtidle_to_pc(struct xe_gt_idle *gtidle)
>   	return &gtidle_to_gt(gtidle)->uc.guc.pc;
>   }
>   
> +static struct xe_device *
> +pc_to_xe(struct xe_guc_pc *pc)
> +{
> +	struct xe_guc *guc = container_of(pc, struct xe_guc, pc);
> +	struct xe_gt *gt = container_of(guc, struct xe_gt, uc.guc);
> +
> +	return gt_to_xe(gt);
> +}
> +
>   static const char *gt_idle_state_to_string(enum xe_gt_idle_state state)
>   {
>   	switch (state) {
> @@ -86,8 +96,14 @@ static ssize_t name_show(struct device *dev,
>   			 struct device_attribute *attr, char *buff)
>   {
>   	struct xe_gt_idle *gtidle = dev_to_gtidle(dev);
> +	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
> +	ssize_t ret;
> +
> +	xe_pm_runtime_get(pc_to_xe(pc));
> +	ret = sysfs_emit(buff, "%s\n", gtidle->name);
> +	xe_pm_runtime_put(pc_to_xe(pc));
>   
> -	return sysfs_emit(buff, "%s\n", gtidle->name);
> +	return ret;
>   }
>   static DEVICE_ATTR_RO(name);
>   
> @@ -98,7 +114,9 @@ static ssize_t idle_status_show(struct device *dev,
>   	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
>   	enum xe_gt_idle_state state;
>   
> +	xe_pm_runtime_get(pc_to_xe(pc));
>   	state = gtidle->idle_status(pc);
> +	xe_pm_runtime_put(pc_to_xe(pc));
>   
>   	return sysfs_emit(buff, "%s\n", gt_idle_state_to_string(state));
>   }
> @@ -111,7 +129,10 @@ static ssize_t idle_residency_ms_show(struct device *dev,
>   	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
>   	u64 residency;
>   
> +	xe_pm_runtime_get(pc_to_xe(pc));
>   	residency = gtidle->idle_residency(pc);
> +	xe_pm_runtime_put(pc_to_xe(pc));
> +
>   	return sysfs_emit(buff, "%llu\n", get_residency_ms(gtidle, residency));
>   }
>   static DEVICE_ATTR_RO(idle_residency_ms);
> diff --git a/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c b/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
> index 63d640591a52..9c33045ff1ef 100644
> --- a/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
> @@ -11,6 +11,7 @@
>   #include "xe_gt_sysfs.h"
>   #include "xe_gt_throttle_sysfs.h"
>   #include "xe_mmio.h"
> +#include "xe_pm.h"
>   
>   /**
>    * DOC: Xe GT Throttle
> @@ -38,10 +39,12 @@ static u32 read_perf_limit_reasons(struct xe_gt *gt)
>   {
>   	u32 reg;
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	if (xe_gt_is_media_type(gt))
>   		reg = xe_mmio_read32(gt, MTL_MEDIA_PERF_LIMIT_REASONS);
>   	else
>   		reg = xe_mmio_read32(gt, GT0_PERF_LIMIT_REASONS);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return reg;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
> index 2345fb42fa39..9e23ca7f45ad 100644
> --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
> +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
> @@ -9,6 +9,7 @@
>   
>   #include "xe_gt.h"
>   #include "xe_hw_engine_class_sysfs.h"
> +#include "xe_pm.h"
>   
>   #define MAX_ENGINE_CLASS_NAME_LEN    16
>   static int xe_add_hw_engine_class_defaults(struct xe_device *xe,
> @@ -513,6 +514,7 @@ kobj_xe_hw_engine_class(struct xe_device *xe, struct kobject *parent, char *name
>   		kobject_put(&keclass->base);
>   		return NULL;
>   	}
> +	keclass->xe = xe;
>   
>   	err = drmm_add_action_or_reset(&xe->drm, kobj_xe_hw_engine_class_fini,
>   				       &keclass->base);
> @@ -567,9 +569,63 @@ static void xe_hw_engine_sysfs_kobj_release(struct kobject *kobj)
>   	kfree(kobj);
>   }
>   
> +#include "xe_pm.h"
> +
> +static inline struct xe_device *pdev_to_xe_device(struct pci_dev *pdev)
> +{
> +	return pci_get_drvdata(pdev);
> +}
> +
> +static inline struct xe_device *to_xe_device(const struct drm_device *dev)
> +{
> +	return container_of(dev, struct xe_device, drm);
> +}
> +
> +static ssize_t xe_hw_engine_class_sysfs_attr_show(struct kobject *kobj,
> +						  struct attribute *attr,
> +						  char *buf)
> +{
> +	struct xe_device *xe = kobj_to_xe(kobj);
> +	struct kobj_attribute *kattr;
> +	ssize_t ret = -EIO;
> +
> +	kattr = container_of(attr, struct kobj_attribute, attr);
> +	if (kattr->show) {
> +		xe_pm_runtime_get(xe);
> +		ret = kattr->show(kobj, kattr, buf);
> +		xe_pm_runtime_put(xe);
> +	}
> +
> +	return ret;
> +}
> +
> +static ssize_t xe_hw_engine_class_sysfs_attr_store(struct kobject *kobj,
> +						   struct attribute *attr,
> +						   const char *buf,
> +						   size_t count)
> +{
> +	struct xe_device *xe = kobj_to_xe(kobj);
> +	struct kobj_attribute *kattr;
> +	ssize_t ret = -EIO;
> +
> +	kattr = container_of(attr, struct kobj_attribute, attr);
> +	if (kattr->store) {
> +		xe_pm_runtime_get(xe);
> +		ret = kattr->store(kobj, kattr, buf, count);
> +		xe_pm_runtime_put(xe);
> +	}
> +
> +	return ret;
> +}
> +
> +static const struct sysfs_ops xe_hw_engine_class_sysfs_ops = {
> +	.show = xe_hw_engine_class_sysfs_attr_show,
> +	.store = xe_hw_engine_class_sysfs_attr_store,
> +};
> +
>   static const struct kobj_type xe_hw_engine_sysfs_kobj_type = {
>   	.release = xe_hw_engine_sysfs_kobj_release,
> -	.sysfs_ops = &kobj_sysfs_ops,
> +	.sysfs_ops = &xe_hw_engine_class_sysfs_ops,
>   };
>   
>   static void hw_engine_class_sysfs_fini(struct drm_device *drm, void *arg)
> diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
> index ec5ba673b314..28a0d7c909c0 100644
> --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
> +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
> @@ -26,6 +26,8 @@ struct kobj_eclass {
>   	struct kobject base;
>   	/** @eclass: A pointer to the hw engine class interface */
>   	struct xe_hw_engine_class_intf *eclass;
> +	/** @xe: A pointer to the xe device */
> +	struct xe_device *xe;
>   };
>   
>   static inline struct xe_hw_engine_class_intf *kobj_to_eclass(struct kobject *kobj)
> @@ -33,4 +35,9 @@ static inline struct xe_hw_engine_class_intf *kobj_to_eclass(struct kobject *kob
>   	return container_of(kobj, struct kobj_eclass, base)->eclass;
>   }
>   
> +static inline struct xe_device *kobj_to_xe(struct kobject *kobj)
> +{
> +	return container_of(kobj, struct kobj_eclass, base)->xe;
> +}
> +
>   #endif
> diff --git a/drivers/gpu/drm/xe/xe_tile_sysfs.c b/drivers/gpu/drm/xe/xe_tile_sysfs.c
> index 0662968d7bcb..237a0761d3ad 100644
> --- a/drivers/gpu/drm/xe/xe_tile_sysfs.c
> +++ b/drivers/gpu/drm/xe/xe_tile_sysfs.c
> @@ -7,6 +7,7 @@
>   #include <linux/sysfs.h>
>   #include <drm/drm_managed.h>
>   
> +#include "xe_pm.h"
>   #include "xe_tile.h"
>   #include "xe_tile_sysfs.h"
>   #include "xe_vram_freq.h"

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 12/34] drm/xe: Ensure device is awake before removing it
  2024-01-26 20:30 ` [RFC 12/34] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
@ 2024-02-05 11:05   ` Matthew Auld
  2024-02-14 18:51     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:05 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> If device is suspended, the module unload might face challenges.
> So, let's ensure that the very first thing of the "unprobe" (remove)
> is to wake the device.

Core kernel already ensures device is called with pm_runtime_get_sync() 
before calling probe/remove. Do we still need this?

> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_pci.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index df6b3b31f419..0f296df729c4 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -686,8 +686,8 @@ static void xe_pci_remove(struct pci_dev *pdev)
>   	if (!xe) /* driver load aborted, nothing to cleanup */
>   		return;
>   
> -	xe_device_remove(xe);
>   	xe_pm_runtime_fini(xe);
> +	xe_device_remove(xe);
>   	pci_set_drvdata(pdev, NULL);
>   }
>   

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls
  2024-01-26 20:30 ` [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
@ 2024-02-05 11:08   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:08 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> We are now protected by init, sysfs, or removal and don't
> need these mem_access protections around GuC_PC anymore.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call
  2024-01-26 20:30 ` [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
@ 2024-02-05 11:10   ` Matthew Auld
  2024-02-14 18:57     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:10 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's ensure our PCI device is awaken on every debugfs call.
> Let's increase the runtime_pm protection and start moving
> that to the outer bounds.
> 
> Also remove the mem_access get_put helpers, now that they are not
> needed anymore.

Wrong commit?

> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Otherwise,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> ---
>   drivers/gpu/drm/xe/xe_debugfs.c     | 10 +++---
>   drivers/gpu/drm/xe/xe_gt_debugfs.c  | 53 ++++++++++++++++++++++++++---
>   drivers/gpu/drm/xe/xe_guc_debugfs.c |  9 ++---
>   drivers/gpu/drm/xe/xe_huc_debugfs.c |  5 +--
>   drivers/gpu/drm/xe/xe_ttm_sys_mgr.c |  5 ++-
>   5 files changed, 66 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index 01db5b27bec5..8abdf3c17e1d 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -12,6 +12,7 @@
>   #include "xe_bo.h"
>   #include "xe_device.h"
>   #include "xe_gt_debugfs.h"
> +#include "xe_pm.h"
>   #include "xe_step.h"
>   
>   #ifdef CONFIG_DRM_XE_DEBUG
> @@ -37,6 +38,8 @@ static int info(struct seq_file *m, void *data)
>   	struct xe_gt *gt;
>   	u8 id;
>   
> +	xe_pm_runtime_get(xe);
> +
>   	drm_printf(&p, "graphics_verx100 %d\n", xe->info.graphics_verx100);
>   	drm_printf(&p, "media_verx100 %d\n", xe->info.media_verx100);
>   	drm_printf(&p, "stepping G:%s M:%s D:%s B:%s\n",
> @@ -63,6 +66,7 @@ static int info(struct seq_file *m, void *data)
>   			   gt->info.engine_mask);
>   	}
>   
> +	xe_pm_runtime_put(xe);
>   	return 0;
>   }
>   
> @@ -76,8 +80,7 @@ static int forcewake_open(struct inode *inode, struct file *file)
>   	struct xe_gt *gt;
>   	u8 id;
>   
> -	xe_device_mem_access_get(xe);
> -
> +	xe_pm_runtime_get(xe);
>   	for_each_gt(gt, xe, id)
>   		XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL));
>   
> @@ -92,8 +95,7 @@ static int forcewake_release(struct inode *inode, struct file *file)
>   
>   	for_each_gt(gt, xe, id)
>   		XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
> -
> -	xe_device_mem_access_put(xe);
> +	xe_pm_runtime_put(xe);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> index c4b67cf09f8f..6b4dc2927727 100644
> --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> @@ -18,6 +18,7 @@
>   #include "xe_lrc.h"
>   #include "xe_macros.h"
>   #include "xe_pat.h"
> +#include "xe_pm.h"
>   #include "xe_reg_sr.h"
>   #include "xe_reg_whitelist.h"
>   #include "xe_uc_debugfs.h"
> @@ -37,10 +38,10 @@ static int hw_engines(struct seq_file *m, void *data)
>   	enum xe_hw_engine_id id;
>   	int err;
>   
> -	xe_device_mem_access_get(xe);
> +	xe_pm_runtime_get(xe);
>   	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>   	if (err) {
> -		xe_device_mem_access_put(xe);
> +		xe_pm_runtime_put(xe);
>   		return err;
>   	}
>   
> @@ -48,7 +49,7 @@ static int hw_engines(struct seq_file *m, void *data)
>   		xe_hw_engine_print(hwe, &p);
>   
>   	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> -	xe_device_mem_access_put(xe);
> +	xe_pm_runtime_put(xe);
>   	if (err)
>   		return err;
>   
> @@ -59,18 +60,23 @@ static int force_reset(struct seq_file *m, void *data)
>   {
>   	struct xe_gt *gt = node_to_gt(m->private);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_gt_reset_async(gt);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return 0;
>   }
>   
>   static int sa_info(struct seq_file *m, void *data)
>   {
> -	struct xe_tile *tile = gt_to_tile(node_to_gt(m->private));
> +	struct xe_gt *gt = node_to_gt(m->private);
> +	struct xe_tile *tile = gt_to_tile(gt);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	drm_suballoc_dump_debug_info(&tile->mem.kernel_bb_pool->base, &p,
>   				     tile->mem.kernel_bb_pool->gpu_addr);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return 0;
>   }
> @@ -80,7 +86,9 @@ static int topology(struct seq_file *m, void *data)
>   	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_gt_topology_dump(gt, &p);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return 0;
>   }
> @@ -90,7 +98,9 @@ static int steering(struct seq_file *m, void *data)
>   	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_gt_mcr_steering_dump(gt, &p);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return 0;
>   }
> @@ -99,8 +109,13 @@ static int ggtt(struct seq_file *m, void *data)
>   {
>   	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
> +	int ret;
> +
> +	xe_pm_runtime_get(gt_to_xe(gt));
> +	ret = xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
> -	return xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
> +	return ret;
>   }
>   
>   static int register_save_restore(struct seq_file *m, void *data)
> @@ -110,6 +125,8 @@ static int register_save_restore(struct seq_file *m, void *data)
>   	struct xe_hw_engine *hwe;
>   	enum xe_hw_engine_id id;
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
> +
>   	xe_reg_sr_dump(&gt->reg_sr, &p);
>   	drm_printf(&p, "\n");
>   
> @@ -127,6 +144,8 @@ static int register_save_restore(struct seq_file *m, void *data)
>   	for_each_hw_engine(hwe, gt, id)
>   		xe_reg_whitelist_dump(&hwe->reg_whitelist, &p);
>   
> +	xe_pm_runtime_put(gt_to_xe(gt));
> +
>   	return 0;
>   }
>   
> @@ -135,7 +154,9 @@ static int workarounds(struct seq_file *m, void *data)
>   	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_wa_dump(gt, &p);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return 0;
>   }
> @@ -145,48 +166,70 @@ static int pat(struct seq_file *m, void *data)
>   	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_pat_dump(gt, &p);
> +	xe_pm_runtime_put(gt_to_xe(gt));
>   
>   	return 0;
>   }
>   
>   static int rcs_default_lrc(struct seq_file *m, void *data)
>   {
> +	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_RENDER);
> +	xe_pm_runtime_put(gt_to_xe(gt));
> +
>   	return 0;
>   }
>   
>   static int ccs_default_lrc(struct seq_file *m, void *data)
>   {
> +	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_COMPUTE);
> +	xe_pm_runtime_put(gt_to_xe(gt));
> +
>   	return 0;
>   }
>   
>   static int bcs_default_lrc(struct seq_file *m, void *data)
>   {
> +	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_COPY);
> +	xe_pm_runtime_put(gt_to_xe(gt));
> +
>   	return 0;
>   }
>   
>   static int vcs_default_lrc(struct seq_file *m, void *data)
>   {
> +	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_VIDEO_DECODE);
> +	xe_pm_runtime_put(gt_to_xe(gt));
> +
>   	return 0;
>   }
>   
>   static int vecs_default_lrc(struct seq_file *m, void *data)
>   {
> +	struct xe_gt *gt = node_to_gt(m->private);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> +	xe_pm_runtime_get(gt_to_xe(gt));
>   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_VIDEO_ENHANCE);
> +	xe_pm_runtime_put(gt_to_xe(gt));
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/xe/xe_guc_debugfs.c b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> index ffd7d53bcc42..d3822cbea273 100644
> --- a/drivers/gpu/drm/xe/xe_guc_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> @@ -14,6 +14,7 @@
>   #include "xe_guc_ct.h"
>   #include "xe_guc_log.h"
>   #include "xe_macros.h"
> +#include "xe_pm.h"
>   
>   static struct xe_guc *node_to_guc(struct drm_info_node *node)
>   {
> @@ -26,9 +27,9 @@ static int guc_info(struct seq_file *m, void *data)
>   	struct xe_device *xe = guc_to_xe(guc);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> -	xe_device_mem_access_get(xe);
> +	xe_pm_runtime_get(xe);
>   	xe_guc_print_info(guc, &p);
> -	xe_device_mem_access_put(xe);
> +	xe_pm_runtime_put(xe);
>   
>   	return 0;
>   }
> @@ -39,9 +40,9 @@ static int guc_log(struct seq_file *m, void *data)
>   	struct xe_device *xe = guc_to_xe(guc);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> -	xe_device_mem_access_get(xe);
> +	xe_pm_runtime_get(xe);
>   	xe_guc_log_print(&guc->log, &p);
> -	xe_device_mem_access_put(xe);
> +	xe_pm_runtime_put(xe);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_huc_debugfs.c b/drivers/gpu/drm/xe/xe_huc_debugfs.c
> index 18585a7eeb9d..3a888a40188b 100644
> --- a/drivers/gpu/drm/xe/xe_huc_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_huc_debugfs.c
> @@ -12,6 +12,7 @@
>   #include "xe_gt.h"
>   #include "xe_huc.h"
>   #include "xe_macros.h"
> +#include "xe_pm.h"
>   
>   static struct xe_gt *
>   huc_to_gt(struct xe_huc *huc)
> @@ -36,9 +37,9 @@ static int huc_info(struct seq_file *m, void *data)
>   	struct xe_device *xe = huc_to_xe(huc);
>   	struct drm_printer p = drm_seq_file_printer(m);
>   
> -	xe_device_mem_access_get(xe);
> +	xe_pm_runtime_get(xe);
>   	xe_huc_print_info(huc, &p);
> -	xe_device_mem_access_put(xe);
> +	xe_pm_runtime_put(xe);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
> index 3e1fa0c832ca..9844a8edbfe1 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
> @@ -73,7 +73,10 @@ static void xe_ttm_sys_mgr_del(struct ttm_resource_manager *man,
>   static void xe_ttm_sys_mgr_debug(struct ttm_resource_manager *man,
>   				 struct drm_printer *printer)
>   {
> -
> +	/*
> +	 * This function is called by debugfs entry and would require
> +	 * pm_runtime_{get,put} wrappers around any operation.
> +	 */
>   }
>   
>   static const struct ttm_resource_manager_func xe_ttm_sys_mgr_func = {

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
  2024-01-26 20:30 ` [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
@ 2024-02-05 11:15   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:15 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Continue on the path to entirely remove mem_access helpers in
> favor of the direct xe_pm_runtime calls. This item is one of

favour

> the direct outer bounds of the protection.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm
  2024-01-26 20:30 ` [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
@ 2024-02-05 11:23   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:23 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> This is not needed any longer, now that we have all the protection
> in place with the runtime pm itself.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c | 8 --------
>   drivers/gpu/drm/xe/xe_device.h | 1 -
>   drivers/gpu/drm/xe/xe_pm.c     | 3 ---
>   3 files changed, 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index ab41202ecaf8..1711d9064ecc 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -673,14 +673,6 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
>   		DIV_ROUND_UP_ULL(size, NUM_BYTES_PER_CCS_BYTE(xe)) : 0;
>   }
>   
> -bool xe_device_mem_access_ongoing(struct xe_device *xe)
> -{
> -	if (xe_pm_read_callback_task(xe) != NULL)
> -		return true;
> -
> -	return atomic_read(&xe->mem_access.ref);
> -}
> -
>   /**
>    * xe_device_assert_mem_access - Inspect the current runtime_pm state.
>    * @xe: xe device instance
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 270124da1e00..74074939d157 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -146,7 +146,6 @@ bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe);
>   void xe_device_mem_access_put(struct xe_device *xe);
>   
>   void xe_device_assert_mem_access(struct xe_device *xe);
> -bool xe_device_mem_access_ongoing(struct xe_device *xe);
>   
>   static inline bool xe_device_in_fault_mode(struct xe_device *xe)
>   {
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index d98f4bb3ad02..967d3dc0ded5 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -281,9 +281,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	u8 id;
>   	int err = 0;
>   
> -	if (xe->d3cold.allowed && xe_device_mem_access_ongoing(xe))
> -		return -EBUSY;

Note that this is always impossible. If access_ongoing is true here 
something is very broken. Not sure what the point of this is. Maybe 
reword the commit message?

> -
>   	/* Disable access_ongoing asserts and prevent recursive pm calls */
>   	xe_pm_write_callback_task(xe, current);
>   

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
  2024-01-26 20:30 ` [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
@ 2024-02-05 11:25   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:25 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Continue the work to kill the mem_access in favor of a pure runtime pm.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime
  2024-01-26 20:30 ` [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-02-05 11:31   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:31 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> The mem_access itself is not holding any lock, but attempting
> to train lockdep with possible scarring locks happening during
> runtime pm. We are going soon to kill the mem_access get and put
> helpers in favor of direct xe_pm_runtime calls, so let's just
> move this lock around to where it now belongs.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c | 23 -----------------
>   drivers/gpu/drm/xe/xe_device.h |  4 ---
>   drivers/gpu/drm/xe/xe_pm.c     | 45 ++++++++++++++++++++++++++++------
>   3 files changed, 37 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 1711d9064ecc..0b27e09b3db9 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -45,12 +45,6 @@
>   #include "xe_wait_user_fence.h"
>   #include "xe_hwmon.h"
>   
> -#ifdef CONFIG_LOCKDEP
> -struct lockdep_map xe_device_mem_access_lockdep_map = {
> -	.name = "xe_device_mem_access_lockdep_map"
> -};
> -#endif
> -
>   static int xe_file_open(struct drm_device *dev, struct drm_file *file)
>   {
>   	struct xe_device *xe = to_xe_device(dev);
> @@ -722,23 +716,6 @@ void xe_device_mem_access_get(struct xe_device *xe)
>   	if (xe_pm_read_callback_task(xe) == current)
>   		return;
>   
> -	/*
> -	 * Since the resume here is synchronous it can be quite easy to deadlock
> -	 * if we are not careful. Also in practice it might be quite timing
> -	 * sensitive to ever see the 0 -> 1 transition with the callers locks
> -	 * held, so deadlocks might exist but are hard for lockdep to ever see.
> -	 * With this in mind, help lockdep learn about the potentially scary
> -	 * stuff that can happen inside the runtime_resume callback by acquiring
> -	 * a dummy lock (it doesn't protect anything and gets compiled out on
> -	 * non-debug builds).  Lockdep then only needs to see the
> -	 * mem_access_lockdep_map -> runtime_resume callback once, and then can
> -	 * hopefully validate all the (callers_locks) -> mem_access_lockdep_map.
> -	 * For example if the (callers_locks) are ever grabbed in the
> -	 * runtime_resume callback, lockdep should give us a nice splat.
> -	 */
> -	lock_map_acquire(&xe_device_mem_access_lockdep_map);
> -	lock_map_release(&xe_device_mem_access_lockdep_map);
> -
>   	xe_pm_runtime_get(xe);
>   	ref = atomic_inc_return(&xe->mem_access.ref);
>   
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 74074939d157..050d4b6cdc65 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -16,10 +16,6 @@ struct xe_file;
>   #include "xe_force_wake.h"
>   #include "xe_macros.h"
>   
> -#ifdef CONFIG_LOCKDEP
> -extern struct lockdep_map xe_device_mem_access_lockdep_map;
> -#endif
> -
>   static inline struct xe_device *to_xe_device(const struct drm_device *dev)
>   {
>   	return container_of(dev, struct xe_device, drm);
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 967d3dc0ded5..86bf225dba02 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -66,6 +66,13 @@
>    * management (RPS).
>    */
>   
> +
> +#ifdef CONFIG_LOCKDEP
> +struct lockdep_map xe_pm_runtime_lockdep_map = {
> +	.name = "xe_pm_runtime_lockdep_map"
> +};
> +#endif
> +
>   /**
>    * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
>    * @xe: xe device instance
> @@ -285,11 +292,11 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	xe_pm_write_callback_task(xe, current);
>   
>   	/*
> -	 * The actual xe_device_mem_access_put() is always async underneath, so
> +	 * The actual xe_pm_runtime_put() is always async underneath, so
>   	 * exactly where that is called should makes no difference to us. However
>   	 * we still need to be very careful with the locks that this callback
>   	 * acquires and the locks that are acquired and held by any callers of
> -	 * xe_device_mem_access_get(). We already have the matching annotation
> +	 * xe_runtime_pm_get(). We already have the matching annotation
>   	 * on that side, but we also need it here. For example lockdep should be
>   	 * able to tell us if the following scenario is in theory possible:
>   	 *
> @@ -297,15 +304,15 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	 * lock(A)                       |
>   	 *                               | xe_pm_runtime_suspend()
>   	 *                               |      lock(A)
> -	 * xe_device_mem_access_get()    |
> +	 * xe_pm_runtime_get()           |
>   	 *
>   	 * This will clearly deadlock since rpm core needs to wait for
>   	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
>   	 * on CPU0 which prevents CPU1 making forward progress.  With the
> -	 * annotation here and in xe_device_mem_access_get() lockdep will see
> +	 * annotation here and in xe_pm_runtime_get() lockdep will see
>   	 * the potential lock inversion and give us a nice splat.
>   	 */
> -	lock_map_acquire(&xe_device_mem_access_lockdep_map);
> +	lock_map_acquire(&xe_pm_runtime_lockdep_map);
>   
>   	/*
>   	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
> @@ -334,7 +341,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	if (xe->d3cold.allowed)
>   		xe_display_pm_runtime_suspend(xe);
>   out:
> -	lock_map_release(&xe_device_mem_access_lockdep_map);
> +	lock_map_release(&xe_pm_runtime_lockdep_map);
>   	xe_pm_write_callback_task(xe, NULL);
>   	return err;
>   }
> @@ -354,7 +361,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>   	/* Disable access_ongoing asserts and prevent recursive pm calls */
>   	xe_pm_write_callback_task(xe, current);
>   
> -	lock_map_acquire(&xe_device_mem_access_lockdep_map);
> +	lock_map_acquire(&xe_pm_runtime_lockdep_map);
>   
>   	/*
>   	 * It can be possible that xe has allowed d3cold but other pcie devices
> @@ -393,11 +400,31 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>   			goto out;
>   	}
>   out:
> -	lock_map_release(&xe_device_mem_access_lockdep_map);
> +	lock_map_release(&xe_pm_runtime_lockdep_map);
>   	xe_pm_write_callback_task(xe, NULL);
>   	return err;
>   }
>   
> +/*
> + * For places where resume is synchronous it can be quite easy to deadlock
> + * if we are not careful. Also in practice it might be quite timing
> + * sensitive to ever see the 0 -> 1 transition with the callers locks
> + * held, so deadlocks might exist but are hard for lockdep to ever see.
> + * With this in mind, help lockdep learn about the potentially scary
> + * stuff that can happen inside the runtime_resume callback by acquiring
> + * a dummy lock (it doesn't protect anything and gets compiled out on
> + * non-debug builds).  Lockdep then only needs to see the
> + * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
> + * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
> + * For example if the (callers_locks) are ever grabbed in the
> + * runtime_resume callback, lockdep should give us a nice splat.
> + */
> +static void pm_runtime_lockdep_training(void)

Maybe xe_pm_runtime_lockdep_prime(). We are priming lockdep with all the 
incoming edges/locks.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> +{
> +	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> +	lock_map_release(&xe_pm_runtime_lockdep_map);
> +}
> +
>   /**
>    * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
>    * @xe: xe device instance
> @@ -409,6 +436,7 @@ void xe_pm_runtime_get(struct xe_device *xe)
>   	if (xe_pm_read_callback_task(xe) == current)
>   		return;
>   
> +	pm_runtime_lockdep_training();
>   	pm_runtime_resume(xe->drm.dev);
>   }
>   
> @@ -438,6 +466,7 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
>   	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
>   		return -ELOOP;
>   
> +	pm_runtime_lockdep_training();
>   	return pm_runtime_get_sync(xe->drm.dev);
>   }
>   

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 19/34] drm/xe: Remove pm_runtime lockdep
  2024-01-26 20:30 ` [RFC 19/34] drm/xe: Remove pm_runtime lockdep Rodrigo Vivi
@ 2024-02-05 11:54   ` Matthew Auld
  2024-02-15 22:47     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 11:54 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> This lockdep was initially designed for the mem_access,
> where the mem_access needed to run the resume sync from
> the innerbounds and count the references.
> 
> With the runtime moving to the outer bounds of the driver
> and the mem_access replaced by the pure rpm get/put
> references, it is no longer needed and it is in a matter

We are also calling it in workers like invalidation_fence_work_func(), 
xe_sched_process_msg_work() etc. It is still quite easy to deadlock with 
that.

> of fact just splatting may false positives as the following:

Yeah, calling invalidation_fence_work_func() directly in the callers 
context is a false positive since ioctl has the rpm ref. But what about 
calling that in the async case where invalidation_fence_work_func() is 
only called when the in-fence signals? We are sure that is safe? It 
should be simple to fix the false positive here, no? If we just delete 
all the annotations we get zero help from lockdep for the more 
complicated cases. Also IIRC lockdep doesn't show you every splat once 
it triggers once, so who knows what else is lurking and whether that is 
also false positive?

> 
>                 -> #1 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}:
> [  384.778761]        xe_pm_runtime_get+0xa3/0x100 [xe]
> [  384.783871]        invalidation_fence_work_func+0x7f/0x2b0 [xe]
> [  384.789942]        invalidation_fence_init+0x8c2/0xce0 [xe]
> [  384.795671]        __xe_pt_unbind_vma+0x4a7/0x1be0 [xe]
> [  384.801050]        xe_vm_unbind+0x22f/0xc70 [xe]
> [  384.805821]        __xe_vma_op_execute+0xc67/0x1af0 [xe]
> [  384.811286]        xe_vm_bind_ioctl+0x3a36/0x66c0 [xe]
> [  384.816579]        drm_ioctl_kernel+0x14a/0x2c0
> [  384.821132]        drm_ioctl+0x4c6/0xab0
> [  384.825073]        xe_drm_ioctl+0xa1/0xe0 [xe]
> [  384.829651]        __x64_sys_ioctl+0x130/0x1a0
> [  384.834115]        do_syscall_64+0x5c/0xe0
> [  384.838232]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [  384.843829]
>                 -> #0 (reservation_ww_class_mutex){+.+.}-{4:4}:
> [  384.850911]        __lock_acquire+0x3261/0x6330
> [  384.855462]        lock_acquire+0x19b/0x4d0
> [  384.859666]        __ww_mutex_lock.constprop.0+0x1d8/0x3500
> [  384.865263]        ww_mutex_lock+0x38/0x150
> [  384.869465]        xe_bo_lock+0x41/0x70 [xe]
> [  384.873869]        xe_bo_evict_all+0x7ad/0xa40 [xe]
> [  384.878883]        xe_pm_runtime_suspend+0x297/0x340 [xe]
> [  384.884431]        xe_pci_runtime_suspend+0x3b/0x1e0 [xe]
> [  384.889975]        pci_pm_runtime_suspend+0x168/0x540
> [  384.895052]        __rpm_callback+0xa9/0x390
> [  384.899343]        rpm_callback+0x1aa/0x210
> [  384.903543]        rpm_suspend+0x2ea/0x14c0
> [  384.907746]        pm_runtime_work+0x133/0x170
> [  384.912213]        process_one_work+0x73b/0x1230
> [  384.916853]        worker_thread+0x726/0x1320
> [  384.921237]        kthread+0x2ee/0x3d0
> [  384.925005]        ret_from_fork+0x2d/0x70
> [  384.929120]        ret_from_fork_asm+0x1b/0x30
> [  384.933585]
>                 other info that might help us debug this:
> 
> [  384.941625]  Possible unsafe locking scenario:
> 
> [  384.947572]        CPU0                    CPU1
> [  384.952123]        ----                    ----
> [  384.956676]   lock(xe_pm_runtime_lockdep_map);
> [  384.961140]                                lock(reservation_ww_class_mutex);
> [  384.968220]                                lock(xe_pm_runtime_lockdep_map);
> [  384.975214]   lock(reservation_ww_class_mutex);
> [  384.979765]
>                  *** DEADLOCK ***
> 
> In a matter of fact, there's actually a third lock that is not
> in this picture:
> spin_lock_irq(&dev->power.lock);
> and INIT_WORK(&dev->power.work, pm_runtime_work);
> 
> The pm_callback_task will ensure that there's no recursive
> calls of the resume function and it will increase the
> reference counter anyway.
> 
> Then, the pm_runtime workqueue and spin locks will avoid that
> any resume and suspend operations happens in parallel with
> other resume and suspend operations.
> 
> With that, the only thing that we are actually doing here is
> to wrongly train the lockdep, basically saying that we will
> acquire some locks on resume and on suspend concurrently,
> entirely ignoring its serialization and protection.
> 
> The above scenario is simply not possible because there's
> a serialization with the spin_lock_irq(&dev->power.lock)
> before each operation. However we are telling the lockep
> that the lock(xe_pm_runtime_lockdep_map) occurs before
> the &dev->power.lock and lockdep is not capable to see
> that other protection.

Can you share some more info here? AFAIK dev->power.lock is an RPM core 
lock that protects internal state like transitioning between ACTIVE, 
SUSPENDING, RESUMING etc. It is never held when calling any of our rpm 
callbacks, so it should never factor in to xe_pm_runtime_lockdep_map.

The overall thing should look like this:

xe_rpm_get:
     lock_map_acquire(&xe_pm_runtime_lockdep_map);
     lock_map_release(&xe_pm_runtime_lockdep_map);
     .....
     RPM core grabs dev->power.lock

rpm resume callback: (RPM has dropped dev->power.lock)
     lock_map_acquire(&xe_pm_runtime_lockdep_map);
     ....do resume stuff
     lock_map_release(&xe_pm_runtime_lockdep_map);

rpm suspend callback: (RPM has dropped dev->power.lock)
     lock_map_acquire(&xe_pm_runtime_lockdep_map);
     .....do suspend stuff
     lock_map_release(&xe_pm_runtime_lockdep_map);

> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_pm.c | 55 --------------------------------------
>   1 file changed, 55 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 86bf225dba02..f49e449d9fb7 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -67,12 +67,6 @@
>    */
>   
>   
> -#ifdef CONFIG_LOCKDEP
> -struct lockdep_map xe_pm_runtime_lockdep_map = {
> -	.name = "xe_pm_runtime_lockdep_map"
> -};
> -#endif
> -
>   /**
>    * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
>    * @xe: xe device instance
> @@ -291,29 +285,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	/* Disable access_ongoing asserts and prevent recursive pm calls */
>   	xe_pm_write_callback_task(xe, current);
>   
> -	/*
> -	 * The actual xe_pm_runtime_put() is always async underneath, so
> -	 * exactly where that is called should makes no difference to us. However
> -	 * we still need to be very careful with the locks that this callback
> -	 * acquires and the locks that are acquired and held by any callers of
> -	 * xe_runtime_pm_get(). We already have the matching annotation
> -	 * on that side, but we also need it here. For example lockdep should be
> -	 * able to tell us if the following scenario is in theory possible:
> -	 *
> -	 * CPU0                          | CPU1 (kworker)
> -	 * lock(A)                       |
> -	 *                               | xe_pm_runtime_suspend()
> -	 *                               |      lock(A)
> -	 * xe_pm_runtime_get()           |
> -	 *
> -	 * This will clearly deadlock since rpm core needs to wait for
> -	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
> -	 * on CPU0 which prevents CPU1 making forward progress.  With the
> -	 * annotation here and in xe_pm_runtime_get() lockdep will see
> -	 * the potential lock inversion and give us a nice splat.
> -	 */
> -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> -
>   	/*
>   	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
>   	 * also checks and delets bo entry from user fault list.
> @@ -341,7 +312,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	if (xe->d3cold.allowed)
>   		xe_display_pm_runtime_suspend(xe);
>   out:
> -	lock_map_release(&xe_pm_runtime_lockdep_map);
>   	xe_pm_write_callback_task(xe, NULL);
>   	return err;
>   }
> @@ -361,8 +331,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>   	/* Disable access_ongoing asserts and prevent recursive pm calls */
>   	xe_pm_write_callback_task(xe, current);
>   
> -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> -
>   	/*
>   	 * It can be possible that xe has allowed d3cold but other pcie devices
>   	 * in gfx card soc would have blocked d3cold, therefore card has not
> @@ -400,31 +368,10 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>   			goto out;
>   	}
>   out:
> -	lock_map_release(&xe_pm_runtime_lockdep_map);
>   	xe_pm_write_callback_task(xe, NULL);
>   	return err;
>   }
>   
> -/*
> - * For places where resume is synchronous it can be quite easy to deadlock
> - * if we are not careful. Also in practice it might be quite timing
> - * sensitive to ever see the 0 -> 1 transition with the callers locks
> - * held, so deadlocks might exist but are hard for lockdep to ever see.
> - * With this in mind, help lockdep learn about the potentially scary
> - * stuff that can happen inside the runtime_resume callback by acquiring
> - * a dummy lock (it doesn't protect anything and gets compiled out on
> - * non-debug builds).  Lockdep then only needs to see the
> - * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
> - * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
> - * For example if the (callers_locks) are ever grabbed in the
> - * runtime_resume callback, lockdep should give us a nice splat.
> - */
> -static void pm_runtime_lockdep_training(void)
> -{
> -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> -	lock_map_release(&xe_pm_runtime_lockdep_map);
> -}
> -
>   /**
>    * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
>    * @xe: xe device instance
> @@ -436,7 +383,6 @@ void xe_pm_runtime_get(struct xe_device *xe)
>   	if (xe_pm_read_callback_task(xe) == current)
>   		return;
>   
> -	pm_runtime_lockdep_training();
>   	pm_runtime_resume(xe->drm.dev);
>   }
>   
> @@ -466,7 +412,6 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
>   	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
>   		return -ELOOP;
>   
> -	pm_runtime_lockdep_training();
>   	return pm_runtime_get_sync(xe->drm.dev);
>   }
>   

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
  2024-01-26 20:30 ` [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-02-05 12:23   ` Matthew Auld
  2024-02-28 16:51     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 12:23 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe; +Cc: Matthew Brost

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> For the G2H and TLB invalidation cases, the outer bounds
> protection of the pm_runtime are not going to work. So
> we need to have the inner side protections here.
> 
> Cc: Matthew Auld <matthew.auld@intel.com>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_debugfs.c      |  3 ++
>   drivers/gpu/drm/xe/xe_guc_ct.c       | 79 +++++++++++++---------------
>   drivers/gpu/drm/xe/xe_guc_ct_types.h |  2 +
>   3 files changed, 41 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index 8abdf3c17e1d..b3669eac23c9 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -64,6 +64,9 @@ static int info(struct seq_file *m, void *data)
>   			   xe_force_wake_ref(gt_to_fw(gt), XE_FW_GT));
>   		drm_printf(&p, "gt%d engine_mask 0x%llx\n", id,
>   			   gt->info.engine_mask);
> +		drm_printf(&p, "gt%d g2h_outstanding %d g2h_pm_refs %d\n", id,
> +			   gt->uc.guc.ct.g2h_outstanding,
> +			   gt->uc.guc.ct.g2h_pm_refs);
>   	}
>   
>   	xe_pm_runtime_put(xe);
> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> index f3d356383ced..139cfe733661 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> @@ -287,16 +287,49 @@ static int guc_ct_control_toggle(struct xe_guc_ct *ct, bool enable)
>   	return ret > 0 ? -EPROTO : ret;
>   }
>   
> +static void guc_ct_g2h_outstanding_clear(struct xe_guc_ct *ct)
> +{
> +	int i;
> +
> +	for (i = 0; i < ct->g2h_pm_refs; i++)
> +		xe_pm_runtime_put(ct_to_xe(ct));
> +	ct->g2h_outstanding = 0;
> +	ct->g2h_pm_refs = 0;
> +}
> +
> +static void guc_ct_g2h_outstanding_dec(struct xe_guc_ct *ct)
> +{
> +	if (ct->g2h_pm_refs > 0)
> +		xe_pm_runtime_put(ct_to_xe(ct));
> +	ct->g2h_pm_refs--;
> +	ct->g2h_outstanding--;
> +}
> +
> +static void guc_ct_g2h_outstanding_add(struct xe_guc_ct *ct, int num_g2h)
> +{
> +	struct xe_device *xe = ct_to_xe(ct);
> +	int i;
> +
> +	for (i = 0; i < num_g2h; i++) {
> +		if (xe_pm_runtime_get_if_in_use(xe))
> +			ct->g2h_pm_refs++;
> +		else
> +			drm_err(&xe->drm, "Failed to grab RPM ref for outstanding g2h\n");
> +	}

I guess we could maybe drop the g2h_pm_refs, and rather just track 
single rpm ref for entire ct when g2h_outstanding 0 -> 1, and likewise 
drop the rpm ref when g2h_outstanding reaches zero?

> +	ct->g2h_outstanding += num_g2h;
> +}
> +
>   static void xe_guc_ct_set_state(struct xe_guc_ct *ct,
>   				enum xe_guc_ct_state state)
>   {
>   	mutex_lock(&ct->lock);		/* Serialise dequeue_one_g2h() */
>   	spin_lock_irq(&ct->fast_lock);	/* Serialise CT fast-path */
>   
> +
>   	xe_gt_assert(ct_to_gt(ct), ct->g2h_outstanding == 0 ||
>   		     state == XE_GUC_CT_STATE_STOPPED);
>   
> -	ct->g2h_outstanding = 0;
> +	guc_ct_g2h_outstanding_clear(ct);
>   	ct->state = state;
>   
>   	spin_unlock_irq(&ct->fast_lock);
> @@ -428,7 +461,7 @@ static void __g2h_reserve_space(struct xe_guc_ct *ct, u32 g2h_len, u32 num_g2h)
>   		lockdep_assert_held(&ct->fast_lock);
>   
>   		ct->ctbs.g2h.info.space -= g2h_len;
> -		ct->g2h_outstanding += num_g2h;
> +		guc_ct_g2h_outstanding_add(ct, num_g2h);
>   	}
>   }
>   
> @@ -439,7 +472,7 @@ static void __g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
>   		  ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space);
>   
>   	ct->ctbs.g2h.info.space += g2h_len;
> -	--ct->g2h_outstanding;
> +	guc_ct_g2h_outstanding_dec(ct);
>   }
>   
>   static void g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
> @@ -1199,14 +1232,8 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
>    */
>   void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
>   {
> -	struct xe_device *xe = ct_to_xe(ct);
> -	bool ongoing;
>   	int len;
>   
> -	ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
> -	if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
> -		return;
> -
>   	spin_lock(&ct->fast_lock);
>   	do {
>   		len = g2h_read(ct, ct->fast_msg, true);
> @@ -1214,9 +1241,6 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
>   			g2h_fast_path(ct, ct->fast_msg, len);
>   	} while (len > 0);
>   	spin_unlock(&ct->fast_lock);
> -
> -	if (ongoing)
> -		xe_device_mem_access_put(xe);
>   }
>   
>   /* Returns less than zero on error, 0 on done, 1 on more available */
> @@ -1247,36 +1271,8 @@ static int dequeue_one_g2h(struct xe_guc_ct *ct)
>   static void g2h_worker_func(struct work_struct *w)
>   {
>   	struct xe_guc_ct *ct = container_of(w, struct xe_guc_ct, g2h_worker);
> -	bool ongoing;
>   	int ret;
>   
> -	/*
> -	 * Normal users must always hold mem_access.ref around CT calls. However
> -	 * during the runtime pm callbacks we rely on CT to talk to the GuC, but
> -	 * at this stage we can't rely on mem_access.ref and even the
> -	 * callback_task will be different than current.  For such cases we just
> -	 * need to ensure we always process the responses from any blocking
> -	 * ct_send requests or where we otherwise expect some response when
> -	 * initiated from those callbacks (which will need to wait for the below
> -	 * dequeue_one_g2h()).  The dequeue_one_g2h() will gracefully fail if
> -	 * the device has suspended to the point that the CT communication has
> -	 * been disabled.
> -	 *
> -	 * If we are inside the runtime pm callback, we can be the only task
> -	 * still issuing CT requests (since that requires having the
> -	 * mem_access.ref).  It seems like it might in theory be possible to
> -	 * receive unsolicited events from the GuC just as we are
> -	 * suspending-resuming, but those will currently anyway be lost when
> -	 * eventually exiting from suspend, hence no need to wake up the device
> -	 * here. If we ever need something stronger than get_if_ongoing() then
> -	 * we need to be careful with blocking the pm callbacks from getting CT
> -	 * responses, if the worker here is blocked on those callbacks
> -	 * completing, creating a deadlock.
> -	 */
> -	ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
> -	if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
> -		return;
> -
>   	do {
>   		mutex_lock(&ct->lock);
>   		ret = dequeue_one_g2h(ct);
> @@ -1290,9 +1286,6 @@ static void g2h_worker_func(struct work_struct *w)
>   			kick_reset(ct);
>   		}
>   	} while (ret == 1);
> -
> -	if (ongoing)
> -		xe_device_mem_access_put(ct_to_xe(ct));
>   }
>   
>   static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
> diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h b/drivers/gpu/drm/xe/xe_guc_ct_types.h
> index d29144c9f20b..4220230e7be4 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
> +++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
> @@ -108,6 +108,8 @@ struct xe_guc_ct {
>   	} ctbs;
>   	/** @g2h_outstanding: number of outstanding G2H */
>   	u32 g2h_outstanding;
> +	/** @g2h_pm_refs: number of pm_refs for pending G2H */
> +	u32 g2h_pm_refs;
>   	/** @g2h_worker: worker to process G2H messages */
>   	struct work_struct g2h_worker;
>   	/** @state: CT state */

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation
  2024-01-26 20:30 ` [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation Rodrigo Vivi
@ 2024-02-05 12:41   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 12:41 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's grab the runtime references around TLB invalidation
> to ensure that hardware is awake in D0.
> 
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_bo.c | 5 +++--
>   drivers/gpu/drm/xe/xe_pt.c | 3 +++
>   2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 686d716c5581..17d3b7f69580 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -22,6 +22,7 @@
>   #include "xe_gt.h"
>   #include "xe_map.h"
>   #include "xe_migrate.h"
> +#include "xe_pm.h"
>   #include "xe_preempt_fence.h"
>   #include "xe_res_cursor.h"
>   #include "xe_trace.h"
> @@ -1136,7 +1137,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
>   	int idx, r = 0;
>   
>   	if (needs_rpm)
> -		xe_device_mem_access_get(xe);
> +		xe_pm_runtime_get(xe);
>   
>   	ret = ttm_bo_vm_reserve(tbo, vmf);
>   	if (ret)
> @@ -1176,7 +1177,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
>   	dma_resv_unlock(tbo->base.resv);
>   out:
>   	if (needs_rpm)
> -		xe_device_mem_access_put(xe);
> +		xe_pm_runtime_put(xe);


Change should be split somewhere else, or needs different commit message?

>   
>   	return ret;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index de1030a47588..8d1d4bea7323 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -11,6 +11,7 @@
>   #include "xe_gt.h"
>   #include "xe_gt_tlb_invalidation.h"
>   #include "xe_migrate.h"
> +#include "xe_pm.h"
>   #include "xe_pt_types.h"
>   #include "xe_pt_walk.h"
>   #include "xe_res_cursor.h"
> @@ -1104,8 +1105,10 @@ static void invalidation_fence_work_func(struct work_struct *w)
>   	struct invalidation_fence *ifence =
>   		container_of(w, struct invalidation_fence, work);
>   
> +	xe_pm_runtime_get(gt_to_xe(ifence->gt));
>   	trace_xe_gt_tlb_invalidation_fence_work_func(&ifence->base);
>   	xe_gt_tlb_invalidation_vma(ifence->gt, &ifence->base, ifence->vma);
> +	xe_pm_runtime_put(gt_to_xe(ifence->gt));

Feels slightly scary to call this from a worker? When the caller setup 
the fence stuff, in the prep stage, that would have been a good spot to 
call if_active and attach it.

>   }
>   
>   static int invalidation_fence_init(struct xe_gt *gt,

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls
  2024-01-26 20:30 ` [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls Rodrigo Vivi
@ 2024-02-05 12:43   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 12:43 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Every IOCTL is already protected on its outer bounds by
> xe_pm_runtime_{get,put} calls, so we can now remove
> these.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Should this be squashed?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> ---
>   drivers/gpu/drm/xe/xe_query.c | 4 ----
>   1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
> index 9b35673b286c..86222c80a874 100644
> --- a/drivers/gpu/drm/xe/xe_query.c
> +++ b/drivers/gpu/drm/xe/xe_query.c
> @@ -147,7 +147,6 @@ query_engine_cycles(struct xe_device *xe,
>   	if (!hwe)
>   		return -EINVAL;
>   
> -	xe_device_mem_access_get(xe);
>   	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>   
>   	__read_timestamps(gt,
> @@ -159,7 +158,6 @@ query_engine_cycles(struct xe_device *xe,
>   			  cpu_clock);
>   
>   	xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> -	xe_device_mem_access_put(xe);
>   	resp.width = 36;
>   
>   	/* Only write to the output fields of user query */
> @@ -437,9 +435,7 @@ static int query_hwconfig(struct xe_device *xe,
>   	if (!hwconfig)
>   		return -ENOMEM;
>   
> -	xe_device_mem_access_get(xe);
>   	xe_guc_hwconfig_copy(&gt->uc.guc, hwconfig);
> -	xe_device_mem_access_put(xe);
>   
>   	if (copy_to_user(query_ptr, hwconfig, size)) {
>   		kfree(hwconfig);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime
  2024-01-26 20:30 ` [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-02-05 13:11   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 13:11 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> Let's directly use xe_pm_runtime_{get,put} instead of the
> mem_access helpers that are going away soon.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 27/34] drm/xe: Remove useless mem_access during probe
  2024-01-26 20:30 ` [RFC 27/34] drm/xe: Remove useless mem_access during probe Rodrigo Vivi
@ 2024-02-05 13:18   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 13:18 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> xe_pm_init is the very last thing during the xe_pci_probe(),
> hence these protections are useless from the point of view
> of ensuring that the device is awake.
> 
> Let's remove it so we continue towards the goal of killing
> xe_device_mem_access.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_ggtt.c |  2 --
>   drivers/gpu/drm/xe/xe_gt.c   |  6 ------
>   drivers/gpu/drm/xe/xe_tile.c | 10 +++-------
>   3 files changed, 3 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index 6fdf830678b3..7faf02523b03 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -201,14 +201,12 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
>   	u64 start, end;
>   
>   	/* Display may have allocated inside ggtt, so be careful with clearing here */
> -	xe_device_mem_access_get(tile_to_xe(ggtt->tile));
>   	mutex_lock(&ggtt->lock);
>   	drm_mm_for_each_hole(hole, &ggtt->mm, start, end)
>   		xe_ggtt_clear(ggtt, start, end - start);
>   
>   	xe_ggtt_invalidate(ggtt);
>   	mutex_unlock(&ggtt->lock);
> -	xe_device_mem_access_put(tile_to_xe(ggtt->tile));
>   }
>   
>   int xe_ggtt_init(struct xe_ggtt *ggtt)
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 675a2927a19e..c6c4292c21f5 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -349,7 +349,6 @@ static int gt_fw_domain_init(struct xe_gt *gt)
>   {
>   	int err, i;
>   
> -	xe_device_mem_access_get(gt_to_xe(gt));
>   	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>   	if (err)
>   		goto err_hw_fence_irq;
> @@ -407,7 +406,6 @@ static int gt_fw_domain_init(struct xe_gt *gt)
>   
>   	err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>   	XE_WARN_ON(err);
> -	xe_device_mem_access_put(gt_to_xe(gt));
>   
>   	return 0;
>   
> @@ -417,7 +415,6 @@ static int gt_fw_domain_init(struct xe_gt *gt)
>   err_hw_fence_irq:
>   	for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
>   		xe_hw_fence_irq_finish(&gt->fence_irq[i]);
> -	xe_device_mem_access_put(gt_to_xe(gt));
>   
>   	return err;
>   }
> @@ -426,7 +423,6 @@ static int all_fw_domain_init(struct xe_gt *gt)
>   {
>   	int err, i;
>   
> -	xe_device_mem_access_get(gt_to_xe(gt));
>   	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>   	if (err)
>   		goto err_hw_fence_irq;
> @@ -489,7 +485,6 @@ static int all_fw_domain_init(struct xe_gt *gt)
>   
>   	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>   	XE_WARN_ON(err);
> -	xe_device_mem_access_put(gt_to_xe(gt));
>   
>   	return 0;
>   
> @@ -498,7 +493,6 @@ static int all_fw_domain_init(struct xe_gt *gt)
>   err_hw_fence_irq:
>   	for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
>   		xe_hw_fence_irq_finish(&gt->fence_irq[i]);
> -	xe_device_mem_access_put(gt_to_xe(gt));
>   
>   	return err;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
> index 044c20881de7..74ecb5f39438 100644
> --- a/drivers/gpu/drm/xe/xe_tile.c
> +++ b/drivers/gpu/drm/xe/xe_tile.c
> @@ -160,23 +160,19 @@ int xe_tile_init_noalloc(struct xe_tile *tile)
>   {
>   	int err;
>   
> -	xe_device_mem_access_get(tile_to_xe(tile));
> -
>   	err = tile_ttm_mgr_init(tile);
>   	if (err)
> -		goto err_mem_access;
> +		return err;
>   
>   	tile->mem.kernel_bb_pool = xe_sa_bo_manager_init(tile, SZ_1M, 16);
>   	if (IS_ERR(tile->mem.kernel_bb_pool))
> -		err = PTR_ERR(tile->mem.kernel_bb_pool);
> +		return PTR_ERR(tile->mem.kernel_bb_pool);

Previously this was just falling through with the err. This deserves to 
be a separate change, from the other more mechanical stuff here.

>   
>   	xe_wa_apply_tile_workarounds(tile);
>   
>   	xe_tile_sysfs_init(tile);
>   
> -err_mem_access:
> -	xe_device_mem_access_put(tile_to_xe(tile));
> -	return err;
> +	return 0;
>   }
>   
>   void xe_tile_migrate_wait(struct xe_tile *tile)

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore
  2024-01-26 20:30 ` [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore Rodrigo Vivi
@ 2024-02-05 13:29   ` Matthew Auld
  2024-02-15 22:37     ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 13:29 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> All the VM operations are now protected or by the IOCTL
> calls, or by sched/exec operations or by the specific
> xe_pm_runtime_{get,put} around the duration of the LR VMs.
> 
> So, these inner mem_access protections can be removed.

Would this be a good place to split the series? AFAICT a lot of the 
nightmares originate from here. Like first part of the series moves the 
rpm to be the outermost thing, but we still keep the important vm rpm 
protection here, so shouldn't need the CT, sched stuff etc yet. And in 
theory there is still a path to d3cold without this patch, I think. Also 
playing devils advocate, since we already end up having rpm protection 
for LR vm, why not normal vm. Is it worth it?

> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_vm.c | 7 -------
>   1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 9ad154a01ad7..f314ff128028 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -1280,9 +1280,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
>   
>   	vm->pt_ops = &xelp_pt_ops;
>   
> -	if (!(flags & XE_VM_FLAG_MIGRATION))
> -		xe_device_mem_access_get(xe);
> -
>   	vm_resv_obj = drm_gpuvm_resv_object_alloc(&xe->drm);
>   	if (!vm_resv_obj) {
>   		err = -ENOMEM;
> @@ -1391,8 +1388,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
>   	for_each_tile(tile, xe, id)
>   		xe_range_fence_tree_fini(&vm->rftree[id]);
>   	kfree(vm);
> -	if (!(flags & XE_VM_FLAG_MIGRATION))
> -		xe_device_mem_access_put(xe);
>   	return ERR_PTR(err);
>   }
>   
> @@ -1515,8 +1510,6 @@ static void vm_destroy_work_func(struct work_struct *w)
>   	xe_assert(xe, !vm->size);
>   
>   	if (!(vm->flags & XE_VM_FLAG_MIGRATION)) {
> -		xe_device_mem_access_put(xe);
> -
>   		if (xe->info.has_asid && vm->usm.asid) {
>   			mutex_lock(&xe->usm.lock);
>   			lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions
  2024-01-26 20:30 ` [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions Rodrigo Vivi
@ 2024-02-05 13:30   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 13:30 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> At these points, we are sure that device is awake in D0.
> Likely in the middle of the transition, but awake. So,
> these extra protections are useless. Let's remove it and
> continue with the killing of xe_device_mem_access.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
  2024-01-26 20:30 ` [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime Rodrigo Vivi
@ 2024-02-05 13:33   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 13:33 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> We need to ensure that device is in D0 on any kind of GT reset.
> We are likely already protected by outer bounds like exec,
> but if exec/sched ref gets dropped on a hang, we might transition
> to D3 before we are able to perform the gt_reset and recover.
> 
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps
  2024-01-26 20:30 ` [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps Rodrigo Vivi
@ 2024-02-05 13:34   ` Matthew Auld
  0 siblings, 0 replies; 77+ messages in thread
From: Matthew Auld @ 2024-02-05 13:34 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-xe

On 26/01/2024 20:30, Rodrigo Vivi wrote:
> PAT dumps are already protected by the xe_pm_runtime_{get,put}
> around the debugfs call. So, these can be removed.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 03/34] drm/xe: Fix display runtime_pm handling
  2024-02-05  9:11   ` Matthew Auld
@ 2024-02-14 18:05     ` Rodrigo Vivi
  2024-02-15  9:30       ` Matthew Auld
  0 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-14 18:05 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Mon, Feb 05, 2024 at 09:11:37AM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > i915's intel_runtime_pm_get_if_in_use actually calls the
> > pm_runtime_get_if_active() with ign_usage_count = false, but Xe
> > was erroneously calling it with true because of the mem_access cases.
> 
> Good catch.
> 
> > This can lead to unbalanced references.
> 
> Is there an actual imbalance here though? Is it not just a case of being
> overzealous in keeping the device awake when it is not currently "in_use" vs
> "if_active"? If the api increments the usage count we will still decrement
> it later, regardless of active vs in-use, AFAICT.

Indeed.

s/This can lead to unbalanced references./This can lead to unnecessary
references getting hold here and device never getting into the runtime
suspended state./g

> 
> > 
> > Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
> > 
> > Also, already start this new function protected from the runtime
> > recursion, since runtime_pm will need to call for display functions
> > for a proper D3Cold flow.
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  2 +-
> >   drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
> >   drivers/gpu/drm/xe/xe_pm.h                      |  1 +
> >   3 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > index 420eba0e4be0..ad5864d1dd74 100644
> > --- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > +++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > @@ -177,7 +177,7 @@ static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_p
> >   {
> >   	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
> > -	return xe_pm_runtime_get_if_active(xe);
> > +	return xe_pm_runtime_get_if_in_use(xe);
> >   }
> >   static inline void intel_runtime_pm_put_unchecked(struct xe_runtime_pm *pm)
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index bd35fe9f6227..19f88cb7715b 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -417,6 +417,23 @@ int xe_pm_runtime_get_if_active(struct xe_device *xe)
> >   	return pm_runtime_get_if_active(xe->drm.dev, true);
> >   }
> > +/**
> > + * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
> > + * @xe: xe device instance
> > + *
> > + * Returns: True if device is awake and the reference was taken, false otherwise.
> > + */
> > +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
> > +{
> > +	if (xe_pm_read_callback_task(xe) == current) {
> > +		/* The device is awake, grab the ref and move on */
> > +		pm_runtime_get_noresume(xe->drm.dev);
> > +		return true;
> > +	}
> > +
> > +	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;
> 
> This is doing atomic_inc_not_zero() underneath for the "in_use" case AFAICT.
> If the usage count is zero it doesn't increment it and returns 0. Does that
> not lead to an imbalance? Should this rather be > 0?
> 
> > +}
> > +
> >   /**
> >    * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
> >    * @xe: xe device instance
> > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > index 64a97c6726a7..9d372cbf388b 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.h
> > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > @@ -28,6 +28,7 @@ int xe_pm_runtime_resume(struct xe_device *xe);
> >   int xe_pm_runtime_get(struct xe_device *xe);
> >   int xe_pm_runtime_put(struct xe_device *xe);
> >   int xe_pm_runtime_get_if_active(struct xe_device *xe);
> > +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
> >   void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
> >   int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
> >   void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state
  2024-02-05  9:55   ` Matthew Auld
@ 2024-02-14 18:15     ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-14 18:15 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Mon, Feb 05, 2024 at 09:55:30AM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > The mem_access helpers are going away and getting replaced by
> > direct calls of the xe_pm_runtime_{get,put} functions. However, an
> > assertion with a warning splat is desired when we hit the worst
> > case of a memory access with the device really in the 'suspended'
> > state.
> > 
> > Also, this needs to be the first step. Otherwise, the upcoming
> > conversion would be really noise with warn splats of missing mem_access
> > gets.
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_device.c | 13 ++++++++++++-
> >   drivers/gpu/drm/xe/xe_pm.c     | 16 ++++++++++++++++
> >   drivers/gpu/drm/xe/xe_pm.h     |  1 +
> >   3 files changed, 29 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 6faa7865b1aa..01db34f06a7d 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -653,9 +653,20 @@ bool xe_device_mem_access_ongoing(struct xe_device *xe)
> >   	return atomic_read(&xe->mem_access.ref);
> >   }
> > +/**
> > + * xe_device_assert_mem_access - Inspect the current runtime_pm state.
> > + * @xe: xe device instance
> > + *
> > + * To be used before any kind of memory access. It will splat a debug warning
> > + * if the device is currently sleeping. But it doesn't guarantee in any way
> > + * that the device is going to continue awake. Xe PM runtime get and put
> 
> s/continue awake/remain awake/ ?

done, thanks

> 
> > + * functions might be added to the outer bound of the memory access, while
> > + * this check is intended for inner usage to splat some warning if the worst
> > + * case has just happened.
> > + */
> >   void xe_device_assert_mem_access(struct xe_device *xe)
> >   {
> > -	XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> > +	XE_WARN_ON(xe_pm_runtime_suspended(xe));
> 
> I guess could also check if we are inside a callback and it doesn't match
> current? Sorry if that was already suggested, and there were good reasons to
> avoid.

we could. But any access coming from the suspended paths would surely not
be in suspended state but in some of the transition ones (RPM_RESUMING or
RPM_SUSPENDING) so the extra check would be unnecessary imho.

> 
> >   }
> >   bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe)
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index 9910b748adab..3ef14937d5d2 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -252,6 +252,22 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe)
> >   	return READ_ONCE(xe->pm_callback_task);
> >   }
> > +/**
> > + * xe_pm_runtime_suspended - Inspect the current runtime_pm state.
> 
> "Check if runtime_pm state is suspended" ?

done, thanks

> 
> > + * @xe: xe device instance
> > + *
> > + * This does not provide any guarantee that the device is going to continue
> > + * suspended as it might be racing with the runtime state transitions.
> 
> s/continue suspended/remain suspended/ ?

done, thanks

> 
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

thank you

> 
> > + * It can be used only as a non-reliable assertion, to ensure that we are not in
> > + * the sleep state while trying to access some memory for instance.
> > + *
> > + * Returns true if PCI device is suspended, false otherwise.
> > + */
> > +bool xe_pm_runtime_suspended(struct xe_device *xe)
> > +{
> > +	return pm_runtime_suspended(xe->drm.dev);
> > +}
> > +
> >   /**
> >    * xe_pm_runtime_suspend - Prepare our device for D3hot/D3Cold
> >    * @xe: xe device instance
> > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > index ecb19ee10db6..a672adffd0e1 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.h
> > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > @@ -23,6 +23,7 @@ int xe_pm_resume(struct xe_device *xe);
> >   void xe_pm_init_early(struct xe_device *xe);
> >   void xe_pm_init(struct xe_device *xe);
> >   void xe_pm_runtime_fini(struct xe_device *xe);
> > +bool xe_pm_runtime_suspended(struct xe_device *xe);
> >   int xe_pm_runtime_suspend(struct xe_device *xe);
> >   int xe_pm_runtime_resume(struct xe_device *xe);
> >   void xe_pm_runtime_get(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call
  2024-02-05 10:55   ` Matthew Auld
@ 2024-02-14 18:48     ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-14 18:48 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Mon, Feb 05, 2024 at 10:55:42AM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > Let's ensure our PCI device is awaken on every sysfs call.
> > Let's increase the runtime_pm protection and start moving
> > that to the outer bounds.
> > 
> > For now, for the files with small number of attr functions,
> > let's only call the runtime pm functions directly.
> > For the hw_engines entries with many files, let's add
> > the sysfs_ops wrapper.
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_device_sysfs.c          |  4 ++
> >   drivers/gpu/drm/xe/xe_gt_freq.c               | 38 +++++++++++-
> >   drivers/gpu/drm/xe/xe_gt_idle.c               | 23 +++++++-
> >   drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c     |  3 +
> >   drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 58 ++++++++++++++++++-
> >   drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h |  7 +++
> >   drivers/gpu/drm/xe/xe_tile_sysfs.c            |  1 +
> >   7 files changed, 129 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device_sysfs.c b/drivers/gpu/drm/xe/xe_device_sysfs.c
> > index 99113a5a2b84..e47c8ad1bb17 100644
> > --- a/drivers/gpu/drm/xe/xe_device_sysfs.c
> > +++ b/drivers/gpu/drm/xe/xe_device_sysfs.c
> > @@ -35,7 +35,9 @@ vram_d3cold_threshold_show(struct device *dev,
> >   	if (!xe)
> >   		return -EINVAL;
> > +	xe_pm_runtime_get(xe);
> 
> Does it make sense to use the get_sync/ioctl version?\

I renamed the other to _ioctl as you suggested...

> 
> Assuming CI is happy,

in the end it is a get_sync because it will call resume directly.
but without any error handling and prepared to be called from
anywhere even from inner places that might be on resume path.

> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

Thanks,

perhaps we can later come and revisit the error check and
the names.

> 
> >   	ret = sysfs_emit(buf, "%d\n", xe->d3cold.vram_threshold);
> > +	xe_pm_runtime_put(xe);
> >   	return ret;
> >   }
> > @@ -58,7 +60,9 @@ vram_d3cold_threshold_store(struct device *dev, struct device_attribute *attr,
> >   	drm_dbg(&xe->drm, "vram_d3cold_threshold: %u\n", vram_d3cold_threshold);
> > +	xe_pm_runtime_get(xe);
> >   	ret = xe_pm_set_vram_threshold(xe, vram_d3cold_threshold);
> > +	xe_pm_runtime_put(xe);
> >   	return ret ?: count;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_gt_freq.c b/drivers/gpu/drm/xe/xe_gt_freq.c
> > index e5b0f4ecdbe8..32b9a743629c 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_freq.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_freq.c
> > @@ -15,6 +15,7 @@
> >   #include "xe_gt_sysfs.h"
> >   #include "xe_gt_throttle_sysfs.h"
> >   #include "xe_guc_pc.h"
> > +#include "xe_pm.h"
> >   /**
> >    * DOC: Xe GT Frequency Management
> > @@ -49,12 +50,23 @@ dev_to_pc(struct device *dev)
> >   	return &kobj_to_gt(dev->kobj.parent)->uc.guc.pc;
> >   }
> > +static struct xe_device *
> > +dev_to_xe(struct device *dev)
> > +{
> > +	return gt_to_xe(kobj_to_gt(dev->kobj.parent));
> > +}
> > +
> >   static ssize_t act_freq_show(struct device *dev,
> >   			     struct device_attribute *attr, char *buf)
> >   {
> >   	struct xe_guc_pc *pc = dev_to_pc(dev);
> > +	u32 freq;
> > +
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> > +	freq = xe_guc_pc_get_act_freq(pc);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> > -	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_act_freq(pc));
> > +	return sysfs_emit(buf, "%d\n", freq);
> >   }
> >   static DEVICE_ATTR_RO(act_freq);
> > @@ -65,7 +77,9 @@ static ssize_t cur_freq_show(struct device *dev,
> >   	u32 freq;
> >   	ssize_t ret;
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> >   	ret = xe_guc_pc_get_cur_freq(pc, &freq);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> >   	if (ret)
> >   		return ret;
> > @@ -77,8 +91,13 @@ static ssize_t rp0_freq_show(struct device *dev,
> >   			     struct device_attribute *attr, char *buf)
> >   {
> >   	struct xe_guc_pc *pc = dev_to_pc(dev);
> > +	u32 freq;
> > +
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> > +	freq = xe_guc_pc_get_rp0_freq(pc);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> > -	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_rp0_freq(pc));
> > +	return sysfs_emit(buf, "%d\n", freq);
> >   }
> >   static DEVICE_ATTR_RO(rp0_freq);
> > @@ -86,8 +105,13 @@ static ssize_t rpe_freq_show(struct device *dev,
> >   			     struct device_attribute *attr, char *buf)
> >   {
> >   	struct xe_guc_pc *pc = dev_to_pc(dev);
> > +	u32 freq;
> > +
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> > +	freq = xe_guc_pc_get_rpe_freq(pc);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> > -	return sysfs_emit(buf, "%d\n", xe_guc_pc_get_rpe_freq(pc));
> > +	return sysfs_emit(buf, "%d\n", freq);
> >   }
> >   static DEVICE_ATTR_RO(rpe_freq);
> > @@ -107,7 +131,9 @@ static ssize_t min_freq_show(struct device *dev,
> >   	u32 freq;
> >   	ssize_t ret;
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> >   	ret = xe_guc_pc_get_min_freq(pc, &freq);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> >   	if (ret)
> >   		return ret;
> > @@ -125,7 +151,9 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
> >   	if (ret)
> >   		return ret;
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> >   	ret = xe_guc_pc_set_min_freq(pc, freq);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> >   	if (ret)
> >   		return ret;
> > @@ -140,7 +168,9 @@ static ssize_t max_freq_show(struct device *dev,
> >   	u32 freq;
> >   	ssize_t ret;
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> >   	ret = xe_guc_pc_get_max_freq(pc, &freq);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> >   	if (ret)
> >   		return ret;
> > @@ -158,7 +188,9 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
> >   	if (ret)
> >   		return ret;
> > +	xe_pm_runtime_get(dev_to_xe(dev));
> >   	ret = xe_guc_pc_set_max_freq(pc, freq);
> > +	xe_pm_runtime_put(dev_to_xe(dev));
> >   	if (ret)
> >   		return ret;
> > diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c
> > index 9358f7336889..824ff458011f 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_idle.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_idle.c
> > @@ -12,6 +12,7 @@
> >   #include "xe_guc_pc.h"
> >   #include "regs/xe_gt_regs.h"
> >   #include "xe_mmio.h"
> > +#include "xe_pm.h"
> >   /**
> >    * DOC: Xe GT Idle
> > @@ -40,6 +41,15 @@ static struct xe_guc_pc *gtidle_to_pc(struct xe_gt_idle *gtidle)
> >   	return &gtidle_to_gt(gtidle)->uc.guc.pc;
> >   }
> > +static struct xe_device *
> > +pc_to_xe(struct xe_guc_pc *pc)
> > +{
> > +	struct xe_guc *guc = container_of(pc, struct xe_guc, pc);
> > +	struct xe_gt *gt = container_of(guc, struct xe_gt, uc.guc);
> > +
> > +	return gt_to_xe(gt);
> > +}
> > +
> >   static const char *gt_idle_state_to_string(enum xe_gt_idle_state state)
> >   {
> >   	switch (state) {
> > @@ -86,8 +96,14 @@ static ssize_t name_show(struct device *dev,
> >   			 struct device_attribute *attr, char *buff)
> >   {
> >   	struct xe_gt_idle *gtidle = dev_to_gtidle(dev);
> > +	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
> > +	ssize_t ret;
> > +
> > +	xe_pm_runtime_get(pc_to_xe(pc));
> > +	ret = sysfs_emit(buff, "%s\n", gtidle->name);
> > +	xe_pm_runtime_put(pc_to_xe(pc));
> > -	return sysfs_emit(buff, "%s\n", gtidle->name);
> > +	return ret;
> >   }
> >   static DEVICE_ATTR_RO(name);
> > @@ -98,7 +114,9 @@ static ssize_t idle_status_show(struct device *dev,
> >   	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
> >   	enum xe_gt_idle_state state;
> > +	xe_pm_runtime_get(pc_to_xe(pc));
> >   	state = gtidle->idle_status(pc);
> > +	xe_pm_runtime_put(pc_to_xe(pc));
> >   	return sysfs_emit(buff, "%s\n", gt_idle_state_to_string(state));
> >   }
> > @@ -111,7 +129,10 @@ static ssize_t idle_residency_ms_show(struct device *dev,
> >   	struct xe_guc_pc *pc = gtidle_to_pc(gtidle);
> >   	u64 residency;
> > +	xe_pm_runtime_get(pc_to_xe(pc));
> >   	residency = gtidle->idle_residency(pc);
> > +	xe_pm_runtime_put(pc_to_xe(pc));
> > +
> >   	return sysfs_emit(buff, "%llu\n", get_residency_ms(gtidle, residency));
> >   }
> >   static DEVICE_ATTR_RO(idle_residency_ms);
> > diff --git a/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c b/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
> > index 63d640591a52..9c33045ff1ef 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c
> > @@ -11,6 +11,7 @@
> >   #include "xe_gt_sysfs.h"
> >   #include "xe_gt_throttle_sysfs.h"
> >   #include "xe_mmio.h"
> > +#include "xe_pm.h"
> >   /**
> >    * DOC: Xe GT Throttle
> > @@ -38,10 +39,12 @@ static u32 read_perf_limit_reasons(struct xe_gt *gt)
> >   {
> >   	u32 reg;
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	if (xe_gt_is_media_type(gt))
> >   		reg = xe_mmio_read32(gt, MTL_MEDIA_PERF_LIMIT_REASONS);
> >   	else
> >   		reg = xe_mmio_read32(gt, GT0_PERF_LIMIT_REASONS);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return reg;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
> > index 2345fb42fa39..9e23ca7f45ad 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
> > @@ -9,6 +9,7 @@
> >   #include "xe_gt.h"
> >   #include "xe_hw_engine_class_sysfs.h"
> > +#include "xe_pm.h"
> >   #define MAX_ENGINE_CLASS_NAME_LEN    16
> >   static int xe_add_hw_engine_class_defaults(struct xe_device *xe,
> > @@ -513,6 +514,7 @@ kobj_xe_hw_engine_class(struct xe_device *xe, struct kobject *parent, char *name
> >   		kobject_put(&keclass->base);
> >   		return NULL;
> >   	}
> > +	keclass->xe = xe;
> >   	err = drmm_add_action_or_reset(&xe->drm, kobj_xe_hw_engine_class_fini,
> >   				       &keclass->base);
> > @@ -567,9 +569,63 @@ static void xe_hw_engine_sysfs_kobj_release(struct kobject *kobj)
> >   	kfree(kobj);
> >   }
> > +#include "xe_pm.h"
> > +
> > +static inline struct xe_device *pdev_to_xe_device(struct pci_dev *pdev)
> > +{
> > +	return pci_get_drvdata(pdev);
> > +}
> > +
> > +static inline struct xe_device *to_xe_device(const struct drm_device *dev)
> > +{
> > +	return container_of(dev, struct xe_device, drm);
> > +}
> > +
> > +static ssize_t xe_hw_engine_class_sysfs_attr_show(struct kobject *kobj,
> > +						  struct attribute *attr,
> > +						  char *buf)
> > +{
> > +	struct xe_device *xe = kobj_to_xe(kobj);
> > +	struct kobj_attribute *kattr;
> > +	ssize_t ret = -EIO;
> > +
> > +	kattr = container_of(attr, struct kobj_attribute, attr);
> > +	if (kattr->show) {
> > +		xe_pm_runtime_get(xe);
> > +		ret = kattr->show(kobj, kattr, buf);
> > +		xe_pm_runtime_put(xe);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t xe_hw_engine_class_sysfs_attr_store(struct kobject *kobj,
> > +						   struct attribute *attr,
> > +						   const char *buf,
> > +						   size_t count)
> > +{
> > +	struct xe_device *xe = kobj_to_xe(kobj);
> > +	struct kobj_attribute *kattr;
> > +	ssize_t ret = -EIO;
> > +
> > +	kattr = container_of(attr, struct kobj_attribute, attr);
> > +	if (kattr->store) {
> > +		xe_pm_runtime_get(xe);
> > +		ret = kattr->store(kobj, kattr, buf, count);
> > +		xe_pm_runtime_put(xe);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static const struct sysfs_ops xe_hw_engine_class_sysfs_ops = {
> > +	.show = xe_hw_engine_class_sysfs_attr_show,
> > +	.store = xe_hw_engine_class_sysfs_attr_store,
> > +};
> > +
> >   static const struct kobj_type xe_hw_engine_sysfs_kobj_type = {
> >   	.release = xe_hw_engine_sysfs_kobj_release,
> > -	.sysfs_ops = &kobj_sysfs_ops,
> > +	.sysfs_ops = &xe_hw_engine_class_sysfs_ops,
> >   };
> >   static void hw_engine_class_sysfs_fini(struct drm_device *drm, void *arg)
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
> > index ec5ba673b314..28a0d7c909c0 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h
> > @@ -26,6 +26,8 @@ struct kobj_eclass {
> >   	struct kobject base;
> >   	/** @eclass: A pointer to the hw engine class interface */
> >   	struct xe_hw_engine_class_intf *eclass;
> > +	/** @xe: A pointer to the xe device */
> > +	struct xe_device *xe;
> >   };
> >   static inline struct xe_hw_engine_class_intf *kobj_to_eclass(struct kobject *kobj)
> > @@ -33,4 +35,9 @@ static inline struct xe_hw_engine_class_intf *kobj_to_eclass(struct kobject *kob
> >   	return container_of(kobj, struct kobj_eclass, base)->eclass;
> >   }
> > +static inline struct xe_device *kobj_to_xe(struct kobject *kobj)
> > +{
> > +	return container_of(kobj, struct kobj_eclass, base)->xe;
> > +}
> > +
> >   #endif
> > diff --git a/drivers/gpu/drm/xe/xe_tile_sysfs.c b/drivers/gpu/drm/xe/xe_tile_sysfs.c
> > index 0662968d7bcb..237a0761d3ad 100644
> > --- a/drivers/gpu/drm/xe/xe_tile_sysfs.c
> > +++ b/drivers/gpu/drm/xe/xe_tile_sysfs.c
> > @@ -7,6 +7,7 @@
> >   #include <linux/sysfs.h>
> >   #include <drm/drm_managed.h>
> > +#include "xe_pm.h"
> >   #include "xe_tile.h"
> >   #include "xe_tile_sysfs.h"
> >   #include "xe_vram_freq.h"

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 12/34] drm/xe: Ensure device is awake before removing it
  2024-02-05 11:05   ` Matthew Auld
@ 2024-02-14 18:51     ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-14 18:51 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Mon, Feb 05, 2024 at 11:05:54AM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > If device is suspended, the module unload might face challenges.
> > So, let's ensure that the very first thing of the "unprobe" (remove)
> > is to wake the device.
> 
> Core kernel already ensures device is called with pm_runtime_get_sync()
> before calling probe/remove. Do we still need this?

hmmm indeed. let me try to remove this.
I believe the issue that I was facing is another display related issue
that I'm facing once in a while even without any of my patches.

> 
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_pci.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index df6b3b31f419..0f296df729c4 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -686,8 +686,8 @@ static void xe_pci_remove(struct pci_dev *pdev)
> >   	if (!xe) /* driver load aborted, nothing to cleanup */
> >   		return;
> > -	xe_device_remove(xe);
> >   	xe_pm_runtime_fini(xe);
> > +	xe_device_remove(xe);
> >   	pci_set_drvdata(pdev, NULL);
> >   }

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call
  2024-02-05 11:10   ` Matthew Auld
@ 2024-02-14 18:57     ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-14 18:57 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Mon, Feb 05, 2024 at 11:10:19AM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > Let's ensure our PCI device is awaken on every debugfs call.
> > Let's increase the runtime_pm protection and start moving
> > that to the outer bounds.
> > 
> > Also remove the mem_access get_put helpers, now that they are not
> > needed anymore.
> 
> Wrong commit?

hmm... bad sentence probably..

what about:

"Also, let's remove the mem_access_{get,put} from where they are not
needed anymore."


> 
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Otherwise,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> 
> > ---
> >   drivers/gpu/drm/xe/xe_debugfs.c     | 10 +++---
> >   drivers/gpu/drm/xe/xe_gt_debugfs.c  | 53 ++++++++++++++++++++++++++---
> >   drivers/gpu/drm/xe/xe_guc_debugfs.c |  9 ++---
> >   drivers/gpu/drm/xe/xe_huc_debugfs.c |  5 +--
> >   drivers/gpu/drm/xe/xe_ttm_sys_mgr.c |  5 ++-
> >   5 files changed, 66 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index 01db5b27bec5..8abdf3c17e1d 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -12,6 +12,7 @@
> >   #include "xe_bo.h"
> >   #include "xe_device.h"
> >   #include "xe_gt_debugfs.h"
> > +#include "xe_pm.h"
> >   #include "xe_step.h"
> >   #ifdef CONFIG_DRM_XE_DEBUG
> > @@ -37,6 +38,8 @@ static int info(struct seq_file *m, void *data)
> >   	struct xe_gt *gt;
> >   	u8 id;
> > +	xe_pm_runtime_get(xe);
> > +
> >   	drm_printf(&p, "graphics_verx100 %d\n", xe->info.graphics_verx100);
> >   	drm_printf(&p, "media_verx100 %d\n", xe->info.media_verx100);
> >   	drm_printf(&p, "stepping G:%s M:%s D:%s B:%s\n",
> > @@ -63,6 +66,7 @@ static int info(struct seq_file *m, void *data)
> >   			   gt->info.engine_mask);
> >   	}
> > +	xe_pm_runtime_put(xe);
> >   	return 0;
> >   }
> > @@ -76,8 +80,7 @@ static int forcewake_open(struct inode *inode, struct file *file)
> >   	struct xe_gt *gt;
> >   	u8 id;
> > -	xe_device_mem_access_get(xe);
> > -
> > +	xe_pm_runtime_get(xe);
> >   	for_each_gt(gt, xe, id)
> >   		XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL));
> > @@ -92,8 +95,7 @@ static int forcewake_release(struct inode *inode, struct file *file)
> >   	for_each_gt(gt, xe, id)
> >   		XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
> > -
> > -	xe_device_mem_access_put(xe);
> > +	xe_pm_runtime_put(xe);
> >   	return 0;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> > index c4b67cf09f8f..6b4dc2927727 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> > @@ -18,6 +18,7 @@
> >   #include "xe_lrc.h"
> >   #include "xe_macros.h"
> >   #include "xe_pat.h"
> > +#include "xe_pm.h"
> >   #include "xe_reg_sr.h"
> >   #include "xe_reg_whitelist.h"
> >   #include "xe_uc_debugfs.h"
> > @@ -37,10 +38,10 @@ static int hw_engines(struct seq_file *m, void *data)
> >   	enum xe_hw_engine_id id;
> >   	int err;
> > -	xe_device_mem_access_get(xe);
> > +	xe_pm_runtime_get(xe);
> >   	err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> >   	if (err) {
> > -		xe_device_mem_access_put(xe);
> > +		xe_pm_runtime_put(xe);
> >   		return err;
> >   	}
> > @@ -48,7 +49,7 @@ static int hw_engines(struct seq_file *m, void *data)
> >   		xe_hw_engine_print(hwe, &p);
> >   	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> > -	xe_device_mem_access_put(xe);
> > +	xe_pm_runtime_put(xe);
> >   	if (err)
> >   		return err;
> > @@ -59,18 +60,23 @@ static int force_reset(struct seq_file *m, void *data)
> >   {
> >   	struct xe_gt *gt = node_to_gt(m->private);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_gt_reset_async(gt);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return 0;
> >   }
> >   static int sa_info(struct seq_file *m, void *data)
> >   {
> > -	struct xe_tile *tile = gt_to_tile(node_to_gt(m->private));
> > +	struct xe_gt *gt = node_to_gt(m->private);
> > +	struct xe_tile *tile = gt_to_tile(gt);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	drm_suballoc_dump_debug_info(&tile->mem.kernel_bb_pool->base, &p,
> >   				     tile->mem.kernel_bb_pool->gpu_addr);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return 0;
> >   }
> > @@ -80,7 +86,9 @@ static int topology(struct seq_file *m, void *data)
> >   	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_gt_topology_dump(gt, &p);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return 0;
> >   }
> > @@ -90,7 +98,9 @@ static int steering(struct seq_file *m, void *data)
> >   	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_gt_mcr_steering_dump(gt, &p);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return 0;
> >   }
> > @@ -99,8 +109,13 @@ static int ggtt(struct seq_file *m, void *data)
> >   {
> >   	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	int ret;
> > +
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> > +	ret = xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > -	return xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
> > +	return ret;
> >   }
> >   static int register_save_restore(struct seq_file *m, void *data)
> > @@ -110,6 +125,8 @@ static int register_save_restore(struct seq_file *m, void *data)
> >   	struct xe_hw_engine *hwe;
> >   	enum xe_hw_engine_id id;
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> > +
> >   	xe_reg_sr_dump(&gt->reg_sr, &p);
> >   	drm_printf(&p, "\n");
> > @@ -127,6 +144,8 @@ static int register_save_restore(struct seq_file *m, void *data)
> >   	for_each_hw_engine(hwe, gt, id)
> >   		xe_reg_whitelist_dump(&hwe->reg_whitelist, &p);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > +
> >   	return 0;
> >   }
> > @@ -135,7 +154,9 @@ static int workarounds(struct seq_file *m, void *data)
> >   	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_wa_dump(gt, &p);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return 0;
> >   }
> > @@ -145,48 +166,70 @@ static int pat(struct seq_file *m, void *data)
> >   	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_pat_dump(gt, &p);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> >   	return 0;
> >   }
> >   static int rcs_default_lrc(struct seq_file *m, void *data)
> >   {
> > +	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_RENDER);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > +
> >   	return 0;
> >   }
> >   static int ccs_default_lrc(struct seq_file *m, void *data)
> >   {
> > +	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_COMPUTE);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > +
> >   	return 0;
> >   }
> >   static int bcs_default_lrc(struct seq_file *m, void *data)
> >   {
> > +	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_COPY);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > +
> >   	return 0;
> >   }
> >   static int vcs_default_lrc(struct seq_file *m, void *data)
> >   {
> > +	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_VIDEO_DECODE);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > +
> >   	return 0;
> >   }
> >   static int vecs_default_lrc(struct seq_file *m, void *data)
> >   {
> > +	struct xe_gt *gt = node_to_gt(m->private);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > +	xe_pm_runtime_get(gt_to_xe(gt));
> >   	xe_lrc_dump_default(&p, node_to_gt(m->private), XE_ENGINE_CLASS_VIDEO_ENHANCE);
> > +	xe_pm_runtime_put(gt_to_xe(gt));
> > +
> >   	return 0;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_guc_debugfs.c b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > index ffd7d53bcc42..d3822cbea273 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > @@ -14,6 +14,7 @@
> >   #include "xe_guc_ct.h"
> >   #include "xe_guc_log.h"
> >   #include "xe_macros.h"
> > +#include "xe_pm.h"
> >   static struct xe_guc *node_to_guc(struct drm_info_node *node)
> >   {
> > @@ -26,9 +27,9 @@ static int guc_info(struct seq_file *m, void *data)
> >   	struct xe_device *xe = guc_to_xe(guc);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > -	xe_device_mem_access_get(xe);
> > +	xe_pm_runtime_get(xe);
> >   	xe_guc_print_info(guc, &p);
> > -	xe_device_mem_access_put(xe);
> > +	xe_pm_runtime_put(xe);
> >   	return 0;
> >   }
> > @@ -39,9 +40,9 @@ static int guc_log(struct seq_file *m, void *data)
> >   	struct xe_device *xe = guc_to_xe(guc);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > -	xe_device_mem_access_get(xe);
> > +	xe_pm_runtime_get(xe);
> >   	xe_guc_log_print(&guc->log, &p);
> > -	xe_device_mem_access_put(xe);
> > +	xe_pm_runtime_put(xe);
> >   	return 0;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_huc_debugfs.c b/drivers/gpu/drm/xe/xe_huc_debugfs.c
> > index 18585a7eeb9d..3a888a40188b 100644
> > --- a/drivers/gpu/drm/xe/xe_huc_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_huc_debugfs.c
> > @@ -12,6 +12,7 @@
> >   #include "xe_gt.h"
> >   #include "xe_huc.h"
> >   #include "xe_macros.h"
> > +#include "xe_pm.h"
> >   static struct xe_gt *
> >   huc_to_gt(struct xe_huc *huc)
> > @@ -36,9 +37,9 @@ static int huc_info(struct seq_file *m, void *data)
> >   	struct xe_device *xe = huc_to_xe(huc);
> >   	struct drm_printer p = drm_seq_file_printer(m);
> > -	xe_device_mem_access_get(xe);
> > +	xe_pm_runtime_get(xe);
> >   	xe_huc_print_info(huc, &p);
> > -	xe_device_mem_access_put(xe);
> > +	xe_pm_runtime_put(xe);
> >   	return 0;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
> > index 3e1fa0c832ca..9844a8edbfe1 100644
> > --- a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
> > +++ b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c
> > @@ -73,7 +73,10 @@ static void xe_ttm_sys_mgr_del(struct ttm_resource_manager *man,
> >   static void xe_ttm_sys_mgr_debug(struct ttm_resource_manager *man,
> >   				 struct drm_printer *printer)
> >   {
> > -
> > +	/*
> > +	 * This function is called by debugfs entry and would require
> > +	 * pm_runtime_{get,put} wrappers around any operation.
> > +	 */
> >   }
> >   static const struct ttm_resource_manager_func xe_ttm_sys_mgr_func = {

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 03/34] drm/xe: Fix display runtime_pm handling
  2024-02-14 18:05     ` Rodrigo Vivi
@ 2024-02-15  9:30       ` Matthew Auld
  2024-02-15 22:19         ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-15  9:30 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-xe

On 14/02/2024 18:05, Rodrigo Vivi wrote:
> On Mon, Feb 05, 2024 at 09:11:37AM +0000, Matthew Auld wrote:
>> On 26/01/2024 20:30, Rodrigo Vivi wrote:
>>> i915's intel_runtime_pm_get_if_in_use actually calls the
>>> pm_runtime_get_if_active() with ign_usage_count = false, but Xe
>>> was erroneously calling it with true because of the mem_access cases.
>>
>> Good catch.
>>
>>> This can lead to unbalanced references.
>>
>> Is there an actual imbalance here though? Is it not just a case of being
>> overzealous in keeping the device awake when it is not currently "in_use" vs
>> "if_active"? If the api increments the usage count we will still decrement
>> it later, regardless of active vs in-use, AFAICT.
> 
> Indeed.
> 
> s/This can lead to unbalanced references./This can lead to unnecessary
> references getting hold here and device never getting into the runtime
> suspended state./g
> 
>>
>>>
>>> Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
>>>
>>> Also, already start this new function protected from the runtime
>>> recursion, since runtime_pm will need to call for display functions
>>> for a proper D3Cold flow.
>>>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> ---
>>>    .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  2 +-
>>>    drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
>>>    drivers/gpu/drm/xe/xe_pm.h                      |  1 +
>>>    3 files changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
>>> index 420eba0e4be0..ad5864d1dd74 100644
>>> --- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
>>> +++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
>>> @@ -177,7 +177,7 @@ static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_p
>>>    {
>>>    	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
>>> -	return xe_pm_runtime_get_if_active(xe);
>>> +	return xe_pm_runtime_get_if_in_use(xe);
>>>    }
>>>    static inline void intel_runtime_pm_put_unchecked(struct xe_runtime_pm *pm)
>>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>>> index bd35fe9f6227..19f88cb7715b 100644
>>> --- a/drivers/gpu/drm/xe/xe_pm.c
>>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>>> @@ -417,6 +417,23 @@ int xe_pm_runtime_get_if_active(struct xe_device *xe)
>>>    	return pm_runtime_get_if_active(xe->drm.dev, true);
>>>    }
>>> +/**
>>> + * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
>>> + * @xe: xe device instance
>>> + *
>>> + * Returns: True if device is awake and the reference was taken, false otherwise.
>>> + */
>>> +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
>>> +{
>>> +	if (xe_pm_read_callback_task(xe) == current) {
>>> +		/* The device is awake, grab the ref and move on */
>>> +		pm_runtime_get_noresume(xe->drm.dev);
>>> +		return true;
>>> +	}
>>> +
>>> +	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;
>>
>> This is doing atomic_inc_not_zero() underneath for the "in_use" case AFAICT.
>> If the usage count is zero it doesn't increment it and returns 0. Does that
>> not lead to an imbalance? Should this rather be > 0?

^^ did you see the comment here also?

>>
>>> +}
>>> +
>>>    /**
>>>     * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
>>>     * @xe: xe device instance
>>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>>> index 64a97c6726a7..9d372cbf388b 100644
>>> --- a/drivers/gpu/drm/xe/xe_pm.h
>>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>>> @@ -28,6 +28,7 @@ int xe_pm_runtime_resume(struct xe_device *xe);
>>>    int xe_pm_runtime_get(struct xe_device *xe);
>>>    int xe_pm_runtime_put(struct xe_device *xe);
>>>    int xe_pm_runtime_get_if_active(struct xe_device *xe);
>>> +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
>>>    void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
>>>    int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
>>>    void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 03/34] drm/xe: Fix display runtime_pm handling
  2024-02-15  9:30       ` Matthew Auld
@ 2024-02-15 22:19         ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-15 22:19 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Thu, Feb 15, 2024 at 09:30:08AM +0000, Matthew Auld wrote:
> On 14/02/2024 18:05, Rodrigo Vivi wrote:
> > On Mon, Feb 05, 2024 at 09:11:37AM +0000, Matthew Auld wrote:
> > > On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > > > i915's intel_runtime_pm_get_if_in_use actually calls the
> > > > pm_runtime_get_if_active() with ign_usage_count = false, but Xe
> > > > was erroneously calling it with true because of the mem_access cases.
> > >
> > > Good catch.
> > >
> > > > This can lead to unbalanced references.
> > >
> > > Is there an actual imbalance here though? Is it not just a case of being
> > > overzealous in keeping the device awake when it is not currently "in_use" vs
> > > "if_active"? If the api increments the usage count we will still decrement
> > > it later, regardless of active vs in-use, AFAICT.
> >
> > Indeed.
> >
> > s/This can lead to unbalanced references./This can lead to unnecessary
> > references getting hold here and device never getting into the runtime
> > suspended state./g
> >
> > >
> > > >
> > > > Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
> > > >
> > > > Also, already start this new function protected from the runtime
> > > > recursion, since runtime_pm will need to call for display functions
> > > > for a proper D3Cold flow.
> > > >
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > ---
> > > >    .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  2 +-
> > > >    drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
> > > >    drivers/gpu/drm/xe/xe_pm.h                      |  1 +
> > > >    3 files changed, 19 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > > > index 420eba0e4be0..ad5864d1dd74 100644
> > > > --- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > > > +++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > > > @@ -177,7 +177,7 @@ static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_p
> > > >    {
> > > >    	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
> > > > -	return xe_pm_runtime_get_if_active(xe);
> > > > +	return xe_pm_runtime_get_if_in_use(xe);
> > > >    }
> > > >    static inline void intel_runtime_pm_put_unchecked(struct xe_runtime_pm *pm)
> > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > > index bd35fe9f6227..19f88cb7715b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > @@ -417,6 +417,23 @@ int xe_pm_runtime_get_if_active(struct xe_device *xe)
> > > >    	return pm_runtime_get_if_active(xe->drm.dev, true);
> > > >    }
> > > > +/**
> > > > + * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
> > > > + * @xe: xe device instance
> > > > + *
> > > > + * Returns: True if device is awake and the reference was taken, false otherwise.
> > > > + */
> > > > +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
> > > > +{
> > > > +	if (xe_pm_read_callback_task(xe) == current) {
> > > > +		/* The device is awake, grab the ref and move on */
> > > > +		pm_runtime_get_noresume(xe->drm.dev);
> > > > +		return true;
> > > > +	}
> > > > +
> > > > +	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;
> > >
> > > This is doing atomic_inc_not_zero() underneath for the "in_use" case AFAICT.
> > > If the usage count is zero it doesn't increment it and returns 0. Does that
> > > not lead to an imbalance? Should this rather be > 0?
>
> ^^ did you see the comment here also?

oh, I had missed this comment, I'm sorry!
And good catch as well... maybe this will even solve some sporadic remaining underflow
that I was seeing...

but not just here but also in the if_active version as well.

doc for pm_runtime_get_if_active():
"
 * The caller is responsible for decrementing the runtime PM usage counter of
 * @dev after this function has returned a positive value for it.
"

Thanks a lot,
Rodrigo.

>
> > >
> > > > +}
> > > > +
> > > >    /**
> > > >     * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
> > > >     * @xe: xe device instance
> > > > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > > > index 64a97c6726a7..9d372cbf388b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pm.h
> > > > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > > > @@ -28,6 +28,7 @@ int xe_pm_runtime_resume(struct xe_device *xe);
> > > >    int xe_pm_runtime_get(struct xe_device *xe);
> > > >    int xe_pm_runtime_put(struct xe_device *xe);
> > > >    int xe_pm_runtime_get_if_active(struct xe_device *xe);
> > > > +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
> > > >    void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
> > > >    int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
> > > >    void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore
  2024-02-05 13:29   ` Matthew Auld
@ 2024-02-15 22:37     ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-15 22:37 UTC (permalink / raw)
  To: Matthew Auld, Matthew Brost; +Cc: intel-xe

On Mon, Feb 05, 2024 at 01:29:57PM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > All the VM operations are now protected or by the IOCTL
> > calls, or by sched/exec operations or by the specific
> > xe_pm_runtime_{get,put} around the duration of the LR VMs.
> > 
> > So, these inner mem_access protections can be removed.
> 
> Would this be a good place to split the series?

yeap, it is good to split the series. Let's first try to get
that first part in so we protect the missing cases from the outer
bounds as we continue to discuss the rest and try to come up with
a good solution.

> AFAICT a lot of the
> nightmares originate from here. Like first part of the series moves the rpm
> to be the outermost thing, but we still keep the important vm rpm protection
> here, so shouldn't need the CT, sched stuff etc yet. And in theory there is
> still a path to d3cold without this patch, I think. Also playing devils
> advocate, since we already end up having rpm protection for LR vm, why not
> normal vm. Is it worth it?

hmmm... do we really have a path with this in place?

A regular client usage with discrete for instance.
The userspace creates the VM. And then user doesn't touch the input,
screen goes blank, display off, but the VM  is still there.
When we would enter the D3cold? Really only when we don't have
any compositor running? only on non-display cases?

For the LR ones that's a good question, Matt Brost was concerned with that.
Apparently we could have a case where ioctl returned, then the job was
freed because the hw fence is immediatelly signalled on that case.
So we wouldn't have other protection.

the job one seemed to be the easiest to take care of the exec_queue,
but there's this caveat of the LR case. Perhaps we need to find a
better outer place to protect executions?

> 
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_vm.c | 7 -------
> >   1 file changed, 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index 9ad154a01ad7..f314ff128028 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -1280,9 +1280,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
> >   	vm->pt_ops = &xelp_pt_ops;
> > -	if (!(flags & XE_VM_FLAG_MIGRATION))
> > -		xe_device_mem_access_get(xe);
> > -
> >   	vm_resv_obj = drm_gpuvm_resv_object_alloc(&xe->drm);
> >   	if (!vm_resv_obj) {
> >   		err = -ENOMEM;
> > @@ -1391,8 +1388,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
> >   	for_each_tile(tile, xe, id)
> >   		xe_range_fence_tree_fini(&vm->rftree[id]);
> >   	kfree(vm);
> > -	if (!(flags & XE_VM_FLAG_MIGRATION))
> > -		xe_device_mem_access_put(xe);
> >   	return ERR_PTR(err);
> >   }
> > @@ -1515,8 +1510,6 @@ static void vm_destroy_work_func(struct work_struct *w)
> >   	xe_assert(xe, !vm->size);
> >   	if (!(vm->flags & XE_VM_FLAG_MIGRATION)) {
> > -		xe_device_mem_access_put(xe);
> > -
> >   		if (xe->info.has_asid && vm->usm.asid) {
> >   			mutex_lock(&xe->usm.lock);
> >   			lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 19/34] drm/xe: Remove pm_runtime lockdep
  2024-02-05 11:54   ` Matthew Auld
@ 2024-02-15 22:47     ` Rodrigo Vivi
  2024-02-20 17:48       ` Matthew Auld
  0 siblings, 1 reply; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-15 22:47 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Mon, Feb 05, 2024 at 11:54:45AM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > This lockdep was initially designed for the mem_access,
> > where the mem_access needed to run the resume sync from
> > the innerbounds and count the references.
> > 
> > With the runtime moving to the outer bounds of the driver
> > and the mem_access replaced by the pure rpm get/put
> > references, it is no longer needed and it is in a matter
> 
> We are also calling it in workers like invalidation_fence_work_func(),
> xe_sched_process_msg_work() etc. It is still quite easy to deadlock with
> that.
> 
> > of fact just splatting may false positives as the following:
> 
> Yeah, calling invalidation_fence_work_func() directly in the callers context
> is a false positive since ioctl has the rpm ref. But what about calling that
> in the async case where invalidation_fence_work_func() is only called when
> the in-fence signals? We are sure that is safe? It should be simple to fix
> the false positive here, no? If we just delete all the annotations we get
> zero help from lockdep for the more complicated cases. Also IIRC lockdep
> doesn't show you every splat once it triggers once, so who knows what else
> is lurking and whether that is also false positive?
> 
> > 
> >                 -> #1 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}:
> > [  384.778761]        xe_pm_runtime_get+0xa3/0x100 [xe]
> > [  384.783871]        invalidation_fence_work_func+0x7f/0x2b0 [xe]
> > [  384.789942]        invalidation_fence_init+0x8c2/0xce0 [xe]
> > [  384.795671]        __xe_pt_unbind_vma+0x4a7/0x1be0 [xe]
> > [  384.801050]        xe_vm_unbind+0x22f/0xc70 [xe]
> > [  384.805821]        __xe_vma_op_execute+0xc67/0x1af0 [xe]
> > [  384.811286]        xe_vm_bind_ioctl+0x3a36/0x66c0 [xe]
> > [  384.816579]        drm_ioctl_kernel+0x14a/0x2c0
> > [  384.821132]        drm_ioctl+0x4c6/0xab0
> > [  384.825073]        xe_drm_ioctl+0xa1/0xe0 [xe]
> > [  384.829651]        __x64_sys_ioctl+0x130/0x1a0
> > [  384.834115]        do_syscall_64+0x5c/0xe0
> > [  384.838232]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
> > [  384.843829]
> >                 -> #0 (reservation_ww_class_mutex){+.+.}-{4:4}:
> > [  384.850911]        __lock_acquire+0x3261/0x6330
> > [  384.855462]        lock_acquire+0x19b/0x4d0
> > [  384.859666]        __ww_mutex_lock.constprop.0+0x1d8/0x3500
> > [  384.865263]        ww_mutex_lock+0x38/0x150
> > [  384.869465]        xe_bo_lock+0x41/0x70 [xe]
> > [  384.873869]        xe_bo_evict_all+0x7ad/0xa40 [xe]
> > [  384.878883]        xe_pm_runtime_suspend+0x297/0x340 [xe]
> > [  384.884431]        xe_pci_runtime_suspend+0x3b/0x1e0 [xe]
> > [  384.889975]        pci_pm_runtime_suspend+0x168/0x540
> > [  384.895052]        __rpm_callback+0xa9/0x390
> > [  384.899343]        rpm_callback+0x1aa/0x210
> > [  384.903543]        rpm_suspend+0x2ea/0x14c0
> > [  384.907746]        pm_runtime_work+0x133/0x170
> > [  384.912213]        process_one_work+0x73b/0x1230
> > [  384.916853]        worker_thread+0x726/0x1320
> > [  384.921237]        kthread+0x2ee/0x3d0
> > [  384.925005]        ret_from_fork+0x2d/0x70
> > [  384.929120]        ret_from_fork_asm+0x1b/0x30
> > [  384.933585]
> >                 other info that might help us debug this:
> > 
> > [  384.941625]  Possible unsafe locking scenario:
> > 
> > [  384.947572]        CPU0                    CPU1
> > [  384.952123]        ----                    ----
> > [  384.956676]   lock(xe_pm_runtime_lockdep_map);
> > [  384.961140]                                lock(reservation_ww_class_mutex);
> > [  384.968220]                                lock(xe_pm_runtime_lockdep_map);
> > [  384.975214]   lock(reservation_ww_class_mutex);
> > [  384.979765]
> >                  *** DEADLOCK ***
> > 
> > In a matter of fact, there's actually a third lock that is not
> > in this picture:
> > spin_lock_irq(&dev->power.lock);
> > and INIT_WORK(&dev->power.work, pm_runtime_work);
> > 
> > The pm_callback_task will ensure that there's no recursive
> > calls of the resume function and it will increase the
> > reference counter anyway.
> > 
> > Then, the pm_runtime workqueue and spin locks will avoid that
> > any resume and suspend operations happens in parallel with
> > other resume and suspend operations.
> > 
> > With that, the only thing that we are actually doing here is
> > to wrongly train the lockdep, basically saying that we will
> > acquire some locks on resume and on suspend concurrently,
> > entirely ignoring its serialization and protection.
> > 
> > The above scenario is simply not possible because there's
> > a serialization with the spin_lock_irq(&dev->power.lock)
> > before each operation. However we are telling the lockep
> > that the lock(xe_pm_runtime_lockdep_map) occurs before
> > the &dev->power.lock and lockdep is not capable to see
> > that other protection.
> 
> Can you share some more info here? AFAIK dev->power.lock is an RPM core lock
> that protects internal state like transitioning between ACTIVE, SUSPENDING,
> RESUMING etc. It is never held when calling any of our rpm callbacks, so it
> should never factor in to xe_pm_runtime_lockdep_map.
> 
> The overall thing should look like this:
> 
> xe_rpm_get:
>     lock_map_acquire(&xe_pm_runtime_lockdep_map);
>     lock_map_release(&xe_pm_runtime_lockdep_map);
>     .....
>     RPM core grabs dev->power.lock
> 
> rpm resume callback: (RPM has dropped dev->power.lock)

                       ^ this is exactly the problem.
RPM doesn't drop the dev->power.lock while the callback
is called. it relaxes while waiting for other transaction
to finish, but it is hold by the time it gets to the callback.

>     lock_map_acquire(&xe_pm_runtime_lockdep_map);
>     ....do resume stuff
>     lock_map_release(&xe_pm_runtime_lockdep_map);
> 
> rpm suspend callback: (RPM has dropped dev->power.lock)
>     lock_map_acquire(&xe_pm_runtime_lockdep_map);
>     .....do suspend stuff
>     lock_map_release(&xe_pm_runtime_lockdep_map);
> 
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_pm.c | 55 --------------------------------------
> >   1 file changed, 55 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index 86bf225dba02..f49e449d9fb7 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -67,12 +67,6 @@
> >    */
> > -#ifdef CONFIG_LOCKDEP
> > -struct lockdep_map xe_pm_runtime_lockdep_map = {
> > -	.name = "xe_pm_runtime_lockdep_map"
> > -};
> > -#endif
> > -
> >   /**
> >    * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
> >    * @xe: xe device instance
> > @@ -291,29 +285,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> >   	/* Disable access_ongoing asserts and prevent recursive pm calls */
> >   	xe_pm_write_callback_task(xe, current);
> > -	/*
> > -	 * The actual xe_pm_runtime_put() is always async underneath, so
> > -	 * exactly where that is called should makes no difference to us. However
> > -	 * we still need to be very careful with the locks that this callback
> > -	 * acquires and the locks that are acquired and held by any callers of
> > -	 * xe_runtime_pm_get(). We already have the matching annotation
> > -	 * on that side, but we also need it here. For example lockdep should be
> > -	 * able to tell us if the following scenario is in theory possible:
> > -	 *
> > -	 * CPU0                          | CPU1 (kworker)
> > -	 * lock(A)                       |
> > -	 *                               | xe_pm_runtime_suspend()
> > -	 *                               |      lock(A)
> > -	 * xe_pm_runtime_get()           |
> > -	 *
> > -	 * This will clearly deadlock since rpm core needs to wait for
> > -	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
> > -	 * on CPU0 which prevents CPU1 making forward progress.  With the
> > -	 * annotation here and in xe_pm_runtime_get() lockdep will see
> > -	 * the potential lock inversion and give us a nice splat.
> > -	 */
> > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > -
> >   	/*
> >   	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
> >   	 * also checks and delets bo entry from user fault list.
> > @@ -341,7 +312,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> >   	if (xe->d3cold.allowed)
> >   		xe_display_pm_runtime_suspend(xe);
> >   out:
> > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> >   	xe_pm_write_callback_task(xe, NULL);
> >   	return err;
> >   }
> > @@ -361,8 +331,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> >   	/* Disable access_ongoing asserts and prevent recursive pm calls */
> >   	xe_pm_write_callback_task(xe, current);
> > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > -
> >   	/*
> >   	 * It can be possible that xe has allowed d3cold but other pcie devices
> >   	 * in gfx card soc would have blocked d3cold, therefore card has not
> > @@ -400,31 +368,10 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> >   			goto out;
> >   	}
> >   out:
> > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> >   	xe_pm_write_callback_task(xe, NULL);
> >   	return err;
> >   }
> > -/*
> > - * For places where resume is synchronous it can be quite easy to deadlock
> > - * if we are not careful. Also in practice it might be quite timing
> > - * sensitive to ever see the 0 -> 1 transition with the callers locks
> > - * held, so deadlocks might exist but are hard for lockdep to ever see.
> > - * With this in mind, help lockdep learn about the potentially scary
> > - * stuff that can happen inside the runtime_resume callback by acquiring
> > - * a dummy lock (it doesn't protect anything and gets compiled out on
> > - * non-debug builds).  Lockdep then only needs to see the
> > - * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
> > - * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
> > - * For example if the (callers_locks) are ever grabbed in the
> > - * runtime_resume callback, lockdep should give us a nice splat.
> > - */
> > -static void pm_runtime_lockdep_training(void)
> > -{
> > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > -}
> > -
> >   /**
> >    * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
> >    * @xe: xe device instance
> > @@ -436,7 +383,6 @@ void xe_pm_runtime_get(struct xe_device *xe)
> >   	if (xe_pm_read_callback_task(xe) == current)
> >   		return;
> > -	pm_runtime_lockdep_training();
> >   	pm_runtime_resume(xe->drm.dev);
> >   }
> > @@ -466,7 +412,6 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
> >   	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
> >   		return -ELOOP;
> > -	pm_runtime_lockdep_training();
> >   	return pm_runtime_get_sync(xe->drm.dev);
> >   }

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 19/34] drm/xe: Remove pm_runtime lockdep
  2024-02-15 22:47     ` Rodrigo Vivi
@ 2024-02-20 17:48       ` Matthew Auld
  2024-02-28 16:53         ` Rodrigo Vivi
  0 siblings, 1 reply; 77+ messages in thread
From: Matthew Auld @ 2024-02-20 17:48 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-xe

On 15/02/2024 22:47, Rodrigo Vivi wrote:
> On Mon, Feb 05, 2024 at 11:54:45AM +0000, Matthew Auld wrote:
>> On 26/01/2024 20:30, Rodrigo Vivi wrote:
>>> This lockdep was initially designed for the mem_access,
>>> where the mem_access needed to run the resume sync from
>>> the innerbounds and count the references.
>>>
>>> With the runtime moving to the outer bounds of the driver
>>> and the mem_access replaced by the pure rpm get/put
>>> references, it is no longer needed and it is in a matter
>>
>> We are also calling it in workers like invalidation_fence_work_func(),
>> xe_sched_process_msg_work() etc. It is still quite easy to deadlock with
>> that.
>>
>>> of fact just splatting may false positives as the following:
>>
>> Yeah, calling invalidation_fence_work_func() directly in the callers context
>> is a false positive since ioctl has the rpm ref. But what about calling that
>> in the async case where invalidation_fence_work_func() is only called when
>> the in-fence signals? We are sure that is safe? It should be simple to fix
>> the false positive here, no? If we just delete all the annotations we get
>> zero help from lockdep for the more complicated cases. Also IIRC lockdep
>> doesn't show you every splat once it triggers once, so who knows what else
>> is lurking and whether that is also false positive?
>>
>>>
>>>                  -> #1 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}:
>>> [  384.778761]        xe_pm_runtime_get+0xa3/0x100 [xe]
>>> [  384.783871]        invalidation_fence_work_func+0x7f/0x2b0 [xe]
>>> [  384.789942]        invalidation_fence_init+0x8c2/0xce0 [xe]
>>> [  384.795671]        __xe_pt_unbind_vma+0x4a7/0x1be0 [xe]
>>> [  384.801050]        xe_vm_unbind+0x22f/0xc70 [xe]
>>> [  384.805821]        __xe_vma_op_execute+0xc67/0x1af0 [xe]
>>> [  384.811286]        xe_vm_bind_ioctl+0x3a36/0x66c0 [xe]
>>> [  384.816579]        drm_ioctl_kernel+0x14a/0x2c0
>>> [  384.821132]        drm_ioctl+0x4c6/0xab0
>>> [  384.825073]        xe_drm_ioctl+0xa1/0xe0 [xe]
>>> [  384.829651]        __x64_sys_ioctl+0x130/0x1a0
>>> [  384.834115]        do_syscall_64+0x5c/0xe0
>>> [  384.838232]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>> [  384.843829]
>>>                  -> #0 (reservation_ww_class_mutex){+.+.}-{4:4}:
>>> [  384.850911]        __lock_acquire+0x3261/0x6330
>>> [  384.855462]        lock_acquire+0x19b/0x4d0
>>> [  384.859666]        __ww_mutex_lock.constprop.0+0x1d8/0x3500
>>> [  384.865263]        ww_mutex_lock+0x38/0x150
>>> [  384.869465]        xe_bo_lock+0x41/0x70 [xe]
>>> [  384.873869]        xe_bo_evict_all+0x7ad/0xa40 [xe]
>>> [  384.878883]        xe_pm_runtime_suspend+0x297/0x340 [xe]
>>> [  384.884431]        xe_pci_runtime_suspend+0x3b/0x1e0 [xe]
>>> [  384.889975]        pci_pm_runtime_suspend+0x168/0x540
>>> [  384.895052]        __rpm_callback+0xa9/0x390
>>> [  384.899343]        rpm_callback+0x1aa/0x210
>>> [  384.903543]        rpm_suspend+0x2ea/0x14c0
>>> [  384.907746]        pm_runtime_work+0x133/0x170
>>> [  384.912213]        process_one_work+0x73b/0x1230
>>> [  384.916853]        worker_thread+0x726/0x1320
>>> [  384.921237]        kthread+0x2ee/0x3d0
>>> [  384.925005]        ret_from_fork+0x2d/0x70
>>> [  384.929120]        ret_from_fork_asm+0x1b/0x30
>>> [  384.933585]
>>>                  other info that might help us debug this:
>>>
>>> [  384.941625]  Possible unsafe locking scenario:
>>>
>>> [  384.947572]        CPU0                    CPU1
>>> [  384.952123]        ----                    ----
>>> [  384.956676]   lock(xe_pm_runtime_lockdep_map);
>>> [  384.961140]                                lock(reservation_ww_class_mutex);
>>> [  384.968220]                                lock(xe_pm_runtime_lockdep_map);
>>> [  384.975214]   lock(reservation_ww_class_mutex);
>>> [  384.979765]
>>>                   *** DEADLOCK ***
>>>
>>> In a matter of fact, there's actually a third lock that is not
>>> in this picture:
>>> spin_lock_irq(&dev->power.lock);
>>> and INIT_WORK(&dev->power.work, pm_runtime_work);
>>>
>>> The pm_callback_task will ensure that there's no recursive
>>> calls of the resume function and it will increase the
>>> reference counter anyway.
>>>
>>> Then, the pm_runtime workqueue and spin locks will avoid that
>>> any resume and suspend operations happens in parallel with
>>> other resume and suspend operations.
>>>
>>> With that, the only thing that we are actually doing here is
>>> to wrongly train the lockdep, basically saying that we will
>>> acquire some locks on resume and on suspend concurrently,
>>> entirely ignoring its serialization and protection.
>>>
>>> The above scenario is simply not possible because there's
>>> a serialization with the spin_lock_irq(&dev->power.lock)
>>> before each operation. However we are telling the lockep
>>> that the lock(xe_pm_runtime_lockdep_map) occurs before
>>> the &dev->power.lock and lockdep is not capable to see
>>> that other protection.
>>
>> Can you share some more info here? AFAIK dev->power.lock is an RPM core lock
>> that protects internal state like transitioning between ACTIVE, SUSPENDING,
>> RESUMING etc. It is never held when calling any of our rpm callbacks, so it
>> should never factor in to xe_pm_runtime_lockdep_map.
>>
>> The overall thing should look like this:
>>
>> xe_rpm_get:
>>      lock_map_acquire(&xe_pm_runtime_lockdep_map);
>>      lock_map_release(&xe_pm_runtime_lockdep_map);
>>      .....
>>      RPM core grabs dev->power.lock
>>
>> rpm resume callback: (RPM has dropped dev->power.lock)
> 
>                         ^ this is exactly the problem.
> RPM doesn't drop the dev->power.lock while the callback
> is called. it relaxes while waiting for other transaction
> to finish, but it is hold by the time it gets to the callback.

AFAICT it is never held from the context that is calling the resume or 
suspend callback (spinlock would be very limiting otherwise), and it ofc 
can't be held while sleeping, like when "waiting for other 
transactions". Note that the "waiting for other transactions" thing is 
exactly what lockdep doesn't understand by itself (there are no 
annotations for waitqueue AFAIK), which is part of what the xe_pm 
annotations here help with...

> 
>>      lock_map_acquire(&xe_pm_runtime_lockdep_map);
>>      ....do resume stuff
>>      lock_map_release(&xe_pm_runtime_lockdep_map);
>>
>> rpm suspend callback: (RPM has dropped dev->power.lock)
>>      lock_map_acquire(&xe_pm_runtime_lockdep_map);
>>      .....do suspend stuff
>>      lock_map_release(&xe_pm_runtime_lockdep_map);
>>
>>>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> ---
>>>    drivers/gpu/drm/xe/xe_pm.c | 55 --------------------------------------
>>>    1 file changed, 55 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>>> index 86bf225dba02..f49e449d9fb7 100644
>>> --- a/drivers/gpu/drm/xe/xe_pm.c
>>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>>> @@ -67,12 +67,6 @@
>>>     */
>>> -#ifdef CONFIG_LOCKDEP
>>> -struct lockdep_map xe_pm_runtime_lockdep_map = {
>>> -	.name = "xe_pm_runtime_lockdep_map"
>>> -};
>>> -#endif
>>> -
>>>    /**
>>>     * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
>>>     * @xe: xe device instance
>>> @@ -291,29 +285,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>>>    	/* Disable access_ongoing asserts and prevent recursive pm calls */
>>>    	xe_pm_write_callback_task(xe, current);
>>> -	/*
>>> -	 * The actual xe_pm_runtime_put() is always async underneath, so
>>> -	 * exactly where that is called should makes no difference to us. However
>>> -	 * we still need to be very careful with the locks that this callback
>>> -	 * acquires and the locks that are acquired and held by any callers of
>>> -	 * xe_runtime_pm_get(). We already have the matching annotation
>>> -	 * on that side, but we also need it here. For example lockdep should be
>>> -	 * able to tell us if the following scenario is in theory possible:
>>> -	 *
>>> -	 * CPU0                          | CPU1 (kworker)
>>> -	 * lock(A)                       |
>>> -	 *                               | xe_pm_runtime_suspend()
>>> -	 *                               |      lock(A)
>>> -	 * xe_pm_runtime_get()           |
>>> -	 *
>>> -	 * This will clearly deadlock since rpm core needs to wait for
>>> -	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
>>> -	 * on CPU0 which prevents CPU1 making forward progress.  With the
>>> -	 * annotation here and in xe_pm_runtime_get() lockdep will see
>>> -	 * the potential lock inversion and give us a nice splat.
>>> -	 */
>>> -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
>>> -
>>>    	/*
>>>    	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
>>>    	 * also checks and delets bo entry from user fault list.
>>> @@ -341,7 +312,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>>>    	if (xe->d3cold.allowed)
>>>    		xe_display_pm_runtime_suspend(xe);
>>>    out:
>>> -	lock_map_release(&xe_pm_runtime_lockdep_map);
>>>    	xe_pm_write_callback_task(xe, NULL);
>>>    	return err;
>>>    }
>>> @@ -361,8 +331,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>>>    	/* Disable access_ongoing asserts and prevent recursive pm calls */
>>>    	xe_pm_write_callback_task(xe, current);
>>> -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
>>> -
>>>    	/*
>>>    	 * It can be possible that xe has allowed d3cold but other pcie devices
>>>    	 * in gfx card soc would have blocked d3cold, therefore card has not
>>> @@ -400,31 +368,10 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>>>    			goto out;
>>>    	}
>>>    out:
>>> -	lock_map_release(&xe_pm_runtime_lockdep_map);
>>>    	xe_pm_write_callback_task(xe, NULL);
>>>    	return err;
>>>    }
>>> -/*
>>> - * For places where resume is synchronous it can be quite easy to deadlock
>>> - * if we are not careful. Also in practice it might be quite timing
>>> - * sensitive to ever see the 0 -> 1 transition with the callers locks
>>> - * held, so deadlocks might exist but are hard for lockdep to ever see.
>>> - * With this in mind, help lockdep learn about the potentially scary
>>> - * stuff that can happen inside the runtime_resume callback by acquiring
>>> - * a dummy lock (it doesn't protect anything and gets compiled out on
>>> - * non-debug builds).  Lockdep then only needs to see the
>>> - * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
>>> - * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
>>> - * For example if the (callers_locks) are ever grabbed in the
>>> - * runtime_resume callback, lockdep should give us a nice splat.
>>> - */
>>> -static void pm_runtime_lockdep_training(void)
>>> -{
>>> -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
>>> -	lock_map_release(&xe_pm_runtime_lockdep_map);
>>> -}
>>> -
>>>    /**
>>>     * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
>>>     * @xe: xe device instance
>>> @@ -436,7 +383,6 @@ void xe_pm_runtime_get(struct xe_device *xe)
>>>    	if (xe_pm_read_callback_task(xe) == current)
>>>    		return;
>>> -	pm_runtime_lockdep_training();
>>>    	pm_runtime_resume(xe->drm.dev);
>>>    }
>>> @@ -466,7 +412,6 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
>>>    	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
>>>    		return -ELOOP;
>>> -	pm_runtime_lockdep_training();
>>>    	return pm_runtime_get_sync(xe->drm.dev);
>>>    }

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime
  2024-02-05 12:23   ` Matthew Auld
@ 2024-02-28 16:51     ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-28 16:51 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe, Matthew Brost

On Mon, Feb 05, 2024 at 12:23:41PM +0000, Matthew Auld wrote:
> On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > For the G2H and TLB invalidation cases, the outer bounds
> > protection of the pm_runtime are not going to work. So
> > we need to have the inner side protections here.
> > 
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_debugfs.c      |  3 ++
> >   drivers/gpu/drm/xe/xe_guc_ct.c       | 79 +++++++++++++---------------
> >   drivers/gpu/drm/xe/xe_guc_ct_types.h |  2 +
> >   3 files changed, 41 insertions(+), 43 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index 8abdf3c17e1d..b3669eac23c9 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -64,6 +64,9 @@ static int info(struct seq_file *m, void *data)
> >   			   xe_force_wake_ref(gt_to_fw(gt), XE_FW_GT));
> >   		drm_printf(&p, "gt%d engine_mask 0x%llx\n", id,
> >   			   gt->info.engine_mask);
> > +		drm_printf(&p, "gt%d g2h_outstanding %d g2h_pm_refs %d\n", id,
> > +			   gt->uc.guc.ct.g2h_outstanding,
> > +			   gt->uc.guc.ct.g2h_pm_refs);
> >   	}
> >   	xe_pm_runtime_put(xe);
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > index f3d356383ced..139cfe733661 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > @@ -287,16 +287,49 @@ static int guc_ct_control_toggle(struct xe_guc_ct *ct, bool enable)
> >   	return ret > 0 ? -EPROTO : ret;
> >   }
> > +static void guc_ct_g2h_outstanding_clear(struct xe_guc_ct *ct)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < ct->g2h_pm_refs; i++)
> > +		xe_pm_runtime_put(ct_to_xe(ct));
> > +	ct->g2h_outstanding = 0;
> > +	ct->g2h_pm_refs = 0;
> > +}
> > +
> > +static void guc_ct_g2h_outstanding_dec(struct xe_guc_ct *ct)
> > +{
> > +	if (ct->g2h_pm_refs > 0)
> > +		xe_pm_runtime_put(ct_to_xe(ct));
> > +	ct->g2h_pm_refs--;
> > +	ct->g2h_outstanding--;
> > +}
> > +
> > +static void guc_ct_g2h_outstanding_add(struct xe_guc_ct *ct, int num_g2h)
> > +{
> > +	struct xe_device *xe = ct_to_xe(ct);
> > +	int i;
> > +
> > +	for (i = 0; i < num_g2h; i++) {
> > +		if (xe_pm_runtime_get_if_in_use(xe))
> > +			ct->g2h_pm_refs++;
> > +		else
> > +			drm_err(&xe->drm, "Failed to grab RPM ref for outstanding g2h\n");
> > +	}
> 
> I guess we could maybe drop the g2h_pm_refs, and rather just track single
> rpm ref for entire ct when g2h_outstanding 0 -> 1, and likewise drop the rpm
> ref when g2h_outstanding reaches zero?

I thought about that, but I was afraid of some unbalanced case when the
get_if_in_use fails.

> 
> > +	ct->g2h_outstanding += num_g2h;
> > +}
> > +
> >   static void xe_guc_ct_set_state(struct xe_guc_ct *ct,
> >   				enum xe_guc_ct_state state)
> >   {
> >   	mutex_lock(&ct->lock);		/* Serialise dequeue_one_g2h() */
> >   	spin_lock_irq(&ct->fast_lock);	/* Serialise CT fast-path */
> > +
> >   	xe_gt_assert(ct_to_gt(ct), ct->g2h_outstanding == 0 ||
> >   		     state == XE_GUC_CT_STATE_STOPPED);
> > -	ct->g2h_outstanding = 0;
> > +	guc_ct_g2h_outstanding_clear(ct);
> >   	ct->state = state;
> >   	spin_unlock_irq(&ct->fast_lock);
> > @@ -428,7 +461,7 @@ static void __g2h_reserve_space(struct xe_guc_ct *ct, u32 g2h_len, u32 num_g2h)
> >   		lockdep_assert_held(&ct->fast_lock);
> >   		ct->ctbs.g2h.info.space -= g2h_len;
> > -		ct->g2h_outstanding += num_g2h;
> > +		guc_ct_g2h_outstanding_add(ct, num_g2h);
> >   	}
> >   }
> > @@ -439,7 +472,7 @@ static void __g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
> >   		  ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space);
> >   	ct->ctbs.g2h.info.space += g2h_len;
> > -	--ct->g2h_outstanding;
> > +	guc_ct_g2h_outstanding_dec(ct);
> >   }
> >   static void g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
> > @@ -1199,14 +1232,8 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
> >    */
> >   void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> >   {
> > -	struct xe_device *xe = ct_to_xe(ct);
> > -	bool ongoing;
> >   	int len;
> > -	ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
> > -	if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
> > -		return;
> > -
> >   	spin_lock(&ct->fast_lock);
> >   	do {
> >   		len = g2h_read(ct, ct->fast_msg, true);
> > @@ -1214,9 +1241,6 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> >   			g2h_fast_path(ct, ct->fast_msg, len);
> >   	} while (len > 0);
> >   	spin_unlock(&ct->fast_lock);
> > -
> > -	if (ongoing)
> > -		xe_device_mem_access_put(xe);
> >   }
> >   /* Returns less than zero on error, 0 on done, 1 on more available */
> > @@ -1247,36 +1271,8 @@ static int dequeue_one_g2h(struct xe_guc_ct *ct)
> >   static void g2h_worker_func(struct work_struct *w)
> >   {
> >   	struct xe_guc_ct *ct = container_of(w, struct xe_guc_ct, g2h_worker);
> > -	bool ongoing;
> >   	int ret;
> > -	/*
> > -	 * Normal users must always hold mem_access.ref around CT calls. However
> > -	 * during the runtime pm callbacks we rely on CT to talk to the GuC, but
> > -	 * at this stage we can't rely on mem_access.ref and even the
> > -	 * callback_task will be different than current.  For such cases we just
> > -	 * need to ensure we always process the responses from any blocking
> > -	 * ct_send requests or where we otherwise expect some response when
> > -	 * initiated from those callbacks (which will need to wait for the below
> > -	 * dequeue_one_g2h()).  The dequeue_one_g2h() will gracefully fail if
> > -	 * the device has suspended to the point that the CT communication has
> > -	 * been disabled.
> > -	 *
> > -	 * If we are inside the runtime pm callback, we can be the only task
> > -	 * still issuing CT requests (since that requires having the
> > -	 * mem_access.ref).  It seems like it might in theory be possible to
> > -	 * receive unsolicited events from the GuC just as we are
> > -	 * suspending-resuming, but those will currently anyway be lost when
> > -	 * eventually exiting from suspend, hence no need to wake up the device
> > -	 * here. If we ever need something stronger than get_if_ongoing() then
> > -	 * we need to be careful with blocking the pm callbacks from getting CT
> > -	 * responses, if the worker here is blocked on those callbacks
> > -	 * completing, creating a deadlock.
> > -	 */
> > -	ongoing = xe_device_mem_access_get_if_ongoing(ct_to_xe(ct));
> > -	if (!ongoing && xe_pm_read_callback_task(ct_to_xe(ct)) == NULL)
> > -		return;
> > -
> >   	do {
> >   		mutex_lock(&ct->lock);
> >   		ret = dequeue_one_g2h(ct);
> > @@ -1290,9 +1286,6 @@ static void g2h_worker_func(struct work_struct *w)
> >   			kick_reset(ct);
> >   		}
> >   	} while (ret == 1);
> > -
> > -	if (ongoing)
> > -		xe_device_mem_access_put(ct_to_xe(ct));
> >   }
> >   static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h b/drivers/gpu/drm/xe/xe_guc_ct_types.h
> > index d29144c9f20b..4220230e7be4 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
> > @@ -108,6 +108,8 @@ struct xe_guc_ct {
> >   	} ctbs;
> >   	/** @g2h_outstanding: number of outstanding G2H */
> >   	u32 g2h_outstanding;
> > +	/** @g2h_pm_refs: number of pm_refs for pending G2H */
> > +	u32 g2h_pm_refs;
> >   	/** @g2h_worker: worker to process G2H messages */
> >   	struct work_struct g2h_worker;
> >   	/** @state: CT state */

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [RFC 19/34] drm/xe: Remove pm_runtime lockdep
  2024-02-20 17:48       ` Matthew Auld
@ 2024-02-28 16:53         ` Rodrigo Vivi
  0 siblings, 0 replies; 77+ messages in thread
From: Rodrigo Vivi @ 2024-02-28 16:53 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Tue, Feb 20, 2024 at 05:48:44PM +0000, Matthew Auld wrote:
> On 15/02/2024 22:47, Rodrigo Vivi wrote:
> > On Mon, Feb 05, 2024 at 11:54:45AM +0000, Matthew Auld wrote:
> > > On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > > > This lockdep was initially designed for the mem_access,
> > > > where the mem_access needed to run the resume sync from
> > > > the innerbounds and count the references.
> > > > 
> > > > With the runtime moving to the outer bounds of the driver
> > > > and the mem_access replaced by the pure rpm get/put
> > > > references, it is no longer needed and it is in a matter
> > > 
> > > We are also calling it in workers like invalidation_fence_work_func(),
> > > xe_sched_process_msg_work() etc. It is still quite easy to deadlock with
> > > that.
> > > 
> > > > of fact just splatting may false positives as the following:
> > > 
> > > Yeah, calling invalidation_fence_work_func() directly in the callers context
> > > is a false positive since ioctl has the rpm ref. But what about calling that
> > > in the async case where invalidation_fence_work_func() is only called when
> > > the in-fence signals? We are sure that is safe? It should be simple to fix
> > > the false positive here, no? If we just delete all the annotations we get
> > > zero help from lockdep for the more complicated cases. Also IIRC lockdep
> > > doesn't show you every splat once it triggers once, so who knows what else
> > > is lurking and whether that is also false positive?
> > > 
> > > > 
> > > >                  -> #1 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}:
> > > > [  384.778761]        xe_pm_runtime_get+0xa3/0x100 [xe]
> > > > [  384.783871]        invalidation_fence_work_func+0x7f/0x2b0 [xe]
> > > > [  384.789942]        invalidation_fence_init+0x8c2/0xce0 [xe]
> > > > [  384.795671]        __xe_pt_unbind_vma+0x4a7/0x1be0 [xe]
> > > > [  384.801050]        xe_vm_unbind+0x22f/0xc70 [xe]
> > > > [  384.805821]        __xe_vma_op_execute+0xc67/0x1af0 [xe]
> > > > [  384.811286]        xe_vm_bind_ioctl+0x3a36/0x66c0 [xe]
> > > > [  384.816579]        drm_ioctl_kernel+0x14a/0x2c0
> > > > [  384.821132]        drm_ioctl+0x4c6/0xab0
> > > > [  384.825073]        xe_drm_ioctl+0xa1/0xe0 [xe]
> > > > [  384.829651]        __x64_sys_ioctl+0x130/0x1a0
> > > > [  384.834115]        do_syscall_64+0x5c/0xe0
> > > > [  384.838232]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
> > > > [  384.843829]
> > > >                  -> #0 (reservation_ww_class_mutex){+.+.}-{4:4}:
> > > > [  384.850911]        __lock_acquire+0x3261/0x6330
> > > > [  384.855462]        lock_acquire+0x19b/0x4d0
> > > > [  384.859666]        __ww_mutex_lock.constprop.0+0x1d8/0x3500
> > > > [  384.865263]        ww_mutex_lock+0x38/0x150
> > > > [  384.869465]        xe_bo_lock+0x41/0x70 [xe]
> > > > [  384.873869]        xe_bo_evict_all+0x7ad/0xa40 [xe]
> > > > [  384.878883]        xe_pm_runtime_suspend+0x297/0x340 [xe]
> > > > [  384.884431]        xe_pci_runtime_suspend+0x3b/0x1e0 [xe]
> > > > [  384.889975]        pci_pm_runtime_suspend+0x168/0x540
> > > > [  384.895052]        __rpm_callback+0xa9/0x390
> > > > [  384.899343]        rpm_callback+0x1aa/0x210
> > > > [  384.903543]        rpm_suspend+0x2ea/0x14c0
> > > > [  384.907746]        pm_runtime_work+0x133/0x170
> > > > [  384.912213]        process_one_work+0x73b/0x1230
> > > > [  384.916853]        worker_thread+0x726/0x1320
> > > > [  384.921237]        kthread+0x2ee/0x3d0
> > > > [  384.925005]        ret_from_fork+0x2d/0x70
> > > > [  384.929120]        ret_from_fork_asm+0x1b/0x30
> > > > [  384.933585]
> > > >                  other info that might help us debug this:
> > > > 
> > > > [  384.941625]  Possible unsafe locking scenario:
> > > > 
> > > > [  384.947572]        CPU0                    CPU1
> > > > [  384.952123]        ----                    ----
> > > > [  384.956676]   lock(xe_pm_runtime_lockdep_map);
> > > > [  384.961140]                                lock(reservation_ww_class_mutex);
> > > > [  384.968220]                                lock(xe_pm_runtime_lockdep_map);
> > > > [  384.975214]   lock(reservation_ww_class_mutex);
> > > > [  384.979765]
> > > >                   *** DEADLOCK ***
> > > > 
> > > > In a matter of fact, there's actually a third lock that is not
> > > > in this picture:
> > > > spin_lock_irq(&dev->power.lock);
> > > > and INIT_WORK(&dev->power.work, pm_runtime_work);
> > > > 
> > > > The pm_callback_task will ensure that there's no recursive
> > > > calls of the resume function and it will increase the
> > > > reference counter anyway.
> > > > 
> > > > Then, the pm_runtime workqueue and spin locks will avoid that
> > > > any resume and suspend operations happens in parallel with
> > > > other resume and suspend operations.
> > > > 
> > > > With that, the only thing that we are actually doing here is
> > > > to wrongly train the lockdep, basically saying that we will
> > > > acquire some locks on resume and on suspend concurrently,
> > > > entirely ignoring its serialization and protection.
> > > > 
> > > > The above scenario is simply not possible because there's
> > > > a serialization with the spin_lock_irq(&dev->power.lock)
> > > > before each operation. However we are telling the lockep
> > > > that the lock(xe_pm_runtime_lockdep_map) occurs before
> > > > the &dev->power.lock and lockdep is not capable to see
> > > > that other protection.
> > > 
> > > Can you share some more info here? AFAIK dev->power.lock is an RPM core lock
> > > that protects internal state like transitioning between ACTIVE, SUSPENDING,
> > > RESUMING etc. It is never held when calling any of our rpm callbacks, so it
> > > should never factor in to xe_pm_runtime_lockdep_map.
> > > 
> > > The overall thing should look like this:
> > > 
> > > xe_rpm_get:
> > >      lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > >      lock_map_release(&xe_pm_runtime_lockdep_map);
> > >      .....
> > >      RPM core grabs dev->power.lock
> > > 
> > > rpm resume callback: (RPM has dropped dev->power.lock)
> > 
> >                         ^ this is exactly the problem.
> > RPM doesn't drop the dev->power.lock while the callback
> > is called. it relaxes while waiting for other transaction
> > to finish, but it is hold by the time it gets to the callback.
> 
> AFAICT it is never held from the context that is calling the resume or
> suspend callback (spinlock would be very limiting otherwise), and it ofc
> can't be held while sleeping, like when "waiting for other transactions".
> Note that the "waiting for other transactions" thing is exactly what lockdep
> doesn't understand by itself (there are no annotations for waitqueue AFAIK),
> which is part of what the xe_pm annotations here help with...

hmm, you are absolutely right.
the rpm_callback gives the impression that its call is in between locked
spin locks, but if that was indeed the case our current mutex_lock
in the get function would fail badly.

Let's try to move without this patch and work on the false positives
separate.

> 
> > 
> > >      lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > >      ....do resume stuff
> > >      lock_map_release(&xe_pm_runtime_lockdep_map);
> > > 
> > > rpm suspend callback: (RPM has dropped dev->power.lock)
> > >      lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > >      .....do suspend stuff
> > >      lock_map_release(&xe_pm_runtime_lockdep_map);
> > > 
> > > > 
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/xe/xe_pm.c | 55 --------------------------------------
> > > >    1 file changed, 55 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > > index 86bf225dba02..f49e449d9fb7 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > @@ -67,12 +67,6 @@
> > > >     */
> > > > -#ifdef CONFIG_LOCKDEP
> > > > -struct lockdep_map xe_pm_runtime_lockdep_map = {
> > > > -	.name = "xe_pm_runtime_lockdep_map"
> > > > -};
> > > > -#endif
> > > > -
> > > >    /**
> > > >     * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
> > > >     * @xe: xe device instance
> > > > @@ -291,29 +285,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> > > >    	/* Disable access_ongoing asserts and prevent recursive pm calls */
> > > >    	xe_pm_write_callback_task(xe, current);
> > > > -	/*
> > > > -	 * The actual xe_pm_runtime_put() is always async underneath, so
> > > > -	 * exactly where that is called should makes no difference to us. However
> > > > -	 * we still need to be very careful with the locks that this callback
> > > > -	 * acquires and the locks that are acquired and held by any callers of
> > > > -	 * xe_runtime_pm_get(). We already have the matching annotation
> > > > -	 * on that side, but we also need it here. For example lockdep should be
> > > > -	 * able to tell us if the following scenario is in theory possible:
> > > > -	 *
> > > > -	 * CPU0                          | CPU1 (kworker)
> > > > -	 * lock(A)                       |
> > > > -	 *                               | xe_pm_runtime_suspend()
> > > > -	 *                               |      lock(A)
> > > > -	 * xe_pm_runtime_get()           |
> > > > -	 *
> > > > -	 * This will clearly deadlock since rpm core needs to wait for
> > > > -	 * xe_pm_runtime_suspend() to complete, but here we are holding lock(A)
> > > > -	 * on CPU0 which prevents CPU1 making forward progress.  With the
> > > > -	 * annotation here and in xe_pm_runtime_get() lockdep will see
> > > > -	 * the potential lock inversion and give us a nice splat.
> > > > -	 */
> > > > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > > > -
> > > >    	/*
> > > >    	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
> > > >    	 * also checks and delets bo entry from user fault list.
> > > > @@ -341,7 +312,6 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> > > >    	if (xe->d3cold.allowed)
> > > >    		xe_display_pm_runtime_suspend(xe);
> > > >    out:
> > > > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > > >    	xe_pm_write_callback_task(xe, NULL);
> > > >    	return err;
> > > >    }
> > > > @@ -361,8 +331,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> > > >    	/* Disable access_ongoing asserts and prevent recursive pm calls */
> > > >    	xe_pm_write_callback_task(xe, current);
> > > > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > > > -
> > > >    	/*
> > > >    	 * It can be possible that xe has allowed d3cold but other pcie devices
> > > >    	 * in gfx card soc would have blocked d3cold, therefore card has not
> > > > @@ -400,31 +368,10 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> > > >    			goto out;
> > > >    	}
> > > >    out:
> > > > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > > >    	xe_pm_write_callback_task(xe, NULL);
> > > >    	return err;
> > > >    }
> > > > -/*
> > > > - * For places where resume is synchronous it can be quite easy to deadlock
> > > > - * if we are not careful. Also in practice it might be quite timing
> > > > - * sensitive to ever see the 0 -> 1 transition with the callers locks
> > > > - * held, so deadlocks might exist but are hard for lockdep to ever see.
> > > > - * With this in mind, help lockdep learn about the potentially scary
> > > > - * stuff that can happen inside the runtime_resume callback by acquiring
> > > > - * a dummy lock (it doesn't protect anything and gets compiled out on
> > > > - * non-debug builds).  Lockdep then only needs to see the
> > > > - * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
> > > > - * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
> > > > - * For example if the (callers_locks) are ever grabbed in the
> > > > - * runtime_resume callback, lockdep should give us a nice splat.
> > > > - */
> > > > -static void pm_runtime_lockdep_training(void)
> > > > -{
> > > > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > > > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > > > -}
> > > > -
> > > >    /**
> > > >     * xe_pm_runtime_get - Get a runtime_pm reference and resume synchronously
> > > >     * @xe: xe device instance
> > > > @@ -436,7 +383,6 @@ void xe_pm_runtime_get(struct xe_device *xe)
> > > >    	if (xe_pm_read_callback_task(xe) == current)
> > > >    		return;
> > > > -	pm_runtime_lockdep_training();
> > > >    	pm_runtime_resume(xe->drm.dev);
> > > >    }
> > > > @@ -466,7 +412,6 @@ int xe_pm_runtime_get_sync(struct xe_device *xe)
> > > >    	if (WARN_ON(xe_pm_read_callback_task(xe) == current))
> > > >    		return -ELOOP;
> > > > -	pm_runtime_lockdep_training();
> > > >    	return pm_runtime_get_sync(xe->drm.dev);
> > > >    }

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2024-02-28 16:54 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-26 20:30 [RFC 00/34] Kill mem_access v2 Rodrigo Vivi
2024-01-26 20:30 ` [RFC 01/34] Revert "drm/xe/uc: Store firmware binary in system-memory backed BO" Rodrigo Vivi
2024-01-26 20:30 ` [RFC 02/34] drm/xe: Document Xe PM component Rodrigo Vivi
2024-01-29 10:38   ` Francois Dugast
2024-01-26 20:30 ` [RFC 03/34] drm/xe: Fix display runtime_pm handling Rodrigo Vivi
2024-02-05  9:11   ` Matthew Auld
2024-02-14 18:05     ` Rodrigo Vivi
2024-02-15  9:30       ` Matthew Auld
2024-02-15 22:19         ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 04/34] drm/xe: Create a xe_pm_runtime_resume_and_get variant for display Rodrigo Vivi
2024-01-26 20:30 ` [RFC 05/34] drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion Rodrigo Vivi
2024-01-26 20:30 ` [RFC 06/34] drm/xe: Prepare display for D3Cold Rodrigo Vivi
2024-01-26 20:30 ` [RFC 07/34] drm/xe: Convert mem_access assertion towards the runtime_pm state Rodrigo Vivi
2024-02-05  9:55   ` Matthew Auld
2024-02-14 18:15     ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 08/34] drm/xe: Runtime PM wake on every IOCTL Rodrigo Vivi
2024-02-05  9:39   ` Matthew Auld
2024-01-26 20:30 ` [RFC 09/34] drm/xe: Convert kunit tests from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05  9:57   ` Matthew Auld
2024-01-26 20:30 ` [RFC 10/34] drm/xe: Convert scheduler towards direct pm_runtime Rodrigo Vivi
2024-02-05 10:46   ` Matthew Auld
2024-01-26 20:30 ` [RFC 11/34] drm/xe: Runtime PM wake on every sysfs call Rodrigo Vivi
2024-02-05 10:55   ` Matthew Auld
2024-02-14 18:48     ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 12/34] drm/xe: Ensure device is awake before removing it Rodrigo Vivi
2024-02-05 11:05   ` Matthew Auld
2024-02-14 18:51     ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 13/34] drm/xe: Remove mem_access from guc_pc calls Rodrigo Vivi
2024-02-05 11:08   ` Matthew Auld
2024-01-26 20:30 ` [RFC 14/34] drm/xe: Runtime PM wake on every debugfs call Rodrigo Vivi
2024-02-05 11:10   ` Matthew Auld
2024-02-14 18:57     ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 15/34] drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls Rodrigo Vivi
2024-02-05 11:15   ` Matthew Auld
2024-01-26 20:30 ` [RFC 16/34] drm/xe: Removing extra mem_access protection from runtime pm Rodrigo Vivi
2024-02-05 11:23   ` Matthew Auld
2024-01-26 20:30 ` [RFC 17/34] drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls Rodrigo Vivi
2024-02-05 11:25   ` Matthew Auld
2024-01-26 20:30 ` [RFC 18/34] drm/xe: Move lockdep protection from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 11:31   ` Matthew Auld
2024-01-26 20:30 ` [RFC 19/34] drm/xe: Remove pm_runtime lockdep Rodrigo Vivi
2024-02-05 11:54   ` Matthew Auld
2024-02-15 22:47     ` Rodrigo Vivi
2024-02-20 17:48       ` Matthew Auld
2024-02-28 16:53         ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 20/34] drm/xe: Stop checking for power_lost on D3Cold Rodrigo Vivi
2024-01-26 20:30 ` [RFC 21/34] drm/xe: Convert GuC CT paths from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 12:23   ` Matthew Auld
2024-02-28 16:51     ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 22/34] drm/xe: Keep D0 for the entire duration of a LR VM Rodrigo Vivi
2024-01-26 20:30 ` [RFC 23/34] drm/xe: Ensure D0 on TLB invalidation Rodrigo Vivi
2024-02-05 12:41   ` Matthew Auld
2024-01-26 20:30 ` [RFC 24/34] drm/xe: Remove useless mem_access protection for query ioctls Rodrigo Vivi
2024-02-05 12:43   ` Matthew Auld
2024-01-26 20:30 ` [RFC 25/34] drm/xe: Convert gsc_work from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 13:11   ` Matthew Auld
2024-01-26 20:30 ` [RFC 26/34] drm/xe: VMs don't need the mem_access protection anymore Rodrigo Vivi
2024-02-05 13:29   ` Matthew Auld
2024-02-15 22:37     ` Rodrigo Vivi
2024-01-26 20:30 ` [RFC 27/34] drm/xe: Remove useless mem_access during probe Rodrigo Vivi
2024-02-05 13:18   ` Matthew Auld
2024-01-26 20:30 ` [RFC 28/34] drm/xe: Remove mem_access from suspend and resume functions Rodrigo Vivi
2024-02-05 13:30   ` Matthew Auld
2024-01-26 20:30 ` [RFC 29/34] drm/xe: Convert gt_reset from mem_access to xe_pm_runtime Rodrigo Vivi
2024-02-05 13:33   ` Matthew Auld
2024-01-26 20:30 ` [RFC 30/34] drm/xe: Remove useless mem_access on PAT dumps Rodrigo Vivi
2024-02-05 13:34   ` Matthew Auld
2024-01-26 20:30 ` [RFC 31/34] drm/xe: Remove inner mem_access protections Rodrigo Vivi
2024-01-26 20:30 ` [RFC 32/34] drm/xe: Kill xe_device_mem_access_{get*,put} Rodrigo Vivi
2024-01-26 20:30 ` [RFC 33/34] drm/xe: Remove unused runtime pm helper Rodrigo Vivi
2024-01-26 20:30 ` [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization Rodrigo Vivi
2024-01-29 12:12   ` Matthew Auld
2024-01-29 19:01     ` Vivi, Rodrigo
2024-01-30 15:01       ` Gupta, Anshuman
2024-01-26 20:39 ` ✓ CI.Patch_applied: success for Kill mem_access v2 Patchwork
2024-01-26 20:40 ` ✗ CI.checkpatch: warning " Patchwork
2024-01-26 20:40 ` ✗ CI.KUnit: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).