dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28
@ 2023-10-28 15:59 Stanislaw Gruszka
  2023-10-28 15:59 ` [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency Stanislaw Gruszka
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel; +Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz

Various driver updates:
 - MMU page tables handling optimizations
 - Rebrand to NPU
 - FW profiling frequency knob
 - job done thread suspend handling

This is based on top of previous update:
https://lore.kernel.org/dri-devel/20231028133415.1169975-1-stanislaw.gruszka@linux.intel.com/

Jacek Lawrynowicz (2):
  accel/ivpu: Simplify MMU SYNC command
  accel/ivpu: Rename VPU to NPU in product strings

Karol Wachowski (2):
  accel/ivpu: Print CMDQ errors after consumer timeout
  accel/ivpu: Make DMA allocations for MMU600 write combined

Krystian Pradzynski (1):
  accel/ivpu/40xx: Allow to change profiling frequency

Stanislaw Gruszka (3):
  accel/ivpu: Assure device is off if power up sequence fail
  accel/ivpu: Stop job_done_thread on suspend
  accel/ivpu: Abort pending rx ipc on reset

 drivers/accel/ivpu/Kconfig            |   9 +-
 drivers/accel/ivpu/ivpu_debugfs.c     |  29 +++++++
 drivers/accel/ivpu/ivpu_drv.c         |   4 +-
 drivers/accel/ivpu/ivpu_drv.h         |   2 +-
 drivers/accel/ivpu/ivpu_fw.c          |   7 ++
 drivers/accel/ivpu/ivpu_hw.h          |  12 +++
 drivers/accel/ivpu/ivpu_hw_37xx.c     |  13 +++
 drivers/accel/ivpu/ivpu_hw_40xx.c     |  15 ++++
 drivers/accel/ivpu/ivpu_ipc.c         |  35 +++++++-
 drivers/accel/ivpu/ivpu_ipc.h         |   3 +-
 drivers/accel/ivpu/ivpu_job.c         |  21 ++++-
 drivers/accel/ivpu/ivpu_job.h         |   2 +
 drivers/accel/ivpu/ivpu_mmu.c         |  39 +++++++--
 drivers/accel/ivpu/ivpu_mmu_context.c | 115 ++++++++++++++------------
 drivers/accel/ivpu/ivpu_pm.c          |  30 ++++---
 15 files changed, 249 insertions(+), 87 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 14:56   ` Jeffrey Hugo
  2023-10-28 15:59 ` [PATCH 2/8] accel/ivpu: Assure device is off if power up sequence fail Stanislaw Gruszka
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz,
	Krystian Pradzynski

From: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>

Profiling freq is a debug firmware feature. It switches default clock
to higher resolution for fine-grained and more accurate firmware task
profiling. We already configure it during boot up of VPU4.

Add debugfs knob and helpers per HW generation that allow to change it.
For vpu37xx the implementation is empty as profiling frequency can only
be changed on VPU4 or newer.

Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_debugfs.c | 29 +++++++++++++++++++++++++++++
 drivers/accel/ivpu/ivpu_fw.c      |  7 +++++++
 drivers/accel/ivpu/ivpu_hw.h      | 12 ++++++++++++
 drivers/accel/ivpu/ivpu_hw_37xx.c | 13 +++++++++++++
 drivers/accel/ivpu/ivpu_hw_40xx.c | 15 +++++++++++++++
 5 files changed, 76 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c b/drivers/accel/ivpu/ivpu_debugfs.c
index 6e0d56823024..19035230563d 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -14,6 +14,7 @@
 #include "ivpu_fw.h"
 #include "ivpu_fw_log.h"
 #include "ivpu_gem.h"
+#include "ivpu_hw.h"
 #include "ivpu_jsm_msg.h"
 #include "ivpu_pm.h"
 
@@ -176,6 +177,30 @@ static const struct file_operations fw_log_fops = {
 	.release = single_release,
 };
 
+static ssize_t
+fw_profiling_freq_fops_write(struct file *file, const char __user *user_buf,
+			     size_t size, loff_t *pos)
+{
+	struct ivpu_device *vdev = file->private_data;
+	bool enable;
+	int ret;
+
+	ret = kstrtobool_from_user(user_buf, size, &enable);
+	if (ret < 0)
+		return ret;
+
+	ivpu_hw_profiling_freq_drive(vdev, enable);
+	ivpu_pm_schedule_recovery(vdev);
+
+	return size;
+}
+
+static const struct file_operations fw_profiling_freq_fops = {
+	.owner = THIS_MODULE,
+	.open = simple_open,
+	.write = fw_profiling_freq_fops_write,
+};
+
 static ssize_t
 fw_trace_destination_mask_fops_write(struct file *file, const char __user *user_buf,
 				     size_t size, loff_t *pos)
@@ -319,4 +344,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
 
 	debugfs_create_file("reset_engine", 0200, debugfs_root, vdev,
 			    &ivpu_reset_engine_fops);
+
+	if (ivpu_hw_gen(vdev) >= IVPU_HW_40XX)
+		debugfs_create_file("fw_profiling_freq_drive", 0200,
+				    debugfs_root, vdev, &fw_profiling_freq_fops);
 }
diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 4a21be3a0c59..3fd74cd4205f 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -498,6 +498,13 @@ void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, struct vpu_boot_params
 	boot_params->vpu_id = to_pci_dev(vdev->drm.dev)->bus->number;
 	boot_params->frequency = ivpu_hw_reg_pll_freq_get(vdev);
 
+	/*
+	 * This param is a debug firmware feature.  It switches default clock
+	 * to higher resolution one for fine-grained and more accurate firmware
+	 * task profiling.
+	 */
+	boot_params->perf_clk_frequency = ivpu_hw_profiling_freq_get(vdev);
+
 	/*
 	 * Uncached region of VPU address space, covers IPC buffers, job queues
 	 * and log buffers, programmable to L2$ Uncached by VPU MTRR
diff --git a/drivers/accel/ivpu/ivpu_hw.h b/drivers/accel/ivpu/ivpu_hw.h
index b7694b1cbc02..aa52e5c29a65 100644
--- a/drivers/accel/ivpu/ivpu_hw.h
+++ b/drivers/accel/ivpu/ivpu_hw.h
@@ -17,6 +17,8 @@ struct ivpu_hw_ops {
 	int (*wait_for_idle)(struct ivpu_device *vdev);
 	void (*wdt_disable)(struct ivpu_device *vdev);
 	void (*diagnose_failure)(struct ivpu_device *vdev);
+	u32 (*profiling_freq_get)(struct ivpu_device *vdev);
+	void (*profiling_freq_drive)(struct ivpu_device *vdev, bool enable);
 	u32 (*reg_pll_freq_get)(struct ivpu_device *vdev);
 	u32 (*reg_telemetry_offset_get)(struct ivpu_device *vdev);
 	u32 (*reg_telemetry_size_get)(struct ivpu_device *vdev);
@@ -104,6 +106,16 @@ static inline void ivpu_hw_wdt_disable(struct ivpu_device *vdev)
 	vdev->hw->ops->wdt_disable(vdev);
 };
 
+static inline u32 ivpu_hw_profiling_freq_get(struct ivpu_device *vdev)
+{
+	return vdev->hw->ops->profiling_freq_get(vdev);
+};
+
+static inline void ivpu_hw_profiling_freq_drive(struct ivpu_device *vdev, bool enable)
+{
+	return vdev->hw->ops->profiling_freq_drive(vdev, enable);
+};
+
 /* Register indirect accesses */
 static inline u32 ivpu_hw_reg_pll_freq_get(struct ivpu_device *vdev)
 {
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 1c8c5715095b..81f81046d39a 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -29,6 +29,7 @@
 
 #define PLL_REF_CLK_FREQ	     (50 * 1000000)
 #define PLL_SIMULATION_FREQ	     (10 * 1000000)
+#define PLL_PROF_CLK_FREQ	     (38400 * 1000)
 #define PLL_DEFAULT_EPP_VALUE	     0x80
 
 #define TIM_SAFE_ENABLE		     0xf1d0dead
@@ -769,6 +770,16 @@ static void ivpu_hw_37xx_wdt_disable(struct ivpu_device *vdev)
 	REGV_WR32(VPU_37XX_CPU_SS_TIM_GEN_CONFIG, val);
 }
 
+static u32 ivpu_hw_37xx_profiling_freq_get(struct ivpu_device *vdev)
+{
+	return PLL_PROF_CLK_FREQ;
+}
+
+static void ivpu_hw_37xx_profiling_freq_drive(struct ivpu_device *vdev, bool enable)
+{
+	/* Profiling freq - is a debug feature. Unavailable on VPU 37XX. */
+}
+
 static u32 ivpu_hw_37xx_pll_to_freq(u32 ratio, u32 config)
 {
 	u32 pll_clock = PLL_REF_CLK_FREQ * ratio;
@@ -1012,6 +1023,8 @@ const struct ivpu_hw_ops ivpu_hw_37xx_ops = {
 	.boot_fw = ivpu_hw_37xx_boot_fw,
 	.wdt_disable = ivpu_hw_37xx_wdt_disable,
 	.diagnose_failure = ivpu_hw_37xx_diagnose_failure,
+	.profiling_freq_get = ivpu_hw_37xx_profiling_freq_get,
+	.profiling_freq_drive = ivpu_hw_37xx_profiling_freq_drive,
 	.reg_pll_freq_get = ivpu_hw_37xx_reg_pll_freq_get,
 	.reg_telemetry_offset_get = ivpu_hw_37xx_reg_telemetry_offset_get,
 	.reg_telemetry_size_get = ivpu_hw_37xx_reg_telemetry_size_get,
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 6a9672f650d1..a779b905f8b1 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -931,6 +931,19 @@ static void ivpu_hw_40xx_wdt_disable(struct ivpu_device *vdev)
 	REGV_WR32(VPU_40XX_CPU_SS_TIM_GEN_CONFIG, val);
 }
 
+static u32 ivpu_hw_40xx_profiling_freq_get(struct ivpu_device *vdev)
+{
+	return vdev->hw->pll.profiling_freq;
+}
+
+static void ivpu_hw_40xx_profiling_freq_drive(struct ivpu_device *vdev, bool enable)
+{
+	if (enable)
+		vdev->hw->pll.profiling_freq = PLL_PROFILING_FREQ_HIGH;
+	else
+		vdev->hw->pll.profiling_freq = PLL_PROFILING_FREQ_DEFAULT;
+}
+
 /* Register indirect accesses */
 static u32 ivpu_hw_40xx_reg_pll_freq_get(struct ivpu_device *vdev)
 {
@@ -1182,6 +1195,8 @@ const struct ivpu_hw_ops ivpu_hw_40xx_ops = {
 	.boot_fw = ivpu_hw_40xx_boot_fw,
 	.wdt_disable = ivpu_hw_40xx_wdt_disable,
 	.diagnose_failure = ivpu_hw_40xx_diagnose_failure,
+	.profiling_freq_get = ivpu_hw_40xx_profiling_freq_get,
+	.profiling_freq_drive = ivpu_hw_40xx_profiling_freq_drive,
 	.reg_pll_freq_get = ivpu_hw_40xx_reg_pll_freq_get,
 	.reg_telemetry_offset_get = ivpu_hw_40xx_reg_telemetry_offset_get,
 	.reg_telemetry_size_get = ivpu_hw_40xx_reg_telemetry_size_get,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/8] accel/ivpu: Assure device is off if power up sequence fail
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
  2023-10-28 15:59 ` [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 14:59   ` Jeffrey Hugo
  2023-10-28 15:59 ` [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend Stanislaw Gruszka
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz,
	Karol Wachowski

We should not leave device half enabled if there is failure somewhere
it power up sequence. Fix device init and resume paths.

Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_drv.c |  2 +-
 drivers/accel/ivpu/ivpu_pm.c  | 30 +++++++++++++++++-------------
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 39bac45d88b5..064cabef41bb 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -543,7 +543,7 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
 	/* Power up early so the rest of init code can access VPU registers */
 	ret = ivpu_hw_power_up(vdev);
 	if (ret)
-		goto err_xa_destroy;
+		goto err_power_down;
 
 	ret = ivpu_mmu_global_context_init(vdev);
 	if (ret)
diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 74688cc57583..7d17f7ea4949 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -70,27 +70,31 @@ static int ivpu_resume(struct ivpu_device *vdev)
 	ret = ivpu_hw_power_up(vdev);
 	if (ret) {
 		ivpu_err(vdev, "Failed to power up HW: %d\n", ret);
-		return ret;
+		goto err_power_down;
 	}
 
 	ret = ivpu_mmu_enable(vdev);
 	if (ret) {
 		ivpu_err(vdev, "Failed to resume MMU: %d\n", ret);
-		ivpu_hw_power_down(vdev);
-		return ret;
+		goto err_power_down;
 	}
 
 	ret = ivpu_boot(vdev);
-	if (ret) {
-		ivpu_mmu_disable(vdev);
-		ivpu_hw_power_down(vdev);
-		if (!ivpu_fw_is_cold_boot(vdev)) {
-			ivpu_warn(vdev, "Failed to resume the FW: %d. Retrying cold boot..\n", ret);
-			ivpu_pm_prepare_cold_boot(vdev);
-			goto retry;
-		} else {
-			ivpu_err(vdev, "Failed to resume the FW: %d\n", ret);
-		}
+	if (ret)
+		goto err_mmu_disable;
+
+	return 0;
+
+err_mmu_disable:
+	ivpu_mmu_disable(vdev);
+err_power_down:
+	ivpu_hw_power_down(vdev);
+
+	if (!ivpu_fw_is_cold_boot(vdev)) {
+		ivpu_pm_prepare_cold_boot(vdev);
+		goto retry;
+	} else {
+		ivpu_err(vdev, "Failed to resume the FW: %d\n", ret);
 	}
 
 	return ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
  2023-10-28 15:59 ` [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency Stanislaw Gruszka
  2023-10-28 15:59 ` [PATCH 2/8] accel/ivpu: Assure device is off if power up sequence fail Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 15:03   ` Jeffrey Hugo
  2023-11-13  7:39   ` Daniel Vetter
  2023-10-28 15:59 ` [PATCH 4/8] accel/ivpu: Abort pending rx ipc on reset Stanislaw Gruszka
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz,
	Karol Wachowski

Stop job_done thread when going to suspend. Use kthread_park() instead
of kthread_stop() to avoid memory allocation and potential failure
on resume.

Use separate function as thread wake up condition. Use spin lock to assure
rx_msg_list is properly protected against concurrent access.

Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_drv.c |  2 ++
 drivers/accel/ivpu/ivpu_ipc.c | 17 +++++++++++++++--
 drivers/accel/ivpu/ivpu_job.c | 20 ++++++++++++++++----
 drivers/accel/ivpu/ivpu_job.h |  2 ++
 4 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 064cabef41bb..60277ff6af69 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -378,6 +378,7 @@ int ivpu_boot(struct ivpu_device *vdev)
 	enable_irq(vdev->irq);
 	ivpu_hw_irq_enable(vdev);
 	ivpu_ipc_enable(vdev);
+	ivpu_job_done_thread_enable(vdev);
 	return 0;
 }
 
@@ -389,6 +390,7 @@ int ivpu_shutdown(struct ivpu_device *vdev)
 	disable_irq(vdev->irq);
 	ivpu_ipc_disable(vdev);
 	ivpu_mmu_disable(vdev);
+	ivpu_job_done_thread_disable(vdev);
 
 	ret = ivpu_hw_power_down(vdev);
 	if (ret)
diff --git a/drivers/accel/ivpu/ivpu_ipc.c b/drivers/accel/ivpu/ivpu_ipc.c
index d069d1e1f91d..270caef789bf 100644
--- a/drivers/accel/ivpu/ivpu_ipc.c
+++ b/drivers/accel/ivpu/ivpu_ipc.c
@@ -202,6 +202,20 @@ ivpu_ipc_send(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, struct v
 	return ret;
 }
 
+static int ivpu_ipc_rx_need_wakeup(struct ivpu_ipc_consumer *cons)
+{
+	int ret = 0;
+
+	if (IS_KTHREAD())
+		ret |= (kthread_should_stop() || kthread_should_park());
+
+	spin_lock_irq(&cons->rx_msg_lock);
+	ret |= !list_empty(&cons->rx_msg_list);
+	spin_unlock_irq(&cons->rx_msg_lock);
+
+	return ret;
+}
+
 int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
 		     struct ivpu_ipc_hdr *ipc_buf,
 		     struct vpu_jsm_msg *ipc_payload, unsigned long timeout_ms)
@@ -211,8 +225,7 @@ int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
 	int wait_ret, ret = 0;
 
 	wait_ret = wait_event_interruptible_timeout(cons->rx_msg_wq,
-						    (IS_KTHREAD() && kthread_should_stop()) ||
-						    !list_empty(&cons->rx_msg_list),
+						    ivpu_ipc_rx_need_wakeup(cons),
 						    msecs_to_jiffies(timeout_ms));
 
 	if (IS_KTHREAD() && kthread_should_stop())
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 6e96c921547d..a245b2d44db7 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -590,6 +590,11 @@ static int ivpu_job_done_thread(void *arg)
 				ivpu_pm_schedule_recovery(vdev);
 			}
 		}
+		if (kthread_should_park()) {
+			ivpu_dbg(vdev, JOB, "Parked %s\n", __func__);
+			kthread_parkme();
+			ivpu_dbg(vdev, JOB, "Unparked %s\n", __func__);
+		}
 	}
 
 	ivpu_ipc_consumer_del(vdev, &cons);
@@ -610,9 +615,6 @@ int ivpu_job_done_thread_init(struct ivpu_device *vdev)
 		return -EIO;
 	}
 
-	get_task_struct(thread);
-	wake_up_process(thread);
-
 	vdev->job_done_thread = thread;
 
 	return 0;
@@ -620,6 +622,16 @@ int ivpu_job_done_thread_init(struct ivpu_device *vdev)
 
 void ivpu_job_done_thread_fini(struct ivpu_device *vdev)
 {
+	kthread_unpark(vdev->job_done_thread);
 	kthread_stop(vdev->job_done_thread);
-	put_task_struct(vdev->job_done_thread);
+}
+
+void ivpu_job_done_thread_disable(struct ivpu_device *vdev)
+{
+	kthread_park(vdev->job_done_thread);
+}
+
+void ivpu_job_done_thread_enable(struct ivpu_device *vdev)
+{
+	kthread_unpark(vdev->job_done_thread);
 }
diff --git a/drivers/accel/ivpu/ivpu_job.h b/drivers/accel/ivpu/ivpu_job.h
index aa1f0b9479b0..a8e914e5affc 100644
--- a/drivers/accel/ivpu/ivpu_job.h
+++ b/drivers/accel/ivpu/ivpu_job.h
@@ -61,6 +61,8 @@ void ivpu_cmdq_reset_all_contexts(struct ivpu_device *vdev);
 
 int ivpu_job_done_thread_init(struct ivpu_device *vdev);
 void ivpu_job_done_thread_fini(struct ivpu_device *vdev);
+void ivpu_job_done_thread_disable(struct ivpu_device *vdev);
+void ivpu_job_done_thread_enable(struct ivpu_device *vdev);
 
 void ivpu_jobs_abort_all(struct ivpu_device *vdev);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/8] accel/ivpu: Abort pending rx ipc on reset
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
                   ` (2 preceding siblings ...)
  2023-10-28 15:59 ` [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 15:04   ` Jeffrey Hugo
  2023-10-28 15:59 ` [PATCH 5/8] accel/ivpu: Print CMDQ errors after consumer timeout Stanislaw Gruszka
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz,
	Karol Wachowski

Waking up process, which wait for particular condition, will go to
sleep again on wake_up() if the condition is not met. Add abort flag
to wake up IPC receivers, which will finish with -ECANCELED error.

This is only needed for reset, run time power management prevent to
suspend VPU when there is pending IPC processing or pending job.

Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_ipc.c | 20 +++++++++++++++++---
 drivers/accel/ivpu/ivpu_ipc.h |  3 ++-
 drivers/accel/ivpu/ivpu_job.c |  1 +
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_ipc.c b/drivers/accel/ivpu/ivpu_ipc.c
index 270caef789bf..255f2b8b0b5e 100644
--- a/drivers/accel/ivpu/ivpu_ipc.c
+++ b/drivers/accel/ivpu/ivpu_ipc.c
@@ -148,6 +148,7 @@ ivpu_ipc_consumer_add(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
 	cons->channel = channel;
 	cons->tx_vpu_addr = 0;
 	cons->request_id = 0;
+	cons->aborted = false;
 	spin_lock_init(&cons->rx_msg_lock);
 	INIT_LIST_HEAD(&cons->rx_msg_list);
 	init_waitqueue_head(&cons->rx_msg_wq);
@@ -169,7 +170,8 @@ void ivpu_ipc_consumer_del(struct ivpu_device *vdev, struct ivpu_ipc_consumer *c
 	spin_lock_irq(&cons->rx_msg_lock);
 	list_for_each_entry_safe(rx_msg, r, &cons->rx_msg_list, link) {
 		list_del(&rx_msg->link);
-		ivpu_ipc_rx_mark_free(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg);
+		if (!cons->aborted)
+			ivpu_ipc_rx_mark_free(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg);
 		atomic_dec(&ipc->rx_msg_count);
 		kfree(rx_msg);
 	}
@@ -210,7 +212,7 @@ static int ivpu_ipc_rx_need_wakeup(struct ivpu_ipc_consumer *cons)
 		ret |= (kthread_should_stop() || kthread_should_park());
 
 	spin_lock_irq(&cons->rx_msg_lock);
-	ret |= !list_empty(&cons->rx_msg_list);
+	ret |= !list_empty(&cons->rx_msg_list) || cons->aborted;
 	spin_unlock_irq(&cons->rx_msg_lock);
 
 	return ret;
@@ -244,6 +246,12 @@ int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
 		return -EAGAIN;
 	}
 	list_del(&rx_msg->link);
+	if (cons->aborted) {
+		spin_unlock_irq(&cons->rx_msg_lock);
+		ret = -ECANCELED;
+		goto out;
+	}
+
 	spin_unlock_irq(&cons->rx_msg_lock);
 
 	if (ipc_buf)
@@ -261,6 +269,7 @@ int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
 	}
 
 	ivpu_ipc_rx_mark_free(vdev, rx_msg->ipc_hdr, rx_msg->jsm_msg);
+out:
 	atomic_dec(&ipc->rx_msg_count);
 	kfree(rx_msg);
 
@@ -522,8 +531,12 @@ void ivpu_ipc_disable(struct ivpu_device *vdev)
 	mutex_unlock(&ipc->lock);
 
 	spin_lock_irqsave(&ipc->cons_list_lock, flags);
-	list_for_each_entry_safe(cons, c, &ipc->cons_list, link)
+	list_for_each_entry_safe(cons, c, &ipc->cons_list, link) {
+		spin_lock(&cons->rx_msg_lock);
+		cons->aborted = true;
+		spin_unlock(&cons->rx_msg_lock);
 		wake_up(&cons->rx_msg_wq);
+	}
 	spin_unlock_irqrestore(&ipc->cons_list_lock, flags);
 }
 
@@ -532,6 +545,7 @@ void ivpu_ipc_reset(struct ivpu_device *vdev)
 	struct ivpu_ipc_info *ipc = vdev->ipc;
 
 	mutex_lock(&ipc->lock);
+	drm_WARN_ON(&vdev->drm, ipc->on);
 
 	memset(ivpu_bo_vaddr(ipc->mem_tx), 0, ivpu_bo_size(ipc->mem_tx));
 	memset(ivpu_bo_vaddr(ipc->mem_rx), 0, ivpu_bo_size(ipc->mem_rx));
diff --git a/drivers/accel/ivpu/ivpu_ipc.h b/drivers/accel/ivpu/ivpu_ipc.h
index 6918db23daa4..a380787f7222 100644
--- a/drivers/accel/ivpu/ivpu_ipc.h
+++ b/drivers/accel/ivpu/ivpu_ipc.h
@@ -47,8 +47,9 @@ struct ivpu_ipc_consumer {
 	u32 channel;
 	u32 tx_vpu_addr;
 	u32 request_id;
+	bool aborted;
 
-	spinlock_t rx_msg_lock; /* Protects rx_msg_list */
+	spinlock_t rx_msg_lock; /* Protects rx_msg_list and aborted */
 	struct list_head rx_msg_list;
 	wait_queue_head_t rx_msg_wq;
 };
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index a245b2d44db7..15a408fad494 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -578,6 +578,7 @@ static int ivpu_job_done_thread(void *arg)
 	ivpu_ipc_consumer_add(vdev, &cons, VPU_IPC_CHAN_JOB_RET);
 
 	while (!kthread_should_stop()) {
+		cons.aborted = false;
 		timeout = ivpu_tdr_timeout_ms ? ivpu_tdr_timeout_ms : vdev->timeout.tdr;
 		jobs_submitted = !xa_empty(&vdev->submitted_jobs_xa);
 		ret = ivpu_ipc_receive(vdev, &cons, NULL, &jsm_msg, timeout);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/8] accel/ivpu: Print CMDQ errors after consumer timeout
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
                   ` (3 preceding siblings ...)
  2023-10-28 15:59 ` [PATCH 4/8] accel/ivpu: Abort pending rx ipc on reset Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 15:06   ` Jeffrey Hugo
  2023-10-28 15:59 ` [PATCH 6/8] accel/ivpu: Make DMA allocations for MMU600 write combined Stanislaw Gruszka
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz,
	Stanislaw Gruszka

From: Karol Wachowski <karol.wachowski@linux.intel.com>

Add checking of error reason bits in IVPU_MMU_CMDQ_CONS
register when waiting for consumer timeout occurred.

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_mmu.c | 34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c
index 2538c78fbebe..b15cf9cc9c43 100644
--- a/drivers/accel/ivpu/ivpu_mmu.c
+++ b/drivers/accel/ivpu/ivpu_mmu.c
@@ -230,7 +230,12 @@
 				  (REG_FLD(IVPU_MMU_REG_GERROR, MSI_PRIQ_ABT)) | \
 				  (REG_FLD(IVPU_MMU_REG_GERROR, MSI_ABT)))
 
-static char *ivpu_mmu_event_to_str(u32 cmd)
+#define IVPU_MMU_CERROR_NONE         0x0
+#define IVPU_MMU_CERROR_ILL          0x1
+#define IVPU_MMU_CERROR_ABT          0x2
+#define IVPU_MMU_CERROR_ATC_INV_SYNC 0x3
+
+static const char *ivpu_mmu_event_to_str(u32 cmd)
 {
 	switch (cmd) {
 	case IVPU_MMU_EVT_F_UUT:
@@ -276,6 +281,22 @@ static char *ivpu_mmu_event_to_str(u32 cmd)
 	}
 }
 
+static const char *ivpu_mmu_cmdq_err_to_str(u32 err)
+{
+	switch (err) {
+	case IVPU_MMU_CERROR_NONE:
+		return "No CMDQ Error";
+	case IVPU_MMU_CERROR_ILL:
+		return "Illegal command";
+	case IVPU_MMU_CERROR_ABT:
+		return "External abort on CMDQ read";
+	case IVPU_MMU_CERROR_ATC_INV_SYNC:
+		return "Sync failed to complete ATS invalidation";
+	default:
+		return "Unknown CMDQ Error";
+	}
+}
+
 static void ivpu_mmu_config_check(struct ivpu_device *vdev)
 {
 	u32 val_ref;
@@ -492,8 +513,15 @@ static int ivpu_mmu_cmdq_sync(struct ivpu_device *vdev)
 	REGV_WR32(IVPU_MMU_REG_CMDQ_PROD, q->prod);
 
 	ret = ivpu_mmu_cmdq_wait_for_cons(vdev);
-	if (ret)
-		ivpu_err(vdev, "Timed out waiting for consumer: %d\n", ret);
+	if (ret) {
+		u32 err;
+
+		val = REGV_RD32(IVPU_MMU_REG_CMDQ_CONS);
+		err = REG_GET_FLD(IVPU_MMU_REG_CMDQ_CONS, ERR, val);
+
+		ivpu_err(vdev, "Timed out waiting for MMU consumer: %d, error: %s\n", ret,
+			 ivpu_mmu_cmdq_err_to_str(err));
+	}
 
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/8] accel/ivpu: Make DMA allocations for MMU600 write combined
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
                   ` (4 preceding siblings ...)
  2023-10-28 15:59 ` [PATCH 5/8] accel/ivpu: Print CMDQ errors after consumer timeout Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 15:08   ` Jeffrey Hugo
  2023-10-28 15:59 ` [PATCH 7/8] accel/ivpu: Simplify MMU SYNC command Stanislaw Gruszka
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz,
	Stanislaw Gruszka

From: Karol Wachowski <karol.wachowski@linux.intel.com>

Previously using dma_alloc_wc() API we created cache coherent
(mapped as write-back) mappings.

Because we disable MMU600 snooping it was required to do costly
page walk and cache flushes after each page table modification.

With write-combined buffers it's possible to do a single write memory
barrier to flush write-combined buffer to memory which simplifies the
driver and significantly reduce time of map/unmap operations.

Mapping time of 255 MB is reduced from 2.5 ms to 500 us.

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_mmu_context.c | 115 ++++++++++++++------------
 1 file changed, 63 insertions(+), 52 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_mmu_context.c b/drivers/accel/ivpu/ivpu_mmu_context.c
index 0c8c65351919..7d8005bc78d2 100644
--- a/drivers/accel/ivpu/ivpu_mmu_context.c
+++ b/drivers/accel/ivpu/ivpu_mmu_context.c
@@ -5,6 +5,9 @@
 
 #include <linux/bitfield.h>
 #include <linux/highmem.h>
+#include <linux/set_memory.h>
+
+#include <drm/drm_cache.h>
 
 #include "ivpu_drv.h"
 #include "ivpu_hw.h"
@@ -38,12 +41,57 @@
 #define IVPU_MMU_ENTRY_MAPPED  (IVPU_MMU_ENTRY_FLAG_AF | IVPU_MMU_ENTRY_FLAG_USER | \
 				IVPU_MMU_ENTRY_FLAG_NG | IVPU_MMU_ENTRY_VALID)
 
+static void *ivpu_pgtable_alloc_page(struct ivpu_device *vdev, dma_addr_t *dma)
+{
+	dma_addr_t dma_addr;
+	struct page *page;
+	void *cpu;
+
+	page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
+	if (!page)
+		return NULL;
+
+	set_pages_array_wc(&page, 1);
+
+	dma_addr = dma_map_page(vdev->drm.dev, page, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(vdev->drm.dev, dma_addr))
+		goto err_free_page;
+
+	cpu = vmap(&page, 1, VM_MAP, pgprot_writecombine(PAGE_KERNEL));
+	if (!cpu)
+		goto err_dma_unmap_page;
+
+
+	*dma = dma_addr;
+	return cpu;
+
+err_dma_unmap_page:
+	dma_unmap_page(vdev->drm.dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+
+err_free_page:
+	put_page(page);
+	return NULL;
+}
+
+static void ivpu_pgtable_free_page(struct ivpu_device *vdev, u64 *cpu_addr, dma_addr_t dma_addr)
+{
+	struct page *page;
+
+	if (cpu_addr) {
+		page = vmalloc_to_page(cpu_addr);
+		vunmap(cpu_addr);
+		dma_unmap_page(vdev->drm.dev, dma_addr & ~IVPU_MMU_ENTRY_FLAGS_MASK, PAGE_SIZE,
+			       DMA_BIDIRECTIONAL);
+		set_pages_array_wb(&page, 1);
+		put_page(page);
+	}
+}
+
 static int ivpu_mmu_pgtable_init(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable)
 {
 	dma_addr_t pgd_dma;
 
-	pgtable->pgd_dma_ptr = dma_alloc_coherent(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pgd_dma,
-						  GFP_KERNEL);
+	pgtable->pgd_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pgd_dma);
 	if (!pgtable->pgd_dma_ptr)
 		return -ENOMEM;
 
@@ -52,13 +100,6 @@ static int ivpu_mmu_pgtable_init(struct ivpu_device *vdev, struct ivpu_mmu_pgtab
 	return 0;
 }
 
-static void ivpu_mmu_pgtable_free(struct ivpu_device *vdev, u64 *cpu_addr, dma_addr_t dma_addr)
-{
-	if (cpu_addr)
-		dma_free_coherent(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, cpu_addr,
-				  dma_addr & ~IVPU_MMU_ENTRY_FLAGS_MASK);
-}
-
 static void ivpu_mmu_pgtables_free(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable)
 {
 	int pgd_idx, pud_idx, pmd_idx;
@@ -83,19 +124,19 @@ static void ivpu_mmu_pgtables_free(struct ivpu_device *vdev, struct ivpu_mmu_pgt
 				pte_dma_ptr = pgtable->pte_ptrs[pgd_idx][pud_idx][pmd_idx];
 				pte_dma = pgtable->pmd_ptrs[pgd_idx][pud_idx][pmd_idx];
 
-				ivpu_mmu_pgtable_free(vdev, pte_dma_ptr, pte_dma);
+				ivpu_pgtable_free_page(vdev, pte_dma_ptr, pte_dma);
 			}
 
 			kfree(pgtable->pte_ptrs[pgd_idx][pud_idx]);
-			ivpu_mmu_pgtable_free(vdev, pmd_dma_ptr, pmd_dma);
+			ivpu_pgtable_free_page(vdev, pmd_dma_ptr, pmd_dma);
 		}
 
 		kfree(pgtable->pmd_ptrs[pgd_idx]);
 		kfree(pgtable->pte_ptrs[pgd_idx]);
-		ivpu_mmu_pgtable_free(vdev, pud_dma_ptr, pud_dma);
+		ivpu_pgtable_free_page(vdev, pud_dma_ptr, pud_dma);
 	}
 
-	ivpu_mmu_pgtable_free(vdev, pgtable->pgd_dma_ptr, pgtable->pgd_dma);
+	ivpu_pgtable_free_page(vdev, pgtable->pgd_dma_ptr, pgtable->pgd_dma);
 }
 
 static u64*
@@ -107,7 +148,7 @@ ivpu_mmu_ensure_pud(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable,
 	if (pud_dma_ptr)
 		return pud_dma_ptr;
 
-	pud_dma_ptr = dma_alloc_wc(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pud_dma, GFP_KERNEL);
+	pud_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pud_dma);
 	if (!pud_dma_ptr)
 		return NULL;
 
@@ -130,7 +171,7 @@ ivpu_mmu_ensure_pud(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable,
 	kfree(pgtable->pmd_ptrs[pgd_idx]);
 
 err_free_pud_dma_ptr:
-	ivpu_mmu_pgtable_free(vdev, pud_dma_ptr, pud_dma);
+	ivpu_pgtable_free_page(vdev, pud_dma_ptr, pud_dma);
 	return NULL;
 }
 
@@ -144,7 +185,7 @@ ivpu_mmu_ensure_pmd(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable,
 	if (pmd_dma_ptr)
 		return pmd_dma_ptr;
 
-	pmd_dma_ptr = dma_alloc_wc(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pmd_dma, GFP_KERNEL);
+	pmd_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pmd_dma);
 	if (!pmd_dma_ptr)
 		return NULL;
 
@@ -159,7 +200,7 @@ ivpu_mmu_ensure_pmd(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable,
 	return pmd_dma_ptr;
 
 err_free_pmd_dma_ptr:
-	ivpu_mmu_pgtable_free(vdev, pmd_dma_ptr, pmd_dma);
+	ivpu_pgtable_free_page(vdev, pmd_dma_ptr, pmd_dma);
 	return NULL;
 }
 
@@ -173,7 +214,7 @@ ivpu_mmu_ensure_pte(struct ivpu_device *vdev, struct ivpu_mmu_pgtable *pgtable,
 	if (pte_dma_ptr)
 		return pte_dma_ptr;
 
-	pte_dma_ptr = dma_alloc_wc(vdev->drm.dev, IVPU_MMU_PGTABLE_SIZE, &pte_dma, GFP_KERNEL);
+	pte_dma_ptr = ivpu_pgtable_alloc_page(vdev, &pte_dma);
 	if (!pte_dma_ptr)
 		return NULL;
 
@@ -248,38 +289,6 @@ static void ivpu_mmu_context_unmap_page(struct ivpu_mmu_context *ctx, u64 vpu_ad
 	ctx->pgtable.pte_ptrs[pgd_idx][pud_idx][pmd_idx][pte_idx] = IVPU_MMU_ENTRY_INVALID;
 }
 
-static void
-ivpu_mmu_context_flush_page_tables(struct ivpu_mmu_context *ctx, u64 vpu_addr, size_t size)
-{
-	struct ivpu_mmu_pgtable *pgtable = &ctx->pgtable;
-	u64 end_addr = vpu_addr + size;
-
-	/* Align to PMD entry (2 MB) */
-	vpu_addr &= ~(IVPU_MMU_PTE_MAP_SIZE - 1);
-
-	while (vpu_addr < end_addr) {
-		int pgd_idx = FIELD_GET(IVPU_MMU_PGD_INDEX_MASK, vpu_addr);
-		u64 pud_end = (pgd_idx + 1) * (u64)IVPU_MMU_PUD_MAP_SIZE;
-
-		while (vpu_addr < end_addr && vpu_addr < pud_end) {
-			int pud_idx = FIELD_GET(IVPU_MMU_PUD_INDEX_MASK, vpu_addr);
-			u64 pmd_end = (pud_idx + 1) * (u64)IVPU_MMU_PMD_MAP_SIZE;
-
-			while (vpu_addr < end_addr && vpu_addr < pmd_end) {
-				int pmd_idx = FIELD_GET(IVPU_MMU_PMD_INDEX_MASK, vpu_addr);
-
-				clflush_cache_range(pgtable->pte_ptrs[pgd_idx][pud_idx][pmd_idx],
-						    IVPU_MMU_PGTABLE_SIZE);
-				vpu_addr += IVPU_MMU_PTE_MAP_SIZE;
-			}
-			clflush_cache_range(pgtable->pmd_ptrs[pgd_idx][pud_idx],
-					    IVPU_MMU_PGTABLE_SIZE);
-		}
-		clflush_cache_range(pgtable->pud_ptrs[pgd_idx], IVPU_MMU_PGTABLE_SIZE);
-	}
-	clflush_cache_range(pgtable->pgd_dma_ptr, IVPU_MMU_PGTABLE_SIZE);
-}
-
 static int
 ivpu_mmu_context_map_pages(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx,
 			   u64 vpu_addr, dma_addr_t dma_addr, size_t size, u64 prot)
@@ -352,10 +361,11 @@ ivpu_mmu_context_map_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx,
 			mutex_unlock(&ctx->lock);
 			return ret;
 		}
-		ivpu_mmu_context_flush_page_tables(ctx, vpu_addr, size);
 		vpu_addr += size;
 	}
 
+	/* Ensure page table modifications are flushed from wc buffers to memory */
+	wmb();
 	mutex_unlock(&ctx->lock);
 
 	ret = ivpu_mmu_invalidate_tlb(vdev, ctx->id);
@@ -381,10 +391,11 @@ ivpu_mmu_context_unmap_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ct
 		size_t size = sg_dma_len(sg) + sg->offset;
 
 		ivpu_mmu_context_unmap_pages(ctx, vpu_addr, size);
-		ivpu_mmu_context_flush_page_tables(ctx, vpu_addr, size);
 		vpu_addr += size;
 	}
 
+	/* Ensure page table modifications are flushed from wc buffers to memory */
+	wmb();
 	mutex_unlock(&ctx->lock);
 
 	ret = ivpu_mmu_invalidate_tlb(vdev, ctx->id);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/8] accel/ivpu: Simplify MMU SYNC command
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
                   ` (5 preceding siblings ...)
  2023-10-28 15:59 ` [PATCH 6/8] accel/ivpu: Make DMA allocations for MMU600 write combined Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 15:09   ` Jeffrey Hugo
  2023-10-28 15:59 ` [PATCH 8/8] accel/ivpu: Rename VPU to NPU in product strings Stanislaw Gruszka
  2023-10-31 16:06 ` [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel; +Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz

From: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>

CMD_SYNC does not need any args as we poll for completion anyway.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_mmu.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c
index b15cf9cc9c43..b46f3743c0e7 100644
--- a/drivers/accel/ivpu/ivpu_mmu.c
+++ b/drivers/accel/ivpu/ivpu_mmu.c
@@ -500,10 +500,7 @@ static int ivpu_mmu_cmdq_sync(struct ivpu_device *vdev)
 	u64 val;
 	int ret;
 
-	val = FIELD_PREP(IVPU_MMU_CMD_OPCODE, CMD_SYNC) |
-	      FIELD_PREP(IVPU_MMU_CMD_SYNC_0_CS, 0x2) |
-	      FIELD_PREP(IVPU_MMU_CMD_SYNC_0_MSH, 0x3) |
-	      FIELD_PREP(IVPU_MMU_CMD_SYNC_0_MSI_ATTR, 0xf);
+	val = FIELD_PREP(IVPU_MMU_CMD_OPCODE, CMD_SYNC);
 
 	ret = ivpu_mmu_cmdq_cmd_write(vdev, "SYNC", val, 0);
 	if (ret)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/8] accel/ivpu: Rename VPU to NPU in product strings
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
                   ` (6 preceding siblings ...)
  2023-10-28 15:59 ` [PATCH 7/8] accel/ivpu: Simplify MMU SYNC command Stanislaw Gruszka
@ 2023-10-28 15:59 ` Stanislaw Gruszka
  2023-10-30 15:10   ` Jeffrey Hugo
  2023-10-31 16:06 ` [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
  8 siblings, 1 reply; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-28 15:59 UTC (permalink / raw)
  To: dri-devel; +Cc: Stanislaw Gruszka, Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz

From: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>

VPU was rebranded as NPU (Neural Processing Unit) so user facing
strings have to be updated but the code remains as is and the module
is still called intel_vpu.ko.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/Kconfig    | 9 +++++----
 drivers/accel/ivpu/ivpu_drv.h | 2 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/accel/ivpu/Kconfig b/drivers/accel/ivpu/Kconfig
index 1a4c4ed9d113..82749e0da524 100644
--- a/drivers/accel/ivpu/Kconfig
+++ b/drivers/accel/ivpu/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 config DRM_ACCEL_IVPU
-	tristate "Intel VPU for Meteor Lake and newer"
+	tristate "Intel NPU (Neural Processing Unit)"
 	depends on DRM_ACCEL
 	depends on X86_64 && !UML
 	depends on PCI && PCI_MSI
@@ -9,8 +9,9 @@ config DRM_ACCEL_IVPU
 	select SHMEM
 	select GENERIC_ALLOCATOR
 	help
-	  Choose this option if you have a system that has an 14th generation Intel CPU
-	  or newer. VPU stands for Versatile Processing Unit and it's a CPU-integrated
-	  inference accelerator for Computer Vision and Deep Learning applications.
+	  Choose this option if you have a system with an 14th generation
+	  Intel CPU (Meteor Lake) or newer. Intel NPU (formerly called Intel VPU)
+	  is a CPU-integrated inference accelerator for Computer Vision
+	  and Deep Learning applications.
 
 	  If "M" is selected, the module will be called intel_vpu.
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 1b482d1d66d9..79b193ffbc26 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -19,7 +19,7 @@
 #include "ivpu_mmu_context.h"
 
 #define DRIVER_NAME "intel_vpu"
-#define DRIVER_DESC "Driver for Intel Versatile Processing Unit (VPU)"
+#define DRIVER_DESC "Driver for Intel NPU (Neural Processing Unit)"
 #define DRIVER_DATE "20230117"
 
 #define PCI_DEVICE_ID_MTL   0x7d1d
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency
  2023-10-28 15:59 ` [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency Stanislaw Gruszka
@ 2023-10-30 14:56   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 14:56 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel
  Cc: Oded Gabbay, Krystian Pradzynski, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> From: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
> 
> Profiling freq is a debug firmware feature. It switches default clock
> to higher resolution for fine-grained and more accurate firmware task
> profiling. We already configure it during boot up of VPU4.
> 
> Add debugfs knob and helpers per HW generation that allow to change it.
> For vpu37xx the implementation is empty as profiling frequency can only
> be changed on VPU4 or newer.
> 
> Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] accel/ivpu: Assure device is off if power up sequence fail
  2023-10-28 15:59 ` [PATCH 2/8] accel/ivpu: Assure device is off if power up sequence fail Stanislaw Gruszka
@ 2023-10-30 14:59   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 14:59 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> We should not leave device half enabled if there is failure somewhere
> it power up sequence. Fix device init and resume paths.
> 
> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend
  2023-10-28 15:59 ` [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend Stanislaw Gruszka
@ 2023-10-30 15:03   ` Jeffrey Hugo
  2023-11-13  7:39   ` Daniel Vetter
  1 sibling, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 15:03 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> Stop job_done thread when going to suspend. Use kthread_park() instead
> of kthread_stop() to avoid memory allocation and potential failure
> on resume.
> 
> Use separate function as thread wake up condition. Use spin lock to assure
> rx_msg_list is properly protected against concurrent access.

This tells me what you are doing (and the changes seems to make sense), 
but doesn't say "why".  A quick, one-sentence justification would fix 
this I think.

> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] accel/ivpu: Abort pending rx ipc on reset
  2023-10-28 15:59 ` [PATCH 4/8] accel/ivpu: Abort pending rx ipc on reset Stanislaw Gruszka
@ 2023-10-30 15:04   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 15:04 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> Waking up process, which wait for particular condition, will go to
> sleep again on wake_up() if the condition is not met. Add abort flag
> to wake up IPC receivers, which will finish with -ECANCELED error.
> 
> This is only needed for reset, run time power management prevent to
> suspend VPU when there is pending IPC processing or pending job.
> 
> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/8] accel/ivpu: Print CMDQ errors after consumer timeout
  2023-10-28 15:59 ` [PATCH 5/8] accel/ivpu: Print CMDQ errors after consumer timeout Stanislaw Gruszka
@ 2023-10-30 15:06   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 15:06 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> From: Karol Wachowski <karol.wachowski@linux.intel.com>
> 
> Add checking of error reason bits in IVPU_MMU_CMDQ_CONS
> register when waiting for consumer timeout occurred.
> 
> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] accel/ivpu: Make DMA allocations for MMU600 write combined
  2023-10-28 15:59 ` [PATCH 6/8] accel/ivpu: Make DMA allocations for MMU600 write combined Stanislaw Gruszka
@ 2023-10-30 15:08   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 15:08 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel
  Cc: Karol Wachowski, Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> From: Karol Wachowski <karol.wachowski@linux.intel.com>
> 
> Previously using dma_alloc_wc() API we created cache coherent
> (mapped as write-back) mappings.
> 
> Because we disable MMU600 snooping it was required to do costly
> page walk and cache flushes after each page table modification.
> 
> With write-combined buffers it's possible to do a single write memory
> barrier to flush write-combined buffer to memory which simplifies the
> driver and significantly reduce time of map/unmap operations.
> 
> Mapping time of 255 MB is reduced from 2.5 ms to 500 us.
> 
> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/8] accel/ivpu: Simplify MMU SYNC command
  2023-10-28 15:59 ` [PATCH 7/8] accel/ivpu: Simplify MMU SYNC command Stanislaw Gruszka
@ 2023-10-30 15:09   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 15:09 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel; +Cc: Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> From: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> 
> CMD_SYNC does not need any args as we poll for completion anyway.
> 
> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 8/8] accel/ivpu: Rename VPU to NPU in product strings
  2023-10-28 15:59 ` [PATCH 8/8] accel/ivpu: Rename VPU to NPU in product strings Stanislaw Gruszka
@ 2023-10-30 15:10   ` Jeffrey Hugo
  0 siblings, 0 replies; 20+ messages in thread
From: Jeffrey Hugo @ 2023-10-30 15:10 UTC (permalink / raw)
  To: Stanislaw Gruszka, dri-devel; +Cc: Oded Gabbay, Jacek Lawrynowicz

On 10/28/2023 9:59 AM, Stanislaw Gruszka wrote:
> From: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> 
> VPU was rebranded as NPU (Neural Processing Unit) so user facing
> strings have to be updated but the code remains as is and the module
> is still called intel_vpu.ko.
> 
> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28
  2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
                   ` (7 preceding siblings ...)
  2023-10-28 15:59 ` [PATCH 8/8] accel/ivpu: Rename VPU to NPU in product strings Stanislaw Gruszka
@ 2023-10-31 16:06 ` Stanislaw Gruszka
  8 siblings, 0 replies; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-10-31 16:06 UTC (permalink / raw)
  To: dri-devel; +Cc: Oded Gabbay, Jeffrey Hugo, Jacek Lawrynowicz

On Sat, Oct 28, 2023 at 05:59:28PM +0200, Stanislaw Gruszka wrote:
> Various driver updates:
>  - MMU page tables handling optimizations
>  - Rebrand to NPU
>  - FW profiling frequency knob
>  - job done thread suspend handling
> 
> This is based on top of previous update:
> https://lore.kernel.org/dri-devel/20231028133415.1169975-1-stanislaw.gruszka@linux.intel.com/

Applied to drm-misc-next

Regards
Stanislaw

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend
  2023-10-28 15:59 ` [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend Stanislaw Gruszka
  2023-10-30 15:03   ` Jeffrey Hugo
@ 2023-11-13  7:39   ` Daniel Vetter
  2023-11-13  8:46     ` Stanislaw Gruszka
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2023-11-13  7:39 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Jeffrey Hugo, dri-devel, Oded Gabbay, Maxime Ripard,
	Karol Wachowski, Jacek Lawrynowicz, Thomas Zimmermann

This conflicts with the kthread change in 6.7-rc1 6309727ef271
("kthread: add kthread_stop_put")

Please double-check that the conflict resolution I've done in drm-tip
is correct and then ask drm-misc/accel maintainers to backmerge -rc2
to bake this in properly. Adding them as fyi.
-Sima


On Sat, 28 Oct 2023 at 18:00, Stanislaw Gruszka
<stanislaw.gruszka@linux.intel.com> wrote:
>
> Stop job_done thread when going to suspend. Use kthread_park() instead
> of kthread_stop() to avoid memory allocation and potential failure
> on resume.
>
> Use separate function as thread wake up condition. Use spin lock to assure
> rx_msg_list is properly protected against concurrent access.
>
> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> ---
>  drivers/accel/ivpu/ivpu_drv.c |  2 ++
>  drivers/accel/ivpu/ivpu_ipc.c | 17 +++++++++++++++--
>  drivers/accel/ivpu/ivpu_job.c | 20 ++++++++++++++++----
>  drivers/accel/ivpu/ivpu_job.h |  2 ++
>  4 files changed, 35 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
> index 064cabef41bb..60277ff6af69 100644
> --- a/drivers/accel/ivpu/ivpu_drv.c
> +++ b/drivers/accel/ivpu/ivpu_drv.c
> @@ -378,6 +378,7 @@ int ivpu_boot(struct ivpu_device *vdev)
>         enable_irq(vdev->irq);
>         ivpu_hw_irq_enable(vdev);
>         ivpu_ipc_enable(vdev);
> +       ivpu_job_done_thread_enable(vdev);
>         return 0;
>  }
>
> @@ -389,6 +390,7 @@ int ivpu_shutdown(struct ivpu_device *vdev)
>         disable_irq(vdev->irq);
>         ivpu_ipc_disable(vdev);
>         ivpu_mmu_disable(vdev);
> +       ivpu_job_done_thread_disable(vdev);
>
>         ret = ivpu_hw_power_down(vdev);
>         if (ret)
> diff --git a/drivers/accel/ivpu/ivpu_ipc.c b/drivers/accel/ivpu/ivpu_ipc.c
> index d069d1e1f91d..270caef789bf 100644
> --- a/drivers/accel/ivpu/ivpu_ipc.c
> +++ b/drivers/accel/ivpu/ivpu_ipc.c
> @@ -202,6 +202,20 @@ ivpu_ipc_send(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, struct v
>         return ret;
>  }
>
> +static int ivpu_ipc_rx_need_wakeup(struct ivpu_ipc_consumer *cons)
> +{
> +       int ret = 0;
> +
> +       if (IS_KTHREAD())
> +               ret |= (kthread_should_stop() || kthread_should_park());
> +
> +       spin_lock_irq(&cons->rx_msg_lock);
> +       ret |= !list_empty(&cons->rx_msg_list);
> +       spin_unlock_irq(&cons->rx_msg_lock);
> +
> +       return ret;
> +}
> +
>  int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
>                      struct ivpu_ipc_hdr *ipc_buf,
>                      struct vpu_jsm_msg *ipc_payload, unsigned long timeout_ms)
> @@ -211,8 +225,7 @@ int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
>         int wait_ret, ret = 0;
>
>         wait_ret = wait_event_interruptible_timeout(cons->rx_msg_wq,
> -                                                   (IS_KTHREAD() && kthread_should_stop()) ||
> -                                                   !list_empty(&cons->rx_msg_list),
> +                                                   ivpu_ipc_rx_need_wakeup(cons),
>                                                     msecs_to_jiffies(timeout_ms));
>
>         if (IS_KTHREAD() && kthread_should_stop())
> diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
> index 6e96c921547d..a245b2d44db7 100644
> --- a/drivers/accel/ivpu/ivpu_job.c
> +++ b/drivers/accel/ivpu/ivpu_job.c
> @@ -590,6 +590,11 @@ static int ivpu_job_done_thread(void *arg)
>                                 ivpu_pm_schedule_recovery(vdev);
>                         }
>                 }
> +               if (kthread_should_park()) {
> +                       ivpu_dbg(vdev, JOB, "Parked %s\n", __func__);
> +                       kthread_parkme();
> +                       ivpu_dbg(vdev, JOB, "Unparked %s\n", __func__);
> +               }
>         }
>
>         ivpu_ipc_consumer_del(vdev, &cons);
> @@ -610,9 +615,6 @@ int ivpu_job_done_thread_init(struct ivpu_device *vdev)
>                 return -EIO;
>         }
>
> -       get_task_struct(thread);
> -       wake_up_process(thread);
> -
>         vdev->job_done_thread = thread;
>
>         return 0;
> @@ -620,6 +622,16 @@ int ivpu_job_done_thread_init(struct ivpu_device *vdev)
>
>  void ivpu_job_done_thread_fini(struct ivpu_device *vdev)
>  {
> +       kthread_unpark(vdev->job_done_thread);
>         kthread_stop(vdev->job_done_thread);
> -       put_task_struct(vdev->job_done_thread);
> +}
> +
> +void ivpu_job_done_thread_disable(struct ivpu_device *vdev)
> +{
> +       kthread_park(vdev->job_done_thread);
> +}
> +
> +void ivpu_job_done_thread_enable(struct ivpu_device *vdev)
> +{
> +       kthread_unpark(vdev->job_done_thread);
>  }
> diff --git a/drivers/accel/ivpu/ivpu_job.h b/drivers/accel/ivpu/ivpu_job.h
> index aa1f0b9479b0..a8e914e5affc 100644
> --- a/drivers/accel/ivpu/ivpu_job.h
> +++ b/drivers/accel/ivpu/ivpu_job.h
> @@ -61,6 +61,8 @@ void ivpu_cmdq_reset_all_contexts(struct ivpu_device *vdev);
>
>  int ivpu_job_done_thread_init(struct ivpu_device *vdev);
>  void ivpu_job_done_thread_fini(struct ivpu_device *vdev);
> +void ivpu_job_done_thread_disable(struct ivpu_device *vdev);
> +void ivpu_job_done_thread_enable(struct ivpu_device *vdev);
>
>  void ivpu_jobs_abort_all(struct ivpu_device *vdev);
>
> --
> 2.25.1
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend
  2023-11-13  7:39   ` Daniel Vetter
@ 2023-11-13  8:46     ` Stanislaw Gruszka
  0 siblings, 0 replies; 20+ messages in thread
From: Stanislaw Gruszka @ 2023-11-13  8:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jeffrey Hugo, Oded Gabbay, Maxime Ripard, Karol Wachowski,
	dri-devel, Thomas Zimmermann, Jacek Lawrynowicz

Hi 
On Mon, Nov 13, 2023 at 08:39:42AM +0100, Daniel Vetter wrote:
> This conflicts with the kthread change in 6.7-rc1 6309727ef271
> ("kthread: add kthread_stop_put")
> 
> Please double-check that the conflict resolution I've done in drm-tip
> is correct and then ask drm-misc/accel maintainers to backmerge -rc2
> to bake this in properly. Adding them as fyi.

Can not rebuild drm-tip due to core-to-CI merge problem. However resolution
in drm-rerere commit fa53f5a2888265d883eedb83a943613a410b9fc9 is correct.

Thomas please do the backmarge.

Regards
Stanislaw

> On Sat, 28 Oct 2023 at 18:00, Stanislaw Gruszka
> <stanislaw.gruszka@linux.intel.com> wrote:
> >
> > Stop job_done thread when going to suspend. Use kthread_park() instead
> > of kthread_stop() to avoid memory allocation and potential failure
> > on resume.
> >
> > Use separate function as thread wake up condition. Use spin lock to assure
> > rx_msg_list is properly protected against concurrent access.
> >
> > Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> > Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> > ---
> >  drivers/accel/ivpu/ivpu_drv.c |  2 ++
> >  drivers/accel/ivpu/ivpu_ipc.c | 17 +++++++++++++++--
> >  drivers/accel/ivpu/ivpu_job.c | 20 ++++++++++++++++----
> >  drivers/accel/ivpu/ivpu_job.h |  2 ++
> >  4 files changed, 35 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
> > index 064cabef41bb..60277ff6af69 100644
> > --- a/drivers/accel/ivpu/ivpu_drv.c
> > +++ b/drivers/accel/ivpu/ivpu_drv.c
> > @@ -378,6 +378,7 @@ int ivpu_boot(struct ivpu_device *vdev)
> >         enable_irq(vdev->irq);
> >         ivpu_hw_irq_enable(vdev);
> >         ivpu_ipc_enable(vdev);
> > +       ivpu_job_done_thread_enable(vdev);
> >         return 0;
> >  }
> >
> > @@ -389,6 +390,7 @@ int ivpu_shutdown(struct ivpu_device *vdev)
> >         disable_irq(vdev->irq);
> >         ivpu_ipc_disable(vdev);
> >         ivpu_mmu_disable(vdev);
> > +       ivpu_job_done_thread_disable(vdev);
> >
> >         ret = ivpu_hw_power_down(vdev);
> >         if (ret)
> > diff --git a/drivers/accel/ivpu/ivpu_ipc.c b/drivers/accel/ivpu/ivpu_ipc.c
> > index d069d1e1f91d..270caef789bf 100644
> > --- a/drivers/accel/ivpu/ivpu_ipc.c
> > +++ b/drivers/accel/ivpu/ivpu_ipc.c
> > @@ -202,6 +202,20 @@ ivpu_ipc_send(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons, struct v
> >         return ret;
> >  }
> >
> > +static int ivpu_ipc_rx_need_wakeup(struct ivpu_ipc_consumer *cons)
> > +{
> > +       int ret = 0;
> > +
> > +       if (IS_KTHREAD())
> > +               ret |= (kthread_should_stop() || kthread_should_park());
> > +
> > +       spin_lock_irq(&cons->rx_msg_lock);
> > +       ret |= !list_empty(&cons->rx_msg_list);
> > +       spin_unlock_irq(&cons->rx_msg_lock);
> > +
> > +       return ret;
> > +}
> > +
> >  int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
> >                      struct ivpu_ipc_hdr *ipc_buf,
> >                      struct vpu_jsm_msg *ipc_payload, unsigned long timeout_ms)
> > @@ -211,8 +225,7 @@ int ivpu_ipc_receive(struct ivpu_device *vdev, struct ivpu_ipc_consumer *cons,
> >         int wait_ret, ret = 0;
> >
> >         wait_ret = wait_event_interruptible_timeout(cons->rx_msg_wq,
> > -                                                   (IS_KTHREAD() && kthread_should_stop()) ||
> > -                                                   !list_empty(&cons->rx_msg_list),
> > +                                                   ivpu_ipc_rx_need_wakeup(cons),
> >                                                     msecs_to_jiffies(timeout_ms));
> >
> >         if (IS_KTHREAD() && kthread_should_stop())
> > diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
> > index 6e96c921547d..a245b2d44db7 100644
> > --- a/drivers/accel/ivpu/ivpu_job.c
> > +++ b/drivers/accel/ivpu/ivpu_job.c
> > @@ -590,6 +590,11 @@ static int ivpu_job_done_thread(void *arg)
> >                                 ivpu_pm_schedule_recovery(vdev);
> >                         }
> >                 }
> > +               if (kthread_should_park()) {
> > +                       ivpu_dbg(vdev, JOB, "Parked %s\n", __func__);
> > +                       kthread_parkme();
> > +                       ivpu_dbg(vdev, JOB, "Unparked %s\n", __func__);
> > +               }
> >         }
> >
> >         ivpu_ipc_consumer_del(vdev, &cons);
> > @@ -610,9 +615,6 @@ int ivpu_job_done_thread_init(struct ivpu_device *vdev)
> >                 return -EIO;
> >         }
> >
> > -       get_task_struct(thread);
> > -       wake_up_process(thread);
> > -
> >         vdev->job_done_thread = thread;
> >
> >         return 0;
> > @@ -620,6 +622,16 @@ int ivpu_job_done_thread_init(struct ivpu_device *vdev)
> >
> >  void ivpu_job_done_thread_fini(struct ivpu_device *vdev)
> >  {
> > +       kthread_unpark(vdev->job_done_thread);
> >         kthread_stop(vdev->job_done_thread);
> > -       put_task_struct(vdev->job_done_thread);
> > +}
> > +
> > +void ivpu_job_done_thread_disable(struct ivpu_device *vdev)
> > +{
> > +       kthread_park(vdev->job_done_thread);
> > +}
> > +
> > +void ivpu_job_done_thread_enable(struct ivpu_device *vdev)
> > +{
> > +       kthread_unpark(vdev->job_done_thread);
> >  }
> > diff --git a/drivers/accel/ivpu/ivpu_job.h b/drivers/accel/ivpu/ivpu_job.h
> > index aa1f0b9479b0..a8e914e5affc 100644
> > --- a/drivers/accel/ivpu/ivpu_job.h
> > +++ b/drivers/accel/ivpu/ivpu_job.h
> > @@ -61,6 +61,8 @@ void ivpu_cmdq_reset_all_contexts(struct ivpu_device *vdev);
> >
> >  int ivpu_job_done_thread_init(struct ivpu_device *vdev);
> >  void ivpu_job_done_thread_fini(struct ivpu_device *vdev);
> > +void ivpu_job_done_thread_disable(struct ivpu_device *vdev);
> > +void ivpu_job_done_thread_enable(struct ivpu_device *vdev);
> >
> >  void ivpu_jobs_abort_all(struct ivpu_device *vdev);
> >
> > --
> > 2.25.1
> >
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-11-13  8:46 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-28 15:59 [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka
2023-10-28 15:59 ` [PATCH 1/8] accel/ivpu/40xx: Allow to change profiling frequency Stanislaw Gruszka
2023-10-30 14:56   ` Jeffrey Hugo
2023-10-28 15:59 ` [PATCH 2/8] accel/ivpu: Assure device is off if power up sequence fail Stanislaw Gruszka
2023-10-30 14:59   ` Jeffrey Hugo
2023-10-28 15:59 ` [PATCH 3/8] accel/ivpu: Stop job_done_thread on suspend Stanislaw Gruszka
2023-10-30 15:03   ` Jeffrey Hugo
2023-11-13  7:39   ` Daniel Vetter
2023-11-13  8:46     ` Stanislaw Gruszka
2023-10-28 15:59 ` [PATCH 4/8] accel/ivpu: Abort pending rx ipc on reset Stanislaw Gruszka
2023-10-30 15:04   ` Jeffrey Hugo
2023-10-28 15:59 ` [PATCH 5/8] accel/ivpu: Print CMDQ errors after consumer timeout Stanislaw Gruszka
2023-10-30 15:06   ` Jeffrey Hugo
2023-10-28 15:59 ` [PATCH 6/8] accel/ivpu: Make DMA allocations for MMU600 write combined Stanislaw Gruszka
2023-10-30 15:08   ` Jeffrey Hugo
2023-10-28 15:59 ` [PATCH 7/8] accel/ivpu: Simplify MMU SYNC command Stanislaw Gruszka
2023-10-30 15:09   ` Jeffrey Hugo
2023-10-28 15:59 ` [PATCH 8/8] accel/ivpu: Rename VPU to NPU in product strings Stanislaw Gruszka
2023-10-30 15:10   ` Jeffrey Hugo
2023-10-31 16:06 ` [PATCH 0/8] accel/ivpu: Update for -next 2023-10-28 Stanislaw Gruszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).