public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW
@ 2025-04-08  9:57 Jacek Lawrynowicz
  2025-04-10  7:49 ` Jacek Lawrynowicz
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jacek Lawrynowicz @ 2025-04-08  9:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: stable, Karol Wachowski, Jacek Lawrynowicz

From: Karol Wachowski <karol.wachowski@intel.com>

commit dad945c27a42dfadddff1049cf5ae417209a8996 upstream.

Trigger recovery of the NPU upon receiving HW context violation from
the firmware. The context violation error is a fatal error that prevents
any subsequent jobs from being executed. Without this fix it is
necessary to reload the driver to restore the NPU operational state.

This is simplified version of upstream commit as the full implementation
would require all engine reset/resume logic to be backported.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-13-maciej.falkowski@linux.intel.com
Fixes: 0adff3b0ef12 ("accel/ivpu: Share NPU busy time in sysfs")
Cc: <stable@vger.kernel.org> # v6.11+
---
 drivers/accel/ivpu/ivpu_job.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index be2e2bf0f43f0..70b3676974407 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -482,6 +482,8 @@ static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device *
 	return job;
 }
 
+#define VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW 0xEU
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 job_status)
 {
 	struct ivpu_job *job;
@@ -490,6 +492,9 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32
 	if (!job)
 		return -ENOENT;
 
+	if (job_status == VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW)
+		ivpu_pm_trigger_recovery(vdev, "HW context violation");
+
 	if (job->file_priv->has_mmu_faults)
 		job_status = DRM_IVPU_JOB_STATUS_ABORTED;
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread
* [PATCH] accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW
@ 2025-04-01 11:08 Jacek Lawrynowicz
  0 siblings, 0 replies; 10+ messages in thread
From: Jacek Lawrynowicz @ 2025-04-01 11:08 UTC (permalink / raw)
  To: stable; +Cc: Karol Wachowski, Jacek Lawrynowicz

From: Karol Wachowski <karol.wachowski@intel.com>

commit dad945c27a42dfadddff1049cf5ae417209a8996 upstream.

Trigger recovery of the NPU upon receiving HW context violation from
the firmware. The context violation error is a fatal error that prevents
any subsequent jobs from being executed. Without this fix it is
necessary to reload the driver to restore the NPU operational state.

This is simplified version of upstream commit as the full implementation
would require all engine reset/resume logic to be backported.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-13-maciej.falkowski@linux.intel.com
Fixes: 0adff3b0ef12 ("accel/ivpu: Share NPU busy time in sysfs")
Cc: <stable@vger.kernel.org> # v6.11+
---
 drivers/accel/ivpu/ivpu_job.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index be2e2bf0f43f0..70b3676974407 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -482,6 +482,8 @@ static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device *
 	return job;
 }
 
+#define VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW 0xEU
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 job_status)
 {
 	struct ivpu_job *job;
@@ -490,6 +492,9 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32
 	if (!job)
 		return -ENOENT;
 
+	if (job_status == VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW)
+		ivpu_pm_trigger_recovery(vdev, "HW context violation");
+
 	if (job->file_priv->has_mmu_faults)
 		job_status = DRM_IVPU_JOB_STATUS_ABORTED;
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-04-30 12:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-08  9:57 [PATCH] accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW Jacek Lawrynowicz
2025-04-10  7:49 ` Jacek Lawrynowicz
2025-04-10  8:03   ` Greg Kroah-Hartman
2025-04-10  9:27     ` Jacek Lawrynowicz
2025-04-10 15:53 ` Sasha Levin
2025-04-22 12:17 ` Greg KH
2025-04-24 10:22   ` Jacek Lawrynowicz
2025-04-24 10:34     ` Greg KH
2025-04-30 12:50       ` Jacek Lawrynowicz
  -- strict thread matches above, loose matches on Subject: below --
2025-04-01 11:08 Jacek Lawrynowicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox