From: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
To: stable@vger.kernel.org
Cc: Karol Wachowski <karol.wachowski@intel.com>,
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Subject: [PATCH 2/3] accel/ivpu: Fix locking order in ivpu_job_submit
Date: Wed, 30 Apr 2025 14:36:52 +0200 [thread overview]
Message-ID: <20250430123653.3748811-3-jacek.lawrynowicz@linux.intel.com> (raw)
In-Reply-To: <20250430123653.3748811-1-jacek.lawrynowicz@linux.intel.com>
From: Karol Wachowski <karol.wachowski@intel.com>
commit ab680dc6c78aa035e944ecc8c48a1caab9f39924 upstream.
Fix deadlock in job submission and abort handling.
When a thread aborts currently executing jobs due to a fault,
it first locks the global lock protecting submitted_jobs (#1).
After the last job is destroyed, it proceeds to release the related context
and locks file_priv (#2). Meanwhile, in the job submission thread,
the file_priv lock (#2) is taken first, and then the submitted_jobs
lock (#1) is obtained when a job is added to the submitted jobs list.
CPU0 CPU1
---- ----
(for example due to a fault) (jobs submissions keep coming)
lock(&vdev->submitted_jobs_lock) #1
ivpu_jobs_abort_all()
job_destroy()
lock(&file_priv->lock) #2
lock(&vdev->submitted_jobs_lock) #1
file_priv_release()
lock(&vdev->context_list_lock)
lock(&file_priv->lock) #2
This order of locking causes a deadlock. To resolve this issue,
change the order of locking in ivpu_job_submit().
Cc: <stable@vger.kernel.org> # v6.14
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-12-maciej.falkowski@linux.intel.com
---
drivers/accel/ivpu/ivpu_job.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index fc91681469e33..5b6d93c20b2da 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -532,6 +532,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
if (ret < 0)
return ret;
+ mutex_lock(&vdev->submitted_jobs_lock);
mutex_lock(&file_priv->lock);
cmdq = ivpu_cmdq_acquire(file_priv, priority);
@@ -539,11 +540,9 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
ivpu_warn_ratelimited(vdev, "Failed to get job queue, ctx %d engine %d prio %d\n",
file_priv->ctx.id, job->engine_idx, priority);
ret = -EINVAL;
- goto err_unlock_file_priv;
+ goto err_unlock;
}
- mutex_lock(&vdev->submitted_jobs_lock);
-
is_first_job = xa_empty(&vdev->submitted_jobs_xa);
ret = xa_alloc_cyclic(&vdev->submitted_jobs_xa, &job->job_id, job, file_priv->job_limit,
&file_priv->job_id_next, GFP_KERNEL);
@@ -551,7 +550,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
ivpu_dbg(vdev, JOB, "Too many active jobs in ctx %d\n",
file_priv->ctx.id);
ret = -EBUSY;
- goto err_unlock_submitted_jobs;
+ goto err_unlock;
}
ret = ivpu_cmdq_push_job(cmdq, job);
@@ -574,22 +573,20 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
job->job_id, file_priv->ctx.id, job->engine_idx, priority,
job->cmd_buf_vpu_addr, cmdq->jobq->header.tail);
- mutex_unlock(&vdev->submitted_jobs_lock);
mutex_unlock(&file_priv->lock);
if (unlikely(ivpu_test_mode & IVPU_TEST_MODE_NULL_HW)) {
- mutex_lock(&vdev->submitted_jobs_lock);
ivpu_job_signal_and_destroy(vdev, job->job_id, VPU_JSM_STATUS_SUCCESS);
- mutex_unlock(&vdev->submitted_jobs_lock);
}
+ mutex_unlock(&vdev->submitted_jobs_lock);
+
return 0;
err_erase_xa:
xa_erase(&vdev->submitted_jobs_xa, job->job_id);
-err_unlock_submitted_jobs:
+err_unlock:
mutex_unlock(&vdev->submitted_jobs_lock);
-err_unlock_file_priv:
mutex_unlock(&file_priv->lock);
ivpu_rpm_put(vdev);
return ret;
--
2.45.1
next prev parent reply other threads:[~2025-04-30 12:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-30 12:36 [PATCH 0/3] accel/ivpu: Add context violation handling for 6.14 Jacek Lawrynowicz
2025-04-30 12:36 ` [PATCH 1/3] accel/ivpu: Abort all jobs after command queue unregister Jacek Lawrynowicz
2025-05-01 19:11 ` Sasha Levin
2025-05-05 6:02 ` Greg KH
2025-04-30 12:36 ` Jacek Lawrynowicz [this message]
2025-05-01 19:10 ` [PATCH 2/3] accel/ivpu: Fix locking order in ivpu_job_submit Sasha Levin
2025-04-30 12:36 ` [PATCH 3/3] accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW Jacek Lawrynowicz
2025-05-01 18:50 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250430123653.3748811-3-jacek.lawrynowicz@linux.intel.com \
--to=jacek.lawrynowicz@linux.intel.com \
--cc=karol.wachowski@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox