From: Bart Van Assche <bvanassche@acm.org>
To: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org,
"Bart Van Assche" <bvanassche@acm.org>,
"Neil Armstrong" <neil.armstrong@linaro.org>,
"André Draszik" <andre.draszik@linaro.org>,
"Peter Wang" <peter.wang@mediatek.com>,
"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
"Matthias Brugger" <matthias.bgg@gmail.com>,
"AngeloGioacchino Del Regno"
<angelogioacchino.delregno@collabora.com>,
"Avri Altman" <avri.altman@sandisk.com>,
"Bean Huo" <beanhuo@micron.com>,
"Bao D. Nguyen" <quic_nguyenb@quicinc.com>,
"Adrian Hunter" <adrian.hunter@intel.com>
Subject: [PATCH v3 1/4] ufs: core: Fix IRQ lock inversion for the SCSI host lock
Date: Fri, 15 Aug 2025 08:58:23 -0700 [thread overview]
Message-ID: <20250815155842.472867-2-bvanassche@acm.org> (raw)
In-Reply-To: <20250815155842.472867-1-bvanassche@acm.org>
Commit 3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service
routine to a threaded IRQ handler") introduced an IRQ lock inversion
issue. Fix this lock inversion by changing the spin_lock_irq() calls into
spin_lock_irqsave() calls in code that can be called either from interrupt
context or from thread context. This patch fixes the following lockdep
complaint:
WARNING: possible irq lock inversion dependency detected
6.12.30-android16-5-maybe-dirty-4k #1 Tainted: G W OE
--------------------------------------------------------
kworker/u28:0/12 just changed the state of lock:
ffffff881e29dd60 (&hba->clk_gating.lock){-...}-{2:2}, at: ufshcd_release_scsi_cmd+0x60/0x110
but this lock took another, HARDIRQ-unsafe lock in the past:
(shost->host_lock){+.+.}-{2:2}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(shost->host_lock);
local_irq_disable();
lock(&hba->clk_gating.lock);
lock(shost->host_lock);
<Interrupt>
lock(&hba->clk_gating.lock);
*** DEADLOCK ***
4 locks held by kworker/u28:0/12:
#0: ffffff8800ac6158 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x1bc/0x65c
#1: ffffffc085c93d70 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1e4/0x65c
#2: ffffff881e29c0e0 (&shost->scan_mutex){+.+.}-{3:3}, at: __scsi_add_device+0x74/0x120
#3: ffffff881960ea00 (&hwq->cq_lock){-...}-{2:2}, at: ufshcd_mcq_poll_cqe_lock+0x28/0x104
the shortest dependencies between 2nd lock and 1st lock:
-> (shost->host_lock){+.+.}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire+0x134/0x2b4
_raw_spin_lock+0x48/0x64
ufshcd_sl_intr+0x4c/0xa08
ufshcd_threaded_intr+0x70/0x12c
irq_thread_fn+0x48/0xa8
irq_thread+0x130/0x1ec
kthread+0x110/0x134
ret_from_fork+0x10/0x20
SOFTIRQ-ON-W at:
lock_acquire+0x134/0x2b4
_raw_spin_lock+0x48/0x64
ufshcd_sl_intr+0x4c/0xa08
ufshcd_threaded_intr+0x70/0x12c
irq_thread_fn+0x48/0xa8
irq_thread+0x130/0x1ec
kthread+0x110/0x134
ret_from_fork+0x10/0x20
INITIAL USE at:
lock_acquire+0x134/0x2b4
_raw_spin_lock+0x48/0x64
ufshcd_sl_intr+0x4c/0xa08
ufshcd_threaded_intr+0x70/0x12c
irq_thread_fn+0x48/0xa8
irq_thread+0x130/0x1ec
kthread+0x110/0x134
ret_from_fork+0x10/0x20
}
... key at: [<ffffffc085ba1a98>] scsi_host_alloc.__key+0x0/0x10
... acquired at:
_raw_spin_lock_irqsave+0x5c/0x80
__ufshcd_release+0x78/0x118
ufshcd_send_uic_cmd+0xe4/0x118
ufshcd_dme_set_attr+0x88/0x1c8
ufs_google_phy_initialization+0x68/0x418 [ufs]
ufs_google_link_startup_notify+0x78/0x27c [ufs]
ufshcd_link_startup+0x84/0x720
ufshcd_init+0xf3c/0x1330
ufshcd_pltfrm_init+0x728/0x7d8
ufs_google_probe+0x30/0x84 [ufs]
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa4/0x160
driver_probe_device+0x44/0x23c
__driver_attach_async_helper+0x60/0xd4
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
-> (&hba->clk_gating.lock){-...}-{2:2} {
IN-HARDIRQ-W at:
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_release_scsi_cmd+0x60/0x110
ufshcd_compl_one_cqe+0x2c0/0x3f4
ufshcd_mcq_poll_cqe_lock+0xb0/0x104
ufs_google_mcq_intr+0x80/0xa0 [ufs]
__handle_irq_event_percpu+0x104/0x32c
handle_irq_event+0x40/0x9c
handle_fasteoi_irq+0x170/0x2e8
generic_handle_domain_irq+0x58/0x80
gic_handle_irq+0x48/0x104
call_on_irq_stack+0x3c/0x50
do_interrupt_handler+0x7c/0xd8
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x3c/0x6c
debug_object_assert_init+0x16c/0x21c
__mod_timer+0x4c/0x48c
schedule_timeout+0xd4/0x16c
io_schedule_timeout+0x48/0x70
do_wait_for_common+0x100/0x194
wait_for_completion_io_timeout+0x48/0x6c
blk_execute_rq+0x124/0x17c
scsi_execute_cmd+0x18c/0x3f8
scsi_probe_and_add_lun+0x204/0xd74
__scsi_add_device+0xbc/0x120
ufshcd_async_scan+0x80/0x3c0
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
INITIAL USE at:
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_hold+0x34/0x14c
ufshcd_send_uic_cmd+0x28/0x118
ufshcd_dme_set_attr+0x88/0x1c8
ufs_google_phy_initialization+0x68/0x418 [ufs]
ufs_google_link_startup_notify+0x78/0x27c [ufs]
ufshcd_link_startup+0x84/0x720
ufshcd_init+0xf3c/0x1330
ufshcd_pltfrm_init+0x728/0x7d8
ufs_google_probe+0x30/0x84 [ufs]
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa4/0x160
driver_probe_device+0x44/0x23c
__driver_attach_async_helper+0x60/0xd4
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
}
... key at: [<ffffffc085ba6fe8>] ufshcd_init.__key+0x0/0x10
... acquired at:
mark_lock+0x1c4/0x224
__lock_acquire+0x438/0x2e1c
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_release_scsi_cmd+0x60/0x110
ufshcd_compl_one_cqe+0x2c0/0x3f4
ufshcd_mcq_poll_cqe_lock+0xb0/0x104
ufs_google_mcq_intr+0x80/0xa0 [ufs]
__handle_irq_event_percpu+0x104/0x32c
handle_irq_event+0x40/0x9c
handle_fasteoi_irq+0x170/0x2e8
generic_handle_domain_irq+0x58/0x80
gic_handle_irq+0x48/0x104
call_on_irq_stack+0x3c/0x50
do_interrupt_handler+0x7c/0xd8
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x3c/0x6c
debug_object_assert_init+0x16c/0x21c
__mod_timer+0x4c/0x48c
schedule_timeout+0xd4/0x16c
io_schedule_timeout+0x48/0x70
do_wait_for_common+0x100/0x194
wait_for_completion_io_timeout+0x48/0x6c
blk_execute_rq+0x124/0x17c
scsi_execute_cmd+0x18c/0x3f8
scsi_probe_and_add_lun+0x204/0xd74
__scsi_add_device+0xbc/0x120
ufshcd_async_scan+0x80/0x3c0
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
stack backtrace:
CPU: 6 UID: 0 PID: 12 Comm: kworker/u28:0 Tainted: G W OE 6.12.30-android16-5-maybe-dirty-4k #1 ccd4020fe444bdf629efc3b86df6be920b8df7d0
Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Spacecraft board based on MALIBU (DT)
Workqueue: async async_run_entry_fn
Call trace:
dump_backtrace+0xfc/0x17c
show_stack+0x18/0x28
dump_stack_lvl+0x40/0xa0
dump_stack+0x18/0x24
print_irq_inversion_bug+0x2fc/0x304
mark_lock_irq+0x388/0x4fc
mark_lock+0x1c4/0x224
__lock_acquire+0x438/0x2e1c
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_release_scsi_cmd+0x60/0x110
ufshcd_compl_one_cqe+0x2c0/0x3f4
ufshcd_mcq_poll_cqe_lock+0xb0/0x104
ufs_google_mcq_intr+0x80/0xa0 [ufs dd6f385554e109da094ab91d5f7be18625a2222a]
__handle_irq_event_percpu+0x104/0x32c
handle_irq_event+0x40/0x9c
handle_fasteoi_irq+0x170/0x2e8
generic_handle_domain_irq+0x58/0x80
gic_handle_irq+0x48/0x104
call_on_irq_stack+0x3c/0x50
do_interrupt_handler+0x7c/0xd8
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x3c/0x6c
debug_object_assert_init+0x16c/0x21c
__mod_timer+0x4c/0x48c
schedule_timeout+0xd4/0x16c
io_schedule_timeout+0x48/0x70
do_wait_for_common+0x100/0x194
wait_for_completion_io_timeout+0x48/0x6c
blk_execute_rq+0x124/0x17c
scsi_execute_cmd+0x18c/0x3f8
scsi_probe_and_add_lun+0x204/0xd74
__scsi_add_device+0xbc/0x120
ufshcd_async_scan+0x80/0x3c0
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Cc: André Draszik <andre.draszik@linaro.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Fixes: 3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service routine to a threaded IRQ handler")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufshcd.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 96ad57c3144b..ec4e860f5c53 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5559,7 +5559,7 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
irqreturn_t retval = IRQ_NONE;
struct uic_command *cmd;
- spin_lock(hba->host->host_lock);
+ guard(spinlock_irqsave)(hba->host->host_lock);
cmd = hba->active_uic_cmd;
if (WARN_ON_ONCE(!cmd))
goto unlock;
@@ -5586,8 +5586,6 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
ufshcd_add_uic_command_trace(hba, cmd, UFS_CMD_COMP);
unlock:
- spin_unlock(hba->host->host_lock);
-
return retval;
}
@@ -6920,7 +6918,7 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba, u32 intr_status)
bool queue_eh_work = false;
irqreturn_t retval = IRQ_NONE;
- spin_lock(hba->host->host_lock);
+ guard(spinlock_irqsave)(hba->host->host_lock);
hba->errors |= UFSHCD_ERROR_MASK & intr_status;
if (hba->errors & INT_FATAL_ERRORS) {
@@ -6979,7 +6977,7 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba, u32 intr_status)
*/
hba->errors = 0;
hba->uic_error = 0;
- spin_unlock(hba->host->host_lock);
+
return retval;
}
next prev parent reply other threads:[~2025-08-15 15:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-15 15:58 [PATCH v3 0/4] UFS driver bug fixes Bart Van Assche
2025-08-15 15:58 ` Bart Van Assche [this message]
2025-08-15 15:58 ` [PATCH v3 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl() Bart Van Assche
2025-08-15 15:58 ` [PATCH v3 3/4] ufs: core: Fix the return value documentation Bart Van Assche
2025-08-18 11:55 ` Peter Wang (王信友)
2025-08-15 15:58 ` [PATCH v3 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr() Bart Van Assche
2025-08-19 2:33 ` [PATCH v3 0/4] UFS driver bug fixes Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250815155842.472867-2-bvanassche@acm.org \
--to=bvanassche@acm.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=adrian.hunter@intel.com \
--cc=andre.draszik@linaro.org \
--cc=angelogioacchino.delregno@collabora.com \
--cc=avri.altman@sandisk.com \
--cc=beanhuo@micron.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=matthias.bgg@gmail.com \
--cc=neil.armstrong@linaro.org \
--cc=peter.wang@mediatek.com \
--cc=quic_nguyenb@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).