* [PATCH 0/4] UFS driver bug fixes
@ 2025-08-11 15:46 Bart Van Assche
2025-08-11 15:46 ` [PATCH 1/4] ufs: core: Fix IRQ lock inversion for the SCSI host lock Bart Van Assche
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Bart Van Assche @ 2025-08-11 15:46 UTC (permalink / raw)
To: Martin K . Petersen; +Cc: linux-scsi, Bart Van Assche
Hi Martin,
This patch series includes three bug fixes and one rename patch for the UFS
driver. Please consider this patch series for the next merge window.
Thanks,
Bart.
Bart Van Assche (4):
ufs: core: Fix IRQ lock inversion for the SCSI host lock
ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl()
ufs: core: Fix the return value documentation
ufs: core: Rename ufshcd_wait_for_doorbell_clr()
drivers/ufs/core/ufshcd.c | 76 ++++++++++++++++++++++-----------------
1 file changed, 44 insertions(+), 32 deletions(-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] ufs: core: Fix IRQ lock inversion for the SCSI host lock
2025-08-11 15:46 [PATCH 0/4] UFS driver bug fixes Bart Van Assche
@ 2025-08-11 15:46 ` Bart Van Assche
2025-08-11 15:46 ` [PATCH 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl() Bart Van Assche
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Bart Van Assche @ 2025-08-11 15:46 UTC (permalink / raw)
To: Martin K . Petersen
Cc: linux-scsi, Bart Van Assche, Neil Armstrong, André Draszik,
Peter Wang, James E.J. Bottomley, Matthias Brugger,
AngeloGioacchino Del Regno, Avri Altman, Bean Huo,
Manivannan Sadhasivam, Bao D. Nguyen, Adrian Hunter
Commit 3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service
routine to a threaded IRQ handler") introduced an IRQ lock inversion
issue. Fix this lock inversion by changing the spin_lock_irq() calls into
spin_lock_irqsave() calls in code that can be called either from interrupt
context or from thread context. This patch fixes the following lockdep
complaint:
WARNING: possible irq lock inversion dependency detected
6.12.30-android16-5-maybe-dirty-4k #1 Tainted: G W OE
--------------------------------------------------------
kworker/u28:0/12 just changed the state of lock:
ffffff881e29dd60 (&hba->clk_gating.lock){-...}-{2:2}, at: ufshcd_release_scsi_cmd+0x60/0x110
but this lock took another, HARDIRQ-unsafe lock in the past:
(shost->host_lock){+.+.}-{2:2}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(shost->host_lock);
local_irq_disable();
lock(&hba->clk_gating.lock);
lock(shost->host_lock);
<Interrupt>
lock(&hba->clk_gating.lock);
*** DEADLOCK ***
4 locks held by kworker/u28:0/12:
#0: ffffff8800ac6158 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x1bc/0x65c
#1: ffffffc085c93d70 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1e4/0x65c
#2: ffffff881e29c0e0 (&shost->scan_mutex){+.+.}-{3:3}, at: __scsi_add_device+0x74/0x120
#3: ffffff881960ea00 (&hwq->cq_lock){-...}-{2:2}, at: ufshcd_mcq_poll_cqe_lock+0x28/0x104
the shortest dependencies between 2nd lock and 1st lock:
-> (shost->host_lock){+.+.}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire+0x134/0x2b4
_raw_spin_lock+0x48/0x64
ufshcd_sl_intr+0x4c/0xa08
ufshcd_threaded_intr+0x70/0x12c
irq_thread_fn+0x48/0xa8
irq_thread+0x130/0x1ec
kthread+0x110/0x134
ret_from_fork+0x10/0x20
SOFTIRQ-ON-W at:
lock_acquire+0x134/0x2b4
_raw_spin_lock+0x48/0x64
ufshcd_sl_intr+0x4c/0xa08
ufshcd_threaded_intr+0x70/0x12c
irq_thread_fn+0x48/0xa8
irq_thread+0x130/0x1ec
kthread+0x110/0x134
ret_from_fork+0x10/0x20
INITIAL USE at:
lock_acquire+0x134/0x2b4
_raw_spin_lock+0x48/0x64
ufshcd_sl_intr+0x4c/0xa08
ufshcd_threaded_intr+0x70/0x12c
irq_thread_fn+0x48/0xa8
irq_thread+0x130/0x1ec
kthread+0x110/0x134
ret_from_fork+0x10/0x20
}
... key at: [<ffffffc085ba1a98>] scsi_host_alloc.__key+0x0/0x10
... acquired at:
_raw_spin_lock_irqsave+0x5c/0x80
__ufshcd_release+0x78/0x118
ufshcd_send_uic_cmd+0xe4/0x118
ufshcd_dme_set_attr+0x88/0x1c8
ufs_google_phy_initialization+0x68/0x418 [ufs]
ufs_google_link_startup_notify+0x78/0x27c [ufs]
ufshcd_link_startup+0x84/0x720
ufshcd_init+0xf3c/0x1330
ufshcd_pltfrm_init+0x728/0x7d8
ufs_google_probe+0x30/0x84 [ufs]
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa4/0x160
driver_probe_device+0x44/0x23c
__driver_attach_async_helper+0x60/0xd4
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
-> (&hba->clk_gating.lock){-...}-{2:2} {
IN-HARDIRQ-W at:
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_release_scsi_cmd+0x60/0x110
ufshcd_compl_one_cqe+0x2c0/0x3f4
ufshcd_mcq_poll_cqe_lock+0xb0/0x104
ufs_google_mcq_intr+0x80/0xa0 [ufs]
__handle_irq_event_percpu+0x104/0x32c
handle_irq_event+0x40/0x9c
handle_fasteoi_irq+0x170/0x2e8
generic_handle_domain_irq+0x58/0x80
gic_handle_irq+0x48/0x104
call_on_irq_stack+0x3c/0x50
do_interrupt_handler+0x7c/0xd8
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x3c/0x6c
debug_object_assert_init+0x16c/0x21c
__mod_timer+0x4c/0x48c
schedule_timeout+0xd4/0x16c
io_schedule_timeout+0x48/0x70
do_wait_for_common+0x100/0x194
wait_for_completion_io_timeout+0x48/0x6c
blk_execute_rq+0x124/0x17c
scsi_execute_cmd+0x18c/0x3f8
scsi_probe_and_add_lun+0x204/0xd74
__scsi_add_device+0xbc/0x120
ufshcd_async_scan+0x80/0x3c0
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
INITIAL USE at:
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_hold+0x34/0x14c
ufshcd_send_uic_cmd+0x28/0x118
ufshcd_dme_set_attr+0x88/0x1c8
ufs_google_phy_initialization+0x68/0x418 [ufs]
ufs_google_link_startup_notify+0x78/0x27c [ufs]
ufshcd_link_startup+0x84/0x720
ufshcd_init+0xf3c/0x1330
ufshcd_pltfrm_init+0x728/0x7d8
ufs_google_probe+0x30/0x84 [ufs]
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa4/0x160
driver_probe_device+0x44/0x23c
__driver_attach_async_helper+0x60/0xd4
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
}
... key at: [<ffffffc085ba6fe8>] ufshcd_init.__key+0x0/0x10
... acquired at:
mark_lock+0x1c4/0x224
__lock_acquire+0x438/0x2e1c
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_release_scsi_cmd+0x60/0x110
ufshcd_compl_one_cqe+0x2c0/0x3f4
ufshcd_mcq_poll_cqe_lock+0xb0/0x104
ufs_google_mcq_intr+0x80/0xa0 [ufs]
__handle_irq_event_percpu+0x104/0x32c
handle_irq_event+0x40/0x9c
handle_fasteoi_irq+0x170/0x2e8
generic_handle_domain_irq+0x58/0x80
gic_handle_irq+0x48/0x104
call_on_irq_stack+0x3c/0x50
do_interrupt_handler+0x7c/0xd8
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x3c/0x6c
debug_object_assert_init+0x16c/0x21c
__mod_timer+0x4c/0x48c
schedule_timeout+0xd4/0x16c
io_schedule_timeout+0x48/0x70
do_wait_for_common+0x100/0x194
wait_for_completion_io_timeout+0x48/0x6c
blk_execute_rq+0x124/0x17c
scsi_execute_cmd+0x18c/0x3f8
scsi_probe_and_add_lun+0x204/0xd74
__scsi_add_device+0xbc/0x120
ufshcd_async_scan+0x80/0x3c0
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
stack backtrace:
CPU: 6 UID: 0 PID: 12 Comm: kworker/u28:0 Tainted: G W OE 6.12.30-android16-5-maybe-dirty-4k #1 ccd4020fe444bdf629efc3b86df6be920b8df7d0
Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Spacecraft board based on MALIBU (DT)
Workqueue: async async_run_entry_fn
Call trace:
dump_backtrace+0xfc/0x17c
show_stack+0x18/0x28
dump_stack_lvl+0x40/0xa0
dump_stack+0x18/0x24
print_irq_inversion_bug+0x2fc/0x304
mark_lock_irq+0x388/0x4fc
mark_lock+0x1c4/0x224
__lock_acquire+0x438/0x2e1c
lock_acquire+0x134/0x2b4
_raw_spin_lock_irqsave+0x5c/0x80
ufshcd_release_scsi_cmd+0x60/0x110
ufshcd_compl_one_cqe+0x2c0/0x3f4
ufshcd_mcq_poll_cqe_lock+0xb0/0x104
ufs_google_mcq_intr+0x80/0xa0 [ufs dd6f385554e109da094ab91d5f7be18625a2222a]
__handle_irq_event_percpu+0x104/0x32c
handle_irq_event+0x40/0x9c
handle_fasteoi_irq+0x170/0x2e8
generic_handle_domain_irq+0x58/0x80
gic_handle_irq+0x48/0x104
call_on_irq_stack+0x3c/0x50
do_interrupt_handler+0x7c/0xd8
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x3c/0x6c
debug_object_assert_init+0x16c/0x21c
__mod_timer+0x4c/0x48c
schedule_timeout+0xd4/0x16c
io_schedule_timeout+0x48/0x70
do_wait_for_common+0x100/0x194
wait_for_completion_io_timeout+0x48/0x6c
blk_execute_rq+0x124/0x17c
scsi_execute_cmd+0x18c/0x3f8
scsi_probe_and_add_lun+0x204/0xd74
__scsi_add_device+0xbc/0x120
ufshcd_async_scan+0x80/0x3c0
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Cc: André Draszik <andre.draszik@linaro.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Fixes: 3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service routine to a threaded IRQ handler")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufshcd.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 2e1fa8cf83f5..3e94364f45d5 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5559,7 +5559,7 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
irqreturn_t retval = IRQ_NONE;
struct uic_command *cmd;
- spin_lock(hba->host->host_lock);
+ guard(spinlock_irqsave)(hba->host->host_lock);
cmd = hba->active_uic_cmd;
if (WARN_ON_ONCE(!cmd))
goto unlock;
@@ -5586,8 +5586,6 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
ufshcd_add_uic_command_trace(hba, cmd, UFS_CMD_COMP);
unlock:
- spin_unlock(hba->host->host_lock);
-
return retval;
}
@@ -6915,7 +6913,7 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba, u32 intr_status)
bool queue_eh_work = false;
irqreturn_t retval = IRQ_NONE;
- spin_lock(hba->host->host_lock);
+ guard(spinlock_irqsave)(hba->host->host_lock);
hba->errors |= UFSHCD_ERROR_MASK & intr_status;
if (hba->errors & INT_FATAL_ERRORS) {
@@ -6974,7 +6972,7 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba, u32 intr_status)
*/
hba->errors = 0;
hba->uic_error = 0;
- spin_unlock(hba->host->host_lock);
+
return retval;
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl()
2025-08-11 15:46 [PATCH 0/4] UFS driver bug fixes Bart Van Assche
2025-08-11 15:46 ` [PATCH 1/4] ufs: core: Fix IRQ lock inversion for the SCSI host lock Bart Van Assche
@ 2025-08-11 15:46 ` Bart Van Assche
2025-08-12 9:01 ` Peter Wang (王信友)
2025-08-11 15:46 ` [PATCH 3/4] ufs: core: Fix the return value documentation Bart Van Assche
2025-08-11 15:46 ` [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr() Bart Van Assche
3 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2025-08-11 15:46 UTC (permalink / raw)
To: Martin K . Petersen
Cc: linux-scsi, Bart Van Assche, James E.J. Bottomley, Peter Wang,
Avri Altman, Bean Huo, Manivannan Sadhasivam, Bao D. Nguyen,
Adrian Hunter
The UIC completion interrupt may be disabled while an UIC command is
being processed. When the UIC completion interrupt is reenabled, an
UIC interrupt is triggered and the WARN_ON_ONCE(!cmd) statement is hit.
Hence this patch that removes this kernel warning.
Fixes: fcd8b0450a9a ("scsi: ufs: core: Make ufshcd_uic_cmd_compl() easier to analyze")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufshcd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 3e94364f45d5..12af4e0824ce 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5561,7 +5561,7 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
guard(spinlock_irqsave)(hba->host->host_lock);
cmd = hba->active_uic_cmd;
- if (WARN_ON_ONCE(!cmd))
+ if (!cmd)
goto unlock;
if (ufshcd_is_auto_hibern8_error(hba, intr_status))
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] ufs: core: Fix the return value documentation
2025-08-11 15:46 [PATCH 0/4] UFS driver bug fixes Bart Van Assche
2025-08-11 15:46 ` [PATCH 1/4] ufs: core: Fix IRQ lock inversion for the SCSI host lock Bart Van Assche
2025-08-11 15:46 ` [PATCH 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl() Bart Van Assche
@ 2025-08-11 15:46 ` Bart Van Assche
2025-08-12 9:02 ` Peter Wang (王信友)
2025-08-11 15:46 ` [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr() Bart Van Assche
3 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2025-08-11 15:46 UTC (permalink / raw)
To: Martin K . Petersen
Cc: linux-scsi, Bart Van Assche, James E.J. Bottomley, Peter Wang,
Avri Altman, Bean Huo, Manivannan Sadhasivam, Bao D. Nguyen,
Adrian Hunter
ufshcd_wait_for_dev_cmd() and all its callers can return an OCS error.
OCS errors are represented by positive integers. Remove the WARN_ONCE()
statements that complain about positive error codes and update the
documentation.
Keep the behavior of ufshcd_wait_for_dev_cmd() because this return value
may end be passed as the second argument of bsg_job_done() and
bsg_job_done() handles positive and negative error codes differently.
Fixes: cc59f3b68542 ("scsi: ufs: core: Improve return value documentation")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufshcd.c | 62 ++++++++++++++++++++++++---------------
1 file changed, 38 insertions(+), 24 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 12af4e0824ce..b5057ce27aa9 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -3199,7 +3199,8 @@ ufshcd_dev_cmd_completion(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
}
/*
- * Return: 0 upon success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba,
struct ufshcd_lrb *lrbp, int max_timeout)
@@ -3275,7 +3276,6 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba,
}
}
- WARN_ONCE(err > 0, "Incorrect return value %d > 0\n", err);
return err;
}
@@ -3294,7 +3294,8 @@ static void ufshcd_dev_man_unlock(struct ufs_hba *hba)
}
/*
- * Return: 0 upon success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int ufshcd_issue_dev_cmd(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
const u32 tag, int timeout)
@@ -3317,7 +3318,8 @@ static int ufshcd_issue_dev_cmd(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
* @cmd_type: specifies the type (NOP, Query...)
* @timeout: timeout in milliseconds
*
- * Return: 0 upon success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*
* NOTE: Since there is only one available tag for device management commands,
* it is expected you hold the hba->dev_cmd.lock mutex.
@@ -3363,6 +3365,10 @@ static inline void ufshcd_init_query(struct ufs_hba *hba,
(*request)->upiu_req.selector = selector;
}
+/*
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
+ */
static int ufshcd_query_flag_retry(struct ufs_hba *hba,
enum query_opcode opcode, enum flag_idn idn, u8 index, bool *flag_res)
{
@@ -3383,7 +3389,6 @@ static int ufshcd_query_flag_retry(struct ufs_hba *hba,
dev_err(hba->dev,
"%s: query flag, opcode %d, idn %d, failed with error %d after %d retries\n",
__func__, opcode, idn, ret, retries);
- WARN_ONCE(ret > 0, "Incorrect return value %d > 0\n", ret);
return ret;
}
@@ -3395,7 +3400,8 @@ static int ufshcd_query_flag_retry(struct ufs_hba *hba,
* @index: flag index to access
* @flag_res: the flag value after the query request completes
*
- * Return: 0 for success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode,
enum flag_idn idn, u8 index, bool *flag_res)
@@ -3451,7 +3457,6 @@ int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode,
out_unlock:
ufshcd_dev_man_unlock(hba);
- WARN_ONCE(err > 0, "Incorrect return value %d > 0\n", err);
return err;
}
@@ -3464,8 +3469,9 @@ int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode,
* @selector: selector field
* @attr_val: the attribute value after the query request completes
*
- * Return: 0 upon success; < 0 upon failure.
-*/
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
+ */
int ufshcd_query_attr(struct ufs_hba *hba, enum query_opcode opcode,
enum attr_idn idn, u8 index, u8 selector, u32 *attr_val)
{
@@ -3513,7 +3519,6 @@ int ufshcd_query_attr(struct ufs_hba *hba, enum query_opcode opcode,
out_unlock:
ufshcd_dev_man_unlock(hba);
- WARN_ONCE(err > 0, "Incorrect return value %d > 0\n", err);
return err;
}
@@ -3528,8 +3533,9 @@ int ufshcd_query_attr(struct ufs_hba *hba, enum query_opcode opcode,
* @attr_val: the attribute value after the query request
* completes
*
- * Return: 0 for success; < 0 upon failure.
-*/
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
+ */
int ufshcd_query_attr_retry(struct ufs_hba *hba,
enum query_opcode opcode, enum attr_idn idn, u8 index, u8 selector,
u32 *attr_val)
@@ -3551,12 +3557,12 @@ int ufshcd_query_attr_retry(struct ufs_hba *hba,
dev_err(hba->dev,
"%s: query attribute, idn %d, failed with error %d after %d retries\n",
__func__, idn, ret, QUERY_REQ_RETRIES);
- WARN_ONCE(ret > 0, "Incorrect return value %d > 0\n", ret);
return ret;
}
/*
- * Return: 0 if successful; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int __ufshcd_query_descriptor(struct ufs_hba *hba,
enum query_opcode opcode, enum desc_idn idn, u8 index,
@@ -3615,7 +3621,6 @@ static int __ufshcd_query_descriptor(struct ufs_hba *hba,
out_unlock:
hba->dev_cmd.query.descriptor = NULL;
ufshcd_dev_man_unlock(hba);
- WARN_ONCE(err > 0, "Incorrect return value %d > 0\n", err);
return err;
}
@@ -3632,7 +3637,8 @@ static int __ufshcd_query_descriptor(struct ufs_hba *hba,
* The buf_len parameter will contain, on return, the length parameter
* received on the response.
*
- * Return: 0 for success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
int ufshcd_query_descriptor_retry(struct ufs_hba *hba,
enum query_opcode opcode,
@@ -3650,7 +3656,6 @@ int ufshcd_query_descriptor_retry(struct ufs_hba *hba,
break;
}
- WARN_ONCE(err > 0, "Incorrect return value %d > 0\n", err);
return err;
}
@@ -3663,7 +3668,8 @@ int ufshcd_query_descriptor_retry(struct ufs_hba *hba,
* @param_read_buf: pointer to buffer where parameter would be read
* @param_size: sizeof(param_read_buf)
*
- * Return: 0 in case of success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
int ufshcd_read_desc_param(struct ufs_hba *hba,
enum desc_idn desc_id,
@@ -3730,7 +3736,6 @@ int ufshcd_read_desc_param(struct ufs_hba *hba,
out:
if (is_kmalloc)
kfree(desc_buf);
- WARN_ONCE(ret > 0, "Incorrect return value %d > 0\n", ret);
return ret;
}
@@ -4781,7 +4786,8 @@ EXPORT_SYMBOL_GPL(ufshcd_config_pwr_mode);
*
* Set fDeviceInit flag and poll until device toggles it.
*
- * Return: 0 upon success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int ufshcd_complete_dev_init(struct ufs_hba *hba)
{
@@ -5135,7 +5141,8 @@ static int ufshcd_link_startup(struct ufs_hba *hba)
* not respond with NOP IN UPIU within timeout of %NOP_OUT_TIMEOUT
* and we retry sending NOP OUT for %NOP_OUT_RETRIES iterations.
*
- * Return: 0 upon success; < 0 upon failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int ufshcd_verify_dev_init(struct ufs_hba *hba)
{
@@ -5867,7 +5874,8 @@ static inline int ufshcd_enable_ee(struct ufs_hba *hba, u16 mask)
* as the device is allowed to manage its own way of handling background
* operations.
*
- * Return: zero on success, non-zero on failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int ufshcd_enable_auto_bkops(struct ufs_hba *hba)
{
@@ -5906,7 +5914,8 @@ static int ufshcd_enable_auto_bkops(struct ufs_hba *hba)
* host is idle so that BKOPS are managed effectively without any negative
* impacts.
*
- * Return: zero on success, non-zero on failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
static int ufshcd_disable_auto_bkops(struct ufs_hba *hba)
{
@@ -6056,6 +6065,10 @@ static void ufshcd_bkops_exception_event_handler(struct ufs_hba *hba)
__func__, err);
}
+/*
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
+ */
int ufshcd_read_device_lvl_exception_id(struct ufs_hba *hba, u64 *exception_id)
{
struct utp_upiu_query_v4_0 *upiu_resp;
@@ -7447,7 +7460,8 @@ int ufshcd_exec_raw_upiu_cmd(struct ufs_hba *hba,
* @sg_list: Pointer to SG list when DATA IN/OUT UPIU is required in ARPMB operation
* @dir: DMA direction
*
- * Return: zero on success, non-zero on failure.
+ * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS device
+ * reported an OCS error.
*/
int ufshcd_advanced_rpmb_req_handler(struct ufs_hba *hba, struct utp_upiu_req *req_upiu,
struct utp_upiu_req *rsp_upiu, struct ufs_ehs *req_ehs,
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr()
2025-08-11 15:46 [PATCH 0/4] UFS driver bug fixes Bart Van Assche
` (2 preceding siblings ...)
2025-08-11 15:46 ` [PATCH 3/4] ufs: core: Fix the return value documentation Bart Van Assche
@ 2025-08-11 15:46 ` Bart Van Assche
2025-08-12 9:03 ` Peter Wang (王信友)
3 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2025-08-11 15:46 UTC (permalink / raw)
To: Martin K . Petersen
Cc: linux-scsi, Bart Van Assche, James E.J. Bottomley, Peter Wang,
Avri Altman, Bean Huo, Manivannan Sadhasivam, Bao D. Nguyen,
Adrian Hunter
The name ufshcd_wait_for_doorbell_clr() refers to legacy mode. Commit
8d077ede48c1 ("scsi: ufs: Optimize the command queueing code") added
support for MCQ mode in this function. Since then the name of this
function is misleading. Hence change the name of this function into
something that is appropriate for both legacy and MCQ mode.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufshcd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index b5057ce27aa9..1b8e2e68c600 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1303,7 +1303,7 @@ static u32 ufshcd_pending_cmds(struct ufs_hba *hba)
*
* Return: 0 upon success; -EBUSY upon timeout.
*/
-static int ufshcd_wait_for_doorbell_clr(struct ufs_hba *hba,
+static int ufshcd_wait_for_pending_cmds(struct ufs_hba *hba,
u64 wait_timeout_us)
{
int ret = 0;
@@ -1431,7 +1431,7 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
down_write(&hba->clk_scaling_lock);
if (!hba->clk_scaling.is_allowed ||
- ufshcd_wait_for_doorbell_clr(hba, timeout_us)) {
+ ufshcd_wait_for_pending_cmds(hba, timeout_us)) {
ret = -EBUSY;
up_write(&hba->clk_scaling_lock);
mutex_unlock(&hba->wb_mutex);
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl()
2025-08-11 15:46 ` [PATCH 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl() Bart Van Assche
@ 2025-08-12 9:01 ` Peter Wang (王信友)
0 siblings, 0 replies; 11+ messages in thread
From: Peter Wang (王信友) @ 2025-08-12 9:01 UTC (permalink / raw)
To: bvanassche@acm.org, martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, beanhuo@micron.com, mani@kernel.org,
avri.altman@sandisk.com, James.Bottomley@HansenPartnership.com,
quic_nguyenb@quicinc.com, adrian.hunter@intel.com
On Mon, 2025-08-11 at 08:46 -0700, Bart Van Assche wrote:
>
>
> The UIC completion interrupt may be disabled while an UIC command is
> being processed. When the UIC completion interrupt is reenabled, an
> UIC interrupt is triggered and the WARN_ON_ONCE(!cmd) statement is
> hit.
> Hence this patch that removes this kernel warning.
>
> Fixes: fcd8b0450a9a ("scsi: ufs: core: Make ufshcd_uic_cmd_compl()
> easier to analyze")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/ufs/core/ufshcd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 3e94364f45d5..12af4e0824ce 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -5561,7 +5561,7 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct
> ufs_hba *hba, u32 intr_status)
>
> guard(spinlock_irqsave)(hba->host->host_lock);
> cmd = hba->active_uic_cmd;
> - if (WARN_ON_ONCE(!cmd))
> + if (!cmd)
> goto unlock;
>
> if (ufshcd_is_auto_hibern8_error(hba, intr_status))
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] ufs: core: Fix the return value documentation
2025-08-11 15:46 ` [PATCH 3/4] ufs: core: Fix the return value documentation Bart Van Assche
@ 2025-08-12 9:02 ` Peter Wang (王信友)
2025-08-12 15:27 ` Bart Van Assche
0 siblings, 1 reply; 11+ messages in thread
From: Peter Wang (王信友) @ 2025-08-12 9:02 UTC (permalink / raw)
To: bvanassche@acm.org, martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, beanhuo@micron.com, mani@kernel.org,
avri.altman@sandisk.com, James.Bottomley@HansenPartnership.com,
quic_nguyenb@quicinc.com, adrian.hunter@intel.com
On Mon, 2025-08-11 at 08:46 -0700, Bart Van Assche wrote:
> ufshcd_wait_for_dev_cmd() and all its callers can return an OCS
> error.
> OCS errors are represented by positive integers. Remove the
> WARN_ONCE()
> statements that complain about positive error codes and update the
> documentation.
>
> Keep the behavior of ufshcd_wait_for_dev_cmd() because this return
> value
> may end be passed as the second argument of bsg_job_done() and
> bsg_job_done() handles positive and negative error codes differently.
>
> Fixes: cc59f3b68542 ("scsi: ufs: core: Improve return value
> documentation")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/ufs/core/ufshcd.c | 62 ++++++++++++++++++++++++-------------
> --
> 1 file changed, 38 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 12af4e0824ce..b5057ce27aa9 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -3199,7 +3199,8 @@ ufshcd_dev_cmd_completion(struct ufs_hba *hba,
> struct ufshcd_lrb *lrbp)
> }
>
> /*
> - * Return: 0 upon success; < 0 upon failure.
> + * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS
> device
> + * reported an OCS error.
> */
>
Hi Bart,
A return value less than 0 shouldn’t only indicate a timeout,
There should also be cases like -EAGAIN and -EINVAL, right?
Thanks.
Peter
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr()
2025-08-11 15:46 ` [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr() Bart Van Assche
@ 2025-08-12 9:03 ` Peter Wang (王信友)
2025-08-13 16:04 ` Bart Van Assche
0 siblings, 1 reply; 11+ messages in thread
From: Peter Wang (王信友) @ 2025-08-12 9:03 UTC (permalink / raw)
To: bvanassche@acm.org, martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, beanhuo@micron.com, mani@kernel.org,
avri.altman@sandisk.com, James.Bottomley@HansenPartnership.com,
quic_nguyenb@quicinc.com, adrian.hunter@intel.com
On Mon, 2025-08-11 at 08:46 -0700, Bart Van Assche wrote:
> The name ufshcd_wait_for_doorbell_clr() refers to legacy mode. Commit
> 8d077ede48c1 ("scsi: ufs: Optimize the command queueing code") added
> support for MCQ mode in this function. Since then the name of this
> function is misleading. Hence change the name of this function into
> something that is appropriate for both legacy and MCQ mode.
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/ufs/core/ufshcd.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index b5057ce27aa9..1b8e2e68c600 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -1303,7 +1303,7 @@ static u32 ufshcd_pending_cmds(struct ufs_hba
> *hba)
> *
> * Return: 0 upon success; -EBUSY upon timeout.
> */
> -static int ufshcd_wait_for_doorbell_clr(struct ufs_hba *hba,
> +static int ufshcd_wait_for_pending_cmds(struct ufs_hba *hba,
> u64 wait_timeout_us)
> {
> int ret = 0;
> @@ -1431,7 +1431,7 @@ static int ufshcd_clock_scaling_prepare(struct
> ufs_hba *hba, u64 timeout_us)
> down_write(&hba->clk_scaling_lock);
>
> if (!hba->clk_scaling.is_allowed ||
> - ufshcd_wait_for_doorbell_clr(hba, timeout_us)) {
> + ufshcd_wait_for_pending_cmds(hba, timeout_us)) {
> ret = -EBUSY;
> up_write(&hba->clk_scaling_lock);
> mutex_unlock(&hba->wb_mutex);
Hi Bart,
I think this patch is fine, but there is another related question.
According to the UFS specification, switching gears does not
require waiting for IO to complete. Therefore, should we consider
removing this function entirely?
Thanks.
Peter
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] ufs: core: Fix the return value documentation
2025-08-12 9:02 ` Peter Wang (王信友)
@ 2025-08-12 15:27 ` Bart Van Assche
0 siblings, 0 replies; 11+ messages in thread
From: Bart Van Assche @ 2025-08-12 15:27 UTC (permalink / raw)
To: Peter Wang (王信友), martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, beanhuo@micron.com, mani@kernel.org,
avri.altman@sandisk.com, James.Bottomley@HansenPartnership.com,
quic_nguyenb@quicinc.com, adrian.hunter@intel.com
On 8/12/25 2:02 AM, Peter Wang (王信友) wrote:
> On Mon, 2025-08-11 at 08:46 -0700, Bart Van Assche wrote:
>> /*
>> - * Return: 0 upon success; < 0 upon failure.
>> + * Return: 0 upon success; < 0 upon timeout; > 0 in case the UFS
>> device
>> + * reported an OCS error.
>> */
>
> A return value less than 0 shouldn’t only indicate a timeout,
> There should also be cases like -EAGAIN and -EINVAL, right?
Agreed. Let me double check which comments need to be refined further.
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr()
2025-08-12 9:03 ` Peter Wang (王信友)
@ 2025-08-13 16:04 ` Bart Van Assche
2025-08-14 8:00 ` Peter Wang (王信友)
0 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2025-08-13 16:04 UTC (permalink / raw)
To: Peter Wang (王信友), martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, beanhuo@micron.com, mani@kernel.org,
avri.altman@sandisk.com, James.Bottomley@HansenPartnership.com,
quic_nguyenb@quicinc.com, adrian.hunter@intel.com
On 8/12/25 2:03 AM, Peter Wang (王信友) wrote:
> According to the UFS specification, switching gears does not
> require waiting for IO to complete. Therefore, should we consider
> removing this function entirely?
I don't think so. Switching gears involves submitting an UIC power mode
change. From the UFSHCI 4.1 standard: "The Adapt is always performed in
conjunction with Power Mode Change. During the Power Mode Change
(and hence for the whole duration of the Adapt sequence) there will be
no data traffic on the link."
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr()
2025-08-13 16:04 ` Bart Van Assche
@ 2025-08-14 8:00 ` Peter Wang (王信友)
0 siblings, 0 replies; 11+ messages in thread
From: Peter Wang (王信友) @ 2025-08-14 8:00 UTC (permalink / raw)
To: bvanassche@acm.org, martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, beanhuo@micron.com, mani@kernel.org,
avri.altman@sandisk.com, James.Bottomley@HansenPartnership.com,
quic_nguyenb@quicinc.com, adrian.hunter@intel.com
On Wed, 2025-08-13 at 09:04 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>
>
> On 8/12/25 2:03 AM, Peter Wang (王信友) wrote:
> > According to the UFS specification, switching gears does not
> > require waiting for IO to complete. Therefore, should we consider
> > removing this function entirely?
>
> I don't think so. Switching gears involves submitting an UIC power
> mode
> change. From the UFSHCI 4.1 standard: "The Adapt is always performed
> in
> conjunction with Power Mode Change. During the Power Mode Change
> (and hence for the whole duration of the Adapt sequence) there will
> be
> no data traffic on the link."
>
> Bart.
Hi Bart,
According to the MIPI UniPro Specification v2.0:
5.3.2.3 PA_DL_PAUSE.ind
This primitive informs the PA Service User, the DL Layer in this case,
that the PA Layer was requested to
execute a operation that requires the usage of the Link, e.g. Power
Mode change or PACP frame transmission.
This primitive is one in the group of primitives defining the handshake
procedure used between PA Layer
and DL Layer to coordinate the Link usage. After generating this
primitive, the PA Layer shall wait for the
reception of the PA_DL_PAUSE.rsp_L before taking any further action.
5.3.2.5 PA_DL_RESUME.ind
This primitive informs the PA Service User, the DL Layer in this case,
that 1125 the PA Layer has completed its
operation and the DL Layer may continue to use the Link.
The semantics of this primitive are:
PA_DL_RESUME.ind( PAResult )
When Generated
This primitive shall be generated by the PA Layer when it has completed
its operation. The parameter is set
to TRUE if the operation completed successfully, otherwise the
parameter is set to FALSE.
For the detailed flow, please refer to Figure 52: Power Mode Change
Using PACP_PWR_req and PACP_PWR_cnf.
In short,
The PA has IO data traffic (power mode change)
-> The DL layer receives PA_DL_PAUSE.ind
-> The DL layer pauses and responds The PA layer sends
PA_DL_PAUSE.rsp_L
-> waits until the PA’s work is completed
-> The PA then notifies the DL layer with PA_DL_Resume.ind
That is, if a power mode change happens during ongoing IO data traffic,
the PA and DL layers will execute pause and resume operations to ensure
that the ongoing IO is properly halted and then continued.
However, I believe this is a separate topic, and I also doubt that
every UFSHCI
host or device can perfectly implement the flow as described in the
specification.
Therefore, it’s better to keep the current implementation for now.
Mediatek will submit a patch to optimize this flow in the future, and
may use a
quirk to retain the original behavior.
Hence,
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Thanks.
Peter
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-08-14 8:00 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 15:46 [PATCH 0/4] UFS driver bug fixes Bart Van Assche
2025-08-11 15:46 ` [PATCH 1/4] ufs: core: Fix IRQ lock inversion for the SCSI host lock Bart Van Assche
2025-08-11 15:46 ` [PATCH 2/4] ufs: core: Remove WARN_ON_ONCE() call from ufshcd_uic_cmd_compl() Bart Van Assche
2025-08-12 9:01 ` Peter Wang (王信友)
2025-08-11 15:46 ` [PATCH 3/4] ufs: core: Fix the return value documentation Bart Van Assche
2025-08-12 9:02 ` Peter Wang (王信友)
2025-08-12 15:27 ` Bart Van Assche
2025-08-11 15:46 ` [PATCH 4/4] ufs: core: Rename ufshcd_wait_for_doorbell_clr() Bart Van Assche
2025-08-12 9:03 ` Peter Wang (王信友)
2025-08-13 16:04 ` Bart Van Assche
2025-08-14 8:00 ` Peter Wang (王信友)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).