* [PATCH v2 0/2] ufs-qcom: Reduce interrupt latency
@ 2026-04-02 17:14 Bart Van Assche
2026-04-02 17:14 ` [PATCH v2 1/2] ufs: core: Introduce ufshcd_mcq_poll_n_cqe_lock() Bart Van Assche
2026-04-02 17:14 ` [PATCH v2 2/2] ufs: qcom: Reduce interrupt latency Bart Van Assche
0 siblings, 2 replies; 3+ messages in thread
From: Bart Van Assche @ 2026-04-02 17:14 UTC (permalink / raw)
To: Martin K . Petersen; +Cc: linux-scsi, Bart Van Assche
Hi Martin,
On Android systems it is important to keep the time spent in interrupts short.
This keeps the user interface responsive and prevents audio stuttering. Hence
this patch series to reduce the time spent in the UFS interrupt handler. Please
consider this patch series for the next merge window after test results have
been shared by Qualcomm.
Thanks,
Bart.
Changes compared to v1:
- Dropped the ufshcd_mcq_compl_all_cqes_lock() changes.
- Renamed ufshcd_mcq_poll_cqe_lock_n() into ufshcd_mcq_poll_n_cqe_lock().
Bart Van Assche (2):
ufs: core: Introduce ufshcd_mcq_poll_n_cqe_lock()
ufs: qcom: Reduce interrupt latency
drivers/ufs/core/ufs-mcq.c | 14 +++++++++++---
drivers/ufs/host/ufs-qcom.c | 33 +++++++++++++++++++++++++++++----
include/ufs/ufshcd.h | 3 +++
3 files changed, 43 insertions(+), 7 deletions(-)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v2 1/2] ufs: core: Introduce ufshcd_mcq_poll_n_cqe_lock()
2026-04-02 17:14 [PATCH v2 0/2] ufs-qcom: Reduce interrupt latency Bart Van Assche
@ 2026-04-02 17:14 ` Bart Van Assche
2026-04-02 17:14 ` [PATCH v2 2/2] ufs: qcom: Reduce interrupt latency Bart Van Assche
1 sibling, 0 replies; 3+ messages in thread
From: Bart Van Assche @ 2026-04-02 17:14 UTC (permalink / raw)
To: Martin K . Petersen
Cc: linux-scsi, Bart Van Assche, Peter Wang, James E.J. Bottomley,
Matthias Brugger, AngeloGioacchino Del Regno, vamshi gajjela,
Alok Tiwari, Chenyuan Yang, ping.gao
Introduce a new function for processing completions that accepts an
upper limit for the number of completions to poll. Tell
ufshcd_mcq_poll_cqe_lock() to poll at most hwq->max_entries. This is
sufficient to poll all pending completions since there are never more
than hwq->max_entries - 1 completions on a completion queue. This patch
prepares for reducing the interrupt latency.
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufs-mcq.c | 14 +++++++++++---
include/ufs/ufshcd.h | 3 +++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 1b3062577945..e6081b97ed74 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -340,15 +340,16 @@ void ufshcd_mcq_compl_all_cqes_lock(struct ufs_hba *hba,
spin_unlock_irqrestore(&hwq->cq_lock, flags);
}
-unsigned long ufshcd_mcq_poll_cqe_lock(struct ufs_hba *hba,
- struct ufs_hw_queue *hwq)
+unsigned long ufshcd_mcq_poll_n_cqe_lock(struct ufs_hba *hba,
+ struct ufs_hw_queue *hwq,
+ unsigned int max_compl)
{
unsigned long completed_reqs = 0;
unsigned long flags;
spin_lock_irqsave(&hwq->cq_lock, flags);
ufshcd_mcq_update_cq_tail_slot(hwq);
- while (!ufshcd_mcq_is_cq_empty(hwq)) {
+ while (!ufshcd_mcq_is_cq_empty(hwq) && completed_reqs < max_compl) {
ufshcd_mcq_process_cqe(hba, hwq);
ufshcd_mcq_inc_cq_head_slot(hwq);
completed_reqs++;
@@ -360,6 +361,13 @@ unsigned long ufshcd_mcq_poll_cqe_lock(struct ufs_hba *hba,
return completed_reqs;
}
+EXPORT_SYMBOL_GPL(ufshcd_mcq_poll_n_cqe_lock);
+
+unsigned long ufshcd_mcq_poll_cqe_lock(struct ufs_hba *hba,
+ struct ufs_hw_queue *hwq)
+{
+ return ufshcd_mcq_poll_n_cqe_lock(hba, hwq, hwq->max_entries);
+}
EXPORT_SYMBOL_GPL(ufshcd_mcq_poll_cqe_lock);
void ufshcd_mcq_make_queues_operational(struct ufs_hba *hba)
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index cfbc75d8df83..aa5f9b31ea86 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -1475,6 +1475,9 @@ void ufshcd_mcq_config_mac(struct ufs_hba *hba, u32 max_active_cmds);
unsigned int ufshcd_mcq_queue_cfg_addr(struct ufs_hba *hba);
u32 ufshcd_mcq_read_cqis(struct ufs_hba *hba, int i);
void ufshcd_mcq_write_cqis(struct ufs_hba *hba, u32 val, int i);
+unsigned long ufshcd_mcq_poll_n_cqe_lock(struct ufs_hba *hba,
+ struct ufs_hw_queue *hwq,
+ unsigned int max_compl);
unsigned long ufshcd_mcq_poll_cqe_lock(struct ufs_hba *hba,
struct ufs_hw_queue *hwq);
void ufshcd_mcq_make_queues_operational(struct ufs_hba *hba);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v2 2/2] ufs: qcom: Reduce interrupt latency
2026-04-02 17:14 [PATCH v2 0/2] ufs-qcom: Reduce interrupt latency Bart Van Assche
2026-04-02 17:14 ` [PATCH v2 1/2] ufs: core: Introduce ufshcd_mcq_poll_n_cqe_lock() Bart Van Assche
@ 2026-04-02 17:14 ` Bart Van Assche
1 sibling, 0 replies; 3+ messages in thread
From: Bart Van Assche @ 2026-04-02 17:14 UTC (permalink / raw)
To: Martin K . Petersen
Cc: linux-scsi, Bart Van Assche, Manivannan Sadhasivam,
James E.J. Bottomley
Defer completion processing to thread context on slower CPU cores to
prevent interrupt latency spikes. On the fastest CPU cores, keep
processing all completions in interrupt context.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/host/ufs-qcom.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/drivers/ufs/host/ufs-qcom.c b/drivers/ufs/host/ufs-qcom.c
index 5a58ffef3d27..4aabbd14631e 100644
--- a/drivers/ufs/host/ufs-qcom.c
+++ b/drivers/ufs/host/ufs-qcom.c
@@ -2370,6 +2370,16 @@ struct ufs_qcom_irq {
struct ufs_hba *hba;
};
+static irqreturn_t ufs_qcom_mcq_threaded_esi_handler(int irq, void *data)
+{
+ struct ufs_qcom_irq *qi = data;
+ struct ufs_hba *hba = qi->hba;
+
+ ufshcd_mcq_poll_cqe_lock(hba, &hba->uhq[qi->idx]);
+
+ return IRQ_HANDLED;
+}
+
static irqreturn_t ufs_qcom_mcq_esi_handler(int irq, void *data)
{
struct ufs_qcom_irq *qi = data;
@@ -2377,9 +2387,22 @@ static irqreturn_t ufs_qcom_mcq_esi_handler(int irq, void *data)
struct ufs_hw_queue *hwq = &hba->uhq[qi->idx];
ufshcd_mcq_write_cqis(hba, 0x1, qi->idx);
- ufshcd_mcq_poll_cqe_lock(hba, hwq);
- return IRQ_HANDLED;
+ if (arch_scale_cpu_capacity(raw_smp_processor_id()) ==
+ SCHED_CAPACITY_SCALE) {
+ ufshcd_mcq_poll_cqe_lock(hba, hwq);
+ return IRQ_HANDLED;
+ }
+
+ if (ufshcd_mcq_poll_n_cqe_lock(hba, hwq, 4) < 4)
+ return IRQ_HANDLED;
+
+ /*
+ * Defer further completion processing to thread context because
+ * processing a large number of completions in interrupt context on
+ * slower CPU cores can result in unacceptably high interrupt latencies.
+ */
+ return IRQ_WAKE_THREAD;
}
static int ufs_qcom_config_esi(struct ufs_hba *hba)
@@ -2415,8 +2438,10 @@ static int ufs_qcom_config_esi(struct ufs_hba *hba)
qi[idx].idx = idx;
qi[idx].hba = hba;
- ret = devm_request_irq(hba->dev, qi[idx].irq, ufs_qcom_mcq_esi_handler,
- IRQF_SHARED, "qcom-mcq-esi", qi + idx);
+ ret = devm_request_threaded_irq(hba->dev, qi[idx].irq,
+ ufs_qcom_mcq_esi_handler,
+ ufs_qcom_mcq_threaded_esi_handler,
+ IRQF_SHARED | IRQF_ONESHOT, "qcom-mcq-esi", qi + idx);
if (ret) {
dev_err(hba->dev, "%s: Failed to request IRQ for %d, err = %d\n",
__func__, qi[idx].irq, ret);
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-02 17:14 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-02 17:14 [PATCH v2 0/2] ufs-qcom: Reduce interrupt latency Bart Van Assche
2026-04-02 17:14 ` [PATCH v2 1/2] ufs: core: Introduce ufshcd_mcq_poll_n_cqe_lock() Bart Van Assche
2026-04-02 17:14 ` [PATCH v2 2/2] ufs: qcom: Reduce interrupt latency Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox