Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Neil Armstrong <neil.armstrong@linaro.org>
To: Bart Van Assche <bvanassche@acm.org>,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	linux-arm-msm@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] ufs: delegate the interrupt service routine to a threaded irq handler
Date: Mon, 24 Mar 2025 10:31:33 +0100	[thread overview]
Message-ID: <d084e50e-8b2b-4820-a5e7-25ec440d128e@linaro.org> (raw)
In-Reply-To: <31b46812-72d5-4f9d-b55d-16a6e10afe7d@acm.org>

Hi,

On 21/03/2025 17:20, Bart Van Assche wrote:
> On 3/21/25 9:08 AM, Neil Armstrong wrote:
>> On systems with a large number request slots and unavailable MCQ,
>> the current design of the interrupt handler can delay handling of
>> other subsystems interrupts causing display artifacts, GPU stalls
>> or system firmware requests timeouts.
>>
>> Since the interrupt routine can take quite some time, it's
>> preferable to move it to a threaded handler and leave the
>> hard interrupt handler save the status and disable the irq
>> until processing is finished in the thread.
>>
>> This fixes all encountered issued when running FIO tests
>> on the Qualcomm SM8650 platform.
>>
>> Example of errors reported on a loaded system:
>>   [drm:dpu_encoder_frame_done_timeout:2706] [dpu error]enc32 frame done timeout
>>   msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1: hangcheck detected gpu lockup rb 2!
>>   msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1:     completed fence: 74285
>>   msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1:     submitted fence: 74286
>>   Error sending AMC RPMH requests (-110)
>>
>> Reported bandwidth is not affected on various tests.
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/ufs/core/ufshcd.c | 43 ++++++++++++++++++++++++++++++++++++-------
>>   include/ufs/ufshcd.h      |  2 ++
>>   2 files changed, 38 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
>> index 0534390c2a35d0671156d79a4b1981a257d2fbfa..0fa3cb48ce0e39439afb0f6d334b835d9e496387 100644
>> --- a/drivers/ufs/core/ufshcd.c
>> +++ b/drivers/ufs/core/ufshcd.c
>> @@ -6974,7 +6974,7 @@ static irqreturn_t ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status)
>>   }
>>   /**
>> - * ufshcd_intr - Main interrupt service routine
>> + * ufshcd_intr - Threaded interrupt service routine
>>    * @irq: irq number
>>    * @__hba: pointer to adapter instance
>>    *
>> @@ -6982,7 +6982,7 @@ static irqreturn_t ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status)
>>    *  IRQ_HANDLED - If interrupt is valid
>>    *  IRQ_NONE    - If invalid interrupt
>>    */
>> -static irqreturn_t ufshcd_intr(int irq, void *__hba)
>> +static irqreturn_t ufshcd_threaded_intr(int irq, void *__hba)
>>   {
>>       u32 intr_status, enabled_intr_status = 0;
>>       irqreturn_t retval = IRQ_NONE;
>> @@ -6990,8 +6990,6 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
>>       int retries = hba->nutrs;
>>       intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
>> -    hba->ufs_stats.last_intr_status = intr_status;
>> -    hba->ufs_stats.last_intr_ts = local_clock();
>>       /*
>>        * There could be max of hba->nutrs reqs in flight and in worst case
>> @@ -7000,8 +6998,7 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
>>        * again in a loop until we process all of the reqs before returning.
>>        */
>>       while (intr_status && retries--) {
>> -        enabled_intr_status =
>> -            intr_status & ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
>> +        enabled_intr_status = intr_status & hba->intr_en;
>>           ufshcd_writel(hba, intr_status, REG_INTERRUPT_STATUS);
>>           if (enabled_intr_status)
>>               retval |= ufshcd_sl_intr(hba, enabled_intr_status);
>> @@ -7020,9 +7017,40 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
>>           ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");
>>       }
>> +    ufshcd_writel(hba, hba->intr_en, REG_INTERRUPT_ENABLE);
>> +
>>       return retval;
>>   }
>> +/**
>> + * ufshcd_intr - Main interrupt service routine
>> + * @irq: irq number
>> + * @__hba: pointer to adapter instance
>> + *
>> + * Return:
>> + *  IRQ_WAKE_THREAD - If interrupt is valid
>> + *  IRQ_NONE        - If invalid interrupt
>> + */
>> +static irqreturn_t ufshcd_intr(int irq, void *__hba)
>> +{
>> +    u32 intr_status, enabled_intr_status = 0;
>> +    irqreturn_t retval = IRQ_NONE;
>> +    struct ufs_hba *hba = __hba;
>> +    int retries = hba->nutrs;
>> +
>> +    intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
>> +    hba->ufs_stats.last_intr_status = intr_status;
>> +    hba->ufs_stats.last_intr_ts = local_clock();
>> +
>> +    if (unlikely(!intr_status))
>> +        return IRQ_NONE;
>> +
>> +    hba->intr_en = ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
>> +    ufshcd_writel(hba, 0, REG_INTERRUPT_ENABLE);
>> +
>> +    return IRQ_WAKE_THREAD;
>> +}
>> +
>>   static int ufshcd_clear_tm_cmd(struct ufs_hba *hba, int tag)
>>   {
>>       int err = 0;
>> @@ -10581,7 +10609,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
>>       ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
>>       /* IRQ registration */
>> -    err = devm_request_irq(dev, irq, ufshcd_intr, IRQF_SHARED, UFSHCD, hba);
>> +    err = devm_request_threaded_irq(dev, irq, ufshcd_intr, ufshcd_threaded_intr,
>> +                    IRQF_SHARED, UFSHCD, hba);
>>       if (err) {
>>           dev_err(hba->dev, "request irq failed\n");
>>           goto out_disable;
>> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
>> index e3909cc691b2a854a270279901edacaa5c5120d6..03a7216b89fd63c297479422d1213e497ce85d8e 100644
>> --- a/include/ufs/ufshcd.h
>> +++ b/include/ufs/ufshcd.h
>> @@ -893,6 +893,7 @@ enum ufshcd_mcq_opr {
>>    * @ufshcd_state: UFSHCD state
>>    * @eh_flags: Error handling flags
>>    * @intr_mask: Interrupt Mask Bits
>> + * @intr_en: Saved Interrupt Enable Bits
>>    * @ee_ctrl_mask: Exception event control mask
>>    * @ee_drv_mask: Exception event mask for driver
>>    * @ee_usr_mask: Exception event mask for user (set via debugfs)
>> @@ -1040,6 +1041,7 @@ struct ufs_hba {
>>       enum ufshcd_state ufshcd_state;
>>       u32 eh_flags;
>>       u32 intr_mask;
>> +    u32 intr_en;
>>       u16 ee_ctrl_mask;
>>       u16 ee_drv_mask;
>>       u16 ee_usr_mask;
> 
> I don't like this patch because:
> - It reduces performance (IOPS) for systems on which MCQ is supported
>    and enabled. Please only use threaded interrupts if MCQ is not used.

OK I guess in the MCQ case only some interrupts should be handled
in the primary handler then, because in PREEMPT_RT it will be forced
to be fully threaded, but it's another subject.

> - It introduces race conditions on the REG_INTERRUPT_ENABLE register.
>    There are plenty of ufshcd_(enable|disable)_intr() calls in the UFS
>    driver. Please remove all code that modifies REG_INTERRUPT_ENABLE
>    from this patch.

Ack, I'll switch to use the default irq_default_primary_handler() and
stop touching the REG_INTERRUPT_ENABLE register.

> - Instead of retaining hba->ufs_stats.last_intr_status and
>    hba->ufs_stats.last_intr_ts, please remove both members and also
>    the debug code that reports the values of these member variables.
>    Please also remove hba->intr_en.

Hmm ok so no need for the IRQ debug code anymore ? I guess this should
be in a separate cleanup patch.

Thanks,
Neil

> 
> Thanks,
> 
> Bart.


  reply	other threads:[~2025-03-24  9:31 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21 16:08 [PATCH RFC] ufs: delegate the interrupt service routine to a threaded irq handler Neil Armstrong
2025-03-21 16:20 ` Bart Van Assche
2025-03-24  9:31   ` Neil Armstrong [this message]
2025-03-24 11:02     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d084e50e-8b2b-4820-a5e7-25ec440d128e@linaro.org \
    --to=neil.armstrong@linaro.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=alim.akhtar@samsung.com \
    --cc=avri.altman@wdc.com \
    --cc=bvanassche@acm.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox