From: Will Deacon <will@kernel.org>
To: Connor Abbott <cwabbott0@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>,
Robin Murphy <robin.murphy@arm.com>,
Joerg Roedel <joro@8bytes.org>, Sean Paul <sean@poorly.run>,
Konrad Dybcio <konradybcio@kernel.org>,
Abhinav Kumar <quic_abhinavk@quicinc.com>,
Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>,
Marijn Suijten <marijn.suijten@somainline.org>,
iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
freedreno@lists.freedesktop.org
Subject: Re: [PATCH v5 3/5] iommu/arm-smmu: Fix spurious interrupts with stall-on-fault
Date: Tue, 6 May 2025 15:53:25 +0100 [thread overview]
Message-ID: <20250506145324.GA1246@willie-the-truck> (raw)
In-Reply-To: <CACu1E7FA0M_0Un3qPRNtqy4R_NbaMks6FSkpQZBuyqJpuT-p7w@mail.gmail.com>
On Tue, May 06, 2025 at 10:08:05AM -0400, Connor Abbott wrote:
> On Tue, May 6, 2025 at 8:24 AM Will Deacon <will@kernel.org> wrote:
> > On Wed, Mar 19, 2025 at 10:44:02AM -0400, Connor Abbott wrote:
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > index c7b5d7c093e71050d29a834c8d33125e96b04d81..9927f3431a2eab913750e6079edc6393d1938c98 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > @@ -470,13 +470,52 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
> > > if (!(cfi->fsr & ARM_SMMU_CB_FSR_FAULT))
> > > return IRQ_NONE;
> > >
> > > + /*
> > > + * On some implementations FSR.SS asserts a context fault
> > > + * interrupt. We do not want this behavior, because resolving the
> > > + * original context fault typically requires operations that cannot be
> > > + * performed in IRQ context but leaving the stall unacknowledged will
> > > + * immediately lead to another spurious interrupt as FSR.SS is still
> > > + * set. Work around this by disabling interrupts for this context bank.
> > > + * It's expected that interrupts are re-enabled after resuming the
> > > + * translation.
> >
> > s/translation/transaction/
> >
> > > + *
> > > + * We have to do this before report_iommu_fault() so that we don't
> > > + * leave interrupts disabled in case the downstream user decides the
> > > + * fault can be resolved inside its fault handler.
> > > + *
> > > + * There is a possible race if there are multiple context banks sharing
> > > + * the same interrupt and both signal an interrupt in between writing
> > > + * RESUME and SCTLR. We could disable interrupts here before we
> > > + * re-enable them in the resume handler, leaving interrupts enabled.
> > > + * Lock the write to serialize it with the resume handler.
> > > + */
> >
> > I'm struggling to understand this last part. If the resume handler runs
> > synchronously from report_iommu_fault(), then there's no need for
> > locking because we're in interrupt context. If the resume handler can
> > run asynchronously from report_iommu_fault(), then the locking doesn't
> > help because the code below could clear CFIE right after the resume
> > handler has set it.
>
> The problem is indeed when the resume handler runs asynchronously.
> Clearing CFIE right after the resume handler has set it is normal and
> expected. The issue is the opposite, i.e. something like:
>
> - Resume handler writes RESUME and stalls for some reason
> - The interrupt handler runs through and clears CFIE while it's already cleared
> - Resume handler sets CFIE, assuming that the handler hasn't run yet
> but it actually has
>
> This wouldn't happen with only one context bank, because we wouldn't
> get an interrupt until the resume handler sets CFIE, but with multiple
> context banks and a shared interrupt line we could get a "spurious"
> interrupt due to a fault in an earlier context bank that becomes not
> spurious if the resume handler writes RESUME before the context fault
> handler for this bank reads FSR above.
Ah, gotcha. Thanks for the explanation.
If we moved the RESUME+CFIE into the interrupt handler after the call
to report_iommu_fault(), would it be possible to run the handler as a
threaded irq (see 'context_fault_needs_threaded_irq') and handle the
callback synchronously? In that case, I think we could avoid taking the
lock if we wrote CFIE _before_ RESUME.
Will
next prev parent reply other threads:[~2025-05-06 18:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-19 14:43 [PATCH v5 0/5] iommu/arm-smmu, drm/msm: Fixes for stall-on-fault Connor Abbott
2025-03-19 14:44 ` [PATCH v5 1/5] iommu/arm-smmu: Save additional information on context fault Connor Abbott
2025-05-06 11:32 ` Will Deacon
2025-05-06 16:26 ` Connor Abbott
2025-03-19 14:44 ` [PATCH v5 2/5] iommu/arm-smmu-qcom: Don't read fault registers directly Connor Abbott
2025-03-19 14:44 ` [PATCH v5 3/5] iommu/arm-smmu: Fix spurious interrupts with stall-on-fault Connor Abbott
2025-05-06 12:24 ` Will Deacon
2025-05-06 14:08 ` Connor Abbott
2025-05-06 14:53 ` Will Deacon [this message]
2025-05-06 15:18 ` Connor Abbott
2025-05-15 14:46 ` Will Deacon
2025-05-15 17:13 ` Connor Abbott
2025-03-19 14:44 ` [PATCH v5 4/5] iommu/arm-smmu-qcom: Make set_stall work when the device is on Connor Abbott
2025-03-19 14:44 ` [PATCH v5 5/5] drm/msm: Temporarily disable stall-on-fault after a page fault Connor Abbott
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250506145324.GA1246@willie-the-truck \
--to=will@kernel.org \
--cc=cwabbott0@gmail.com \
--cc=dmitry.baryshkov@oss.qualcomm.com \
--cc=freedreno@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=konradybcio@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=marijn.suijten@somainline.org \
--cc=quic_abhinavk@quicinc.com \
--cc=robdclark@gmail.com \
--cc=robin.murphy@arm.com \
--cc=sean@poorly.run \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox