Re: [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nicolin Chen <nicolinc@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: <will@kernel.org>, <joro@8bytes.org>, <bhelgaas@google.com>,
	<jgg@nvidia.com>, <rafael@kernel.org>, <lenb@kernel.org>,
	<praan@google.com>, <kees@kernel.org>, <baolu.lu@linux.intel.com>,
	<smostafa@google.com>, <Alexander.Grest@microsoft.com>,
	<kevin.tian@intel.com>, <miko.lenczewski@arm.com>,
	<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>, <linux-acpi@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, <vsethi@nvidia.com>
Subject: Re: [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts
Date: Thu, 5 Mar 2026 15:30:19 -0800	[thread overview]
Message-ID: <aaoSC74sKispAUtl@Asurada-Nvidia> (raw)
In-Reply-To: <aanwTQtMMplXY3du@Asurada-Nvidia>

On Thu, Mar 05, 2026 at 01:06:24PM -0800, Nicolin Chen wrote:
> On Thu, Mar 05, 2026 at 03:24:30PM +0000, Robin Murphy wrote:
> > On 2026-03-05 5:21 am, Nicolin Chen wrote:
> > > Though it'd be ideal to block it immediately in the ISR, it cannot be done
> > > because an STE update would require another CFIG_STE command that couldn't
> > > finish in the context of an ISR handling a CMDQ error.
> > 
> > Why not? As soon as we've acked GERRORN.CMDQ_ERR, command consumption will
> > resume and we're free to do whatever we fancy. Admittedly this probably
> > represents more work than we *want* to be doing in the SMMU's IRQ handler
> > (arguably even in a thread, since all the PCI housekeeping isn't really the
> > SMMU driver's own problem), but I would say the workqueue is a definite
> > design choice, not a functional requirement.
> 
> Hmm, you are right, after writel(gerror, ARM_SMMU_GERRORN), it
> would be doable.
> 
> Though SID->pdev conversion requires streams_mutex, which can't
> happen in ISR?

I forgot that (if we don't mind the core-level domain attachment
and the driver-level ATS state being temporarily out of sync with
the physical STE) SID alone was enough for a surgical STE update
in the ISR to unset EATS.

Thanks
Nicolin

> > > +static void arm_smmu_atc_recovery_worker(struct work_struct *work)
> > > +{
> > > +	struct arm_smmu_atc_recovery_param *param =
> > > +		container_of(work, struct arm_smmu_atc_recovery_param, work);
> > > +	struct pci_dev *pdev;
> > > +
> > > +	scoped_guard(mutex, &param->smmu->streams_mutex) {
> > > +		struct arm_smmu_master *master;
> > > +
> > > +		master = arm_smmu_find_master(param->smmu, param->sid);
> > 
> > The only thing SMMU-specific about this seems to be the use of
> > arm_smmu_find_master() to resolve the device, which could just as well be
> > done upon submission anyway - why isn't this a generic IOMMU/IOPF mechanism?
> 
> You mean treating this as a page fault? That's an interesting idea.
> 
> So, the IOMMU-level fence could be done in arm_smmu_page_response().
> 
> > > +static int arm_smmu_sched_atc_recovery(struct arm_smmu_device *smmu, u32 sid)
> > > +{
> > > +	struct arm_smmu_atc_recovery_param *param;
> > > +
> > > +	param = kzalloc_obj(*param, GFP_ATOMIC);
> > > +	if (!param)
> > > +		return -ENOMEM;
> > > +	param->smmu = smmu;
> > > +	param->sid = sid;
> > > +
> > > +	INIT_WORK(&param->work, arm_smmu_atc_recovery_worker);
> > > +	queue_work(system_unbound_wq, &param->work);
> > 
> > Might it make more sense to have a single work item associated with the
> > list_head and use the latter as an actual queue, such that the work runs
> > until the list is empty, then here at submisison time we do the list_add()
> > and schedule_work()? That could perhaps even be a global queue, since ATS
> > timeouts can hardly be expected to be a highly-contended high-perfoamnce
> > concern.
> 
> That sounds like the IOPF implementation. Maybe inventing another
> IOMMU_FAULT_ATC_TIMEOUT to reuse the existing infrastructure would
> make things cleaner.
> 
> > Right now it seems the list is barely doing anything - a "deduplication"
> > mechanism that only works if multiple resets for the same device happen to
> > have their work scheduled concurrently seems pretty ineffective.
> 
> From my testing results, it does effectively block duplications, so
> I think it's somehow meaningful.
> 
> If I wasn't wrong, the duplicated ATC timeout might come from this
> exact reset thread, as it did a CMDQ_OP_CFGI_STE while the CONS was
> still pointing to previous CMDQ_OP_ATC_INV.
> 
> IIUIC, the IOPF queue doesn't expect duplicated PRI events. So, it
> might invoke the handler twice if there is a duplication. The core
> handling list_add() might need a deduplication.
> 
> > > @@ -456,6 +555,27 @@ void __arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu,
> > >   	 * not to touch any of the shadow cmdq state.
> > >   	 */
> > >   	queue_read(cmd, Q_ENT(q, cons), q->ent_dwords);
> > > +
> > > +	if (idx == CMDQ_ERR_CERROR_ATC_INV_IDX) {
> > > +		/*
> > > +		 * Since commands can be issued in batch making it difficult to
> > > +		 * identify which CMDQ_OP_ATC_INV actually timed out, the driver
> > > +		 * must ensure only CMDQ_OP_ATC_INV commands for the same device
> > > +		 * can be batched.
> > 
> > But this *is* "the driver" - arm_smmu_atc_inv_domain() is literally further
> > down this same C file, and does not do what this comment is saying it must
> > do, so how are you expecting this to work correctly?
> 
> Oh, that's a big miss..
> 
> I imaged these changes to be based on the arm_smmu_invs series,
> where it could break the ATC in the batching function. Here, it
> should have made some change to arm_smmu_atc_inv_domain().
> 
> Thanks
> Nicolin

next prev parent reply	other threads:[~2026-03-05 23:30 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-05  5:21 [PATCH v1 0/2] iommu/arm-smmu-v3: Reset PCI device upon ATC invalidate timeout Nicolin Chen
2026-03-05  5:21 ` [PATCH v1 1/2] iommu: Do not call pci_dev_reset_iommu_done() unless reset succeeds Nicolin Chen
2026-03-05  5:21 ` [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts Nicolin Chen
2026-03-05 15:15   ` kernel test robot
2026-03-05 15:24   ` Robin Murphy
2026-03-05 21:06     ` Nicolin Chen
2026-03-05 23:30       ` Nicolin Chen [this message]
2026-03-05 23:52       ` Jason Gunthorpe
2026-03-06 15:24         ` Robin Murphy
2026-03-06 15:56           ` Jason Gunthorpe
2026-03-10 19:34             ` Pranjal Shrivastava
2026-03-05 15:39   ` Jason Gunthorpe
2026-03-05 21:15     ` Nicolin Chen
2026-03-05 23:41       ` Jason Gunthorpe
2026-03-06  1:29         ` Nicolin Chen
2026-03-06  1:33           ` Jason Gunthorpe
2026-03-06  5:06             ` Nicolin Chen
2026-03-06 13:02               ` Jason Gunthorpe
2026-03-06 19:20                 ` Nicolin Chen
2026-03-06 19:22                   ` Jason Gunthorpe
2026-03-06 19:39                     ` Nicolin Chen
2026-03-06 19:47                       ` Jason Gunthorpe
2026-03-10 19:40                 ` Pranjal Shrivastava
2026-03-10 19:57                   ` Nicolin Chen
2026-03-10 20:04                     ` Pranjal Shrivastava
2026-03-06 13:22         ` Robin Murphy
2026-03-06 14:01           ` Jason Gunthorpe
2026-03-06 20:18             ` Nicolin Chen
2026-03-06 20:22               ` Jason Gunthorpe
2026-03-06 20:34                 ` Nicolin Chen
2026-03-06  3:22     ` Baolu Lu
2026-03-06 13:00       ` Jason Gunthorpe
2026-03-06 19:35         ` Samiullah Khawaja
2026-03-06 19:43           ` Jason Gunthorpe
2026-03-06 19:59             ` Samiullah Khawaja
2026-03-06 20:03               ` Jason Gunthorpe
2026-03-06 20:22                 ` Samiullah Khawaja
2026-03-06 20:26                   ` Jason Gunthorpe
2026-03-10 20:00                     ` Samiullah Khawaja
2026-03-11 12:12                       ` Jason Gunthorpe
2026-03-06  2:35   ` kernel test robot
2026-03-10 19:16   ` Pranjal Shrivastava
2026-03-10 19:51     ` Nicolin Chen
2026-03-10 20:00       ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaoSC74sKispAUtl@Asurada-Nvidia \
    --to=nicolinc@nvidia.com \
    --cc=Alexander.Grest@microsoft.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kees@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=miko.lenczewski@arm.com \
    --cc=praan@google.com \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.