public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Nicolin Chen <nicolinc@nvidia.com>
Cc: will@kernel.org, joro@8bytes.org, bhelgaas@google.com,
	rafael@kernel.org, lenb@kernel.org, praan@google.com,
	kees@kernel.org, baolu.lu@linux.intel.com, smostafa@google.com,
	Alexander.Grest@microsoft.com, kevin.tian@intel.com,
	miko.lenczewski@arm.com, linux-arm-kernel@lists.infradead.org,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org,
	vsethi@nvidia.com
Subject: Re: [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts
Date: Fri, 6 Mar 2026 15:24:20 +0000	[thread overview]
Message-ID: <03461707-783e-403a-86fa-ae7a5107fa30@arm.com> (raw)
In-Reply-To: <20260305235252.GC1651202@nvidia.com>

On 2026-03-05 11:52 pm, Jason Gunthorpe wrote:
> On Thu, Mar 05, 2026 at 01:06:21PM -0800, Nicolin Chen wrote:
>> That sounds like the IOPF implementation. Maybe inventing another
>> IOMMU_FAULT_ATC_TIMEOUT to reuse the existing infrastructure would
>> make things cleaner.
> 
> I think the routing is quite different, IOPF wants to route an event
> the domain creator, here you want to route an event to the IOMMU core
> then the PCIe RAS callbacks.
> 
> IDK if there is much to be reused there, especially since IOPF
> requires a memory allocation and ideally we should not be allocating
> memory to resolve this critical error condition.

Yeah, sorry, for a moment there I somehow forgot that we can expect to 
use ATS without PRI, so indeed tying this to IOPF wouldn't be 
appropriate. And given the general difficulty of trying to infer what 
went wrong and what to do from the CMDQ contents alone, I do like your 
idea of trying to return a new kind of sync failure back to 
arm_smmu_atc_inv_{master,domain}() so that we can take any defensive 
action from there, with all the information to hand. We'd just have to 
ensure that if a large set of ATCI commands needs to span multiple 
batches, every batch must contain its own sync (since if some other 
batch of unrelated commands could get interleaved in the middle and 
issue a sync that then fails due to someone else's ATC timeout, 
everything's likely to get confused and go wrong).

The fiddly thing then is that we might also have to be prepared to 
"handle" CMD_SYNC timeout by manually checking for GERRORs, in case the 
whole invalidation is in the context of an dma_unmap within some other 
device's IRQ handler, which happens to be on the same CPU where the 
GERROR IRQ is now pending, but can't be taken until we can complete the 
inv and return out of the current IRQ :/

Thanks,
Robin.


  reply	other threads:[~2026-03-06 15:24 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-05  5:21 [PATCH v1 0/2] iommu/arm-smmu-v3: Reset PCI device upon ATC invalidate timeout Nicolin Chen
2026-03-05  5:21 ` [PATCH v1 1/2] iommu: Do not call pci_dev_reset_iommu_done() unless reset succeeds Nicolin Chen
2026-03-05  5:21 ` [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts Nicolin Chen
2026-03-05 15:15   ` kernel test robot
2026-03-05 15:24   ` Robin Murphy
2026-03-05 21:06     ` Nicolin Chen
2026-03-05 23:30       ` Nicolin Chen
2026-03-05 23:52       ` Jason Gunthorpe
2026-03-06 15:24         ` Robin Murphy [this message]
2026-03-06 15:56           ` Jason Gunthorpe
2026-03-10 19:34             ` Pranjal Shrivastava
2026-03-05 15:39   ` Jason Gunthorpe
2026-03-05 21:15     ` Nicolin Chen
2026-03-05 23:41       ` Jason Gunthorpe
2026-03-06  1:29         ` Nicolin Chen
2026-03-06  1:33           ` Jason Gunthorpe
2026-03-06  5:06             ` Nicolin Chen
2026-03-06 13:02               ` Jason Gunthorpe
2026-03-06 19:20                 ` Nicolin Chen
2026-03-06 19:22                   ` Jason Gunthorpe
2026-03-06 19:39                     ` Nicolin Chen
2026-03-06 19:47                       ` Jason Gunthorpe
2026-03-10 19:40                 ` Pranjal Shrivastava
2026-03-10 19:57                   ` Nicolin Chen
2026-03-10 20:04                     ` Pranjal Shrivastava
2026-03-06 13:22         ` Robin Murphy
2026-03-06 14:01           ` Jason Gunthorpe
2026-03-06 20:18             ` Nicolin Chen
2026-03-06 20:22               ` Jason Gunthorpe
2026-03-06 20:34                 ` Nicolin Chen
2026-03-06  3:22     ` Baolu Lu
2026-03-06 13:00       ` Jason Gunthorpe
2026-03-06 19:35         ` Samiullah Khawaja
2026-03-06 19:43           ` Jason Gunthorpe
2026-03-06 19:59             ` Samiullah Khawaja
2026-03-06 20:03               ` Jason Gunthorpe
2026-03-06 20:22                 ` Samiullah Khawaja
2026-03-06 20:26                   ` Jason Gunthorpe
2026-03-10 20:00                     ` Samiullah Khawaja
2026-03-11 12:12                       ` Jason Gunthorpe
2026-03-06  2:35   ` kernel test robot
2026-03-10 19:16   ` Pranjal Shrivastava
2026-03-10 19:51     ` Nicolin Chen
2026-03-10 20:00       ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03461707-783e-403a-86fa-ae7a5107fa30@arm.com \
    --to=robin.murphy@arm.com \
    --cc=Alexander.Grest@microsoft.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kees@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=miko.lenczewski@arm.com \
    --cc=nicolinc@nvidia.com \
    --cc=praan@google.com \
    --cc=rafael@kernel.org \
    --cc=smostafa@google.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox