All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: Nicolin Chen <nicolinc@nvidia.com>,
	robin.murphy@arm.com, joro@8bytes.org, bhelgaas@google.com,
	jgg@nvidia.com
Cc: will@kernel.org, robin.clark@oss.qualcomm.com,
	yong.wu@mediatek.com, matthias.bgg@gmail.com,
	angelogioacchino.delregno@collabora.com,
	thierry.reding@gmail.com, vdumpa@nvidia.com,
	jonathanh@nvidia.com, rafael@kernel.org, lenb@kernel.org,
	kevin.tian@intel.com, yi.l.liu@intel.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-mediatek@lists.infradead.org, linux-tegra@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org,
	patches@lists.linux.dev, pjaroszynski@nvidia.com,
	vsethi@nvidia.com, helgaas@kernel.org, etzhao1900@gmail.com
Subject: Re: [PATCH v3 4/5] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done()
Date: Fri, 15 Aug 2025 13:49:55 +0800	[thread overview]
Message-ID: <7b8d8bfa-ca6b-4a07-8a4d-a30d8993c7c7@linux.intel.com> (raw)
In-Reply-To: <5ba556fc54777853c499186f494f3411d7a4a5a9.1754952762.git.nicolinc@nvidia.com>

On 8/12/25 06:59, Nicolin Chen wrote:
> PCIe permits a device to ignore ATS invalidation TLPs, while processing a
> reset. This creates a problem visible to the OS where an ATS invalidation
> command will time out: e.g. an SVA domain will have no coordination with a
> reset event and can racily issue ATS invalidations to a resetting device.
> 
> The OS should do something to mitigate this as we do not want production
> systems to be reporting critical ATS failures, especially in a hypervisor
> environment. Broadly, OS could arrange to ignore the timeouts, block page
> table mutations to prevent invalidations, or disable and block ATS.
> 
> The PCIe spec in sec 10.3.1 IMPLEMENTATION NOTE recommends to disable and
> block ATS before initiating a Function Level Reset. It also mentions that
> other reset methods could have the same vulnerability as well.
> 
> Provide a callback from the PCI subsystem that will enclose the reset and
> have the iommu core temporarily change all the attached domain to BLOCKED.
> After attaching a BLOCKED domain, IOMMU drivers should fence any incoming

Nit, my understanding is that it's not the "IOMMU drivers" but the
"IOMMU hardware" that fences any further incoming translation requests,
right?

> ATS queries, synchronously stop issuing new ATS invalidations, and wait
> for all ATS invalidations to complete. This can avoid any ATS invaliation
> timeouts.
> 
> However, if there is a domain attachment/replacement happening during an
> ongoing reset, ATS routines may be re-activated between the two function
> calls. So, introduce a new pending_reset flag in group_device to defer an
> attachment during a reset, allowing iommu core to cache target domains in
> the SW level while bypassing the driver. The iommu_dev_reset_done() will
> re-attach these soft-attached domains, once the device reset is finished.
> 
> Signed-off-by: Nicolin Chen<nicolinc@nvidia.com>

The code looks good to me:

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

  reply	other threads:[~2025-08-15  5:52 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-11 22:59 [PATCH v3 0/5] Disable ATS via iommu during PCI resets Nicolin Chen
2025-08-11 22:59 ` [PATCH v3 1/5] iommu: Lock group->mutex in iommu_deferred_attach Nicolin Chen
2025-08-15  5:09   ` Baolu Lu
2025-08-15  8:27     ` Tian, Kevin
2025-08-15  8:24   ` Tian, Kevin
2025-08-15 19:26     ` Nicolin Chen
2025-08-18 14:17     ` Jason Gunthorpe
2025-08-18 17:45       ` Nicolin Chen
2025-08-18 18:09         ` Jason Gunthorpe
2025-08-18 18:29           ` Nicolin Chen
2025-08-11 22:59 ` [PATCH v3 2/5] iommu: Pass in gdev to __iommu_device_set_domain Nicolin Chen
2025-08-15  5:18   ` Baolu Lu
2025-08-15  8:29   ` Tian, Kevin
2025-08-11 22:59 ` [PATCH v3 3/5] iommu: Add iommu_get_domain_for_dev_locked() helper Nicolin Chen
2025-08-15  5:28   ` Baolu Lu
2025-08-15 18:48     ` Nicolin Chen
2025-08-18 14:40     ` Jason Gunthorpe
2025-08-19  2:09       ` Baolu Lu
2025-08-15  8:55   ` Tian, Kevin
2025-08-15 18:45     ` Nicolin Chen
2025-08-21  8:07       ` Tian, Kevin
2025-08-21 14:41         ` Nicolin Chen
2025-08-31 23:24           ` Nicolin Chen
2025-08-18 14:39   ` Jason Gunthorpe
2025-08-18 17:22     ` Nicolin Chen
2025-08-18 23:42       ` Jason Gunthorpe
2025-08-19  5:09         ` Nicolin Chen
2025-08-19 12:52           ` Jason Gunthorpe
2025-08-19 17:22             ` Nicolin Chen
2025-08-21 13:13               ` Jason Gunthorpe
2025-08-21 15:22                 ` Nicolin Chen
2025-08-21 18:37                   ` Jason Gunthorpe
2025-08-22  5:11                     ` Nicolin Chen
2025-08-21  8:11       ` Tian, Kevin
2025-08-21 13:14         ` Jason Gunthorpe
2025-08-22  9:45           ` Tian, Kevin
2025-08-22 13:21             ` Jason Gunthorpe
2025-08-11 22:59 ` [PATCH v3 4/5] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done() Nicolin Chen
2025-08-15  5:49   ` Baolu Lu [this message]
2025-08-15 20:10     ` Nicolin Chen
2025-08-15  9:14   ` Tian, Kevin
2025-08-15 19:45     ` Nicolin Chen
2025-08-11 22:59 ` [PATCH v3 5/5] pci: Suspend iommu function prior to resetting a device Nicolin Chen
2025-08-19 14:12   ` Ethan Zhao
2025-08-19 21:59     ` Nicolin Chen
2025-08-20  3:18       ` Ethan Zhao
2025-08-22  6:14         ` Nicolin Chen
2025-08-21 13:07       ` Jason Gunthorpe
2025-08-22  6:35         ` Nicolin Chen
2025-08-22 14:08           ` Jason Gunthorpe
2025-08-22 18:50             ` Nicolin Chen
2025-08-28 12:51               ` Jason Gunthorpe
2025-08-28 15:08                 ` Nicolin Chen
2025-08-28 18:46                   ` Jason Gunthorpe
2025-08-28 19:35                     ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b8d8bfa-ca6b-4a07-8a4d-a30d8993c7c7@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=bhelgaas@google.com \
    --cc=etzhao1900@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=jonathanh@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=matthias.bgg@gmail.com \
    --cc=nicolinc@nvidia.com \
    --cc=patches@lists.linux.dev \
    --cc=pjaroszynski@nvidia.com \
    --cc=rafael@kernel.org \
    --cc=robin.clark@oss.qualcomm.com \
    --cc=robin.murphy@arm.com \
    --cc=thierry.reding@gmail.com \
    --cc=vdumpa@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yong.wu@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.