From: Jason Gunthorpe <jgg@nvidia.com>
To: Will Deacon <will@kernel.org>
Cc: Nicolin Chen <nicolinc@nvidia.com>,
jean-philippe@linaro.org, robin.murphy@arm.com, joro@8bytes.org,
balbirs@nvidia.com, miko.lenczewski@arm.com,
peterz@infradead.org, kevin.tian@intel.com, praan@google.com,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()
Date: Mon, 26 Jan 2026 11:20:19 -0400 [thread overview]
Message-ID: <20260126152019.GK1134360@nvidia.com> (raw)
In-Reply-To: <aXdlnLLFUBwjT0V5@willie-the-truck>
On Mon, Jan 26, 2026 at 01:01:16PM +0000, Will Deacon wrote:
> On Fri, Jan 23, 2026 at 04:03:27PM -0400, Jason Gunthorpe wrote:
> > On Fri, Jan 23, 2026 at 05:10:52PM +0000, Will Deacon wrote:
> > > On Fri, Jan 23, 2026 at 05:05:31PM +0000, Will Deacon wrote:
> > > > On Fri, Dec 19, 2025 at 12:11:28PM -0800, Nicolin Chen wrote:
> > > > > + /*
> > > > > + * We are committed to updating the STE. Ensure the invalidation array
> > > > > + * is visible to concurrent map/unmap threads, and acquire any racing
> > > > > + * IOPTE updates.
> > > > > + *
> > > > > + * [CPU0] | [CPU1]
> > > > > + * |
> > > > > + * change IOPTEs and TLB flush: |
> > > > > + * arm_smmu_domain_inv_range() { | arm_smmu_install_old_domain_invs {
> > > > > + * ... | rcu_assign_pointer(new_invs);
> > > > > + * smp_mb(); // ensure IOPTEs | smp_mb(); // ensure new_invs
> > > > > + * ... | kfree_rcu(old_invs, rcu);
> > > > > + * // load invalidation array | }
> > > > > + * invs = rcu_dereference(); | arm_smmu_install_ste_for_dev {
> > > > > + * | STE = TTB0 // read new IOPTEs
> > > > > + */
> > > > > + smp_mb();
> > > >
> > > > I don't think we need to duplicate this comment three times, you can just
> > > > refer to the first function (e.g. "See ordering comment in
> > > > arm_smmu_domain_inv_range()").
> > > >
> > > > However, isn't the comment above misleading for this case?
> > > > arm_smmu_install_old_domain_invs() has the sequencing the other way
> > > > around on CPU 1: we should update the STE first.
> > >
> > > I also think we probably want a dma_mb() instead of an smp_mb() for all
> > > of these examples? It won't make any practical difference but I think it
> > > helps readability given that one of the readers is the PTW.
> >
> > The only actual dma_wmb() is inside arm_smmu_install_ste_for_dev()
> > after updating the STE. Adding that line explicitly would help as that
> > is the only point where we must have the writes actually visible to
> > the DMA HW.
> >
> > The ones written here as smp_mb() are not required to be DMA ones and
> > could all be NOP's on UP..
>
> Hmm, I'm not sure about that.
>
> If we've written a new (i.e. previously invalid) valid PTE to a
> page-table and then we install that page-table into an STE hitlessly
> (let's say we write the S2TTB field) then isn't there a window before we
> do the STE invalidation where the page-table might be accessible to the
> SMMU but the new PTE is still sitting in the CPU?
Hmm! Yes seems like it.
However, that's seems like a general bug, if we allocate an
iommu_domain and immediately hitlessly install it, then there would be
no dma_wmb() for the page table memory prior to the earliest point the
HW is able to read the STE.
What I wrote is is how things are intended to work, so lets fix it
with this?
@@ -1173,6 +1173,13 @@ void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer, __le64 *entry,
__le64 unused_update[NUM_ENTRY_QWORDS];
u8 used_qword_diff;
+ /*
+ * Many of the entry structures have pointers to other structures that
+ * need to have their updates be visible before any writes of the entry
+ * happen.
+ */
+ dma_wmb();
+
used_qword_diff =
arm_smmu_entry_qword_diff(writer, entry, target, unused_update);
if (hweight8(used_qword_diff) == 1) {
Jason
next prev parent reply other threads:[~2026-01-26 15:20 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-19 20:11 [PATCH v9 0/7] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-12-19 20:11 ` [PATCH v9 1/7] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2026-01-23 9:49 ` Pranjal Shrivastava
2025-12-19 20:11 ` [PATCH v9 2/7] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
2026-01-23 9:50 ` Pranjal Shrivastava
2025-12-19 20:11 ` [PATCH v9 3/7] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2026-01-23 9:53 ` Pranjal Shrivastava
2026-01-23 17:03 ` Will Deacon
2026-01-23 17:35 ` Nicolin Chen
2026-01-23 17:51 ` Will Deacon
2026-01-23 17:56 ` Nicolin Chen
2026-01-23 19:16 ` Jason Gunthorpe
2026-01-23 19:18 ` Nicolin Chen
2026-01-26 14:54 ` Will Deacon
2026-01-26 15:21 ` Jason Gunthorpe
2025-12-19 20:11 ` [PATCH v9 4/7] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
2026-01-23 9:54 ` Pranjal Shrivastava
2025-12-19 20:11 ` [PATCH v9 5/7] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
2025-12-19 20:11 ` [PATCH v9 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
2026-01-23 9:48 ` Pranjal Shrivastava
2026-01-23 13:56 ` Jason Gunthorpe
2026-01-27 16:38 ` Nicolin Chen
2026-01-27 17:08 ` Jason Gunthorpe
2026-01-27 18:07 ` Nicolin Chen
2026-01-27 18:23 ` Jason Gunthorpe
2026-01-27 18:37 ` Nicolin Chen
2026-01-27 19:19 ` Jason Gunthorpe
2026-01-27 20:14 ` Nicolin Chen
2026-01-28 0:05 ` Jason Gunthorpe
2026-01-23 17:05 ` Will Deacon
2026-01-23 17:10 ` Will Deacon
2026-01-23 17:43 ` Nicolin Chen
2026-01-23 20:03 ` Jason Gunthorpe
2026-01-26 13:01 ` Will Deacon
2026-01-26 15:20 ` Jason Gunthorpe [this message]
2026-01-26 16:02 ` Will Deacon
2026-01-26 16:09 ` Jason Gunthorpe
2026-01-26 18:56 ` Will Deacon
2026-01-27 3:14 ` Nicolin Chen
2026-01-26 17:50 ` Nicolin Chen
2025-12-19 20:11 ` [PATCH v9 7/7] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen
2026-01-23 17:07 ` Will Deacon
2026-01-23 17:47 ` Nicolin Chen
2026-01-23 19:59 ` Jason Gunthorpe
2026-01-19 17:10 ` [PATCH v9 0/7] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260126152019.GK1134360@nvidia.com \
--to=jgg@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miko.lenczewski@arm.com \
--cc=nicolinc@nvidia.com \
--cc=peterz@infradead.org \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox