public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Will Deacon <will@kernel.org>
Cc: Evangelos Petrongonas <epetron@amazon.de>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Pranjal Shrivastava <praan@google.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, nh-open-source@amazon.com,
	Zeev Zilberman <zeev@amazon.com>
Subject: Re: [PATCH] iommu/arm-smmu-v3: Allow disabling Stage 1 translation
Date: Fri, 24 Apr 2026 13:39:21 -0300	[thread overview]
Message-ID: <20260424163921.GG3611611@ziepe.ca> (raw)
In-Reply-To: <aeuT1-TB6dOT5ZQ2@willie-the-truck>

On Fri, Apr 24, 2026 at 05:01:27PM +0100, Will Deacon wrote:
> On Fri, Apr 24, 2026 at 12:42:56PM -0300, Jason Gunthorpe wrote:
> > On Fri, Apr 24, 2026 at 04:16:17PM +0100, Will Deacon wrote:
> > > > > > STE/CD is pretty simple now, there is only one place to put the CMO
> > > > > > and the ordering is all handled with that shared code. We no longer
> > > > > > care about ordering beyond all the writes must be visible to HW before
> > > > > > issuing the CMDQ invalidation command - which is the same environment
> > > > > > as the pagetable.
> > > > > 
> > > > > You presumably rely on 64-bit single-copy atomicity for hitless updates,
> > > > > no?
> > > > 
> > > > Yes, just like the page table does..
> > > > 
> > > > I hope that's not a problem or we have a issue with the PTW :)
> > > 
> > > You trimmed the part from my reply where I think we _do_ have an issue
> > > with the PTW. Here it is again:
> > > 
> > >   The non-coherent case looks more fragile, because I don't _think_ the
> > >   architecture provides any ordering or atomicity guarantees about cache
> > >   cleaning to the PoC. Presumably, the correct sequence would be to write
> > >   the PTE with the valid bit clear, do the CMO (with completion barrier),
> > >   *then* write the bottom byte with the valid bit set and do another CMO.
> > 
> > I wasn't sure if you are being serious.
> > 
> > CMO + barriers must provide an ordering guarentee about cache cleaning
> > to POC otherwise the entire Linux DMA API is broken. dma_sync must
> > order with following device DMA. IMHO that's not negotiable for Linux.
> 
> The problem is with concurrent DMA (from the page-table walker) and I
> don't see anything that guarantees that in the CPU architecture. I don't
> think the streaming DMA API pretends to handle that case, does it? It
> relies on a pretty rigid ownership concept from what I understand.

I think you pointed out two things, ordering and tearing.

Ordering is OK. If I write a PTE, dma_sync, then command a device to
use that IOVA the PTW must observe the new PTE value. Otherwise
dma_sync isn't doing what Linux requires.

Tearing is a different issue, if the device uses the IOVA and races
with the PTE write changing it then you say maybe it can mis-read it
with tearing.

However, this race only happens if the PTE is currently non-valid or
being changed to non-valid. Meaning randomly you will be getting an
invalid IOVA event.

In non-coherent mode we don't allow SVA and we don't allow VFIO. Only
the DMA API and drivers open coding things.

For VFIO and SVA, yes, we need the HW to work and properly, userspcae
can trigger invalid IOVA, we can't tolerate a corrupted PTE.

In embedded I suppose you could make an argument you don't care about
it since invalid IOVA would have to be caused by a buggy kernel
driver, it should never happen, and thus this is really a debug
feature. If the race will never be hit in a working system maybe it is
fine to leave it as is.

Would be good to document this detail :)

> Of course I'd rather that the architecture said that our current code
> is fine, but if it doesn't then I don't have much choice, really. At the
> very least, we should minimise the number of places where we rely on
> non-architected behaviour and so keeping the CDs and STEs non-cacheable
> remains my preference.

So, I am convinced, PTW has that escape above that doesn't apply to
STE/CD. Those can be accessed truely at any time and we can't ever
leave a 64 bit value in a strange state.

Jason


      reply	other threads:[~2026-04-24 16:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 12:32 [PATCH] iommu/arm-smmu-v3: Allow disabling Stage 1 translation Evangelos Petrongonas
2026-04-20 12:40 ` Jason Gunthorpe
2026-04-22  6:44   ` Evangelos Petrongonas
2026-04-22 15:44     ` Pranjal Shrivastava
2026-04-22 16:23     ` Jason Gunthorpe
2026-04-22 16:36       ` Robin Murphy
2026-04-23  9:44       ` Will Deacon
2026-04-23  9:47         ` Will Deacon
2026-04-23 14:23           ` Jason Gunthorpe
2026-04-23 17:07             ` Will Deacon
2026-04-23 18:43               ` Samiullah Khawaja
2026-04-23 22:37               ` Jason Gunthorpe
2026-04-24 15:16                 ` Will Deacon
2026-04-24 15:42                   ` Jason Gunthorpe
2026-04-24 16:01                     ` Will Deacon
2026-04-24 16:39                       ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260424163921.GG3611611@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=baolu.lu@linux.intel.com \
    --cc=epetron@amazon.de \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nh-open-source@amazon.com \
    --cc=nicolinc@nvidia.com \
    --cc=praan@google.com \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=zeev@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox