linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Mostafa Saleh <smostafa@google.com>
Cc: acpica-devel@lists.linux.dev,
	Alex Williamson <alex.williamson@redhat.com>,
	Hanjun Guo <guohanjun@huawei.com>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Kevin Tian <kevin.tian@intel.com>,
	kvm@vger.kernel.org, Len Brown <lenb@kernel.org>,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Robert Moore <robert.moore@intel.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Will Deacon <will@kernel.org>, Eric Auger <eric.auger@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Moritz Fischer <mdf@kernel.org>,
	Michael Shavit <mshavit@google.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	patches@lists.linux.dev,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available
Date: Tue, 20 Aug 2024 17:21:38 -0300	[thread overview]
Message-ID: <20240820202138.GH3773488@nvidia.com> (raw)
In-Reply-To: <ZsT0Fd5FHS47gm0-@google.com>

On Tue, Aug 20, 2024 at 07:52:53PM +0000, Mostafa Saleh wrote:
> On Tue, Aug 20, 2024 at 09:01:02AM -0300, Jason Gunthorpe wrote:
> > On Tue, Aug 20, 2024 at 08:30:05AM +0000, Mostafa Saleh wrote:
> > > Hi Jason,
> > > 
> > > On Tue, Aug 06, 2024 at 08:41:15PM -0300, Jason Gunthorpe wrote:
> > > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field
> > > > works. When S2FWB is supported and enabled the IOPTE will force cachable
> > > > access to IOMMU_CACHE memory and deny cachable access otherwise.
> > > > 
> > > > This is not especially meaningful for simple S2 domains, it apparently
> > > > doesn't even force PCI no-snoop access to be coherent.
> > > > 
> > > > However, when used with a nested S1, FWB has the effect of preventing the
> > > > guest from choosing a MemAttr that would cause ordinary DMA to bypass the
> > > > cache. Consistent with KVM we wish to deny the guest the ability to become
> > > > incoherent with cached memory the hypervisor believes is cachable so we
> > > > don't have to flush it.
> > > > 
> > > > Turn on S2FWB whenever the SMMU supports it and use it for all S2
> > > > mappings.
> > > 
> > > I have been looking into this recently from the KVM side as it will
> > > use FWB for the CPU stage-2 unconditionally for guests(if supported),
> > > however that breaks for non-coherent devices when assigned, and
> > > limiting assigned devices to be coherent seems too restrictive.
> > 
> > kvm's CPU S2 doesn't care about non-DMA-coherent devices though? That
> > concept is only relevant to the SMMU.
>
> Why not? That would be a problem if a device is not dma coherent,
> and the VM knows that and maps it’s DMA memory as non cacheable.
> But it would be overridden by FWB in stage-2 to be cacheable,
> it would lead to coherency issues.

Oh, from that perspective yes, but the entire point of S2FWB is that
VM's can not create non-coherent access so it is a bit nonsense to ask
for both S2FWB and try to assign a non-DMA coherent device.

> Yes, that also breaks (although I think this is an easier problem to
> solve)

Well, it is easy to solve, just don't use S2FWB and manually flush the
caches before the hypervisor touches any memory. :)

> What I mean is the master itself not the SMMU (the SID basically),
> so in that case the STE shouldn’t have FWB enabled.

That doesn't matter, those cases will not pass in IOMMU_CACHE and they
will work fine with S2FWB turned on.

> > Also bear in mind VFIO won't run unless ARM_SMMU_FEAT_COHERENCY is set
> > so we won't even get a chance to ask for a S2 domain.
> 
> Oh, I think that is only for the SMMU, not for the master, the
> SMMU can be coherent (for pte, ste …) but the master can still be
> non coherent. Looking at how VFIO uses it, that seems to be a bug?

If there are mixes of SMMU feature and dev_is_dma_coherent() then it
would be a bug yes..

I recall we started out trying to use dev_is_dma_coherent() but
Christoph explained it doesn't work that generally:

https://lore.kernel.org/kvm/20220406135150.GA21532@lst.de/

Seems we sort of gave up on it, too complicated. Robin had a nice
observation of the complexity:

    Disregarding the complete disaster of PCIe No Snoop on Arm-Based 
    systems, there's the more interesting effectively-opposite scenario 
    where an SMMU bridges non-coherent devices to a coherent interconnect. 
    It's not something we take advantage of yet in Linux, and it can only be 
    properly described in ACPI, but there do exist situations where 
    IOMMU_CACHE is capable of making the device's traffic snoop, but 
    dev_is_dma_coherent() - and device_get_dma_attr() for external users - 
    would still say non-coherent because they can't assume that the SMMU is 
    enabled and programmed in just the right way.

Anyhow, for the purposes of KVM and VFIO, devices that don't work with
IOMMU_CACHE are not allowed. From an API perspective
IOMMU_CAP_CACHE_COHERENCY is supposed to return if the struct device
can use IOMMU_CACHE.

The corner case where we have a ARM_SMMU_FEAT_COHERENCY SMMU but
somehow specific devices don't support IOMMU_CACHE is not properly
reflected in IOMMU_CAP_CACHE_COHERENCY. I don't know how to fix that,
and we've been ignoring it for a long time now :)

Jason


  reply	other threads:[~2024-08-20 20:22 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-06 23:41 [PATCH 0/8] Initial support for SMMUv3 nested translation Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 1/8] vfio: Remove VFIO_TYPE1_NESTING_IOMMU Jason Gunthorpe
2024-08-07 17:52   ` Alex Williamson
2024-08-20  8:23   ` Mostafa Saleh
2024-08-06 23:41 ` [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available Jason Gunthorpe
2024-08-09 14:26   ` Shameerali Kolothum Thodi
2024-08-09 15:12     ` Jason Gunthorpe
2024-08-15 16:14       ` Shameerali Kolothum Thodi
2024-08-15 16:18         ` Jason Gunthorpe
2024-08-20  8:30   ` Mostafa Saleh
2024-08-20 12:01     ` Jason Gunthorpe
2024-08-20 13:01       ` Jason Gunthorpe
2024-08-20 19:52       ` Mostafa Saleh
2024-08-20 20:21         ` Jason Gunthorpe [this message]
2024-08-21  9:53           ` Mostafa Saleh
2024-08-21 12:06             ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 3/8] ACPI/IORT: Support CANWBS memory access flag Jason Gunthorpe
2024-08-09 14:36   ` Shameerali Kolothum Thodi
2024-08-09 14:59     ` Jason Gunthorpe
2024-08-09 15:15       ` Shameerali Kolothum Thodi
2024-08-09 20:14         ` Nicolin Chen
2024-08-06 23:41 ` [PATCH 4/8] iommu/arm-smmu-v3: Report IOMMU_CAP_ENFORCE_CACHE_COHERENCY for CANWBS Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 5/8] iommu/arm-smmu-v3: Support IOMMU_GET_HW_INFO via struct arm_smmu_hw_info Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 6/8] iommu/arm-smmu-v3: Implement IOMMU_HWPT_ALLOC_NEST_PARENT Jason Gunthorpe
2024-08-09 15:06   ` Robin Murphy
2024-08-09 16:09     ` Jason Gunthorpe
2024-08-09 18:34       ` Robin Murphy
2024-08-13 14:35         ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 7/8] iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED Jason Gunthorpe
2024-08-09 16:05   ` Robin Murphy
2024-08-09 18:03     ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 8/8] iommu/arm-smmu-v3: Add arm_smmu_cache_invalidate_user Jason Gunthorpe
2024-08-20  8:20 ` [PATCH 0/8] Initial support for SMMUv3 nested translation Mostafa Saleh
2024-08-20 15:24   ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240820202138.GH3773488@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acpica-devel@lists.linux.dev \
    --cc=alex.williamson@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=guohanjun@huawei.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=lpieralisi@kernel.org \
    --cc=mdf@kernel.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=patches@lists.linux.dev \
    --cc=rafael@kernel.org \
    --cc=robert.moore@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=smostafa@google.com \
    --cc=sudeep.holla@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).