From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: acpica-devel@lists.linux.dev,
Alex Williamson <alex.williamson@redhat.com>,
Hanjun Guo <guohanjun@huawei.com>,
iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
Kevin Tian <kevin.tian@intel.com>,
kvm@vger.kernel.org, Len Brown <lenb@kernel.org>,
linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Robert Moore <robert.moore@intel.com>,
Robin Murphy <robin.murphy@arm.com>,
Sudeep Holla <sudeep.holla@arm.com>,
Will Deacon <will@kernel.org>, Eric Auger <eric.auger@redhat.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Moritz Fischer <mdf@kernel.org>,
Michael Shavit <mshavit@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
patches@lists.linux.dev,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Marc Zyngier <maz@kernel.org>
Subject: Re: [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available
Date: Wed, 21 Aug 2024 09:53:33 +0000 [thread overview]
Message-ID: <ZsW5HRZj2O2hGQYc@google.com> (raw)
In-Reply-To: <20240820202138.GH3773488@nvidia.com>
On Tue, Aug 20, 2024 at 05:21:38PM -0300, Jason Gunthorpe wrote:
> On Tue, Aug 20, 2024 at 07:52:53PM +0000, Mostafa Saleh wrote:
> > On Tue, Aug 20, 2024 at 09:01:02AM -0300, Jason Gunthorpe wrote:
> > > On Tue, Aug 20, 2024 at 08:30:05AM +0000, Mostafa Saleh wrote:
> > > > Hi Jason,
> > > >
> > > > On Tue, Aug 06, 2024 at 08:41:15PM -0300, Jason Gunthorpe wrote:
> > > > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field
> > > > > works. When S2FWB is supported and enabled the IOPTE will force cachable
> > > > > access to IOMMU_CACHE memory and deny cachable access otherwise.
> > > > >
> > > > > This is not especially meaningful for simple S2 domains, it apparently
> > > > > doesn't even force PCI no-snoop access to be coherent.
> > > > >
> > > > > However, when used with a nested S1, FWB has the effect of preventing the
> > > > > guest from choosing a MemAttr that would cause ordinary DMA to bypass the
> > > > > cache. Consistent with KVM we wish to deny the guest the ability to become
> > > > > incoherent with cached memory the hypervisor believes is cachable so we
> > > > > don't have to flush it.
> > > > >
> > > > > Turn on S2FWB whenever the SMMU supports it and use it for all S2
> > > > > mappings.
> > > >
> > > > I have been looking into this recently from the KVM side as it will
> > > > use FWB for the CPU stage-2 unconditionally for guests(if supported),
> > > > however that breaks for non-coherent devices when assigned, and
> > > > limiting assigned devices to be coherent seems too restrictive.
> > >
> > > kvm's CPU S2 doesn't care about non-DMA-coherent devices though? That
> > > concept is only relevant to the SMMU.
> >
> > Why not? That would be a problem if a device is not dma coherent,
> > and the VM knows that and maps it’s DMA memory as non cacheable.
> > But it would be overridden by FWB in stage-2 to be cacheable,
> > it would lead to coherency issues.
>
> Oh, from that perspective yes, but the entire point of S2FWB is that
> VM's can not create non-coherent access so it is a bit nonsense to ask
> for both S2FWB and try to assign a non-DMA coherent device.
Yes, but KVM sets FWB unconditionally and would use cacheable mapping
for stage-2, and I expect the same for the nested SMMU.
>
> > Yes, that also breaks (although I think this is an easier problem to
> > solve)
>
> Well, it is easy to solve, just don't use S2FWB and manually flush the
> caches before the hypervisor touches any memory. :)
Yes, although that means virtualized devices would have worse
performance :/ but I guess there is nothing more to do here.
I have some ideas about that, I can send patches to the kvm list
as an RFC.
>
> > What I mean is the master itself not the SMMU (the SID basically),
> > so in that case the STE shouldn’t have FWB enabled.
>
> That doesn't matter, those cases will not pass in IOMMU_CACHE and they
> will work fine with S2FWB turned on.
>
But that won’t be the case in nested? Otherwise why we use FWB in the
first place.
> > > Also bear in mind VFIO won't run unless ARM_SMMU_FEAT_COHERENCY is set
> > > so we won't even get a chance to ask for a S2 domain.
> >
> > Oh, I think that is only for the SMMU, not for the master, the
> > SMMU can be coherent (for pte, ste …) but the master can still be
> > non coherent. Looking at how VFIO uses it, that seems to be a bug?
>
> If there are mixes of SMMU feature and dev_is_dma_coherent() then it
> would be a bug yes..
>
I think there is a bug, I was able to assign a “non-coherent” device with
VFIO with no issues, and it allows it as long as the SMMU is coherent.
> I recall we started out trying to use dev_is_dma_coherent() but
> Christoph explained it doesn't work that generally:
>
> https://lore.kernel.org/kvm/20220406135150.GA21532@lst.de/
>
> Seems we sort of gave up on it, too complicated. Robin had a nice
> observation of the complexity:
>
> Disregarding the complete disaster of PCIe No Snoop on Arm-Based
> systems, there's the more interesting effectively-opposite scenario
> where an SMMU bridges non-coherent devices to a coherent interconnect.
> It's not something we take advantage of yet in Linux, and it can only be
> properly described in ACPI, but there do exist situations where
> IOMMU_CACHE is capable of making the device's traffic snoop, but
> dev_is_dma_coherent() - and device_get_dma_attr() for external users -
> would still say non-coherent because they can't assume that the SMMU is
> enabled and programmed in just the right way.
>
> Anyhow, for the purposes of KVM and VFIO, devices that don't work with
> IOMMU_CACHE are not allowed. From an API perspective
> IOMMU_CAP_CACHE_COHERENCY is supposed to return if the struct device
> can use IOMMU_CACHE.
>
> The corner case where we have a ARM_SMMU_FEAT_COHERENCY SMMU but
> somehow specific devices don't support IOMMU_CACHE is not properly
> reflected in IOMMU_CAP_CACHE_COHERENCY. I don't know how to fix that,
> and we've been ignoring it for a long time now :)
Thanks a lot for the extra context!
Maybe the SMMUv3 .capable, should be changed to check if the device is
coherent (instead of using dev_is_dma_coherent, it can use lower level
functions from the supported buses)
Also, I think supporting IOMMU_CACHE is not enough, as the SMMU can
support it but the device is still not coherent.
Thanks,
Mostafa
>
> Jason
next prev parent reply other threads:[~2024-08-21 9:53 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-06 23:41 [PATCH 0/8] Initial support for SMMUv3 nested translation Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 1/8] vfio: Remove VFIO_TYPE1_NESTING_IOMMU Jason Gunthorpe
2024-08-07 17:52 ` Alex Williamson
2024-08-20 8:23 ` Mostafa Saleh
2024-08-06 23:41 ` [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available Jason Gunthorpe
2024-08-09 14:26 ` Shameerali Kolothum Thodi
2024-08-09 15:12 ` Jason Gunthorpe
2024-08-15 16:14 ` Shameerali Kolothum Thodi
2024-08-15 16:18 ` Jason Gunthorpe
2024-08-20 8:30 ` Mostafa Saleh
2024-08-20 12:01 ` Jason Gunthorpe
2024-08-20 13:01 ` Jason Gunthorpe
2024-08-20 19:52 ` Mostafa Saleh
2024-08-20 20:21 ` Jason Gunthorpe
2024-08-21 9:53 ` Mostafa Saleh [this message]
2024-08-21 12:06 ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 3/8] ACPI/IORT: Support CANWBS memory access flag Jason Gunthorpe
2024-08-09 14:36 ` Shameerali Kolothum Thodi
2024-08-09 14:59 ` Jason Gunthorpe
2024-08-09 15:15 ` Shameerali Kolothum Thodi
2024-08-09 20:14 ` Nicolin Chen
2024-08-06 23:41 ` [PATCH 4/8] iommu/arm-smmu-v3: Report IOMMU_CAP_ENFORCE_CACHE_COHERENCY for CANWBS Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 5/8] iommu/arm-smmu-v3: Support IOMMU_GET_HW_INFO via struct arm_smmu_hw_info Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 6/8] iommu/arm-smmu-v3: Implement IOMMU_HWPT_ALLOC_NEST_PARENT Jason Gunthorpe
2024-08-09 15:06 ` Robin Murphy
2024-08-09 16:09 ` Jason Gunthorpe
2024-08-09 18:34 ` Robin Murphy
2024-08-13 14:35 ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 7/8] iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED Jason Gunthorpe
2024-08-09 16:05 ` Robin Murphy
2024-08-09 18:03 ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 8/8] iommu/arm-smmu-v3: Add arm_smmu_cache_invalidate_user Jason Gunthorpe
2024-08-20 8:20 ` [PATCH 0/8] Initial support for SMMUv3 nested translation Mostafa Saleh
2024-08-20 15:24 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZsW5HRZj2O2hGQYc@google.com \
--to=smostafa@google.com \
--cc=acpica-devel@lists.linux.dev \
--cc=alex.williamson@redhat.com \
--cc=eric.auger@redhat.com \
--cc=guohanjun@huawei.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=lpieralisi@kernel.org \
--cc=maz@kernel.org \
--cc=mdf@kernel.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=rafael@kernel.org \
--cc=robert.moore@intel.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=sudeep.holla@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.