From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: acpica-devel@lists.linux.dev,
Alex Williamson <alex.williamson@redhat.com>,
Hanjun Guo <guohanjun@huawei.com>,
iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
Kevin Tian <kevin.tian@intel.com>,
kvm@vger.kernel.org, Len Brown <lenb@kernel.org>,
linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Robert Moore <robert.moore@intel.com>,
Robin Murphy <robin.murphy@arm.com>,
Sudeep Holla <sudeep.holla@arm.com>,
Will Deacon <will@kernel.org>, Eric Auger <eric.auger@redhat.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Moritz Fischer <mdf@kernel.org>,
Michael Shavit <mshavit@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
patches@lists.linux.dev,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Marc Zyngier <maz@kernel.org>
Subject: Re: [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available
Date: Wed, 21 Aug 2024 09:53:33 +0000 [thread overview]
Message-ID: <ZsW5HRZj2O2hGQYc@google.com> (raw)
In-Reply-To: <20240820202138.GH3773488@nvidia.com>
On Tue, Aug 20, 2024 at 05:21:38PM -0300, Jason Gunthorpe wrote:
> On Tue, Aug 20, 2024 at 07:52:53PM +0000, Mostafa Saleh wrote:
> > On Tue, Aug 20, 2024 at 09:01:02AM -0300, Jason Gunthorpe wrote:
> > > On Tue, Aug 20, 2024 at 08:30:05AM +0000, Mostafa Saleh wrote:
> > > > Hi Jason,
> > > >
> > > > On Tue, Aug 06, 2024 at 08:41:15PM -0300, Jason Gunthorpe wrote:
> > > > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field
> > > > > works. When S2FWB is supported and enabled the IOPTE will force cachable
> > > > > access to IOMMU_CACHE memory and deny cachable access otherwise.
> > > > >
> > > > > This is not especially meaningful for simple S2 domains, it apparently
> > > > > doesn't even force PCI no-snoop access to be coherent.
> > > > >
> > > > > However, when used with a nested S1, FWB has the effect of preventing the
> > > > > guest from choosing a MemAttr that would cause ordinary DMA to bypass the
> > > > > cache. Consistent with KVM we wish to deny the guest the ability to become
> > > > > incoherent with cached memory the hypervisor believes is cachable so we
> > > > > don't have to flush it.
> > > > >
> > > > > Turn on S2FWB whenever the SMMU supports it and use it for all S2
> > > > > mappings.
> > > >
> > > > I have been looking into this recently from the KVM side as it will
> > > > use FWB for the CPU stage-2 unconditionally for guests(if supported),
> > > > however that breaks for non-coherent devices when assigned, and
> > > > limiting assigned devices to be coherent seems too restrictive.
> > >
> > > kvm's CPU S2 doesn't care about non-DMA-coherent devices though? That
> > > concept is only relevant to the SMMU.
> >
> > Why not? That would be a problem if a device is not dma coherent,
> > and the VM knows that and maps it’s DMA memory as non cacheable.
> > But it would be overridden by FWB in stage-2 to be cacheable,
> > it would lead to coherency issues.
>
> Oh, from that perspective yes, but the entire point of S2FWB is that
> VM's can not create non-coherent access so it is a bit nonsense to ask
> for both S2FWB and try to assign a non-DMA coherent device.
Yes, but KVM sets FWB unconditionally and would use cacheable mapping
for stage-2, and I expect the same for the nested SMMU.
>
> > Yes, that also breaks (although I think this is an easier problem to
> > solve)
>
> Well, it is easy to solve, just don't use S2FWB and manually flush the
> caches before the hypervisor touches any memory. :)
Yes, although that means virtualized devices would have worse
performance :/ but I guess there is nothing more to do here.
I have some ideas about that, I can send patches to the kvm list
as an RFC.
>
> > What I mean is the master itself not the SMMU (the SID basically),
> > so in that case the STE shouldn’t have FWB enabled.
>
> That doesn't matter, those cases will not pass in IOMMU_CACHE and they
> will work fine with S2FWB turned on.
>
But that won’t be the case in nested? Otherwise why we use FWB in the
first place.
> > > Also bear in mind VFIO won't run unless ARM_SMMU_FEAT_COHERENCY is set
> > > so we won't even get a chance to ask for a S2 domain.
> >
> > Oh, I think that is only for the SMMU, not for the master, the
> > SMMU can be coherent (for pte, ste …) but the master can still be
> > non coherent. Looking at how VFIO uses it, that seems to be a bug?
>
> If there are mixes of SMMU feature and dev_is_dma_coherent() then it
> would be a bug yes..
>
I think there is a bug, I was able to assign a “non-coherent” device with
VFIO with no issues, and it allows it as long as the SMMU is coherent.
> I recall we started out trying to use dev_is_dma_coherent() but
> Christoph explained it doesn't work that generally:
>
> https://lore.kernel.org/kvm/20220406135150.GA21532@lst.de/
>
> Seems we sort of gave up on it, too complicated. Robin had a nice
> observation of the complexity:
>
> Disregarding the complete disaster of PCIe No Snoop on Arm-Based
> systems, there's the more interesting effectively-opposite scenario
> where an SMMU bridges non-coherent devices to a coherent interconnect.
> It's not something we take advantage of yet in Linux, and it can only be
> properly described in ACPI, but there do exist situations where
> IOMMU_CACHE is capable of making the device's traffic snoop, but
> dev_is_dma_coherent() - and device_get_dma_attr() for external users -
> would still say non-coherent because they can't assume that the SMMU is
> enabled and programmed in just the right way.
>
> Anyhow, for the purposes of KVM and VFIO, devices that don't work with
> IOMMU_CACHE are not allowed. From an API perspective
> IOMMU_CAP_CACHE_COHERENCY is supposed to return if the struct device
> can use IOMMU_CACHE.
>
> The corner case where we have a ARM_SMMU_FEAT_COHERENCY SMMU but
> somehow specific devices don't support IOMMU_CACHE is not properly
> reflected in IOMMU_CAP_CACHE_COHERENCY. I don't know how to fix that,
> and we've been ignoring it for a long time now :)
Thanks a lot for the extra context!
Maybe the SMMUv3 .capable, should be changed to check if the device is
coherent (instead of using dev_is_dma_coherent, it can use lower level
functions from the supported buses)
Also, I think supporting IOMMU_CACHE is not enough, as the SMMU can
support it but the device is still not coherent.
Thanks,
Mostafa
>
> Jason
next prev parent reply other threads:[~2024-08-21 10:11 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-06 23:41 [PATCH 0/8] Initial support for SMMUv3 nested translation Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 1/8] vfio: Remove VFIO_TYPE1_NESTING_IOMMU Jason Gunthorpe
2024-08-07 17:52 ` Alex Williamson
2024-08-20 8:23 ` Mostafa Saleh
2024-08-06 23:41 ` [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available Jason Gunthorpe
2024-08-09 14:26 ` Shameerali Kolothum Thodi
2024-08-09 15:12 ` Jason Gunthorpe
2024-08-15 16:14 ` Shameerali Kolothum Thodi
2024-08-15 16:18 ` Jason Gunthorpe
2024-08-20 8:30 ` Mostafa Saleh
2024-08-20 12:01 ` Jason Gunthorpe
2024-08-20 13:01 ` Jason Gunthorpe
2024-08-20 19:52 ` Mostafa Saleh
2024-08-20 20:21 ` Jason Gunthorpe
2024-08-21 9:53 ` Mostafa Saleh [this message]
2024-08-21 12:06 ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 3/8] ACPI/IORT: Support CANWBS memory access flag Jason Gunthorpe
2024-08-09 14:36 ` Shameerali Kolothum Thodi
2024-08-09 14:59 ` Jason Gunthorpe
2024-08-09 15:15 ` Shameerali Kolothum Thodi
2024-08-09 20:14 ` Nicolin Chen
2024-08-06 23:41 ` [PATCH 4/8] iommu/arm-smmu-v3: Report IOMMU_CAP_ENFORCE_CACHE_COHERENCY for CANWBS Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 5/8] iommu/arm-smmu-v3: Support IOMMU_GET_HW_INFO via struct arm_smmu_hw_info Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 6/8] iommu/arm-smmu-v3: Implement IOMMU_HWPT_ALLOC_NEST_PARENT Jason Gunthorpe
2024-08-09 15:06 ` Robin Murphy
2024-08-09 16:09 ` Jason Gunthorpe
2024-08-09 18:34 ` Robin Murphy
2024-08-13 14:35 ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 7/8] iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED Jason Gunthorpe
2024-08-09 16:05 ` Robin Murphy
2024-08-09 18:03 ` Jason Gunthorpe
2024-08-06 23:41 ` [PATCH 8/8] iommu/arm-smmu-v3: Add arm_smmu_cache_invalidate_user Jason Gunthorpe
2024-08-20 8:20 ` [PATCH 0/8] Initial support for SMMUv3 nested translation Mostafa Saleh
2024-08-20 15:24 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZsW5HRZj2O2hGQYc@google.com \
--to=smostafa@google.com \
--cc=acpica-devel@lists.linux.dev \
--cc=alex.williamson@redhat.com \
--cc=eric.auger@redhat.com \
--cc=guohanjun@huawei.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=lpieralisi@kernel.org \
--cc=maz@kernel.org \
--cc=mdf@kernel.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=rafael@kernel.org \
--cc=robert.moore@intel.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=sudeep.holla@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).