From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com,
suzuki.poulose@arm.com, yuzenghui@huawei.com,
catalin.marinas@arm.com, will@kernel.org, robin.murphy@arm.com,
jean-philippe@linaro.org, qperret@google.com, tabba@google.com,
mark.rutland@arm.com, praan@google.com
Subject: Re: [PATCH v3 29/29] iommu/arm-smmu-v3-kvm: Add IOMMU ops
Date: Mon, 4 Aug 2025 14:41:31 +0000 [thread overview]
Message-ID: <aJDGm02ihZyrBalY@google.com> (raw)
In-Reply-To: <20250801185930.GH26511@ziepe.ca>
On Fri, Aug 01, 2025 at 03:59:30PM -0300, Jason Gunthorpe wrote:
> On Thu, Jul 31, 2025 at 05:44:55PM +0000, Mostafa Saleh wrote:
> > > > They are not random, as part of this series the SMMUv3 driver is split
> > > > where some of the code goes to “arm-smmu-v3-common.c” which is used by
> > > > both drivers, this reduces a lot of duplication.
> > >
> > > I find it very confusing.
> > >
> > > It made sense to factor some of the code out so that pKVM can have
> > > it's own smmv3 HW driver, sure.
> > >
> > > But I don't understand why a paravirtualized iommu driver for pKVM has
> > > any relation to smmuv3. Shouldn't it just be calling some hypercalls
> > > to set IDENTITY/BLOCKING?
> >
> > Well it’s not really “paravirtualized” as virtio-iommu, this is an SMMUv3
> > driver (it uses the same binding a the smmu-v3)
>
> > It re-use the same probe code, fw/hw parsing and so on (inside the kernel),
> > also re-use the same structs to make that possible.
>
> I think this is not quite true, I think you have some part of the smmu driver
> bootstrap the pkvm protected driver.
>
> But then the pkvm takes over all the registers and the command queue.
>
> Are you saying the event queue is left behind for the kernel? How does
> that work if it doesn't have access to the registers?
The evtq itself will be owned by the kernel, However, MMIO access would be
trapped and emulated, here the PoC for part-2 of this series (as mentioned in
the cover letter) This is very close to how nesting will work.
https://android-kvm.googlesource.com/linux/+/refs/heads/for-upstream/pkvm-smmu-v3-part-2/drivers/iommu/arm/arm-smmu-v3/pkvm/arm-smmu-v3.c#744
>
> So what is left of the actual *iommu subsystem* driver is just some
> pkvm hypercalls?
Yes at the moment there are only 2 hypercalls and one hypervisor callback
to shadow the page table, when we go to nesting, the hypercalls will be
removed and there will be only data abort callback for MMIO emulation.
>
> It seems more sensible to me to have a pkvm HW driver for SMMUv3 that
> is split between pkvm and kernel, that operates the HW - but is NOT an
> iommu subsystem driver
>
> Then an iommu subsystem driver that does the hypercalls, that is NOT
> connected to SMMUv3 at all.
>
> In other words you have two cleanly seperate concerns here, an "pkvm
> iommu subsystem" that lets pkvm control iommu HW - and the current
> "iommu subsystem" that lets the kernel control iommu HW. The same
> driver should not register to both.
>
I am not sure how that would work exactly, for example how would probe_device
work, xlate... in a generic way? same for other ops. We can make some of these
functions (hypercalls wrappers) in a separate file. Also I am not sure how that
looks from the kernel perspective (do we have 2 struct devices per SMMU?)
But, tbh, i’d prefer to drop iommu_ops at all, check my answer below.
> > As mentioned in the cover letter, we can also still build nesting on top of
> > this driver, and I plan to post an RFC for that, once this one is sorted.
>
> I would expect nesting to present an actual paravirtualized SMMUv3
> though, with a proper normal SMMUv3 IOMMU subystem driver. This is how
> ARM architecture is built to work, why mess it up?
>
> So my advice above seems cleaner, you have the pkvm iommu HW driver
> that turns around and presents a completely normal SMMUv3 HW API which
> is bound by the ordinately SMMUv3 iommu subsystem driver.
>
I think we are on the same page about how that will look at the end.
For nesting there will be a pKVM driver (as mentioned in the cover letter)
to probe register the SMMUs, then it will unbind itself to let the current
(ARM_SMMU_V3) driver probe the SMMUs and it can run unmodified. Which
will be full transparent.
Then the hypervisor driver will use trap and emulate to handle SMMU access
to the MMIO, providing an architecturally accurate SMMUv3 emualation, and
it will not register iommu_ops.
Nor will it use any hypercalls, as the main reason I added those is to tell
the hypervisor what SIDs are used in identity while others remain blocked, as
enabling all the SID space doesn’t only require a lot of memory but also
doesn't feel secure.
With nesting, we don’t need those, as the hypervisor will trap CFGI and will
know what SID to shadow.
However, based on the feedback on my v2 series, it was better to split pKVM
support, so the initial series only establishes DMA isolation. Then when we
can enable full translating domains (either nesting or pv which is another
discussion)
So, I will happily drop the hypercalls and the iommu_ops from this series,
if there is a better way to enlighten the hypervisor about the SIDs to be
in identity.
Otherwise I can’t see any other way to move forward other than going back to
posting large serieses.
Thanks,
Mostafa
> Jason
next prev parent reply other threads:[~2025-08-04 14:41 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-28 17:52 [PATCH v3 00/29] KVM: arm64: SMMUv3 driver for pKVM Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 01/29] KVM: arm64: Add a new function to donate memory with prot Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 02/29] KVM: arm64: Donate MMIO to the hypervisor Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 03/29] KVM: arm64: pkvm: Add pkvm_time_get() Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 04/29] iommu/io-pgtable-arm: Split the page table driver Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 05/29] iommu/io-pgtable-arm: Split initialization Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 06/29] iommu/arm-smmu-v3: Move some definitions to a new common file Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 07/29] iommu/arm-smmu-v3: Extract driver-specific bits from probe function Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 08/29] iommu/arm-smmu-v3: Move some functions to arm-smmu-v3-common.c Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 09/29] iommu/arm-smmu-v3: Move queue and table allocation " Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 10/29] iommu/arm-smmu-v3: Move firmware probe to arm-smmu-v3-common Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 11/29] iommu/arm-smmu-v3: Move IOMMU registration to arm-smmu-v3-common.c Mostafa Saleh
2025-07-28 17:52 ` [PATCH v3 12/29] iommu/arm-smmu-v3: Split cmdq code with hyp Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 13/29] iommu/arm-smmu-v3: Move TLB range invalidation into a macro Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 14/29] KVM: arm64: iommu: Introduce IOMMU driver infrastructure Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 15/29] KVM: arm64: iommu: Shadow host stage-2 page table Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 16/29] KVM: arm64: iommu: Add a memory pool Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 17/29] KVM: arm64: iommu: Add enable/disable hypercalls Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 18/29] iommu/arm-smmu-v3-kvm: Add SMMUv3 driver Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 19/29] iommu/arm-smmu-v3-kvm: Initialize registers Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 20/29] iommu/arm-smmu-v3-kvm: Setup command queue Mostafa Saleh
2025-07-29 6:44 ` Krzysztof Kozlowski
2025-07-29 9:55 ` Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 21/29] iommu/arm-smmu-v3-kvm: Setup stream table Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 22/29] iommu/arm-smmu-v3-kvm: Reset the device Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 23/29] iommu/arm-smmu-v3-kvm: Support io-pgtable Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 24/29] iommu/arm-smmu-v3-kvm: Shadow the CPU stage-2 page table Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 25/29] iommu/arm-smmu-v3-kvm: Add enable/disable device HVCs Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 26/29] iommu/arm-smmu-v3-kvm: Add host driver for pKVM Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 27/29] iommu/arm-smmu-v3-kvm: Pass a list of SMMU devices to the hypervisor Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 28/29] iommu/arm-smmu-v3-kvm: Allocate structures and reset device Mostafa Saleh
2025-07-28 17:53 ` [PATCH v3 29/29] iommu/arm-smmu-v3-kvm: Add IOMMU ops Mostafa Saleh
2025-07-30 14:42 ` Jason Gunthorpe
2025-07-30 15:07 ` Mostafa Saleh
2025-07-30 16:47 ` Jason Gunthorpe
2025-07-31 14:17 ` Mostafa Saleh
2025-07-31 16:57 ` Jason Gunthorpe
2025-07-31 17:44 ` Mostafa Saleh
2025-08-01 18:59 ` Jason Gunthorpe
2025-08-04 14:41 ` Mostafa Saleh [this message]
2025-08-05 17:57 ` Jason Gunthorpe
2025-08-06 14:10 ` Mostafa Saleh
2025-08-11 18:55 ` Jason Gunthorpe
2025-08-12 10:29 ` Mostafa Saleh
2025-08-12 12:10 ` Jason Gunthorpe
2025-08-12 12:37 ` Mostafa Saleh
2025-08-12 13:48 ` Jason Gunthorpe
2025-08-13 13:52 ` Mostafa Saleh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aJDGm02ihZyrBalY@google.com \
--to=smostafa@google.com \
--cc=catalin.marinas@arm.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=jgg@ziepe.ca \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=praan@google.com \
--cc=qperret@google.com \
--cc=robin.murphy@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).