From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Aneesh Kumar K.V (Arm)" <aneesh.kumar@kernel.org>,
iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev,
Robin Murphy <robin.murphy@arm.com>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Steven Price <steven.price@arm.com>,
Suzuki K Poulose <Suzuki.Poulose@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Jiri Pirko <jiri@resnulli.us>, Petr Tesarik <ptesarik@suse.com>,
Alexey Kardashevskiy <aik@amd.com>,
Dan Williams <dan.j.williams@intel.com>,
Xu Yilun <yilun.xu@linux.intel.com>,
linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
x86@kernel.org
Subject: Re: [PATCH v4 04/13] dma: swiotlb: track pool encryption state and honor DMA_ATTR_CC_SHARED
Date: Thu, 14 May 2026 14:43:39 +0000 [thread overview]
Message-ID: <agXfm3mS_M3fvRrN@google.com> (raw)
In-Reply-To: <20260514123529.GZ7702@ziepe.ca>
On Thu, May 14, 2026 at 09:35:29AM -0300, Jason Gunthorpe wrote:
> > > How will pKVM signal what kind of memory the DMA needs then?
> > >
> > > Does it use set_memory_decrypted()? How can it use
> > > set_memory_decrypted() without offering CC_ATTR_MEM_ENCRYPT ?
> >
> > pKVM (hypervisor) doesn’t signal anything.
> > The VMM when running protected guests will use restricted dma-pools
> > for emulated vritio devices in the guest, which gets decrypted by
> > the guest kernel and hence shared with the host kernel, and then
> > traffic is bounced via the pool.
>
> That really does sound like CC and set_memory_decrypted() to me..
>
> > It’s also worth noting that bouncing here isn't just about visibility.
> > Because memory sharing operates at page granularity, bouncing sub-page
> > allocations through the restricted pool prevents adjacent, sensitive
> > guest data from being exposed to the untrusted host.
>
> That's a somewhat different problem, we have the dev->trusted stuff
> that is supposed to deal with this kind of security. We need it for
> IOMMU based systems too, eg hot plug thunderbolt should have it.
I see that it is used only for dma-iommu and for PCI devices.
However, I think that should be a problem with other CCA solutions
with emulated devices as they are untrusted. As I'd expect they
would have virtio devices.
>
> Then CC issue is more that the DMA API can't decrypt random passed in
> memory because doing so often requires changing the PTEs pointing at
> the page so it would break everything if done transparently.
>
> > > > I believe that the pool should have a way to control it’s property
> > > > (encrypted or decrypted) and that takes priority over whatever
> > > > attributes comes from allocation.
> > >
> > > We should get here because dma_capable() fails, and then swiotlb needs
> > > to return something that makes dma_capable() succeed. Yes, it should
> > > return details about the thing it decided, but it shouldn't have been
> > > pre-created with some idea how to make dma_capable() work.
> >
> > That sounds neat, but at the end we have force_dma_unencrypted() in
> > dma_capable() which is just hardcoded to true/false by the platform.
>
> For now, the next step is it becomes per-device and dynamic during the
> device lifecycle.
>
> > How is that different from having the state static by the pool?
>
> statically attached pools to the device are not so flexible when
> devices have dynamically changing capabilities..
Pools can be per-device also. A device can have mutiple pools with
different memory attrs, which then can be matched by the DMA code
at runtime, it's not as flexible, but removes some complexity from
the guest code.
>
> > > If dma_capable() can fail, then swiotlb should know exactly what to do
> > > to fix it.
> >
> > dma_capable() returns a bool, I don’t think it can know what exactly
> > went wrong (based on address, size, attrs, dev...)
>
> Yes, but I think the design is swiotlb is supposed to re-inspect what
> is going on against the limits dma_capable checks and then select the
> correct remedy..
I see, but that’s not part of this series, and probably would require
some rework so dma_capable() can return an error code (ERANGE, EPERM...)
so that caller can deal with that.
>
> > While we can debate the aesthetics of the setup , this is
> > the exisitng behaviour for Linux, which existed for years
> > and pKVM relies on and is used extensively.
> > And, this patch alters that long-standing logic and introduces
> > a functional regression.
>
> Yeah, Aneesh needs to do something here, I'm pointing out it is
> entirely seperate thing from the CC path we are working on which is
> decoupling CC from reylying on force swiotlb.
I am looking into converting pKVM to use the CC stuff, I replied with
a patch to Aneesh in this thread. However, I need to do more testing
and make sure there are not any unwanted consequences.
>
> > We can address this by either adjusting this patch or by changing
> > pKVM guests to be more aligned with other CCA guests which is
> > something I have been wondering about if it would help reduce
> > bouncing.
>
> Every time I look at pkvm I think it is just ARM CCA with a different
> design and no access to the unique HW features..
>
> > > If we can make that work then maybe the flows are designed correctly.
> >
> > Mmm, I am not sure I understand this one, shouldn’t the device also be
> > notified about the switch in memory state, if it expects to read/write
> > decrypted memory, how would that work if the kernel changes it to an
> > encrypted one?
>
> Nothing on the device changes. In a CC world we put the device in a
> T=0 or T=1 state before the driver loads and the expectation from the
> DMA API is that the device will only use that T=x DMA type during
> operation.
>
> A T=1 state device can access all of memory, private or shared. Any
> information the platform may need is encoded in the dma_addr_t or in
> the S1 IOPTEs.
>
> So we never need to tell the device driver what kind of memory the DMA
> is targetting, and we NEVER expect a device in T=1 mode to have to
> issue a T=0 DMA to use the DMA API.
>
> In a pkvm world it should be the same, the S2 table for the SMMU will
> control what the device can access, and if the SMMU points to a
> "private" or "shared" page is not something the device needs to know
> or care about.
I see that's because dma-iommu chooses the attrs for iommu_map().
In pKVM, dma_addr_t and IOPTE are the same for private and shared,
so nothing differs in that case.
We don’t expect pass-through devices to interact with shared
memory (T=0) at the moment.
However, I can see use cases for that, where the host and the guest
collaborate with device passthrough and require zero copy.
One other interesting case for device-passthrough is non-coherent
devices which then require private pools for bouncing.
Thanks,
Mostafa
>
> Jason
next prev parent reply other threads:[~2026-05-14 14:43 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 9:03 [PATCH v4 00/13] dma-mapping: Use DMA_ATTR_CC_SHARED through direct, pool and swiotlb paths Aneesh Kumar K.V (Arm)
2026-05-12 9:03 ` [PATCH v4 01/13] dma-direct: swiotlb: handle swiotlb alloc/free outside __dma_direct_alloc_pages Aneesh Kumar K.V (Arm)
2026-05-13 13:57 ` Mostafa Saleh
2026-05-14 4:54 ` Aneesh Kumar K.V
2026-05-12 9:03 ` [PATCH v4 02/13] dma-direct: use DMA_ATTR_CC_SHARED in alloc/free paths Aneesh Kumar K.V (Arm)
2026-05-13 13:58 ` Mostafa Saleh
2026-05-14 5:01 ` Aneesh Kumar K.V
2026-05-12 9:03 ` [PATCH v4 03/13] dma-pool: track decrypted atomic pools and select them via attrs Aneesh Kumar K.V (Arm)
2026-05-13 14:00 ` Mostafa Saleh
2026-05-14 7:00 ` Aneesh Kumar K.V
2026-05-14 8:06 ` Mostafa Saleh
2026-05-12 9:03 ` [PATCH v4 04/13] dma: swiotlb: track pool encryption state and honor DMA_ATTR_CC_SHARED Aneesh Kumar K.V (Arm)
2026-05-13 14:27 ` Mostafa Saleh
2026-05-13 17:24 ` Jason Gunthorpe
2026-05-14 6:24 ` Aneesh Kumar K.V
2026-05-14 11:48 ` Mostafa Saleh
2026-05-14 12:35 ` Jason Gunthorpe
2026-05-14 14:43 ` Mostafa Saleh [this message]
2026-05-14 5:54 ` Aneesh Kumar K.V
2026-05-14 12:02 ` Mostafa Saleh
2026-05-14 12:48 ` Aneesh Kumar K.V
2026-05-14 14:21 ` Mostafa Saleh
2026-05-14 14:43 ` Aneesh Kumar K.V
2026-05-14 14:37 ` Jason Gunthorpe
2026-05-14 15:43 ` Mostafa Saleh
2026-05-12 9:04 ` [PATCH v4 05/13] dma-mapping: make dma_pgprot() " Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 06/13] dma-direct: pass attrs to dma_capable() for DMA_ATTR_CC_SHARED checks Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 07/13] dma-direct: make dma_direct_map_phys() honor DMA_ATTR_CC_SHARED Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 08/13] dma-direct: set decrypted flag for remapped DMA allocations Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 09/13] dma-direct: select DMA address encoding from DMA_ATTR_CC_SHARED Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 10/13] dma-pool: fix page leak in atomic_pool_expand() cleanup Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 11/13] dma-direct: rename ret to cpu_addr in alloc helpers Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 12/13] dma-direct: return struct page from dma_direct_alloc_from_pool() Aneesh Kumar K.V (Arm)
2026-05-12 9:04 ` [PATCH v4 13/13] x86/amd-gart: preserve the direct DMA address until GART mapping succeeds Aneesh Kumar K.V (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agXfm3mS_M3fvRrN@google.com \
--to=smostafa@google.com \
--cc=Suzuki.Poulose@arm.com \
--cc=agordeev@linux.ibm.com \
--cc=aik@amd.com \
--cc=aneesh.kumar@kernel.org \
--cc=borntraeger@linux.ibm.com \
--cc=catalin.marinas@arm.com \
--cc=chleroy@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=jiri@resnulli.us \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=m.szyprowski@samsung.com \
--cc=maddy@linux.ibm.com \
--cc=maz@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=ptesarik@suse.com \
--cc=robin.murphy@arm.com \
--cc=steven.price@arm.com \
--cc=svens@linux.ibm.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox