* [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
@ 2026-04-15 15:08 Peng Fan (OSS)
2026-04-20 12:19 ` Jürgen Groß
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Peng Fan (OSS) @ 2026-04-15 15:08 UTC (permalink / raw)
To: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko
Cc: xen-devel, iommu, linux-kernel, Peng Fan
From: Peng Fan <peng.fan@nxp.com>
On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
physical address passed in refers to normal RAM that is part of the
kernel linear(direct) mapping, as it unconditionally derives a CPU
virtual address via phys_to_virt().
With Xen swiotlb, devices may use per-device coherent DMA memory,
such as reserved-memory regions described by 'shared-dma-pool',
which are assigned to dev->dma_mem. These regions may be marked
no-map in DT and therefore are not part of the kernel linear map.
In such cases, pfn_valid() still returns true, but phys_to_virt()
is not valid and cache maintenance via arch_sync_dma_* will fault.
Prevent this by excluding devices with a private DMA memory pool
(dev->dma_mem) from the arch_sync_dma_* fast path, and always
fall back to xen_dma_sync_* for those devices to avoid invalid
phys_to_virt() conversions for no-map DMA memory while preserving the
existing fast path for normal, linear-mapped RAM.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
---
drivers/xen/swiotlb-xen.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2cbf2b588f5b20cfbf9e83a8339dc22092c9559a..b1445df99d9a8f1d18a83b8c413bada6e5579209 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -195,6 +195,11 @@ xen_swiotlb_free_coherent(struct device *dev, size_t size, void *vaddr,
}
#endif /* CONFIG_X86 */
+static inline bool dev_has_private_dma_pool(struct device *dev)
+{
+ return dev && dev->dma_mem;
+}
+
/*
* Map a single buffer of the indicated size for DMA in streaming mode. The
* physical address to use is returned.
@@ -262,7 +267,8 @@ static dma_addr_t xen_swiotlb_map_phys(struct device *dev, phys_addr_t phys,
done:
if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
- if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr)))) {
+ if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr))) &&
+ !dev_has_private_dma_pool(dev)) {
arch_sync_dma_for_device(phys, size, dir);
arch_sync_dma_flush();
} else {
@@ -289,7 +295,8 @@ static void xen_swiotlb_unmap_phys(struct device *hwdev, dma_addr_t dev_addr,
BUG_ON(dir == DMA_NONE);
if (!dev_is_dma_coherent(hwdev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
- if (pfn_valid(PFN_DOWN(dma_to_phys(hwdev, dev_addr)))) {
+ if (pfn_valid(PFN_DOWN(dma_to_phys(hwdev, dev_addr))) &&
+ !dev_has_private_dma_pool(hwdev)) {
arch_sync_dma_for_cpu(paddr, size, dir);
arch_sync_dma_flush();
} else {
@@ -312,7 +319,8 @@ xen_swiotlb_sync_single_for_cpu(struct device *dev, dma_addr_t dma_addr,
struct io_tlb_pool *pool;
if (!dev_is_dma_coherent(dev)) {
- if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr)))) {
+ if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr))) &&
+ !dev_has_private_dma_pool(dev)) {
arch_sync_dma_for_cpu(paddr, size, dir);
arch_sync_dma_flush();
} else {
@@ -337,7 +345,8 @@ xen_swiotlb_sync_single_for_device(struct device *dev, dma_addr_t dma_addr,
__swiotlb_sync_single_for_device(dev, paddr, size, dir, pool);
if (!dev_is_dma_coherent(dev)) {
- if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr)))) {
+ if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr))) &&
+ !dev_has_private_dma_pool(dev)) {
arch_sync_dma_for_device(paddr, size, dir);
arch_sync_dma_flush();
} else {
---
base-commit: 66672af7a095d89f082c5327f3b15bc2f93d558e
change-id: 20260415-xen-swiotlb-34a198b6c1d6
Best regards,
--
Peng Fan <peng.fan@nxp.com>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-15 15:08 [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory Peng Fan (OSS)
@ 2026-04-20 12:19 ` Jürgen Groß
2026-04-24 7:03 ` Peng Fan
2026-04-20 23:01 ` Jason Gunthorpe
2026-04-24 23:26 ` Stefano Stabellini
2 siblings, 1 reply; 8+ messages in thread
From: Jürgen Groß @ 2026-04-20 12:19 UTC (permalink / raw)
To: Peng Fan (OSS), Stefano Stabellini, Oleksandr Tyshchenko
Cc: xen-devel, iommu, linux-kernel, Peng Fan
[-- Attachment #1.1.1: Type: text/plain, Size: 1790 bytes --]
On 15.04.26 17:08, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
>
> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
> physical address passed in refers to normal RAM that is part of the
> kernel linear(direct) mapping, as it unconditionally derives a CPU
> virtual address via phys_to_virt().
>
> With Xen swiotlb, devices may use per-device coherent DMA memory,
> such as reserved-memory regions described by 'shared-dma-pool',
> which are assigned to dev->dma_mem. These regions may be marked
> no-map in DT and therefore are not part of the kernel linear map.
> In such cases, pfn_valid() still returns true, but phys_to_virt()
> is not valid and cache maintenance via arch_sync_dma_* will fault.
>
> Prevent this by excluding devices with a private DMA memory pool
> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
> fall back to xen_dma_sync_* for those devices to avoid invalid
> phys_to_virt() conversions for no-map DMA memory while preserving the
> existing fast path for normal, linear-mapped RAM.
>
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> ---
> drivers/xen/swiotlb-xen.c | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 2cbf2b588f5b20cfbf9e83a8339dc22092c9559a..b1445df99d9a8f1d18a83b8c413bada6e5579209 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -195,6 +195,11 @@ xen_swiotlb_free_coherent(struct device *dev, size_t size, void *vaddr,
> }
> #endif /* CONFIG_X86 */
>
> +static inline bool dev_has_private_dma_pool(struct device *dev)
> +{
> + return dev && dev->dma_mem;
> +}
> +
I don't think this will compile on x86.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-15 15:08 [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory Peng Fan (OSS)
2026-04-20 12:19 ` Jürgen Groß
@ 2026-04-20 23:01 ` Jason Gunthorpe
2026-04-24 6:57 ` Peng Fan
2026-04-24 23:26 ` Stefano Stabellini
2 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2026-04-20 23:01 UTC (permalink / raw)
To: Peng Fan (OSS)
Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
xen-devel, iommu, linux-kernel, Peng Fan
On Wed, Apr 15, 2026 at 11:08:36PM +0800, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
>
> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
> physical address passed in refers to normal RAM that is part of the
> kernel linear(direct) mapping, as it unconditionally derives a CPU
> virtual address via phys_to_virt().
>
> With Xen swiotlb, devices may use per-device coherent DMA memory,
> such as reserved-memory regions described by 'shared-dma-pool',
> which are assigned to dev->dma_mem. These regions may be marked
> no-map in DT and therefore are not part of the kernel linear map.
> In such cases, pfn_valid() still returns true, but phys_to_virt()
> is not valid and cache maintenance via arch_sync_dma_* will fault.
>
> Prevent this by excluding devices with a private DMA memory pool
> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
> fall back to xen_dma_sync_* for those devices to avoid invalid
> phys_to_virt() conversions for no-map DMA memory while preserving the
> existing fast path for normal, linear-mapped RAM.
I think this is the same sort of weirdness the other two CC threads are
dealing with.. We already have two different flags indicating the
cache flush should be skipped, it would make more sense to have the
swiotlb mangle the flags, just like for cc.
https://lore.kernel.org/r/20260420061415.3650870-1-aneesh.kumar@kernel.org
Then you know that the swiotlb was used and it should flow down to
here.
> * physical address to use is returned.
> @@ -262,7 +267,8 @@ static dma_addr_t xen_swiotlb_map_phys(struct device *dev, phys_addr_t phys,
>
> done:
> if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
> - if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr)))) {
> + if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr))) &&
> + !dev_has_private_dma_pool(dev)) {
Also this pfn_valid() is totally bogus. Unless DMA_ATTR_MMIO the phys
must have a struct page, be pfn_valid, etc.
This is why you are getting into trouble here, beacuse swiotlb created
a non-struct page address and passed it to lower layers without
setting something like DMA_ATTR_MMIO..
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-20 23:01 ` Jason Gunthorpe
@ 2026-04-24 6:57 ` Peng Fan
2026-04-24 13:44 ` Jason Gunthorpe
0 siblings, 1 reply; 8+ messages in thread
From: Peng Fan @ 2026-04-24 6:57 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
xen-devel, iommu, linux-kernel, Peng Fan
Hi Jason,
On Mon, Apr 20, 2026 at 08:01:37PM -0300, Jason Gunthorpe wrote:
>On Wed, Apr 15, 2026 at 11:08:36PM +0800, Peng Fan (OSS) wrote:
>> From: Peng Fan <peng.fan@nxp.com>
>>
>> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
>> physical address passed in refers to normal RAM that is part of the
>> kernel linear(direct) mapping, as it unconditionally derives a CPU
>> virtual address via phys_to_virt().
>>
>> With Xen swiotlb, devices may use per-device coherent DMA memory,
>> such as reserved-memory regions described by 'shared-dma-pool',
>> which are assigned to dev->dma_mem. These regions may be marked
>> no-map in DT and therefore are not part of the kernel linear map.
>> In such cases, pfn_valid() still returns true, but phys_to_virt()
>> is not valid and cache maintenance via arch_sync_dma_* will fault.
>>
>> Prevent this by excluding devices with a private DMA memory pool
>> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
>> fall back to xen_dma_sync_* for those devices to avoid invalid
>> phys_to_virt() conversions for no-map DMA memory while preserving the
>> existing fast path for normal, linear-mapped RAM.
>
>I think this is the same sort of weirdness the other two CC threads are
>dealing with.. We already have two different flags indicating the
>cache flush should be skipped, it would make more sense to have the
>swiotlb mangle the flags, just like for cc.
>
>https://lore.kernel.org/r/20260420061415.3650870-1-aneesh.kumar@kernel.org
>
>Then you know that the swiotlb was used and it should flow down to
>here.
Xen fully implements dev->dma_ops and does not leak hypervisor-specific
semantics outside of it.
The issue is that the existing DMA attribute model only distinguishes
between "CPU sync required" and "no sync required at all". Xen needs
a third case: CPU cache sync must be skipped, but platform-level DMA
synchronization remains mandatory.
This is not a generic DMA extension but a constraint of Xen's DMA
model when mapping private or foreign memory that is not CPU-mapped.
>
>> * physical address to use is returned.
>> @@ -262,7 +267,8 @@ static dma_addr_t xen_swiotlb_map_phys(struct device *dev, phys_addr_t phys,
>>
>> done:
>> if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
>> - if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr)))) {
>> + if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr))) &&
>> + !dev_has_private_dma_pool(dev)) {
>
>Also this pfn_valid() is totally bogus. Unless DMA_ATTR_MMIO the phys
>must have a struct page, be pfn_valid, etc.
>
>This is why you are getting into trouble here, beacuse swiotlb created
>a non-struct page address and passed it to lower layers without
>setting something like DMA_ATTR_MMIO..
See above.
xen swiotlb maybe a bit misleading, it is not kind of linux swiotlb.
Juergen, Stefano,
Please help correct if I am wrong or I may miss something.
Thanks,
Peng
>
>Jason
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-20 12:19 ` Jürgen Groß
@ 2026-04-24 7:03 ` Peng Fan
2026-04-24 23:15 ` Stefano Stabellini
0 siblings, 1 reply; 8+ messages in thread
From: Peng Fan @ 2026-04-24 7:03 UTC (permalink / raw)
To: Jürgen Groß
Cc: Stefano Stabellini, Oleksandr Tyshchenko, xen-devel, iommu,
linux-kernel, Peng Fan
Hi Juergen,
On Mon, Apr 20, 2026 at 02:19:34PM +0200, Jürgen Groß wrote:
>On 15.04.26 17:08, Peng Fan (OSS) wrote:
>> From: Peng Fan <peng.fan@nxp.com>
>>
>> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
>> physical address passed in refers to normal RAM that is part of the
>> kernel linear(direct) mapping, as it unconditionally derives a CPU
>> virtual address via phys_to_virt().
>>
>> With Xen swiotlb, devices may use per-device coherent DMA memory,
>> such as reserved-memory regions described by 'shared-dma-pool',
>> which are assigned to dev->dma_mem. These regions may be marked
>> no-map in DT and therefore are not part of the kernel linear map.
>> In such cases, pfn_valid() still returns true, but phys_to_virt()
>> is not valid and cache maintenance via arch_sync_dma_* will fault.
>>
>> Prevent this by excluding devices with a private DMA memory pool
>> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
>> fall back to xen_dma_sync_* for those devices to avoid invalid
>> phys_to_virt() conversions for no-map DMA memory while preserving the
>> existing fast path for normal, linear-mapped RAM.
>>
>> Signed-off-by: Peng Fan <peng.fan@nxp.com>
>> ---
>> drivers/xen/swiotlb-xen.c | 17 +++++++++++++----
>> 1 file changed, 13 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>> index 2cbf2b588f5b20cfbf9e83a8339dc22092c9559a..b1445df99d9a8f1d18a83b8c413bada6e5579209 100644
>> --- a/drivers/xen/swiotlb-xen.c
>> +++ b/drivers/xen/swiotlb-xen.c
>> @@ -195,6 +195,11 @@ xen_swiotlb_free_coherent(struct device *dev, size_t size, void *vaddr,
>> }
>> #endif /* CONFIG_X86 */
>> +static inline bool dev_has_private_dma_pool(struct device *dev)
>> +{
>> + return dev && dev->dma_mem;
>> +}
>> +
>
>I don't think this will compile on x86.
My bad, I only tried the build for ARM64. I use below changes, if it looks
good.
#ifdef CONFIG_ARM64
static inline bool dev_has_private_dma_pool(struct device *dev)
{
return dev && dev->dma_mem;
}
#else
static inline bool dev_has_private_dma_pool(struct device *dev)
{
return false;
}
#endif
Thanks,
Peng
>
>
>Juergen
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-24 6:57 ` Peng Fan
@ 2026-04-24 13:44 ` Jason Gunthorpe
0 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2026-04-24 13:44 UTC (permalink / raw)
To: Peng Fan
Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
xen-devel, iommu, linux-kernel, Peng Fan
On Fri, Apr 24, 2026 at 02:57:19PM +0800, Peng Fan wrote:
> Hi Jason,
>
> On Mon, Apr 20, 2026 at 08:01:37PM -0300, Jason Gunthorpe wrote:
> >On Wed, Apr 15, 2026 at 11:08:36PM +0800, Peng Fan (OSS) wrote:
> >> From: Peng Fan <peng.fan@nxp.com>
> >>
> >> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
> >> physical address passed in refers to normal RAM that is part of the
> >> kernel linear(direct) mapping, as it unconditionally derives a CPU
> >> virtual address via phys_to_virt().
> >>
> >> With Xen swiotlb, devices may use per-device coherent DMA memory,
> >> such as reserved-memory regions described by 'shared-dma-pool',
> >> which are assigned to dev->dma_mem. These regions may be marked
> >> no-map in DT and therefore are not part of the kernel linear map.
> >> In such cases, pfn_valid() still returns true, but phys_to_virt()
> >> is not valid and cache maintenance via arch_sync_dma_* will fault.
> >>
> >> Prevent this by excluding devices with a private DMA memory pool
> >> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
> >> fall back to xen_dma_sync_* for those devices to avoid invalid
> >> phys_to_virt() conversions for no-map DMA memory while preserving the
> >> existing fast path for normal, linear-mapped RAM.
> >
> >I think this is the same sort of weirdness the other two CC threads are
> >dealing with.. We already have two different flags indicating the
> >cache flush should be skipped, it would make more sense to have the
> >swiotlb mangle the flags, just like for cc.
> >
> >https://lore.kernel.org/r/20260420061415.3650870-1-aneesh.kumar@kernel.org
> >
> >Then you know that the swiotlb was used and it should flow down to
> >here.
>
> Xen fully implements dev->dma_ops and does not leak hypervisor-specific
> semantics outside of it.
It may have its own re-implementation but the same remarks apply. The
flags in attrs should be correct and you should not be putting random
boolean checks all over the place to make up for incorrect flags.
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-24 7:03 ` Peng Fan
@ 2026-04-24 23:15 ` Stefano Stabellini
0 siblings, 0 replies; 8+ messages in thread
From: Stefano Stabellini @ 2026-04-24 23:15 UTC (permalink / raw)
To: Peng Fan
Cc: Jürgen Groß, Stefano Stabellini, Oleksandr Tyshchenko,
xen-devel, iommu, linux-kernel, Peng Fan
[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]
On Fri, 24 Apr 2026, Peng Fan wrote:
> Hi Juergen,
>
> On Mon, Apr 20, 2026 at 02:19:34PM +0200, Jürgen Groß wrote:
> >On 15.04.26 17:08, Peng Fan (OSS) wrote:
> >> From: Peng Fan <peng.fan@nxp.com>
> >>
> >> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
> >> physical address passed in refers to normal RAM that is part of the
> >> kernel linear(direct) mapping, as it unconditionally derives a CPU
> >> virtual address via phys_to_virt().
> >>
> >> With Xen swiotlb, devices may use per-device coherent DMA memory,
> >> such as reserved-memory regions described by 'shared-dma-pool',
> >> which are assigned to dev->dma_mem. These regions may be marked
> >> no-map in DT and therefore are not part of the kernel linear map.
> >> In such cases, pfn_valid() still returns true, but phys_to_virt()
> >> is not valid and cache maintenance via arch_sync_dma_* will fault.
> >>
> >> Prevent this by excluding devices with a private DMA memory pool
> >> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
> >> fall back to xen_dma_sync_* for those devices to avoid invalid
> >> phys_to_virt() conversions for no-map DMA memory while preserving the
> >> existing fast path for normal, linear-mapped RAM.
> >>
> >> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> >> ---
> >> drivers/xen/swiotlb-xen.c | 17 +++++++++++++----
> >> 1 file changed, 13 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> >> index 2cbf2b588f5b20cfbf9e83a8339dc22092c9559a..b1445df99d9a8f1d18a83b8c413bada6e5579209 100644
> >> --- a/drivers/xen/swiotlb-xen.c
> >> +++ b/drivers/xen/swiotlb-xen.c
> >> @@ -195,6 +195,11 @@ xen_swiotlb_free_coherent(struct device *dev, size_t size, void *vaddr,
> >> }
> >> #endif /* CONFIG_X86 */
> >> +static inline bool dev_has_private_dma_pool(struct device *dev)
> >> +{
> >> + return dev && dev->dma_mem;
> >> +}
> >> +
> >
> >I don't think this will compile on x86.
>
> My bad, I only tried the build for ARM64. I use below changes, if it looks
> good.
>
> #ifdef CONFIG_ARM64
> static inline bool dev_has_private_dma_pool(struct device *dev)
> {
> return dev && dev->dma_mem;
> }
> #else
> static inline bool dev_has_private_dma_pool(struct device *dev)
> {
> return false;
> }
> #endif
Please use CONFIG_DMA_DECLARE_COHERENT
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory
2026-04-15 15:08 [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory Peng Fan (OSS)
2026-04-20 12:19 ` Jürgen Groß
2026-04-20 23:01 ` Jason Gunthorpe
@ 2026-04-24 23:26 ` Stefano Stabellini
2 siblings, 0 replies; 8+ messages in thread
From: Stefano Stabellini @ 2026-04-24 23:26 UTC (permalink / raw)
To: Peng Fan (OSS)
Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
xen-devel, iommu, linux-kernel, Peng Fan, michal.orzel
+Michal
On Wed, 15 Apr 2026, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
>
> On ARM64, arch_sync_dma_for_{cpu,device}() assumes that the
> physical address passed in refers to normal RAM that is part of the
> kernel linear(direct) mapping, as it unconditionally derives a CPU
> virtual address via phys_to_virt().
>
> With Xen swiotlb, devices may use per-device coherent DMA memory,
> such as reserved-memory regions described by 'shared-dma-pool',
> which are assigned to dev->dma_mem. These regions may be marked
> no-map in DT and therefore are not part of the kernel linear map.
> In such cases, pfn_valid() still returns true, but phys_to_virt()
> is not valid and cache maintenance via arch_sync_dma_* will fault.
>
> Prevent this by excluding devices with a private DMA memory pool
> (dev->dma_mem) from the arch_sync_dma_* fast path, and always
> fall back to xen_dma_sync_* for those devices to avoid invalid
> phys_to_virt() conversions for no-map DMA memory while preserving the
> existing fast path for normal, linear-mapped RAM.
This might not work either: the Xen side implementation is
xen/common/grant_table.c:_cache_flush.
Could you please check? From looking at the code,
page_get_owner_and_reference might return NULL for pages part of
reserved-memory regions marked as no-map.
In which case, the Xen hypercall should return -EPERM.
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> ---
> drivers/xen/swiotlb-xen.c | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 2cbf2b588f5b20cfbf9e83a8339dc22092c9559a..b1445df99d9a8f1d18a83b8c413bada6e5579209 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -195,6 +195,11 @@ xen_swiotlb_free_coherent(struct device *dev, size_t size, void *vaddr,
> }
> #endif /* CONFIG_X86 */
>
> +static inline bool dev_has_private_dma_pool(struct device *dev)
> +{
> + return dev && dev->dma_mem;
> +}
> +
> /*
> * Map a single buffer of the indicated size for DMA in streaming mode. The
> * physical address to use is returned.
> @@ -262,7 +267,8 @@ static dma_addr_t xen_swiotlb_map_phys(struct device *dev, phys_addr_t phys,
>
> done:
> if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
> - if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr)))) {
> + if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dev_addr))) &&
> + !dev_has_private_dma_pool(dev)) {
> arch_sync_dma_for_device(phys, size, dir);
> arch_sync_dma_flush();
> } else {
> @@ -289,7 +295,8 @@ static void xen_swiotlb_unmap_phys(struct device *hwdev, dma_addr_t dev_addr,
> BUG_ON(dir == DMA_NONE);
>
> if (!dev_is_dma_coherent(hwdev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
> - if (pfn_valid(PFN_DOWN(dma_to_phys(hwdev, dev_addr)))) {
> + if (pfn_valid(PFN_DOWN(dma_to_phys(hwdev, dev_addr))) &&
> + !dev_has_private_dma_pool(hwdev)) {
> arch_sync_dma_for_cpu(paddr, size, dir);
> arch_sync_dma_flush();
> } else {
> @@ -312,7 +319,8 @@ xen_swiotlb_sync_single_for_cpu(struct device *dev, dma_addr_t dma_addr,
> struct io_tlb_pool *pool;
>
> if (!dev_is_dma_coherent(dev)) {
> - if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr)))) {
> + if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr))) &&
> + !dev_has_private_dma_pool(dev)) {
> arch_sync_dma_for_cpu(paddr, size, dir);
> arch_sync_dma_flush();
> } else {
> @@ -337,7 +345,8 @@ xen_swiotlb_sync_single_for_device(struct device *dev, dma_addr_t dma_addr,
> __swiotlb_sync_single_for_device(dev, paddr, size, dir, pool);
>
> if (!dev_is_dma_coherent(dev)) {
> - if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr)))) {
> + if (pfn_valid(PFN_DOWN(dma_to_phys(dev, dma_addr))) &&
> + !dev_has_private_dma_pool(dev)) {
> arch_sync_dma_for_device(paddr, size, dir);
> arch_sync_dma_flush();
> } else {
>
> ---
> base-commit: 66672af7a095d89f082c5327f3b15bc2f93d558e
> change-id: 20260415-xen-swiotlb-34a198b6c1d6
>
> Best regards,
> --
> Peng Fan <peng.fan@nxp.com>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-24 23:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-15 15:08 [PATCH RFC] xen/swiotlb: avoid arch_sync_dma_* on per-device DMA memory Peng Fan (OSS)
2026-04-20 12:19 ` Jürgen Groß
2026-04-24 7:03 ` Peng Fan
2026-04-24 23:15 ` Stefano Stabellini
2026-04-20 23:01 ` Jason Gunthorpe
2026-04-24 6:57 ` Peng Fan
2026-04-24 13:44 ` Jason Gunthorpe
2026-04-24 23:26 ` Stefano Stabellini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox