* [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
@ 2026-03-27 5:58 Mikhail Gavrilov
2026-03-27 6:37 ` Harry Yoo (Oracle)
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Mikhail Gavrilov @ 2026-03-27 5:58 UTC (permalink / raw)
To: vbabka, harry.yoo, akpm
Cc: hao.li, cl, rientjes, roman.gushchin, linux-mm, linux-kernel,
linux-usb, stern, linux, andy.shevchenko, hch, Jeff.kirsher,
Mikhail Gavrilov
When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
tracks active mappings per cacheline and warns if two different DMA
mappings share the same cacheline ("cacheline tracking EEXIST,
overlapping mappings aren't supported").
On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
hub driver) frequently land in the same 64-byte cacheline. When both
are DMA-mapped, this triggers a false positive warning.
This has been reported repeatedly since v5.14 (when the EEXIST check
was added) across various USB host controllers and devices including
xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
Raise ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES when CONFIG_DMA_API_DEBUG
is enabled, ensuring each kmalloc allocation occupies its own cacheline
and eliminating the false positive.
Verified with a kernel module reproducer that performs two kmalloc(8)
allocations back-to-back and DMA-maps both:
Before: allocations share a cacheline, EEXIST fires within ~50 pairs
After: 64 pairs allocated, all in separate cachelines, no warning
Fixes: 2b4bbc6231d7 ("dma-debug: report -EEXIST errors in add_dma_entry")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215740
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Jeff Kirsher <Jeff.kirsher@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
Reproducer module that triggers the bug reliably:
https://bugzilla.kernel.org/attachment.cgi?id=309769
include/linux/slab.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 15a60b501b95..f044956e17c1 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -536,6 +536,19 @@ static inline bool kmem_dump_obj(void *object) { return false; }
#endif
#endif
+/*
+ * Align memory allocations to cache lines if DMA API debugging is active
+ * to avoid false positive DMA overlapping error messages.
+ */
+#ifdef CONFIG_DMA_API_DEBUG
+#ifndef ARCH_KMALLOC_MINALIGN
+#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
+#elif ARCH_KMALLOC_MINALIGN < L1_CACHE_BYTES
+#undef ARCH_KMALLOC_MINALIGN
+#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
+#endif
+#endif
+
#ifndef ARCH_KMALLOC_MINALIGN
#define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long)
#elif ARCH_KMALLOC_MINALIGN > 8
--
2.53.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 5:58 [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active Mikhail Gavrilov
@ 2026-03-27 6:37 ` Harry Yoo (Oracle)
2026-03-27 6:50 ` Mikhail Gavrilov
2026-03-27 6:41 ` Guenter Roeck
2026-03-27 12:26 ` Catalin Marinas
2 siblings, 1 reply; 16+ messages in thread
From: Harry Yoo (Oracle) @ 2026-03-27 6:37 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher
On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
> When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
> tracks active mappings per cacheline and warns if two different DMA
> mappings share the same cacheline ("cacheline tracking EEXIST,
> overlapping mappings aren't supported").
>
> On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
> allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
> hub driver) frequently land in the same 64-byte cacheline. When both
> are DMA-mapped, this triggers a false positive warning.
Is it feasible to suppress the warning if dma_get_cache_alignment() is
smaller than L1_CACHE_BYTES?
> This has been reported repeatedly since v5.14 (when the EEXIST check
> was added) across various USB host controllers and devices including
> xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
>
> Raise ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES when CONFIG_DMA_API_DEBUG
> is enabled, ensuring each kmalloc allocation occupies its own cacheline
> and eliminating the false positive.
>
> Verified with a kernel module reproducer that performs two kmalloc(8)
> allocations back-to-back and DMA-maps both:
>
> Before: allocations share a cacheline, EEXIST fires within ~50 pairs
> After: 64 pairs allocated, all in separate cachelines, no warning
>
> Fixes: 2b4bbc6231d7 ("dma-debug: report -EEXIST errors in add_dma_entry")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215740
> Suggested-by: Alan Stern <stern@rowland.harvard.edu>
> Suggested-by: Guenter Roeck <linux@roeck-us.net>
> Tested-by: Jeff Kirsher <Jeff.kirsher@gmail.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 5:58 [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active Mikhail Gavrilov
2026-03-27 6:37 ` Harry Yoo (Oracle)
@ 2026-03-27 6:41 ` Guenter Roeck
2026-03-27 12:26 ` Catalin Marinas
2 siblings, 0 replies; 16+ messages in thread
From: Guenter Roeck @ 2026-03-27 6:41 UTC (permalink / raw)
To: Mikhail Gavrilov, vbabka, harry.yoo, akpm
Cc: hao.li, cl, rientjes, roman.gushchin, linux-mm, linux-kernel,
linux-usb, stern, andy.shevchenko, hch, Jeff.kirsher
On 3/26/26 22:58, Mikhail Gavrilov wrote:
> When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
> tracks active mappings per cacheline and warns if two different DMA
> mappings share the same cacheline ("cacheline tracking EEXIST,
> overlapping mappings aren't supported").
>
> On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
> allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
> hub driver) frequently land in the same 64-byte cacheline. When both
> are DMA-mapped, this triggers a false positive warning.
>
> This has been reported repeatedly since v5.14 (when the EEXIST check
> was added) across various USB host controllers and devices including
> xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
>
> Raise ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES when CONFIG_DMA_API_DEBUG
> is enabled, ensuring each kmalloc allocation occupies its own cacheline
> and eliminating the false positive.
>
> Verified with a kernel module reproducer that performs two kmalloc(8)
> allocations back-to-back and DMA-maps both:
>
> Before: allocations share a cacheline, EEXIST fires within ~50 pairs
> After: 64 pairs allocated, all in separate cachelines, no warning
>
> Fixes: 2b4bbc6231d7 ("dma-debug: report -EEXIST errors in add_dma_entry")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215740
> Suggested-by: Alan Stern <stern@rowland.harvard.edu>
> Suggested-by: Guenter Roeck <linux@roeck-us.net>
> Tested-by: Jeff Kirsher <Jeff.kirsher@gmail.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Thanks a lot for taking care of this!
FWIW:
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
> ---
>
> Reproducer module that triggers the bug reliably:
> https://bugzilla.kernel.org/attachment.cgi?id=309769
>
> include/linux/slab.h | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 15a60b501b95..f044956e17c1 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -536,6 +536,19 @@ static inline bool kmem_dump_obj(void *object) { return false; }
> #endif
> #endif
>
> +/*
> + * Align memory allocations to cache lines if DMA API debugging is active
> + * to avoid false positive DMA overlapping error messages.
> + */
> +#ifdef CONFIG_DMA_API_DEBUG
> +#ifndef ARCH_KMALLOC_MINALIGN
> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
> +#elif ARCH_KMALLOC_MINALIGN < L1_CACHE_BYTES
> +#undef ARCH_KMALLOC_MINALIGN
> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
> +#endif
> +#endif
> +
> #ifndef ARCH_KMALLOC_MINALIGN
> #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long)
> #elif ARCH_KMALLOC_MINALIGN > 8
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 6:37 ` Harry Yoo (Oracle)
@ 2026-03-27 6:50 ` Mikhail Gavrilov
2026-03-27 8:00 ` Harry Yoo (Oracle)
0 siblings, 1 reply; 16+ messages in thread
From: Mikhail Gavrilov @ 2026-03-27 6:50 UTC (permalink / raw)
To: Harry Yoo (Oracle)
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher
On Fri, Mar 27, 2026 at 11:38 AM Harry Yoo (Oracle) <harry@kernel.org> wrote:
>
> On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
> > When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
> > tracks active mappings per cacheline and warns if two different DMA
> > mappings share the same cacheline ("cacheline tracking EEXIST,
> > overlapping mappings aren't supported").
> >
> > On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
> > allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
> > hub driver) frequently land in the same 64-byte cacheline. When both
> > are DMA-mapped, this triggers a false positive warning.
>
> Is it feasible to suppress the warning if dma_get_cache_alignment() is
> smaller than L1_CACHE_BYTES?
Hi Harry,
Good question. I considered the dma-debug side, but the issue is
that the cacheline overlap check in add_dma_entry() is intentionally
strict -- it catches real bugs on non-coherent architectures where
two DMA buffers sharing a cacheline can corrupt data.
The check already has suppressions for DMA_ATTR_SKIP_CPU_SYNC,
DMA_ATTR_CPU_CACHE_CLEAN, and CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC +
swiotlb. Adding another special case (e.g. dev_is_dma_coherent())
would weaken the check for all coherent platforms, potentially
hiding real bugs on devices behind non-coherent IOMMUs.
Alan Stern discussed this in the bugzilla [1] and concluded that
the slab alignment approach "seems reasonable" [2], noting that
"turning on debugging should not affect the way the kernel behaves --
otherwise what you're debugging isn't the same as what normally
happens. But given the way the DMA API debugging is set up, I don't
see any alternative."
The memory overhead is only present when CONFIG_DMA_API_DEBUG is
enabled, which is a debug-only option not used in production.
That said, if you'd prefer a dma-debug side fix, I'm happy to
explore that direction instead.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=215740#c31
[2] https://bugzilla.kernel.org/show_bug.cgi?id=215740#c44
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 6:50 ` Mikhail Gavrilov
@ 2026-03-27 8:00 ` Harry Yoo (Oracle)
2026-03-27 8:07 ` Mikhail Gavrilov
0 siblings, 1 reply; 16+ messages in thread
From: Harry Yoo (Oracle) @ 2026-03-27 8:00 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher, Catalin Marinas
On Fri, Mar 27, 2026 at 11:50:07AM +0500, Mikhail Gavrilov wrote:
> On Fri, Mar 27, 2026 at 11:38 AM Harry Yoo (Oracle) <harry@kernel.org> wrote:
> >
> > On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
> > > When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
> > > tracks active mappings per cacheline and warns if two different DMA
> > > mappings share the same cacheline ("cacheline tracking EEXIST,
> > > overlapping mappings aren't supported").
> > >
> > > On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
> > > allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
> > > hub driver) frequently land in the same 64-byte cacheline. When both
> > > are DMA-mapped, this triggers a false positive warning.
> >
> > Is it feasible to suppress the warning if dma_get_cache_alignment() is
> > smaller than L1_CACHE_BYTES?
>
> Hi Harry,
Hi Mikhail,
Please keep in mind that I have limited understanding of DMA API,
but just wanted to double check if there is (or isn't) a sane way to
fix it on dma-debug side.
> Good question. I considered the dma-debug side, but the issue is
> that the cacheline overlap check in add_dma_entry() is intentionally
> strict -- it catches real bugs on non-coherent architectures where
> two DMA buffers sharing a cacheline can corrupt data.
But dma_get_cache_alignment() < L1_CACHE_BYTES means the architecture
actually allows overlapping cachelines, no?
A non-coherent architecture where two DMA buffers sharing a cacheline
could corrupt data should define ARCH_DMA_MINALIGN >= L1_CACHE_BYTES.
I'm not sure what kind of a real bug this will hide,
or am I missing something?
> Alan Stern discussed this in the bugzilla [1] and concluded that
> the slab alignment approach "seems reasonable" [2],
As long as there's no good alternative way to fix, yeah.
> noting that turning on debugging should not affect the way the kernel
> behaves -- otherwise what you're debugging isn't the same as what normally
> happens.
Yeah, this is why I'm trying to double check if there's no feasible
alternative.
> But given the way the DMA API debugging is set up, I don't
> see any alternative."
I'm trying to say adding the (dma_get_cache_alignment() <
L1_CACHE_BYTES) check might be considered as an alternative ;)
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=215740#c31
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=215740#c44
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 8:00 ` Harry Yoo (Oracle)
@ 2026-03-27 8:07 ` Mikhail Gavrilov
2026-03-27 8:43 ` Harry Yoo (Oracle)
0 siblings, 1 reply; 16+ messages in thread
From: Mikhail Gavrilov @ 2026-03-27 8:07 UTC (permalink / raw)
To: Harry Yoo (Oracle)
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher, Catalin Marinas
On Fri, Mar 27, 2026 at 1:00 PM Harry Yoo (Oracle) <harry@kernel.org> wrote:
>
> But dma_get_cache_alignment() < L1_CACHE_BYTES means the architecture
> actually allows overlapping cachelines, no?
Hi Harry,
On x86_64, dma_get_cache_alignment() returns L1_CACHE_BYTES (both
are 64). The condition (dma_get_cache_alignment() < L1_CACHE_BYTES)
would be false, so the check wouldn't suppress the warning.
The problem isn't that the architecture allows overlapping -- it's
that kmalloc returns 8-byte aligned buffers that happen to land in
the same 64-byte cacheline. The DMA debug code correctly identifies
that two DMA mappings share a cacheline, but on coherent platforms
this is harmless.
Adding a dev_is_dma_coherent() check in dma-debug would fix x86 but
would also silence the warning for any coherent device, including
ones behind IOMMUs that might have non-coherent paths. That's why
Alan's conclusion was that fixing the allocator side is safer --
it doesn't weaken any debug checks, it just ensures the situation
never arises.
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 8:07 ` Mikhail Gavrilov
@ 2026-03-27 8:43 ` Harry Yoo (Oracle)
2026-03-27 10:25 ` Mikhail Gavrilov
0 siblings, 1 reply; 16+ messages in thread
From: Harry Yoo (Oracle) @ 2026-03-27 8:43 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher, Catalin Marinas
On Fri, Mar 27, 2026 at 01:07:21PM +0500, Mikhail Gavrilov wrote:
> On Fri, Mar 27, 2026 at 1:00 PM Harry Yoo (Oracle) <harry@kernel.org> wrote:
> >
> > But dma_get_cache_alignment() < L1_CACHE_BYTES means the architecture
> > actually allows overlapping cachelines, no?
>
> Hi Harry,
>
> On x86_64, dma_get_cache_alignment() returns L1_CACHE_BYTES (both
> are 64). The condition (dma_get_cache_alignment() < L1_CACHE_BYTES)
> would be false, so the check wouldn't suppress the warning.
How does dma_get_cache_alignment() return L1_CACHE_BYTES when
x86_64 doesn't define ARCH_HAS_DMA_MINALIGN?
> The problem isn't that the architecture allows overlapping --
Probably what I said was misleading...
I didn't mean "the architecture is fine with overlapping cacheline".
I meant "not defining ARCH_DMA_MINALIGN or defining it as smaller than
L1_CACHE_BYTES is how architectures tell kmalloc subsystem that
kmalloc objects don't have to be aligned with cacheline size."
> it's that kmalloc returns 8-byte aligned buffers that happen to land in
> the same 64-byte cacheline.
> The DMA debug code correctly identifies that two DMA mappings share
> a cacheline, but on coherent platforms this is harmless.
That happens only when the architecture can live with that.
> Adding a dev_is_dma_coherent() check in dma-debug would fix x86
> but would also silence the warning for any coherent device, including
> ones behind IOMMUs that might have non-coherent paths.
Sorry, I don't understand where the idea of adding a
dma_is_dma_coherent() check comes from ...
> That's why Alan's conclusion was that fixing the allocator side is safer --
> it doesn't weaken any debug checks, it just ensures the situation
> never arises.
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 8:43 ` Harry Yoo (Oracle)
@ 2026-03-27 10:25 ` Mikhail Gavrilov
2026-03-27 10:39 ` Harry Yoo (Oracle)
0 siblings, 1 reply; 16+ messages in thread
From: Mikhail Gavrilov @ 2026-03-27 10:25 UTC (permalink / raw)
To: Harry Yoo (Oracle)
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher, Catalin Marinas
On Fri, Mar 27, 2026 at 1:43 PM Harry Yoo (Oracle) <harry@kernel.org> wrote:
>
> Probably what I said was misleading...
>
> I didn't mean "the architecture is fine with overlapping cacheline".
>
> I meant "not defining ARCH_DMA_MINALIGN or defining it as smaller than
> L1_CACHE_BYTES is how architectures tell kmalloc subsystem that
> kmalloc objects don't have to be aligned with cacheline size."
>
Hi Harry,
You're right, I was wrong about dma_get_cache_alignment() -- on
x86_64 without ARCH_HAS_DMA_MINALIGN it returns 1, not
L1_CACHE_BYTES. Sorry for the confusion.
So your suggestion to suppress the warning in dma-debug when
dma_get_cache_alignment() < L1_CACHE_BYTES would indeed work
on x86_64 and other coherent platforms.
I don't have a strong preference either way. Both approaches
solve the problem:
- slab side: prevents the overlap from happening
- dma-debug side: tolerates the overlap when the arch says
cacheline alignment isn't required for DMA
Would you prefer I send a v2 with the dma-debug approach instead?
Happy to go whichever direction the maintainers prefer.
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 10:25 ` Mikhail Gavrilov
@ 2026-03-27 10:39 ` Harry Yoo (Oracle)
0 siblings, 0 replies; 16+ messages in thread
From: Harry Yoo (Oracle) @ 2026-03-27 10:39 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: vbabka, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher, Catalin Marinas
On Fri, Mar 27, 2026 at 03:25:00PM +0500, Mikhail Gavrilov wrote:
> On Fri, Mar 27, 2026 at 1:43 PM Harry Yoo (Oracle) <harry@kernel.org> wrote:
> >
> > Probably what I said was misleading...
> >
> > I didn't mean "the architecture is fine with overlapping cacheline".
> >
> > I meant "not defining ARCH_DMA_MINALIGN or defining it as smaller than
> > L1_CACHE_BYTES is how architectures tell kmalloc subsystem that
> > kmalloc objects don't have to be aligned with cacheline size."
>
> Hi Harry,
Hi Mikhail,
> You're right, I was wrong about dma_get_cache_alignment() -- on
> x86_64 without ARCH_HAS_DMA_MINALIGN it returns 1, not
> L1_CACHE_BYTES. Sorry for the confusion.
Don't worry!
> So your suggestion to suppress the warning in dma-debug when
> dma_get_cache_alignment() < L1_CACHE_BYTES would indeed work
> on x86_64 and other coherent platforms.
Thanks for confirming.
> I don't have a strong preference either way. Both approaches
> solve the problem:
>
> - slab side: prevents the overlap from happening
> - dma-debug side: tolerates the overlap when the arch says
> cacheline alignment isn't required for DMA
>
> Would you prefer I send a v2 with the dma-debug approach instead?
Yes please. I think keeping the same behavior regardless of the debug
option will be better in the long term.
> Happy to go whichever direction the maintainers prefer.
Thanks a lot for working on this!
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 5:58 [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active Mikhail Gavrilov
2026-03-27 6:37 ` Harry Yoo (Oracle)
2026-03-27 6:41 ` Guenter Roeck
@ 2026-03-27 12:26 ` Catalin Marinas
2026-03-27 12:34 ` Andy Shevchenko
2026-03-27 14:09 ` Marek Szyprowski
2 siblings, 2 replies; 16+ messages in thread
From: Catalin Marinas @ 2026-03-27 12:26 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: vbabka, harry.yoo, akpm, hao.li, cl, rientjes, roman.gushchin,
linux-mm, linux-kernel, linux-usb, stern, linux, andy.shevchenko,
hch, Jeff.kirsher, Marek Szyprowski, Robin Murphy
+ Marek, Robin
On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
> When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
> tracks active mappings per cacheline and warns if two different DMA
> mappings share the same cacheline ("cacheline tracking EEXIST,
> overlapping mappings aren't supported").
>
> On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
> allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
> hub driver) frequently land in the same 64-byte cacheline. When both
> are DMA-mapped, this triggers a false positive warning.
>
> This has been reported repeatedly since v5.14 (when the EEXIST check
> was added) across various USB host controllers and devices including
> xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
This indeed has come up regularly in the past years.
> +/*
> + * Align memory allocations to cache lines if DMA API debugging is active
> + * to avoid false positive DMA overlapping error messages.
> + */
> +#ifdef CONFIG_DMA_API_DEBUG
> +#ifndef ARCH_KMALLOC_MINALIGN
> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
> +#elif ARCH_KMALLOC_MINALIGN < L1_CACHE_BYTES
> +#undef ARCH_KMALLOC_MINALIGN
> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
> +#endif
> +#endif
TL;DR: I think this is fine:
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
I'm not sure that's the best way to hide the warning but there
are no great solutions either. On one hand, we want the DMA debug to
capture potential problems on architectures it's not running on. OTOH,
we also want to avoid false positives on coherent architectures/devices.
I don't think reconciling the two requirements is easy.
When DMA_API_DEBUG is enabled, the above will change the x86 behaviour
that could have implications beyond DMA (e.g. may not catch some buffer
overflow because it's within L1_CACHE_BYTES). Similarly for non-coherent
architectures that select DMA_BOUNCE_UNALIGNED_KMALLOC (arm64 and riscv
currently). arm64 defines ARCH_DMA_MINALIGN to 128 but
ARCH_KMALLOC_MINALIGN to 8 (why 128 is larger than L1_CACHE_BYTES is
another matter but let's ignore it for now).
More of a thinking out loud, we have:
1. Coherent architectures - alignment doesn't matter
2. Non-coherent architectures with:
a) Sufficiently large ARCH_KMALLOC_MINALIGN
b) Small ARCH_KMALLOC_MINALIGN but DMA_BOUNCE_UNALIGNED_KMALLOC
c) Broken config - forgot to set ARCH_DMA_MINALIGN or bouncing
We can ignore (2.c), the aim of the DMA debug is to catch wrong uses in
drivers. If drivers is the only goal, the above change will do when
running on (1) or (2.a) hardware - it will catch sub-L1_CACHE_BYTES
buffers from drivers while assuming kmalloc() machinery is safe.
However, if running on (2.b) it won't catch anything that may be
problematic on (2.a) since the DMA debug ignores the overlap.
We could make DMA_BOUNCE_UNALIGNED_KMALLOC dependent on !DMA_API_DEBUG
but it would be nice to be able to sanity-check the bouncing logic.
Well, it wasn't checking it before and with commit 03521c892bb8
("dma-debug: don't report false positives with
DMA_BOUNCE_UNALIGNED_KMALLOC"), we made this clear that overlapping will
be ignored.
Irrespective of whether we disable bouncing with DMA_API_DEBUG, maybe we
could replace the above commit with:
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 3928a509c44c..488045ef6245 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -175,7 +175,7 @@ dma_addr_t dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
if (!is_mmio)
kmsan_handle_dma(phys, size, dir);
trace_dma_map_phys(dev, phys, addr, size, dir, attrs);
- debug_dma_map_phys(dev, phys, size, dir, addr, attrs);
+ debug_dma_map_phys(dev, dma_to_phys(addr), size, dir, addr, attrs);
return addr;
}
Anyway, this I think is unrelated to the proposed change affecting x86,
more of a how to make the DMA API debugging more useful when running on
arm64 or riscv.
--
Catalin
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 12:26 ` Catalin Marinas
@ 2026-03-27 12:34 ` Andy Shevchenko
2026-03-27 14:09 ` Marek Szyprowski
1 sibling, 0 replies; 16+ messages in thread
From: Andy Shevchenko @ 2026-03-27 12:34 UTC (permalink / raw)
To: Catalin Marinas
Cc: Mikhail Gavrilov, vbabka, harry.yoo, akpm, hao.li, cl, rientjes,
roman.gushchin, linux-mm, linux-kernel, linux-usb, stern, linux,
hch, Jeff.kirsher, Marek Szyprowski, Robin Murphy
On Fri, Mar 27, 2026 at 2:26 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
> TL;DR: I think this is fine:
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>
> I'm not sure that's the best way to hide the warning but there
> are no great solutions either. On one hand, we want the DMA debug to
> capture potential problems on architectures it's not running on. OTOH,
> we also want to avoid false positives on coherent architectures/devices.
> I don't think reconciling the two requirements is easy.
>
> When DMA_API_DEBUG is enabled, the above will change the x86 behaviour
> that could have implications beyond DMA (e.g. may not catch some buffer
> overflow because it's within L1_CACHE_BYTES). Similarly for non-coherent
> architectures that select DMA_BOUNCE_UNALIGNED_KMALLOC (arm64 and riscv
> currently). arm64 defines ARCH_DMA_MINALIGN to 128 but
> ARCH_KMALLOC_MINALIGN to 8 (why 128 is larger than L1_CACHE_BYTES is
> another matter but let's ignore it for now).
Maybe for the cases where we do not warn we should introduce a
dev_dbg_/pr_debug_once()? At least users may be informed about potential issues.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 12:26 ` Catalin Marinas
2026-03-27 12:34 ` Andy Shevchenko
@ 2026-03-27 14:09 ` Marek Szyprowski
2026-03-27 14:30 ` Vlastimil Babka (SUSE)
2026-03-27 14:55 ` Marek Szyprowski
1 sibling, 2 replies; 16+ messages in thread
From: Marek Szyprowski @ 2026-03-27 14:09 UTC (permalink / raw)
To: Catalin Marinas, Mikhail Gavrilov
Cc: vbabka, harry.yoo, akpm, hao.li, cl, rientjes, roman.gushchin,
linux-mm, linux-kernel, linux-usb, stern, linux, andy.shevchenko,
hch, Jeff.kirsher, Robin Murphy
Hi
On 27.03.2026 13:26, Catalin Marinas wrote:
> + Marek, Robin
Thanks for adding me to the loop.
> On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
>> When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
>> tracks active mappings per cacheline and warns if two different DMA
>> mappings share the same cacheline ("cacheline tracking EEXIST,
>> overlapping mappings aren't supported").
>>
>> On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
>> allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
>> hub driver) frequently land in the same 64-byte cacheline. When both
>> are DMA-mapped, this triggers a false positive warning.
>>
>> This has been reported repeatedly since v5.14 (when the EEXIST check
>> was added) across various USB host controllers and devices including
>> xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
> This indeed has come up regularly in the past years.
>
>> +/*
>> + * Align memory allocations to cache lines if DMA API debugging is active
>> + * to avoid false positive DMA overlapping error messages.
>> + */
>> +#ifdef CONFIG_DMA_API_DEBUG
>> +#ifndef ARCH_KMALLOC_MINALIGN
>> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
>> +#elif ARCH_KMALLOC_MINALIGN < L1_CACHE_BYTES
>> +#undef ARCH_KMALLOC_MINALIGN
>> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
>> +#endif
>> +#endif
> TL;DR: I think this is fine:
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>
> I'm not sure that's the best way to hide the warning but there
> are no great solutions either. On one hand, we want the DMA debug to
> capture potential problems on architectures it's not running on. OTOH,
> we also want to avoid false positives on coherent architectures/devices.
> I don't think reconciling the two requirements is easy.
>
> When DMA_API_DEBUG is enabled, the above will change the x86 behaviour
> that could have implications beyond DMA (e.g. may not catch some buffer
> overflow because it's within L1_CACHE_BYTES). Similarly for non-coherent
> architectures that select DMA_BOUNCE_UNALIGNED_KMALLOC (arm64 and riscv
> currently). arm64 defines ARCH_DMA_MINALIGN to 128 but
> ARCH_KMALLOC_MINALIGN to 8 (why 128 is larger than L1_CACHE_BYTES is
> another matter but let's ignore it for now).
IMHO enabling DMA_API_DEBUG should not change the kernel behavior, so I
would prefer fixing this in DMA-debug code somehow.
> More of a thinking out loud, we have:
>
> 1. Coherent architectures - alignment doesn't matter
>
> 2. Non-coherent architectures with:
> a) Sufficiently large ARCH_KMALLOC_MINALIGN
> b) Small ARCH_KMALLOC_MINALIGN but DMA_BOUNCE_UNALIGNED_KMALLOC
> c) Broken config - forgot to set ARCH_DMA_MINALIGN or bouncing
>
> We can ignore (2.c), the aim of the DMA debug is to catch wrong uses in
> drivers. If drivers is the only goal, the above change will do when
> running on (1) or (2.a) hardware - it will catch sub-L1_CACHE_BYTES
> buffers from drivers while assuming kmalloc() machinery is safe.
> However, if running on (2.b) it won't catch anything that may be
> problematic on (2.a) since the DMA debug ignores the overlap.
>
> We could make DMA_BOUNCE_UNALIGNED_KMALLOC dependent on !DMA_API_DEBUG
> but it would be nice to be able to sanity-check the bouncing logic.
> Well, it wasn't checking it before and with commit 03521c892bb8
> ("dma-debug: don't report false positives with
> DMA_BOUNCE_UNALIGNED_KMALLOC"), we made this clear that overlapping will
> be ignored.
>
> Irrespective of whether we disable bouncing with DMA_API_DEBUG, maybe we
> could replace the above commit with:
>
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 3928a509c44c..488045ef6245 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -175,7 +175,7 @@ dma_addr_t dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
> if (!is_mmio)
> kmsan_handle_dma(phys, size, dir);
> trace_dma_map_phys(dev, phys, addr, size, dir, attrs);
> - debug_dma_map_phys(dev, phys, size, dir, addr, attrs);
> + debug_dma_map_phys(dev, dma_to_phys(addr), size, dir, addr, attrs);
>
> return addr;
> }
>
> Anyway, this I think is unrelated to the proposed change affecting x86,
> more of a how to make the DMA API debugging more useful when running on
> arm64 or riscv.
This is not enough, there is also a dma_map_sg_attrs() path.
I've reverted 03521c892bb8 and added the following change:
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index
55e7ca8ceb86..bbada41143ea 100644 --- a/kernel/dma/debug.c +++
b/kernel/dma/debug.c @@ -18,6 +18,7 @@ #include <linux/uaccess.h>
#include <linux/export.h> #include <linux/device.h> +#include
<linux/dma-direct.h> #include <linux/types.h> #include <linux/sched.h>
#include <linux/ctype.h> @@ -1241,7 +1242,8 @@ void
debug_dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
entry->dev = dev; entry->type = dma_debug_phy; - entry->paddr = phys; +
entry->paddr = IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ? +
dma_to_phys(dev, dma_addr) : phys; entry->dev_addr = dma_addr;
entry->size = size; entry->direction = direction; @@ -1335,7 +1337,9 @@
void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
entry->type = dma_debug_sg; entry->dev = dev; - entry->paddr =
sg_phys(s); + entry->paddr = +
IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ? + dma_to_phys(dev,
sg_dma_address(s)) : sg_phys(s); entry->size = sg_dma_len(s);
entry->dev_addr = sg_dma_address(s); entry->direction = direction;
thenran my tests on ARM64 and RV64 boards. Only one new warning has been
reported (I didn't analyze it yet), so this might be indeed a better
solution than skipping overlapping cache lines warnings when
DMA_BOUNCE_UNALIGNED_KMALLOC is set.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 14:09 ` Marek Szyprowski
@ 2026-03-27 14:30 ` Vlastimil Babka (SUSE)
2026-03-27 14:37 ` Mikhail Gavrilov
2026-03-27 14:55 ` Marek Szyprowski
1 sibling, 1 reply; 16+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-03-27 14:30 UTC (permalink / raw)
To: Marek Szyprowski, Catalin Marinas, Mikhail Gavrilov
Cc: harry.yoo, akpm, hao.li, cl, rientjes, roman.gushchin, linux-mm,
linux-kernel, linux-usb, stern, linux, andy.shevchenko, hch,
Jeff.kirsher, Robin Murphy
On 3/27/26 15:09, Marek Szyprowski wrote:
> Hi
>
> On 27.03.2026 13:26, Catalin Marinas wrote:
>> + Marek, Robin
>
> Thanks for adding me to the loop.
>
>> On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
>>> When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
>>> tracks active mappings per cacheline and warns if two different DMA
>>> mappings share the same cacheline ("cacheline tracking EEXIST,
>>> overlapping mappings aren't supported").
>>>
>>> On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
>>> allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
>>> hub driver) frequently land in the same 64-byte cacheline. When both
>>> are DMA-mapped, this triggers a false positive warning.
>>>
>>> This has been reported repeatedly since v5.14 (when the EEXIST check
>>> was added) across various USB host controllers and devices including
>>> xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
>> This indeed has come up regularly in the past years.
>>
>>> +/*
>>> + * Align memory allocations to cache lines if DMA API debugging is active
>>> + * to avoid false positive DMA overlapping error messages.
>>> + */
>>> +#ifdef CONFIG_DMA_API_DEBUG
>>> +#ifndef ARCH_KMALLOC_MINALIGN
>>> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
>>> +#elif ARCH_KMALLOC_MINALIGN < L1_CACHE_BYTES
>>> +#undef ARCH_KMALLOC_MINALIGN
>>> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
>>> +#endif
>>> +#endif
>> TL;DR: I think this is fine:
>>
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>>
>> I'm not sure that's the best way to hide the warning but there
>> are no great solutions either. On one hand, we want the DMA debug to
>> capture potential problems on architectures it's not running on. OTOH,
>> we also want to avoid false positives on coherent architectures/devices.
>> I don't think reconciling the two requirements is easy.
>>
>> When DMA_API_DEBUG is enabled, the above will change the x86 behaviour
>> that could have implications beyond DMA (e.g. may not catch some buffer
>> overflow because it's within L1_CACHE_BYTES). Similarly for non-coherent
>> architectures that select DMA_BOUNCE_UNALIGNED_KMALLOC (arm64 and riscv
>> currently). arm64 defines ARCH_DMA_MINALIGN to 128 but
>> ARCH_KMALLOC_MINALIGN to 8 (why 128 is larger than L1_CACHE_BYTES is
>> another matter but let's ignore it for now).
>
> IMHO enabling DMA_API_DEBUG should not change the kernel behavior, so I
> would prefer fixing this in DMA-debug code somehow.
So what about Harry's proposal [1]? Mikhail seems to be on board? [2]
It seems it would achieve the goal that enabling DMA_API_DEBUG doesn't
change the kernel behavior? But I don't know this area too well so
maybe there's a catch.
[1] https://lore.kernel.org/all/acYlxRBhSMcwBnja@hyeyoo/
[2] https://lore.kernel.org/all/CABXGCsO_C8%2B%2B4%2BoPfZ%2BbQyrBnEGy5JFpXHkGNpfy%2B8%3D5BvVNfg@mail.gmail.com/
>
>> More of a thinking out loud, we have:
>>
>> 1. Coherent architectures - alignment doesn't matter
>>
>> 2. Non-coherent architectures with:
>> a) Sufficiently large ARCH_KMALLOC_MINALIGN
>> b) Small ARCH_KMALLOC_MINALIGN but DMA_BOUNCE_UNALIGNED_KMALLOC
>> c) Broken config - forgot to set ARCH_DMA_MINALIGN or bouncing
>>
>> We can ignore (2.c), the aim of the DMA debug is to catch wrong uses in
>> drivers. If drivers is the only goal, the above change will do when
>> running on (1) or (2.a) hardware - it will catch sub-L1_CACHE_BYTES
>> buffers from drivers while assuming kmalloc() machinery is safe.
>> However, if running on (2.b) it won't catch anything that may be
>> problematic on (2.a) since the DMA debug ignores the overlap.
>>
>> We could make DMA_BOUNCE_UNALIGNED_KMALLOC dependent on !DMA_API_DEBUG
>> but it would be nice to be able to sanity-check the bouncing logic.
>> Well, it wasn't checking it before and with commit 03521c892bb8
>> ("dma-debug: don't report false positives with
>> DMA_BOUNCE_UNALIGNED_KMALLOC"), we made this clear that overlapping will
>> be ignored.
>>
>> Irrespective of whether we disable bouncing with DMA_API_DEBUG, maybe we
>> could replace the above commit with:
>>
>> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
>> index 3928a509c44c..488045ef6245 100644
>> --- a/kernel/dma/mapping.c
>> +++ b/kernel/dma/mapping.c
>> @@ -175,7 +175,7 @@ dma_addr_t dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
>> if (!is_mmio)
>> kmsan_handle_dma(phys, size, dir);
>> trace_dma_map_phys(dev, phys, addr, size, dir, attrs);
>> - debug_dma_map_phys(dev, phys, size, dir, addr, attrs);
>> + debug_dma_map_phys(dev, dma_to_phys(addr), size, dir, addr, attrs);
>>
>> return addr;
>> }
>>
>> Anyway, this I think is unrelated to the proposed change affecting x86,
>> more of a how to make the DMA API debugging more useful when running on
>> arm64 or riscv.
>
> This is not enough, there is also a dma_map_sg_attrs() path.
>
> I've reverted 03521c892bb8 and added the following change:
>
> diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index
> 55e7ca8ceb86..bbada41143ea 100644 --- a/kernel/dma/debug.c +++
> b/kernel/dma/debug.c @@ -18,6 +18,7 @@ #include <linux/uaccess.h>
> #include <linux/export.h> #include <linux/device.h> +#include
> <linux/dma-direct.h> #include <linux/types.h> #include <linux/sched.h>
> #include <linux/ctype.h> @@ -1241,7 +1242,8 @@ void
> debug_dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
> entry->dev = dev; entry->type = dma_debug_phy; - entry->paddr = phys; +
> entry->paddr = IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ? +
> dma_to_phys(dev, dma_addr) : phys; entry->dev_addr = dma_addr;
> entry->size = size; entry->direction = direction; @@ -1335,7 +1337,9 @@
> void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
> entry->type = dma_debug_sg; entry->dev = dev; - entry->paddr =
> sg_phys(s); + entry->paddr = +
> IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ? + dma_to_phys(dev,
> sg_dma_address(s)) : sg_phys(s); entry->size = sg_dma_len(s);
> entry->dev_addr = sg_dma_address(s); entry->direction = direction;
>
> thenran my tests on ARM64 and RV64 boards. Only one new warning has been
> reported (I didn't analyze it yet), so this might be indeed a better
> solution than skipping overlapping cache lines warnings when
> DMA_BOUNCE_UNALIGNED_KMALLOC is set.
>
> Best regards
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 14:30 ` Vlastimil Babka (SUSE)
@ 2026-03-27 14:37 ` Mikhail Gavrilov
2026-03-27 14:41 ` Marek Szyprowski
0 siblings, 1 reply; 16+ messages in thread
From: Mikhail Gavrilov @ 2026-03-27 14:37 UTC (permalink / raw)
To: Vlastimil Babka (SUSE)
Cc: Marek Szyprowski, Catalin Marinas, harry.yoo, akpm, hao.li, cl,
rientjes, roman.gushchin, linux-mm, linux-kernel, linux-usb,
stern, linux, andy.shevchenko, hch, Jeff.kirsher, Robin Murphy
On Fri, Mar 27, 2026 at 7:30 PM Vlastimil Babka (SUSE)
<vbabka@kernel.org> wrote:
>
> So what about Harry's proposal [1]? Mikhail seems to be on board? [2]
>
> It seems it would achieve the goal that enabling DMA_API_DEBUG doesn't
> change the kernel behavior? But I don't know this area too well so
> maybe there's a catch.
>
> [1] https://lore.kernel.org/all/acYlxRBhSMcwBnja@hyeyoo/
> [2] https://lore.kernel.org/all/CABXGCsO_C8%2B%2B4%2BoPfZ%2BbQyrBnEGy5JFpXHkGNpfy%2B8%3D5BvVNfg@mail.gmail.com/
Hi Vlastimil,
Yes, I've already sent v2 based on Harry's suggestion:
https://lore.kernel.org/all/20260327124156.24820-1-mikhail.v.gavrilov@gmail.com/
It adds a dma_get_cache_alignment() >= L1_CACHE_BYTES check in
add_dma_entry() instead of changing ARCH_KMALLOC_MINALIGN, so
enabling DMA_API_DEBUG no longer affects allocator behavior.
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 14:37 ` Mikhail Gavrilov
@ 2026-03-27 14:41 ` Marek Szyprowski
0 siblings, 0 replies; 16+ messages in thread
From: Marek Szyprowski @ 2026-03-27 14:41 UTC (permalink / raw)
To: Mikhail Gavrilov, Vlastimil Babka (SUSE)
Cc: Catalin Marinas, harry.yoo, akpm, hao.li, cl, rientjes,
roman.gushchin, linux-mm, linux-kernel, linux-usb, stern, linux,
andy.shevchenko, hch, Jeff.kirsher, Robin Murphy
On 27.03.2026 15:37, Mikhail Gavrilov wrote:
> On Fri, Mar 27, 2026 at 7:30 PM Vlastimil Babka (SUSE)
> <vbabka@kernel.org> wrote:
>> So what about Harry's proposal [1]? Mikhail seems to be on board? [2]
>>
>> It seems it would achieve the goal that enabling DMA_API_DEBUG doesn't
>> change the kernel behavior? But I don't know this area too well so
>> maybe there's a catch.
>>
>> [1] https://lore.kernel.org/all/acYlxRBhSMcwBnja@hyeyoo/
>> [2] https://lore.kernel.org/all/CABXGCsO_C8%2B%2B4%2BoPfZ%2BbQyrBnEGy5JFpXHkGNpfy%2B8%3D5BvVNfg@mail.gmail.com/
> Hi Vlastimil,
>
> Yes, I've already sent v2 based on Harry's suggestion:
> https://lore.kernel.org/all/20260327124156.24820-1-mikhail.v.gavrilov@gmail.com/
>
> It adds a dma_get_cache_alignment() >= L1_CACHE_BYTES check in
> add_dma_entry() instead of changing ARCH_KMALLOC_MINALIGN, so
> enabling DMA_API_DEBUG no longer affects allocator behavior.
This looks like a good fix, but let me think a bit more about all
possible cases.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active
2026-03-27 14:09 ` Marek Szyprowski
2026-03-27 14:30 ` Vlastimil Babka (SUSE)
@ 2026-03-27 14:55 ` Marek Szyprowski
1 sibling, 0 replies; 16+ messages in thread
From: Marek Szyprowski @ 2026-03-27 14:55 UTC (permalink / raw)
To: Catalin Marinas, Mikhail Gavrilov
Cc: vbabka, harry.yoo, akpm, hao.li, cl, rientjes, roman.gushchin,
linux-mm, linux-kernel, linux-usb, stern, linux, andy.shevchenko,
hch, Jeff.kirsher, Robin Murphy
On 27.03.2026 15:09, Marek Szyprowski wrote:
> On 27.03.2026 13:26, Catalin Marinas wrote:
>> + Marek, Robin
>
> Thanks for adding me to the loop.
>
>> On Fri, Mar 27, 2026 at 10:58:46AM +0500, Mikhail Gavrilov wrote:
>>> When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
>>> tracks active mappings per cacheline and warns if two different DMA
>>> mappings share the same cacheline ("cacheline tracking EEXIST,
>>> overlapping mappings aren't supported").
>>>
>>> On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
>>> allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
>>> hub driver) frequently land in the same 64-byte cacheline. When both
>>> are DMA-mapped, this triggers a false positive warning.
>>>
>>> This has been reported repeatedly since v5.14 (when the EEXIST check
>>> was added) across various USB host controllers and devices including
>>> xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
>> This indeed has come up regularly in the past years.
>>
>>> +/*
>>> + * Align memory allocations to cache lines if DMA API debugging is active
>>> + * to avoid false positive DMA overlapping error messages.
>>> + */
>>> +#ifdef CONFIG_DMA_API_DEBUG
>>> +#ifndef ARCH_KMALLOC_MINALIGN
>>> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
>>> +#elif ARCH_KMALLOC_MINALIGN < L1_CACHE_BYTES
>>> +#undef ARCH_KMALLOC_MINALIGN
>>> +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
>>> +#endif
>>> +#endif
>> TL;DR: I think this is fine:
>>
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>>
>> I'm not sure that's the best way to hide the warning but there
>> are no great solutions either. On one hand, we want the DMA debug to
>> capture potential problems on architectures it's not running on. OTOH,
>> we also want to avoid false positives on coherent architectures/devices.
>> I don't think reconciling the two requirements is easy.
>>
>> When DMA_API_DEBUG is enabled, the above will change the x86 behaviour
>> that could have implications beyond DMA (e.g. may not catch some buffer
>> overflow because it's within L1_CACHE_BYTES). Similarly for non-coherent
>> architectures that select DMA_BOUNCE_UNALIGNED_KMALLOC (arm64 and riscv
>> currently). arm64 defines ARCH_DMA_MINALIGN to 128 but
>> ARCH_KMALLOC_MINALIGN to 8 (why 128 is larger than L1_CACHE_BYTES is
>> another matter but let's ignore it for now).
>
> IMHO enabling DMA_API_DEBUG should not change the kernel behavior, so I would prefer fixing this in DMA-debug code somehow.
>
>> More of a thinking out loud, we have:
>>
>> 1. Coherent architectures - alignment doesn't matter
>>
>> 2. Non-coherent architectures with:
>> a) Sufficiently large ARCH_KMALLOC_MINALIGN
>> b) Small ARCH_KMALLOC_MINALIGN but DMA_BOUNCE_UNALIGNED_KMALLOC
>> c) Broken config - forgot to set ARCH_DMA_MINALIGN or bouncing
>>
>> We can ignore (2.c), the aim of the DMA debug is to catch wrong uses in
>> drivers. If drivers is the only goal, the above change will do when
>> running on (1) or (2.a) hardware - it will catch sub-L1_CACHE_BYTES
>> buffers from drivers while assuming kmalloc() machinery is safe.
>> However, if running on (2.b) it won't catch anything that may be
>> problematic on (2.a) since the DMA debug ignores the overlap.
>>
>> We could make DMA_BOUNCE_UNALIGNED_KMALLOC dependent on !DMA_API_DEBUG
>> but it would be nice to be able to sanity-check the bouncing logic.
>> Well, it wasn't checking it before and with commit 03521c892bb8
>> ("dma-debug: don't report false positives with
>> DMA_BOUNCE_UNALIGNED_KMALLOC"), we made this clear that overlapping will
>> be ignored.
>>
>> Irrespective of whether we disable bouncing with DMA_API_DEBUG, maybe we
>> could replace the above commit with:
>>
>> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
>> index 3928a509c44c..488045ef6245 100644
>> --- a/kernel/dma/mapping.c
>> +++ b/kernel/dma/mapping.c
>> @@ -175,7 +175,7 @@ dma_addr_t dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
>> if (!is_mmio)
>> kmsan_handle_dma(phys, size, dir);
>> trace_dma_map_phys(dev, phys, addr, size, dir, attrs);
>> - debug_dma_map_phys(dev, phys, size, dir, addr, attrs);
>> + debug_dma_map_phys(dev, dma_to_phys(addr), size, dir, addr, attrs);
>>
>> return addr;
>> }
>>
>> Anyway, this I think is unrelated to the proposed change affecting x86,
>> more of a how to make the DMA API debugging more useful when running on
>> arm64 or riscv.
>
> This is not enough, there is also a dma_map_sg_attrs() path.
>
> I've reverted 03521c892bb8 and added the following change:
>
> diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 55e7ca8ceb86..bbada41143ea 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -18,6 +18,7 @@ #include <linux/uaccess.h> #include <linux/export.h> #include <linux/device.h> +#include <linux/dma-direct.h> #include <linux/types.h> #include <linux/sched.h> #include <linux/ctype.h> @@ -1241,7 +1242,8 @@ void debug_dma_map_phys(struct device *dev, phys_addr_t phys, size_t size, entry->dev = dev; entry->type = dma_debug_phy; - entry->paddr = phys; + entry->paddr = IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ? + dma_to_phys(dev, dma_addr) : phys; entry->dev_addr = dma_addr; entry->size = size; entry->direction = direction; @@ -1335,7 +1337,9 @@ void debug_dma_map_sg(struct device *dev, struct scatterlist *sg, entry->type = dma_debug_sg; entry->dev = dev; - entry->paddr = sg_phys(s); + entry->paddr = + IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ? + dma_to_phys(dev, sg_dma_address(s)) : sg_phys(s);
> entry->size = sg_dma_len(s); entry->dev_addr = sg_dma_address(s); entry->direction = direction;
>
> thenran my tests on ARM64 and RV64 boards. Only one new warning has been reported (I didn't analyze it yet), so this might be indeed a better solution than skipping overlapping cache lines warnings when DMA_BOUNCE_UNALIGNED_KMALLOC is set.
>
Huh, the diff has been malformed by my mail client. Let's try again:
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 55e7ca8ceb86..bbada41143ea 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -18,6 +18,7 @@
#include <linux/uaccess.h>
#include <linux/export.h>
#include <linux/device.h>
+#include <linux/dma-direct.h>
#include <linux/types.h>
#include <linux/sched.h>
#include <linux/ctype.h>
@@ -1241,7 +1242,8 @@ void debug_dma_map_phys(struct device *dev, phys_addr_t phys, size_t size,
entry->dev = dev;
entry->type = dma_debug_phy;
- entry->paddr = phys;
+ entry->paddr = IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ?
+ dma_to_phys(dev, dma_addr) : phys;
entry->dev_addr = dma_addr;
entry->size = size;
entry->direction = direction;
@@ -1335,7 +1337,9 @@ void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
entry->type = dma_debug_sg;
entry->dev = dev;
- entry->paddr = sg_phys(s);
+ entry->paddr =
+ IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ?
+ dma_to_phys(dev, sg_dma_address(s)) : sg_phys(s);
entry->size = sg_dma_len(s);
entry->dev_addr = sg_dma_address(s);
entry->direction = direction;
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-03-27 14:55 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-27 5:58 [PATCH] mm/slab: align kmalloc to cacheline when DMA API debugging is active Mikhail Gavrilov
2026-03-27 6:37 ` Harry Yoo (Oracle)
2026-03-27 6:50 ` Mikhail Gavrilov
2026-03-27 8:00 ` Harry Yoo (Oracle)
2026-03-27 8:07 ` Mikhail Gavrilov
2026-03-27 8:43 ` Harry Yoo (Oracle)
2026-03-27 10:25 ` Mikhail Gavrilov
2026-03-27 10:39 ` Harry Yoo (Oracle)
2026-03-27 6:41 ` Guenter Roeck
2026-03-27 12:26 ` Catalin Marinas
2026-03-27 12:34 ` Andy Shevchenko
2026-03-27 14:09 ` Marek Szyprowski
2026-03-27 14:30 ` Vlastimil Babka (SUSE)
2026-03-27 14:37 ` Mikhail Gavrilov
2026-03-27 14:41 ` Marek Szyprowski
2026-03-27 14:55 ` Marek Szyprowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox