linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/15] fix DMA aligment issues around virtio
@ 2026-01-05  8:22 Michael S. Tsirkin
  2026-01-05  8:22 ` [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end() Michael S. Tsirkin
                   ` (14 more replies)
  0 siblings, 15 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev


Cong Wang reported dma debug warnings with virtio-vsock
and proposed a patch, see:

https://lore.kernel.org/all/20251228015451.1253271-1-xiyou.wangcong@gmail.com/

however, the issue is more widespread.
This is an attempt to fix it systematically.
Note: i2c and gio might also be affected, I am still looking
into it. Help from maintainers welcome.

Lightly tested.  Cursor/claude used liberally, mostly for
refactoring/API updates/English.

DMA maintainers, could you please confirm the DMA core changes
are ok with you?

Thanks!


Michael S. Tsirkin (15):
  dma-mapping: add __dma_from_device_group_begin()/end()
  docs: dma-api: document __dma_from_device_group_begin()/end()
  dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
  docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
  dma-debug: track cache clean flag in entries
  virtio: add virtqueue_add_inbuf_cache_clean API
  vsock/virtio: fix DMA alignment for event_list
  vsock/virtio: use virtqueue_add_inbuf_cache_clean for events
  virtio_input: fix DMA alignment for evts
  virtio_scsi: fix DMA cacheline issues for events
  virtio-rng: fix DMA alignment for data buffer
  virtio_input: use virtqueue_add_inbuf_cache_clean for events
  vsock/virtio: reorder fields to reduce padding
  gpio: virtio: fix DMA alignment
  gpio: virtio: reorder fields to reduce struct padding

 Documentation/core-api/dma-api-howto.rst  | 52 ++++++++++++++
 Documentation/core-api/dma-attributes.rst |  9 +++
 drivers/char/hw_random/virtio-rng.c       |  3 +
 drivers/gpio/gpio-virtio.c                | 15 ++--
 drivers/scsi/virtio_scsi.c                | 17 +++--
 drivers/virtio/virtio_input.c             |  5 +-
 drivers/virtio/virtio_ring.c              | 83 ++++++++++++++++-------
 include/linux/dma-mapping.h               | 20 ++++++
 include/linux/virtio.h                    |  5 ++
 kernel/dma/debug.c                        | 28 ++++++--
 net/vmw_vsock/virtio_transport.c          |  8 ++-
 11 files changed, 205 insertions(+), 40 deletions(-)

-- 
MST


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end()
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
@ 2026-01-05  8:22 ` Michael S. Tsirkin
  2026-01-05  9:40   ` Petr Tesarik
  2026-01-05 18:27   ` Marek Szyprowski
  2026-01-05  8:22 ` [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end() Michael S. Tsirkin
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

When a structure contains a buffer that DMA writes to alongside fields
that the CPU writes to, cache line sharing between the DMA buffer and
CPU-written fields can cause data corruption on non-cache-coherent
platforms.

Add __dma_from_device_group_begin()/end() annotations to ensure proper
alignment to prevent this:

struct my_device {
	spinlock_t lock1;
	__dma_from_device_group_begin();
	char dma_buffer1[16];
	char dma_buffer2[16];
	__dma_from_device_group_end();
	spinlock_t lock2;
};

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/dma-mapping.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index aa36a0d1d9df..29ad2ce700f0 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -7,6 +7,7 @@
 #include <linux/dma-direction.h>
 #include <linux/scatterlist.h>
 #include <linux/bug.h>
+#include <linux/cache.h>
 
 /**
  * List of possible attributes associated with a DMA mapping. The semantics
@@ -703,6 +704,18 @@ static inline int dma_get_cache_alignment(void)
 }
 #endif
 
+#ifdef ARCH_HAS_DMA_MINALIGN
+#define ____dma_from_device_aligned __aligned(ARCH_DMA_MINALIGN)
+#else
+#define ____dma_from_device_aligned
+#endif
+/* Mark start of DMA buffer */
+#define __dma_from_device_group_begin(GROUP)			\
+	__cacheline_group_begin(GROUP) ____dma_from_device_aligned
+/* Mark end of DMA buffer */
+#define __dma_from_device_group_end(GROUP)			\
+	__cacheline_group_end(GROUP) ____dma_from_device_aligned
+
 static inline void *dmam_alloc_coherent(struct device *dev, size_t size,
 		dma_addr_t *dma_handle, gfp_t gfp)
 {
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end()
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
  2026-01-05  8:22 ` [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end() Michael S. Tsirkin
@ 2026-01-05  8:22 ` Michael S. Tsirkin
  2026-01-05  9:48   ` Petr Tesarik
  2026-01-05 18:28   ` Marek Szyprowski
  2026-01-05  8:23 ` [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

Document the __dma_from_device_group_begin()/end() annotations.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/core-api/dma-api-howto.rst | 52 ++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/Documentation/core-api/dma-api-howto.rst b/Documentation/core-api/dma-api-howto.rst
index 96fce2a9aa90..e97743ab0f26 100644
--- a/Documentation/core-api/dma-api-howto.rst
+++ b/Documentation/core-api/dma-api-howto.rst
@@ -146,6 +146,58 @@ What about block I/O and networking buffers?  The block I/O and
 networking subsystems make sure that the buffers they use are valid
 for you to DMA from/to.
 
+__dma_from_device_group_begin/end annotations
+=============================================
+
+As explained previously, when a structure contains a DMA_FROM_DEVICE /
+DMA_BIDIRECTIONAL buffer (device writes to memory) alongside fields that the
+CPU writes to, cache line sharing between the DMA buffer and CPU-written fields
+can cause data corruption on CPUs with DMA-incoherent caches.
+
+The ``__dma_from_device_group_begin(GROUP)/__dma_from_device_group_end(GROUP)``
+macros ensure proper alignment to prevent this::
+
+	struct my_device {
+		spinlock_t lock1;
+		__dma_from_device_group_begin();
+		char dma_buffer1[16];
+		char dma_buffer2[16];
+		__dma_from_device_group_end();
+		spinlock_t lock2;
+	};
+
+To isolate a DMA buffer from adjacent fields, use
+``__dma_from_device_group_begin(GROUP)`` before the first DMA buffer
+field and ``__dma_from_device_group_end(GROUP)`` after the last DMA
+buffer field (with the same GROUP name). This protects both the head
+and tail of the buffer from cache line sharing.
+
+The GROUP parameter is an optional identifier that names the DMA buffer group
+(in case you have several in the same structure)::
+
+	struct my_device {
+		spinlock_t lock1;
+		__dma_from_device_group_begin(buffer1);
+		char dma_buffer1[16];
+		__dma_from_device_group_end(buffer1);
+		spinlock_t lock2;
+		__dma_from_device_group_begin(buffer2);
+		char dma_buffer2[16];
+		__dma_from_device_group_end(buffer2);
+	};
+
+On cache-coherent platforms these macros expand to zero-length array markers.
+On non-coherent platforms, they also ensure the minimal DMA alignment, which
+can be as large as 128 bytes.
+
+.. note::
+
+        It is allowed (though somewhat fragile) to include extra fields, not
+        intended for DMA from the device, within the group (in order to pack the
+        structure tightly) - but only as long as the CPU does not write these
+        fields while any fields in the group are mapped for DMA_FROM_DEVICE or
+        DMA_BIDIRECTIONAL.
+
 DMA addressing capabilities
 ===========================
 
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
  2026-01-05  8:22 ` [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end() Michael S. Tsirkin
  2026-01-05  8:22 ` [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end() Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  9:50   ` Petr Tesarik
  2026-01-08 13:57   ` Marek Szyprowski
  2026-01-05  8:23 ` [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

When multiple small DMA_FROM_DEVICE or DMA_BIDIRECTIONAL buffers share a
cacheline, and DMA_API_DEBUG is enabled, we get this warning:
	cacheline tracking EEXIST, overlapping mappings aren't supported.

This is because when one of the mappings is removed, while another one
is active, CPU might write into the buffer.

Add an attribute for the driver to promise not to do this, making the
overlapping safe, and suppressing the warning.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/dma-mapping.h | 7 +++++++
 kernel/dma/debug.c          | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 29ad2ce700f0..29973baa0581 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -79,6 +79,13 @@
  */
 #define DMA_ATTR_MMIO		(1UL << 10)
 
+/*
+ * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline
+ * overlapping this buffer while it is mapped for DMA. All mappings sharing
+ * a cacheline must have this attribute for this to be considered safe.
+ */
+#define DMA_ATTR_CPU_CACHE_CLEAN	(1UL << 11)
+
 /*
  * A dma_addr_t can hold any valid DMA or bus address for the platform.  It can
  * be given to a device to use as a DMA source or target.  It is specific to a
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 138ede653de4..7e66d863d573 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -595,7 +595,8 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
 	if (rc == -ENOMEM) {
 		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
 		global_disable = true;
-	} else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+	} else if (rc == -EEXIST &&
+		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
 		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
 		     is_swiotlb_active(entry->dev))) {
 		err_printk(entry->dev, entry,
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  9:51   ` Petr Tesarik
  2026-01-08 13:59   ` Marek Szyprowski
  2026-01-05  8:23 ` [PATCH v2 05/15] dma-debug: track cache clean flag in entries Michael S. Tsirkin
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

Document DMA_ATTR_CPU_CACHE_CLEAN as implemented in the
previous patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/core-api/dma-attributes.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 0bdc2be65e57..1d7bfad73b1c 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -148,3 +148,12 @@ DMA_ATTR_MMIO is appropriate.
 For architectures that require cache flushing for DMA coherence
 DMA_ATTR_MMIO will not perform any cache flushing. The address
 provided must never be mapped cacheable into the CPU.
+
+DMA_ATTR_CPU_CACHE_CLEAN
+------------------------
+
+This attribute indicates the CPU will not dirty any cacheline overlapping this
+DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
+multiple small buffers to safely share a cacheline without risk of data
+corruption, suppressing DMA debug warnings about overlapping mappings.
+All mappings sharing a cacheline should have this attribute.
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 05/15] dma-debug: track cache clean flag in entries
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  9:54   ` Petr Tesarik
  2026-01-05  8:23 ` [PATCH v2 06/15] virtio: add virtqueue_add_inbuf_cache_clean API Michael S. Tsirkin
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

If a driver is buggy and has 2 overlapping mappings but only
sets cache clean flag on the 1st one of them, we warn.
But if it only does it for the 2nd one, we don't.

Fix by tracking cache clean flag in the entry.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 kernel/dma/debug.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 7e66d863d573..43d6a996d7a7 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -63,6 +63,7 @@ enum map_err_types {
  * @sg_mapped_ents: 'mapped_ents' from dma_map_sg
  * @paddr: physical start address of the mapping
  * @map_err_type: track whether dma_mapping_error() was checked
+ * @is_cache_clean: driver promises not to write to buffer while mapped
  * @stack_len: number of backtrace entries in @stack_entries
  * @stack_entries: stack of backtrace history
  */
@@ -76,7 +77,8 @@ struct dma_debug_entry {
 	int		 sg_call_ents;
 	int		 sg_mapped_ents;
 	phys_addr_t	 paddr;
-	enum map_err_types  map_err_type;
+	enum map_err_types map_err_type;
+	bool		 is_cache_clean;
 #ifdef CONFIG_STACKTRACE
 	unsigned int	stack_len;
 	unsigned long	stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
@@ -472,12 +474,15 @@ static int active_cacheline_dec_overlap(phys_addr_t cln)
 	return active_cacheline_set_overlap(cln, --overlap);
 }
 
-static int active_cacheline_insert(struct dma_debug_entry *entry)
+static int active_cacheline_insert(struct dma_debug_entry *entry,
+				   bool *overlap_cache_clean)
 {
 	phys_addr_t cln = to_cacheline_number(entry);
 	unsigned long flags;
 	int rc;
 
+	*overlap_cache_clean = false;
+
 	/* If the device is not writing memory then we don't have any
 	 * concerns about the cpu consuming stale data.  This mitigates
 	 * legitimate usages of overlapping mappings.
@@ -487,8 +492,16 @@ static int active_cacheline_insert(struct dma_debug_entry *entry)
 
 	spin_lock_irqsave(&radix_lock, flags);
 	rc = radix_tree_insert(&dma_active_cacheline, cln, entry);
-	if (rc == -EEXIST)
+	if (rc == -EEXIST) {
+		struct dma_debug_entry *existing;
+
 		active_cacheline_inc_overlap(cln);
+		existing = radix_tree_lookup(&dma_active_cacheline, cln);
+		/* A lookup failure here after we got -EEXIST is unexpected. */
+		WARN_ON(!existing);
+		if (existing)
+			*overlap_cache_clean = existing->is_cache_clean;
+	}
 	spin_unlock_irqrestore(&radix_lock, flags);
 
 	return rc;
@@ -583,20 +596,24 @@ DEFINE_SHOW_ATTRIBUTE(dump);
  */
 static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
 {
+	bool overlap_cache_clean;
 	struct hash_bucket *bucket;
 	unsigned long flags;
 	int rc;
 
+	entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
+
 	bucket = get_hash_bucket(entry, &flags);
 	hash_bucket_add(bucket, entry);
 	put_hash_bucket(bucket, flags);
 
-	rc = active_cacheline_insert(entry);
+	rc = active_cacheline_insert(entry, &overlap_cache_clean);
 	if (rc == -ENOMEM) {
 		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
 		global_disable = true;
 	} else if (rc == -EEXIST &&
-		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
+		   !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+		   !(entry->is_cache_clean && overlap_cache_clean) &&
 		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
 		     is_swiotlb_active(entry->dev))) {
 		err_printk(entry->dev, entry,
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 06/15] virtio: add virtqueue_add_inbuf_cache_clean API
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 05/15] dma-debug: track cache clean flag in entries Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  8:23 ` [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

Add virtqueue_add_inbuf_cache_clean() for passing DMA_ATTR_CPU_CACHE_CLEAN
to virtqueue operations. This suppresses DMA debug cacheline overlap
warnings for buffers where proper cache management is ensured by the
caller.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c | 83 ++++++++++++++++++++++++++----------
 include/linux/virtio.h       |  5 +++
 2 files changed, 65 insertions(+), 23 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 95e320b23624..4fe0f78df5ec 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -174,7 +174,8 @@ struct virtqueue_ops {
 	int (*add)(struct vring_virtqueue *vq, struct scatterlist *sgs[],
 		   unsigned int total_sg, unsigned int out_sgs,
 		   unsigned int in_sgs,	void *data,
-		   void *ctx, bool premapped, gfp_t gfp);
+		   void *ctx, bool premapped, gfp_t gfp,
+		   unsigned long attr);
 	void *(*get)(struct vring_virtqueue *vq, unsigned int *len, void **ctx);
 	bool (*kick_prepare)(struct vring_virtqueue *vq);
 	void (*disable_cb)(struct vring_virtqueue *vq);
@@ -444,7 +445,7 @@ static int vring_mapping_error(const struct vring_virtqueue *vq,
 /* Map one sg entry. */
 static int vring_map_one_sg(const struct vring_virtqueue *vq, struct scatterlist *sg,
 			    enum dma_data_direction direction, dma_addr_t *addr,
-			    u32 *len, bool premapped)
+			    u32 *len, bool premapped, unsigned long attr)
 {
 	if (premapped) {
 		*addr = sg_dma_address(sg);
@@ -472,7 +473,7 @@ static int vring_map_one_sg(const struct vring_virtqueue *vq, struct scatterlist
 	 */
 	*addr = virtqueue_map_page_attrs(&vq->vq, sg_page(sg),
 					 sg->offset, sg->length,
-					 direction, 0);
+					 direction, attr);
 
 	if (vring_mapping_error(vq, *addr))
 		return -ENOMEM;
@@ -603,7 +604,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
 				      void *data,
 				      void *ctx,
 				      bool premapped,
-				      gfp_t gfp)
+				      gfp_t gfp,
+				      unsigned long attr)
 {
 	struct vring_desc_extra *extra;
 	struct scatterlist *sg;
@@ -675,7 +677,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
 			if (++sg_count != total_sg)
 				flags |= VRING_DESC_F_NEXT;
 
-			if (vring_map_one_sg(vq, sg, DMA_TO_DEVICE, &addr, &len, premapped))
+			if (vring_map_one_sg(vq, sg, DMA_TO_DEVICE, &addr, &len,
+					     premapped, attr))
 				goto unmap_release;
 
 			/* Note that we trust indirect descriptor
@@ -694,7 +697,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
 			if (++sg_count != total_sg)
 				flags |= VRING_DESC_F_NEXT;
 
-			if (vring_map_one_sg(vq, sg, DMA_FROM_DEVICE, &addr, &len, premapped))
+			if (vring_map_one_sg(vq, sg, DMA_FROM_DEVICE, &addr, &len,
+					     premapped, attr))
 				goto unmap_release;
 
 			/* Note that we trust indirect descriptor
@@ -1487,7 +1491,8 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 					 void *data,
 					 bool premapped,
 					 gfp_t gfp,
-					 u16 id)
+					 u16 id,
+					 unsigned long attr)
 {
 	struct vring_desc_extra *extra;
 	struct vring_packed_desc *desc;
@@ -1516,7 +1521,7 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 		for (sg = sgs[n]; sg; sg = sg_next(sg)) {
 			if (vring_map_one_sg(vq, sg, n < out_sgs ?
 					     DMA_TO_DEVICE : DMA_FROM_DEVICE,
-					     &addr, &len, premapped))
+					     &addr, &len, premapped, attr))
 				goto unmap_release;
 
 			desc[i].flags = cpu_to_le16(n < out_sgs ?
@@ -1615,7 +1620,8 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
 				       void *data,
 				       void *ctx,
 				       bool premapped,
-				       gfp_t gfp)
+				       gfp_t gfp,
+				       unsigned long attr)
 {
 	struct vring_packed_desc *desc;
 	struct scatterlist *sg;
@@ -1642,8 +1648,8 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
 		id = vq->free_head;
 		BUG_ON(id == vq->packed.vring.num);
 		err = virtqueue_add_indirect_packed(vq, sgs, total_sg, out_sgs,
-						    in_sgs, data, premapped,
-						    gfp, id);
+						    in_sgs, data, premapped, gfp,
+						    id, attr);
 		if (err != -ENOMEM) {
 			END_USE(vq);
 			return err;
@@ -1679,7 +1685,7 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
 
 			if (vring_map_one_sg(vq, sg, n < out_sgs ?
 					     DMA_TO_DEVICE : DMA_FROM_DEVICE,
-					     &addr, &len, premapped))
+					     &addr, &len, premapped, attr))
 				goto unmap_release;
 
 			flags = cpu_to_le16(vq->packed.avail_used_flags |
@@ -1772,7 +1778,8 @@ static inline int virtqueue_add_packed_in_order(struct vring_virtqueue *vq,
 						void *data,
 						void *ctx,
 						bool premapped,
-						gfp_t gfp)
+						gfp_t gfp,
+						unsigned long attr)
 {
 	struct vring_packed_desc *desc;
 	struct scatterlist *sg;
@@ -1799,7 +1806,8 @@ static inline int virtqueue_add_packed_in_order(struct vring_virtqueue *vq,
 	if (virtqueue_use_indirect(vq, total_sg)) {
 		err = virtqueue_add_indirect_packed(vq, sgs, total_sg, out_sgs,
 						    in_sgs, data, premapped, gfp,
-						    vq->packed.next_avail_idx);
+						    vq->packed.next_avail_idx,
+						    attr);
 		if (err != -ENOMEM) {
 			END_USE(vq);
 			return err;
@@ -1838,7 +1846,7 @@ static inline int virtqueue_add_packed_in_order(struct vring_virtqueue *vq,
 
 			if (vring_map_one_sg(vq, sg, n < out_sgs ?
 					     DMA_TO_DEVICE : DMA_FROM_DEVICE,
-					     &addr, &len, premapped))
+					     &addr, &len, premapped, attr))
 				goto unmap_release;
 
 			flags |= cpu_to_le16(vq->packed.avail_used_flags);
@@ -2781,13 +2789,14 @@ static inline int virtqueue_add(struct virtqueue *_vq,
 				void *data,
 				void *ctx,
 				bool premapped,
-				gfp_t gfp)
+				gfp_t gfp,
+				unsigned long attr)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
 	return VIRTQUEUE_CALL(vq, add, sgs, total_sg,
 			      out_sgs, in_sgs, data,
-			      ctx, premapped, gfp);
+			      ctx, premapped, gfp, attr);
 }
 
 /**
@@ -2825,7 +2834,7 @@ int virtqueue_add_sgs(struct virtqueue *_vq,
 			total_sg++;
 	}
 	return virtqueue_add(_vq, sgs, total_sg, out_sgs, in_sgs,
-			     data, NULL, false, gfp);
+			     data, NULL, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_sgs);
 
@@ -2847,7 +2856,7 @@ int virtqueue_add_outbuf(struct virtqueue *vq,
 			 void *data,
 			 gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, false, gfp);
+	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_outbuf);
 
@@ -2870,7 +2879,7 @@ int virtqueue_add_outbuf_premapped(struct virtqueue *vq,
 				   void *data,
 				   gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, true, gfp);
+	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, true, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_outbuf_premapped);
 
@@ -2892,10 +2901,38 @@ int virtqueue_add_inbuf(struct virtqueue *vq,
 			void *data,
 			gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp);
+	return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf);
 
+/**
+ * virtqueue_add_inbuf_cache_clean - expose input buffers with cache clean
+ * @vq: the struct virtqueue we're talking about.
+ * @sg: scatterlist (must be well-formed and terminated!)
+ * @num: the number of entries in @sg writable by other side
+ * @data: the token identifying the buffer.
+ * @gfp: how to do memory allocations (if necessary).
+ *
+ * Same as virtqueue_add_inbuf but passes DMA_ATTR_CPU_CACHE_CLEAN to indicate
+ * that the CPU will not dirty any cacheline overlapping this buffer while it
+ * is available, and to suppress overlapping cacheline warnings in DMA debug
+ * builds.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ *
+ * Returns zero or a negative error (ie. ENOSPC, ENOMEM, EIO).
+ */
+int virtqueue_add_inbuf_cache_clean(struct virtqueue *vq,
+				    struct scatterlist *sg, unsigned int num,
+				    void *data,
+				    gfp_t gfp)
+{
+	return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp,
+			     DMA_ATTR_CPU_CACHE_CLEAN);
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_cache_clean);
+
 /**
  * virtqueue_add_inbuf_ctx - expose input buffers to other end
  * @vq: the struct virtqueue we're talking about.
@@ -2916,7 +2953,7 @@ int virtqueue_add_inbuf_ctx(struct virtqueue *vq,
 			void *ctx,
 			gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, false, gfp);
+	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_ctx);
 
@@ -2941,7 +2978,7 @@ int virtqueue_add_inbuf_premapped(struct virtqueue *vq,
 				  void *ctx,
 				  gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, true, gfp);
+	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, true, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_premapped);
 
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 3626eb694728..63bb05ece8c5 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -62,6 +62,11 @@ int virtqueue_add_inbuf(struct virtqueue *vq,
 			void *data,
 			gfp_t gfp);
 
+int virtqueue_add_inbuf_cache_clean(struct virtqueue *vq,
+				    struct scatterlist sg[], unsigned int num,
+				    void *data,
+				    gfp_t gfp);
+
 int virtqueue_add_inbuf_ctx(struct virtqueue *vq,
 			    struct scatterlist sg[], unsigned int num,
 			    void *data,
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (5 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 06/15] virtio: add virtqueue_add_inbuf_cache_clean API Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-08 14:04   ` Stefano Garzarella
  2026-01-05  8:23 ` [PATCH v2 08/15] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On non-cache-coherent platforms, when a structure contains a buffer
used for DMA alongside fields that the CPU writes to, cacheline sharing
can cause data corruption.

The event_list array is used for DMA_FROM_DEVICE operations via
virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are
written by the CPU while the buffer is available, so mapped for the
device. If these share cachelines with event_list, CPU writes can
corrupt DMA data.

Add __dma_from_device_group_begin()/end() annotations to ensure event_list
is isolated in its own cachelines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 8c867023a2e5..bb94baadfd8b 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -17,6 +17,7 @@
 #include <linux/virtio_ids.h>
 #include <linux/virtio_config.h>
 #include <linux/virtio_vsock.h>
+#include <linux/dma-mapping.h>
 #include <net/sock.h>
 #include <linux/mutex.h>
 #include <net/af_vsock.h>
@@ -59,8 +60,9 @@ struct virtio_vsock {
 	 */
 	struct mutex event_lock;
 	bool event_run;
+	__dma_from_device_group_begin();
 	struct virtio_vsock_event event_list[8];
-
+	__dma_from_device_group_end();
 	u32 guest_cid;
 	bool seqpacket_allow;
 
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 08/15] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (6 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-08 14:08   ` Stefano Garzarella
  2026-01-05  8:23 ` [PATCH v2 09/15] virtio_input: fix DMA alignment for evts Michael S. Tsirkin
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

The event_list array contains 8 small (4-byte) events that share
cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
this can trigger warnings about overlapping DMA mappings within
the same cacheline.

The previous patch isolated event_list in its own cache lines
so the warnings are spurious.

Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
write into these fields, suppressing the warnings.

Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index bb94baadfd8b..ef983c36cb66 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -392,7 +392,7 @@ static int virtio_vsock_event_fill_one(struct virtio_vsock *vsock,
 
 	sg_init_one(&sg, event, sizeof(*event));
 
-	return virtqueue_add_inbuf(vq, &sg, 1, event, GFP_KERNEL);
+	return virtqueue_add_inbuf_cache_clean(vq, &sg, 1, event, GFP_KERNEL);
 }
 
 /* event_lock must be held */
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 09/15] virtio_input: fix DMA alignment for evts
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (7 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 08/15] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  8:23 ` [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On non-cache-coherent platforms, when a structure contains a buffer
used for DMA alongside fields that the CPU writes to, cacheline sharing
can cause data corruption.

The evts array is used for DMA_FROM_DEVICE operations via
virtqueue_add_inbuf(). The adjacent lock and ready fields are written
by the CPU during normal operation. If these share cachelines with evts,
CPU writes can corrupt DMA data.

Add __dma_from_device_group_begin()/end() annotations to ensure evts is
isolated in its own cachelines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_input.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index d0728285b6ce..9f13de1f1d77 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -4,6 +4,7 @@
 #include <linux/virtio_config.h>
 #include <linux/input.h>
 #include <linux/slab.h>
+#include <linux/dma-mapping.h>
 
 #include <uapi/linux/virtio_ids.h>
 #include <uapi/linux/virtio_input.h>
@@ -16,7 +17,9 @@ struct virtio_input {
 	char                       serial[64];
 	char                       phys[64];
 	struct virtqueue           *evt, *sts;
+	__dma_from_device_group_begin();
 	struct virtio_input_event  evts[64];
+	__dma_from_device_group_end();
 	spinlock_t                 lock;
 	bool                       ready;
 };
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (8 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 09/15] virtio_input: fix DMA alignment for evts Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05 18:19   ` Stefan Hajnoczi
  2026-01-05  8:23 ` [PATCH v2 11/15] virtio-rng: fix DMA alignment for data buffer Michael S. Tsirkin
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

Current struct virtio_scsi_event_node layout has two problems:

The event (DMA_FROM_DEVICE) and work (CPU-written via
INIT_WORK/queue_work) fields share a cacheline.
On non-cache-coherent platforms, CPU writes to work can
corrupt device-written event data.

If ARCH_DMA_MINALIGN is large enough, the 8 events in event_list share
cachelines, triggering CONFIG_DMA_API_DEBUG warnings.

Fix the corruption by moving event buffers to a separate array and
aligning using __dma_from_device_group_begin()/end().

Suppress the (now spurious) DMA debug warnings using
virtqueue_add_inbuf_cache_clean().

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 96a69edddbe5..6ff53fc8adb0 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -29,6 +29,7 @@
 #include <scsi/scsi_tcq.h>
 #include <scsi/scsi_devinfo.h>
 #include <linux/seqlock.h>
+#include <linux/dma-mapping.h>
 
 #include "sd.h"
 
@@ -61,7 +62,7 @@ struct virtio_scsi_cmd {
 
 struct virtio_scsi_event_node {
 	struct virtio_scsi *vscsi;
-	struct virtio_scsi_event event;
+	struct virtio_scsi_event *event;
 	struct work_struct work;
 };
 
@@ -89,6 +90,11 @@ struct virtio_scsi {
 
 	struct virtio_scsi_vq ctrl_vq;
 	struct virtio_scsi_vq event_vq;
+
+	__dma_from_device_group_begin();
+	struct virtio_scsi_event events[VIRTIO_SCSI_EVENT_LEN];
+	__dma_from_device_group_end();
+
 	struct virtio_scsi_vq req_vqs[];
 };
 
@@ -237,12 +243,12 @@ static int virtscsi_kick_event(struct virtio_scsi *vscsi,
 	unsigned long flags;
 
 	INIT_WORK(&event_node->work, virtscsi_handle_event);
-	sg_init_one(&sg, &event_node->event, sizeof(struct virtio_scsi_event));
+	sg_init_one(&sg, event_node->event, sizeof(struct virtio_scsi_event));
 
 	spin_lock_irqsave(&vscsi->event_vq.vq_lock, flags);
 
-	err = virtqueue_add_inbuf(vscsi->event_vq.vq, &sg, 1, event_node,
-				  GFP_ATOMIC);
+	err = virtqueue_add_inbuf_cache_clean(vscsi->event_vq.vq, &sg, 1, event_node,
+					      GFP_ATOMIC);
 	if (!err)
 		virtqueue_kick(vscsi->event_vq.vq);
 
@@ -257,6 +263,7 @@ static int virtscsi_kick_event_all(struct virtio_scsi *vscsi)
 
 	for (i = 0; i < VIRTIO_SCSI_EVENT_LEN; i++) {
 		vscsi->event_list[i].vscsi = vscsi;
+		vscsi->event_list[i].event = &vscsi->events[i];
 		virtscsi_kick_event(vscsi, &vscsi->event_list[i]);
 	}
 
@@ -380,7 +387,7 @@ static void virtscsi_handle_event(struct work_struct *work)
 	struct virtio_scsi_event_node *event_node =
 		container_of(work, struct virtio_scsi_event_node, work);
 	struct virtio_scsi *vscsi = event_node->vscsi;
-	struct virtio_scsi_event *event = &event_node->event;
+	struct virtio_scsi_event *event = event_node->event;
 
 	if (event->event &
 	    cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 11/15] virtio-rng: fix DMA alignment for data buffer
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (9 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  8:23 ` [PATCH v2 12/15] virtio_input: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

The data buffer in struct virtrng_info is used for DMA_FROM_DEVICE via
virtqueue_add_inbuf() and shares cachelines with the adjacent
CPU-written fields (data_avail, data_idx).

The device writing to the DMA buffer and the CPU writing to adjacent
fields could corrupt each other's data on non-cache-coherent platforms.

Add __dma_from_device_group_begin()/end() annotations to place these
in distinct cache lines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/char/hw_random/virtio-rng.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index dd998f4fe4f2..eb80a031c7be 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -11,6 +11,7 @@
 #include <linux/spinlock.h>
 #include <linux/virtio.h>
 #include <linux/virtio_rng.h>
+#include <linux/dma-mapping.h>
 #include <linux/module.h>
 #include <linux/slab.h>
 
@@ -28,11 +29,13 @@ struct virtrng_info {
 	unsigned int data_avail;
 	unsigned int data_idx;
 	/* minimal size returned by rng_buffer_size() */
+	__dma_from_device_group_begin();
 #if SMP_CACHE_BYTES < 32
 	u8 data[32];
 #else
 	u8 data[SMP_CACHE_BYTES];
 #endif
+	__dma_from_device_group_end();
 };
 
 static void random_recv_done(struct virtqueue *vq)
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 12/15] virtio_input: use virtqueue_add_inbuf_cache_clean for events
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (10 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 11/15] virtio-rng: fix DMA alignment for data buffer Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  8:23 ` [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding Michael S. Tsirkin
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

The evts array contains 64 small (8-byte) input events that share
cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
this can trigger warnings about overlapping DMA mappings within
the same cacheline.

Previous patch isolated the array in its own cachelines,
so the warnings are now spurious.

Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
write into these cache lines, suppressing these warnings.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index 9f13de1f1d77..74df16677da8 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -30,7 +30,7 @@ static void virtinput_queue_evtbuf(struct virtio_input *vi,
 	struct scatterlist sg[1];
 
 	sg_init_one(sg, evtbuf, sizeof(*evtbuf));
-	virtqueue_add_inbuf(vi->evt, sg, 1, evtbuf, GFP_ATOMIC);
+	virtqueue_add_inbuf_cache_clean(vi->evt, sg, 1, evtbuf, GFP_ATOMIC);
 }
 
 static void virtinput_recv_events(struct virtqueue *vq)
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (11 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 12/15] virtio_input: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-08 14:11   ` Stefano Garzarella
  2026-01-05  8:23 ` [PATCH v2 14/15] gpio: virtio: fix DMA alignment Michael S. Tsirkin
  2026-01-05  8:23 ` [PATCH v2 15/15] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
last. This eliminates the padding from aligning the struct size on
ARCH_DMA_MINALIGN.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index ef983c36cb66..964d25e11858 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -60,9 +60,7 @@ struct virtio_vsock {
 	 */
 	struct mutex event_lock;
 	bool event_run;
-	__dma_from_device_group_begin();
-	struct virtio_vsock_event event_list[8];
-	__dma_from_device_group_end();
+
 	u32 guest_cid;
 	bool seqpacket_allow;
 
@@ -76,6 +74,10 @@ struct virtio_vsock {
 	 */
 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
+
+	__dma_from_device_group_begin();
+	struct virtio_vsock_event event_list[8];
+	__dma_from_device_group_end();
 };
 
 static u32 virtio_transport_get_local_cid(void)
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 14/15] gpio: virtio: fix DMA alignment
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (12 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  9:48   ` Bartosz Golaszewski
  2026-01-05  8:23 ` [PATCH v2 15/15] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev, Viresh Kumar,
	Enrico Weigelt, metux IT consult, Viresh Kumar, Linus Walleij,
	linux-gpio

The res and ires buffers in struct virtio_gpio_line and struct
vgpio_irq_line respectively are used for DMA_FROM_DEVICE via
virtqueue_add_sgs().  However, within these structs, even though these
elements are tagged as ____cacheline_aligned, adjacent struct elements
can share DMA cachelines on platforms where ARCH_DMA_MINALIGN >
L1_CACHE_BYTES (e.g., arm64 with 128-byte DMA alignment but 64-byte
cache lines).

The existing ____cacheline_aligned annotation aligns to L1_CACHE_BYTES
which is not always sufficient for DMA alignment. For example, with
L1_CACHE_BYTES = 32 and ARCH_DMA_MINALIGN = 128
  - irq_lines[0].ires at offset 128
  - irq_lines[1].type at offset 192
both in same 128-byte DMA cacheline [128-256)

When the device writes to irq_lines[0].ires and the CPU concurrently
modifies one of irq_lines[1].type/disabled/masked/queued flags,
corruption can occur on non-cache-coherent platforms.

Fix by using __dma_from_device_group_begin()/end() annotations on the
DMA buffers. Drop ____cacheline_aligned - it's not required to isolate
request and response, and keeping them would increase the memory cost.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/gpio/gpio-virtio.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-virtio.c b/drivers/gpio/gpio-virtio.c
index 17e040991e46..b70294626770 100644
--- a/drivers/gpio/gpio-virtio.c
+++ b/drivers/gpio/gpio-virtio.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/completion.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/gpio/driver.h>
 #include <linux/io.h>
@@ -24,8 +25,11 @@
 struct virtio_gpio_line {
 	struct mutex lock; /* Protects line operation */
 	struct completion completion;
-	struct virtio_gpio_request req ____cacheline_aligned;
-	struct virtio_gpio_response res ____cacheline_aligned;
+
+	__dma_from_device_group_begin();
+	struct virtio_gpio_request req;
+	struct virtio_gpio_response res;
+	__dma_from_device_group_end();
 	unsigned int rxlen;
 };
 
@@ -37,8 +41,10 @@ struct vgpio_irq_line {
 	bool update_pending;
 	bool queue_pending;
 
-	struct virtio_gpio_irq_request ireq ____cacheline_aligned;
-	struct virtio_gpio_irq_response ires ____cacheline_aligned;
+	__dma_from_device_group_begin();
+	struct virtio_gpio_irq_request ireq;
+	struct virtio_gpio_irq_response ires;
+	__dma_from_device_group_end();
 };
 
 struct virtio_gpio {
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 15/15] gpio: virtio: reorder fields to reduce struct padding
  2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (13 preceding siblings ...)
  2026-01-05  8:23 ` [PATCH v2 14/15] gpio: virtio: fix DMA alignment Michael S. Tsirkin
@ 2026-01-05  8:23 ` Michael S. Tsirkin
  2026-01-05  9:49   ` Bartosz Golaszewski
  14 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05  8:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev, Viresh Kumar,
	Enrico Weigelt, metux IT consult, Viresh Kumar, Linus Walleij,
	linux-gpio

Reorder struct virtio_gpio_line fields to place the DMA buffers
(req/res) last.

This eliminates the padding from aligning struct size on
ARCH_DMA_MINALIGN.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/gpio/gpio-virtio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-virtio.c b/drivers/gpio/gpio-virtio.c
index b70294626770..ed6e0e90fa8a 100644
--- a/drivers/gpio/gpio-virtio.c
+++ b/drivers/gpio/gpio-virtio.c
@@ -26,11 +26,12 @@ struct virtio_gpio_line {
 	struct mutex lock; /* Protects line operation */
 	struct completion completion;
 
+	unsigned int rxlen;
+
 	__dma_from_device_group_begin();
 	struct virtio_gpio_request req;
 	struct virtio_gpio_response res;
 	__dma_from_device_group_end();
-	unsigned int rxlen;
 };
 
 struct vgpio_irq_line {
-- 
MST


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end()
  2026-01-05  8:22 ` [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end() Michael S. Tsirkin
@ 2026-01-05  9:40   ` Petr Tesarik
  2026-01-05 18:27   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2026-01-05  9:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, 5 Jan 2026 03:22:54 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> When a structure contains a buffer that DMA writes to alongside fields
> that the CPU writes to, cache line sharing between the DMA buffer and
> CPU-written fields can cause data corruption on non-cache-coherent
> platforms.
> 
> Add __dma_from_device_group_begin()/end() annotations to ensure proper
> alignment to prevent this:
> 
> struct my_device {
> 	spinlock_t lock1;
> 	__dma_from_device_group_begin();
> 	char dma_buffer1[16];
> 	char dma_buffer2[16];
> 	__dma_from_device_group_end();
> 	spinlock_t lock2;
> };
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

LGTM. I'm not formally a reviewer, but FWIW:

Reviewed-by: Petr Tesarik <ptesarik@suse.com>

> ---
>  include/linux/dma-mapping.h | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index aa36a0d1d9df..29ad2ce700f0 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -7,6 +7,7 @@
>  #include <linux/dma-direction.h>
>  #include <linux/scatterlist.h>
>  #include <linux/bug.h>
> +#include <linux/cache.h>
>  
>  /**
>   * List of possible attributes associated with a DMA mapping. The semantics
> @@ -703,6 +704,18 @@ static inline int dma_get_cache_alignment(void)
>  }
>  #endif
>  
> +#ifdef ARCH_HAS_DMA_MINALIGN
> +#define ____dma_from_device_aligned __aligned(ARCH_DMA_MINALIGN)
> +#else
> +#define ____dma_from_device_aligned
> +#endif
> +/* Mark start of DMA buffer */
> +#define __dma_from_device_group_begin(GROUP)			\
> +	__cacheline_group_begin(GROUP) ____dma_from_device_aligned
> +/* Mark end of DMA buffer */
> +#define __dma_from_device_group_end(GROUP)			\
> +	__cacheline_group_end(GROUP) ____dma_from_device_aligned
> +
>  static inline void *dmam_alloc_coherent(struct device *dev, size_t size,
>  		dma_addr_t *dma_handle, gfp_t gfp)
>  {


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end()
  2026-01-05  8:22 ` [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end() Michael S. Tsirkin
@ 2026-01-05  9:48   ` Petr Tesarik
  2026-01-05 18:28   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2026-01-05  9:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, 5 Jan 2026 03:22:57 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> Document the __dma_from_device_group_begin()/end() annotations.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

I really like your wording ("CPU does not write"), which rightly refers
to what happens on the bus rather then what may or may not make a
specific CPU architecture initiate a bus write.

I'm not formally a reviewer, but FWIW:

Reviewed-by: Petr Tesarik <ptesarik@suse.com>

> ---
>  Documentation/core-api/dma-api-howto.rst | 52 ++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/Documentation/core-api/dma-api-howto.rst b/Documentation/core-api/dma-api-howto.rst
> index 96fce2a9aa90..e97743ab0f26 100644
> --- a/Documentation/core-api/dma-api-howto.rst
> +++ b/Documentation/core-api/dma-api-howto.rst
> @@ -146,6 +146,58 @@ What about block I/O and networking buffers?  The block I/O and
>  networking subsystems make sure that the buffers they use are valid
>  for you to DMA from/to.
>  
> +__dma_from_device_group_begin/end annotations
> +=============================================
> +
> +As explained previously, when a structure contains a DMA_FROM_DEVICE /
> +DMA_BIDIRECTIONAL buffer (device writes to memory) alongside fields that the
> +CPU writes to, cache line sharing between the DMA buffer and CPU-written fields
> +can cause data corruption on CPUs with DMA-incoherent caches.
> +
> +The ``__dma_from_device_group_begin(GROUP)/__dma_from_device_group_end(GROUP)``
> +macros ensure proper alignment to prevent this::
> +
> +	struct my_device {
> +		spinlock_t lock1;
> +		__dma_from_device_group_begin();
> +		char dma_buffer1[16];
> +		char dma_buffer2[16];
> +		__dma_from_device_group_end();
> +		spinlock_t lock2;
> +	};
> +
> +To isolate a DMA buffer from adjacent fields, use
> +``__dma_from_device_group_begin(GROUP)`` before the first DMA buffer
> +field and ``__dma_from_device_group_end(GROUP)`` after the last DMA
> +buffer field (with the same GROUP name). This protects both the head
> +and tail of the buffer from cache line sharing.
> +
> +The GROUP parameter is an optional identifier that names the DMA buffer group
> +(in case you have several in the same structure)::
> +
> +	struct my_device {
> +		spinlock_t lock1;
> +		__dma_from_device_group_begin(buffer1);
> +		char dma_buffer1[16];
> +		__dma_from_device_group_end(buffer1);
> +		spinlock_t lock2;
> +		__dma_from_device_group_begin(buffer2);
> +		char dma_buffer2[16];
> +		__dma_from_device_group_end(buffer2);
> +	};
> +
> +On cache-coherent platforms these macros expand to zero-length array markers.
> +On non-coherent platforms, they also ensure the minimal DMA alignment, which
> +can be as large as 128 bytes.
> +
> +.. note::
> +
> +        It is allowed (though somewhat fragile) to include extra fields, not
> +        intended for DMA from the device, within the group (in order to pack the
> +        structure tightly) - but only as long as the CPU does not write these
> +        fields while any fields in the group are mapped for DMA_FROM_DEVICE or
> +        DMA_BIDIRECTIONAL.
> +
>  DMA addressing capabilities
>  ===========================
>  


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 14/15] gpio: virtio: fix DMA alignment
  2026-01-05  8:23 ` [PATCH v2 14/15] gpio: virtio: fix DMA alignment Michael S. Tsirkin
@ 2026-01-05  9:48   ` Bartosz Golaszewski
  0 siblings, 0 replies; 42+ messages in thread
From: Bartosz Golaszewski @ 2026-01-05  9:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev, Viresh Kumar,
	Enrico Weigelt, metux IT consult, Viresh Kumar, Linus Walleij,
	linux-gpio, linux-kernel

On Mon, 5 Jan 2026 09:23:45 +0100, "Michael S. Tsirkin" <mst@redhat.com> said:
> The res and ires buffers in struct virtio_gpio_line and struct
> vgpio_irq_line respectively are used for DMA_FROM_DEVICE via
> virtqueue_add_sgs().  However, within these structs, even though these
> elements are tagged as ____cacheline_aligned, adjacent struct elements
> can share DMA cachelines on platforms where ARCH_DMA_MINALIGN >
> L1_CACHE_BYTES (e.g., arm64 with 128-byte DMA alignment but 64-byte
> cache lines).
>
> The existing ____cacheline_aligned annotation aligns to L1_CACHE_BYTES
> which is not always sufficient for DMA alignment. For example, with
> L1_CACHE_BYTES = 32 and ARCH_DMA_MINALIGN = 128
>   - irq_lines[0].ires at offset 128
>   - irq_lines[1].type at offset 192
> both in same 128-byte DMA cacheline [128-256)
>
> When the device writes to irq_lines[0].ires and the CPU concurrently
> modifies one of irq_lines[1].type/disabled/masked/queued flags,
> corruption can occur on non-cache-coherent platforms.
>
> Fix by using __dma_from_device_group_begin()/end() annotations on the
> DMA buffers. Drop ____cacheline_aligned - it's not required to isolate
> request and response, and keeping them would increase the memory cost.
>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---

Acked-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/15] gpio: virtio: reorder fields to reduce struct padding
  2026-01-05  8:23 ` [PATCH v2 15/15] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin
@ 2026-01-05  9:49   ` Bartosz Golaszewski
  0 siblings, 0 replies; 42+ messages in thread
From: Bartosz Golaszewski @ 2026-01-05  9:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev, Viresh Kumar,
	Enrico Weigelt, metux IT consult, Viresh Kumar, Linus Walleij,
	linux-gpio, linux-kernel

On Mon, 5 Jan 2026 09:23:49 +0100, "Michael S. Tsirkin" <mst@redhat.com> said:
> Reorder struct virtio_gpio_line fields to place the DMA buffers
> (req/res) last.
>
> This eliminates the padding from aligning struct size on
> ARCH_DMA_MINALIGN.
>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/gpio/gpio-virtio.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpio/gpio-virtio.c b/drivers/gpio/gpio-virtio.c
> index b70294626770..ed6e0e90fa8a 100644
> --- a/drivers/gpio/gpio-virtio.c
> +++ b/drivers/gpio/gpio-virtio.c
> @@ -26,11 +26,12 @@ struct virtio_gpio_line {
>  	struct mutex lock; /* Protects line operation */
>  	struct completion completion;
>
> +	unsigned int rxlen;
> +
>  	__dma_from_device_group_begin();
>  	struct virtio_gpio_request req;
>  	struct virtio_gpio_response res;
>  	__dma_from_device_group_end();
> -	unsigned int rxlen;
>  };
>
>  struct vgpio_irq_line {
> --
> MST
>
>

Acked-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
  2026-01-05  8:23 ` [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
@ 2026-01-05  9:50   ` Petr Tesarik
  2026-01-08 13:57   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2026-01-05  9:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, 5 Jan 2026 03:23:01 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> When multiple small DMA_FROM_DEVICE or DMA_BIDIRECTIONAL buffers share a
> cacheline, and DMA_API_DEBUG is enabled, we get this warning:
> 	cacheline tracking EEXIST, overlapping mappings aren't supported.
> 
> This is because when one of the mappings is removed, while another one
> is active, CPU might write into the buffer.
> 
> Add an attribute for the driver to promise not to do this, making the
> overlapping safe, and suppressing the warning.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

LGTM. I'm not formally a reviewer, but FWIW:

Reviewed-by: Petr Tesarik <ptesarik@suse.com>

> ---
>  include/linux/dma-mapping.h | 7 +++++++
>  kernel/dma/debug.c          | 3 ++-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 29ad2ce700f0..29973baa0581 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -79,6 +79,13 @@
>   */
>  #define DMA_ATTR_MMIO		(1UL << 10)
>  
> +/*
> + * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline
> + * overlapping this buffer while it is mapped for DMA. All mappings sharing
> + * a cacheline must have this attribute for this to be considered safe.
> + */
> +#define DMA_ATTR_CPU_CACHE_CLEAN	(1UL << 11)
> +
>  /*
>   * A dma_addr_t can hold any valid DMA or bus address for the platform.  It can
>   * be given to a device to use as a DMA source or target.  It is specific to a
> diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
> index 138ede653de4..7e66d863d573 100644
> --- a/kernel/dma/debug.c
> +++ b/kernel/dma/debug.c
> @@ -595,7 +595,8 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
>  	if (rc == -ENOMEM) {
>  		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
>  		global_disable = true;
> -	} else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> +	} else if (rc == -EEXIST &&
> +		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
>  		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
>  		     is_swiotlb_active(entry->dev))) {
>  		err_printk(entry->dev, entry,


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
  2026-01-05  8:23 ` [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
@ 2026-01-05  9:51   ` Petr Tesarik
  2026-01-08 13:59   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2026-01-05  9:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, 5 Jan 2026 03:23:05 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> Document DMA_ATTR_CPU_CACHE_CLEAN as implemented in the
> previous patch.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

LGTM. I'm not formally a reviewer, but FWIW:

Reviewed-by: Petr Tesarik <ptesarik@suse.com>

> ---
>  Documentation/core-api/dma-attributes.rst | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
> index 0bdc2be65e57..1d7bfad73b1c 100644
> --- a/Documentation/core-api/dma-attributes.rst
> +++ b/Documentation/core-api/dma-attributes.rst
> @@ -148,3 +148,12 @@ DMA_ATTR_MMIO is appropriate.
>  For architectures that require cache flushing for DMA coherence
>  DMA_ATTR_MMIO will not perform any cache flushing. The address
>  provided must never be mapped cacheable into the CPU.
> +
> +DMA_ATTR_CPU_CACHE_CLEAN
> +------------------------
> +
> +This attribute indicates the CPU will not dirty any cacheline overlapping this
> +DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> +multiple small buffers to safely share a cacheline without risk of data
> +corruption, suppressing DMA debug warnings about overlapping mappings.
> +All mappings sharing a cacheline should have this attribute.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/15] dma-debug: track cache clean flag in entries
  2026-01-05  8:23 ` [PATCH v2 05/15] dma-debug: track cache clean flag in entries Michael S. Tsirkin
@ 2026-01-05  9:54   ` Petr Tesarik
  2026-01-05 12:37     ` Michael S. Tsirkin
  0 siblings, 1 reply; 42+ messages in thread
From: Petr Tesarik @ 2026-01-05  9:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, 5 Jan 2026 03:23:10 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> If a driver is buggy and has 2 overlapping mappings but only
> sets cache clean flag on the 1st one of them, we warn.
> But if it only does it for the 2nd one, we don't.
> 
> Fix by tracking cache clean flag in the entry.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  kernel/dma/debug.c | 27 ++++++++++++++++++++++-----
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
> index 7e66d863d573..43d6a996d7a7 100644
> --- a/kernel/dma/debug.c
> +++ b/kernel/dma/debug.c
> @@ -63,6 +63,7 @@ enum map_err_types {
>   * @sg_mapped_ents: 'mapped_ents' from dma_map_sg
>   * @paddr: physical start address of the mapping
>   * @map_err_type: track whether dma_mapping_error() was checked
> + * @is_cache_clean: driver promises not to write to buffer while mapped
>   * @stack_len: number of backtrace entries in @stack_entries
>   * @stack_entries: stack of backtrace history
>   */
> @@ -76,7 +77,8 @@ struct dma_debug_entry {
>  	int		 sg_call_ents;
>  	int		 sg_mapped_ents;
>  	phys_addr_t	 paddr;
> -	enum map_err_types  map_err_type;
> +	enum map_err_types map_err_type;

*nitpick* unnecessary change in white space (breaks git-blame).

Other than that, LGTM. I'm not formally a reviewer, but FWIW:

Reviewed-by: Petr Tesarik <ptesarik@suse.com>

Petr T

> +	bool		 is_cache_clean;
>  #ifdef CONFIG_STACKTRACE
>  	unsigned int	stack_len;
>  	unsigned long	stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
> @@ -472,12 +474,15 @@ static int active_cacheline_dec_overlap(phys_addr_t cln)
>  	return active_cacheline_set_overlap(cln, --overlap);
>  }
>  
> -static int active_cacheline_insert(struct dma_debug_entry *entry)
> +static int active_cacheline_insert(struct dma_debug_entry *entry,
> +				   bool *overlap_cache_clean)
>  {
>  	phys_addr_t cln = to_cacheline_number(entry);
>  	unsigned long flags;
>  	int rc;
>  
> +	*overlap_cache_clean = false;
> +
>  	/* If the device is not writing memory then we don't have any
>  	 * concerns about the cpu consuming stale data.  This mitigates
>  	 * legitimate usages of overlapping mappings.
> @@ -487,8 +492,16 @@ static int active_cacheline_insert(struct dma_debug_entry *entry)
>  
>  	spin_lock_irqsave(&radix_lock, flags);
>  	rc = radix_tree_insert(&dma_active_cacheline, cln, entry);
> -	if (rc == -EEXIST)
> +	if (rc == -EEXIST) {
> +		struct dma_debug_entry *existing;
> +
>  		active_cacheline_inc_overlap(cln);
> +		existing = radix_tree_lookup(&dma_active_cacheline, cln);
> +		/* A lookup failure here after we got -EEXIST is unexpected. */
> +		WARN_ON(!existing);
> +		if (existing)
> +			*overlap_cache_clean = existing->is_cache_clean;
> +	}
>  	spin_unlock_irqrestore(&radix_lock, flags);
>  
>  	return rc;
> @@ -583,20 +596,24 @@ DEFINE_SHOW_ATTRIBUTE(dump);
>   */
>  static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
>  {
> +	bool overlap_cache_clean;
>  	struct hash_bucket *bucket;
>  	unsigned long flags;
>  	int rc;
>  
> +	entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
> +
>  	bucket = get_hash_bucket(entry, &flags);
>  	hash_bucket_add(bucket, entry);
>  	put_hash_bucket(bucket, flags);
>  
> -	rc = active_cacheline_insert(entry);
> +	rc = active_cacheline_insert(entry, &overlap_cache_clean);
>  	if (rc == -ENOMEM) {
>  		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
>  		global_disable = true;
>  	} else if (rc == -EEXIST &&
> -		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
> +		   !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> +		   !(entry->is_cache_clean && overlap_cache_clean) &&
>  		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
>  		     is_swiotlb_active(entry->dev))) {
>  		err_printk(entry->dev, entry,


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/15] dma-debug: track cache clean flag in entries
  2026-01-05  9:54   ` Petr Tesarik
@ 2026-01-05 12:37     ` Michael S. Tsirkin
  2026-01-05 13:40       ` Petr Tesarik
  0 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-05 12:37 UTC (permalink / raw)
  To: Petr Tesarik
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, Jan 05, 2026 at 10:54:33AM +0100, Petr Tesarik wrote:
> On Mon, 5 Jan 2026 03:23:10 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > If a driver is buggy and has 2 overlapping mappings but only
> > sets cache clean flag on the 1st one of them, we warn.
> > But if it only does it for the 2nd one, we don't.
> > 
> > Fix by tracking cache clean flag in the entry.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  kernel/dma/debug.c | 27 ++++++++++++++++++++++-----
> >  1 file changed, 22 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
> > index 7e66d863d573..43d6a996d7a7 100644
> > --- a/kernel/dma/debug.c
> > +++ b/kernel/dma/debug.c
> > @@ -63,6 +63,7 @@ enum map_err_types {
> >   * @sg_mapped_ents: 'mapped_ents' from dma_map_sg
> >   * @paddr: physical start address of the mapping
> >   * @map_err_type: track whether dma_mapping_error() was checked
> > + * @is_cache_clean: driver promises not to write to buffer while mapped
> >   * @stack_len: number of backtrace entries in @stack_entries
> >   * @stack_entries: stack of backtrace history
> >   */
> > @@ -76,7 +77,8 @@ struct dma_debug_entry {
> >  	int		 sg_call_ents;
> >  	int		 sg_mapped_ents;
> >  	phys_addr_t	 paddr;
> > -	enum map_err_types  map_err_type;
> > +	enum map_err_types map_err_type;
> 
> *nitpick* unnecessary change in white space (breaks git-blame).
> 
> Other than that, LGTM. I'm not formally a reviewer, but FWIW:
> 
> Reviewed-by: Petr Tesarik <ptesarik@suse.com>
> 
> Petr T


I mean, yes it's not really required here, but the padding we had before
was broken (two spaces not aligning to anything).

> > +	bool		 is_cache_clean;
> >  #ifdef CONFIG_STACKTRACE
> >  	unsigned int	stack_len;
> >  	unsigned long	stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
> > @@ -472,12 +474,15 @@ static int active_cacheline_dec_overlap(phys_addr_t cln)
> >  	return active_cacheline_set_overlap(cln, --overlap);
> >  }
> >  
> > -static int active_cacheline_insert(struct dma_debug_entry *entry)
> > +static int active_cacheline_insert(struct dma_debug_entry *entry,
> > +				   bool *overlap_cache_clean)
> >  {
> >  	phys_addr_t cln = to_cacheline_number(entry);
> >  	unsigned long flags;
> >  	int rc;
> >  
> > +	*overlap_cache_clean = false;
> > +
> >  	/* If the device is not writing memory then we don't have any
> >  	 * concerns about the cpu consuming stale data.  This mitigates
> >  	 * legitimate usages of overlapping mappings.
> > @@ -487,8 +492,16 @@ static int active_cacheline_insert(struct dma_debug_entry *entry)
> >  
> >  	spin_lock_irqsave(&radix_lock, flags);
> >  	rc = radix_tree_insert(&dma_active_cacheline, cln, entry);
> > -	if (rc == -EEXIST)
> > +	if (rc == -EEXIST) {
> > +		struct dma_debug_entry *existing;
> > +
> >  		active_cacheline_inc_overlap(cln);
> > +		existing = radix_tree_lookup(&dma_active_cacheline, cln);
> > +		/* A lookup failure here after we got -EEXIST is unexpected. */
> > +		WARN_ON(!existing);
> > +		if (existing)
> > +			*overlap_cache_clean = existing->is_cache_clean;
> > +	}
> >  	spin_unlock_irqrestore(&radix_lock, flags);
> >  
> >  	return rc;
> > @@ -583,20 +596,24 @@ DEFINE_SHOW_ATTRIBUTE(dump);
> >   */
> >  static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
> >  {
> > +	bool overlap_cache_clean;
> >  	struct hash_bucket *bucket;
> >  	unsigned long flags;
> >  	int rc;
> >  
> > +	entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
> > +
> >  	bucket = get_hash_bucket(entry, &flags);
> >  	hash_bucket_add(bucket, entry);
> >  	put_hash_bucket(bucket, flags);
> >  
> > -	rc = active_cacheline_insert(entry);
> > +	rc = active_cacheline_insert(entry, &overlap_cache_clean);
> >  	if (rc == -ENOMEM) {
> >  		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
> >  		global_disable = true;
> >  	} else if (rc == -EEXIST &&
> > -		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
> > +		   !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> > +		   !(entry->is_cache_clean && overlap_cache_clean) &&
> >  		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
> >  		     is_swiotlb_active(entry->dev))) {
> >  		err_printk(entry->dev, entry,


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/15] dma-debug: track cache clean flag in entries
  2026-01-05 12:37     ` Michael S. Tsirkin
@ 2026-01-05 13:40       ` Petr Tesarik
  0 siblings, 0 replies; 42+ messages in thread
From: Petr Tesarik @ 2026-01-05 13:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	Stefano Garzarella, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, 5 Jan 2026 07:37:31 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Jan 05, 2026 at 10:54:33AM +0100, Petr Tesarik wrote:
> > On Mon, 5 Jan 2026 03:23:10 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > If a driver is buggy and has 2 overlapping mappings but only
> > > sets cache clean flag on the 1st one of them, we warn.
> > > But if it only does it for the 2nd one, we don't.
> > > 
> > > Fix by tracking cache clean flag in the entry.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > >  kernel/dma/debug.c | 27 ++++++++++++++++++++++-----
> > >  1 file changed, 22 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
> > > index 7e66d863d573..43d6a996d7a7 100644
> > > --- a/kernel/dma/debug.c
> > > +++ b/kernel/dma/debug.c
> > > @@ -63,6 +63,7 @@ enum map_err_types {
> > >   * @sg_mapped_ents: 'mapped_ents' from dma_map_sg
> > >   * @paddr: physical start address of the mapping
> > >   * @map_err_type: track whether dma_mapping_error() was checked
> > > + * @is_cache_clean: driver promises not to write to buffer while mapped
> > >   * @stack_len: number of backtrace entries in @stack_entries
> > >   * @stack_entries: stack of backtrace history
> > >   */
> > > @@ -76,7 +77,8 @@ struct dma_debug_entry {
> > >  	int		 sg_call_ents;
> > >  	int		 sg_mapped_ents;
> > >  	phys_addr_t	 paddr;
> > > -	enum map_err_types  map_err_type;
> > > +	enum map_err_types map_err_type;  
> > 
> > *nitpick* unnecessary change in white space (breaks git-blame).
> > 
> > Other than that, LGTM. I'm not formally a reviewer, but FWIW:
> > 
> > Reviewed-by: Petr Tesarik <ptesarik@suse.com>
> > 
> > Petr T  
> 
> 
> I mean, yes it's not really required here, but the padding we had before
> was broken (two spaces not aligning to anything).

Oh, you're right! Yes, then let's fix it now, because you touch the
neighbouring line.

Sorry for the noise.

Petr T

> > > +	bool		 is_cache_clean;
> > >  #ifdef CONFIG_STACKTRACE
> > >  	unsigned int	stack_len;
> > >  	unsigned long	stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
> > > @@ -472,12 +474,15 @@ static int active_cacheline_dec_overlap(phys_addr_t cln)
> > >  	return active_cacheline_set_overlap(cln, --overlap);
> > >  }
> > >  
> > > -static int active_cacheline_insert(struct dma_debug_entry *entry)
> > > +static int active_cacheline_insert(struct dma_debug_entry *entry,
> > > +				   bool *overlap_cache_clean)
> > >  {
> > >  	phys_addr_t cln = to_cacheline_number(entry);
> > >  	unsigned long flags;
> > >  	int rc;
> > >  
> > > +	*overlap_cache_clean = false;
> > > +
> > >  	/* If the device is not writing memory then we don't have any
> > >  	 * concerns about the cpu consuming stale data.  This mitigates
> > >  	 * legitimate usages of overlapping mappings.
> > > @@ -487,8 +492,16 @@ static int active_cacheline_insert(struct dma_debug_entry *entry)
> > >  
> > >  	spin_lock_irqsave(&radix_lock, flags);
> > >  	rc = radix_tree_insert(&dma_active_cacheline, cln, entry);
> > > -	if (rc == -EEXIST)
> > > +	if (rc == -EEXIST) {
> > > +		struct dma_debug_entry *existing;
> > > +
> > >  		active_cacheline_inc_overlap(cln);
> > > +		existing = radix_tree_lookup(&dma_active_cacheline, cln);
> > > +		/* A lookup failure here after we got -EEXIST is unexpected. */
> > > +		WARN_ON(!existing);
> > > +		if (existing)
> > > +			*overlap_cache_clean = existing->is_cache_clean;
> > > +	}
> > >  	spin_unlock_irqrestore(&radix_lock, flags);
> > >  
> > >  	return rc;
> > > @@ -583,20 +596,24 @@ DEFINE_SHOW_ATTRIBUTE(dump);
> > >   */
> > >  static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
> > >  {
> > > +	bool overlap_cache_clean;
> > >  	struct hash_bucket *bucket;
> > >  	unsigned long flags;
> > >  	int rc;
> > >  
> > > +	entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
> > > +
> > >  	bucket = get_hash_bucket(entry, &flags);
> > >  	hash_bucket_add(bucket, entry);
> > >  	put_hash_bucket(bucket, flags);
> > >  
> > > -	rc = active_cacheline_insert(entry);
> > > +	rc = active_cacheline_insert(entry, &overlap_cache_clean);
> > >  	if (rc == -ENOMEM) {
> > >  		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
> > >  		global_disable = true;
> > >  	} else if (rc == -EEXIST &&
> > > -		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
> > > +		   !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> > > +		   !(entry->is_cache_clean && overlap_cache_clean) &&
> > >  		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
> > >  		     is_swiotlb_active(entry->dev))) {
> > >  		err_printk(entry->dev, entry,  
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events
  2026-01-05  8:23 ` [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
@ 2026-01-05 18:19   ` Stefan Hajnoczi
  2026-01-06 14:50     ` Michael S. Tsirkin
  2026-01-06 14:51     ` Michael S. Tsirkin
  0 siblings, 2 replies; 42+ messages in thread
From: Stefan Hajnoczi @ 2026-01-05 18:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]

On Mon, Jan 05, 2026 at 03:23:29AM -0500, Michael S. Tsirkin wrote:
> @@ -61,7 +62,7 @@ struct virtio_scsi_cmd {
>  
>  struct virtio_scsi_event_node {
>  	struct virtio_scsi *vscsi;
> -	struct virtio_scsi_event event;
> +	struct virtio_scsi_event *event;
>  	struct work_struct work;
>  };
>  
> @@ -89,6 +90,11 @@ struct virtio_scsi {
>  
>  	struct virtio_scsi_vq ctrl_vq;
>  	struct virtio_scsi_vq event_vq;
> +
> +	__dma_from_device_group_begin();
> +	struct virtio_scsi_event events[VIRTIO_SCSI_EVENT_LEN];
> +	__dma_from_device_group_end();

If the device emits two events in rapid succession, could the CPU see
stale data for the second event because it already holds the cache line
for reading the first event?

In other words, it's not obvious to me that the DMA warnings are indeed
spurious and should be silenced here.

It seems safer and simpler to align and pad the struct virtio_scsi_event
field in struct virtio_scsi_event_node rather than packing these structs
into a single array here they might share cache lines.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end()
  2026-01-05  8:22 ` [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end() Michael S. Tsirkin
  2026-01-05  9:40   ` Petr Tesarik
@ 2026-01-05 18:27   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Marek Szyprowski @ 2026-01-05 18:27 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Robin Murphy, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On 05.01.2026 09:22, Michael S. Tsirkin wrote:
> When a structure contains a buffer that DMA writes to alongside fields
> that the CPU writes to, cache line sharing between the DMA buffer and
> CPU-written fields can cause data corruption on non-cache-coherent
> platforms.
>
> Add __dma_from_device_group_begin()/end() annotations to ensure proper
> alignment to prevent this:
>
> struct my_device {
> 	spinlock_t lock1;
> 	__dma_from_device_group_begin();
> 	char dma_buffer1[16];
> 	char dma_buffer2[16];
> 	__dma_from_device_group_end();
> 	spinlock_t lock2;
> };
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   include/linux/dma-mapping.h | 13 +++++++++++++
>   1 file changed, 13 insertions(+)

Right, this was one of the long standing issues, how to make DMA to the 
buffers embedded into some structures safe and this solution looks 
really nice.

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>


> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index aa36a0d1d9df..29ad2ce700f0 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -7,6 +7,7 @@
>   #include <linux/dma-direction.h>
>   #include <linux/scatterlist.h>
>   #include <linux/bug.h>
> +#include <linux/cache.h>
>   
>   /**
>    * List of possible attributes associated with a DMA mapping. The semantics
> @@ -703,6 +704,18 @@ static inline int dma_get_cache_alignment(void)
>   }
>   #endif
>   
> +#ifdef ARCH_HAS_DMA_MINALIGN
> +#define ____dma_from_device_aligned __aligned(ARCH_DMA_MINALIGN)
> +#else
> +#define ____dma_from_device_aligned
> +#endif
> +/* Mark start of DMA buffer */
> +#define __dma_from_device_group_begin(GROUP)			\
> +	__cacheline_group_begin(GROUP) ____dma_from_device_aligned
> +/* Mark end of DMA buffer */
> +#define __dma_from_device_group_end(GROUP)			\
> +	__cacheline_group_end(GROUP) ____dma_from_device_aligned
> +
>   static inline void *dmam_alloc_coherent(struct device *dev, size_t size,
>   		dma_addr_t *dma_handle, gfp_t gfp)
>   {

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end()
  2026-01-05  8:22 ` [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end() Michael S. Tsirkin
  2026-01-05  9:48   ` Petr Tesarik
@ 2026-01-05 18:28   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Marek Szyprowski @ 2026-01-05 18:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Robin Murphy, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On 05.01.2026 09:22, Michael S. Tsirkin wrote:
> Document the __dma_from_device_group_begin()/end() annotations.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   Documentation/core-api/dma-api-howto.rst | 52 ++++++++++++++++++++++++
>   1 file changed, 52 insertions(+)

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>

> diff --git a/Documentation/core-api/dma-api-howto.rst b/Documentation/core-api/dma-api-howto.rst
> index 96fce2a9aa90..e97743ab0f26 100644
> --- a/Documentation/core-api/dma-api-howto.rst
> +++ b/Documentation/core-api/dma-api-howto.rst
> @@ -146,6 +146,58 @@ What about block I/O and networking buffers?  The block I/O and
>   networking subsystems make sure that the buffers they use are valid
>   for you to DMA from/to.
>   
> +__dma_from_device_group_begin/end annotations
> +=============================================
> +
> +As explained previously, when a structure contains a DMA_FROM_DEVICE /
> +DMA_BIDIRECTIONAL buffer (device writes to memory) alongside fields that the
> +CPU writes to, cache line sharing between the DMA buffer and CPU-written fields
> +can cause data corruption on CPUs with DMA-incoherent caches.
> +
> +The ``__dma_from_device_group_begin(GROUP)/__dma_from_device_group_end(GROUP)``
> +macros ensure proper alignment to prevent this::
> +
> +	struct my_device {
> +		spinlock_t lock1;
> +		__dma_from_device_group_begin();
> +		char dma_buffer1[16];
> +		char dma_buffer2[16];
> +		__dma_from_device_group_end();
> +		spinlock_t lock2;
> +	};
> +
> +To isolate a DMA buffer from adjacent fields, use
> +``__dma_from_device_group_begin(GROUP)`` before the first DMA buffer
> +field and ``__dma_from_device_group_end(GROUP)`` after the last DMA
> +buffer field (with the same GROUP name). This protects both the head
> +and tail of the buffer from cache line sharing.
> +
> +The GROUP parameter is an optional identifier that names the DMA buffer group
> +(in case you have several in the same structure)::
> +
> +	struct my_device {
> +		spinlock_t lock1;
> +		__dma_from_device_group_begin(buffer1);
> +		char dma_buffer1[16];
> +		__dma_from_device_group_end(buffer1);
> +		spinlock_t lock2;
> +		__dma_from_device_group_begin(buffer2);
> +		char dma_buffer2[16];
> +		__dma_from_device_group_end(buffer2);
> +	};
> +
> +On cache-coherent platforms these macros expand to zero-length array markers.
> +On non-coherent platforms, they also ensure the minimal DMA alignment, which
> +can be as large as 128 bytes.
> +
> +.. note::
> +
> +        It is allowed (though somewhat fragile) to include extra fields, not
> +        intended for DMA from the device, within the group (in order to pack the
> +        structure tightly) - but only as long as the CPU does not write these
> +        fields while any fields in the group are mapped for DMA_FROM_DEVICE or
> +        DMA_BIDIRECTIONAL.
> +
>   DMA addressing capabilities
>   ===========================
>   

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events
  2026-01-05 18:19   ` Stefan Hajnoczi
@ 2026-01-06 14:50     ` Michael S. Tsirkin
  2026-01-07 16:29       ` Stefan Hajnoczi
  2026-01-06 14:51     ` Michael S. Tsirkin
  1 sibling, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-06 14:50 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, Jan 05, 2026 at 01:19:39PM -0500, Stefan Hajnoczi wrote:
> On Mon, Jan 05, 2026 at 03:23:29AM -0500, Michael S. Tsirkin wrote:
> > @@ -61,7 +62,7 @@ struct virtio_scsi_cmd {
> >  
> >  struct virtio_scsi_event_node {
> >  	struct virtio_scsi *vscsi;
> > -	struct virtio_scsi_event event;
> > +	struct virtio_scsi_event *event;
> >  	struct work_struct work;
> >  };
> >  
> > @@ -89,6 +90,11 @@ struct virtio_scsi {
> >  
> >  	struct virtio_scsi_vq ctrl_vq;
> >  	struct virtio_scsi_vq event_vq;
> > +
> > +	__dma_from_device_group_begin();
> > +	struct virtio_scsi_event events[VIRTIO_SCSI_EVENT_LEN];
> > +	__dma_from_device_group_end();
> 
> If the device emits two events in rapid succession, could the CPU see
> stale data for the second event because it already holds the cache line
> for reading the first event?

No because virtio does unmap and syncs the cache line.

In other words, CPU reads cause no issues.

The issues are exclusively around CPU writes dirtying the
cache and writeback overwriting DMA data.

> In other words, it's not obvious to me that the DMA warnings are indeed
> spurious and should be silenced here.
> 
> It seems safer and simpler to align and pad the struct virtio_scsi_event
> field in struct virtio_scsi_event_node rather than packing these structs
> into a single array here they might share cache lines.
> 
> Stefan



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events
  2026-01-05 18:19   ` Stefan Hajnoczi
  2026-01-06 14:50     ` Michael S. Tsirkin
@ 2026-01-06 14:51     ` Michael S. Tsirkin
  1 sibling, 0 replies; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-06 14:51 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, Jan 05, 2026 at 01:19:39PM -0500, Stefan Hajnoczi wrote:
> On Mon, Jan 05, 2026 at 03:23:29AM -0500, Michael S. Tsirkin wrote:
> > @@ -61,7 +62,7 @@ struct virtio_scsi_cmd {
> >  
> >  struct virtio_scsi_event_node {
> >  	struct virtio_scsi *vscsi;
> > -	struct virtio_scsi_event event;
> > +	struct virtio_scsi_event *event;
> >  	struct work_struct work;
> >  };
> >  
> > @@ -89,6 +90,11 @@ struct virtio_scsi {
> >  
> >  	struct virtio_scsi_vq ctrl_vq;
> >  	struct virtio_scsi_vq event_vq;
> > +
> > +	__dma_from_device_group_begin();
> > +	struct virtio_scsi_event events[VIRTIO_SCSI_EVENT_LEN];
> > +	__dma_from_device_group_end();
> 
> If the device emits two events in rapid succession, could the CPU see
> stale data for the second event because it already holds the cache line
> for reading the first event?
> 
> In other words, it's not obvious to me that the DMA warnings are indeed
> spurious and should be silenced here.
> 
> It seems safer and simpler to align and pad the struct virtio_scsi_event
> field in struct virtio_scsi_event_node rather than packing these structs
> into a single array here they might share cache lines.
> 
> Stefan



To add to what I wrote, that's a lot of overhead: 8 * 128 - about 1K on
some platforms, and these happen to be low end ones.

-- 
MST


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events
  2026-01-06 14:50     ` Michael S. Tsirkin
@ 2026-01-07 16:29       ` Stefan Hajnoczi
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Hajnoczi @ 2026-01-07 16:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

[-- Attachment #1: Type: text/plain, Size: 1305 bytes --]

On Tue, Jan 06, 2026 at 09:50:00AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 05, 2026 at 01:19:39PM -0500, Stefan Hajnoczi wrote:
> > On Mon, Jan 05, 2026 at 03:23:29AM -0500, Michael S. Tsirkin wrote:
> > > @@ -61,7 +62,7 @@ struct virtio_scsi_cmd {
> > >  
> > >  struct virtio_scsi_event_node {
> > >  	struct virtio_scsi *vscsi;
> > > -	struct virtio_scsi_event event;
> > > +	struct virtio_scsi_event *event;
> > >  	struct work_struct work;
> > >  };
> > >  
> > > @@ -89,6 +90,11 @@ struct virtio_scsi {
> > >  
> > >  	struct virtio_scsi_vq ctrl_vq;
> > >  	struct virtio_scsi_vq event_vq;
> > > +
> > > +	__dma_from_device_group_begin();
> > > +	struct virtio_scsi_event events[VIRTIO_SCSI_EVENT_LEN];
> > > +	__dma_from_device_group_end();
> > 
> > If the device emits two events in rapid succession, could the CPU see
> > stale data for the second event because it already holds the cache line
> > for reading the first event?
> 
> No because virtio does unmap and syncs the cache line.
> 
> In other words, CPU reads cause no issues.
> 
> The issues are exclusively around CPU writes dirtying the
> cache and writeback overwriting DMA data.

I see. In that case I'm happy with the virtio-scsi change:

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
  2026-01-05  8:23 ` [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
  2026-01-05  9:50   ` Petr Tesarik
@ 2026-01-08 13:57   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Marek Szyprowski @ 2026-01-08 13:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Robin Murphy, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On 05.01.2026 09:23, Michael S. Tsirkin wrote:
> When multiple small DMA_FROM_DEVICE or DMA_BIDIRECTIONAL buffers share a
> cacheline, and DMA_API_DEBUG is enabled, we get this warning:
> 	cacheline tracking EEXIST, overlapping mappings aren't supported.
>
> This is because when one of the mappings is removed, while another one
> is active, CPU might write into the buffer.
>
> Add an attribute for the driver to promise not to do this, making the
> overlapping safe, and suppressing the warning.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

It is somehow similar to DMA_ATTR_SKIP_CPU_SYNC in its concept, so I see 
no reason not to accept it.

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---
>   include/linux/dma-mapping.h | 7 +++++++
>   kernel/dma/debug.c          | 3 ++-
>   2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 29ad2ce700f0..29973baa0581 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -79,6 +79,13 @@
>    */
>   #define DMA_ATTR_MMIO		(1UL << 10)
>   
> +/*
> + * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline
> + * overlapping this buffer while it is mapped for DMA. All mappings sharing
> + * a cacheline must have this attribute for this to be considered safe.
> + */
> +#define DMA_ATTR_CPU_CACHE_CLEAN	(1UL << 11)
> +
>   /*
>    * A dma_addr_t can hold any valid DMA or bus address for the platform.  It can
>    * be given to a device to use as a DMA source or target.  It is specific to a
> diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
> index 138ede653de4..7e66d863d573 100644
> --- a/kernel/dma/debug.c
> +++ b/kernel/dma/debug.c
> @@ -595,7 +595,8 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
>   	if (rc == -ENOMEM) {
>   		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
>   		global_disable = true;
> -	} else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> +	} else if (rc == -EEXIST &&
> +		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
>   		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
>   		     is_swiotlb_active(entry->dev))) {
>   		err_printk(entry->dev, entry,

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
  2026-01-05  8:23 ` [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
  2026-01-05  9:51   ` Petr Tesarik
@ 2026-01-08 13:59   ` Marek Szyprowski
  1 sibling, 0 replies; 42+ messages in thread
From: Marek Szyprowski @ 2026-01-08 13:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Robin Murphy, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On 05.01.2026 09:23, Michael S. Tsirkin wrote:
> Document DMA_ATTR_CPU_CACHE_CLEAN as implemented in the
> previous patch.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---
>   Documentation/core-api/dma-attributes.rst | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
> index 0bdc2be65e57..1d7bfad73b1c 100644
> --- a/Documentation/core-api/dma-attributes.rst
> +++ b/Documentation/core-api/dma-attributes.rst
> @@ -148,3 +148,12 @@ DMA_ATTR_MMIO is appropriate.
>   For architectures that require cache flushing for DMA coherence
>   DMA_ATTR_MMIO will not perform any cache flushing. The address
>   provided must never be mapped cacheable into the CPU.
> +
> +DMA_ATTR_CPU_CACHE_CLEAN
> +------------------------
> +
> +This attribute indicates the CPU will not dirty any cacheline overlapping this
> +DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
> +multiple small buffers to safely share a cacheline without risk of data
> +corruption, suppressing DMA debug warnings about overlapping mappings.
> +All mappings sharing a cacheline should have this attribute.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list
  2026-01-05  8:23 ` [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
@ 2026-01-08 14:04   ` Stefano Garzarella
  2026-01-08 14:07     ` Michael S. Tsirkin
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Garzarella @ 2026-01-08 14:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, Jan 05, 2026 at 03:23:17AM -0500, Michael S. Tsirkin wrote:
>On non-cache-coherent platforms, when a structure contains a buffer
>used for DMA alongside fields that the CPU writes to, cacheline sharing
>can cause data corruption.
>
>The event_list array is used for DMA_FROM_DEVICE operations via
>virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are
>written by the CPU while the buffer is available, so mapped for the
>device. If these share cachelines with event_list, CPU writes can
>corrupt DMA data.
>
>Add __dma_from_device_group_begin()/end() annotations to ensure event_list
>is isolated in its own cachelines.
>
>Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>---
> net/vmw_vsock/virtio_transport.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index 8c867023a2e5..bb94baadfd8b 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -17,6 +17,7 @@
> #include <linux/virtio_ids.h>
> #include <linux/virtio_config.h>
> #include <linux/virtio_vsock.h>
>+#include <linux/dma-mapping.h>
> #include <net/sock.h>
> #include <linux/mutex.h>
> #include <net/af_vsock.h>
>@@ -59,8 +60,9 @@ struct virtio_vsock {
> 	 */
> 	struct mutex event_lock;
> 	bool event_run;
>+	__dma_from_device_group_begin();
> 	struct virtio_vsock_event event_list[8];
>-
>+	__dma_from_device_group_end();

Can we keep the blank line before `guest_cid` so that the comment before 
this section makes sense? (regarding the lock required to access these 
fields)

Thanks,
Stefano

> 	u32 guest_cid;
> 	bool seqpacket_allow;
>
>-- 
>MST
>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list
  2026-01-08 14:04   ` Stefano Garzarella
@ 2026-01-08 14:07     ` Michael S. Tsirkin
  2026-01-08 14:18       ` Stefano Garzarella
  0 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-08 14:07 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Thu, Jan 08, 2026 at 03:04:07PM +0100, Stefano Garzarella wrote:
> On Mon, Jan 05, 2026 at 03:23:17AM -0500, Michael S. Tsirkin wrote:
> > On non-cache-coherent platforms, when a structure contains a buffer
> > used for DMA alongside fields that the CPU writes to, cacheline sharing
> > can cause data corruption.
> > 
> > The event_list array is used for DMA_FROM_DEVICE operations via
> > virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are
> > written by the CPU while the buffer is available, so mapped for the
> > device. If these share cachelines with event_list, CPU writes can
> > corrupt DMA data.
> > 
> > Add __dma_from_device_group_begin()/end() annotations to ensure event_list
> > is isolated in its own cachelines.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > net/vmw_vsock/virtio_transport.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > index 8c867023a2e5..bb94baadfd8b 100644
> > --- a/net/vmw_vsock/virtio_transport.c
> > +++ b/net/vmw_vsock/virtio_transport.c
> > @@ -17,6 +17,7 @@
> > #include <linux/virtio_ids.h>
> > #include <linux/virtio_config.h>
> > #include <linux/virtio_vsock.h>
> > +#include <linux/dma-mapping.h>
> > #include <net/sock.h>
> > #include <linux/mutex.h>
> > #include <net/af_vsock.h>
> > @@ -59,8 +60,9 @@ struct virtio_vsock {
> > 	 */
> > 	struct mutex event_lock;
> > 	bool event_run;
> > +	__dma_from_device_group_begin();
> > 	struct virtio_vsock_event event_list[8];
> > -
> > +	__dma_from_device_group_end();
> 
> Can we keep the blank line before `guest_cid` so that the comment before
> this section makes sense? (regarding the lock required to access these
> fields)
> 
> Thanks,
> Stefano

A follow up patch re-introduces it, so I don't think it matters?

> > 	u32 guest_cid;
> > 	bool seqpacket_allow;
> > 
> > -- 
> > MST
> > 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 08/15] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events
  2026-01-05  8:23 ` [PATCH v2 08/15] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
@ 2026-01-08 14:08   ` Stefano Garzarella
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Garzarella @ 2026-01-08 14:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, Jan 05, 2026 at 03:23:21AM -0500, Michael S. Tsirkin wrote:
>The event_list array contains 8 small (4-byte) events that share
>cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
>this can trigger warnings about overlapping DMA mappings within
>the same cacheline.
>
>The previous patch isolated event_list in its own cache lines
>so the warnings are spurious.
>
>Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
>write into these fields, suppressing the warnings.
>
>Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
>Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>---
> net/vmw_vsock/virtio_transport.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Stefano Garzarella <sgarzare@redhat.com>

>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index bb94baadfd8b..ef983c36cb66 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -392,7 +392,7 @@ static int virtio_vsock_event_fill_one(struct virtio_vsock *vsock,
>
> 	sg_init_one(&sg, event, sizeof(*event));
>
>-	return virtqueue_add_inbuf(vq, &sg, 1, event, GFP_KERNEL);
>+	return virtqueue_add_inbuf_cache_clean(vq, &sg, 1, event, GFP_KERNEL);
> }
>
> /* event_lock must be held */
>-- 
>MST
>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding
  2026-01-05  8:23 ` [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding Michael S. Tsirkin
@ 2026-01-08 14:11   ` Stefano Garzarella
  2026-01-08 14:17     ` Michael S. Tsirkin
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Garzarella @ 2026-01-08 14:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Mon, Jan 05, 2026 at 03:23:41AM -0500, Michael S. Tsirkin wrote:
>Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
>last. This eliminates the padding from aligning the struct size on
>ARCH_DMA_MINALIGN.
>
>Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>---
> net/vmw_vsock/virtio_transport.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index ef983c36cb66..964d25e11858 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -60,9 +60,7 @@ struct virtio_vsock {
> 	 */
> 	struct mutex event_lock;
> 	bool event_run;
>-	__dma_from_device_group_begin();
>-	struct virtio_vsock_event event_list[8];
>-	__dma_from_device_group_end();
>+
> 	u32 guest_cid;
> 	bool seqpacket_allow;
>
>@@ -76,6 +74,10 @@ struct virtio_vsock {
> 	 */
> 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
> 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
>+

IIUC we would like to have these fields always on the bottom of this 
struct, so would be better to add a comment here to make sure we will 
not add other fields in the future after this?

Maybe we should also add a comment about the `event_lock` requirement we 
have in the section above.

Thanks,
Stefano

>+	__dma_from_device_group_begin();
>+	struct virtio_vsock_event event_list[8];
>+	__dma_from_device_group_end();
> };
>
> static u32 virtio_transport_get_local_cid(void)
>-- 
>MST
>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding
  2026-01-08 14:11   ` Stefano Garzarella
@ 2026-01-08 14:17     ` Michael S. Tsirkin
  2026-01-08 14:27       ` Stefano Garzarella
  0 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-08 14:17 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Thu, Jan 08, 2026 at 03:11:36PM +0100, Stefano Garzarella wrote:
> On Mon, Jan 05, 2026 at 03:23:41AM -0500, Michael S. Tsirkin wrote:
> > Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
> > last. This eliminates the padding from aligning the struct size on
> > ARCH_DMA_MINALIGN.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > net/vmw_vsock/virtio_transport.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > index ef983c36cb66..964d25e11858 100644
> > --- a/net/vmw_vsock/virtio_transport.c
> > +++ b/net/vmw_vsock/virtio_transport.c
> > @@ -60,9 +60,7 @@ struct virtio_vsock {
> > 	 */
> > 	struct mutex event_lock;
> > 	bool event_run;
> > -	__dma_from_device_group_begin();
> > -	struct virtio_vsock_event event_list[8];
> > -	__dma_from_device_group_end();
> > +
> > 	u32 guest_cid;
> > 	bool seqpacket_allow;
> > 
> > @@ -76,6 +74,10 @@ struct virtio_vsock {
> > 	 */
> > 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
> > 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
> > +
> 
> IIUC we would like to have these fields always on the bottom of this struct,
> so would be better to add a comment here to make sure we will not add other
> fields in the future after this?

not necessarily - you can add fields after, too - it's just that
__dma_from_device_group_begin already adds a bunch of padding, so adding
fields in this padding is cheaper.


do we really need to add comments to teach people about the art of
struct packing?

> Maybe we should also add a comment about the `event_lock` requirement we
> have in the section above.
> 
> Thanks,
> Stefano

hmm which requirement do you mean?

> 
> > +	__dma_from_device_group_begin();
> > +	struct virtio_vsock_event event_list[8];
> > +	__dma_from_device_group_end();
> > };
> > 
> > static u32 virtio_transport_get_local_cid(void)
> > -- 
> > MST
> > 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list
  2026-01-08 14:07     ` Michael S. Tsirkin
@ 2026-01-08 14:18       ` Stefano Garzarella
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Garzarella @ 2026-01-08 14:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Thu, Jan 08, 2026 at 09:07:53AM -0500, Michael S. Tsirkin wrote:
>On Thu, Jan 08, 2026 at 03:04:07PM +0100, Stefano Garzarella wrote:
>> On Mon, Jan 05, 2026 at 03:23:17AM -0500, Michael S. Tsirkin wrote:
>> > On non-cache-coherent platforms, when a structure contains a buffer
>> > used for DMA alongside fields that the CPU writes to, cacheline sharing
>> > can cause data corruption.
>> >
>> > The event_list array is used for DMA_FROM_DEVICE operations via
>> > virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are
>> > written by the CPU while the buffer is available, so mapped for the
>> > device. If these share cachelines with event_list, CPU writes can
>> > corrupt DMA data.
>> >
>> > Add __dma_from_device_group_begin()/end() annotations to ensure event_list
>> > is isolated in its own cachelines.
>> >
>> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > ---
>> > net/vmw_vsock/virtio_transport.c | 4 +++-
>> > 1 file changed, 3 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> > index 8c867023a2e5..bb94baadfd8b 100644
>> > --- a/net/vmw_vsock/virtio_transport.c
>> > +++ b/net/vmw_vsock/virtio_transport.c
>> > @@ -17,6 +17,7 @@
>> > #include <linux/virtio_ids.h>
>> > #include <linux/virtio_config.h>
>> > #include <linux/virtio_vsock.h>
>> > +#include <linux/dma-mapping.h>
>> > #include <net/sock.h>
>> > #include <linux/mutex.h>
>> > #include <net/af_vsock.h>
>> > @@ -59,8 +60,9 @@ struct virtio_vsock {
>> > 	 */
>> > 	struct mutex event_lock;
>> > 	bool event_run;
>> > +	__dma_from_device_group_begin();
>> > 	struct virtio_vsock_event event_list[8];
>> > -
>> > +	__dma_from_device_group_end();
>>
>> Can we keep the blank line before `guest_cid` so that the comment before
>> this section makes sense? (regarding the lock required to access these
>> fields)
>>
>> Thanks,
>> Stefano
>
>A follow up patch re-introduces it, so I don't think it matters?

Yes, I saw it later. Of course I don't want you to resend the whole 
series just for this. So if you have to resend the series for other 
reasons, I would avoid removing the line here because I don't see any 
value on removing it and add back later.

In both cases:

Acked-by: Stefano Garzarella <sgarzare@redhat.com>

>
>> > 	u32 guest_cid;
>> > 	bool seqpacket_allow;
>> >
>> > --
>> > MST
>> >
>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding
  2026-01-08 14:17     ` Michael S. Tsirkin
@ 2026-01-08 14:27       ` Stefano Garzarella
  2026-01-08 14:32         ` Michael S. Tsirkin
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Garzarella @ 2026-01-08 14:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Thu, Jan 08, 2026 at 09:17:49AM -0500, Michael S. Tsirkin wrote:
>On Thu, Jan 08, 2026 at 03:11:36PM +0100, Stefano Garzarella wrote:
>> On Mon, Jan 05, 2026 at 03:23:41AM -0500, Michael S. Tsirkin wrote:
>> > Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
>> > last. This eliminates the padding from aligning the struct size on
>> > ARCH_DMA_MINALIGN.
>> >
>> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > ---
>> > net/vmw_vsock/virtio_transport.c | 8 +++++---
>> > 1 file changed, 5 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> > index ef983c36cb66..964d25e11858 100644
>> > --- a/net/vmw_vsock/virtio_transport.c
>> > +++ b/net/vmw_vsock/virtio_transport.c
>> > @@ -60,9 +60,7 @@ struct virtio_vsock {
>> > 	 */
>> > 	struct mutex event_lock;
>> > 	bool event_run;
>> > -	__dma_from_device_group_begin();
>> > -	struct virtio_vsock_event event_list[8];
>> > -	__dma_from_device_group_end();
>> > +
>> > 	u32 guest_cid;
>> > 	bool seqpacket_allow;
>> >
>> > @@ -76,6 +74,10 @@ struct virtio_vsock {
>> > 	 */
>> > 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
>> > 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
>> > +
>>
>> IIUC we would like to have these fields always on the bottom of this struct,
>> so would be better to add a comment here to make sure we will not add other
>> fields in the future after this?
>
>not necessarily - you can add fields after, too - it's just that
>__dma_from_device_group_begin already adds a bunch of padding, so adding
>fields in this padding is cheaper.
>

Okay, I see.

>
>do we really need to add comments to teach people about the art of
>struct packing?

I can do it later if you prefer, I don't want to block this work, but 
yes, I'd prefer to have a comment because otherwise I'll have to ask 
every time to avoid, especially for new contributors xD

>
>> Maybe we should also add a comment about the `ev`nt_lock` requirement 
>> we
>> have in the section above.
>>
>> Thanks,
>> Stefano
>
>hmm which requirement do you mean?

That `event_list` must be accessed with `event_lock`.

So maybe we can move also `event_lock` and `event_run`, so we can just 
move that comment. I mean something like this:


@@ -74,6 +67,15 @@ struct virtio_vsock {
          */
         struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
         struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
+
+       /* The following fields are protected by event_lock.
+        * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held.
+        */
+       struct mutex event_lock;
+       bool event_run;
+       __dma_from_device_group_begin();
+       struct virtio_vsock_event event_list[8];
+       __dma_from_device_group_end();
  };

  static u32 virtio_transport_get_local_cid(void)


Thanks,
Stefano


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding
  2026-01-08 14:27       ` Stefano Garzarella
@ 2026-01-08 14:32         ` Michael S. Tsirkin
  2026-01-08 14:45           ` Stefano Garzarella
  0 siblings, 1 reply; 42+ messages in thread
From: Michael S. Tsirkin @ 2026-01-08 14:32 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Thu, Jan 08, 2026 at 03:27:04PM +0100, Stefano Garzarella wrote:
> On Thu, Jan 08, 2026 at 09:17:49AM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 08, 2026 at 03:11:36PM +0100, Stefano Garzarella wrote:
> > > On Mon, Jan 05, 2026 at 03:23:41AM -0500, Michael S. Tsirkin wrote:
> > > > Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
> > > > last. This eliminates the padding from aligning the struct size on
> > > > ARCH_DMA_MINALIGN.
> > > >
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > ---
> > > > net/vmw_vsock/virtio_transport.c | 8 +++++---
> > > > 1 file changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index ef983c36cb66..964d25e11858 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -60,9 +60,7 @@ struct virtio_vsock {
> > > > 	 */
> > > > 	struct mutex event_lock;
> > > > 	bool event_run;
> > > > -	__dma_from_device_group_begin();
> > > > -	struct virtio_vsock_event event_list[8];
> > > > -	__dma_from_device_group_end();
> > > > +
> > > > 	u32 guest_cid;
> > > > 	bool seqpacket_allow;
> > > >
> > > > @@ -76,6 +74,10 @@ struct virtio_vsock {
> > > > 	 */
> > > > 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
> > > > 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
> > > > +
> > > 
> > > IIUC we would like to have these fields always on the bottom of this struct,
> > > so would be better to add a comment here to make sure we will not add other
> > > fields in the future after this?
> > 
> > not necessarily - you can add fields after, too - it's just that
> > __dma_from_device_group_begin already adds a bunch of padding, so adding
> > fields in this padding is cheaper.
> > 
> 
> Okay, I see.
> 
> > 
> > do we really need to add comments to teach people about the art of
> > struct packing?
> 
> I can do it later if you prefer, I don't want to block this work, but yes,
> I'd prefer to have a comment because otherwise I'll have to ask every time
> to avoid, especially for new contributors xD

On the one hand you are right on the other I don't want it
duplicated each time __dma_from_device_group_begin is invoked.
Pls come up with something you like, and we'll discuss.

> > 
> > > Maybe we should also add a comment about the `ev`nt_lock`
> > > requirement we
> > > have in the section above.
> > > 
> > > Thanks,
> > > Stefano
> > 
> > hmm which requirement do you mean?
> 
> That `event_list` must be accessed with `event_lock`.
> 
> So maybe we can move also `event_lock` and `event_run`, so we can just move
> that comment. I mean something like this:
> 
> 
> @@ -74,6 +67,15 @@ struct virtio_vsock {
>          */
>         struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
>         struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
> +
> +       /* The following fields are protected by event_lock.
> +        * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held.
> +        */
> +       struct mutex event_lock;
> +       bool event_run;
> +       __dma_from_device_group_begin();
> +       struct virtio_vsock_event event_list[8];
> +       __dma_from_device_group_end();
>  };
> 
>  static u32 virtio_transport_get_local_cid(void)

Yea this makes sense.

> 
> Thanks,
> Stefano


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding
  2026-01-08 14:32         ` Michael S. Tsirkin
@ 2026-01-08 14:45           ` Stefano Garzarella
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Garzarella @ 2026-01-08 14:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Cong Wang, Jonathan Corbet, Olivia Mackall,
	Herbert Xu, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
	Eugenio Pérez, James E.J. Bottomley, Martin K. Petersen,
	Gerd Hoffmann, Xuan Zhuo, Marek Szyprowski, Robin Murphy,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	Bartosz Golaszewski, linux-doc, linux-crypto, virtualization,
	linux-scsi, iommu, kvm, netdev

On Thu, Jan 08, 2026 at 09:32:23AM -0500, Michael S. Tsirkin wrote:
>On Thu, Jan 08, 2026 at 03:27:04PM +0100, Stefano Garzarella wrote:
>> On Thu, Jan 08, 2026 at 09:17:49AM -0500, Michael S. Tsirkin wrote:
>> > On Thu, Jan 08, 2026 at 03:11:36PM +0100, Stefano Garzarella wrote:
>> > > On Mon, Jan 05, 2026 at 03:23:41AM -0500, Michael S. Tsirkin wrote:
>> > > > Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
>> > > > last. This eliminates the padding from aligning the struct size on
>> > > > ARCH_DMA_MINALIGN.
>> > > >
>> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > > > ---
>> > > > net/vmw_vsock/virtio_transport.c | 8 +++++---
>> > > > 1 file changed, 5 insertions(+), 3 deletions(-)
>> > > >
>> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> > > > index ef983c36cb66..964d25e11858 100644
>> > > > --- a/net/vmw_vsock/virtio_transport.c
>> > > > +++ b/net/vmw_vsock/virtio_transport.c
>> > > > @@ -60,9 +60,7 @@ struct virtio_vsock {
>> > > > 	 */
>> > > > 	struct mutex event_lock;
>> > > > 	bool event_run;
>> > > > -	__dma_from_device_group_begin();
>> > > > -	struct virtio_vsock_event event_list[8];
>> > > > -	__dma_from_device_group_end();
>> > > > +
>> > > > 	u32 guest_cid;
>> > > > 	bool seqpacket_allow;
>> > > >
>> > > > @@ -76,6 +74,10 @@ struct virtio_vsock {
>> > > > 	 */
>> > > > 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
>> > > > 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
>> > > > +
>> > >
>> > > IIUC we would like to have these fields always on the bottom of this struct,
>> > > so would be better to add a comment here to make sure we will not add other
>> > > fields in the future after this?
>> >
>> > not necessarily - you can add fields after, too - it's just that
>> > __dma_from_device_group_begin already adds a bunch of padding, so adding
>> > fields in this padding is cheaper.
>> >
>>
>> Okay, I see.
>>
>> >
>> > do we really need to add comments to teach people about the art of
>> > struct packing?
>>
>> I can do it later if you prefer, I don't want to block this work, but yes,
>> I'd prefer to have a comment because otherwise I'll have to ask every time
>> to avoid, especially for new contributors xD
>
>On the one hand you are right on the other I don't want it
>duplicated each time __dma_from_device_group_begin is invoked.

yeah, I see.

>Pls come up with something you like, and we'll discuss.

sure, I'll check a bit similar cases to have some inspiration.

>
>> >
>> > > Maybe we should also add a comment about the `ev`nt_lock`
>> > > requirement we
>> > > have in the section above.
>> > >
>> > > Thanks,
>> > > Stefano
>> >
>> > hmm which requirement do you mean?
>>
>> That `event_list` must be accessed with `event_lock`.
>>
>> So maybe we can move also `event_lock` and `event_run`, so we can just move
>> that comment. I mean something like this:
>>
>>
>> @@ -74,6 +67,15 @@ struct virtio_vsock {
>>          */
>>         struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
>>         struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
>> +
>> +       /* The following fields are protected by event_lock.
>> +        * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held.
>> +        */
>> +       struct mutex event_lock;
>> +       bool event_run;
>> +       __dma_from_device_group_begin();
>> +       struct virtio_vsock_event event_list[8];
>> +       __dma_from_device_group_end();
>>  };
>>
>>  static u32 virtio_transport_get_local_cid(void)
>
>Yea this makes sense.

Thanks for that!
Stefano


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2026-01-08 14:45 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-05  8:22 [PATCH v2 00/15] fix DMA aligment issues around virtio Michael S. Tsirkin
2026-01-05  8:22 ` [PATCH v2 01/15] dma-mapping: add __dma_from_device_group_begin()/end() Michael S. Tsirkin
2026-01-05  9:40   ` Petr Tesarik
2026-01-05 18:27   ` Marek Szyprowski
2026-01-05  8:22 ` [PATCH v2 02/15] docs: dma-api: document __dma_from_device_group_begin()/end() Michael S. Tsirkin
2026-01-05  9:48   ` Petr Tesarik
2026-01-05 18:28   ` Marek Szyprowski
2026-01-05  8:23 ` [PATCH v2 03/15] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
2026-01-05  9:50   ` Petr Tesarik
2026-01-08 13:57   ` Marek Szyprowski
2026-01-05  8:23 ` [PATCH v2 04/15] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
2026-01-05  9:51   ` Petr Tesarik
2026-01-08 13:59   ` Marek Szyprowski
2026-01-05  8:23 ` [PATCH v2 05/15] dma-debug: track cache clean flag in entries Michael S. Tsirkin
2026-01-05  9:54   ` Petr Tesarik
2026-01-05 12:37     ` Michael S. Tsirkin
2026-01-05 13:40       ` Petr Tesarik
2026-01-05  8:23 ` [PATCH v2 06/15] virtio: add virtqueue_add_inbuf_cache_clean API Michael S. Tsirkin
2026-01-05  8:23 ` [PATCH v2 07/15] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
2026-01-08 14:04   ` Stefano Garzarella
2026-01-08 14:07     ` Michael S. Tsirkin
2026-01-08 14:18       ` Stefano Garzarella
2026-01-05  8:23 ` [PATCH v2 08/15] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
2026-01-08 14:08   ` Stefano Garzarella
2026-01-05  8:23 ` [PATCH v2 09/15] virtio_input: fix DMA alignment for evts Michael S. Tsirkin
2026-01-05  8:23 ` [PATCH v2 10/15] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
2026-01-05 18:19   ` Stefan Hajnoczi
2026-01-06 14:50     ` Michael S. Tsirkin
2026-01-07 16:29       ` Stefan Hajnoczi
2026-01-06 14:51     ` Michael S. Tsirkin
2026-01-05  8:23 ` [PATCH v2 11/15] virtio-rng: fix DMA alignment for data buffer Michael S. Tsirkin
2026-01-05  8:23 ` [PATCH v2 12/15] virtio_input: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
2026-01-05  8:23 ` [PATCH v2 13/15] vsock/virtio: reorder fields to reduce padding Michael S. Tsirkin
2026-01-08 14:11   ` Stefano Garzarella
2026-01-08 14:17     ` Michael S. Tsirkin
2026-01-08 14:27       ` Stefano Garzarella
2026-01-08 14:32         ` Michael S. Tsirkin
2026-01-08 14:45           ` Stefano Garzarella
2026-01-05  8:23 ` [PATCH v2 14/15] gpio: virtio: fix DMA alignment Michael S. Tsirkin
2026-01-05  9:48   ` Bartosz Golaszewski
2026-01-05  8:23 ` [PATCH v2 15/15] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin
2026-01-05  9:49   ` Bartosz Golaszewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).