netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/13] fix DMA aligment issues around virtio
@ 2025-12-30 10:15 Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 01/13] dma-mapping: add __dma_from_device_align_begin/end Michael S. Tsirkin
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev


Cong Wang reported dma debug warnings with virtio-vsock
and proposed a patch, see:

https://lore.kernel.org/all/20251228015451.1253271-1-xiyou.wangcong@gmail.com/

however, the issue is more widespread.
This is an attempt to fix it systematically.
Note: i2c and gio might also be affected, I am still looking
into it. Help from maintainers welcome.

Early RFC, compile tested only. Sending for early feedback/flames.
Cursor/claude used liberally mostly for refactoring, and english.

DMA maintainers, could you please confirm the DMA core changes
are ok with you?

Thanks!


Michael S. Tsirkin (13):
  dma-mapping: add __dma_from_device_align_begin/end
  docs: dma-api: document __dma_align_begin/end
  dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
  docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
  dma-debug: track cache clean flag in entries
  virtio: add virtqueue_add_inbuf_cache_clean API
  vsock/virtio: fix DMA alignment for event_list
  vsock/virtio: use virtqueue_add_inbuf_cache_clean for events
  virtio_input: fix DMA alignment for evts
  virtio_scsi: fix DMA cacheline issues for events
  virtio-rng: fix DMA alignment for data buffer
  virtio_input: use virtqueue_add_inbuf_cache_clean for events
  vsock/virtio: reorder fields to reduce struct padding

 Documentation/core-api/dma-api-howto.rst  | 42 +++++++++++++
 Documentation/core-api/dma-attributes.rst |  9 +++
 drivers/char/hw_random/virtio-rng.c       |  2 +
 drivers/scsi/virtio_scsi.c                | 18 ++++--
 drivers/virtio/virtio_input.c             |  5 +-
 drivers/virtio/virtio_ring.c              | 72 +++++++++++++++++------
 include/linux/dma-mapping.h               | 17 ++++++
 include/linux/virtio.h                    |  5 ++
 kernel/dma/debug.c                        | 26 ++++++--
 net/vmw_vsock/virtio_transport.c          |  8 ++-
 10 files changed, 172 insertions(+), 32 deletions(-)

-- 
MST


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC 01/13] dma-mapping: add __dma_from_device_align_begin/end
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
@ 2025-12-30 10:15 ` Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 02/13] docs: dma-api: document __dma_align_begin/end Michael S. Tsirkin
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

When a structure contains a buffer that DMA writes to alongside fields
that the CPU writes to, cache line sharing between the DMA buffer and
CPU-written fields can cause data corruption on non-cache-coherent
platforms.

Add __dma_from_device_aligned_begin/__dma_from_device_aligned_end
annotations to ensure proper alignment to prevent this:

struct my_device {
	spinlock_t lock1;
	__dma_from_device_aligned_begin char dma_buffer1[16];
	char dma_buffer2[16];
	__dma_from_device_aligned_end spinlock_t lock2;
};

When the DMA buffer is the last field in the structure, just
__dma_from_device_aligned_begin is enough - the compiler's struct
padding protects the tail:

struct my_device {
	spinlock_t lock;
	struct mutex mlock;
	__dma_from_device_aligned_begin char dma_buffer1[16];
	char dma_buffer2[16];
};

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/dma-mapping.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index aa36a0d1d9df..47b7de3786a1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -703,6 +703,16 @@ static inline int dma_get_cache_alignment(void)
 }
 #endif
 
+#ifdef ARCH_HAS_DMA_MINALIGN
+#define ____dma_from_device_aligned __aligned(ARCH_DMA_MINALIGN)
+#else
+#define ____dma_from_device_aligned
+#endif
+/* Apply to the 1st field of the DMA buffer */
+#define __dma_from_device_aligned_begin ____dma_from_device_aligned
+/* Apply to the 1st field beyond the DMA buffer */
+#define __dma_from_device_aligned_end ____dma_from_device_aligned
+
 static inline void *dmam_alloc_coherent(struct device *dev, size_t size,
 		dma_addr_t *dma_handle, gfp_t gfp)
 {
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 02/13] docs: dma-api: document __dma_align_begin/end
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 01/13] dma-mapping: add __dma_from_device_align_begin/end Michael S. Tsirkin
@ 2025-12-30 10:15 ` Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 03/13] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

Document the __dma_align_begin/__dma_align_end annotations
introduced by the previous patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/core-api/dma-api-howto.rst | 42 ++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/Documentation/core-api/dma-api-howto.rst b/Documentation/core-api/dma-api-howto.rst
index 96fce2a9aa90..99eda4c5c8e7 100644
--- a/Documentation/core-api/dma-api-howto.rst
+++ b/Documentation/core-api/dma-api-howto.rst
@@ -146,6 +146,48 @@ What about block I/O and networking buffers?  The block I/O and
 networking subsystems make sure that the buffers they use are valid
 for you to DMA from/to.
 
+__dma_from_device_aligned_begin/end annotations
+===============================================
+
+As explained previously, when a structure contains a DMA_FROM_DEVICE buffer
+(device writes to memory) alongside fields that the CPU writes to, cache line
+sharing between the DMA buffer and CPU-written fields can cause data corruption
+on CPUs with DMA-incoherent caches.
+
+The ``__dma_from_device_aligned_begin/__dma_from_device_aligned_end``
+annotations ensure proper alignment to prevent this::
+
+	struct my_device {
+		spinlock_t lock1;
+		__dma_from_device_aligned_begin char dma_buffer1[16];
+		char dma_buffer2[16];
+		__dma_from_device_aligned_end spinlock_t lock2;
+	};
+
+On cache-coherent platforms these macros expand to nothing. On non-coherent
+platforms, they ensure the minimal DMA alignment, which can be as large as 128
+bytes.
+
+.. note::
+
+	To isolate a DMA buffer from adjacent fields, you must apply
+	``__dma_from_device_aligned_begin`` to the first DMA buffer field
+	**and additionally** apply ``__dma_from_device_aligned_end`` to the
+	**next** field in the structure, **beyond** the DMA buffer (as opposed
+	to the last field of the DMA buffer!).  This protects both the head and
+	tail of the buffer from cache line sharing.
+
+	When the DMA buffer is the **last field** in the structure, just
+	``__dma_from_device_aligned_begin`` is enough - the compiler's struct
+	padding protects the tail::
+
+		struct my_device {
+			spinlock_t lock;
+			struct mutex mlock;
+			__dma_from_device_aligned_begin char dma_buffer1[16];
+			char dma_buffer2[16];
+		};
+
 DMA addressing capabilities
 ===========================
 
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 03/13] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 01/13] dma-mapping: add __dma_from_device_align_begin/end Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 02/13] docs: dma-api: document __dma_align_begin/end Michael S. Tsirkin
@ 2025-12-30 10:15 ` Michael S. Tsirkin
  2025-12-30 10:15 ` [PATCH RFC 04/13] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

When multiple small DMA_FROM_DEVICE or DMA_BIDIRECTIONAL buffers share a
cacheline, and DMA_API_DEBUG is enabled, we get this warning:
	cacheline tracking EEXIST, overlapping mappings aren't
supported.

This is because when one of the mappings is removed, while another
one is active, CPU might write into the buffer.

Add an attribute for the driver to promise not to do this,
making the overlapping safe, and suppressing the warning.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/dma-mapping.h | 7 +++++++
 kernel/dma/debug.c          | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 47b7de3786a1..8216a86cd0c2 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -78,6 +78,13 @@
  */
 #define DMA_ATTR_MMIO		(1UL << 10)
 
+/*
+ * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline
+ * overlapping this buffer while it is mapped for DMA. All mappings sharing
+ * a cacheline must have this attribute for this to be considered safe.
+ */
+#define DMA_ATTR_CPU_CACHE_CLEAN	(1UL << 11)
+
 /*
  * A dma_addr_t can hold any valid DMA or bus address for the platform.  It can
  * be given to a device to use as a DMA source or target.  It is specific to a
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 138ede653de4..7e66d863d573 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -595,7 +595,8 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
 	if (rc == -ENOMEM) {
 		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
 		global_disable = true;
-	} else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+	} else if (rc == -EEXIST &&
+		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
 		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
 		     is_swiotlb_active(entry->dev))) {
 		err_printk(entry->dev, entry,
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 04/13] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2025-12-30 10:15 ` [PATCH RFC 03/13] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
@ 2025-12-30 10:15 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 05/13] dma-debug: track cache clean flag in entries Michael S. Tsirkin
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

Document DMA_ATTR_CPU_CACHE_CLEAN as implemented in the
previous patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/core-api/dma-attributes.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 0bdc2be65e57..1d7bfad73b1c 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -148,3 +148,12 @@ DMA_ATTR_MMIO is appropriate.
 For architectures that require cache flushing for DMA coherence
 DMA_ATTR_MMIO will not perform any cache flushing. The address
 provided must never be mapped cacheable into the CPU.
+
+DMA_ATTR_CPU_CACHE_CLEAN
+------------------------
+
+This attribute indicates the CPU will not dirty any cacheline overlapping this
+DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
+multiple small buffers to safely share a cacheline without risk of data
+corruption, suppressing DMA debug warnings about overlapping mappings.
+All mappings sharing a cacheline should have this attribute.
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 05/13] dma-debug: track cache clean flag in entries
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2025-12-30 10:15 ` [PATCH RFC 04/13] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 06/13] virtio: add virtqueue_add_inbuf_cache_clean API Michael S. Tsirkin
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

If a driver is bugy and has 2 overlapping mappings but only
sets cache clean flag on the 1st one of them, we warn.
But if it only does it for the 2nd one, we don't.

Fix by tracking cache clean flag in the entry.
Shrink map_err_type to u8 to avoid bloating up the struct.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 kernel/dma/debug.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 7e66d863d573..9bd14fd4c51b 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -63,6 +63,7 @@ enum map_err_types {
  * @sg_mapped_ents: 'mapped_ents' from dma_map_sg
  * @paddr: physical start address of the mapping
  * @map_err_type: track whether dma_mapping_error() was checked
+ * @is_cache_clean: driver promises not to write to buffer while mapped
  * @stack_len: number of backtrace entries in @stack_entries
  * @stack_entries: stack of backtrace history
  */
@@ -76,7 +77,8 @@ struct dma_debug_entry {
 	int		 sg_call_ents;
 	int		 sg_mapped_ents;
 	phys_addr_t	 paddr;
-	enum map_err_types  map_err_type;
+	u8		 map_err_type;
+	bool		 is_cache_clean;
 #ifdef CONFIG_STACKTRACE
 	unsigned int	stack_len;
 	unsigned long	stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
@@ -472,12 +474,15 @@ static int active_cacheline_dec_overlap(phys_addr_t cln)
 	return active_cacheline_set_overlap(cln, --overlap);
 }
 
-static int active_cacheline_insert(struct dma_debug_entry *entry)
+static int active_cacheline_insert(struct dma_debug_entry *entry,
+				   bool *overlap_cache_clean)
 {
 	phys_addr_t cln = to_cacheline_number(entry);
 	unsigned long flags;
 	int rc;
 
+	*overlap_cache_clean = false;
+
 	/* If the device is not writing memory then we don't have any
 	 * concerns about the cpu consuming stale data.  This mitigates
 	 * legitimate usages of overlapping mappings.
@@ -487,8 +492,14 @@ static int active_cacheline_insert(struct dma_debug_entry *entry)
 
 	spin_lock_irqsave(&radix_lock, flags);
 	rc = radix_tree_insert(&dma_active_cacheline, cln, entry);
-	if (rc == -EEXIST)
+	if (rc == -EEXIST) {
+		struct dma_debug_entry *existing;
+
 		active_cacheline_inc_overlap(cln);
+		existing = radix_tree_lookup(&dma_active_cacheline, cln);
+		if (existing)
+			*overlap_cache_clean = existing->is_cache_clean;
+	}
 	spin_unlock_irqrestore(&radix_lock, flags);
 
 	return rc;
@@ -583,20 +594,24 @@ DEFINE_SHOW_ATTRIBUTE(dump);
  */
 static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
 {
+	bool overlap_cache_clean;
 	struct hash_bucket *bucket;
 	unsigned long flags;
 	int rc;
 
+	entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
+
 	bucket = get_hash_bucket(entry, &flags);
 	hash_bucket_add(bucket, entry);
 	put_hash_bucket(bucket, flags);
 
-	rc = active_cacheline_insert(entry);
+	rc = active_cacheline_insert(entry, &overlap_cache_clean);
 	if (rc == -ENOMEM) {
 		pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
 		global_disable = true;
 	} else if (rc == -EEXIST &&
-		   !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_CPU_CACHE_CLEAN)) &&
+		   !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+		   !(entry->is_cache_clean && overlap_cache_clean) &&
 		   !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
 		     is_swiotlb_active(entry->dev))) {
 		err_printk(entry->dev, entry,
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 06/13] virtio: add virtqueue_add_inbuf_cache_clean API
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 05/13] dma-debug: track cache clean flag in entries Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 07/13] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

Add virtqueue_add_inbuf_cache_clean() for passing DMA_ATTR_CPU_CACHE_CLEAN
to virtqueue operations. This suppresses DMA debug cacheline overlap
warnings for buffers where proper cache management is ensured by the
caller.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c | 72 ++++++++++++++++++++++++++----------
 include/linux/virtio.h       |  5 +++
 2 files changed, 58 insertions(+), 19 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 1832ea7982a6..19a4a8cd22f9 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -382,7 +382,7 @@ static int vring_mapping_error(const struct vring_virtqueue *vq,
 /* Map one sg entry. */
 static int vring_map_one_sg(const struct vring_virtqueue *vq, struct scatterlist *sg,
 			    enum dma_data_direction direction, dma_addr_t *addr,
-			    u32 *len, bool premapped)
+			    u32 *len, bool premapped, unsigned long attr)
 {
 	if (premapped) {
 		*addr = sg_dma_address(sg);
@@ -410,7 +410,7 @@ static int vring_map_one_sg(const struct vring_virtqueue *vq, struct scatterlist
 	 */
 	*addr = virtqueue_map_page_attrs(&vq->vq, sg_page(sg),
 					 sg->offset, sg->length,
-					 direction, 0);
+					 direction, attr);
 
 	if (vring_mapping_error(vq, *addr))
 		return -ENOMEM;
@@ -539,7 +539,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
 				      void *data,
 				      void *ctx,
 				      bool premapped,
-				      gfp_t gfp)
+				      gfp_t gfp,
+				      unsigned long attr)
 {
 	struct vring_desc_extra *extra;
 	struct scatterlist *sg;
@@ -605,7 +606,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
 			dma_addr_t addr;
 			u32 len;
 
-			if (vring_map_one_sg(vq, sg, DMA_TO_DEVICE, &addr, &len, premapped))
+			if (vring_map_one_sg(vq, sg, DMA_TO_DEVICE, &addr, &len,
+					     premapped, attr))
 				goto unmap_release;
 
 			prev = i;
@@ -622,7 +624,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
 			dma_addr_t addr;
 			u32 len;
 
-			if (vring_map_one_sg(vq, sg, DMA_FROM_DEVICE, &addr, &len, premapped))
+			if (vring_map_one_sg(vq, sg, DMA_FROM_DEVICE, &addr, &len,
+					     premapped, attr))
 				goto unmap_release;
 
 			prev = i;
@@ -1315,7 +1318,8 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 					 unsigned int in_sgs,
 					 void *data,
 					 bool premapped,
-					 gfp_t gfp)
+					 gfp_t gfp,
+					 unsigned long attr)
 {
 	struct vring_desc_extra *extra;
 	struct vring_packed_desc *desc;
@@ -1346,7 +1350,7 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 		for (sg = sgs[n]; sg; sg = sg_next(sg)) {
 			if (vring_map_one_sg(vq, sg, n < out_sgs ?
 					     DMA_TO_DEVICE : DMA_FROM_DEVICE,
-					     &addr, &len, premapped))
+					     &addr, &len, premapped, attr))
 				goto unmap_release;
 
 			desc[i].flags = cpu_to_le16(n < out_sgs ?
@@ -1441,7 +1445,8 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
 				       void *data,
 				       void *ctx,
 				       bool premapped,
-				       gfp_t gfp)
+				       gfp_t gfp,
+				       unsigned long attr)
 {
 	struct vring_packed_desc *desc;
 	struct scatterlist *sg;
@@ -1466,7 +1471,7 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
 
 	if (virtqueue_use_indirect(vq, total_sg)) {
 		err = virtqueue_add_indirect_packed(vq, sgs, total_sg, out_sgs,
-						    in_sgs, data, premapped, gfp);
+						    in_sgs, data, premapped, gfp, attr);
 		if (err != -ENOMEM) {
 			END_USE(vq);
 			return err;
@@ -1502,7 +1507,7 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
 
 			if (vring_map_one_sg(vq, sg, n < out_sgs ?
 					     DMA_TO_DEVICE : DMA_FROM_DEVICE,
-					     &addr, &len, premapped))
+					     &addr, &len, premapped, attr))
 				goto unmap_release;
 
 			flags = cpu_to_le16(vq->packed.avail_used_flags |
@@ -2244,14 +2249,17 @@ static inline int virtqueue_add(struct virtqueue *_vq,
 				void *data,
 				void *ctx,
 				bool premapped,
-				gfp_t gfp)
+				gfp_t gfp,
+				unsigned long attr)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
 	return vq->packed_ring ? virtqueue_add_packed(vq, sgs, total_sg,
-					out_sgs, in_sgs, data, ctx, premapped, gfp) :
+					out_sgs, in_sgs, data, ctx, premapped, gfp,
+					attr) :
 				 virtqueue_add_split(vq, sgs, total_sg,
-					out_sgs, in_sgs, data, ctx, premapped, gfp);
+					out_sgs, in_sgs, data, ctx, premapped, gfp,
+					attr);
 }
 
 /**
@@ -2289,7 +2297,7 @@ int virtqueue_add_sgs(struct virtqueue *_vq,
 			total_sg++;
 	}
 	return virtqueue_add(_vq, sgs, total_sg, out_sgs, in_sgs,
-			     data, NULL, false, gfp);
+			     data, NULL, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_sgs);
 
@@ -2311,7 +2319,7 @@ int virtqueue_add_outbuf(struct virtqueue *vq,
 			 void *data,
 			 gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, false, gfp);
+	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_outbuf);
 
@@ -2334,7 +2342,7 @@ int virtqueue_add_outbuf_premapped(struct virtqueue *vq,
 				   void *data,
 				   gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, true, gfp);
+	return virtqueue_add(vq, &sg, num, 1, 0, data, NULL, true, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_outbuf_premapped);
 
@@ -2356,10 +2364,36 @@ int virtqueue_add_inbuf(struct virtqueue *vq,
 			void *data,
 			gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp);
+	return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf);
 
+/**
+ * virtqueue_add_inbuf_cache_clean - expose input buffers with cache clean hint
+ * @vq: the struct virtqueue we're talking about.
+ * @sg: scatterlist (must be well-formed and terminated!)
+ * @num: the number of entries in @sg writable by other side
+ * @data: the token identifying the buffer.
+ * @gfp: how to do memory allocations (if necessary).
+ *
+ * Adds DMA_ATTR_CPU_CACHE_CLEAN attribute to suppress overlapping cacheline
+ * warnings in DMA debug builds. Has no effect in production builds.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ *
+ * Returns zero or a negative error (ie. ENOSPC, ENOMEM, EIO).
+ */
+int virtqueue_add_inbuf_cache_clean(struct virtqueue *vq,
+				    struct scatterlist *sg, unsigned int num,
+				    void *data,
+				    gfp_t gfp)
+{
+	return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp,
+			     DMA_ATTR_CPU_CACHE_CLEAN);
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_cache_clean);
+
 /**
  * virtqueue_add_inbuf_ctx - expose input buffers to other end
  * @vq: the struct virtqueue we're talking about.
@@ -2380,7 +2414,7 @@ int virtqueue_add_inbuf_ctx(struct virtqueue *vq,
 			void *ctx,
 			gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, false, gfp);
+	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, false, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_ctx);
 
@@ -2405,7 +2439,7 @@ int virtqueue_add_inbuf_premapped(struct virtqueue *vq,
 				  void *ctx,
 				  gfp_t gfp)
 {
-	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, true, gfp);
+	return virtqueue_add(vq, &sg, num, 0, 1, data, ctx, true, gfp, 0);
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_premapped);
 
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 3626eb694728..63bb05ece8c5 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -62,6 +62,11 @@ int virtqueue_add_inbuf(struct virtqueue *vq,
 			void *data,
 			gfp_t gfp);
 
+int virtqueue_add_inbuf_cache_clean(struct virtqueue *vq,
+				    struct scatterlist sg[], unsigned int num,
+				    void *data,
+				    gfp_t gfp);
+
 int virtqueue_add_inbuf_ctx(struct virtqueue *vq,
 			    struct scatterlist sg[], unsigned int num,
 			    void *data,
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 07/13] vsock/virtio: fix DMA alignment for event_list
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (5 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 06/13] virtio: add virtqueue_add_inbuf_cache_clean API Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 08/13] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

On non-cache-coherent platforms, when a structure contains a buffer
used for DMA alongside fields that the CPU writes to, cacheline sharing
can cause data corruption.

The event_list array is used for DMA_FROM_DEVICE operations via
virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are
written by the CPU while the buffer is available, so mapped for the
device. If these share cachelines with event_list, CPU writes can
corrupt DMA data.

Add __dma_from_device_aligned_begin/end annotations to ensure event_list
is isolated in its own cachelines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 8c867023a2e5..76099f7dc040 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -17,6 +17,7 @@
 #include <linux/virtio_ids.h>
 #include <linux/virtio_config.h>
 #include <linux/virtio_vsock.h>
+#include <linux/dma-mapping.h>
 #include <net/sock.h>
 #include <linux/mutex.h>
 #include <net/af_vsock.h>
@@ -59,8 +60,10 @@ struct virtio_vsock {
 	 */
 	struct mutex event_lock;
 	bool event_run;
+	__dma_from_device_aligned_begin
 	struct virtio_vsock_event event_list[8];
 
+	__dma_from_device_aligned_end
 	u32 guest_cid;
 	bool seqpacket_allow;
 
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 08/13] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (6 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 07/13] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 09/13] virtio_input: fix DMA alignment for evts Michael S. Tsirkin
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

The event_list array contains 8 small (4-byte) events that share
cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
this can trigger warnings about overlapping DMA mappings within
the same cacheline.

The previous patch isolated event_list in its own cache lines
so the warnings are spurious.

Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
write into these fields, suppressing the warnings.

Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 76099f7dc040..f1589db5d190 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -393,7 +393,7 @@ static int virtio_vsock_event_fill_one(struct virtio_vsock *vsock,
 
 	sg_init_one(&sg, event, sizeof(*event));
 
-	return virtqueue_add_inbuf(vq, &sg, 1, event, GFP_KERNEL);
+	return virtqueue_add_inbuf_cache_clean(vq, &sg, 1, event, GFP_KERNEL);
 }
 
 /* event_lock must be held */
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 09/13] virtio_input: fix DMA alignment for evts
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (7 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 08/13] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 10/13] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

On non-cache-coherent platforms, when a structure contains a buffer
used for DMA alongside fields that the CPU writes to, cacheline sharing
can cause data corruption.

The evts array is used for DMA_FROM_DEVICE operations via
virtqueue_add_inbuf(). The adjacent lock and ready fields are written
by the CPU during normal operation. If these share cachelines with evts,
CPU writes can corrupt DMA data.

Add __dma_from_device_aligned_begin/end annotations to ensure evts is
isolated in its own cachelines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_input.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index d0728285b6ce..774494754a99 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -4,6 +4,7 @@
 #include <linux/virtio_config.h>
 #include <linux/input.h>
 #include <linux/slab.h>
+#include <linux/dma-mapping.h>
 
 #include <uapi/linux/virtio_ids.h>
 #include <uapi/linux/virtio_input.h>
@@ -16,7 +17,9 @@ struct virtio_input {
 	char                       serial[64];
 	char                       phys[64];
 	struct virtqueue           *evt, *sts;
+	__dma_from_device_aligned_begin
 	struct virtio_input_event  evts[64];
+	__dma_from_device_aligned_end
 	spinlock_t                 lock;
 	bool                       ready;
 };
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 10/13] virtio_scsi: fix DMA cacheline issues for events
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (8 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 09/13] virtio_input: fix DMA alignment for evts Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 11/13] virtio-rng: fix DMA alignment for data buffer Michael S. Tsirkin
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

Current struct virtio_scsi_event_node layout has two problems:

The event (DMA_FROM_DEVICE) and work (CPU-written via
INIT_WORK/queue_work) fields share a cacheline.
On non-cache-coherent platforms, CPU writes to work can
corrupt device-written event data.

If DMA_MIN_ALIGN is large enough, the 8 events in event_list share
cachelines, triggering CONFIG_DMA_API_DEBUG warnings.

Fix the corruption by moving event buffers to a separate array and
aligning using __dma_from_device_aligned_begin/end.

Suppress the (now spurious) DMA debug warnings using
virtqueue_add_inbuf_cache_clean().

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 96a69edddbe5..b0ce3884e22a 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -29,6 +29,7 @@
 #include <scsi/scsi_tcq.h>
 #include <scsi/scsi_devinfo.h>
 #include <linux/seqlock.h>
+#include <linux/dma-mapping.h>
 
 #include "sd.h"
 
@@ -61,7 +62,7 @@ struct virtio_scsi_cmd {
 
 struct virtio_scsi_event_node {
 	struct virtio_scsi *vscsi;
-	struct virtio_scsi_event event;
+	struct virtio_scsi_event *event;
 	struct work_struct work;
 };
 
@@ -89,6 +90,12 @@ struct virtio_scsi {
 
 	struct virtio_scsi_vq ctrl_vq;
 	struct virtio_scsi_vq event_vq;
+
+	/* DMA buffers for events - aligned, kept separate from CPU-written fields */
+	__dma_from_device_aligned_begin
+	struct virtio_scsi_event events[VIRTIO_SCSI_EVENT_LEN];
+	__dma_from_device_aligned_end
+
 	struct virtio_scsi_vq req_vqs[];
 };
 
@@ -237,12 +244,12 @@ static int virtscsi_kick_event(struct virtio_scsi *vscsi,
 	unsigned long flags;
 
 	INIT_WORK(&event_node->work, virtscsi_handle_event);
-	sg_init_one(&sg, &event_node->event, sizeof(struct virtio_scsi_event));
+	sg_init_one(&sg, event_node->event, sizeof(struct virtio_scsi_event));
 
 	spin_lock_irqsave(&vscsi->event_vq.vq_lock, flags);
 
-	err = virtqueue_add_inbuf(vscsi->event_vq.vq, &sg, 1, event_node,
-				  GFP_ATOMIC);
+	err = virtqueue_add_inbuf_cache_clean(vscsi->event_vq.vq, &sg, 1, event_node,
+					      GFP_ATOMIC);
 	if (!err)
 		virtqueue_kick(vscsi->event_vq.vq);
 
@@ -257,6 +264,7 @@ static int virtscsi_kick_event_all(struct virtio_scsi *vscsi)
 
 	for (i = 0; i < VIRTIO_SCSI_EVENT_LEN; i++) {
 		vscsi->event_list[i].vscsi = vscsi;
+		vscsi->event_list[i].event = &vscsi->events[i];
 		virtscsi_kick_event(vscsi, &vscsi->event_list[i]);
 	}
 
@@ -380,7 +388,7 @@ static void virtscsi_handle_event(struct work_struct *work)
 	struct virtio_scsi_event_node *event_node =
 		container_of(work, struct virtio_scsi_event_node, work);
 	struct virtio_scsi *vscsi = event_node->vscsi;
-	struct virtio_scsi_event *event = &event_node->event;
+	struct virtio_scsi_event *event = event_node->event;
 
 	if (event->event &
 	    cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 11/13] virtio-rng: fix DMA alignment for data buffer
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (9 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 10/13] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 12/13] virtio_input: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

The data buffer in struct virtrng_info is used for DMA_FROM_DEVICE via
virtqueue_add_inbuf() and shares cachelines with the adjacent
CPU-written fields (data_avail, data_idx).

The device writing to the DMA buffer and the CPU writing to adjacent
fields could corrupt each other's data on non-cache-coherent platforms.

Add __dma_from_device_aligned_begin annotation to place these
in distinct cache lines.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/char/hw_random/virtio-rng.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index dd998f4fe4f2..fb3c57bee3b1 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -11,6 +11,7 @@
 #include <linux/spinlock.h>
 #include <linux/virtio.h>
 #include <linux/virtio_rng.h>
+#include <linux/dma-mapping.h>
 #include <linux/module.h>
 #include <linux/slab.h>
 
@@ -28,6 +29,7 @@ struct virtrng_info {
 	unsigned int data_avail;
 	unsigned int data_idx;
 	/* minimal size returned by rng_buffer_size() */
+	__dma_from_device_aligned_begin
 #if SMP_CACHE_BYTES < 32
 	u8 data[32];
 #else
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 12/13] virtio_input: use virtqueue_add_inbuf_cache_clean for events
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (10 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 11/13] virtio-rng: fix DMA alignment for data buffer Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 10:16 ` [PATCH RFC 13/13] vsock/virtio: reorder fields to reduce struct padding Michael S. Tsirkin
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

The evts array contains 64 small (8-byte) input events that share
cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
this can trigger warnings about overlapping DMA mappings within
the same cacheline.

Previous patch isolated the array in its own cachelines,
so the warnings are now spurious.

Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
write into these cache lines, suppressing these warnings.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index 774494754a99..b26db7d6a49f 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -30,7 +30,7 @@ static void virtinput_queue_evtbuf(struct virtio_input *vi,
 	struct scatterlist sg[1];
 
 	sg_init_one(sg, evtbuf, sizeof(*evtbuf));
-	virtqueue_add_inbuf(vi->evt, sg, 1, evtbuf, GFP_ATOMIC);
+	virtqueue_add_inbuf_cache_clean(vi->evt, sg, 1, evtbuf, GFP_ATOMIC);
 }
 
 static void virtinput_recv_events(struct virtqueue *vq)
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 13/13] vsock/virtio: reorder fields to reduce struct padding
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (11 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 12/13] virtio_input: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
@ 2025-12-30 10:16 ` Michael S. Tsirkin
  2025-12-30 16:40 ` [PATCH RFC 14/13] gpio: virtio: fix DMA alignment Michael S. Tsirkin
  2025-12-30 16:40 ` [PATCH RFC 15/13] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 10:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev

Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
last. This eliminates the need for __dma_from_device_aligned_end padding
after the DMA buffer, since struct tail padding naturally protects it,
making the struct a bit smaller.

Size reduction estimation when ARCH_DMA_MINALIGN=128:
- event_list is 32 bytes
- removing _end saves up to 128-32=96 bytes padding to align next field

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index f1589db5d190..2e34581f1143 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -60,10 +60,7 @@ struct virtio_vsock {
 	 */
 	struct mutex event_lock;
 	bool event_run;
-	__dma_from_device_aligned_begin
-	struct virtio_vsock_event event_list[8];
 
-	__dma_from_device_aligned_end
 	u32 guest_cid;
 	bool seqpacket_allow;
 
@@ -77,6 +74,10 @@ struct virtio_vsock {
 	 */
 	struct scatterlist *out_sgs[MAX_SKB_FRAGS + 1];
 	struct scatterlist out_bufs[MAX_SKB_FRAGS + 1];
+
+	/* DMA buffer - must be last, aligned for non-cache-coherent DMA */
+	__dma_from_device_aligned_begin
+	struct virtio_vsock_event event_list[8];
 };
 
 static u32 virtio_transport_get_local_cid(void)
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 14/13] gpio: virtio: fix DMA alignment
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (12 preceding siblings ...)
  2025-12-30 10:16 ` [PATCH RFC 13/13] vsock/virtio: reorder fields to reduce struct padding Michael S. Tsirkin
@ 2025-12-30 16:40 ` Michael S. Tsirkin
  2025-12-30 16:40 ` [PATCH RFC 15/13] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 16:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev, Enrico Weigelt, metux IT consult, Viresh Kumar,
	Linus Walleij, Bartosz Golaszewski, linux-gpio

The res and ires buffers in struct virtio_gpio_line and struct
vgpio_irq_line respectively are used for DMA_FROM_DEVICE via virtqueue_add_sgs().
However, within these structs, even though these elements are tagged
as ____cacheline_aligned, adjacent struct elements
can share DMA cachelines on platforms where ARCH_DMA_MINALIGN >
L1_CACHE_BYTES (e.g., arm64 with 128-byte DMA alignment but 64-byte
cache lines).

The existing ____cacheline_aligned annotation aligns to L1_CACHE_BYTES
which is now always sufficient for DMA alignment. For example,
with L1_CACHE_BYTES = 32 and ARCH_DMA_MINALIGN = 128
  - irq_lines[0].ires at offset 128
  - irq_lines[1].type at offset 192
both in same 128-byte DMA cacheline [128-256)

When the device writes to irq_lines[0].ires and the CPU concurrently
modifies one of irq_lines[1].type/disabled/masked/queued flags,
corruption can occur on non-cache-coherent platform.

Fix by using __dma_from_device_aligned_begin/end annotations on the
DMA buffers. Drop ____cacheline_aligned - it's not required to isolate
request and response, and keeping them would increase the memory cost.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/gpio/gpio-virtio.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-virtio.c b/drivers/gpio/gpio-virtio.c
index 17e040991e46..32b578b46df8 100644
--- a/drivers/gpio/gpio-virtio.c
+++ b/drivers/gpio/gpio-virtio.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/completion.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/gpio/driver.h>
 #include <linux/io.h>
@@ -24,8 +25,12 @@
 struct virtio_gpio_line {
 	struct mutex lock; /* Protects line operation */
 	struct completion completion;
-	struct virtio_gpio_request req ____cacheline_aligned;
-	struct virtio_gpio_response res ____cacheline_aligned;
+
+	__dma_from_device_aligned_begin
+	struct virtio_gpio_request req;
+	struct virtio_gpio_response res;
+
+	__dma_from_device_aligned_end
 	unsigned int rxlen;
 };
 
@@ -37,8 +42,9 @@ struct vgpio_irq_line {
 	bool update_pending;
 	bool queue_pending;
 
-	struct virtio_gpio_irq_request ireq ____cacheline_aligned;
-	struct virtio_gpio_irq_response ires ____cacheline_aligned;
+	__dma_from_device_aligned_begin
+	struct virtio_gpio_irq_request ireq;
+	struct virtio_gpio_irq_response ires;
 };
 
 struct virtio_gpio {
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 15/13] gpio: virtio: reorder fields to reduce struct padding
  2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
                   ` (13 preceding siblings ...)
  2025-12-30 16:40 ` [PATCH RFC 14/13] gpio: virtio: fix DMA alignment Michael S. Tsirkin
@ 2025-12-30 16:40 ` Michael S. Tsirkin
  14 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2025-12-30 16:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jonathan Corbet, Olivia Mackall, Herbert Xu,
	Jason Wang, Paolo Bonzini, Stefan Hajnoczi, Eugenio Pérez,
	James E.J. Bottomley, Martin K. Petersen, Gerd Hoffmann,
	Xuan Zhuo, Marek Szyprowski, Robin Murphy, Stefano Garzarella,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Petr Tesarik, Leon Romanovsky, Jason Gunthorpe,
	linux-doc, linux-crypto, virtualization, linux-scsi, iommu, kvm,
	netdev, Enrico Weigelt, metux IT consult, Viresh Kumar,
	Linus Walleij, Bartosz Golaszewski, linux-gpio

Reorder struct virtio_gpio_line fields to place the DMA buffers (req/res)
last. This eliminates the need for __dma_from_device_aligned_end padding
after the DMA buffer, since struct tail padding naturally protects it,
making the struct a bit smaller.

Size reduction estimation when ARCH_DMA_MINALIGN=128:
- request is 8 bytes
- response is 2 bytes
- removing _end saves up to 128-6=122 bytes padding to align rxlen field

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/gpio/gpio-virtio.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpio/gpio-virtio.c b/drivers/gpio/gpio-virtio.c
index 32b578b46df8..8b30a94e4625 100644
--- a/drivers/gpio/gpio-virtio.c
+++ b/drivers/gpio/gpio-virtio.c
@@ -26,12 +26,11 @@ struct virtio_gpio_line {
 	struct mutex lock; /* Protects line operation */
 	struct completion completion;
 
+	unsigned int rxlen;
+
 	__dma_from_device_aligned_begin
 	struct virtio_gpio_request req;
 	struct virtio_gpio_response res;
-
-	__dma_from_device_aligned_end
-	unsigned int rxlen;
 };
 
 struct vgpio_irq_line {
-- 
MST


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-12-30 16:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-30 10:15 [PATCH RFC 00/13] fix DMA aligment issues around virtio Michael S. Tsirkin
2025-12-30 10:15 ` [PATCH RFC 01/13] dma-mapping: add __dma_from_device_align_begin/end Michael S. Tsirkin
2025-12-30 10:15 ` [PATCH RFC 02/13] docs: dma-api: document __dma_align_begin/end Michael S. Tsirkin
2025-12-30 10:15 ` [PATCH RFC 03/13] dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
2025-12-30 10:15 ` [PATCH RFC 04/13] docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 05/13] dma-debug: track cache clean flag in entries Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 06/13] virtio: add virtqueue_add_inbuf_cache_clean API Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 07/13] vsock/virtio: fix DMA alignment for event_list Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 08/13] vsock/virtio: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 09/13] virtio_input: fix DMA alignment for evts Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 10/13] virtio_scsi: fix DMA cacheline issues for events Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 11/13] virtio-rng: fix DMA alignment for data buffer Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 12/13] virtio_input: use virtqueue_add_inbuf_cache_clean for events Michael S. Tsirkin
2025-12-30 10:16 ` [PATCH RFC 13/13] vsock/virtio: reorder fields to reduce struct padding Michael S. Tsirkin
2025-12-30 16:40 ` [PATCH RFC 14/13] gpio: virtio: fix DMA alignment Michael S. Tsirkin
2025-12-30 16:40 ` [PATCH RFC 15/13] gpio: virtio: reorder fields to reduce struct padding Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).