[PATCH AUTOSEL 6.16-5.4] vhost: fail early when __vhost_add

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH AUTOSEL 6.16-5.4] vhost: fail early when __vhost_add_used() fails
       [not found] <20250805130945.471732-1-sashal@kernel.org>
@ 2025-08-05 13:08 ` Sasha Levin
  2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.1] vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page Sasha Levin
  1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-08-05 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Jason Wang, Eugenio Pérez, Michael S . Tsirkin, Lei Yang,
	Sasha Levin, kvm, virtualization, netdev

From: Jason Wang <jasowang@redhat.com>

[ Upstream commit b4ba1207d45adaafa2982c035898b36af2d3e518 ]

This patch fails vhost_add_used_n() early when __vhost_add_used()
fails to make sure used idx is not updated with stale used ring
information.

Reported-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250714084755.11921-2-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees. Here's my
extensive analysis:

## Bug Analysis

The commit fixes a critical error handling bug in `vhost_add_used_n()`
where the function continues execution even after `__vhost_add_used_n()`
fails on line 2776. Looking at the code:

1. **The Bug**: In the original code at drivers/vhost/vhost.c:2776, when
   the second call to `__vhost_add_used_n()` fails and returns an error,
   the function doesn't immediately return. Instead, it continues to:
   - Execute memory barrier (`smp_wmb()`) at line 2779
   - Update the used index via `vhost_put_used_idx()` at line 2780
   - Potentially log the used index update at lines 2784-2791

2. **Impact**: This means the vhost driver updates the used ring index
   even when the actual used ring entries weren't successfully written.
   This creates a **data corruption scenario** where:
   - The guest sees an updated used index
   - But the corresponding used ring entries contain stale/invalid data
   - This can lead to guest crashes, data corruption, or unpredictable
     behavior

3. **The Fix**: The patch adds a simple but crucial check at lines
   2778-2779 (after applying):
  ```c
  if (r < 0)
  return r;
  ```
  This ensures the function returns immediately upon failure, preventing
  the index from being updated with invalid ring state.

## Stable Backport Criteria Assessment

1. **Bug Fix**: ✓ This fixes a real bug that can cause data corruption
   in vhost operations
2. **Small and Contained**: ✓ The fix is only 3 lines of code -
   extremely minimal
3. **No Side Effects**: ✓ The change only adds proper error handling, no
   behavioral changes for success cases
4. **No Architectural Changes**: ✓ Simple error check addition, no
   design changes
5. **Critical Subsystem**: ✓ vhost is used for virtualization (virtio
   devices), affecting VMs and containers
6. **Clear Bug Impact**: ✓ Data corruption in guest-host communication
   is a serious issue
7. **Follows Stable Rules**: ✓ Important bugfix with minimal regression
   risk

## Additional Evidence

- The bug was reported by Eugenio Pérez from Red Hat, indicating it was
  found in production/testing environments
- The fix has been tested (as indicated by "Tested-by: Lei Yang")
- The function `__vhost_add_used_n()` can fail with -EFAULT when
  `vhost_put_used()` fails (line 2738-2740)
- The first call to `__vhost_add_used_n()` already has proper error
  handling (lines 2770-2772), making this an inconsistency bug

This is a textbook example of a stable backport candidate: a small,
obvious fix for a real bug that can cause data corruption in a critical
kernel subsystem.

 drivers/vhost/vhost.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 3a5ebb973dba..d1d3912f4804 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2775,6 +2775,9 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 	}
 	r = __vhost_add_used_n(vq, heads, count);

+	if (r < 0)
+		return r;
+
 	/* Make sure buffer is written before we update index. */
 	smp_wmb();
 	if (vhost_put_used_idx(vq)) {
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH AUTOSEL 6.16-6.1] vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page
       [not found] <20250805130945.471732-1-sashal@kernel.org>
  2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] vhost: fail early when __vhost_add_used() fails Sasha Levin
@ 2025-08-05 13:09 ` Sasha Levin
  1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-08-05 13:09 UTC (permalink / raw)
  To: patches, stable
  Cc: Will Deacon, Stefano Garzarella, Michael S . Tsirkin, Sasha Levin,
	stefanha, jasowang, kvm, virtualization, netdev

From: Will Deacon <will@kernel.org>

[ Upstream commit 03a92f036a04fed2b00d69f5f46f1a486e70dc5c ]

When allocating receive buffers for the vsock virtio RX virtqueue, an
SKB is allocated with a 4140 data payload (the 44-byte packet header +
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE). Even when factoring in the SKB
overhead, the resulting 8KiB allocation thanks to the rounding in
kmalloc_reserve() is wasteful (~3700 unusable bytes) and results in a
higher-order page allocation on systems with 4KiB pages just for the
sake of a few hundred bytes of packet data.

Limit the vsock virtio RX buffers to 4KiB per SKB, resulting in much
better memory utilisation and removing the need to allocate higher-order
pages entirely.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
Message-Id: <20250717090116.11987-5-will@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Let me analyze what we know about this commit:

**Backport Status: YES**

This commit is a good candidate for stable backporting based on the
following analysis:

## Reasons for backporting:

1. **Memory efficiency fix**: The commit addresses a significant memory
   waste issue where ~3700 bytes out of 8KiB allocations were unusable.
   This is a ~46% memory waste for every RX buffer allocation.

2. **Avoids higher-order page allocations**: The original code forces
   the kernel to allocate higher-order pages (8KiB) on systems with 4KiB
   pages, which can lead to:
   - Memory fragmentation issues
   - Allocation failures under memory pressure
   - Degraded system performance

3. **Simple, contained fix**: The change is minimal and well-contained:
   - Changes `VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE` from a hardcoded 4096 to
     `SKB_WITH_OVERHEAD(1024 * 4)`
   - Removes the addition of `VIRTIO_VSOCK_SKB_HEADROOM` in
     `virtio_vsock_rx_fill()`
   - The SKB_WITH_OVERHEAD macro (defined as `((X) -
     SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))`) ensures the
     entire SKB fits in a 4K page

4. **No functional changes**: This is purely an optimization that:
   - Doesn't change the protocol behavior
   - Doesn't introduce new features
   - Maintains backward compatibility
   - Only affects memory allocation patterns

5. **Clear performance benefit**: This provides immediate benefits to
   all vsock users by:
   - Reducing memory consumption by ~46% per RX buffer
   - Eliminating pressure on higher-order page allocations
   - Improving overall system memory efficiency

6. **Low risk**: The change is straightforward with minimal risk of
   regression:
   - The logic remains the same
   - Only the buffer sizing calculation changes
   - Has been reviewed by the vsock maintainer (Stefano Garzarella)

The commit meets all the stable tree criteria: it fixes a real problem
(memory waste and higher-order allocations), is minimal in scope,
doesn't introduce new features, and has clear benefits with low risk of
regression.

 include/linux/virtio_vsock.h     | 7 ++++++-
 net/vmw_vsock/virtio_transport.c | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 36fb3edfa403..6c00687539cf 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -111,7 +111,12 @@ static inline size_t virtio_vsock_skb_len(struct sk_buff *skb)
 	return (size_t)(skb_end_pointer(skb) - skb->head);
 }
 
-#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	(1024 * 4)
+/* Dimension the RX SKB so that the entire thing fits exactly into
+ * a single 4KiB page. This avoids wasting memory due to alloc_skb()
+ * rounding up to the next page order and also means that we
+ * don't leave higher-order pages sitting around in the RX queue.
+ */
+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	SKB_WITH_OVERHEAD(1024 * 4)
 #define VIRTIO_VSOCK_MAX_BUF_SIZE		0xFFFFFFFFUL
 #define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE		(1024 * 64)
 
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index f0e48e6911fc..f01f9e878106 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -307,7 +307,7 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
 
 static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
 {
-	int total_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM;
+	int total_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
 	struct scatterlist pkt, *p;
 	struct virtqueue *vq;
 	struct sk_buff *skb;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-08-05 13:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250805130945.471732-1-sashal@kernel.org>
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] vhost: fail early when __vhost_add_used() fails Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.1] vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).