* [PATCH 0/5] vsock/virtio: SKB allocation improvements
@ 2025-06-25 13:15 Will Deacon
2025-06-25 13:15 ` [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs Will Deacon
` (5 more replies)
0 siblings, 6 replies; 20+ messages in thread
From: Will Deacon @ 2025-06-25 13:15 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, netdev, virtualization
Hi folks,
We're using vsock extensively in Android as a channel over which we can
route binder transactions to/from virtual machines managed by the
Android Virtualisation Framework. However, we have been observing some
issues in production builds when using vsock in a low-memory environment
(on the host and the guest) such as:
* The host receive path hanging forever, despite the guest performing
a successful write to the socket.
* Page allocation failures in the vhost receive path (this is a likely
contributor to the above)
* -ENOMEM coming back from sendmsg()
This series aims to improve the vsock SKB allocation for both the host
(vhost) and the guest when using the virtio transport to help mitigate
these issues. Specifically:
- Avoid single allocations of order > PAGE_ALLOC_COSTLY_ORDER
- Use non-linear SKBs for the transmit and vhost receive paths
- Reduce the guest RX buffers to a single page
There are more details in the individual commit messages but overall
this results in less wasted memory and puts less pressure on the
allocator.
This is my first time looking at this stuff, so all feedback is welcome.
Patches based on v6.16-rc3.
Cheers,
Will
Cc: Keir Fraser <keirf@google.com>
Cc: Steven Moreland <smoreland@google.com>
Cc: Frederick Mayle <fmayle@google.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: netdev@vger.kernel.org
Cc: virtualization@lists.linux.dev
--->8
Will Deacon (5):
vhost/vsock: Avoid allocating arbitrarily-sized SKBs
vsock/virtio: Resize receive buffers so that each SKB fits in a page
vhost/vsock: Allocate nonlinear SKBs for handling large receive
buffers
vsock/virtio: Rename virtio_vsock_skb_rx_put() to
virtio_vsock_skb_put()
vhost/vsock: Allocate nonlinear SKBs for handling large transmit
buffers
drivers/vhost/vsock.c | 21 +++++++++------
include/linux/virtio_vsock.h | 36 +++++++++++++++++++------
net/vmw_vsock/virtio_transport.c | 2 +-
net/vmw_vsock/virtio_transport_common.c | 9 +++++--
4 files changed, 49 insertions(+), 19 deletions(-)
--
2.50.0.714.g196bf9f422-goog
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
@ 2025-06-25 13:15 ` Will Deacon
2025-06-27 10:36 ` Stefano Garzarella
2025-06-25 13:15 ` [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page Will Deacon
` (4 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-25 13:15 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, netdev, virtualization
vhost_vsock_alloc_skb() returns NULL for packets advertising a length
larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE in the packet header. However,
this is only checked once the SKB has been allocated and, if the length
in the packet header is zero, the SKB may not be freed immediately.
Hoist the size check before the SKB allocation so that an iovec larger
than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + the header size is rejected
outright. The subsequent check on the length field in the header can
then simply check that the allocated SKB is indeed large enough to hold
the packet.
Signed-off-by: Will Deacon <will@kernel.org>
---
drivers/vhost/vsock.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 802153e23073..66a0f060770e 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -344,6 +344,9 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
len = iov_length(vq->iov, out);
+ if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
+ return NULL;
+
/* len contains both payload and hdr */
skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
if (!skb)
@@ -367,8 +370,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
return skb;
/* The pkt is too big or the length in the header is invalid */
- if (payload_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE ||
- payload_len + sizeof(*hdr) > len) {
+ if (payload_len + sizeof(*hdr) > len) {
kfree_skb(skb);
return NULL;
}
--
2.50.0.714.g196bf9f422-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
2025-06-25 13:15 ` [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs Will Deacon
@ 2025-06-25 13:15 ` Will Deacon
2025-06-27 10:41 ` Stefano Garzarella
2025-06-25 13:15 ` [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers Will Deacon
` (3 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-25 13:15 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, netdev, virtualization
When allocating receive buffers for the vsock virtio RX virtqueue, an
SKB is allocated with a 4140 data payload (the 44-byte packet header +
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE). Even when factoring in the SKB
overhead, the resulting 8KiB allocation thanks to the rounding in
kmalloc_reserve() is wasteful (~3700 unusable bytes) and results in a
higher-order page allocation for the sake of a few hundred bytes of
packet data.
Limit the vsock virtio RX buffers to a page per SKB, resulting in much
better memory utilisation and removing the need to allocate higher-order
pages entirely.
Signed-off-by: Will Deacon <will@kernel.org>
---
include/linux/virtio_vsock.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 36fb3edfa403..67ffb64325ef 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -111,7 +111,8 @@ static inline size_t virtio_vsock_skb_len(struct sk_buff *skb)
return (size_t)(skb_end_pointer(skb) - skb->head);
}
-#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (1024 * 4)
+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (SKB_WITH_OVERHEAD(PAGE_SIZE) \
+ - VIRTIO_VSOCK_SKB_HEADROOM)
#define VIRTIO_VSOCK_MAX_BUF_SIZE 0xFFFFFFFFUL
#define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE (1024 * 64)
--
2.50.0.714.g196bf9f422-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
2025-06-25 13:15 ` [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs Will Deacon
2025-06-25 13:15 ` [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page Will Deacon
@ 2025-06-25 13:15 ` Will Deacon
2025-06-27 10:45 ` Stefano Garzarella
2025-06-25 13:15 ` [PATCH 4/5] vsock/virtio: Rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put() Will Deacon
` (2 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-25 13:15 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, netdev, virtualization
When receiving a packet from a guest, vhost_vsock_handle_tx_kick()
calls vhost_vsock_alloc_skb() to allocate and fill an SKB with the
receive data. Unfortunately, these are always linear allocations and can
therefore result in significant pressure on kmalloc() considering that
the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
allocation for each packet.
Rework the vsock SKB allocation so that, for sizes with page order
greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
instead with the packet header in the SKB and the receive data in the
fragments.
Signed-off-by: Will Deacon <will@kernel.org>
---
drivers/vhost/vsock.c | 15 +++++++++------
include/linux/virtio_vsock.h | 31 +++++++++++++++++++++++++------
2 files changed, 34 insertions(+), 12 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 66a0f060770e..cfa4e1bcf367 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -344,11 +344,16 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
len = iov_length(vq->iov, out);
- if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
+ if (len < VIRTIO_VSOCK_SKB_HEADROOM ||
+ len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
return NULL;
/* len contains both payload and hdr */
- skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
+ if (len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+ skb = virtio_vsock_alloc_skb_with_frags(len, GFP_KERNEL);
+ else
+ skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
+
if (!skb)
return NULL;
@@ -377,10 +382,8 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
virtio_vsock_skb_rx_put(skb);
- nbytes = copy_from_iter(skb->data, payload_len, &iov_iter);
- if (nbytes != payload_len) {
- vq_err(vq, "Expected %zu byte payload, got %zu bytes\n",
- payload_len, nbytes);
+ if (skb_copy_datagram_from_iter(skb, 0, &iov_iter, payload_len)) {
+ vq_err(vq, "Failed to copy %zu byte payload\n", payload_len);
kfree_skb(skb);
return NULL;
}
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 67ffb64325ef..8f9fa1cab32a 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -51,27 +51,46 @@ static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb)
{
u32 len;
+ DEBUG_NET_WARN_ON_ONCE(skb->len);
len = le32_to_cpu(virtio_vsock_hdr(skb)->len);
- if (len > 0)
+ if (skb_is_nonlinear(skb))
+ skb->len = len;
+ else
skb_put(skb, len);
}
-static inline struct sk_buff *virtio_vsock_alloc_skb(unsigned int size, gfp_t mask)
+static inline struct sk_buff *
+__virtio_vsock_alloc_skb_with_frags(unsigned int header_len,
+ unsigned int data_len,
+ gfp_t mask)
{
struct sk_buff *skb;
+ int err;
- if (size < VIRTIO_VSOCK_SKB_HEADROOM)
- return NULL;
-
- skb = alloc_skb(size, mask);
+ skb = alloc_skb_with_frags(header_len, data_len,
+ PAGE_ALLOC_COSTLY_ORDER, &err, mask);
if (!skb)
return NULL;
skb_reserve(skb, VIRTIO_VSOCK_SKB_HEADROOM);
+ skb->data_len = data_len;
return skb;
}
+static inline struct sk_buff *
+virtio_vsock_alloc_skb_with_frags(unsigned int size, gfp_t mask)
+{
+ size -= VIRTIO_VSOCK_SKB_HEADROOM;
+ return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
+ size, mask);
+}
+
+static inline struct sk_buff *virtio_vsock_alloc_skb(unsigned int size, gfp_t mask)
+{
+ return __virtio_vsock_alloc_skb_with_frags(size, 0, mask);
+}
+
static inline void
virtio_vsock_skb_queue_head(struct sk_buff_head *list, struct sk_buff *skb)
{
--
2.50.0.714.g196bf9f422-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/5] vsock/virtio: Rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put()
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
` (2 preceding siblings ...)
2025-06-25 13:15 ` [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers Will Deacon
@ 2025-06-25 13:15 ` Will Deacon
2025-06-27 10:46 ` Stefano Garzarella
2025-06-25 13:15 ` [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers Will Deacon
2025-06-27 10:51 ` [PATCH 0/5] vsock/virtio: SKB allocation improvements Stefano Garzarella
5 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-25 13:15 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, netdev, virtualization
In preparation for using virtio_vsock_skb_rx_put() when populating SKBs
on the vsock TX path, rename virtio_vsock_skb_rx_put() to
virtio_vsock_skb_put().
No functional change.
Signed-off-by: Will Deacon <will@kernel.org>
---
drivers/vhost/vsock.c | 2 +-
include/linux/virtio_vsock.h | 2 +-
net/vmw_vsock/virtio_transport.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index cfa4e1bcf367..3799c0aeeec5 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -380,7 +380,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
return NULL;
}
- virtio_vsock_skb_rx_put(skb);
+ virtio_vsock_skb_put(skb);
if (skb_copy_datagram_from_iter(skb, 0, &iov_iter, payload_len)) {
vq_err(vq, "Failed to copy %zu byte payload\n", payload_len);
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 8f9fa1cab32a..d237ca0fc320 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -47,7 +47,7 @@ static inline void virtio_vsock_skb_clear_tap_delivered(struct sk_buff *skb)
VIRTIO_VSOCK_SKB_CB(skb)->tap_delivered = false;
}
-static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb)
+static inline void virtio_vsock_skb_put(struct sk_buff *skb)
{
u32 len;
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index f0e48e6911fc..3319be2ee3aa 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -648,7 +648,7 @@ static void virtio_transport_rx_work(struct work_struct *work)
continue;
}
- virtio_vsock_skb_rx_put(skb);
+ virtio_vsock_skb_put(skb);
virtio_transport_deliver_tap_pkt(skb);
virtio_transport_recv_pkt(&virtio_transport, skb);
}
--
2.50.0.714.g196bf9f422-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
` (3 preceding siblings ...)
2025-06-25 13:15 ` [PATCH 4/5] vsock/virtio: Rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put() Will Deacon
@ 2025-06-25 13:15 ` Will Deacon
2025-06-27 10:50 ` Stefano Garzarella
2025-06-27 10:51 ` [PATCH 0/5] vsock/virtio: SKB allocation improvements Stefano Garzarella
5 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-25 13:15 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, netdev, virtualization
When transmitting a vsock packet, virtio_transport_send_pkt_info() calls
virtio_transport_alloc_skb() to allocate and fill SKBs with the transmit
data. Unfortunately, these are always linear allocations and can
therefore result in significant pressure on kmalloc() considering that
the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
allocation for each packet.
Rework the vsock SKB allocation so that, for sizes with page order
greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
instead with the packet header in the SKB and the transmit data in the
fragments.
Signed-off-by: Will Deacon <will@kernel.org>
---
net/vmw_vsock/virtio_transport_common.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 1b5d9896edae..424eb69e84f9 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -109,7 +109,8 @@ static int virtio_transport_fill_skb(struct sk_buff *skb,
return __zerocopy_sg_from_iter(info->msg, NULL, skb,
&info->msg->msg_iter, len, NULL);
- return memcpy_from_msg(skb_put(skb, len), info->msg, len);
+ virtio_vsock_skb_put(skb);
+ return skb_copy_datagram_from_iter(skb, 0, &info->msg->msg_iter, len);
}
static void virtio_transport_init_hdr(struct sk_buff *skb,
@@ -261,7 +262,11 @@ static struct sk_buff *virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *
if (!zcopy)
skb_len += payload_len;
- skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
+ if (skb_len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+ skb = virtio_vsock_alloc_skb_with_frags(skb_len, GFP_KERNEL);
+ else
+ skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
+
if (!skb)
return NULL;
--
2.50.0.714.g196bf9f422-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs
2025-06-25 13:15 ` [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs Will Deacon
@ 2025-06-27 10:36 ` Stefano Garzarella
2025-06-30 12:51 ` Will Deacon
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Garzarella @ 2025-06-27 10:36 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Wed, Jun 25, 2025 at 02:15:39PM +0100, Will Deacon wrote:
>vhost_vsock_alloc_skb() returns NULL for packets advertising a length
>larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE in the packet header. However,
>this is only checked once the SKB has been allocated and, if the length
>in the packet header is zero, the SKB may not be freed immediately.
>
>Hoist the size check before the SKB allocation so that an iovec larger
>than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + the header size is rejected
>outright. The subsequent check on the length field in the header can
>then simply check that the allocated SKB is indeed large enough to hold
>the packet.
LGTM, but should we consider this as stable material adding a Fixes tag?
Thanks,
Stefano
>
>Signed-off-by: Will Deacon <will@kernel.org>
>---
> drivers/vhost/vsock.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index 802153e23073..66a0f060770e 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -344,6 +344,9 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
>
> len = iov_length(vq->iov, out);
>
>+ if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
>+ return NULL;
>+
> /* len contains both payload and hdr */
> skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
> if (!skb)
>@@ -367,8 +370,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
> return skb;
>
> /* The pkt is too big or the length in the header is invalid */
>- if (payload_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE ||
>- payload_len + sizeof(*hdr) > len) {
>+ if (payload_len + sizeof(*hdr) > len) {
> kfree_skb(skb);
> return NULL;
> }
>--
>2.50.0.714.g196bf9f422-goog
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page
2025-06-25 13:15 ` [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page Will Deacon
@ 2025-06-27 10:41 ` Stefano Garzarella
2025-06-30 13:06 ` Will Deacon
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Garzarella @ 2025-06-27 10:41 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Wed, Jun 25, 2025 at 02:15:40PM +0100, Will Deacon wrote:
>When allocating receive buffers for the vsock virtio RX virtqueue, an
>SKB is allocated with a 4140 data payload (the 44-byte packet header +
>VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE). Even when factoring in the SKB
>overhead, the resulting 8KiB allocation thanks to the rounding in
>kmalloc_reserve() is wasteful (~3700 unusable bytes) and results in a
>higher-order page allocation for the sake of a few hundred bytes of
>packet data.
>
>Limit the vsock virtio RX buffers to a page per SKB, resulting in much
>better memory utilisation and removing the need to allocate higher-order
>pages entirely.
>
>Signed-off-by: Will Deacon <will@kernel.org>
>---
> include/linux/virtio_vsock.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 36fb3edfa403..67ffb64325ef 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -111,7 +111,8 @@ static inline size_t virtio_vsock_skb_len(struct sk_buff *skb)
> return (size_t)(skb_end_pointer(skb) - skb->head);
> }
>
>-#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (1024 * 4)
>+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (SKB_WITH_OVERHEAD(PAGE_SIZE) \
>+ - VIRTIO_VSOCK_SKB_HEADROOM)
This is only used in net/vmw_vsock/virtio_transport.c :
static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
{
int total_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM;
What about just remove VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE and use
`SKB_WITH_OVERHEAD(PAGE_SIZE)` there? (maybe with a comment summarizing
the issue we found).
Thanks,
Stefano
> #define VIRTIO_VSOCK_MAX_BUF_SIZE 0xFFFFFFFFUL
> #define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE (1024 * 64)
>
>--
>2.50.0.714.g196bf9f422-goog
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
2025-06-25 13:15 ` [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers Will Deacon
@ 2025-06-27 10:45 ` Stefano Garzarella
2025-06-30 14:20 ` Will Deacon
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Garzarella @ 2025-06-27 10:45 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Wed, Jun 25, 2025 at 02:15:41PM +0100, Will Deacon wrote:
>When receiving a packet from a guest, vhost_vsock_handle_tx_kick()
>calls vhost_vsock_alloc_skb() to allocate and fill an SKB with the
>receive data. Unfortunately, these are always linear allocations and can
>therefore result in significant pressure on kmalloc() considering that
>the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
>VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
>allocation for each packet.
>
>Rework the vsock SKB allocation so that, for sizes with page order
>greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
>instead with the packet header in the SKB and the receive data in the
>fragments.
>
>Signed-off-by: Will Deacon <will@kernel.org>
>---
> drivers/vhost/vsock.c | 15 +++++++++------
> include/linux/virtio_vsock.h | 31 +++++++++++++++++++++++++------
> 2 files changed, 34 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index 66a0f060770e..cfa4e1bcf367 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -344,11 +344,16 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
>
> len = iov_length(vq->iov, out);
>
>- if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
>+ if (len < VIRTIO_VSOCK_SKB_HEADROOM ||
Why moving this check here?
>+ len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
> return NULL;
>
> /* len contains both payload and hdr */
>- skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
>+ if (len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>+ skb = virtio_vsock_alloc_skb_with_frags(len, GFP_KERNEL);
>+ else
>+ skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
Can we do this directly in virtio_vsock_alloc_skb() so we don't need
to duplicate code on virtio/vhost code?
>+
> if (!skb)
> return NULL;
>
>@@ -377,10 +382,8 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
>
> virtio_vsock_skb_rx_put(skb);
>
>- nbytes = copy_from_iter(skb->data, payload_len, &iov_iter);
>- if (nbytes != payload_len) {
>- vq_err(vq, "Expected %zu byte payload, got %zu bytes\n",
>- payload_len, nbytes);
>+ if (skb_copy_datagram_from_iter(skb, 0, &iov_iter, payload_len)) {
>+ vq_err(vq, "Failed to copy %zu byte payload\n", payload_len);
> kfree_skb(skb);
> return NULL;
> }
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 67ffb64325ef..8f9fa1cab32a 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -51,27 +51,46 @@ static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb)
> {
> u32 len;
>
>+ DEBUG_NET_WARN_ON_ONCE(skb->len);
Should we mention in the commit message?
> len = le32_to_cpu(virtio_vsock_hdr(skb)->len);
>
>- if (len > 0)
Why removing this check?
Thanks,
Stefano
>+ if (skb_is_nonlinear(skb))
>+ skb->len = len;
>+ else
> skb_put(skb, len);
> }
>
>-static inline struct sk_buff *virtio_vsock_alloc_skb(unsigned int size, gfp_t mask)
>+static inline struct sk_buff *
>+__virtio_vsock_alloc_skb_with_frags(unsigned int header_len,
>+ unsigned int data_len,
>+ gfp_t mask)
> {
> struct sk_buff *skb;
>+ int err;
>
>- if (size < VIRTIO_VSOCK_SKB_HEADROOM)
>- return NULL;
>-
>- skb = alloc_skb(size, mask);
>+ skb = alloc_skb_with_frags(header_len, data_len,
>+ PAGE_ALLOC_COSTLY_ORDER, &err, mask);
> if (!skb)
> return NULL;
>
> skb_reserve(skb, VIRTIO_VSOCK_SKB_HEADROOM);
>+ skb->data_len = data_len;
> return skb;
> }
>
>+static inline struct sk_buff *
>+virtio_vsock_alloc_skb_with_frags(unsigned int size, gfp_t mask)
>+{
>+ size -= VIRTIO_VSOCK_SKB_HEADROOM;
>+ return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
>+ size, mask);
>+}
>+
>+static inline struct sk_buff *virtio_vsock_alloc_skb(unsigned int size, gfp_t mask)
>+{
>+ return __virtio_vsock_alloc_skb_with_frags(size, 0, mask);
>+}
>+
> static inline void
> virtio_vsock_skb_queue_head(struct sk_buff_head *list, struct sk_buff *skb)
> {
>--
>2.50.0.714.g196bf9f422-goog
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/5] vsock/virtio: Rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put()
2025-06-25 13:15 ` [PATCH 4/5] vsock/virtio: Rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put() Will Deacon
@ 2025-06-27 10:46 ` Stefano Garzarella
0 siblings, 0 replies; 20+ messages in thread
From: Stefano Garzarella @ 2025-06-27 10:46 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Wed, Jun 25, 2025 at 02:15:42PM +0100, Will Deacon wrote:
>In preparation for using virtio_vsock_skb_rx_put() when populating SKBs
>on the vsock TX path, rename virtio_vsock_skb_rx_put() to
>virtio_vsock_skb_put().
>
>No functional change.
>
>Signed-off-by: Will Deacon <will@kernel.org>
>---
> drivers/vhost/vsock.c | 2 +-
> include/linux/virtio_vsock.h | 2 +-
> net/vmw_vsock/virtio_transport.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
LGMT!
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index cfa4e1bcf367..3799c0aeeec5 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -380,7 +380,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
> return NULL;
> }
>
>- virtio_vsock_skb_rx_put(skb);
>+ virtio_vsock_skb_put(skb);
>
> if (skb_copy_datagram_from_iter(skb, 0, &iov_iter, payload_len)) {
> vq_err(vq, "Failed to copy %zu byte payload\n", payload_len);
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 8f9fa1cab32a..d237ca0fc320 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -47,7 +47,7 @@ static inline void virtio_vsock_skb_clear_tap_delivered(struct sk_buff *skb)
> VIRTIO_VSOCK_SKB_CB(skb)->tap_delivered = false;
> }
>
>-static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb)
>+static inline void virtio_vsock_skb_put(struct sk_buff *skb)
> {
> u32 len;
>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index f0e48e6911fc..3319be2ee3aa 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -648,7 +648,7 @@ static void virtio_transport_rx_work(struct work_struct *work)
> continue;
> }
>
>- virtio_vsock_skb_rx_put(skb);
>+ virtio_vsock_skb_put(skb);
> virtio_transport_deliver_tap_pkt(skb);
> virtio_transport_recv_pkt(&virtio_transport, skb);
> }
>--
>2.50.0.714.g196bf9f422-goog
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers
2025-06-25 13:15 ` [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers Will Deacon
@ 2025-06-27 10:50 ` Stefano Garzarella
2025-06-30 14:21 ` Will Deacon
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Garzarella @ 2025-06-27 10:50 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
nit: I'd use `vsock/virtio: ` prefix since we are touching the virtio
transport common code. Maybe we can mention that this will affect both
virtio and vhost transports.
On Wed, Jun 25, 2025 at 02:15:43PM +0100, Will Deacon wrote:
>When transmitting a vsock packet, virtio_transport_send_pkt_info() calls
>virtio_transport_alloc_skb() to allocate and fill SKBs with the transmit
>data. Unfortunately, these are always linear allocations and can
>therefore result in significant pressure on kmalloc() considering that
>the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
>VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
>allocation for each packet.
>
>Rework the vsock SKB allocation so that, for sizes with page order
>greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
>instead with the packet header in the SKB and the transmit data in the
>fragments.
>
>Signed-off-by: Will Deacon <will@kernel.org>
>---
> net/vmw_vsock/virtio_transport_common.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 1b5d9896edae..424eb69e84f9 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -109,7 +109,8 @@ static int virtio_transport_fill_skb(struct sk_buff *skb,
> return __zerocopy_sg_from_iter(info->msg, NULL, skb,
> &info->msg->msg_iter, len, NULL);
>
>- return memcpy_from_msg(skb_put(skb, len), info->msg, len);
>+ virtio_vsock_skb_put(skb);
>+ return skb_copy_datagram_from_iter(skb, 0, &info->msg->msg_iter, len);
> }
>
> static void virtio_transport_init_hdr(struct sk_buff *skb,
>@@ -261,7 +262,11 @@ static struct sk_buff *virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *
> if (!zcopy)
> skb_len += payload_len;
>
>- skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
>+ if (skb_len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>+ skb = virtio_vsock_alloc_skb_with_frags(skb_len, GFP_KERNEL);
>+ else
>+ skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
>+
As I mentioned in the other patch, we may avoid this code duplication
hiding this in virtio_vsock_alloc_skb() or adding a new function that
we can use when we want to allocate frags or not.
Thanks,
Stefano
> if (!skb)
> return NULL;
>
>--
>2.50.0.714.g196bf9f422-goog
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/5] vsock/virtio: SKB allocation improvements
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
` (4 preceding siblings ...)
2025-06-25 13:15 ` [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers Will Deacon
@ 2025-06-27 10:51 ` Stefano Garzarella
2025-06-30 12:50 ` Will Deacon
5 siblings, 1 reply; 20+ messages in thread
From: Stefano Garzarella @ 2025-06-27 10:51 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Wed, Jun 25, 2025 at 02:15:38PM +0100, Will Deacon wrote:
>Hi folks,
>
>We're using vsock extensively in Android as a channel over which we can
>route binder transactions to/from virtual machines managed by the
>Android Virtualisation Framework. However, we have been observing some
>issues in production builds when using vsock in a low-memory environment
>(on the host and the guest) such as:
>
> * The host receive path hanging forever, despite the guest performing
> a successful write to the socket.
>
> * Page allocation failures in the vhost receive path (this is a likely
> contributor to the above)
>
> * -ENOMEM coming back from sendmsg()
>
>This series aims to improve the vsock SKB allocation for both the host
>(vhost) and the guest when using the virtio transport to help mitigate
>these issues. Specifically:
>
> - Avoid single allocations of order > PAGE_ALLOC_COSTLY_ORDER
>
> - Use non-linear SKBs for the transmit and vhost receive paths
>
> - Reduce the guest RX buffers to a single page
>
>There are more details in the individual commit messages but overall
>this results in less wasted memory and puts less pressure on the
>allocator.
>
>This is my first time looking at this stuff, so all feedback is welcome.
Thank you very much for this series!
I left some minor comments, but overall LGTM!
Thanks,
Stefano
>
>Patches based on v6.16-rc3.
>
>Cheers,
>
>Will
>
>Cc: Keir Fraser <keirf@google.com>
>Cc: Steven Moreland <smoreland@google.com>
>Cc: Frederick Mayle <fmayle@google.com>
>Cc: Stefan Hajnoczi <stefanha@redhat.com>
>Cc: Stefano Garzarella <sgarzare@redhat.com>
>Cc: "Michael S. Tsirkin" <mst@redhat.com>
>Cc: Jason Wang <jasowang@redhat.com>
>Cc: "Eugenio Pérez" <eperezma@redhat.com>
>Cc: netdev@vger.kernel.org
>Cc: virtualization@lists.linux.dev
>
>--->8
>
>Will Deacon (5):
> vhost/vsock: Avoid allocating arbitrarily-sized SKBs
> vsock/virtio: Resize receive buffers so that each SKB fits in a page
> vhost/vsock: Allocate nonlinear SKBs for handling large receive
> buffers
> vsock/virtio: Rename virtio_vsock_skb_rx_put() to
> virtio_vsock_skb_put()
> vhost/vsock: Allocate nonlinear SKBs for handling large transmit
> buffers
>
> drivers/vhost/vsock.c | 21 +++++++++------
> include/linux/virtio_vsock.h | 36 +++++++++++++++++++------
> net/vmw_vsock/virtio_transport.c | 2 +-
> net/vmw_vsock/virtio_transport_common.c | 9 +++++--
> 4 files changed, 49 insertions(+), 19 deletions(-)
>
>--
>2.50.0.714.g196bf9f422-goog
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/5] vsock/virtio: SKB allocation improvements
2025-06-27 10:51 ` [PATCH 0/5] vsock/virtio: SKB allocation improvements Stefano Garzarella
@ 2025-06-30 12:50 ` Will Deacon
0 siblings, 0 replies; 20+ messages in thread
From: Will Deacon @ 2025-06-30 12:50 UTC (permalink / raw)
To: Stefano Garzarella
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Fri, Jun 27, 2025 at 12:51:45PM +0200, Stefano Garzarella wrote:
> On Wed, Jun 25, 2025 at 02:15:38PM +0100, Will Deacon wrote:
> > We're using vsock extensively in Android as a channel over which we can
> > route binder transactions to/from virtual machines managed by the
> > Android Virtualisation Framework. However, we have been observing some
> > issues in production builds when using vsock in a low-memory environment
> > (on the host and the guest) such as:
> >
> > * The host receive path hanging forever, despite the guest performing
> > a successful write to the socket.
> >
> > * Page allocation failures in the vhost receive path (this is a likely
> > contributor to the above)
> >
> > * -ENOMEM coming back from sendmsg()
> >
> > This series aims to improve the vsock SKB allocation for both the host
> > (vhost) and the guest when using the virtio transport to help mitigate
> > these issues. Specifically:
> >
> > - Avoid single allocations of order > PAGE_ALLOC_COSTLY_ORDER
> >
> > - Use non-linear SKBs for the transmit and vhost receive paths
> >
> > - Reduce the guest RX buffers to a single page
> >
> > There are more details in the individual commit messages but overall
> > this results in less wasted memory and puts less pressure on the
> > allocator.
> >
> > This is my first time looking at this stuff, so all feedback is welcome.
>
> Thank you very much for this series!
>
> I left some minor comments, but overall LGTM!
Cheers for going through it! I'll work through your comments now...
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs
2025-06-27 10:36 ` Stefano Garzarella
@ 2025-06-30 12:51 ` Will Deacon
2025-07-01 10:37 ` Stefano Garzarella
0 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-30 12:51 UTC (permalink / raw)
To: Stefano Garzarella
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Fri, Jun 27, 2025 at 12:36:46PM +0200, Stefano Garzarella wrote:
> On Wed, Jun 25, 2025 at 02:15:39PM +0100, Will Deacon wrote:
> > vhost_vsock_alloc_skb() returns NULL for packets advertising a length
> > larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE in the packet header. However,
> > this is only checked once the SKB has been allocated and, if the length
> > in the packet header is zero, the SKB may not be freed immediately.
> >
> > Hoist the size check before the SKB allocation so that an iovec larger
> > than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + the header size is rejected
> > outright. The subsequent check on the length field in the header can
> > then simply check that the allocated SKB is indeed large enough to hold
> > the packet.
>
> LGTM, but should we consider this as stable material adding a Fixes tag?
Yup, absolutely. I put it first so that it can be backported easily but,
for some reason, I thought networking didn't CC stable. I have no idea
_why_ I thought that, so I'll add it (and a Fixes: line) for v2!
That seems to be:
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
from what I can tell.
Cheers,
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page
2025-06-27 10:41 ` Stefano Garzarella
@ 2025-06-30 13:06 ` Will Deacon
0 siblings, 0 replies; 20+ messages in thread
From: Will Deacon @ 2025-06-30 13:06 UTC (permalink / raw)
To: Stefano Garzarella
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Fri, Jun 27, 2025 at 12:41:48PM +0200, Stefano Garzarella wrote:
> On Wed, Jun 25, 2025 at 02:15:40PM +0100, Will Deacon wrote:
> > When allocating receive buffers for the vsock virtio RX virtqueue, an
> > SKB is allocated with a 4140 data payload (the 44-byte packet header +
> > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE). Even when factoring in the SKB
> > overhead, the resulting 8KiB allocation thanks to the rounding in
> > kmalloc_reserve() is wasteful (~3700 unusable bytes) and results in a
> > higher-order page allocation for the sake of a few hundred bytes of
> > packet data.
> >
> > Limit the vsock virtio RX buffers to a page per SKB, resulting in much
> > better memory utilisation and removing the need to allocate higher-order
> > pages entirely.
> >
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> > include/linux/virtio_vsock.h | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> > index 36fb3edfa403..67ffb64325ef 100644
> > --- a/include/linux/virtio_vsock.h
> > +++ b/include/linux/virtio_vsock.h
> > @@ -111,7 +111,8 @@ static inline size_t virtio_vsock_skb_len(struct sk_buff *skb)
> > return (size_t)(skb_end_pointer(skb) - skb->head);
> > }
> >
> > -#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (1024 * 4)
> > +#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (SKB_WITH_OVERHEAD(PAGE_SIZE) \
> > + - VIRTIO_VSOCK_SKB_HEADROOM)
>
> This is only used in net/vmw_vsock/virtio_transport.c :
>
> static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> {
> int total_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM;
>
>
> What about just remove VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE and use
> `SKB_WITH_OVERHEAD(PAGE_SIZE)` there? (maybe with a comment summarizing
> the issue we found).
Sure, works for me. That gets rid of the funny +- VIRTIO_VSOCK_SKB_HEADROOM
too.
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
2025-06-27 10:45 ` Stefano Garzarella
@ 2025-06-30 14:20 ` Will Deacon
2025-07-01 10:44 ` Stefano Garzarella
0 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2025-06-30 14:20 UTC (permalink / raw)
To: Stefano Garzarella
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Fri, Jun 27, 2025 at 12:45:45PM +0200, Stefano Garzarella wrote:
> On Wed, Jun 25, 2025 at 02:15:41PM +0100, Will Deacon wrote:
> > When receiving a packet from a guest, vhost_vsock_handle_tx_kick()
> > calls vhost_vsock_alloc_skb() to allocate and fill an SKB with the
> > receive data. Unfortunately, these are always linear allocations and can
> > therefore result in significant pressure on kmalloc() considering that
> > the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
> > VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
> > allocation for each packet.
> >
> > Rework the vsock SKB allocation so that, for sizes with page order
> > greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
> > instead with the packet header in the SKB and the receive data in the
> > fragments.
> >
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> > drivers/vhost/vsock.c | 15 +++++++++------
> > include/linux/virtio_vsock.h | 31 +++++++++++++++++++++++++------
> > 2 files changed, 34 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > index 66a0f060770e..cfa4e1bcf367 100644
> > --- a/drivers/vhost/vsock.c
> > +++ b/drivers/vhost/vsock.c
> > @@ -344,11 +344,16 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
> >
> > len = iov_length(vq->iov, out);
> >
> > - if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
> > + if (len < VIRTIO_VSOCK_SKB_HEADROOM ||
>
> Why moving this check here?
I moved it here because virtio_vsock_alloc_skb_with_frags() does:
+ size -= VIRTIO_VSOCK_SKB_HEADROOM;
+ return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
+ size, mask);
and so having the check in __virtio_vsock_alloc_skb_with_frags() looks
strange as, by then, it really only applies to the linear case. It also
feels weird to me to have the upper-bound of the length checked by the
caller but the lower-bound checked in the callee. I certainly find it
easier to reason about if they're in the same place.
Additionally, the lower-bound check is only needed by the vhost receive
code, as the transmit path uses virtio_vsock_alloc_skb(), which never
passes a size smaller than VIRTIO_VSOCK_SKB_HEADROOM.
Given all that, moving it to the one place that needs it seemed like the
best option. What do you think?
> > + len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
> > return NULL;
> >
> > /* len contains both payload and hdr */
> > - skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
> > + if (len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> > + skb = virtio_vsock_alloc_skb_with_frags(len, GFP_KERNEL);
> > + else
> > + skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
>
> Can we do this directly in virtio_vsock_alloc_skb() so we don't need
> to duplicate code on virtio/vhost code?
We can, but then I think we should do something different for the
rx_fill() path -- it feels fragile to rely on that using small-enough
buffers to guarantee linear allocations. How about I:
1. Add virtio_vsock_alloc_linear_skb(), which always performs a linear
allocation.
2. Change virtio_vsock_alloc_skb() to use nonlinear SKBs for sizes
greater than SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)
3. Use virtio_vsock_alloc_linear_skb() to fill the guest RX buffers
4. Use virtio_vsock_alloc_skb() for everything else
If you like the idea, I'll rework the series along those lines.
Diff below... (see end of mail)
> > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> > index 67ffb64325ef..8f9fa1cab32a 100644
> > --- a/include/linux/virtio_vsock.h
> > +++ b/include/linux/virtio_vsock.h
> > @@ -51,27 +51,46 @@ static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb)
> > {
> > u32 len;
> >
> > + DEBUG_NET_WARN_ON_ONCE(skb->len);
>
> Should we mention in the commit message?
Sure, I'll add something. The non-linear handling doesn't accumulate len,
so it's a debug check to ensure that len hasn't been messed with between
allocation and here.
> > len = le32_to_cpu(virtio_vsock_hdr(skb)->len);
> >
> > - if (len > 0)
>
> Why removing this check?
I think it's redundant: len is a u32, so we're basically just checking
to see if it's non-zero. All the callers have already checked for this
but, even if they didn't, skb_put(skb, 0) is harmless afaict.
Will
--->8
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 3799c0aeeec5..a6cd72a32f63 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -349,11 +349,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
return NULL;
/* len contains both payload and hdr */
- if (len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
- skb = virtio_vsock_alloc_skb_with_frags(len, GFP_KERNEL);
- else
- skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
-
+ skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
if (!skb)
return NULL;
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 0e265921be03..ed5eab46e3dc 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -79,16 +79,19 @@ __virtio_vsock_alloc_skb_with_frags(unsigned int header_len,
}
static inline struct sk_buff *
-virtio_vsock_alloc_skb_with_frags(unsigned int size, gfp_t mask)
+virtio_vsock_alloc_linear_skb(unsigned int size, gfp_t mask)
{
- size -= VIRTIO_VSOCK_SKB_HEADROOM;
- return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
- size, mask);
+ return __virtio_vsock_alloc_skb_with_frags(size, 0, mask);
}
static inline struct sk_buff *virtio_vsock_alloc_skb(unsigned int size, gfp_t mask)
{
- return __virtio_vsock_alloc_skb_with_frags(size, 0, mask);
+ if (size <= SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+ return virtio_vsock_alloc_linear_skb(size, mask);
+
+ size -= VIRTIO_VSOCK_SKB_HEADROOM;
+ return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
+ size, mask);
}
static inline void
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 4ae714397ca3..8c9ca0cb0d4e 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -321,7 +321,7 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
vq = vsock->vqs[VSOCK_VQ_RX];
do {
- skb = virtio_vsock_alloc_skb(total_len, GFP_KERNEL);
+ skb = virtio_vsock_alloc_linear_skb(total_len, GFP_KERNEL);
if (!skb)
break;
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 424eb69e84f9..f74677c3511e 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -262,11 +262,7 @@ static struct sk_buff *virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *
if (!zcopy)
skb_len += payload_len;
- if (skb_len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
- skb = virtio_vsock_alloc_skb_with_frags(skb_len, GFP_KERNEL);
- else
- skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
-
+ skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
if (!skb)
return NULL;
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers
2025-06-27 10:50 ` Stefano Garzarella
@ 2025-06-30 14:21 ` Will Deacon
0 siblings, 0 replies; 20+ messages in thread
From: Will Deacon @ 2025-06-30 14:21 UTC (permalink / raw)
To: Stefano Garzarella
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Fri, Jun 27, 2025 at 12:50:27PM +0200, Stefano Garzarella wrote:
> nit: I'd use `vsock/virtio: ` prefix since we are touching the virtio
> transport common code. Maybe we can mention that this will affect both
> virtio and vhost transports.
Sure, I'll do that.
> On Wed, Jun 25, 2025 at 02:15:43PM +0100, Will Deacon wrote:
> > When transmitting a vsock packet, virtio_transport_send_pkt_info() calls
> > virtio_transport_alloc_skb() to allocate and fill SKBs with the transmit
> > data. Unfortunately, these are always linear allocations and can
> > therefore result in significant pressure on kmalloc() considering that
> > the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
> > VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
> > allocation for each packet.
> >
> > Rework the vsock SKB allocation so that, for sizes with page order
> > greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
> > instead with the packet header in the SKB and the transmit data in the
> > fragments.
> >
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> > net/vmw_vsock/virtio_transport_common.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > index 1b5d9896edae..424eb69e84f9 100644
> > --- a/net/vmw_vsock/virtio_transport_common.c
> > +++ b/net/vmw_vsock/virtio_transport_common.c
> > @@ -109,7 +109,8 @@ static int virtio_transport_fill_skb(struct sk_buff *skb,
> > return __zerocopy_sg_from_iter(info->msg, NULL, skb,
> > &info->msg->msg_iter, len, NULL);
> >
> > - return memcpy_from_msg(skb_put(skb, len), info->msg, len);
> > + virtio_vsock_skb_put(skb);
> > + return skb_copy_datagram_from_iter(skb, 0, &info->msg->msg_iter, len);
> > }
> >
> > static void virtio_transport_init_hdr(struct sk_buff *skb,
> > @@ -261,7 +262,11 @@ static struct sk_buff *virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *
> > if (!zcopy)
> > skb_len += payload_len;
> >
> > - skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
> > + if (skb_len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> > + skb = virtio_vsock_alloc_skb_with_frags(skb_len, GFP_KERNEL);
> > + else
> > + skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
> > +
>
> As I mentioned in the other patch, we may avoid this code duplication hiding
> this in virtio_vsock_alloc_skb() or adding a new function that
> we can use when we want to allocate frags or not.
That would be good. I had a crack at it in the diff I sent in reply to
the earlier patch, so please take a look.
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs
2025-06-30 12:51 ` Will Deacon
@ 2025-07-01 10:37 ` Stefano Garzarella
0 siblings, 0 replies; 20+ messages in thread
From: Stefano Garzarella @ 2025-07-01 10:37 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Mon, Jun 30, 2025 at 01:51:07PM +0100, Will Deacon wrote:
>On Fri, Jun 27, 2025 at 12:36:46PM +0200, Stefano Garzarella wrote:
>> On Wed, Jun 25, 2025 at 02:15:39PM +0100, Will Deacon wrote:
>> > vhost_vsock_alloc_skb() returns NULL for packets advertising a length
>> > larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE in the packet header. However,
>> > this is only checked once the SKB has been allocated and, if the length
>> > in the packet header is zero, the SKB may not be freed immediately.
>> >
>> > Hoist the size check before the SKB allocation so that an iovec larger
>> > than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + the header size is rejected
>> > outright. The subsequent check on the length field in the header can
>> > then simply check that the allocated SKB is indeed large enough to hold
>> > the packet.
>>
>> LGTM, but should we consider this as stable material adding a Fixes tag?
>
>Yup, absolutely. I put it first so that it can be backported easily but,
>for some reason, I thought networking didn't CC stable. I have no idea
>_why_ I thought that, so I'll add it (and a Fixes: line) for v2!
yeah, this was the case till last year IIRC, but we always used Fixes
tag, also if we didn't cc stable.
>
>That seems to be:
>
> Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
>
>from what I can tell.
I think so!
Thanks,
Stefano
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
2025-06-30 14:20 ` Will Deacon
@ 2025-07-01 10:44 ` Stefano Garzarella
2025-07-01 13:52 ` Will Deacon
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Garzarella @ 2025-07-01 10:44 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Mon, Jun 30, 2025 at 03:20:57PM +0100, Will Deacon wrote:
>On Fri, Jun 27, 2025 at 12:45:45PM +0200, Stefano Garzarella wrote:
>> On Wed, Jun 25, 2025 at 02:15:41PM +0100, Will Deacon wrote:
>> > When receiving a packet from a guest, vhost_vsock_handle_tx_kick()
>> > calls vhost_vsock_alloc_skb() to allocate and fill an SKB with the
>> > receive data. Unfortunately, these are always linear allocations and can
>> > therefore result in significant pressure on kmalloc() considering that
>> > the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE +
>> > VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB
>> > allocation for each packet.
>> >
>> > Rework the vsock SKB allocation so that, for sizes with page order
>> > greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated
>> > instead with the packet header in the SKB and the receive data in the
>> > fragments.
>> >
>> > Signed-off-by: Will Deacon <will@kernel.org>
>> > ---
>> > drivers/vhost/vsock.c | 15 +++++++++------
>> > include/linux/virtio_vsock.h | 31 +++++++++++++++++++++++++------
>> > 2 files changed, 34 insertions(+), 12 deletions(-)
>> >
>> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> > index 66a0f060770e..cfa4e1bcf367 100644
>> > --- a/drivers/vhost/vsock.c
>> > +++ b/drivers/vhost/vsock.c
>> > @@ -344,11 +344,16 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
>> >
>> > len = iov_length(vq->iov, out);
>> >
>> > - if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
>> > + if (len < VIRTIO_VSOCK_SKB_HEADROOM ||
>>
>> Why moving this check here?
>
>I moved it here because virtio_vsock_alloc_skb_with_frags() does:
>
>+ size -= VIRTIO_VSOCK_SKB_HEADROOM;
>+ return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
>+ size, mask);
>
>and so having the check in __virtio_vsock_alloc_skb_with_frags() looks
>strange as, by then, it really only applies to the linear case. It also
>feels weird to me to have the upper-bound of the length checked by the
>caller but the lower-bound checked in the callee. I certainly find it
>easier to reason about if they're in the same place.
>
>Additionally, the lower-bound check is only needed by the vhost receive
>code, as the transmit path uses virtio_vsock_alloc_skb(), which never
>passes a size smaller than VIRTIO_VSOCK_SKB_HEADROOM.
>
>Given all that, moving it to the one place that needs it seemed like the
>best option. What do you think?
Okay, I see now. Yep, it's fine, but please mention in the commit
description.
>
>> > + len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
>> > return NULL;
>> >
>> > /* len contains both payload and hdr */
>> > - skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
>> > + if (len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>> > + skb = virtio_vsock_alloc_skb_with_frags(len, GFP_KERNEL);
>> > + else
>> > + skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
>>
>> Can we do this directly in virtio_vsock_alloc_skb() so we don't need
>> to duplicate code on virtio/vhost code?
>
>We can, but then I think we should do something different for the
>rx_fill() path -- it feels fragile to rely on that using small-enough
>buffers to guarantee linear allocations. How about I:
>
> 1. Add virtio_vsock_alloc_linear_skb(), which always performs a linear
> allocation.
>
> 2. Change virtio_vsock_alloc_skb() to use nonlinear SKBs for sizes
> greater than SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)
>
> 3. Use virtio_vsock_alloc_linear_skb() to fill the guest RX buffers
>
> 4. Use virtio_vsock_alloc_skb() for everything else
>
>If you like the idea, I'll rework the series along those lines.
>Diff below... (see end of mail)
I really like it :-) let's go in that direction!
>
>> > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>> > index 67ffb64325ef..8f9fa1cab32a 100644
>> > --- a/include/linux/virtio_vsock.h
>> > +++ b/include/linux/virtio_vsock.h
>> > @@ -51,27 +51,46 @@ static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb)
>> > {
>> > u32 len;
>> >
>> > + DEBUG_NET_WARN_ON_ONCE(skb->len);
>>
>> Should we mention in the commit message?
>
>Sure, I'll add something. The non-linear handling doesn't accumulate len,
>so it's a debug check to ensure that len hasn't been messed with between
>allocation and here.
>
>> > len = le32_to_cpu(virtio_vsock_hdr(skb)->len);
>> >
>> > - if (len > 0)
>>
>> Why removing this check?
>
>I think it's redundant: len is a u32, so we're basically just checking
>to see if it's non-zero. All the callers have already checked for this
>but, even if they didn't, skb_put(skb, 0) is harmless afaict.
Yep, I see, but now I don't remember why we have it, could it be more
expensive to call `skb_put(skb, 0)`, instead of just having the if for
control packets with no payload?
Thanks,
Stefano
>
>Will
>
>--->8
>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index 3799c0aeeec5..a6cd72a32f63 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -349,11 +349,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
> return NULL;
>
> /* len contains both payload and hdr */
>- if (len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>- skb = virtio_vsock_alloc_skb_with_frags(len, GFP_KERNEL);
>- else
>- skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
>-
>+ skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
> if (!skb)
> return NULL;
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 0e265921be03..ed5eab46e3dc 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -79,16 +79,19 @@ __virtio_vsock_alloc_skb_with_frags(unsigned int header_len,
> }
>
> static inline struct sk_buff *
>-virtio_vsock_alloc_skb_with_frags(unsigned int size, gfp_t mask)
>+virtio_vsock_alloc_linear_skb(unsigned int size, gfp_t mask)
> {
>- size -= VIRTIO_VSOCK_SKB_HEADROOM;
>- return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
>- size, mask);
>+ return __virtio_vsock_alloc_skb_with_frags(size, 0, mask);
> }
>
> static inline struct sk_buff *virtio_vsock_alloc_skb(unsigned int size, gfp_t mask)
> {
>- return __virtio_vsock_alloc_skb_with_frags(size, 0, mask);
>+ if (size <= SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>+ return virtio_vsock_alloc_linear_skb(size, mask);
>+
>+ size -= VIRTIO_VSOCK_SKB_HEADROOM;
>+ return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
>+ size, mask);
> }
>
> static inline void
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index 4ae714397ca3..8c9ca0cb0d4e 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -321,7 +321,7 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
> vq = vsock->vqs[VSOCK_VQ_RX];
>
> do {
>- skb = virtio_vsock_alloc_skb(total_len, GFP_KERNEL);
>+ skb = virtio_vsock_alloc_linear_skb(total_len, GFP_KERNEL);
> if (!skb)
> break;
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 424eb69e84f9..f74677c3511e 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -262,11 +262,7 @@ static struct sk_buff *virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *
> if (!zcopy)
> skb_len += payload_len;
>
>- if (skb_len > SKB_WITH_OVERHEAD(PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>- skb = virtio_vsock_alloc_skb_with_frags(skb_len, GFP_KERNEL);
>- else
>- skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
>-
>+ skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL);
> if (!skb)
> return NULL;
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
2025-07-01 10:44 ` Stefano Garzarella
@ 2025-07-01 13:52 ` Will Deacon
0 siblings, 0 replies; 20+ messages in thread
From: Will Deacon @ 2025-07-01 13:52 UTC (permalink / raw)
To: Stefano Garzarella
Cc: linux-kernel, Keir Fraser, Steven Moreland, Frederick Mayle,
Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang,
Eugenio Pérez, netdev, virtualization
On Tue, Jul 01, 2025 at 12:44:58PM +0200, Stefano Garzarella wrote:
> On Mon, Jun 30, 2025 at 03:20:57PM +0100, Will Deacon wrote:
> > On Fri, Jun 27, 2025 at 12:45:45PM +0200, Stefano Garzarella wrote:
> > > On Wed, Jun 25, 2025 at 02:15:41PM +0100, Will Deacon wrote:
> > > > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > > > index 66a0f060770e..cfa4e1bcf367 100644
> > > > --- a/drivers/vhost/vsock.c
> > > > +++ b/drivers/vhost/vsock.c
> > > > @@ -344,11 +344,16 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
> > > >
> > > > len = iov_length(vq->iov, out);
> > > >
> > > > - if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
> > > > + if (len < VIRTIO_VSOCK_SKB_HEADROOM ||
> > >
> > > Why moving this check here?
> >
> > I moved it here because virtio_vsock_alloc_skb_with_frags() does:
> >
> > + size -= VIRTIO_VSOCK_SKB_HEADROOM;
> > + return __virtio_vsock_alloc_skb_with_frags(VIRTIO_VSOCK_SKB_HEADROOM,
> > + size, mask);
> >
> > and so having the check in __virtio_vsock_alloc_skb_with_frags() looks
> > strange as, by then, it really only applies to the linear case. It also
> > feels weird to me to have the upper-bound of the length checked by the
> > caller but the lower-bound checked in the callee. I certainly find it
> > easier to reason about if they're in the same place.
> >
> > Additionally, the lower-bound check is only needed by the vhost receive
> > code, as the transmit path uses virtio_vsock_alloc_skb(), which never
> > passes a size smaller than VIRTIO_VSOCK_SKB_HEADROOM.
> >
> > Given all that, moving it to the one place that needs it seemed like the
> > best option. What do you think?
>
> Okay, I see now. Yep, it's fine, but please mention in the commit
> description.
Great, I'll do that.
> > > > len = le32_to_cpu(virtio_vsock_hdr(skb)->len);
> > > >
> > > > - if (len > 0)
> > >
> > > Why removing this check?
> >
> > I think it's redundant: len is a u32, so we're basically just checking
> > to see if it's non-zero. All the callers have already checked for this
> > but, even if they didn't, skb_put(skb, 0) is harmless afaict.
>
> Yep, I see, but now I don't remember why we have it, could it be more
> expensive to call `skb_put(skb, 0)`, instead of just having the if for
> control packets with no payload?
That sounds like a questionable optimisation, but I can preserve it in
the only caller that doesn't already check for a non-zero size
(virtio_transport_rx_work()). I mistakenly thought that it was already
checking it, but on closer inspection it only checks the size of the
virtqueue buffer and doesn't look at the packet header at all.
In fact, that is itself a bug because nothing prevents an SKB overflow
on the put path...
I'll add an extra fix for that in v2 so that it can be backported
independently.
Will
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-07-01 13:53 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-25 13:15 [PATCH 0/5] vsock/virtio: SKB allocation improvements Will Deacon
2025-06-25 13:15 ` [PATCH 1/5] vhost/vsock: Avoid allocating arbitrarily-sized SKBs Will Deacon
2025-06-27 10:36 ` Stefano Garzarella
2025-06-30 12:51 ` Will Deacon
2025-07-01 10:37 ` Stefano Garzarella
2025-06-25 13:15 ` [PATCH 2/5] vsock/virtio: Resize receive buffers so that each SKB fits in a page Will Deacon
2025-06-27 10:41 ` Stefano Garzarella
2025-06-30 13:06 ` Will Deacon
2025-06-25 13:15 ` [PATCH 3/5] vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers Will Deacon
2025-06-27 10:45 ` Stefano Garzarella
2025-06-30 14:20 ` Will Deacon
2025-07-01 10:44 ` Stefano Garzarella
2025-07-01 13:52 ` Will Deacon
2025-06-25 13:15 ` [PATCH 4/5] vsock/virtio: Rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put() Will Deacon
2025-06-27 10:46 ` Stefano Garzarella
2025-06-25 13:15 ` [PATCH 5/5] vhost/vsock: Allocate nonlinear SKBs for handling large transmit buffers Will Deacon
2025-06-27 10:50 ` Stefano Garzarella
2025-06-30 14:21 ` Will Deacon
2025-06-27 10:51 ` [PATCH 0/5] vsock/virtio: SKB allocation improvements Stefano Garzarella
2025-06-30 12:50 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).