From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Will Deacon <will@kernel.org>,
Stefano Garzarella <sgarzare@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
Sasha Levin <sashal@kernel.org>,
stefanha@redhat.com, jasowang@redhat.com, kvm@vger.kernel.org,
virtualization@lists.linux.dev, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 6.16-6.1] vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page
Date: Tue, 5 Aug 2025 09:09:12 -0400 [thread overview]
Message-ID: <20250805130945.471732-37-sashal@kernel.org> (raw)
In-Reply-To: <20250805130945.471732-1-sashal@kernel.org>
From: Will Deacon <will@kernel.org>
[ Upstream commit 03a92f036a04fed2b00d69f5f46f1a486e70dc5c ]
When allocating receive buffers for the vsock virtio RX virtqueue, an
SKB is allocated with a 4140 data payload (the 44-byte packet header +
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE). Even when factoring in the SKB
overhead, the resulting 8KiB allocation thanks to the rounding in
kmalloc_reserve() is wasteful (~3700 unusable bytes) and results in a
higher-order page allocation on systems with 4KiB pages just for the
sake of a few hundred bytes of packet data.
Limit the vsock virtio RX buffers to 4KiB per SKB, resulting in much
better memory utilisation and removing the need to allocate higher-order
pages entirely.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
Message-Id: <20250717090116.11987-5-will@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Let me analyze what we know about this commit:
**Backport Status: YES**
This commit is a good candidate for stable backporting based on the
following analysis:
## Reasons for backporting:
1. **Memory efficiency fix**: The commit addresses a significant memory
waste issue where ~3700 bytes out of 8KiB allocations were unusable.
This is a ~46% memory waste for every RX buffer allocation.
2. **Avoids higher-order page allocations**: The original code forces
the kernel to allocate higher-order pages (8KiB) on systems with 4KiB
pages, which can lead to:
- Memory fragmentation issues
- Allocation failures under memory pressure
- Degraded system performance
3. **Simple, contained fix**: The change is minimal and well-contained:
- Changes `VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE` from a hardcoded 4096 to
`SKB_WITH_OVERHEAD(1024 * 4)`
- Removes the addition of `VIRTIO_VSOCK_SKB_HEADROOM` in
`virtio_vsock_rx_fill()`
- The SKB_WITH_OVERHEAD macro (defined as `((X) -
SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))`) ensures the
entire SKB fits in a 4K page
4. **No functional changes**: This is purely an optimization that:
- Doesn't change the protocol behavior
- Doesn't introduce new features
- Maintains backward compatibility
- Only affects memory allocation patterns
5. **Clear performance benefit**: This provides immediate benefits to
all vsock users by:
- Reducing memory consumption by ~46% per RX buffer
- Eliminating pressure on higher-order page allocations
- Improving overall system memory efficiency
6. **Low risk**: The change is straightforward with minimal risk of
regression:
- The logic remains the same
- Only the buffer sizing calculation changes
- Has been reviewed by the vsock maintainer (Stefano Garzarella)
The commit meets all the stable tree criteria: it fixes a real problem
(memory waste and higher-order allocations), is minimal in scope,
doesn't introduce new features, and has clear benefits with low risk of
regression.
include/linux/virtio_vsock.h | 7 ++++++-
net/vmw_vsock/virtio_transport.c | 2 +-
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 36fb3edfa403..6c00687539cf 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -111,7 +111,12 @@ static inline size_t virtio_vsock_skb_len(struct sk_buff *skb)
return (size_t)(skb_end_pointer(skb) - skb->head);
}
-#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (1024 * 4)
+/* Dimension the RX SKB so that the entire thing fits exactly into
+ * a single 4KiB page. This avoids wasting memory due to alloc_skb()
+ * rounding up to the next page order and also means that we
+ * don't leave higher-order pages sitting around in the RX queue.
+ */
+#define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE SKB_WITH_OVERHEAD(1024 * 4)
#define VIRTIO_VSOCK_MAX_BUF_SIZE 0xFFFFFFFFUL
#define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE (1024 * 64)
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index f0e48e6911fc..f01f9e878106 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -307,7 +307,7 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
{
- int total_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM;
+ int total_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
struct scatterlist pkt, *p;
struct virtqueue *vq;
struct sk_buff *skb;
--
2.39.5
next prev parent reply other threads:[~2025-08-05 13:11 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-05 13:08 [PATCH AUTOSEL 6.16-6.6] mfd: axp20x: Set explicit ID for AXP313 regulator Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] MIPS: vpe-mt: add missing prototypes for vpe_{alloc,start,stop,free} Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.10] leds: leds-lp50xx: Handle reg to get correct multi_index Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] scsi: bfa: Double-free fix Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] pinctrl: stm32: Manage irq affinity settings Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16] PCI: dw-rockchip: Delay link training after hot reset in EP mode Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.6] phy: rockchip-pcie: Properly disable TEST_WRITE strobe signal Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.6] soundwire: Move handle_nested_irq outside of sdw_dev_lock Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] media: uvcvideo: Fix bandwidth issue for Alcor camera Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.15] crypto: hisilicon/hpre - fix dma unmap sequence Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.6] soundwire: amd: serialize amd manager resume sequence during pm_prepare Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.15] watchdog: sbsa: Adjust keepalive timeout to avoid MediaTek WS0 race condition Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.6] clk: qcom: ipq5018: keep XO clock always on Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16] media: i2c: vd55g1: Fix RATE macros not being expressed in bps Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] media: usb: hdpvr: disable zero-length read messages Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.15] media: raspberrypi: cfe: Fix min_reqbufs_allocation Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.1] hwmon: (emc2305) Set initial PWM minimum value during probe based on thermal state Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.12] media: uvcvideo: Add quirk for HP Webcam HD 2300 Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.1] drm/amd/display: Only finalize atomic_obj if it was initialized Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] vhost: fail early when __vhost_add_used() fails Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-6.12] scsi: lpfc: Ensure HBA_SETUP flag is used only for SLI4 in dev_loss_tmo_callbk Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16] ext4: limit the maximum folio order Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16-5.4] fs/orangefs: use snprintf() instead of sprintf() Sasha Levin
2025-08-05 13:08 ` [PATCH AUTOSEL 6.16] crypto: caam - Support iMX8QXP and variants thereof Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] crypto: ccp - Add missing bootloader info reg for pspv6 Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] scsi: lpfc: Check for hdwq null ptr when cleaning up lpfc_vport structure Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] media: dvb-frontends: dib7090p: fix null-ptr-deref in dib7090p_rw_on_apb() Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] scsi: pm80xx: Free allocated tags after failure Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] HID: rate-limit hid_warn to prevent log flooding Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16] media: i2c: vd55g1: Setup sensor external clock before patching Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.15] watchdog: iTCO_wdt: Report error if timeout configuration fails Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] media: iris: Add handling for corrupt and drop frames Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] phy: rockchip-pcie: Enable all four lanes if required Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] watchdog: dw_wdt: Fix default timeout Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] MIPS: Don't crash in stack_top() for tasks without ABI or vDSO Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.6] crypto: jitter - fix intermediary handling Sasha Levin
2025-08-05 13:09 ` Sasha Levin [this message]
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.1] MIPS: lantiq: falcon: sysctrl: fix request memory check logic Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] media: tc358743: Check I2C succeeded during probe Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.1] scsi: mpi3mr: Correctly handle ATA device errors Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] clk: renesas: rzg2l: Postpone updating priv->clks[] Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] scsi: mpt3sas: Correctly handle ATA device errors Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] smb: client: fix session setup against servers that require SPN Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.1] fbdev: fix potential buffer overflow in do_register_framebuffer() Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] sphinx: kernel_abi: fix performance regression with O=<dir> Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] media: tc358743: Return an appropriate colorspace from tc358743_set_fmt Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.6] drm/amd/display: Avoid configuring PSR granularity if PSR-SU not supported Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] media: tc358743: Increase FIFO trigger level to 374 Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] jfs: truncate good inode pages when hard link is 0 Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.15] media: v4l2-common: Reduce warnings about missing V4L2_CID_LINK_FREQ control Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.1] dmaengine: stm32-dma: configure next sg only if there are more than 2 sgs Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] RDMA/bnxt_re: Fix size of uverbs_copy_to() in BNXT_RE_METHOD_GET_TOGGLE_MEM Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] cifs: Fix calling CIFSFindFirst() for root path without msearch Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.10] RDMA/core: reduce stack using in nldev_stat_get_doit() Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] scsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] media: dvb-frontends: w7090p: fix null-ptr-deref in w7090p_tuner_write_serpar and w7090p_tuner_read_serpar Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] soundwire: amd: cancel pending slave status handling workqueue during remove sequence Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.6] PCI: xgene-msi: Resend an MSI racing with itself on a different CPU Sasha Levin
2025-08-05 13:20 ` Marc Zyngier
2025-08-05 13:59 ` Sasha Levin
2025-08-05 18:09 ` Marc Zyngier
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.6] clk: tegra: periph: Fix error handling and resolve unsigned compare warning Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] drm/amd/display: Disable dsc_power_gate for dcn314 by default Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask() Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.15] crypto: octeontx2 - add timeout for load_fvc completion poll Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.6] power: supply: qcom_battmgr: Add lithium-polymer entry Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] media: ipu-bridge: Add _HID for OV5670 Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] media: hi556: Fix reset GPIO timings Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.12] clk: thead: Mark essential bus clocks as CLK_IGNORE_UNUSED Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-6.15] media: uvcvideo: Set V4L2_CTRL_FLAG_DISABLED during queryctrl errors Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] jfs: Regular file corruption check Sasha Levin
2025-08-05 13:09 ` [PATCH AUTOSEL 6.16-5.4] jfs: upper bound check of tree index in dbAllocAG Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250805130945.471732-37-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=sgarzare@redhat.com \
--cc=stable@vger.kernel.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox