netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] vhost: use "checked" versions of get_user() and put_user()
@ 2025-11-13  0:55 Jon Kohler
  2025-11-13  1:09 ` Jason Wang
  2025-11-14 18:54 ` David Laight
  0 siblings, 2 replies; 27+ messages in thread
From: Jon Kohler @ 2025-11-13  0:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Eugenio Pérez, kvm,
	virtualization, netdev, linux-kernel
  Cc: Jon Kohler, Linus Torvalds, Borislav Petkov, Sean Christopherson

vhost_get_user and vhost_put_user leverage __get_user and __put_user,
respectively, which were both added in 2016 by commit 6b1e6cc7855b
("vhost: new device IOTLB API"). In a heavy UDP transmit workload on a
vhost-net backed tap device, these functions showed up as ~11.6% of
samples in a flamegraph of the underlying vhost worker thread.

Quoting Linus from [1]:
    Anyway, every single __get_user() call I looked at looked like
    historical garbage. [...] End result: I get the feeling that we
    should just do a global search-and-replace of the __get_user/
    __put_user users, replace them with plain get_user/put_user instead,
    and then fix up any fallout (eg the coco code).

Switch to plain get_user/put_user in vhost, which results in a slight
throughput speedup. get_user now about ~8.4% of samples in flamegraph.

Basic iperf3 test on a Intel 5416S CPU with Ubuntu 25.10 guest:
TX: taskset -c 2 iperf3 -c <rx_ip> -t 60 -p 5200 -b 0 -u -i 5
RX: taskset -c 2 iperf3 -s -p 5200 -D
Before: 6.08 Gbits/sec
After:  6.32 Gbits/sec

As to what drives the speedup, Sean's patch [2] explains:
	Use the normal, checked versions for get_user() and put_user() instead of
	the double-underscore versions that omit range checks, as the checked
	versions are actually measurably faster on modern CPUs (12%+ on Intel,
	25%+ on AMD).

	The performance hit on the unchecked versions is almost entirely due to
	the added LFENCE on CPUs where LFENCE is serializing (which is effectively
	all modern CPUs), which was added by commit 304ec1b05031 ("x86/uaccess:
	Use __uaccess_begin_nospec() and uaccess_try_nospec").  The small
	optimizations done by commit b19b74bc99b1 ("x86/mm: Rework address range
	check in get_user() and put_user()") likely shave a few cycles off, but
	the bulk of the extra latency comes from the LFENCE.

[1] https://lore.kernel.org/all/CAHk-=wiJiDSPZJTV7z3Q-u4DfLgQTNWqUqqrwSBHp0+Dh016FA@mail.gmail.com/
[2] https://lore.kernel.org/all/20251106210206.221558-1-seanjc@google.com/

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
---
 drivers/vhost/vhost.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 8570fdf2e14a..ffbd0a9a7a03 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1442,13 +1442,13 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
 ({ \
 	int ret; \
 	if (!vq->iotlb) { \
-		ret = __put_user(x, ptr); \
+		ret = put_user(x, ptr); \
 	} else { \
 		__typeof__(ptr) to = \
 			(__typeof__(ptr)) __vhost_get_user(vq, ptr,	\
 					  sizeof(*ptr), VHOST_ADDR_USED); \
 		if (to != NULL) \
-			ret = __put_user(x, to); \
+			ret = put_user(x, to); \
 		else \
 			ret = -EFAULT;	\
 	} \
@@ -1487,14 +1487,14 @@ static inline int vhost_put_used_idx(struct vhost_virtqueue *vq)
 ({ \
 	int ret; \
 	if (!vq->iotlb) { \
-		ret = __get_user(x, ptr); \
+		ret = get_user(x, ptr); \
 	} else { \
 		__typeof__(ptr) from = \
 			(__typeof__(ptr)) __vhost_get_user(vq, ptr, \
 							   sizeof(*ptr), \
 							   type); \
 		if (from != NULL) \
-			ret = __get_user(x, from); \
+			ret = get_user(x, from); \
 		else \
 			ret = -EFAULT; \
 	} \
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2025-12-02 16:55 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-13  0:55 [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() Jon Kohler
2025-11-13  1:09 ` Jason Wang
2025-11-14 14:53   ` Jon Kohler
2025-11-14 17:48     ` Linus Torvalds
2025-11-14 19:08       ` David Laight
2025-11-14 20:48         ` Linus Torvalds
2025-11-14 21:38           ` David Laight
2025-11-17  4:32     ` Jason Wang
2025-11-17 17:34       ` Jon Kohler
2025-11-20  1:57         ` Jason Wang
2025-11-25 19:45           ` Jon Kohler
2025-11-26  6:04             ` Jason Wang
2025-11-26 10:25               ` Arnd Bergmann
2025-11-26 19:47                 ` Jon Kohler
2025-11-26 19:58                   ` Arnd Bergmann
2025-11-26 21:42                     ` Jon Kohler
2025-11-26 21:45                       ` Linus Torvalds
2025-11-27  2:58                         ` Jon Kohler
2025-11-27  1:08                   ` Jason Wang
2025-11-27  3:11                     ` Jon Kohler
2025-11-27  6:31                       ` Michael S. Tsirkin
2025-11-27  6:32                       ` Michael S. Tsirkin
2025-12-02 16:54                         ` Jon Kohler
2025-11-14 18:54 ` David Laight
2025-11-14 19:30   ` Jon Kohler
2025-11-14 20:32     ` David Laight
2025-11-16  6:32     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).