* [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF
@ 2025-05-30 4:39 Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 1/5] net: Allow configuring virtio hashing Akihiko Odaki
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Akihiko Odaki @ 2025-05-30 4:39 UTC (permalink / raw)
To: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Cc: Akihiko Odaki
I'm proposing to add a feature to offload virtio-net RSS/hash report to
Linux. This series contain patches to utilize the proposed Linux
feature. The patches for Linux are available at:
https://lore.kernel.org/r/20250307-rss-v9-0-df76624025eb@daynix.com/
Note that the referenced patches for Linux implement UAPIs that are
compatible with older versions of this series but not compatible with
this version. Compatible patches for Linux will be posted shortly.
This work was presented at LPC 2024:
https://lpc.events/event/18/contributions/1963/
Patch "docs/devel/ebpf_rss.rst: Update for peer RSS" provides comparion
of existing RSS mechanism and the new one (called "peer RSS") and
explains how QEMU selects one.
---
Changes in v5:
- Changed UAPIs.
- Rebased.
- Link to v4: https://lore.kernel.org/qemu-devel/20250313-hash-v4-0-c75c494b495e@daynix.com
Changes in v4:
- Rebased.
- Added a reference to the documentation to the cover letter.
- Link to v3: https://lore.kernel.org/r/20240915-hash-v3-0-79cb08d28647@daynix.com
---
Akihiko Odaki (5):
net: Allow configuring virtio hashing
virtio-net: Offload hashing to peer
virtio-net: Offload hashing without vhost
tap: Report virtio-net hashing support on Linux
docs/devel/ebpf_rss.rst: Update for peer RSS
docs/devel/ebpf_rss.rst | 22 ++++++++-----
include/net/net.h | 13 ++++++++
net/tap-linux.h | 4 +++
net/tap_int.h | 4 +++
hw/net/virtio-net.c | 84 ++++++++++++++++++++++++++++++++++++++-----------
net/net.c | 11 +++++++
net/tap-bsd.c | 15 +++++++++
net/tap-linux.c | 16 ++++++++++
net/tap-solaris.c | 15 +++++++++
net/tap-stub.c | 15 +++++++++
net/tap.c | 23 ++++++++++++++
net/vhost-vdpa.c | 13 ++++++++
12 files changed, 209 insertions(+), 26 deletions(-)
---
base-commit: f0737158b483e7ec2b2512145aeab888b85cc1f7
change-id: 20240828-hash-628329a45d4d
prerequisite-change-id: 20250530-vdpa-2c481c64ce45:v1
prerequisite-patch-id: a8c36f25d07b2b1a658ec3189bdf8bf12ef5ce8d
prerequisite-patch-id: 414474fbb325338338f2d84f7065be241c27df2b
prerequisite-patch-id: 6db7f777008d6d1fae86df2a1fffcd2ddb394e05
prerequisite-patch-id: 762aad5a811163b3d9326870bb751fbe7853c21c
prerequisite-patch-id: 9dfcea59addbdebfc1cc8fda26c91ebf21482d21
prerequisite-patch-id: 3d391c14efe64541a881421735cf9ff04bb69713
Best regards,
--
Akihiko Odaki <akihiko.odaki@daynix.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH RFC v5 1/5] net: Allow configuring virtio hashing
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
@ 2025-05-30 4:39 ` Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 2/5] virtio-net: Offload hashing to peer Akihiko Odaki
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Akihiko Odaki @ 2025-05-30 4:39 UTC (permalink / raw)
To: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Cc: Akihiko Odaki
This adds functions to configure virtio hashing and implements it
for Linux's tap. vDPA will have empty functions as configuring virtio
hashing is done with the load().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
include/net/net.h | 13 +++++++++++++
net/tap-linux.h | 3 +++
net/tap_int.h | 3 +++
net/net.c | 11 +++++++++++
net/tap-bsd.c | 10 ++++++++++
net/tap-linux.c | 11 +++++++++++
net/tap-solaris.c | 10 ++++++++++
net/tap-stub.c | 10 ++++++++++
net/tap.c | 15 +++++++++++++++
net/vhost-vdpa.c | 13 +++++++++++++
10 files changed, 99 insertions(+)
diff --git a/include/net/net.h b/include/net/net.h
index 545f4339cec8..779cee7f4a22 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -35,6 +35,12 @@ typedef struct NICConf {
int32_t bootindex;
} NICConf;
+typedef struct NetVnetRss {
+ uint32_t hash_types;
+ uint16_t indirection_table_mask;
+ uint16_t unclassified_queue;
+} NetVnetRss;
+
#define DEFINE_NIC_PROPERTIES(_state, _conf) \
DEFINE_PROP_MACADDR("mac", _state, _conf.macaddr), \
DEFINE_PROP_NETDEV("netdev", _state, _conf.peers)
@@ -61,6 +67,8 @@ typedef void (SetOffload)(NetClientState *, int, int, int, int, int, int, int);
typedef int (GetVnetHdrLen)(NetClientState *);
typedef void (SetVnetHdrLen)(NetClientState *, int);
typedef bool (GetVnetHashSupportedTypes)(NetClientState *, uint32_t *);
+typedef void (SetVnetAutomq)(NetClientState *, uint32_t);
+typedef void (SetVnetRss)(NetClientState *, const NetVnetRss *, bool);
typedef int (SetVnetLE)(NetClientState *, bool);
typedef int (SetVnetBE)(NetClientState *, bool);
typedef struct SocketReadState SocketReadState;
@@ -91,6 +99,8 @@ typedef struct NetClientInfo {
SetVnetLE *set_vnet_le;
SetVnetBE *set_vnet_be;
GetVnetHashSupportedTypes *get_vnet_hash_supported_types;
+ SetVnetAutomq *set_vnet_automq;
+ SetVnetRss *set_vnet_rss;
NetAnnounce *announce;
SetSteeringEBPF *set_steering_ebpf;
NetCheckPeerType *check_peer_type;
@@ -192,6 +202,9 @@ void qemu_set_offload(NetClientState *nc, int csum, int tso4, int tso6,
int qemu_get_vnet_hdr_len(NetClientState *nc);
void qemu_set_vnet_hdr_len(NetClientState *nc, int len);
bool qemu_get_vnet_hash_supported_types(NetClientState *nc, uint32_t *types);
+void qemu_set_vnet_automq(NetClientState *nc, uint32_t hash_types);
+void qemu_set_vnet_rss(NetClientState *nc, const NetVnetRss *rss,
+ bool hash_report);
int qemu_set_vnet_le(NetClientState *nc, bool is_le);
int qemu_set_vnet_be(NetClientState *nc, bool is_be);
void qemu_macaddr_default_if_unset(MACAddr *macaddr);
diff --git a/net/tap-linux.h b/net/tap-linux.h
index 9a58cecb7f47..5bca6cab1867 100644
--- a/net/tap-linux.h
+++ b/net/tap-linux.h
@@ -32,6 +32,9 @@
#define TUNSETVNETLE _IOW('T', 220, int)
#define TUNSETVNETBE _IOW('T', 222, int)
#define TUNSETSTEERINGEBPF _IOR('T', 224, int)
+#define TUNSETVNETREPORTINGAUTOMQ _IOR('T', 229, __u32)
+#define TUNSETVNETREPORTINGRSS _IOR('T', 230, NetVnetRss)
+#define TUNSETVNETRSS _IOR('T', 231, struct NetVnetRss)
#endif
diff --git a/net/tap_int.h b/net/tap_int.h
index 8857ff299d22..248d1efa51a0 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -27,6 +27,7 @@
#define NET_TAP_INT_H
#include "qapi/qapi-types-net.h"
+#include "net/net.h"
int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
int vnet_hdr_required, int mq_required, Error **errp);
@@ -40,6 +41,8 @@ int tap_probe_has_uso(int fd);
void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, int ecn, int ufo,
int uso4, int uso6);
void tap_fd_set_vnet_hdr_len(int fd, int len);
+void tap_fd_set_vnet_automq(int fd, uint32_t hash_types);
+void tap_fd_set_vnet_rss(int fd, const NetVnetRss *rss, bool hash_report);
int tap_fd_set_vnet_le(int fd, int vnet_is_le);
int tap_fd_set_vnet_be(int fd, int vnet_is_be);
int tap_fd_enable(int fd);
diff --git a/net/net.c b/net/net.c
index d0ae3db0d864..7e21e1a373ab 100644
--- a/net/net.c
+++ b/net/net.c
@@ -582,6 +582,17 @@ bool qemu_get_vnet_hash_supported_types(NetClientState *nc, uint32_t *types)
return nc->info->get_vnet_hash_supported_types(nc, types);
}
+void qemu_set_vnet_automq(NetClientState *nc, uint32_t hash_types)
+{
+ nc->info->set_vnet_automq(nc, hash_types);
+}
+
+void qemu_set_vnet_rss(NetClientState *nc, const NetVnetRss *rss,
+ bool hash_report)
+{
+ nc->info->set_vnet_rss(nc, rss, hash_report);
+}
+
int qemu_set_vnet_le(NetClientState *nc, bool is_le)
{
#if HOST_BIG_ENDIAN
diff --git a/net/tap-bsd.c b/net/tap-bsd.c
index b4c84441ba8b..8ed384f02c5b 100644
--- a/net/tap-bsd.c
+++ b/net/tap-bsd.c
@@ -221,6 +221,16 @@ void tap_fd_set_vnet_hdr_len(int fd, int len)
{
}
+void tap_fd_set_vnet_automq(int fd, uint32_t hash_types)
+{
+ g_assert_not_reached();
+}
+
+void tap_fd_set_vnet_rss(int fd, const NetVnetRss *rss, bool hash_report)
+{
+ g_assert_not_reached();
+}
+
int tap_fd_set_vnet_le(int fd, int is_le)
{
return -EINVAL;
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 22ec2f45d2b7..d0adb168e977 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -205,6 +205,17 @@ void tap_fd_set_vnet_hdr_len(int fd, int len)
}
}
+void tap_fd_set_vnet_automq(int fd, uint32_t hash_types)
+{
+ assert(!ioctl(fd, TUNSETVNETREPORTINGAUTOMQ, &hash_types));
+}
+
+void tap_fd_set_vnet_rss(int fd, const NetVnetRss *rss, bool hash_report)
+{
+ unsigned int cmd = hash_report ? TUNSETVNETREPORTINGRSS : TUNSETVNETRSS;
+ assert(!ioctl(fd, cmd, rss));
+}
+
int tap_fd_set_vnet_le(int fd, int is_le)
{
int arg = is_le ? 1 : 0;
diff --git a/net/tap-solaris.c b/net/tap-solaris.c
index 51b7830bef1d..bc76a030e7f9 100644
--- a/net/tap-solaris.c
+++ b/net/tap-solaris.c
@@ -225,6 +225,16 @@ void tap_fd_set_vnet_hdr_len(int fd, int len)
{
}
+void tap_fd_set_vnet_automq(int fd, uint32_t hash_types)
+{
+ g_assert_not_reached();
+}
+
+void tap_fd_set_vnet_rss(int fd, const NetVnetRss *rss, bool hash_report)
+{
+ g_assert_not_reached();
+}
+
int tap_fd_set_vnet_le(int fd, int is_le)
{
return -EINVAL;
diff --git a/net/tap-stub.c b/net/tap-stub.c
index 38673434cbd6..511ddfc707eb 100644
--- a/net/tap-stub.c
+++ b/net/tap-stub.c
@@ -56,6 +56,16 @@ void tap_fd_set_vnet_hdr_len(int fd, int len)
{
}
+void tap_fd_set_vnet_automq(int fd, uint32_t hash_types)
+{
+ g_assert_not_reached();
+}
+
+void tap_fd_set_vnet_rss(int fd, const NetVnetRss *rss, bool hash_report)
+{
+ g_assert_not_reached();
+}
+
int tap_fd_set_vnet_le(int fd, int is_le)
{
return -EINVAL;
diff --git a/net/tap.c b/net/tap.c
index ae1c7e398321..e93f5f951057 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -248,6 +248,19 @@ static void tap_set_vnet_hdr_len(NetClientState *nc, int len)
s->using_vnet_hdr = true;
}
+static void tap_set_vnet_automq(NetClientState *nc, uint32_t hash_types)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+ return tap_fd_set_vnet_automq(s->fd, hash_types);
+}
+
+static void tap_set_vnet_rss(NetClientState *nc, const NetVnetRss *rss,
+ bool hash_report)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+ return tap_fd_set_vnet_rss(s->fd, rss, hash_report);
+}
+
static int tap_set_vnet_le(NetClientState *nc, bool is_le)
{
TAPState *s = DO_UPCAST(TAPState, nc, nc);
@@ -344,6 +357,8 @@ static NetClientInfo net_tap_info = {
.has_vnet_hdr_len = tap_has_vnet_hdr_len,
.set_offload = tap_set_offload,
.set_vnet_hdr_len = tap_set_vnet_hdr_len,
+ .set_vnet_automq = tap_set_vnet_automq,
+ .set_vnet_rss = tap_set_vnet_rss,
.set_vnet_le = tap_set_vnet_le,
.set_vnet_be = tap_set_vnet_be,
.set_steering_ebpf = tap_set_steering_ebpf,
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 149c0f7f1766..43822f1f79da 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -271,6 +271,15 @@ static bool vhost_vdpa_get_vnet_hash_supported_types(NetClientState *nc,
return true;
}
+static void vhost_vdpa_set_vnet_automq(NetClientState *nc, uint32_t hash_types)
+{
+}
+
+static void vhost_vdpa_set_vnet_rss(NetClientState *nc, const NetVnetRss *rss,
+ bool hash_report)
+{
+}
+
static bool vhost_vdpa_has_ufo(NetClientState *nc)
{
assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
@@ -461,6 +470,8 @@ static NetClientInfo net_vhost_vdpa_info = {
.cleanup = vhost_vdpa_cleanup,
.has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
.get_vnet_hash_supported_types = vhost_vdpa_get_vnet_hash_supported_types,
+ .set_vnet_automq = vhost_vdpa_set_vnet_automq,
+ .set_vnet_rss = vhost_vdpa_set_vnet_rss,
.has_ufo = vhost_vdpa_has_ufo,
.set_vnet_le = vhost_vdpa_set_vnet_le,
.check_peer_type = vhost_vdpa_check_peer_type,
@@ -1335,6 +1346,8 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
.cleanup = vhost_vdpa_cleanup,
.has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
.get_vnet_hash_supported_types = vhost_vdpa_get_vnet_hash_supported_types,
+ .set_vnet_automq = vhost_vdpa_set_vnet_automq,
+ .set_vnet_rss = vhost_vdpa_set_vnet_rss,
.has_ufo = vhost_vdpa_has_ufo,
.check_peer_type = vhost_vdpa_check_peer_type,
};
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH RFC v5 2/5] virtio-net: Offload hashing to peer
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 1/5] net: Allow configuring virtio hashing Akihiko Odaki
@ 2025-05-30 4:39 ` Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 3/5] virtio-net: Offload hashing without vhost Akihiko Odaki
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Akihiko Odaki @ 2025-05-30 4:39 UTC (permalink / raw)
To: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Cc: Akihiko Odaki
This allows offloading hash reporting and RSS to tap.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
hw/net/virtio-net.c | 69 +++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 56 insertions(+), 13 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 52fe404b3431..0a333d560d7b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1213,20 +1213,58 @@ static void virtio_net_detach_ebpf_rss(VirtIONet *n)
static void virtio_net_commit_rss_config(VirtIONet *n)
{
- if (n->rss_data.peer_hash_available) {
- return;
- }
-
if (n->rss_data.enabled) {
- n->rss_data.enabled_software_rss = n->rss_data.populate_hash;
- if (n->rss_data.populate_hash) {
- virtio_net_detach_ebpf_rss(n);
- } else if (!virtio_net_attach_ebpf_rss(n)) {
- if (get_vhost_net(qemu_get_queue(n->nic)->peer)) {
- warn_report("Can't load eBPF RSS for vhost");
+ if (n->rss_data.peer_hash_available &&
+ (n->rss_data.peer_hash_types & n->rss_data.runtime_hash_types) ==
+ n->rss_data.runtime_hash_types) {
+ if (n->rss_data.redirect) {
+ size_t indirection_table_size =
+ n->rss_data.indirections_len *
+ sizeof(*n->rss_data.indirections_table);
+
+ size_t hash_size = sizeof(NetVnetRss) +
+ indirection_table_size +
+ sizeof(n->rss_data.key);
+
+ g_autofree struct {
+ NetVnetRss hdr;
+ uint8_t footer[];
+ } *rss = g_malloc(hash_size);
+
+ rss->hdr.hash_types = n->rss_data.runtime_hash_types;
+ rss->hdr.indirection_table_mask =
+ n->rss_data.indirections_len - 1;
+ rss->hdr.unclassified_queue = n->rss_data.default_queue;
+
+ memcpy(rss->footer, n->rss_data.indirections_table,
+ indirection_table_size);
+
+ memcpy(rss->footer + indirection_table_size, n->rss_data.key,
+ sizeof(n->rss_data.key));
+
+ qemu_set_vnet_rss(qemu_get_queue(n->nic)->peer, &rss->hdr,
+ n->rss_data.populate_hash);
} else {
- warn_report("Can't load eBPF RSS - fallback to software RSS");
- n->rss_data.enabled_software_rss = true;
+ qemu_set_vnet_automq(qemu_get_queue(n->nic)->peer,
+ n->rss_data.runtime_hash_types);
+ }
+
+ n->rss_data.enabled_software_rss = false;
+ } else {
+ if (n->rss_data.peer_hash_available) {
+ qemu_set_vnet_automq(qemu_get_queue(n->nic)->peer, 0);
+ }
+
+ n->rss_data.enabled_software_rss = n->rss_data.populate_hash;
+ if (n->rss_data.populate_hash) {
+ virtio_net_detach_ebpf_rss(n);
+ } else if (!virtio_net_attach_ebpf_rss(n)) {
+ if (get_vhost_net(qemu_get_queue(n->nic)->peer)) {
+ warn_report("Can't load eBPF RSS for vhost");
+ } else {
+ warn_report("Can't load eBPF RSS - fallback to software RSS");
+ n->rss_data.enabled_software_rss = true;
+ }
}
}
@@ -1235,7 +1273,12 @@ static void virtio_net_commit_rss_config(VirtIONet *n)
n->rss_data.indirections_len,
sizeof(n->rss_data.key));
} else {
- virtio_net_detach_ebpf_rss(n);
+ if (n->rss_data.peer_hash_available) {
+ qemu_set_vnet_automq(qemu_get_queue(n->nic)->peer, 0);
+ } else {
+ virtio_net_detach_ebpf_rss(n);
+ }
+
trace_virtio_net_rss_disable(n);
}
}
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH RFC v5 3/5] virtio-net: Offload hashing without vhost
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 1/5] net: Allow configuring virtio hashing Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 2/5] virtio-net: Offload hashing to peer Akihiko Odaki
@ 2025-05-30 4:39 ` Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 4/5] tap: Report virtio-net hashing support on Linux Akihiko Odaki
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Akihiko Odaki @ 2025-05-30 4:39 UTC (permalink / raw)
To: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Cc: Akihiko Odaki
This is necessary to offload hashing to tap.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
hw/net/virtio-net.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 0a333d560d7b..3469c211b13a 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1968,7 +1968,8 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
}
receive_header(n, sg, elem->in_num, buf, size);
- if (n->rss_data.populate_hash) {
+ if (n->rss_data.enabled_software_rss &&
+ n->rss_data.populate_hash) {
offset = offsetof(typeof(extra_hdr), hash_value);
iov_from_buf(sg, elem->in_num, offset,
(char *)&extra_hdr + offset,
@@ -3099,11 +3100,13 @@ static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features,
}
if (!get_vhost_net(nc->peer)) {
- if (!use_own_hash) {
- virtio_clear_feature(&features, VIRTIO_NET_F_HASH_REPORT);
- virtio_clear_feature(&features, VIRTIO_NET_F_RSS);
- } else if (virtio_has_feature(features, VIRTIO_NET_F_RSS)) {
- virtio_net_load_ebpf(n, errp);
+ if (!use_peer_hash) {
+ if (!use_own_hash) {
+ virtio_clear_feature(&features, VIRTIO_NET_F_HASH_REPORT);
+ virtio_clear_feature(&features, VIRTIO_NET_F_RSS);
+ } else if (virtio_has_feature(features, VIRTIO_NET_F_RSS)) {
+ virtio_net_load_ebpf(n, errp);
+ }
}
return features;
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH RFC v5 4/5] tap: Report virtio-net hashing support on Linux
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
` (2 preceding siblings ...)
2025-05-30 4:39 ` [PATCH RFC v5 3/5] virtio-net: Offload hashing without vhost Akihiko Odaki
@ 2025-05-30 4:39 ` Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 5/5] docs/devel/ebpf_rss.rst: Update for peer RSS Akihiko Odaki
2025-06-03 6:35 ` [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Lei Yang
5 siblings, 0 replies; 7+ messages in thread
From: Akihiko Odaki @ 2025-05-30 4:39 UTC (permalink / raw)
To: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Cc: Akihiko Odaki
This allows offloading virtio-net hashing to tap on Linux.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
net/tap-linux.h | 1 +
net/tap_int.h | 1 +
net/tap-bsd.c | 5 +++++
net/tap-linux.c | 5 +++++
net/tap-solaris.c | 5 +++++
net/tap-stub.c | 5 +++++
net/tap.c | 8 ++++++++
7 files changed, 30 insertions(+)
diff --git a/net/tap-linux.h b/net/tap-linux.h
index 5bca6cab1867..fe30f4f27788 100644
--- a/net/tap-linux.h
+++ b/net/tap-linux.h
@@ -32,6 +32,7 @@
#define TUNSETVNETLE _IOW('T', 220, int)
#define TUNSETVNETBE _IOW('T', 222, int)
#define TUNSETSTEERINGEBPF _IOR('T', 224, int)
+#define TUNGETVNETHASHTYPES _IOR('T', 228, __u32)
#define TUNSETVNETREPORTINGAUTOMQ _IOR('T', 229, __u32)
#define TUNSETVNETREPORTINGRSS _IOR('T', 230, NetVnetRss)
#define TUNSETVNETRSS _IOR('T', 231, struct NetVnetRss)
diff --git a/net/tap_int.h b/net/tap_int.h
index 248d1efa51a0..5ff9ca721928 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -36,6 +36,7 @@ ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen);
void tap_set_sndbuf(int fd, const NetdevTapOptions *tap, Error **errp);
int tap_probe_vnet_hdr(int fd, Error **errp);
+bool tap_probe_vnet_hash_supported_types(int fd, uint32_t *types);
int tap_probe_has_ufo(int fd);
int tap_probe_has_uso(int fd);
void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, int ecn, int ufo,
diff --git a/net/tap-bsd.c b/net/tap-bsd.c
index 8ed384f02c5b..749732138502 100644
--- a/net/tap-bsd.c
+++ b/net/tap-bsd.c
@@ -217,6 +217,11 @@ int tap_probe_has_uso(int fd)
return 0;
}
+bool tap_probe_vnet_hash_supported_types(int fd, uint32_t *types)
+{
+ return false;
+}
+
void tap_fd_set_vnet_hdr_len(int fd, int len)
{
}
diff --git a/net/tap-linux.c b/net/tap-linux.c
index d0adb168e977..76fc88acaa18 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -196,6 +196,11 @@ int tap_probe_has_uso(int fd)
return 1;
}
+bool tap_probe_vnet_hash_supported_types(int fd, uint32_t *types)
+{
+ return !ioctl(fd, TUNGETVNETHASHTYPES, types);
+}
+
void tap_fd_set_vnet_hdr_len(int fd, int len)
{
if (ioctl(fd, TUNSETVNETHDRSZ, &len) == -1) {
diff --git a/net/tap-solaris.c b/net/tap-solaris.c
index bc76a030e7f9..65234c49a196 100644
--- a/net/tap-solaris.c
+++ b/net/tap-solaris.c
@@ -221,6 +221,11 @@ int tap_probe_has_uso(int fd)
return 0;
}
+bool tap_probe_vnet_hash_supported_types(int fd, uint32_t *types)
+{
+ return false;
+}
+
void tap_fd_set_vnet_hdr_len(int fd, int len)
{
}
diff --git a/net/tap-stub.c b/net/tap-stub.c
index 511ddfc707eb..281bae2615d2 100644
--- a/net/tap-stub.c
+++ b/net/tap-stub.c
@@ -52,6 +52,11 @@ int tap_probe_has_uso(int fd)
return 0;
}
+bool tap_probe_vnet_hash_supported_types(int fd, uint32_t *types)
+{
+ return false;
+}
+
void tap_fd_set_vnet_hdr_len(int fd, int len)
{
}
diff --git a/net/tap.c b/net/tap.c
index e93f5f951057..4a8adcf447eb 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -248,6 +248,13 @@ static void tap_set_vnet_hdr_len(NetClientState *nc, int len)
s->using_vnet_hdr = true;
}
+static bool tap_get_vnet_hash_supported_types(NetClientState *nc,
+ uint32_t *types)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+ return tap_probe_vnet_hash_supported_types(s->fd, types);
+}
+
static void tap_set_vnet_automq(NetClientState *nc, uint32_t hash_types)
{
TAPState *s = DO_UPCAST(TAPState, nc, nc);
@@ -357,6 +364,7 @@ static NetClientInfo net_tap_info = {
.has_vnet_hdr_len = tap_has_vnet_hdr_len,
.set_offload = tap_set_offload,
.set_vnet_hdr_len = tap_set_vnet_hdr_len,
+ .get_vnet_hash_supported_types = tap_get_vnet_hash_supported_types,
.set_vnet_automq = tap_set_vnet_automq,
.set_vnet_rss = tap_set_vnet_rss,
.set_vnet_le = tap_set_vnet_le,
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH RFC v5 5/5] docs/devel/ebpf_rss.rst: Update for peer RSS
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
` (3 preceding siblings ...)
2025-05-30 4:39 ` [PATCH RFC v5 4/5] tap: Report virtio-net hashing support on Linux Akihiko Odaki
@ 2025-05-30 4:39 ` Akihiko Odaki
2025-06-03 6:35 ` [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Lei Yang
5 siblings, 0 replies; 7+ messages in thread
From: Akihiko Odaki @ 2025-05-30 4:39 UTC (permalink / raw)
To: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Cc: Akihiko Odaki
eBPF RSS virtio-net support was written in assumption that there is only
one alternative RSS implementation: 'in-qemu' RSS. It is no longer true,
and we now have yet another implementation; namely the peer RSS.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
| 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
--git a/docs/devel/ebpf_rss.rst b/docs/devel/ebpf_rss.rst
index ed5d33767bd5..739d0259a168 100644
--- a/docs/devel/ebpf_rss.rst
+++ b/docs/devel/ebpf_rss.rst
@@ -7,9 +7,21 @@ eBPF RSS virtio-net support
RSS(Receive Side Scaling) is used to distribute network packets to guest virtqueues
by calculating packet hash. Usually every queue is processed then by a specific guest CPU core.
-For now there are 2 RSS implementations in qemu:
-- 'in-qemu' RSS (functions if qemu receives network packets, i.e. vhost=off)
-- eBPF RSS (can function with also with vhost=on)
+For now there are 3 RSS implementations in qemu:
+1. Peer RSS
+2. eBPF RSS
+3. 'In-QEMU' RSS
+
+'In-QEMU' RSS is incompatible with vhost since the packets are not routed to
+QEMU. eBPF RSS requires Linux 5.8+. Peer RSS requires the peer to implement RSS.
+Currently QEMU can use the RSS implementation of vDPA and Linux's TUN module,
+which is currently being upstreamed.
+
+eBPF RSS does not support hash reporting. Peer RSS may support limited hash
+types.
+
+virtio-net automatically chooses the RSS implementation to use. Peer RSS is
+the most preferred, and 'in-QEMU' RSS is the least.
eBPF support (CONFIG_EBPF) is enabled by 'configure' script.
To enable eBPF RSS support use './configure --enable-bpf'.
@@ -49,9 +61,6 @@ eBPF RSS turned on by different combinations of vhost-net, vitrio-net and tap co
tap,vhost=on & virtio-net-pci,rss=on,hash=on
-If CONFIG_EBPF is not set then only 'in-qemu' RSS is supported.
-Also 'in-qemu' RSS, as a fallback, is used if the eBPF program failed to load or set to TUN.
-
RSS eBPF program
----------------
@@ -67,7 +76,6 @@ Prerequisites to recompile the eBPF program (regenerate ebpf/rss.bpf.skeleton.h)
$ make -f Makefile.ebpf
Current eBPF RSS implementation uses 'bounded loops' with 'backward jump instructions' which present in the last kernels.
-Overall eBPF RSS works on kernels 5.8+.
eBPF RSS implementation
-----------------------
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
` (4 preceding siblings ...)
2025-05-30 4:39 ` [PATCH RFC v5 5/5] docs/devel/ebpf_rss.rst: Update for peer RSS Akihiko Odaki
@ 2025-06-03 6:35 ` Lei Yang
5 siblings, 0 replies; 7+ messages in thread
From: Lei Yang @ 2025-06-03 6:35 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, Yuri Benditovich, Andrew Melnychenko,
Michael S . Tsirkin, Jason Wang, Paolo Abeni, devel
Tested with this series of patches with virtio-net regression tests,
everything works fine.
Tested-by: Lei Yang <leiyang@redhat.com>
On Fri, May 30, 2025 at 12:40 PM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>
> I'm proposing to add a feature to offload virtio-net RSS/hash report to
> Linux. This series contain patches to utilize the proposed Linux
> feature. The patches for Linux are available at:
> https://lore.kernel.org/r/20250307-rss-v9-0-df76624025eb@daynix.com/
>
> Note that the referenced patches for Linux implement UAPIs that are
> compatible with older versions of this series but not compatible with
> this version. Compatible patches for Linux will be posted shortly.
>
> This work was presented at LPC 2024:
> https://lpc.events/event/18/contributions/1963/
>
> Patch "docs/devel/ebpf_rss.rst: Update for peer RSS" provides comparion
> of existing RSS mechanism and the new one (called "peer RSS") and
> explains how QEMU selects one.
>
> ---
> Changes in v5:
> - Changed UAPIs.
> - Rebased.
> - Link to v4: https://lore.kernel.org/qemu-devel/20250313-hash-v4-0-c75c494b495e@daynix.com
>
> Changes in v4:
> - Rebased.
> - Added a reference to the documentation to the cover letter.
> - Link to v3: https://lore.kernel.org/r/20240915-hash-v3-0-79cb08d28647@daynix.com
>
> ---
> Akihiko Odaki (5):
> net: Allow configuring virtio hashing
> virtio-net: Offload hashing to peer
> virtio-net: Offload hashing without vhost
> tap: Report virtio-net hashing support on Linux
> docs/devel/ebpf_rss.rst: Update for peer RSS
>
> docs/devel/ebpf_rss.rst | 22 ++++++++-----
> include/net/net.h | 13 ++++++++
> net/tap-linux.h | 4 +++
> net/tap_int.h | 4 +++
> hw/net/virtio-net.c | 84 ++++++++++++++++++++++++++++++++++++++-----------
> net/net.c | 11 +++++++
> net/tap-bsd.c | 15 +++++++++
> net/tap-linux.c | 16 ++++++++++
> net/tap-solaris.c | 15 +++++++++
> net/tap-stub.c | 15 +++++++++
> net/tap.c | 23 ++++++++++++++
> net/vhost-vdpa.c | 13 ++++++++
> 12 files changed, 209 insertions(+), 26 deletions(-)
> ---
> base-commit: f0737158b483e7ec2b2512145aeab888b85cc1f7
> change-id: 20240828-hash-628329a45d4d
> prerequisite-change-id: 20250530-vdpa-2c481c64ce45:v1
> prerequisite-patch-id: a8c36f25d07b2b1a658ec3189bdf8bf12ef5ce8d
> prerequisite-patch-id: 414474fbb325338338f2d84f7065be241c27df2b
> prerequisite-patch-id: 6db7f777008d6d1fae86df2a1fffcd2ddb394e05
> prerequisite-patch-id: 762aad5a811163b3d9326870bb751fbe7853c21c
> prerequisite-patch-id: 9dfcea59addbdebfc1cc8fda26c91ebf21482d21
> prerequisite-patch-id: 3d391c14efe64541a881421735cf9ff04bb69713
>
> Best regards,
> --
> Akihiko Odaki <akihiko.odaki@daynix.com>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-06-03 6:37 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-30 4:39 [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 1/5] net: Allow configuring virtio hashing Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 2/5] virtio-net: Offload hashing to peer Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 3/5] virtio-net: Offload hashing without vhost Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 4/5] tap: Report virtio-net hashing support on Linux Akihiko Odaki
2025-05-30 4:39 ` [PATCH RFC v5 5/5] docs/devel/ebpf_rss.rst: Update for peer RSS Akihiko Odaki
2025-06-03 6:35 ` [PATCH RFC v5 0/5] virtio-net: Offload hashing without eBPF Lei Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).