* [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net
@ 2026-03-12 7:09 Cindy Lu
2026-03-12 7:09 ` [RFC v2 1/9] net/filter: allow redirector on vhost TAP backends Cindy Lu
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Hi, All
This series adds an AF_PACKET support for vhost tap
device in filter-redirector/filter-buffer.when the vhost=on will use
AF_PACKET to capture and inject,
Example Usage(not change with exist upstream code)
=============
Primary VM (mirror incoming packets to secondary via chardev socket):
-netdev tap,id=net0,vhost=on,...
-chardev socket,id=mirror0,host=...,port=...,server=on,wait=off
-object filter-redirector,id=vm1redir,netdev=net0,outdev=mirror0...
Secondary VM (receive mirrored packets):
-netdev tap,id=net0,vhost=on,...
-chardev socket,id=red0,host=...,port=...,reconnect-ms=..
-object filter-buffer,id=swbuf,netdev=net0,queue=tx,interval=1000000,status=off.....
-object filter-redirector,id=r1,netdev=net0,queue=tx,indev=red0,status=off,enable_when
_stopped=true.... \
TODO
=======
This series still based on tap device. The vhost-vdpa support is on going,will send soon
changset
===========
change in v2:
1. add support for filter-buffer
2. remove the in_netdev and out_netdev for AF_PACKET bind port, now only use netdev
when the vhost=on start use AF_PACKET to capture and inject, when use vhost=off will use
the existing code
3. add CAP_NET_RAW check
4. address the comment
Testing
=======
- Tested with vhost=on/off TAP netdev on x86_64
Cindy Lu (9):
net/filter: allow redirector on vhost TAP backends
net/filter-redirector: add role helpers for AF_PACKET paths
net/filter-redirector: add AF_PACKET socket setup and input handler
net/filter-redirector: add send helpers and netdev counters
net/filter-redirector: route chardev and AF_PACKET receive paths
net/filter: Add support for filter-buffer
virtio-net: keep tap read polling disabled while vhost owns RX
virtio-net: handle short vnet headers on replay RX
net/filter-redirector: check CAP_NET_RAW before creating AF_PACKET
hw/net/virtio-net.c | 66 +++++-
include/net/queue.h | 5 +
net/filter-mirror.c | 493 ++++++++++++++++++++++++++++++++++++++++++--
net/filter.c | 16 +-
4 files changed, 551 insertions(+), 29 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC v2 1/9] net/filter: allow redirector on vhost TAP backends
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 2/9] net/filter-redirector: add role helpers for AF_PACKET paths Cindy Lu
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
netfilter_complete() currently rejects every filter attached to a
vhost-backed netdev. That prevents filter-redirector from being used on
the TAP backends that handle switchover capture and replay.
Permit filter-redirector on vhost-backed TAP netdevs, but keep the gate
narrow: other filters are still rejected and non-TAP backends remain
unsupported. Later commits can widen the filter set without duplicating
the backend restriction.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
net/filter.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/net/filter.c b/net/filter.c
index 76345c1a9d..b9646b9e00 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -255,8 +255,19 @@ static void netfilter_complete(UserCreatable *uc, Error **errp)
}
if (get_vhost_net(ncs[0])) {
- error_setg(errp, "Vhost is not supported");
- return;
+ bool redirector = object_dynamic_cast(OBJECT(uc),
+ "filter-redirector");
+ bool buffer = object_dynamic_cast(OBJECT(uc), "filter-buffer");
+ bool vhost_filter = redirector || buffer;
+
+ if (!redirector) {
+ error_setg(errp, "Vhost is not supported");
+ return;
+ }
+ if (vhost_filter && ncs[0]->info->type != NET_CLIENT_DRIVER_TAP) {
+ error_setg(errp, "Vhost filter support requires a TAP backend");
+ return;
+ }
}
if (strcmp(nf->position, "head") && strcmp(nf->position, "tail")) {
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 2/9] net/filter-redirector: add role helpers for AF_PACKET paths
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
2026-03-12 7:09 ` [RFC v2 1/9] net/filter: allow redirector on vhost TAP backends Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 3/9] net/filter-redirector: add AF_PACKET socket setup and input handler Cindy Lu
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Add helpers that tell whether a redirector instance should create an
AF_PACKET capture socket or inject socket. Later commits use them when
wiring up the TAP datapath.
While here, let the indev-only inject role enable
allow_send_when_stopped, and guard
filter_redirector_vm_state_change() against a missing nc.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
net/filter-mirror.c | 33 ++++++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 7 deletions(-)
diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index ab711e8835..376b7da025 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -22,6 +22,7 @@
#include "qemu/error-report.h"
#include "trace.h"
#include "chardev/char-fe.h"
+#include "net/vhost_net.h"
#include "qemu/iov.h"
#include "qemu/sockets.h"
#include "block/aio-wait.h"
@@ -62,6 +63,24 @@ typedef struct FilterSendCo {
int ret;
} FilterSendCo;
+static bool filter_redirector_use_inject_netdev(NetFilterState *nf)
+{
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+
+ return s->indev && !s->outdev &&
+ nf->netdev &&
+ get_vhost_net(nf->netdev);
+}
+
+static bool filter_redirector_use_capture_netdev(NetFilterState *nf)
+{
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+
+ return s->outdev && !s->indev &&
+ nf->netdev &&
+ get_vhost_net(nf->netdev);
+}
+
static int _filter_send(MirrorState *s,
char *buf,
ssize_t size)
@@ -318,13 +337,13 @@ filter_redirector_refresh_allow_send_when_stopped(NetFilterState *nf)
/*
* Allow sending when stopped if enable_when_stopped is set and we have
- * an outdev. This must be independent of nf->on (status) so that packets
- * can still flow through the filter chain to other filters even when this
- * redirector is disabled. Otherwise, tap_send() will disable read_poll
- * when qemu_can_send_packet() returns false, preventing further packet
- * processing.
+ * a redirector output endpoint and the redirector is enabled.
+ * Keeping this active while redirector status=off can unexpectedly
+ * drain packets in migration stop windows and perturb vhost ring state.
*/
- nc->allow_send_when_stopped = (s->enable_when_stopped && s->outdev);
+ nc->allow_send_when_stopped = (s->enable_when_stopped &&
+ (s->outdev ||
+ filter_redirector_use_inject_netdev(nf)));
}
static void filter_redirector_vm_state_change(void *opaque, bool running,
@@ -334,7 +353,7 @@ static void filter_redirector_vm_state_change(void *opaque, bool running,
MirrorState *s = FILTER_REDIRECTOR(nf);
NetClientState *nc = nf->netdev;
- if (!running && s->enable_when_stopped && nc->info->read_poll) {
+ if (!running && nc && s->enable_when_stopped && nc->info->read_poll) {
nc->info->read_poll(nc, true);
}
}
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 3/9] net/filter-redirector: add AF_PACKET socket setup and input handler
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
2026-03-12 7:09 ` [RFC v2 1/9] net/filter: allow redirector on vhost TAP backends Cindy Lu
2026-03-12 7:09 ` [RFC v2 2/9] net/filter-redirector: add role helpers for AF_PACKET paths Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 4/9] net/filter-redirector: add send helpers and netdev counters Cindy Lu
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Add the AF_PACKET plumbing that lets filter-redirector bypass vhost and
talk to the TAP device directly.
Resolve the TAP ifname from the backend fd, create a nonblocking raw
socket, bind it to the interface, and store it as either the capture or
inject endpoint depending on the redirector role.
Also add the capture-side fd handler, which drains PACKET_OUTGOING
frames and forwards them into the filter chain.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
net/filter-mirror.c | 179 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 170 insertions(+), 9 deletions(-)
diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index 376b7da025..915f2f8b35 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -27,6 +27,13 @@
#include "qemu/sockets.h"
#include "block/aio-wait.h"
#include "system/runstate.h"
+#include "net/tap.h"
+#include "net/tap_int.h"
+
+#include <sys/socket.h>
+#include <net/if.h>
+#include <linux/if_packet.h>
+#include <netinet/if_ether.h>
typedef struct MirrorState MirrorState;
DECLARE_INSTANCE_CHECKER(MirrorState, FILTER_MIRROR,
@@ -41,6 +48,10 @@ struct MirrorState {
NetFilterState parent_obj;
char *indev;
char *outdev;
+ NetClientState *out_net;
+ int in_netfd;
+ uint8_t *in_netbuf;
+ int out_netfd;
CharFrontend chr_in;
CharFrontend chr_out;
SocketReadState rs;
@@ -189,6 +200,17 @@ static int redirector_chr_can_read(void *opaque)
return REDIRECTOR_MAX_LEN;
}
+static bool filter_redirector_input_active(NetFilterState *nf, bool enable)
+{
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+
+ if (!enable) {
+ return false;
+ }
+
+ return runstate_is_running() || s->enable_when_stopped;
+}
+
static void redirector_chr_read(void *opaque, const uint8_t *buf, int size)
{
NetFilterState *nf = opaque;
@@ -225,6 +247,40 @@ static void redirector_chr_event(void *opaque, QEMUChrEvent event)
}
}
+static void filter_redirector_netdev_read(void *opaque)
+{
+ NetFilterState *nf = opaque;
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+ struct sockaddr_ll sll;
+ socklen_t sll_len;
+ ssize_t len;
+
+ if (!s->in_netbuf || s->in_netfd < 0) {
+ return;
+ }
+
+ for (;;) {
+ sll_len = sizeof(sll);
+ len = recvfrom(s->in_netfd, s->in_netbuf, REDIRECTOR_MAX_LEN, 0,
+ (struct sockaddr *)&sll, &sll_len);
+ if (len <= 0) {
+ break;
+ }
+
+ if (sll.sll_pkttype != PACKET_OUTGOING) {
+ continue;
+ }
+
+ redirector_to_filter(nf, s->in_netbuf, len);
+ }
+
+ if (len < 0 && errno != EAGAIN && errno != EWOULDBLOCK &&
+ errno != EINTR) {
+ error_report("filter redirector read netdev failed(%s)",
+ strerror(errno));
+ }
+}
+
static ssize_t filter_mirror_receive_iov(NetFilterState *nf,
NetClientState *sender,
unsigned flags,
@@ -285,7 +341,19 @@ static void filter_redirector_cleanup(NetFilterState *nf)
qemu_chr_fe_deinit(&s->chr_in, false);
qemu_chr_fe_deinit(&s->chr_out, false);
- qemu_del_vm_change_state_handler(s->vmsentry);
+ if (s->vmsentry) {
+ qemu_del_vm_change_state_handler(s->vmsentry);
+ s->vmsentry = NULL;
+ }
+ if (s->in_netfd >= 0) {
+ qemu_set_fd_handler(s->in_netfd, NULL, NULL, NULL);
+ close(s->in_netfd);
+ s->in_netfd = -1;
+ }
+ if (s->out_netfd >= 0) {
+ close(s->out_netfd);
+ s->out_netfd = -1;
+ }
if (nf->netdev) {
nf->netdev->allow_send_when_stopped = 0;
@@ -352,6 +420,14 @@ static void filter_redirector_vm_state_change(void *opaque, bool running,
NetFilterState *nf = opaque;
MirrorState *s = FILTER_REDIRECTOR(nf);
NetClientState *nc = nf->netdev;
+ bool active = filter_redirector_input_active(nf, nf->on);
+
+ if (s->in_netfd >= 0) {
+ qemu_set_fd_handler(s->in_netfd,
+ active ? filter_redirector_netdev_read : NULL,
+ NULL,
+ active ? nf : NULL);
+ }
if (!running && nc && s->enable_when_stopped && nc->info->read_poll) {
nc->info->read_poll(nc, true);
@@ -379,21 +455,83 @@ static void filter_redirector_maybe_enable_read_poll(NetFilterState *nf)
}
}
+static bool filter_redirector_netdev_setup(NetFilterState *nf, Error **errp)
+{
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+ struct sockaddr_ll sll = { 0 };
+ char ifname[IFNAMSIZ] = { 0 };
+ int ifindex;
+ int fd;
+ NetClientState *nc = nf->netdev;
+ int tapfd;
+ bool capture = filter_redirector_use_capture_netdev(nf);
+ bool inject = filter_redirector_use_inject_netdev(nf);
+
+ if (!capture && !inject) {
+ return true;
+ }
+
+ if (!nc || nc->info->type != NET_CLIENT_DRIVER_TAP) {
+ return true;
+ }
+
+ tapfd = tap_get_fd(nc);
+ if (tapfd < 0 || tap_fd_get_ifname(tapfd, ifname) != 0) {
+ error_setg(errp, "failed to resolve TAP ifname for netdev '%s'",
+ nf->netdev_id);
+ return false;
+ }
+
+ ifindex = if_nametoindex(ifname);
+ if (!ifindex) {
+ error_setg_errno(errp, errno,
+ "failed to resolve ifindex for '%s'", ifname);
+ return false;
+ }
+
+ fd = qemu_socket(AF_PACKET, SOCK_RAW | SOCK_NONBLOCK, htons(ETH_P_ALL));
+ if (fd < 0) {
+ error_setg_errno(errp, errno, "failed to create AF_PACKET socket");
+ return false;
+ }
+
+ sll.sll_family = AF_PACKET;
+ sll.sll_ifindex = ifindex;
+ sll.sll_protocol = htons(ETH_P_ALL);
+ if (bind(fd, (struct sockaddr *)&sll, sizeof(sll)) < 0) {
+ error_setg_errno(errp, errno,
+ "failed to bind AF_PACKET socket for ifname '%s'",
+ ifname);
+ close(fd);
+ return false;
+ }
+
+ if (capture) {
+ s->in_netfd = fd;
+ g_free(s->in_netbuf);
+ s->in_netbuf = g_malloc(REDIRECTOR_MAX_LEN);
+ } else if (inject) {
+ s->out_netfd = fd;
+ s->out_net = nc;
+ }
+ return true;
+}
+
static void filter_redirector_setup(NetFilterState *nf, Error **errp)
{
MirrorState *s = FILTER_REDIRECTOR(nf);
Chardev *chr;
if (!s->indev && !s->outdev) {
- error_setg(errp, "filter redirector needs 'indev' or "
- "'outdev' at least one property set");
+ error_setg(errp, "filter redirector needs at least one of "
+ "'indev' or 'outdev'");
+ return;
+ }
+
+ if (s->indev && s->outdev && !strcmp(s->indev, s->outdev)) {
+ error_setg(errp, "'indev' and 'outdev' could not be same "
+ "for filter redirector");
return;
- } else if (s->indev && s->outdev) {
- if (!strcmp(s->indev, s->outdev)) {
- error_setg(errp, "'indev' and 'outdev' could not be same "
- "for filter redirector");
- return;
- }
}
net_socket_rs_init(&s->rs, redirector_rs_finalize, s->vnet_hdr);
@@ -429,9 +567,21 @@ static void filter_redirector_setup(NetFilterState *nf, Error **errp)
}
}
+ if (!filter_redirector_netdev_setup(nf, errp)) {
+ return;
+ }
+
s->vmsentry = qemu_add_vm_change_state_handler(
filter_redirector_vm_state_change, nf);
+ if (s->in_netfd >= 0) {
+ bool active = filter_redirector_input_active(nf, nf->on);
+
+ qemu_set_fd_handler(s->in_netfd,
+ active ? filter_redirector_netdev_read : NULL,
+ NULL,
+ active ? nf : NULL);
+ }
filter_redirector_maybe_enable_read_poll(nf);
filter_redirector_refresh_allow_send_when_stopped(nf);
@@ -440,6 +590,7 @@ static void filter_redirector_setup(NetFilterState *nf, Error **errp)
static void filter_redirector_status_changed(NetFilterState *nf, Error **errp)
{
MirrorState *s = FILTER_REDIRECTOR(nf);
+ bool active = filter_redirector_input_active(nf, nf->on);
if (s->indev) {
if (nf->on) {
@@ -452,6 +603,13 @@ static void filter_redirector_status_changed(NetFilterState *nf, Error **errp)
}
}
+ if (s->in_netfd >= 0) {
+ qemu_set_fd_handler(s->in_netfd,
+ active ? filter_redirector_netdev_read : NULL,
+ NULL,
+ active ? nf : NULL);
+ }
+
if (nf->on) {
filter_redirector_maybe_enable_read_poll(nf);
}
@@ -642,6 +800,8 @@ static void filter_redirector_init(Object *obj)
MirrorState *s = FILTER_REDIRECTOR(obj);
s->vnet_hdr = false;
+ s->in_netfd = -1;
+ s->out_netfd = -1;
}
static void filter_mirror_fini(Object *obj)
@@ -657,6 +817,7 @@ static void filter_redirector_fini(Object *obj)
g_free(s->indev);
g_free(s->outdev);
+ g_free(s->in_netbuf);
}
static const TypeInfo filter_redirector_info = {
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 4/9] net/filter-redirector: add send helpers and netdev counters
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (2 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 3/9] net/filter-redirector: add AF_PACKET socket setup and input handler Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 5/9] net/filter-redirector: route chardev and AF_PACKET receive paths Cindy Lu
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Add helper functions for sending packets through the AF_PACKET out
socket or the chardev backend, and add netdev RX/TX packet and byte
counters to MirrorState.
The follow-up receive-path changes use these helpers and expose the new
statistics via filter_redirector_get_stats().
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
net/filter-mirror.c | 70 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index 915f2f8b35..e57fbc94b8 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -64,6 +64,11 @@ struct MirrorState {
uint64_t indev_bytes;
uint64_t outdev_packets;
uint64_t outdev_bytes;
+ /* netdev replay/capture statistics for filter-redirector */
+ uint64_t netdev_rx_packets;
+ uint64_t netdev_rx_bytes;
+ uint64_t netdev_tx_packets;
+ uint64_t netdev_tx_bytes;
};
typedef struct FilterSendCo {
@@ -175,6 +180,59 @@ static int filter_send(MirrorState *s,
return data.ret;
}
+static ssize_t filter_redirector_send_netdev_packet(MirrorState *s,
+ const struct iovec *iov,
+ int iovcnt)
+{
+ ssize_t size = iov_size(iov, iovcnt);
+ g_autofree uint8_t *buf = NULL;
+
+ if (s->out_netfd < 0) {
+ return -ENODEV;
+ }
+ if (size > NET_BUFSIZE) {
+ return -EINVAL;
+ }
+
+ buf = g_malloc(size);
+ iov_to_buf(iov, iovcnt, 0, buf, size);
+
+ ssize_t ret = send(s->out_netfd, buf, size, 0);
+ if (ret < 0) {
+ return -errno;
+ }
+ if (ret > 0) {
+ s->netdev_tx_packets++;
+ s->netdev_tx_bytes += ret;
+ }
+ return ret;
+}
+static ssize_t filter_redirector_send_chardev_iov(MirrorState *s,
+ const struct iovec *iov,
+ int iovcnt)
+{
+ if (!s->outdev) {
+ return -ENODEV;
+ }
+
+ if (!qemu_chr_fe_backend_connected(&s->chr_out)) {
+ return 0;
+ }
+
+ return filter_send(s, iov, iovcnt);
+}
+
+static ssize_t filter_redirector_send_netdev_iov(MirrorState *s,
+ const struct iovec *iov,
+ int iovcnt)
+{
+ if (s->out_netfd < 0) {
+ return -ENODEV;
+ }
+
+ return filter_redirector_send_netdev_packet(s, iov, iovcnt);
+}
+
static void redirector_to_filter(NetFilterState *nf,
const uint8_t *buf,
int len)
@@ -763,6 +821,18 @@ static GList *filter_redirector_get_stats(NetFilterState *nf)
counter->bytes = s->outdev_bytes;
list = g_list_append(list, counter);
+ counter = g_new0(NetFilterCounter, 1);
+ counter->name = g_strdup("netdev_rx");
+ counter->packets = s->netdev_rx_packets;
+ counter->bytes = s->netdev_rx_bytes;
+ list = g_list_append(list, counter);
+
+ counter = g_new0(NetFilterCounter, 1);
+ counter->name = g_strdup("netdev_tx");
+ counter->packets = s->netdev_tx_packets;
+ counter->bytes = s->netdev_tx_bytes;
+ list = g_list_append(list, counter);
+
return list;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 5/9] net/filter-redirector: route chardev and AF_PACKET receive paths
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (3 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 4/9] net/filter-redirector: add send helpers and netdev counters Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 6/9] net/filter: Add support for filter-buffer Cindy Lu
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Packets captured from AF_PACKET now either go to the chardev outdev or
into the filter chain, and both paths update the new netdev statistics.
Use the same routing from redirector_rs_finalize() so replay traffic and
normal receive handling share one dispatch policy.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
net/filter-mirror.c | 107 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 101 insertions(+), 6 deletions(-)
diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index e57fbc94b8..1ff58e1d27 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -305,6 +305,81 @@ static void redirector_chr_event(void *opaque, QEMUChrEvent event)
}
}
+static void filter_redirector_recv_from_chardev(NetFilterState *nf,
+ const uint8_t *buf,
+ int len)
+{
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+ bool inject_netdev = filter_redirector_use_inject_netdev(nf);
+ ssize_t ret;
+ struct iovec iov = {
+ .iov_base = (void *)buf,
+ .iov_len = len,
+ };
+
+ if (len <= 0) {
+ return;
+ }
+
+ /* chardev indev */
+ s->indev_packets++;
+ s->indev_bytes += len;
+
+ if (inject_netdev) {
+ ret = filter_redirector_send_netdev_iov(s, &iov, 1);
+ if (ret < 0) {
+ error_report("filter redirector send failed(%s)", strerror(-ret));
+ }
+ return;
+ }
+
+ if (s->outdev) {
+ ret = filter_redirector_send_chardev_iov(s, &iov, 1);
+ if (ret < 0) {
+ error_report("filter redirector send failed(%s)", strerror(-ret));
+ } else if (ret > 0) {
+ s->outdev_packets++;
+ s->outdev_bytes += ret;
+ }
+ return;
+ }
+
+ redirector_to_filter(nf, buf, len);
+}
+
+static bool filter_redirector_recv_from_netdev(NetFilterState *nf,
+ const uint8_t *buf,
+ int len)
+{
+ MirrorState *s = FILTER_REDIRECTOR(nf);
+ ssize_t ret;
+ struct iovec iov = {
+ .iov_base = (void *)buf,
+ .iov_len = len,
+ };
+
+ if (len <= 0) {
+ return false;
+ }
+
+ if (s->outdev) {
+ ret = filter_redirector_send_chardev_iov(s, &iov, 1);
+ if (ret > 0) {
+ s->outdev_packets++;
+ s->outdev_bytes += ret;
+ }
+ } else {
+ redirector_to_filter(nf, buf, len);
+ return true;
+ }
+
+ if (ret < 0) {
+ error_report("filter redirector send failed(%s)", strerror(-ret));
+ return false;
+ }
+ return true;
+}
+
static void filter_redirector_netdev_read(void *opaque)
{
NetFilterState *nf = opaque;
@@ -329,7 +404,9 @@ static void filter_redirector_netdev_read(void *opaque)
continue;
}
- redirector_to_filter(nf, s->in_netbuf, len);
+ s->netdev_rx_packets++;
+ s->netdev_rx_bytes += len;
+ filter_redirector_recv_from_netdev(nf, s->in_netbuf, len);
}
if (len < 0 && errno != EAGAIN && errno != EWOULDBLOCK &&
@@ -369,21 +446,34 @@ static ssize_t filter_redirector_receive_iov(NetFilterState *nf,
NetPacketSent *sent_cb)
{
MirrorState *s = FILTER_REDIRECTOR(nf);
+ bool capture_netdev = filter_redirector_use_capture_netdev(nf);
+ bool inject_netdev = filter_redirector_use_inject_netdev(nf);
int ret;
- if (qemu_chr_fe_backend_connected(&s->chr_out)) {
- ret = filter_send(s, iov, iovcnt);
+ if (s->indev || inject_netdev) {
+ return 0;
+ }
+
+ if (capture_netdev || s->outdev) {
+ if (capture_netdev) {
+ return 0;
+ }
+
+ ret = filter_redirector_send_chardev_iov(s, iov, iovcnt);
if (ret < 0) {
error_report("filter redirector send failed(%s)", strerror(-ret));
} else if (ret > 0) {
- /* Update outdev statistics on successful send */
s->outdev_packets++;
s->outdev_bytes += ret;
}
- return iov_size(iov, iovcnt);
- } else {
+ /*
+ * Without an active AF_PACKET capture socket, outdev mirroring is a
+ * sideband copy only and must not consume the guest-bound packet.
+ */
return 0;
}
+
+ return 0;
}
static void filter_mirror_cleanup(NetFilterState *nf)
@@ -444,6 +534,11 @@ static void redirector_rs_finalize(SocketReadState *rs)
MirrorState *s = container_of(rs, MirrorState, rs);
NetFilterState *nf = NETFILTER(s);
+ if (s->outdev || filter_redirector_use_inject_netdev(nf)) {
+ filter_redirector_recv_from_chardev(nf, rs->buf, rs->packet_len);
+ return;
+ }
+
/* Update indev statistics */
s->indev_packets++;
s->indev_bytes += rs->packet_len;
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 6/9] net/filter: Add support for filter-buffer
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (4 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 5/9] net/filter-redirector: route chardev and AF_PACKET receive paths Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 7/9] virtio-net: keep tap read polling disabled while vhost owns RX Cindy Lu
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Allow filter-buffer on the same vhost backend as filter-redirector,
add an internal redirector-injected packet flag, and route indev packets
through the preceding filter-buffer before they are reinjected.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
include/net/queue.h | 5 +++
net/filter-mirror.c | 98 +++++++++++++++++++++++++++++++++++++++++----
net/filter.c | 5 ++-
3 files changed, 98 insertions(+), 10 deletions(-)
diff --git a/include/net/queue.h b/include/net/queue.h
index 2e686b1b61..213abe62ec 100644
--- a/include/net/queue.h
+++ b/include/net/queue.h
@@ -32,6 +32,11 @@ typedef void (NetPacketSent) (NetClientState *sender, ssize_t ret);
#define QEMU_NET_PACKET_FLAG_NONE 0
#define QEMU_NET_PACKET_FLAG_RAW (1<<0)
+/*
+ * Internal marker used by filter-redirector when packets are injected from
+ * indev through filter-buffer before being reinjected.
+ */
+#define QEMU_NET_PACKET_FLAG_REDIRECTOR_INJECT (1<<1)
/* Returns:
* >0 - success
diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index 1ff58e1d27..dabf52275a 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -233,6 +233,73 @@ static ssize_t filter_redirector_send_netdev_iov(MirrorState *s,
return filter_redirector_send_netdev_packet(s, iov, iovcnt);
}
+static NetFilterState *filter_redirector_prev_in_direction(NetFilterState *nf,
+ NetFilterDirection dir)
+{
+ if (dir == NET_FILTER_DIRECTION_TX) {
+ return QTAILQ_PREV(nf, next);
+ }
+ return QTAILQ_NEXT(nf, next);
+}
+
+static NetFilterState *filter_redirector_find_buffer_before(NetFilterState *nf,
+ NetFilterDirection dir)
+{
+ NetFilterState *iter = filter_redirector_prev_in_direction(nf, dir);
+
+ while (iter) {
+ if ((iter->direction == dir ||
+ iter->direction == NET_FILTER_DIRECTION_ALL) &&
+ object_dynamic_cast(OBJECT(iter), "filter-buffer")) {
+ return iter;
+ }
+ iter = filter_redirector_prev_in_direction(iter, dir);
+ }
+
+ return NULL;
+}
+
+static bool filter_redirector_inject_to_buffer(NetFilterState *nf,
+ const uint8_t *buf,
+ int len)
+{
+ struct iovec iov = {
+ .iov_base = (void *)buf,
+ .iov_len = len,
+ };
+ NetFilterState *buffer;
+ bool injected = false;
+
+ if (nf->direction == NET_FILTER_DIRECTION_ALL ||
+ nf->direction == NET_FILTER_DIRECTION_TX) {
+ buffer = filter_redirector_find_buffer_before(nf,
+ NET_FILTER_DIRECTION_TX);
+ if (buffer) {
+ qemu_netfilter_receive(buffer, NET_FILTER_DIRECTION_TX,
+ nf->netdev,
+ QEMU_NET_PACKET_FLAG_REDIRECTOR_INJECT,
+ &iov, 1, NULL);
+ injected = true;
+ }
+ }
+
+ if ((nf->direction == NET_FILTER_DIRECTION_ALL ||
+ nf->direction == NET_FILTER_DIRECTION_RX) &&
+ nf->netdev->peer) {
+ buffer = filter_redirector_find_buffer_before(nf,
+ NET_FILTER_DIRECTION_RX);
+ if (buffer) {
+ qemu_netfilter_receive(buffer, NET_FILTER_DIRECTION_RX,
+ nf->netdev->peer,
+ QEMU_NET_PACKET_FLAG_REDIRECTOR_INJECT,
+ &iov, 1, NULL);
+ injected = true;
+ }
+ }
+
+ return injected;
+}
+
static void redirector_to_filter(NetFilterState *nf,
const uint8_t *buf,
int len)
@@ -310,7 +377,6 @@ static void filter_redirector_recv_from_chardev(NetFilterState *nf,
int len)
{
MirrorState *s = FILTER_REDIRECTOR(nf);
- bool inject_netdev = filter_redirector_use_inject_netdev(nf);
ssize_t ret;
struct iovec iov = {
.iov_base = (void *)buf,
@@ -325,7 +391,11 @@ static void filter_redirector_recv_from_chardev(NetFilterState *nf,
s->indev_packets++;
s->indev_bytes += len;
- if (inject_netdev) {
+ if (!s->outdev && filter_redirector_inject_to_buffer(nf, buf, len)) {
+ return;
+ }
+
+ if (s->out_netfd >= 0) {
ret = filter_redirector_send_netdev_iov(s, &iov, 1);
if (ret < 0) {
error_report("filter redirector send failed(%s)", strerror(-ret));
@@ -446,16 +516,22 @@ static ssize_t filter_redirector_receive_iov(NetFilterState *nf,
NetPacketSent *sent_cb)
{
MirrorState *s = FILTER_REDIRECTOR(nf);
- bool capture_netdev = filter_redirector_use_capture_netdev(nf);
- bool inject_netdev = filter_redirector_use_inject_netdev(nf);
int ret;
- if (s->indev || inject_netdev) {
- return 0;
+ if (s->out_netfd >= 0) {
+ if (!(flags & QEMU_NET_PACKET_FLAG_REDIRECTOR_INJECT)) {
+ return 0;
+ }
+
+ ret = filter_redirector_send_netdev_iov(s, iov, iovcnt);
+ if (ret < 0) {
+ error_report("filter redirector send failed(%s)", strerror(-ret));
+ }
+ return iov_size(iov, iovcnt);
}
- if (capture_netdev || s->outdev) {
- if (capture_netdev) {
+ if (s->outdev) {
+ if (s->in_netfd >= 0) {
return 0;
}
@@ -473,6 +549,12 @@ static ssize_t filter_redirector_receive_iov(NetFilterState *nf,
return 0;
}
+ if (s->indev) {
+ if (!(flags & QEMU_NET_PACKET_FLAG_REDIRECTOR_INJECT)) {
+ return 0;
+ }
+ }
+
return 0;
}
diff --git a/net/filter.c b/net/filter.c
index b9646b9e00..cc23e743cf 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -260,8 +260,9 @@ static void netfilter_complete(UserCreatable *uc, Error **errp)
bool buffer = object_dynamic_cast(OBJECT(uc), "filter-buffer");
bool vhost_filter = redirector || buffer;
- if (!redirector) {
- error_setg(errp, "Vhost is not supported");
+ if (!vhost_filter) {
+ error_setg(errp, "Vhost only supports filter-redirector and "
+ "filter-buffer");
return;
}
if (vhost_filter && ncs[0]->info->type != NET_CLIENT_DRIVER_TAP) {
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 7/9] virtio-net: keep tap read polling disabled while vhost owns RX
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (5 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 6/9] net/filter: Add support for filter-buffer Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 8/9] virtio-net: handle short vnet headers on replay RX Cindy Lu
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
virtio_net_backend_read_poll_set() re-enables TAP read polling on
vmstart even when kernel vhost has already taken over RX.
That lets QEMU userspace and vhost race on the same tap fd and can
corrupt the restored virtqueue state during migration switchover.
Keep read_poll disabled for TAP backends with a started vhost_net, while
leaving pure userspace backends unchanged.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
hw/net/virtio-net.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index d6d2188863..616590fb82 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1322,9 +1322,19 @@ static void virtio_net_backend_read_poll_set(VirtIONet *n, bool enable)
for (i = 0; i < (int)n->max_ncs; i++) {
NetClientState *frontend = qemu_get_subqueue(n->nic, i);
NetClientState *backend = frontend ? frontend->peer : NULL;
+ bool backend_enable = enable;
if (backend && backend->info && backend->info->read_poll) {
- backend->info->read_poll(backend, enable);
+ /*
+ * When vhost is active, the kernel backend owns the tap RX path.
+ * Re-enabling QEMU read_poll on vmstart makes userspace and vhost
+ * race on the same tap fd, which can corrupt the restored RX ring
+ * during migration switchover replay.
+ */
+ if (enable && get_vhost_net(backend) && n->vhost_started) {
+ backend_enable = false;
+ }
+ backend->info->read_poll(backend, backend_enable);
}
}
}
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 8/9] virtio-net: handle short vnet headers on replay RX
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (6 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 7/9] virtio-net: keep tap read polling disabled while vhost owns RX Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-12 7:09 ` [RFC v2 9/9] net/filter-redirector: check CAP_NET_RAW before creating AF_PACKET Cindy Lu
2026-03-13 6:05 ` [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Jason Wang
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
During switchover replay, packets injected through AF_PACKET can come
back from the bridge with a 10-byte virtio_net_hdr even though QEMU
expects a 12-byte merged-rxbuf header. The missing two bytes shift the
Ethernet frame and corrupt the packet seen by the guest.
Detect this case by comparing the EtherType at the expected position
with the value two bytes earlier. When only the shifted position
contains a recognized protocol, reduce the effective host header length
by two for this packet.
Only apply the heuristic while vhost is running, and carry the adjusted
header length through the normal receive path without copying the
buffer.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
hw/net/virtio-net.c | 54 ++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 49 insertions(+), 5 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 616590fb82..29dbe3d8d5 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -144,6 +144,36 @@ static int vq2q(int queue_index)
return queue_index / 2;
}
+static bool virtio_net_rx_known_ethertype(uint16_t proto)
+{
+ if (proto <= 1500) {
+ /* IEEE 802.3 length field */
+ return true;
+ }
+
+ switch (proto) {
+ case ETH_P_IP:
+ case ETH_P_IPV6:
+ case ETH_P_ARP:
+ case ETH_P_VLAN:
+ case ETH_P_DVLAN:
+ case 0x0842: /* Wake-on-LAN */
+ case 0x22f0: /* IEEE 802.1Qbe / TSN */
+ case 0x8809: /* Slow protocols / LACP */
+ case 0x8863: /* PPPoE discovery */
+ case 0x8864: /* PPPoE session */
+ case 0x8906: /* FCoE */
+ case 0x8914: /* FCoE Init */
+ case 0x88cc: /* LLDP */
+ case 0x88e1: /* HomePlug AV */
+ case 0x88f7: /* PTP */
+ case 0x8915: /* RoCE */
+ return true;
+ default:
+ return false;
+ }
+}
+
static void flush_or_purge_queued_packets(NetClientState *nc)
{
if (!nc->peer) {
@@ -1780,7 +1810,8 @@ static void receive_header(VirtIONet *n, const struct iovec *iov, int iov_cnt,
}
}
-static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
+static int receive_filter(VirtIONet *n, const uint8_t *buf, int size,
+ size_t host_hdr_len)
{
static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
static const uint8_t vlan[] = {0x81, 0x00};
@@ -1790,7 +1821,7 @@ static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
if (n->promisc)
return 1;
- ptr += n->host_hdr_len;
+ ptr += host_hdr_len;
if (!memcmp(&ptr[12], vlan, sizeof(vlan))) {
int vid = lduw_be_p(ptr + 14) & 0xfff;
@@ -1955,12 +1986,25 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
QEMU_UNINITIALIZED size_t lens[VIRTQUEUE_MAX_SIZE];
QEMU_UNINITIALIZED struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
struct virtio_net_hdr_v1_hash extra_hdr;
+ size_t host_hdr_len = n->host_hdr_len;
unsigned mhdr_cnt = 0;
size_t offset, i, guest_offset, j;
ssize_t err;
memset(&extra_hdr, 0, sizeof(extra_hdr));
+ if (n->vhost_started &&
+ host_hdr_len >= 12 &&
+ size >= host_hdr_len + ETH_HLEN) {
+ uint16_t et_at_host = lduw_be_p(buf + host_hdr_len + 12);
+ uint16_t et_at_m2 = lduw_be_p(buf + host_hdr_len + 10);
+
+ if (!virtio_net_rx_known_ethertype(et_at_host) &&
+ virtio_net_rx_known_ethertype(et_at_m2)) {
+ host_hdr_len -= 2;
+ }
+ }
+
if (n->rss_data.enabled && n->rss_data.enabled_software_rss) {
int index = virtio_net_process_rss(nc, buf, size, &extra_hdr);
if (index >= 0) {
@@ -1975,11 +2019,11 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
q = virtio_net_get_subqueue(nc);
/* hdr_len refers to the header we supply to the guest */
- if (!virtio_net_has_buffers(q, size + n->guest_hdr_len - n->host_hdr_len)) {
+ if (!virtio_net_has_buffers(q, size + n->guest_hdr_len - host_hdr_len)) {
return 0;
}
- if (!receive_filter(n, buf, size))
+ if (!receive_filter(n, buf, size, host_hdr_len))
return size;
offset = i = 0;
@@ -2041,7 +2085,7 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
sizeof(extra_hdr.hash_value) +
sizeof(extra_hdr.hash_report));
}
- offset = n->host_hdr_len;
+ offset = host_hdr_len;
total += n->guest_hdr_len;
guest_offset = n->guest_hdr_len;
} else {
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 9/9] net/filter-redirector: check CAP_NET_RAW before creating AF_PACKET
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (7 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 8/9] virtio-net: handle short vnet headers on replay RX Cindy Lu
@ 2026-03-12 7:09 ` Cindy Lu
2026-03-13 6:05 ` [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Jason Wang
9 siblings, 0 replies; 11+ messages in thread
From: Cindy Lu @ 2026-03-12 7:09 UTC (permalink / raw)
To: lulu, mst, jasowang, zhangckid, lizhijian, jmarcin, qemu-devel
Creating an AF_PACKET SOCK_RAW socket requires the CAP_NET_RAW
capability. Without it the qemu_socket() call fails with EPERM,
producing a generic error that gives no hint about the missing
capability.
Add an explicit capget()-based check in filter_redirector_netdev_setup()
before the socket call.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
net/filter-mirror.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index dabf52275a..a07ae61b2d 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -34,6 +34,8 @@
#include <net/if.h>
#include <linux/if_packet.h>
#include <netinet/if_ether.h>
+#include <linux/capability.h>
+#include <sys/syscall.h>
typedef struct MirrorState MirrorState;
DECLARE_INSTANCE_CHECKER(MirrorState, FILTER_MIRROR,
@@ -690,6 +692,21 @@ static void filter_redirector_maybe_enable_read_poll(NetFilterState *nf)
}
}
+static bool filter_redirector_has_cap_net_raw(void)
+{
+ struct __user_cap_header_struct hdr = {
+ .version = _LINUX_CAPABILITY_VERSION_3,
+ .pid = 0,
+ };
+ struct __user_cap_data_struct data[2] = {};
+
+ if (syscall(SYS_capget, &hdr, data) < 0) {
+ return false;
+ }
+
+ return data[CAP_NET_RAW >> 5].effective & (1u << (CAP_NET_RAW & 31));
+}
+
static bool filter_redirector_netdev_setup(NetFilterState *nf, Error **errp)
{
MirrorState *s = FILTER_REDIRECTOR(nf);
@@ -724,6 +741,13 @@ static bool filter_redirector_netdev_setup(NetFilterState *nf, Error **errp)
return false;
}
+ if (!filter_redirector_has_cap_net_raw()) {
+ error_setg(errp,
+ "AF_PACKET raw socket requires CAP_NET_RAW; "
+ "run with 'setcap cap_net_raw+ep <qemu-binary>' or as root");
+ return false;
+ }
+
fd = qemu_socket(AF_PACKET, SOCK_RAW | SOCK_NONBLOCK, htons(ETH_P_ALL));
if (fd < 0) {
error_setg_errno(errp, errno, "failed to create AF_PACKET socket");
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
` (8 preceding siblings ...)
2026-03-12 7:09 ` [RFC v2 9/9] net/filter-redirector: check CAP_NET_RAW before creating AF_PACKET Cindy Lu
@ 2026-03-13 6:05 ` Jason Wang
9 siblings, 0 replies; 11+ messages in thread
From: Jason Wang @ 2026-03-13 6:05 UTC (permalink / raw)
To: Cindy Lu; +Cc: mst, zhangckid, lizhijian, jmarcin, qemu-devel
On Thu, Mar 12, 2026 at 3:14 PM Cindy Lu <lulu@redhat.com> wrote:
>
> Hi, All
>
> This series adds an AF_PACKET support for vhost tap
> device in filter-redirector/filter-buffer.when the vhost=on will use
> AF_PACKET to capture and inject,
>
> Example Usage(not change with exist upstream code)
> =============
> Primary VM (mirror incoming packets to secondary via chardev socket):
>
> -netdev tap,id=net0,vhost=on,...
> -chardev socket,id=mirror0,host=...,port=...,server=on,wait=off
> -object filter-redirector,id=vm1redir,netdev=net0,outdev=mirror0...
>
> Secondary VM (receive mirrored packets):
>
> -netdev tap,id=net0,vhost=on,...
> -chardev socket,id=red0,host=...,port=...,reconnect-ms=..
> -object filter-buffer,id=swbuf,netdev=net0,queue=tx,interval=1000000,status=off.....
> -object filter-redirector,id=r1,netdev=net0,queue=tx,indev=red0,status=off,enable_when
> _stopped=true.... \
>
> TODO
> =======
> This series still based on tap device. The vhost-vdpa support is on going,will send soon
>
Thanks for the series. But I think I still have the same question as
v1. Any reason to tightly coupled packet socket into the netfilter.
Couldn'y we reuse chardev for that?
Thanks
> changset
> ===========
> change in v2:
> 1. add support for filter-buffer
> 2. remove the in_netdev and out_netdev for AF_PACKET bind port, now only use netdev
> when the vhost=on start use AF_PACKET to capture and inject, when use vhost=off will use
> the existing code
> 3. add CAP_NET_RAW check
> 4. address the comment
>
>
> Testing
> =======
> - Tested with vhost=on/off TAP netdev on x86_64
>
> Cindy Lu (9):
> net/filter: allow redirector on vhost TAP backends
> net/filter-redirector: add role helpers for AF_PACKET paths
> net/filter-redirector: add AF_PACKET socket setup and input handler
> net/filter-redirector: add send helpers and netdev counters
> net/filter-redirector: route chardev and AF_PACKET receive paths
> net/filter: Add support for filter-buffer
> virtio-net: keep tap read polling disabled while vhost owns RX
> virtio-net: handle short vnet headers on replay RX
> net/filter-redirector: check CAP_NET_RAW before creating AF_PACKET
>
> hw/net/virtio-net.c | 66 +++++-
> include/net/queue.h | 5 +
> net/filter-mirror.c | 493 ++++++++++++++++++++++++++++++++++++++++++--
> net/filter.c | 16 +-
> 4 files changed, 551 insertions(+), 29 deletions(-)
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-03-13 6:07 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-12 7:09 [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Cindy Lu
2026-03-12 7:09 ` [RFC v2 1/9] net/filter: allow redirector on vhost TAP backends Cindy Lu
2026-03-12 7:09 ` [RFC v2 2/9] net/filter-redirector: add role helpers for AF_PACKET paths Cindy Lu
2026-03-12 7:09 ` [RFC v2 3/9] net/filter-redirector: add AF_PACKET socket setup and input handler Cindy Lu
2026-03-12 7:09 ` [RFC v2 4/9] net/filter-redirector: add send helpers and netdev counters Cindy Lu
2026-03-12 7:09 ` [RFC v2 5/9] net/filter-redirector: route chardev and AF_PACKET receive paths Cindy Lu
2026-03-12 7:09 ` [RFC v2 6/9] net/filter: Add support for filter-buffer Cindy Lu
2026-03-12 7:09 ` [RFC v2 7/9] virtio-net: keep tap read polling disabled while vhost owns RX Cindy Lu
2026-03-12 7:09 ` [RFC v2 8/9] virtio-net: handle short vnet headers on replay RX Cindy Lu
2026-03-12 7:09 ` [RFC v2 9/9] net/filter-redirector: check CAP_NET_RAW before creating AF_PACKET Cindy Lu
2026-03-13 6:05 ` [RFC v2 0/9] net/filter-redirector: Add AF_PACKET support for vhost-net Jason Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox