netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx
@ 2025-04-15  7:43 Bui Quang Minh
  2025-04-15  7:43 ` [PATCH v3 1/3] " Bui Quang Minh
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Bui Quang Minh @ 2025-04-15  7:43 UTC (permalink / raw)
  To: virtualization
  Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, Andrew Lunn,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf,
	Bui Quang Minh

Hi everyone,

This series tries to fix a deadlock in virtio-net when binding/unbinding
XDP program, XDP socket or resizing the rx queue.

When pausing rx (e.g. set up xdp, xsk pool, rx resize), we call
napi_disable() on the receive queue's napi. In delayed refill_work, it
also calls napi_disable() on the receive queue's napi. When
napi_disable() is called on an already disabled napi, it will sleep in
napi_disable_locked while still holding the netdev_lock. As a result,
later napi_enable gets stuck too as it cannot acquire the netdev_lock.
This leads to refill_work and the pause-then-resume tx are stuck
altogether.

This scenario can be reproducible by binding a XDP socket to virtio-net
interface without setting up the fill ring. As a result, try_fill_recv
will fail until the fill ring is set up and refill_work is scheduled.

This fix adds virtnet_rx_(pause/resume)_all helpers and fixes up the
virtnet_rx_resume to disable future and cancel all inflights delayed
refill_work before calling napi_disable() to pause the rx.

Version 3 changes:
- Patch 1: refactor to avoid code duplication

Version 2 changes:
- Add selftest for deadlock scenario

Thanks,
Quang Minh.

Bui Quang Minh (3):
  virtio-net: disable delayed refill when pausing rx
  selftests: net: move xdp_helper to net/lib
  selftests: net: add a virtio_net deadlock selftest

 drivers/net/virtio_net.c                      | 69 +++++++++++++++----
 tools/testing/selftests/Makefile              |  2 +-
 tools/testing/selftests/drivers/net/Makefile  |  2 -
 tools/testing/selftests/drivers/net/queues.py |  4 +-
 .../selftests/drivers/net/virtio_net/Makefile |  2 +
 .../selftests/drivers/net/virtio_net/config   |  1 +
 .../drivers/net/virtio_net/lib/py/__init__.py | 16 +++++
 .../drivers/net/virtio_net/xsk_pool.py        | 52 ++++++++++++++
 tools/testing/selftests/net/lib/.gitignore    |  1 +
 tools/testing/selftests/net/lib/Makefile      |  1 +
 .../{drivers/net => net/lib}/xdp_helper.c     |  0
 11 files changed, 133 insertions(+), 17 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/virtio_net/lib/py/__init__.py
 create mode 100755 tools/testing/selftests/drivers/net/virtio_net/xsk_pool.py
 rename tools/testing/selftests/{drivers/net => net/lib}/xdp_helper.c (100%)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/3] virtio-net: disable delayed refill when pausing rx
  2025-04-15  7:43 [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Bui Quang Minh
@ 2025-04-15  7:43 ` Bui Quang Minh
  2025-04-15 13:19   ` Michael S. Tsirkin
  2025-04-15  7:43 ` [PATCH v3 2/3] selftests: net: move xdp_helper to net/lib Bui Quang Minh
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Bui Quang Minh @ 2025-04-15  7:43 UTC (permalink / raw)
  To: virtualization
  Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, Andrew Lunn,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf,
	Bui Quang Minh

When pausing rx (e.g. set up xdp, xsk pool, rx resize), we call
napi_disable() on the receive queue's napi. In delayed refill_work, it
also calls napi_disable() on the receive queue's napi.  When
napi_disable() is called on an already disabled napi, it will sleep in
napi_disable_locked while still holding the netdev_lock. As a result,
later napi_enable gets stuck too as it cannot acquire the netdev_lock.
This leads to refill_work and the pause-then-resume tx are stuck
altogether.

This scenario can be reproducible by binding a XDP socket to virtio-net
interface without setting up the fill ring. As a result, try_fill_recv
will fail until the fill ring is set up and refill_work is scheduled.

This commit adds virtnet_rx_(pause/resume)_all helpers and fixes up the
virtnet_rx_resume to disable future and cancel all inflights delayed
refill_work before calling napi_disable() to pause the rx.

Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
 drivers/net/virtio_net.c | 69 +++++++++++++++++++++++++++++++++-------
 1 file changed, 57 insertions(+), 12 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7e4617216a4b..848fab51dfa1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3342,7 +3342,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	return NETDEV_TX_OK;
 }
 
-static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
+static void __virtnet_rx_pause(struct virtnet_info *vi,
+			       struct receive_queue *rq)
 {
 	bool running = netif_running(vi->dev);
 
@@ -3352,17 +3353,63 @@ static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
 	}
 }
 
-static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq)
+static void virtnet_rx_pause_all(struct virtnet_info *vi)
+{
+	int i;
+
+	/*
+	 * Make sure refill_work does not run concurrently to
+	 * avoid napi_disable race which leads to deadlock.
+	 */
+	disable_delayed_refill(vi);
+	cancel_delayed_work_sync(&vi->refill);
+	for (i = 0; i < vi->max_queue_pairs; i++)
+		__virtnet_rx_pause(vi, &vi->rq[i]);
+}
+
+static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
+{
+	/*
+	 * Make sure refill_work does not run concurrently to
+	 * avoid napi_disable race which leads to deadlock.
+	 */
+	disable_delayed_refill(vi);
+	cancel_delayed_work_sync(&vi->refill);
+	__virtnet_rx_pause(vi, rq);
+}
+
+static void __virtnet_rx_resume(struct virtnet_info *vi,
+				struct receive_queue *rq,
+				bool refill)
 {
 	bool running = netif_running(vi->dev);
 
-	if (!try_fill_recv(vi, rq, GFP_KERNEL))
+	if (refill && !try_fill_recv(vi, rq, GFP_KERNEL))
 		schedule_delayed_work(&vi->refill, 0);
 
 	if (running)
 		virtnet_napi_enable(rq);
 }
 
+static void virtnet_rx_resume_all(struct virtnet_info *vi)
+{
+	int i;
+
+	enable_delayed_refill(vi);
+	for (i = 0; i < vi->max_queue_pairs; i++) {
+		if (i < vi->curr_queue_pairs)
+			__virtnet_rx_resume(vi, &vi->rq[i], true);
+		else
+			__virtnet_rx_resume(vi, &vi->rq[i], false);
+	}
+}
+
+static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq)
+{
+	enable_delayed_refill(vi);
+	__virtnet_rx_resume(vi, rq, true);
+}
+
 static int virtnet_rx_resize(struct virtnet_info *vi,
 			     struct receive_queue *rq, u32 ring_num)
 {
@@ -5959,12 +6006,12 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 	if (prog)
 		bpf_prog_add(prog, vi->max_queue_pairs - 1);
 
+	virtnet_rx_pause_all(vi);
+
 	/* Make sure NAPI is not using any XDP TX queues for RX. */
 	if (netif_running(dev)) {
-		for (i = 0; i < vi->max_queue_pairs; i++) {
-			virtnet_napi_disable(&vi->rq[i]);
+		for (i = 0; i < vi->max_queue_pairs; i++)
 			virtnet_napi_tx_disable(&vi->sq[i]);
-		}
 	}
 
 	if (!prog) {
@@ -5996,13 +6043,12 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 		vi->xdp_enabled = false;
 	}
 
+	virtnet_rx_resume_all(vi);
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		if (old_prog)
 			bpf_prog_put(old_prog);
-		if (netif_running(dev)) {
-			virtnet_napi_enable(&vi->rq[i]);
+		if (netif_running(dev))
 			virtnet_napi_tx_enable(&vi->sq[i]);
-		}
 	}
 
 	return 0;
@@ -6014,11 +6060,10 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 			rcu_assign_pointer(vi->rq[i].xdp_prog, old_prog);
 	}
 
+	virtnet_rx_resume_all(vi);
 	if (netif_running(dev)) {
-		for (i = 0; i < vi->max_queue_pairs; i++) {
-			virtnet_napi_enable(&vi->rq[i]);
+		for (i = 0; i < vi->max_queue_pairs; i++)
 			virtnet_napi_tx_enable(&vi->sq[i]);
-		}
 	}
 	if (prog)
 		bpf_prog_sub(prog, vi->max_queue_pairs - 1);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/3] selftests: net: move xdp_helper to net/lib
  2025-04-15  7:43 [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Bui Quang Minh
  2025-04-15  7:43 ` [PATCH v3 1/3] " Bui Quang Minh
@ 2025-04-15  7:43 ` Bui Quang Minh
  2025-04-15  7:43 ` [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest Bui Quang Minh
  2025-04-15 14:04 ` [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Michael S. Tsirkin
  3 siblings, 0 replies; 11+ messages in thread
From: Bui Quang Minh @ 2025-04-15  7:43 UTC (permalink / raw)
  To: virtualization
  Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, Andrew Lunn,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf,
	Bui Quang Minh

Move xdp_helper to net/lib to make it easier for other selftests to use
the helper.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
 tools/testing/selftests/drivers/net/Makefile                  | 2 --
 tools/testing/selftests/drivers/net/queues.py                 | 4 ++--
 tools/testing/selftests/net/lib/.gitignore                    | 1 +
 tools/testing/selftests/net/lib/Makefile                      | 1 +
 tools/testing/selftests/{drivers/net => net/lib}/xdp_helper.c | 0
 5 files changed, 4 insertions(+), 4 deletions(-)
 rename tools/testing/selftests/{drivers/net => net/lib}/xdp_helper.c (100%)

diff --git a/tools/testing/selftests/drivers/net/Makefile b/tools/testing/selftests/drivers/net/Makefile
index 0c95bd944d56..cd74f1eb3193 100644
--- a/tools/testing/selftests/drivers/net/Makefile
+++ b/tools/testing/selftests/drivers/net/Makefile
@@ -6,8 +6,6 @@ TEST_INCLUDES := $(wildcard lib/py/*.py) \
 		 ../../net/net_helper.sh \
 		 ../../net/lib.sh \
 
-TEST_GEN_FILES := xdp_helper
-
 TEST_PROGS := \
 	netcons_basic.sh \
 	netcons_fragmented_msg.sh \
diff --git a/tools/testing/selftests/drivers/net/queues.py b/tools/testing/selftests/drivers/net/queues.py
index 06abd3f233e1..236005290a33 100755
--- a/tools/testing/selftests/drivers/net/queues.py
+++ b/tools/testing/selftests/drivers/net/queues.py
@@ -26,13 +26,13 @@ def nl_get_queues(cfg, nl, qtype='rx'):
 
 def check_xsk(cfg, nl, xdp_queue_id=0) -> None:
     # Probe for support
-    xdp = cmd(f'{cfg.test_dir / "xdp_helper"} - -', fail=False)
+    xdp = cmd(f'{cfg.net_lib_dir / "xdp_helper"} - -', fail=False)
     if xdp.ret == 255:
         raise KsftSkipEx('AF_XDP unsupported')
     elif xdp.ret > 0:
         raise KsftFailEx('unable to create AF_XDP socket')
 
-    with bkg(f'{cfg.test_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
+    with bkg(f'{cfg.net_lib_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
              ksft_wait=3):
 
         rx = tx = False
diff --git a/tools/testing/selftests/net/lib/.gitignore b/tools/testing/selftests/net/lib/.gitignore
index 1ebc6187f421..bbc97d6bf556 100644
--- a/tools/testing/selftests/net/lib/.gitignore
+++ b/tools/testing/selftests/net/lib/.gitignore
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
 csum
+xdp_helper
diff --git a/tools/testing/selftests/net/lib/Makefile b/tools/testing/selftests/net/lib/Makefile
index c22623b9a2a5..88c4bc461459 100644
--- a/tools/testing/selftests/net/lib/Makefile
+++ b/tools/testing/selftests/net/lib/Makefile
@@ -10,6 +10,7 @@ TEST_FILES += ../../../../net/ynl
 
 TEST_GEN_FILES += csum
 TEST_GEN_FILES += $(patsubst %.c,%.o,$(wildcard *.bpf.c))
+TEST_GEN_FILES += xdp_helper
 
 TEST_INCLUDES := $(wildcard py/*.py sh/*.sh)
 
diff --git a/tools/testing/selftests/drivers/net/xdp_helper.c b/tools/testing/selftests/net/lib/xdp_helper.c
similarity index 100%
rename from tools/testing/selftests/drivers/net/xdp_helper.c
rename to tools/testing/selftests/net/lib/xdp_helper.c
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest
  2025-04-15  7:43 [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Bui Quang Minh
  2025-04-15  7:43 ` [PATCH v3 1/3] " Bui Quang Minh
  2025-04-15  7:43 ` [PATCH v3 2/3] selftests: net: move xdp_helper to net/lib Bui Quang Minh
@ 2025-04-15  7:43 ` Bui Quang Minh
  2025-04-16  4:27   ` Jakub Kicinski
  2025-04-15 14:04 ` [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Michael S. Tsirkin
  3 siblings, 1 reply; 11+ messages in thread
From: Bui Quang Minh @ 2025-04-15  7:43 UTC (permalink / raw)
  To: virtualization
  Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, Andrew Lunn,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf,
	Bui Quang Minh

The selftest reproduces the deadlock scenario when binding/unbinding XDP
program, XDP socket, rx ring resize on virtio_net interface.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
 tools/testing/selftests/Makefile              |  2 +-
 .../selftests/drivers/net/virtio_net/Makefile |  2 +
 .../selftests/drivers/net/virtio_net/config   |  1 +
 .../drivers/net/virtio_net/lib/py/__init__.py | 16 ++++++
 .../drivers/net/virtio_net/xsk_pool.py        | 52 +++++++++++++++++++
 5 files changed, 72 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/drivers/net/virtio_net/lib/py/__init__.py
 create mode 100755 tools/testing/selftests/drivers/net/virtio_net/xsk_pool.py

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index c77c8c8e3d9b..0a6b096f98b7 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -130,7 +130,7 @@ TARGETS_HOTPLUG = cpu-hotplug
 TARGETS_HOTPLUG += memory-hotplug
 
 # Networking tests want the net/lib target, include it automatically
-ifneq ($(filter net drivers/net drivers/net/hw,$(TARGETS)),)
+ifneq ($(filter net drivers/net drivers/net/hw drivers/net/virtio_net,$(TARGETS)),)
 ifeq ($(filter net/lib,$(TARGETS)),)
 	INSTALL_DEP_TARGETS := net/lib
 endif
diff --git a/tools/testing/selftests/drivers/net/virtio_net/Makefile b/tools/testing/selftests/drivers/net/virtio_net/Makefile
index 7ec7cd3ab2cc..82adda96ef15 100644
--- a/tools/testing/selftests/drivers/net/virtio_net/Makefile
+++ b/tools/testing/selftests/drivers/net/virtio_net/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0+ OR MIT
 
 TEST_PROGS = basic_features.sh \
+        xsk_pool.py \
         #
 
 TEST_FILES = \
@@ -8,6 +9,7 @@ TEST_FILES = \
         #
 
 TEST_INCLUDES = \
+        $(wildcard lib/py/*.py ../lib/py/*.py) \
         ../../../net/forwarding/lib.sh \
         ../../../net/lib.sh \
         #
diff --git a/tools/testing/selftests/drivers/net/virtio_net/config b/tools/testing/selftests/drivers/net/virtio_net/config
index bcf7555eaffe..12e8caa22613 100644
--- a/tools/testing/selftests/drivers/net/virtio_net/config
+++ b/tools/testing/selftests/drivers/net/virtio_net/config
@@ -6,3 +6,4 @@ CONFIG_NET_L3_MASTER_DEV=y
 CONFIG_NET_VRF=m
 CONFIG_VIRTIO_DEBUG=y
 CONFIG_VIRTIO_NET=y
+CONFIG_XDP_SOCKETS=y
diff --git a/tools/testing/selftests/drivers/net/virtio_net/lib/py/__init__.py b/tools/testing/selftests/drivers/net/virtio_net/lib/py/__init__.py
new file mode 100644
index 000000000000..b582885786f5
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/virtio_net/lib/py/__init__.py
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0
+
+import sys
+from pathlib import Path
+
+KSFT_DIR = (Path(__file__).parent / "../../../../..").resolve()
+
+try:
+    sys.path.append(KSFT_DIR.as_posix())
+    from net.lib.py import *
+    from drivers.net.lib.py import *
+except ModuleNotFoundError as e:
+    ksft_pr("Failed importing `net` library from kernel sources")
+    ksft_pr(str(e))
+    ktap_result(True, comment="SKIP")
+    sys.exit(4)
diff --git a/tools/testing/selftests/drivers/net/virtio_net/xsk_pool.py b/tools/testing/selftests/drivers/net/virtio_net/xsk_pool.py
new file mode 100755
index 000000000000..3f80afd97d4e
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/virtio_net/xsk_pool.py
@@ -0,0 +1,52 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+# This is intended to be run on a virtio-net guest interface.
+# The test binds the XDP socket to the interface without setting
+# the fill ring to trigger delayed refill_work. This helps to
+# make it easier to reproduce the deadlock when XDP program,
+# XDP socket bind/unbind, rx ring resize race with refill_work on
+# the buggy kernel.
+
+from lib.py import ksft_exit, ksft_run
+from lib.py import KsftSkipEx, KsftFailEx
+from lib.py import NetDrvEnv
+from lib.py import bkg, ip, cmd, ethtool
+import re
+
+def _get_rx_ring_entries(cfg):
+    output = ethtool(f"-g {cfg.ifname}").stdout
+    values = re.findall(r'RX:\s+(\d+)', output)
+    return int(values[1])
+
+def setup_xsk(cfg, xdp_queue_id = 0) -> bkg:
+    # Probe for support
+    xdp = cmd(f'{cfg.net_lib_dir / "xdp_helper"} - -', fail=False)
+    if xdp.ret == 255:
+        raise KsftSkipEx('AF_XDP unsupported')
+    elif xdp.ret > 0:
+        raise KsftFailEx('unable to create AF_XDP socket')
+
+    return bkg(f'{cfg.net_lib_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
+               ksft_wait=3)
+
+def check_xdp_bind(cfg):
+    ip(f"link set dev %s xdp obj %s sec xdp" %
+       (cfg.ifname, cfg.net_lib_dir / "xdp_dummy.bpf.o"))
+    ip(f"link set dev %s xdp off" % cfg.ifname)
+
+def check_rx_resize(cfg, queue_size = 128):
+    rx_ring = _get_rx_ring_entries(cfg)
+    ethtool(f"-G %s rx %d" % (cfg.ifname, queue_size))
+    ethtool(f"-G %s rx %d" % (cfg.ifname, rx_ring))
+
+def main():
+    with NetDrvEnv(__file__, nsim_test=False) as cfg:
+        xsk_bkg = setup_xsk(cfg)
+
+        ksft_run([check_xdp_bind, check_rx_resize],
+                 args=(cfg, ))
+    ksft_exit()
+
+if __name__ == "__main__":
+    main()
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/3] virtio-net: disable delayed refill when pausing rx
  2025-04-15  7:43 ` [PATCH v3 1/3] " Bui Quang Minh
@ 2025-04-15 13:19   ` Michael S. Tsirkin
  0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2025-04-15 13:19 UTC (permalink / raw)
  To: Bui Quang Minh
  Cc: virtualization, Jason Wang, Xuan Zhuo, Andrew Lunn, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Eugenio Pérez,
	David S . Miller, netdev, linux-kernel, bpf

On Tue, Apr 15, 2025 at 02:43:39PM +0700, Bui Quang Minh wrote:
> When pausing rx (e.g. set up xdp, xsk pool, rx resize), we call
> napi_disable() on the receive queue's napi. In delayed refill_work, it
> also calls napi_disable() on the receive queue's napi.  When
> napi_disable() is called on an already disabled napi, it will sleep in
> napi_disable_locked while still holding the netdev_lock. As a result,
> later napi_enable gets stuck too as it cannot acquire the netdev_lock.
> This leads to refill_work and the pause-then-resume tx are stuck
> altogether.
> 
> This scenario can be reproducible by binding a XDP socket to virtio-net
> interface without setting up the fill ring. As a result, try_fill_recv
> will fail until the fill ring is set up and refill_work is scheduled.
> 
> This commit adds virtnet_rx_(pause/resume)_all helpers and fixes up the
> virtnet_rx_resume to disable future and cancel all inflights delayed
> refill_work before calling napi_disable() to pause the rx.
> 
> Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/virtio_net.c | 69 +++++++++++++++++++++++++++++++++-------
>  1 file changed, 57 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7e4617216a4b..848fab51dfa1 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3342,7 +3342,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	return NETDEV_TX_OK;
>  }
>  
> -static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
> +static void __virtnet_rx_pause(struct virtnet_info *vi,
> +			       struct receive_queue *rq)
>  {
>  	bool running = netif_running(vi->dev);
>  
> @@ -3352,17 +3353,63 @@ static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
>  	}
>  }
>  
> -static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq)
> +static void virtnet_rx_pause_all(struct virtnet_info *vi)
> +{
> +	int i;
> +
> +	/*
> +	 * Make sure refill_work does not run concurrently to
> +	 * avoid napi_disable race which leads to deadlock.
> +	 */
> +	disable_delayed_refill(vi);
> +	cancel_delayed_work_sync(&vi->refill);
> +	for (i = 0; i < vi->max_queue_pairs; i++)
> +		__virtnet_rx_pause(vi, &vi->rq[i]);
> +}
> +
> +static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
> +{
> +	/*
> +	 * Make sure refill_work does not run concurrently to
> +	 * avoid napi_disable race which leads to deadlock.
> +	 */
> +	disable_delayed_refill(vi);
> +	cancel_delayed_work_sync(&vi->refill);
> +	__virtnet_rx_pause(vi, rq);
> +}
> +
> +static void __virtnet_rx_resume(struct virtnet_info *vi,
> +				struct receive_queue *rq,
> +				bool refill)
>  {
>  	bool running = netif_running(vi->dev);
>  
> -	if (!try_fill_recv(vi, rq, GFP_KERNEL))
> +	if (refill && !try_fill_recv(vi, rq, GFP_KERNEL))
>  		schedule_delayed_work(&vi->refill, 0);
>  
>  	if (running)
>  		virtnet_napi_enable(rq);
>  }
>  
> +static void virtnet_rx_resume_all(struct virtnet_info *vi)
> +{
> +	int i;
> +
> +	enable_delayed_refill(vi);
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
> +		if (i < vi->curr_queue_pairs)
> +			__virtnet_rx_resume(vi, &vi->rq[i], true);
> +		else
> +			__virtnet_rx_resume(vi, &vi->rq[i], false);
> +	}
> +}
> +
> +static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq)
> +{
> +	enable_delayed_refill(vi);
> +	__virtnet_rx_resume(vi, rq, true);
> +}
> +
>  static int virtnet_rx_resize(struct virtnet_info *vi,
>  			     struct receive_queue *rq, u32 ring_num)
>  {
> @@ -5959,12 +6006,12 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>  	if (prog)
>  		bpf_prog_add(prog, vi->max_queue_pairs - 1);
>  
> +	virtnet_rx_pause_all(vi);
> +
>  	/* Make sure NAPI is not using any XDP TX queues for RX. */
>  	if (netif_running(dev)) {
> -		for (i = 0; i < vi->max_queue_pairs; i++) {
> -			virtnet_napi_disable(&vi->rq[i]);
> +		for (i = 0; i < vi->max_queue_pairs; i++)
>  			virtnet_napi_tx_disable(&vi->sq[i]);
> -		}
>  	}
>  
>  	if (!prog) {
> @@ -5996,13 +6043,12 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>  		vi->xdp_enabled = false;
>  	}
>  
> +	virtnet_rx_resume_all(vi);
>  	for (i = 0; i < vi->max_queue_pairs; i++) {
>  		if (old_prog)
>  			bpf_prog_put(old_prog);
> -		if (netif_running(dev)) {
> -			virtnet_napi_enable(&vi->rq[i]);
> +		if (netif_running(dev))
>  			virtnet_napi_tx_enable(&vi->sq[i]);
> -		}
>  	}
>  
>  	return 0;
> @@ -6014,11 +6060,10 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>  			rcu_assign_pointer(vi->rq[i].xdp_prog, old_prog);
>  	}
>  
> +	virtnet_rx_resume_all(vi);
>  	if (netif_running(dev)) {
> -		for (i = 0; i < vi->max_queue_pairs; i++) {
> -			virtnet_napi_enable(&vi->rq[i]);
> +		for (i = 0; i < vi->max_queue_pairs; i++)
>  			virtnet_napi_tx_enable(&vi->sq[i]);
> -		}
>  	}
>  	if (prog)
>  		bpf_prog_sub(prog, vi->max_queue_pairs - 1);
> -- 
> 2.43.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx
  2025-04-15  7:43 [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Bui Quang Minh
                   ` (2 preceding siblings ...)
  2025-04-15  7:43 ` [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest Bui Quang Minh
@ 2025-04-15 14:04 ` Michael S. Tsirkin
  3 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2025-04-15 14:04 UTC (permalink / raw)
  To: Bui Quang Minh
  Cc: virtualization, Jason Wang, Xuan Zhuo, Andrew Lunn, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Eugenio Pérez,
	David S . Miller, netdev, linux-kernel, bpf

On Tue, Apr 15, 2025 at 02:43:38PM +0700, Bui Quang Minh wrote:
> Hi everyone,
> 
> This series tries to fix a deadlock in virtio-net when binding/unbinding
> XDP program, XDP socket or resizing the rx queue.
> 
> When pausing rx (e.g. set up xdp, xsk pool, rx resize), we call
> napi_disable() on the receive queue's napi. In delayed refill_work, it
> also calls napi_disable() on the receive queue's napi. When
> napi_disable() is called on an already disabled napi, it will sleep in
> napi_disable_locked while still holding the netdev_lock. As a result,
> later napi_enable gets stuck too as it cannot acquire the netdev_lock.
> This leads to refill_work and the pause-then-resume tx are stuck
> altogether.
> 
> This scenario can be reproducible by binding a XDP socket to virtio-net
> interface without setting up the fill ring. As a result, try_fill_recv
> will fail until the fill ring is set up and refill_work is scheduled.
> 
> This fix adds virtnet_rx_(pause/resume)_all helpers and fixes up the
> virtnet_rx_resume to disable future and cancel all inflights delayed
> refill_work before calling napi_disable() to pause the rx.
> 
> Version 3 changes:
> - Patch 1: refactor to avoid code duplication
> 
> Version 2 changes:
> - Add selftest for deadlock scenario
> 
> Thanks,
> Quang Minh.


Acked-by: Michael S. Tsirkin <mst@redhat.com>

> Bui Quang Minh (3):
>   virtio-net: disable delayed refill when pausing rx
>   selftests: net: move xdp_helper to net/lib
>   selftests: net: add a virtio_net deadlock selftest
> 
>  drivers/net/virtio_net.c                      | 69 +++++++++++++++----
>  tools/testing/selftests/Makefile              |  2 +-
>  tools/testing/selftests/drivers/net/Makefile  |  2 -
>  tools/testing/selftests/drivers/net/queues.py |  4 +-
>  .../selftests/drivers/net/virtio_net/Makefile |  2 +
>  .../selftests/drivers/net/virtio_net/config   |  1 +
>  .../drivers/net/virtio_net/lib/py/__init__.py | 16 +++++
>  .../drivers/net/virtio_net/xsk_pool.py        | 52 ++++++++++++++
>  tools/testing/selftests/net/lib/.gitignore    |  1 +
>  tools/testing/selftests/net/lib/Makefile      |  1 +
>  .../{drivers/net => net/lib}/xdp_helper.c     |  0
>  11 files changed, 133 insertions(+), 17 deletions(-)
>  create mode 100644 tools/testing/selftests/drivers/net/virtio_net/lib/py/__init__.py
>  create mode 100755 tools/testing/selftests/drivers/net/virtio_net/xsk_pool.py
>  rename tools/testing/selftests/{drivers/net => net/lib}/xdp_helper.c (100%)
> 
> -- 
> 2.43.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest
  2025-04-15  7:43 ` [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest Bui Quang Minh
@ 2025-04-16  4:27   ` Jakub Kicinski
  2025-04-16  6:54     ` Bui Quang Minh
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2025-04-16  4:27 UTC (permalink / raw)
  To: Bui Quang Minh
  Cc: virtualization, Michael S . Tsirkin, Jason Wang, Xuan Zhuo,
	Andrew Lunn, Eric Dumazet, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf

On Tue, 15 Apr 2025 14:43:41 +0700 Bui Quang Minh wrote:
> +def setup_xsk(cfg, xdp_queue_id = 0) -> bkg:
> +    # Probe for support
> +    xdp = cmd(f'{cfg.net_lib_dir / "xdp_helper"} - -', fail=False)
> +    if xdp.ret == 255:
> +        raise KsftSkipEx('AF_XDP unsupported')
> +    elif xdp.ret > 0:
> +        raise KsftFailEx('unable to create AF_XDP socket')
> +
> +    return bkg(f'{cfg.net_lib_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
> +               ksft_wait=3)
> +
> +def check_xdp_bind(cfg):
> +    ip(f"link set dev %s xdp obj %s sec xdp" %
> +       (cfg.ifname, cfg.net_lib_dir / "xdp_dummy.bpf.o"))
> +    ip(f"link set dev %s xdp off" % cfg.ifname)
> +
> +def check_rx_resize(cfg, queue_size = 128):
> +    rx_ring = _get_rx_ring_entries(cfg)
> +    ethtool(f"-G %s rx %d" % (cfg.ifname, queue_size))
> +    ethtool(f"-G %s rx %d" % (cfg.ifname, rx_ring))

Unfortunately this doesn't work on a basic QEMU setup:

# ethtool -G eth0 rx 128
[   15.680655][  T287] virtio_net virtio2 eth0: resize rx fail: rx queue index: 0 err: -2
netlink error: No such file or directory

Is there a way to enable more capable virtio_net with QEMU?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest
  2025-04-16  4:27   ` Jakub Kicinski
@ 2025-04-16  6:54     ` Bui Quang Minh
  2025-04-16  7:46       ` Jason Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Bui Quang Minh @ 2025-04-16  6:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: virtualization, Michael S . Tsirkin, Jason Wang, Xuan Zhuo,
	Andrew Lunn, Eric Dumazet, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf

On 4/16/25 11:27, Jakub Kicinski wrote:
> On Tue, 15 Apr 2025 14:43:41 +0700 Bui Quang Minh wrote:
>> +def setup_xsk(cfg, xdp_queue_id = 0) -> bkg:
>> +    # Probe for support
>> +    xdp = cmd(f'{cfg.net_lib_dir / "xdp_helper"} - -', fail=False)
>> +    if xdp.ret == 255:
>> +        raise KsftSkipEx('AF_XDP unsupported')
>> +    elif xdp.ret > 0:
>> +        raise KsftFailEx('unable to create AF_XDP socket')
>> +
>> +    return bkg(f'{cfg.net_lib_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
>> +               ksft_wait=3)
>> +
>> +def check_xdp_bind(cfg):
>> +    ip(f"link set dev %s xdp obj %s sec xdp" %
>> +       (cfg.ifname, cfg.net_lib_dir / "xdp_dummy.bpf.o"))
>> +    ip(f"link set dev %s xdp off" % cfg.ifname)
>> +
>> +def check_rx_resize(cfg, queue_size = 128):
>> +    rx_ring = _get_rx_ring_entries(cfg)
>> +    ethtool(f"-G %s rx %d" % (cfg.ifname, queue_size))
>> +    ethtool(f"-G %s rx %d" % (cfg.ifname, rx_ring))
> Unfortunately this doesn't work on a basic QEMU setup:
>
> # ethtool -G eth0 rx 128
> [   15.680655][  T287] virtio_net virtio2 eth0: resize rx fail: rx queue index: 0 err: -2
> netlink error: No such file or directory
>
> Is there a way to enable more capable virtio_net with QEMU?

I guess that virtio-pci-legacy is used in your setup.

Here is how I setup virtio-net with Qemu

     -netdev tap,id=hostnet1,vhost=on,script=$NETWORK_SCRIPT,downscript=no \
     -device 
virtio-net-pci,netdev=hostnet1,iommu_platform=on,disable-legacy=on \

The iommu_platform=on is necessary to make vring use dma API which is a 
requirement to enable xsk_pool in virtio-net (XDP socket will be in 
zerocopy mode for this case). Otherwise, the XDP socket will fallback to 
copy mode, xsk_pool is not enabled in virtio-net that makes the 
probability to reproduce bug to be very small. Currently, when you don't 
have iommu_platform=on, you can pass the test even before the fix, so I 
think I will try to harden the selftest to make it return skip in this case.

Thanks,
Quang Minh.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest
  2025-04-16  6:54     ` Bui Quang Minh
@ 2025-04-16  7:46       ` Jason Wang
  2025-04-16  9:00         ` Bui Quang Minh
  2025-04-17  0:08         ` Jakub Kicinski
  0 siblings, 2 replies; 11+ messages in thread
From: Jason Wang @ 2025-04-16  7:46 UTC (permalink / raw)
  To: Bui Quang Minh
  Cc: Jakub Kicinski, virtualization, Michael S . Tsirkin, Xuan Zhuo,
	Andrew Lunn, Eric Dumazet, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf

On Wed, Apr 16, 2025 at 2:54 PM Bui Quang Minh <minhquangbui99@gmail.com> wrote:
>
> On 4/16/25 11:27, Jakub Kicinski wrote:
> > On Tue, 15 Apr 2025 14:43:41 +0700 Bui Quang Minh wrote:
> >> +def setup_xsk(cfg, xdp_queue_id = 0) -> bkg:
> >> +    # Probe for support
> >> +    xdp = cmd(f'{cfg.net_lib_dir / "xdp_helper"} - -', fail=False)
> >> +    if xdp.ret == 255:
> >> +        raise KsftSkipEx('AF_XDP unsupported')
> >> +    elif xdp.ret > 0:
> >> +        raise KsftFailEx('unable to create AF_XDP socket')
> >> +
> >> +    return bkg(f'{cfg.net_lib_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
> >> +               ksft_wait=3)
> >> +
> >> +def check_xdp_bind(cfg):
> >> +    ip(f"link set dev %s xdp obj %s sec xdp" %
> >> +       (cfg.ifname, cfg.net_lib_dir / "xdp_dummy.bpf.o"))
> >> +    ip(f"link set dev %s xdp off" % cfg.ifname)
> >> +
> >> +def check_rx_resize(cfg, queue_size = 128):
> >> +    rx_ring = _get_rx_ring_entries(cfg)
> >> +    ethtool(f"-G %s rx %d" % (cfg.ifname, queue_size))
> >> +    ethtool(f"-G %s rx %d" % (cfg.ifname, rx_ring))
> > Unfortunately this doesn't work on a basic QEMU setup:
> >
> > # ethtool -G eth0 rx 128
> > [   15.680655][  T287] virtio_net virtio2 eth0: resize rx fail: rx queue index: 0 err: -2
> > netlink error: No such file or directory
> >
> > Is there a way to enable more capable virtio_net with QEMU?

What's the qemu command line and version?

Resize depends on queue_reset which should be supported from Qemu 7.2

>
> I guess that virtio-pci-legacy is used in your setup.

Note that modern devices are used by default.

>
> Here is how I setup virtio-net with Qemu
>
>      -netdev tap,id=hostnet1,vhost=on,script=$NETWORK_SCRIPT,downscript=no \
>      -device
> virtio-net-pci,netdev=hostnet1,iommu_platform=on,disable-legacy=on \
>
> The iommu_platform=on is necessary to make vring use dma API which is a
> requirement to enable xsk_pool in virtio-net (XDP socket will be in
> zerocopy mode for this case). Otherwise, the XDP socket will fallback to
> copy mode, xsk_pool is not enabled in virtio-net that makes the
> probability to reproduce bug to be very small. Currently, when you don't
> have iommu_platform=on, you can pass the test even before the fix, so I
> think I will try to harden the selftest to make it return skip in this case.

I would like to keep the resize test as it doesn't require iommu_platform.

Thanks

>
> Thanks,
> Quang Minh.
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest
  2025-04-16  7:46       ` Jason Wang
@ 2025-04-16  9:00         ` Bui Quang Minh
  2025-04-17  0:08         ` Jakub Kicinski
  1 sibling, 0 replies; 11+ messages in thread
From: Bui Quang Minh @ 2025-04-16  9:00 UTC (permalink / raw)
  To: Jason Wang
  Cc: Jakub Kicinski, virtualization, Michael S . Tsirkin, Xuan Zhuo,
	Andrew Lunn, Eric Dumazet, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf

On 4/16/25 14:46, Jason Wang wrote:
> On Wed, Apr 16, 2025 at 2:54 PM Bui Quang Minh <minhquangbui99@gmail.com> wrote:
>> On 4/16/25 11:27, Jakub Kicinski wrote:
>>> On Tue, 15 Apr 2025 14:43:41 +0700 Bui Quang Minh wrote:
>>>> +def setup_xsk(cfg, xdp_queue_id = 0) -> bkg:
>>>> +    # Probe for support
>>>> +    xdp = cmd(f'{cfg.net_lib_dir / "xdp_helper"} - -', fail=False)
>>>> +    if xdp.ret == 255:
>>>> +        raise KsftSkipEx('AF_XDP unsupported')
>>>> +    elif xdp.ret > 0:
>>>> +        raise KsftFailEx('unable to create AF_XDP socket')
>>>> +
>>>> +    return bkg(f'{cfg.net_lib_dir / "xdp_helper"} {cfg.ifindex} {xdp_queue_id}',
>>>> +               ksft_wait=3)
>>>> +
>>>> +def check_xdp_bind(cfg):
>>>> +    ip(f"link set dev %s xdp obj %s sec xdp" %
>>>> +       (cfg.ifname, cfg.net_lib_dir / "xdp_dummy.bpf.o"))
>>>> +    ip(f"link set dev %s xdp off" % cfg.ifname)
>>>> +
>>>> +def check_rx_resize(cfg, queue_size = 128):
>>>> +    rx_ring = _get_rx_ring_entries(cfg)
>>>> +    ethtool(f"-G %s rx %d" % (cfg.ifname, queue_size))
>>>> +    ethtool(f"-G %s rx %d" % (cfg.ifname, rx_ring))
>>> Unfortunately this doesn't work on a basic QEMU setup:
>>>
>>> # ethtool -G eth0 rx 128
>>> [   15.680655][  T287] virtio_net virtio2 eth0: resize rx fail: rx queue index: 0 err: -2
>>> netlink error: No such file or directory
>>>
>>> Is there a way to enable more capable virtio_net with QEMU?
> What's the qemu command line and version?
>
> Resize depends on queue_reset which should be supported from Qemu 7.2
>
>> I guess that virtio-pci-legacy is used in your setup.
> Note that modern devices are used by default.
>
>> Here is how I setup virtio-net with Qemu
>>
>>       -netdev tap,id=hostnet1,vhost=on,script=$NETWORK_SCRIPT,downscript=no \
>>       -device
>> virtio-net-pci,netdev=hostnet1,iommu_platform=on,disable-legacy=on \
>>
>> The iommu_platform=on is necessary to make vring use dma API which is a
>> requirement to enable xsk_pool in virtio-net (XDP socket will be in
>> zerocopy mode for this case). Otherwise, the XDP socket will fallback to
>> copy mode, xsk_pool is not enabled in virtio-net that makes the
>> probability to reproduce bug to be very small. Currently, when you don't
>> have iommu_platform=on, you can pass the test even before the fix, so I
>> think I will try to harden the selftest to make it return skip in this case.
> I would like to keep the resize test as it doesn't require iommu_platform.

Okay, in next version I will force the XDP socket binding to zerocopy to 
setup xsk_pool. When the binding fails, 2 tests still run but I will 
print a warning message.

Thanks,
Quang Minh.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest
  2025-04-16  7:46       ` Jason Wang
  2025-04-16  9:00         ` Bui Quang Minh
@ 2025-04-17  0:08         ` Jakub Kicinski
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2025-04-17  0:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: Bui Quang Minh, virtualization, Michael S . Tsirkin, Xuan Zhuo,
	Andrew Lunn, Eric Dumazet, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Eugenio Pérez, David S . Miller, netdev, linux-kernel, bpf

On Wed, 16 Apr 2025 15:46:42 +0800 Jason Wang wrote:
> On Wed, Apr 16, 2025 at 2:54 PM Bui Quang Minh <minhquangbui99@gmail.com> wrote:
> > On 4/16/25 11:27, Jakub Kicinski wrote:  
> > > Unfortunately this doesn't work on a basic QEMU setup:
> > >
> > > # ethtool -G eth0 rx 128
> > > [   15.680655][  T287] virtio_net virtio2 eth0: resize rx fail: rx queue index: 0 err: -2
> > > netlink error: No such file or directory
> > >
> > > Is there a way to enable more capable virtio_net with QEMU?  
> 
> What's the qemu command line and version?
> 
> Resize depends on queue_reset which should be supported from Qemu 7.2

I'm using virtme-ng with --net loop and:

QEMU emulator version 9.1.3 (qemu-9.1.3-2.fc41)

--net loop resolves to:

	-device virtio-net-device,netdev=n0 \
	-netdev hubport,id=n0,hubid=0 \
	-device virtio-net-device,netdev=n1 \
	-netdev hubport,id=n1,hubid=0

> > I guess that virtio-pci-legacy is used in your setup.  
> 
> Note that modern devices are used by default.
> 
> >
> > Here is how I setup virtio-net with Qemu
> >
> >      -netdev tap,id=hostnet1,vhost=on,script=$NETWORK_SCRIPT,downscript=no \
> >      -device
> > virtio-net-pci,netdev=hostnet1,iommu_platform=on,disable-legacy=on \

That works! I rejigged the CI, for posterity I used two times:

	-device	virtio-net-pci,netdev=n0,iommu_platform=on,disable-legacy=on,mq=on,vectors=18
	-netdev tap,id=n0,ifname=tap4,vhost=on,script=no,downscript=no,queues=8 

and then manually bridged the taps together on the hypervisor side.

> > The iommu_platform=on is necessary to make vring use dma API which is a
> > requirement to enable xsk_pool in virtio-net (XDP socket will be in
> > zerocopy mode for this case). Otherwise, the XDP socket will fallback to
> > copy mode, xsk_pool is not enabled in virtio-net that makes the
> > probability to reproduce bug to be very small. Currently, when you don't
> > have iommu_platform=on, you can pass the test even before the fix, so I
> > think I will try to harden the selftest to make it return skip in this case.  
> 
> I would like to keep the resize test as it doesn't require iommu_platform.

Sounds good but lets just add them to the drivers/net/hw directory.
I don't think there's anything virtio specific in the test itself?

Right now drivers/net/virtio_net has a test which expects to see
both netdevs in the VM, while drivers / Python based tests expect
to have the env prepared where only one end is on the local machine, 
and the other is accessible over SSH or in another netns. So it's a bit
painful to marry the two kinds of tests in the CI. At least our netdev
CI does not know how to figure this out :( It preps the env and then
runs the whole kselftest TARGET in the same setup.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-04-17  0:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-15  7:43 [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Bui Quang Minh
2025-04-15  7:43 ` [PATCH v3 1/3] " Bui Quang Minh
2025-04-15 13:19   ` Michael S. Tsirkin
2025-04-15  7:43 ` [PATCH v3 2/3] selftests: net: move xdp_helper to net/lib Bui Quang Minh
2025-04-15  7:43 ` [PATCH v3 3/3] selftests: net: add a virtio_net deadlock selftest Bui Quang Minh
2025-04-16  4:27   ` Jakub Kicinski
2025-04-16  6:54     ` Bui Quang Minh
2025-04-16  7:46       ` Jason Wang
2025-04-16  9:00         ` Bui Quang Minh
2025-04-17  0:08         ` Jakub Kicinski
2025-04-15 14:04 ` [PATCH v3 0/3] virtio-net: disable delayed refill when pausing rx Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).