[PATCH net-next 0/6] selftests: net: page_poll allocation error injection

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next 0/6] selftests: net: page_poll allocation error injection
@ 2024-04-26 23:23 Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 1/6] net: page_pool: support " Jakub Kicinski
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

Add a test for exercising driver memory allocation failure paths.
page pool is a bit tricky to inject errors into at the page allocator
level because of the bulk alloc and recycling, so add explicit error
injection support "in front" of the caches.

Add a test to exercise that using only the standard APIs.
This is the first useful test for the new tests with an endpoint.
There's no point testing netdevsim here, so this is also the first
HW-only test in Python.

I'm not super happy with the traffic generation using iperf3,
my initial approach was to use mausezahn. But it turned out to be
5x slower in terms of PPS. Hopefully this is good enough for now.

Jakub Kicinski (6):
  net: page_pool: support error injection
  selftests: drv-net-hw: support using Python from net hw tests
  selftests: net: py: extract tool logic
  selftests: net: py: avoid all ports < 10k
  selftests: drv-net: support generating iperf3 load
  selftests: drv-net-hw: add test for memory allocation failures with
    page pool

 net/core/page_pool.c                          |   2 +
 tools/testing/selftests/Makefile              |   2 +-
 .../testing/selftests/drivers/net/hw/Makefile |   2 +
 .../drivers/net/hw/lib/py/__init__.py         |  16 +++
 .../selftests/drivers/net/hw/pp_alloc_fail.py | 129 ++++++++++++++++++
 .../selftests/drivers/net/lib/py/__init__.py  |   1 +
 .../selftests/drivers/net/lib/py/env.py       |  10 +-
 .../selftests/drivers/net/lib/py/load.py      |  41 ++++++
 tools/testing/selftests/net/lib/py/ksft.py    |   4 +
 tools/testing/selftests/net/lib/py/utils.py   |  14 +-
 10 files changed, 214 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/hw/lib/py/__init__.py
 create mode 100755 tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
 create mode 100644 tools/testing/selftests/drivers/net/lib/py/load.py

-- 
2.44.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH net-next 1/6] net: page_pool: support error injection
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
@ 2024-04-26 23:23 ` Jakub Kicinski
  2024-04-30 15:27   ` Jesper Dangaard Brouer
  2024-04-26 23:23 ` [PATCH net-next 2/6] selftests: drv-net-hw: support using Python from net hw tests Jakub Kicinski
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski, hawk, ilias.apalodimas

Because of caching / recycling using the general page allocation
failures to induce errors in page pool allocation is very hard.
Add direct error injection support to page_pool_alloc_pages().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: hawk@kernel.org
CC: ilias.apalodimas@linaro.org
---
 net/core/page_pool.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 273c24429bce..8bcc7014a61a 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -5,6 +5,7 @@
  *	Copyright (C) 2016 Red Hat, Inc.
  */
 
+#include <linux/error-injection.h>
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/slab.h>
@@ -550,6 +551,7 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
 	return page;
 }
 EXPORT_SYMBOL(page_pool_alloc_pages);
+ALLOW_ERROR_INJECTION(page_pool_alloc_pages, NULL);
 
 /* Calculate distance between two u32 values, valid if distance is below 2^(31)
  *  https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next 2/6] selftests: drv-net-hw: support using Python from net hw tests
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 1/6] net: page_pool: support " Jakub Kicinski
@ 2024-04-26 23:23 ` Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 3/6] selftests: net: py: extract tool logic Jakub Kicinski
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

We created a separate directory for HW-only tests, recently.
Glue in the Python test library there, Python is a bit annoying
when it comes to using library code located "lower"
in the directory structure.

Reuse the Env class, but let tests require non-nsim setup.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/Makefile                 |  2 +-
 tools/testing/selftests/drivers/net/hw/Makefile  |  1 +
 .../selftests/drivers/net/hw/lib/py/__init__.py  | 16 ++++++++++++++++
 .../testing/selftests/drivers/net/lib/py/env.py  | 10 ++++++++--
 4 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/hw/lib/py/__init__.py

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 2c940e9c4ced..9039f3709aff 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -119,7 +119,7 @@ TARGETS_HOTPLUG = cpu-hotplug
 TARGETS_HOTPLUG += memory-hotplug
 
 # Networking tests want the net/lib target, include it automatically
-ifneq ($(filter net drivers/net,$(TARGETS)),)
+ifneq ($(filter net drivers/net drivers/net/hw,$(TARGETS)),)
 ifeq ($(filter net/lib,$(TARGETS)),)
 	INSTALL_DEP_TARGETS := net/lib
 endif
diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
index 2259a39a70ed..95f32158b095 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -16,6 +16,7 @@ TEST_FILES := \
 	#
 
 TEST_INCLUDES := \
+	$(wildcard lib/py/*.py ../lib/py/*.py) \
 	../../../net/lib.sh \
 	../../../net/forwarding/lib.sh \
 	../../../net/forwarding/ipip_lib.sh \
diff --git a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py
new file mode 100644
index 000000000000..b582885786f5
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0
+
+import sys
+from pathlib import Path
+
+KSFT_DIR = (Path(__file__).parent / "../../../../..").resolve()
+
+try:
+    sys.path.append(KSFT_DIR.as_posix())
+    from net.lib.py import *
+    from drivers.net.lib.py import *
+except ModuleNotFoundError as e:
+    ksft_pr("Failed importing `net` library from kernel sources")
+    ksft_pr(str(e))
+    ktap_result(True, comment="SKIP")
+    sys.exit(4)
diff --git a/tools/testing/selftests/drivers/net/lib/py/env.py b/tools/testing/selftests/drivers/net/lib/py/env.py
index e2ab637e56dc..5c8f695b2536 100644
--- a/tools/testing/selftests/drivers/net/lib/py/env.py
+++ b/tools/testing/selftests/drivers/net/lib/py/env.py
@@ -2,7 +2,7 @@
 
 import os
 from pathlib import Path
-from lib.py import KsftSkipEx
+from lib.py import KsftSkipEx, KsftXfailEx
 from lib.py import cmd, ip
 from lib.py import NetNS, NetdevSimDev
 from .remote import Remote
@@ -76,7 +76,7 @@ from .remote import Remote
     nsim_v4_pfx = "192.0.2."
     nsim_v6_pfx = "2001:db8::"
 
-    def __init__(self, src_path):
+    def __init__(self, src_path, nsim_test=None):
 
         self.env = _load_env_file(src_path)
 
@@ -88,7 +88,10 @@ from .remote import Remote
         self._ns_peer = None
 
         if "NETIF" in self.env:
+            if nsim_test is True:
+                raise KsftXfailEx("Test only works on netdevsim")
             self._check_env()
+
             self.dev = ip("link show dev " + self.env['NETIF'], json=True)[0]
 
             self.v4 = self.env.get("LOCAL_V4")
@@ -98,6 +101,9 @@ from .remote import Remote
             kind = self.env["REMOTE_TYPE"]
             args = self.env["REMOTE_ARGS"]
         else:
+            if nsim_test is False:
+                raise KsftXfailEx("Test does not work on netdevsim")
+
             self.create_local()
 
             self.dev = self._ns.nsims[0].dev
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next 3/6] selftests: net: py: extract tool logic
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 1/6] net: page_pool: support " Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 2/6] selftests: drv-net-hw: support using Python from net hw tests Jakub Kicinski
@ 2024-04-26 23:23 ` Jakub Kicinski
  2024-04-27 13:51   ` Willem de Bruijn
  2024-04-26 23:23 ` [PATCH net-next 4/6] selftests: net: py: avoid all ports < 10k Jakub Kicinski
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

The main use of the ip() wrapper over cmd() is that it can parse JSON.
cmd("ip -j link show") will return stdout as a string, and test has
to call json.loads(). With ip("link show", json=True) the return value
will be already parsed.

More tools (ethtool, bpftool etc.) support the --json switch.
To avoid having to wrap all of them individually create a tool()
helper.

Switch from -j to --json (for ethtool).
While at it consume the netns attribute at the ip() level.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/net/lib/py/utils.py | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/net/lib/py/utils.py b/tools/testing/selftests/net/lib/py/utils.py
index d3715e6c21f2..11b588a2bb9d 100644
--- a/tools/testing/selftests/net/lib/py/utils.py
+++ b/tools/testing/selftests/net/lib/py/utils.py
@@ -56,10 +56,10 @@ import time
         return self.process(terminate=self.terminate)
 
 
-def ip(args, json=None, ns=None, host=None):
-    cmd_str = "ip "
+def tool(name, args, json=None, ns=None, host=None):
+    cmd_str = name + " "
     if json:
-        cmd_str += '-j '
+        cmd_str += '--json '
     cmd_str += args
     cmd_obj = cmd(cmd_str, ns=ns, host=host)
     if json:
@@ -67,6 +67,12 @@ import time
     return cmd_obj
 
 
+def ip(args, json=None, ns=None, host=None):
+    if ns:
+        args = '-netns ' + ns + " " + args
+    return tool("ip", args, json=json, host=host)
+
+
 def rand_port():
     """
     Get unprivileged port, for now just random, one day we may decide to check if used.
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next 4/6] selftests: net: py: avoid all ports < 10k
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
                   ` (2 preceding siblings ...)
  2024-04-26 23:23 ` [PATCH net-next 3/6] selftests: net: py: extract tool logic Jakub Kicinski
@ 2024-04-26 23:23 ` Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 5/6] selftests: drv-net: support generating iperf3 load Jakub Kicinski
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

When picking TCP ports to use, avoid all below 10k.
This should lower the chance of collision or running
afoul whatever random policies may be on the host.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/net/lib/py/utils.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/lib/py/utils.py b/tools/testing/selftests/net/lib/py/utils.py
index 11b588a2bb9d..074b792ec943 100644
--- a/tools/testing/selftests/net/lib/py/utils.py
+++ b/tools/testing/selftests/net/lib/py/utils.py
@@ -77,7 +77,7 @@ import time
     """
     Get unprivileged port, for now just random, one day we may decide to check if used.
     """
-    return random.randint(1024, 65535)
+    return random.randint(10000, 65535)
 
 
 def wait_port_listen(port, proto="tcp", ns=None, host=None, sleep=0.005, deadline=5):
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next 5/6] selftests: drv-net: support generating iperf3 load
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
                   ` (3 preceding siblings ...)
  2024-04-26 23:23 ` [PATCH net-next 4/6] selftests: net: py: avoid all ports < 10k Jakub Kicinski
@ 2024-04-26 23:23 ` Jakub Kicinski
  2024-04-26 23:23 ` [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool Jakub Kicinski
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

While we are not very interested in testing performance
it's useful to be able to generate a lot of traffic.
iperf is the simplest way of getting relatively high PPS.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 .../selftests/drivers/net/lib/py/__init__.py  |  1 +
 .../selftests/drivers/net/lib/py/load.py      | 41 +++++++++++++++++++
 2 files changed, 42 insertions(+)
 create mode 100644 tools/testing/selftests/drivers/net/lib/py/load.py

diff --git a/tools/testing/selftests/drivers/net/lib/py/__init__.py b/tools/testing/selftests/drivers/net/lib/py/__init__.py
index 4789c1a4282d..2a2dbb1b4ad7 100644
--- a/tools/testing/selftests/drivers/net/lib/py/__init__.py
+++ b/tools/testing/selftests/drivers/net/lib/py/__init__.py
@@ -2,6 +2,7 @@
 
 import sys
 from pathlib import Path
+from .load import *
 
 KSFT_DIR = (Path(__file__).parent / "../../../..").resolve()
 
diff --git a/tools/testing/selftests/drivers/net/lib/py/load.py b/tools/testing/selftests/drivers/net/lib/py/load.py
new file mode 100644
index 000000000000..abdb677bdb1c
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/lib/py/load.py
@@ -0,0 +1,41 @@
+# SPDX-License-Identifier: GPL-2.0
+
+import time
+
+from lib.py import ksft_pr, cmd, ip, rand_port, wait_port_listen
+
+class GenerateTraffic:
+    def __init__(self, env):
+        env.require_cmd("iperf3", remote=True)
+
+        self.env = env
+
+        port = rand_port()
+        self._iperf_server = cmd(f"iperf3 -s -p {port}", background=True)
+        wait_port_listen(port)
+        time.sleep(0.1)
+        self._iperf_client = cmd(f"iperf3 -c {env.addr} -P 16 -p {port} -t 86400",
+                                 background=True, host=env.remote)
+
+        # Wait for traffic to ramp up
+        pkt = ip("-s link show dev " + env.ifname, json=True)[0]["stats64"]["rx"]["packets"]
+        for _ in range(50):
+            time.sleep(0.1)
+            now = ip("-s link show dev " + env.ifname, json=True)[0]["stats64"]["rx"]["packets"]
+            if now - pkt > 1000:
+                return
+            pkt = now
+        self.stop(verbose=True)
+        raise Exception("iperf3 traffic did not ramp up")
+
+    def stop(self, verbose=None):
+        self._iperf_client.process(terminate=True)
+        if verbose:
+            ksft_pr(">> Client:")
+            ksft_pr(self._iperf_client.stdout)
+            ksft_pr(self._iperf_client.stderr)
+        self._iperf_server.process(terminate=True)
+        if verbose:
+            ksft_pr(">> Server:")
+            ksft_pr(self._iperf_server.stdout)
+            ksft_pr(self._iperf_server.stderr)
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
                   ` (4 preceding siblings ...)
  2024-04-26 23:23 ` [PATCH net-next 5/6] selftests: drv-net: support generating iperf3 load Jakub Kicinski
@ 2024-04-26 23:23 ` Jakub Kicinski
  2024-04-27 13:49   ` Willem de Bruijn
  2024-04-29 15:12   ` Andrew Lunn
  2024-04-27 13:55 ` [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Willem de Bruijn
  2024-04-29 15:01 ` Andrew Lunn
  7 siblings, 2 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-26 23:23 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

Bugs in memory allocation failure paths are quite common.
Add a test exercising those paths based on qstat and page pool
failure hook.

Running on bnxt:

  # ./drivers/net/hw/pp_alloc_fail.py
  KTAP version 1
  1..1
  # ethtool -G change retval: success
  ok 1 pp_alloc_fail.test_pp_alloc
  # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0

I initially wrote this test to validate commit be43b7489a3c ("net/mlx5e:
RX, Fix page_pool allocation failure recovery for striding rq") but mlx5
still doesn't have qstat. So I run it on bnxt, and while bnxt survives
I found the problem fixed in commit 730117730709 ("eth: bnxt: fix counting
packets discarded due to OOM and netpoll").

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 .../testing/selftests/drivers/net/hw/Makefile |   1 +
 .../selftests/drivers/net/hw/pp_alloc_fail.py | 129 ++++++++++++++++++
 tools/testing/selftests/net/lib/py/ksft.py    |   4 +
 3 files changed, 134 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py

diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
index 95f32158b095..1dd732855d76 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -9,6 +9,7 @@ TEST_PROGS = \
 	hw_stats_l3.sh \
 	hw_stats_l3_gre.sh \
 	loopback.sh \
+	pp_alloc_fail.py \
 	#
 
 TEST_FILES := \
diff --git a/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
new file mode 100755
index 000000000000..026d98976c35
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
@@ -0,0 +1,129 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import time
+import os
+from lib.py import ksft_run, ksft_exit, ksft_pr
+from lib.py import KsftSkipEx, KsftFailEx
+from lib.py import NetdevFamily, NlError
+from lib.py import NetDrvEpEnv
+from lib.py import cmd, tool, GenerateTraffic
+
+
+def _write_fail_config(config):
+    for key, value in config.items():
+        with open("/sys/kernel/debug/fail_function/" + key, "w") as fp:
+            fp.write(str(value) + "\n")
+
+
+def _enable_pp_allocation_fail():
+    if not os.path.exists("/sys/kernel/debug/fail_function"):
+        raise KsftSkipEx("Kernel built without function error injection (or DebugFS)")
+
+    if not os.path.exists("/sys/kernel/debug/fail_function/page_pool_alloc_pages"):
+        with open("/sys/kernel/debug/fail_function/inject", "w") as fp:
+            fp.write("page_pool_alloc_pages\n")
+
+    _write_fail_config({
+        "verbose": 0,
+        "interval": 511,
+        "probability": 100,
+        "times": -1,
+    })
+
+
+def _disable_pp_allocation_fail():
+    if not os.path.exists("/sys/kernel/debug/fail_function"):
+        return
+
+    if os.path.exists("/sys/kernel/debug/fail_function/page_pool_alloc_pages"):
+        with open("/sys/kernel/debug/fail_function/inject", "w") as fp:
+            fp.write("\n")
+
+    _write_fail_config({
+        "probability": 0,
+        "times": 0,
+    })
+
+
+def test_pp_alloc(cfg, netdevnl):
+    def get_stats():
+        return netdevnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0]
+
+    def check_traffic_flowing():
+        stat1 = get_stats()
+        time.sleep(1)
+        stat2 = get_stats()
+        if stat2['rx-packets'] - stat1['rx-packets'] < 15000:
+            raise KsftFailEx("Traffic seems low:", stat2['rx-packets'] - stat1['rx-packets'])
+
+
+    try:
+        stats = get_stats()
+    except NlError as e:
+        if e.nl_msg.error == -95:
+            stats = {}
+        else:
+            raise
+    if 'rx-alloc-fail' not in stats:
+        raise KsftSkipEx("Driver does not report 'rx-alloc-fail' via qstats")
+
+    set_g = False
+    traffic = None
+    try:
+        traffic = GenerateTraffic(cfg)
+
+        check_traffic_flowing()
+
+        _enable_pp_allocation_fail()
+
+        s1 = get_stats()
+        time.sleep(3)
+        s2 = get_stats()
+
+        if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < 1:
+            raise KsftSkipEx("Allocation failures not increasing")
+        if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < 100:
+            raise KsftSkipEx("Allocation increasing too slowly", s2['rx-alloc-fail'] - s1['rx-alloc-fail'],
+                             "packets:", s2['rx-packets'] - s1['rx-packets'])
+
+        # Basic failures are fine, try to wobble some settings to catch extra failures
+        check_traffic_flowing()
+        g = tool("ethtool", "-g " + cfg.ifname, json=True)[0]
+        if 'rx' in g and g["rx"] * 2 <= g["rx-max"]:
+            new_g = g['rx'] * 2
+        elif 'rx' in g:
+            new_g = g['rx'] // 2
+        else:
+            new_g = None
+
+        if new_g:
+            set_g = cmd(f"ethtool -G {cfg.ifname} rx {new_g}", fail=False).ret == 0
+            if set_g:
+                ksft_pr("ethtool -G change retval: success")
+            else:
+                ksft_pr("ethtool -G change retval: did not succeed", new_g)
+        else:
+                ksft_pr("ethtool -G change retval: did not try")
+
+        time.sleep(0.1)
+        check_traffic_flowing()
+    finally:
+        _disable_pp_allocation_fail()
+        if traffic:
+            traffic.stop()
+        time.sleep(0.1)
+        if set_g:
+            cmd(f"ethtool -G {cfg.ifname} rx {g['rx']}")
+
+
+def main() -> None:
+    netdevnl = NetdevFamily()
+    with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
+
+        ksft_run([test_pp_alloc], args=(cfg, netdevnl, ))
+    ksft_exit()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/tools/testing/selftests/net/lib/py/ksft.py b/tools/testing/selftests/net/lib/py/ksft.py
index f84e9fdd0032..4769b4eb1ea1 100644
--- a/tools/testing/selftests/net/lib/py/ksft.py
+++ b/tools/testing/selftests/net/lib/py/ksft.py
@@ -11,6 +11,10 @@ KSFT_RESULT = None
 KSFT_RESULT_ALL = True
 
 
+class KsftFailEx(Exception):
+    pass
+
+
 class KsftSkipEx(Exception):
     pass
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-26 23:23 ` [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool Jakub Kicinski
@ 2024-04-27 13:49   ` Willem de Bruijn
  2024-04-29 14:51     ` Jakub Kicinski
  2024-04-29 15:12   ` Andrew Lunn
  1 sibling, 1 reply; 19+ messages in thread
From: Willem de Bruijn @ 2024-04-27 13:49 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

Jakub Kicinski wrote:
> Bugs in memory allocation failure paths are quite common.
> Add a test exercising those paths based on qstat and page pool
> failure hook.
> 
> Running on bnxt:
> 
>   # ./drivers/net/hw/pp_alloc_fail.py
>   KTAP version 1
>   1..1
>   # ethtool -G change retval: success
>   ok 1 pp_alloc_fail.test_pp_alloc
>   # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> 
> I initially wrote this test to validate commit be43b7489a3c ("net/mlx5e:
> RX, Fix page_pool allocation failure recovery for striding rq") but mlx5
> still doesn't have qstat. So I run it on bnxt, and while bnxt survives
> I found the problem fixed in commit 730117730709 ("eth: bnxt: fix counting
> packets discarded due to OOM and netpoll").
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  .../testing/selftests/drivers/net/hw/Makefile |   1 +
>  .../selftests/drivers/net/hw/pp_alloc_fail.py | 129 ++++++++++++++++++
>  tools/testing/selftests/net/lib/py/ksft.py    |   4 +
>  3 files changed, 134 insertions(+)
>  create mode 100755 tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
> 
> diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
> index 95f32158b095..1dd732855d76 100644
> --- a/tools/testing/selftests/drivers/net/hw/Makefile
> +++ b/tools/testing/selftests/drivers/net/hw/Makefile
> @@ -9,6 +9,7 @@ TEST_PROGS = \
>  	hw_stats_l3.sh \
>  	hw_stats_l3_gre.sh \
>  	loopback.sh \
> +	pp_alloc_fail.py \
>  	#
>  
>  TEST_FILES := \
> diff --git a/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
> new file mode 100755
> index 000000000000..026d98976c35
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
> @@ -0,0 +1,129 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +
> +import time
> +import os
> +from lib.py import ksft_run, ksft_exit, ksft_pr
> +from lib.py import KsftSkipEx, KsftFailEx
> +from lib.py import NetdevFamily, NlError
> +from lib.py import NetDrvEpEnv
> +from lib.py import cmd, tool, GenerateTraffic
> +
> +
> +def _write_fail_config(config):
> +    for key, value in config.items():
> +        with open("/sys/kernel/debug/fail_function/" + key, "w") as fp:
> +            fp.write(str(value) + "\n")
> +
> +
> +def _enable_pp_allocation_fail():
> +    if not os.path.exists("/sys/kernel/debug/fail_function"):
> +        raise KsftSkipEx("Kernel built without function error injection (or DebugFS)")
> +
> +    if not os.path.exists("/sys/kernel/debug/fail_function/page_pool_alloc_pages"):
> +        with open("/sys/kernel/debug/fail_function/inject", "w") as fp:
> +            fp.write("page_pool_alloc_pages\n")
> +
> +    _write_fail_config({
> +        "verbose": 0,
> +        "interval": 511,
> +        "probability": 100,
> +        "times": -1,
> +    })
> +
> +
> +def _disable_pp_allocation_fail():
> +    if not os.path.exists("/sys/kernel/debug/fail_function"):
> +        return
> +
> +    if os.path.exists("/sys/kernel/debug/fail_function/page_pool_alloc_pages"):
> +        with open("/sys/kernel/debug/fail_function/inject", "w") as fp:
> +            fp.write("\n")
> +
> +    _write_fail_config({
> +        "probability": 0,
> +        "times": 0,
> +    })
> +
> +
> +def test_pp_alloc(cfg, netdevnl):
> +    def get_stats():
> +        return netdevnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0]
> +
> +    def check_traffic_flowing():
> +        stat1 = get_stats()
> +        time.sleep(1)
> +        stat2 = get_stats()
> +        if stat2['rx-packets'] - stat1['rx-packets'] < 15000:
> +            raise KsftFailEx("Traffic seems low:", stat2['rx-packets'] - stat1['rx-packets'])
> +
> +
> +    try:
> +        stats = get_stats()
> +    except NlError as e:
> +        if e.nl_msg.error == -95:
> +            stats = {}
> +        else:
> +            raise
> +    if 'rx-alloc-fail' not in stats:
> +        raise KsftSkipEx("Driver does not report 'rx-alloc-fail' via qstats")
> +
> +    set_g = False
> +    traffic = None
> +    try:
> +        traffic = GenerateTraffic(cfg)
> +
> +        check_traffic_flowing()
> +
> +        _enable_pp_allocation_fail()
> +
> +        s1 = get_stats()
> +        time.sleep(3)
> +        s2 = get_stats()
> +
> +        if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < 1:
> +            raise KsftSkipEx("Allocation failures not increasing")
> +        if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < 100:
> +            raise KsftSkipEx("Allocation increasing too slowly", s2['rx-alloc-fail'] - s1['rx-alloc-fail'],
> +                             "packets:", s2['rx-packets'] - s1['rx-packets'])
> +
> +        # Basic failures are fine, try to wobble some settings to catch extra failures
> +        check_traffic_flowing()
> +        g = tool("ethtool", "-g " + cfg.ifname, json=True)[0]
> +        if 'rx' in g and g["rx"] * 2 <= g["rx-max"]:
> +            new_g = g['rx'] * 2
> +        elif 'rx' in g:
> +            new_g = g['rx'] // 2
> +        else:
> +            new_g = None
> +
> +        if new_g:
> +            set_g = cmd(f"ethtool -G {cfg.ifname} rx {new_g}", fail=False).ret == 0
> +            if set_g:
> +                ksft_pr("ethtool -G change retval: success")
> +            else:
> +                ksft_pr("ethtool -G change retval: did not succeed", new_g)
> +        else:
> +                ksft_pr("ethtool -G change retval: did not try")
> +
> +        time.sleep(0.1)
> +        check_traffic_flowing()
> +    finally:
> +        _disable_pp_allocation_fail()
> +        if traffic:
> +            traffic.stop()

Very cool!

Eventually probably want a more generic fault injection class.

And for both fault injection and background traffic the with object
construct to ensure cleanup in all cases.

Maybe even the same for ethtool, as ip and ethtool config changes that
need to be reverted to original state will be common.

To be clear, not at all suggesting to revise this series for that.

> +        time.sleep(0.1)
> +        if set_g:
> +            cmd(f"ethtool -G {cfg.ifname} rx {g['rx']}")
> +
> +
> +def main() -> None:
> +    netdevnl = NetdevFamily()
> +    with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
> +
> +        ksft_run([test_pp_alloc], args=(cfg, netdevnl, ))
> +    ksft_exit()
> +
> +
> +if __name__ == "__main__":
> +    main()
> diff --git a/tools/testing/selftests/net/lib/py/ksft.py b/tools/testing/selftests/net/lib/py/ksft.py
> index f84e9fdd0032..4769b4eb1ea1 100644
> --- a/tools/testing/selftests/net/lib/py/ksft.py
> +++ b/tools/testing/selftests/net/lib/py/ksft.py
> @@ -11,6 +11,10 @@ KSFT_RESULT = None
>  KSFT_RESULT_ALL = True
>  
>  
> +class KsftFailEx(Exception):
> +    pass
> +
> +
>  class KsftSkipEx(Exception):
>      pass
>  
> -- 
> 2.44.0
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 3/6] selftests: net: py: extract tool logic
  2024-04-26 23:23 ` [PATCH net-next 3/6] selftests: net: py: extract tool logic Jakub Kicinski
@ 2024-04-27 13:51   ` Willem de Bruijn
  2024-04-29 14:44     ` Jakub Kicinski
  0 siblings, 1 reply; 19+ messages in thread
From: Willem de Bruijn @ 2024-04-27 13:51 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

Jakub Kicinski wrote:
> The main use of the ip() wrapper over cmd() is that it can parse JSON.
> cmd("ip -j link show") will return stdout as a string, and test has
> to call json.loads(). With ip("link show", json=True) the return value
> will be already parsed.
> 
> More tools (ethtool, bpftool etc.) support the --json switch.
> To avoid having to wrap all of them individually create a tool()
> helper.
> 
> Switch from -j to --json (for ethtool).
> While at it consume the netns attribute at the ip() level.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  tools/testing/selftests/net/lib/py/utils.py | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/lib/py/utils.py b/tools/testing/selftests/net/lib/py/utils.py
> index d3715e6c21f2..11b588a2bb9d 100644
> --- a/tools/testing/selftests/net/lib/py/utils.py
> +++ b/tools/testing/selftests/net/lib/py/utils.py
> @@ -56,10 +56,10 @@ import time
>          return self.process(terminate=self.terminate)
>  
>  
> -def ip(args, json=None, ns=None, host=None):
> -    cmd_str = "ip "
> +def tool(name, args, json=None, ns=None, host=None):
> +    cmd_str = name + " "
>      if json:
> -        cmd_str += '-j '
> +        cmd_str += '--json '
>      cmd_str += args
>      cmd_obj = cmd(cmd_str, ns=ns, host=host)
>      if json:
> @@ -67,6 +67,12 @@ import time
>      return cmd_obj
>  
>  
> +def ip(args, json=None, ns=None, host=None):
> +    if ns:
> +        args = '-netns ' + ns + " " + args

Minor: inconsistent use of single and double comma strings. Maybe
there's a reasoning that I'm just missing.

> +    return tool("ip", args, json=json, host=host)
> +
> +
>  def rand_port():
>      """
>      Get unprivileged port, for now just random, one day we may decide to check if used.
> -- 
> 2.44.0
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 0/6] selftests: net: page_poll allocation error injection
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
                   ` (5 preceding siblings ...)
  2024-04-26 23:23 ` [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool Jakub Kicinski
@ 2024-04-27 13:55 ` Willem de Bruijn
  2024-04-29 15:01 ` Andrew Lunn
  7 siblings, 0 replies; 19+ messages in thread
From: Willem de Bruijn @ 2024-04-27 13:55 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	Jakub Kicinski

Jakub Kicinski wrote:
> Add a test for exercising driver memory allocation failure paths.
> page pool is a bit tricky to inject errors into at the page allocator
> level because of the bulk alloc and recycling, so add explicit error
> injection support "in front" of the caches.
> 
> Add a test to exercise that using only the standard APIs.
> This is the first useful test for the new tests with an endpoint.
> There's no point testing netdevsim here, so this is also the first
> HW-only test in Python.
> 
> I'm not super happy with the traffic generation using iperf3,
> my initial approach was to use mausezahn. But it turned out to be
> 5x slower in terms of PPS. Hopefully this is good enough for now.
> 
> Jakub Kicinski (6):
>   net: page_pool: support error injection
>   selftests: drv-net-hw: support using Python from net hw tests
>   selftests: net: py: extract tool logic
>   selftests: net: py: avoid all ports < 10k
>   selftests: drv-net: support generating iperf3 load
>   selftests: drv-net-hw: add test for memory allocation failures with
>     page pool

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 3/6] selftests: net: py: extract tool logic
  2024-04-27 13:51   ` Willem de Bruijn
@ 2024-04-29 14:44     ` Jakub Kicinski
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-29 14:44 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: davem, netdev, edumazet, pabeni, linux-kselftest

On Sat, 27 Apr 2024 09:51:01 -0400 Willem de Bruijn wrote:
> > +def ip(args, json=None, ns=None, host=None):
> > +    if ns:
> > +        args = '-netns ' + ns + " " + args  
> 
> Minor: inconsistent use of single and double comma strings. Maybe
> there's a reasoning that I'm just missing.

I also need to coerce ns to be a string, it may be class NetNS.
v2 coming in 3... 2... 1...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-27 13:49   ` Willem de Bruijn
@ 2024-04-29 14:51     ` Jakub Kicinski
  2024-04-29 15:02       ` Willem de Bruijn
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-29 14:51 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: davem, netdev, edumazet, pabeni, linux-kselftest

On Sat, 27 Apr 2024 09:49:28 -0400 Willem de Bruijn wrote:
> Eventually probably want a more generic fault injection class.
> 
> And for both fault injection and background traffic the with object
> construct to ensure cleanup in all cases.
> 
> Maybe even the same for ethtool, as ip and ethtool config changes that
> need to be reverted to original state will be common.

Agreed, the nice way of wrapping all that has not revealed itself to me
yet. When we discussed it with Petr a while back he was suggesting
"with", and I was thinking of creating an object with test as the
parent. The with is nicer but here we'd end up doing:

	with a():
		# some code
		with b():
			# more code
				with c():
					# check traffic

which offends my sensibilities.

There are many options, hard to say which one is best without having 
a bunch of tests to convert as a litmus test :S So I stuck to "finally"

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 0/6] selftests: net: page_poll allocation error injection
  2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
                   ` (6 preceding siblings ...)
  2024-04-27 13:55 ` [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Willem de Bruijn
@ 2024-04-29 15:01 ` Andrew Lunn
  2024-04-29 16:25   ` Jakub Kicinski
  7 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2024-04-29 15:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, linux-kselftest,
	willemdebruijn.kernel

> I'm not super happy with the traffic generation using iperf3,
> my initial approach was to use mausezahn. But it turned out to be
> 5x slower in terms of PPS. Hopefully this is good enough for now.

How important is PPS? In order to get 'Maintained' status, automotive
vendors are going to want to test their 10Mbps T1 links.

	Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-29 14:51     ` Jakub Kicinski
@ 2024-04-29 15:02       ` Willem de Bruijn
  2024-04-29 16:27         ` Jakub Kicinski
  0 siblings, 1 reply; 19+ messages in thread
From: Willem de Bruijn @ 2024-04-29 15:02 UTC (permalink / raw)
  To: Jakub Kicinski, Willem de Bruijn
  Cc: davem, netdev, edumazet, pabeni, linux-kselftest

Jakub Kicinski wrote:
> On Sat, 27 Apr 2024 09:49:28 -0400 Willem de Bruijn wrote:
> > Eventually probably want a more generic fault injection class.
> > 
> > And for both fault injection and background traffic the with object
> > construct to ensure cleanup in all cases.
> > 
> > Maybe even the same for ethtool, as ip and ethtool config changes that
> > need to be reverted to original state will be common.
> 
> Agreed, the nice way of wrapping all that has not revealed itself to me
> yet. When we discussed it with Petr a while back he was suggesting
> "with", and I was thinking of creating an object with test as the
> parent. The with is nicer but here we'd end up doing:
> 
> 	with a():
> 		# some code
> 		with b():
> 			# more code
> 				with c():
> 					# check traffic
> 
> which offends my sensibilities.
> 
> There are many options, hard to say which one is best without having 
> a bunch of tests to convert as a litmus test :S So I stuck to "finally"

Entirely reasonable.

Btw, I have a preliminary tools/testing/selftests/net/csum test on
top of this series.

The only interesting points so far are the use of deploy (which I
assume you have on some internal patch already) and that with bkg
would not fail the test if the background process exits with error.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-26 23:23 ` [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool Jakub Kicinski
  2024-04-27 13:49   ` Willem de Bruijn
@ 2024-04-29 15:12   ` Andrew Lunn
  2024-04-29 16:16     ` Jakub Kicinski
  1 sibling, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2024-04-29 15:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, linux-kselftest,
	willemdebruijn.kernel

On Fri, Apr 26, 2024 at 04:23:59PM -0700, Jakub Kicinski wrote:
> Bugs in memory allocation failure paths are quite common.
> Add a test exercising those paths based on qstat and page pool
> failure hook.
> 
> Running on bnxt:
> 
>   # ./drivers/net/hw/pp_alloc_fail.py
>   KTAP version 1
>   1..1
>   # ethtool -G change retval: success
>   ok 1 pp_alloc_fail.test_pp_alloc
>   # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0

If i'm reading the traffic generator correctly, this test runs for 24
hours. Do we want some sort of warning here about the test duration?
Some sort of alive indication very minute?

     Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-29 15:12   ` Andrew Lunn
@ 2024-04-29 16:16     ` Jakub Kicinski
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-29 16:16 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem, netdev, edumazet, pabeni, linux-kselftest,
	willemdebruijn.kernel

On Mon, 29 Apr 2024 17:12:29 +0200 Andrew Lunn wrote:
> >   # ./drivers/net/hw/pp_alloc_fail.py
> >   KTAP version 1
> >   1..1
> >   # ethtool -G change retval: success
> >   ok 1 pp_alloc_fail.test_pp_alloc
> >   # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0  
> 
> If i'm reading the traffic generator correctly, this test runs for 24
> hours. Do we want some sort of warning here about the test duration?
> Some sort of alive indication very minute?

That's just the max value for the time param.
The generator is stopped / killed after we go thru the test steps.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 0/6] selftests: net: page_poll allocation error injection
  2024-04-29 15:01 ` Andrew Lunn
@ 2024-04-29 16:25   ` Jakub Kicinski
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-29 16:25 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem, netdev, edumazet, pabeni, linux-kselftest,
	willemdebruijn.kernel

On Mon, 29 Apr 2024 17:01:55 +0200 Andrew Lunn wrote:
> > I'm not super happy with the traffic generation using iperf3,
> > my initial approach was to use mausezahn. But it turned out to be
> > 5x slower in terms of PPS. Hopefully this is good enough for now.  
> 
> How important is PPS? In order to get 'Maintained' status, automotive
> vendors are going to want to test their 10Mbps T1 links.

s/Maintained/Supported/ ?

PPS isn't important in itself, that said, I wanted to set a floor to
make sure that the failure path is actually well exercised. 
Some drivers may be doing internal recycling or whatever other magic,
which would make them barely call the page_pool alloc.

Even though this is not a performance tests the check is based on
expected perf. My thinking is that once we have some data points about
various system we can abstract the perf expectations a bit more
systematically than if speed < 10GE: pps //= 10

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool
  2024-04-29 15:02       ` Willem de Bruijn
@ 2024-04-29 16:27         ` Jakub Kicinski
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2024-04-29 16:27 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: davem, netdev, edumazet, pabeni, linux-kselftest

On Mon, 29 Apr 2024 11:02:51 -0400 Willem de Bruijn wrote:
> The only interesting points so far are the use of deploy (which I
> assume you have on some internal patch already)

Yup, they need a touch more cleaning up but the PSP tests use it.

> and that with bkg would not fail the test if the background process
> exits with error.

Ah, that's a bug, yes. We should record the value of fail from 
the constructor and use it in __exit__().

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next 1/6] net: page_pool: support error injection
  2024-04-26 23:23 ` [PATCH net-next 1/6] net: page_pool: support " Jakub Kicinski
@ 2024-04-30 15:27   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2024-04-30 15:27 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, linux-kselftest, willemdebruijn.kernel,
	ilias.apalodimas



On 27/04/2024 01.23, Jakub Kicinski wrote:
> Because of caching / recycling using the general page allocation
> failures to induce errors in page pool allocation is very hard.
> Add direct error injection support to page_pool_alloc_pages().
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>


Sounds good to me :-)

Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>

> ---
> CC: hawk@kernel.org
> CC: ilias.apalodimas@linaro.org
> ---
>   net/core/page_pool.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 273c24429bce..8bcc7014a61a 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -5,6 +5,7 @@
>    *	Copyright (C) 2016 Red Hat, Inc.
>    */
>   
> +#include <linux/error-injection.h>
>   #include <linux/types.h>
>   #include <linux/kernel.h>
>   #include <linux/slab.h>
> @@ -550,6 +551,7 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
>   	return page;
>   }
>   EXPORT_SYMBOL(page_pool_alloc_pages);
> +ALLOW_ERROR_INJECTION(page_pool_alloc_pages, NULL);
>   
>   /* Calculate distance between two u32 values, valid if distance is below 2^(31)
>    *  https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-04-30 15:27 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-26 23:23 [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Jakub Kicinski
2024-04-26 23:23 ` [PATCH net-next 1/6] net: page_pool: support " Jakub Kicinski
2024-04-30 15:27   ` Jesper Dangaard Brouer
2024-04-26 23:23 ` [PATCH net-next 2/6] selftests: drv-net-hw: support using Python from net hw tests Jakub Kicinski
2024-04-26 23:23 ` [PATCH net-next 3/6] selftests: net: py: extract tool logic Jakub Kicinski
2024-04-27 13:51   ` Willem de Bruijn
2024-04-29 14:44     ` Jakub Kicinski
2024-04-26 23:23 ` [PATCH net-next 4/6] selftests: net: py: avoid all ports < 10k Jakub Kicinski
2024-04-26 23:23 ` [PATCH net-next 5/6] selftests: drv-net: support generating iperf3 load Jakub Kicinski
2024-04-26 23:23 ` [PATCH net-next 6/6] selftests: drv-net-hw: add test for memory allocation failures with page pool Jakub Kicinski
2024-04-27 13:49   ` Willem de Bruijn
2024-04-29 14:51     ` Jakub Kicinski
2024-04-29 15:02       ` Willem de Bruijn
2024-04-29 16:27         ` Jakub Kicinski
2024-04-29 15:12   ` Andrew Lunn
2024-04-29 16:16     ` Jakub Kicinski
2024-04-27 13:55 ` [PATCH net-next 0/6] selftests: net: page_poll allocation error injection Willem de Bruijn
2024-04-29 15:01 ` Andrew Lunn
2024-04-29 16:25   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).