* [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets
@ 2026-04-08 7:08 Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 1/5] tools: ynl: move ethtool.py to selftest Hangbin Liu
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Hangbin Liu @ 2026-04-08 7:08 UTC (permalink / raw)
To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Andrew Lunn
Cc: netdev, linux-kernel, Hangbin Liu
This series addresses a silent data corruption issue triggered when ynl
retrieves string sets from NICs with a large number of statistics entries
(e.g. mlx5_core with thousands of ETH_SS_STATS strings).
The root cause is that struct nlattr.nla_len is a __u16 (max 65535
bytes). When a NIC exports enough statistics strings, the
ETHTOOL_A_STRINGSET_STRINGS nest built by strset_fill_set() exceeds
this limit. nla_nest_end() silently truncates the length on assignment,
producing a corrupted netlink message.
Patch 1 moves ethtool.py to selftest.
Patch 2 improves the ethtool tool: rename the doit/dumpit helpers
to do_set/do_get and convert do_get to use ynl.do() with an
explicit device header instead of a full dump with client-side filtering.
Patch 3 adds a --dbg-small-recv option to the YNL ethtool tool,
matching the same option already present in cli.py, to help debug netlink
message size issues
Patch 4 adds a new helper nla_nest_end_safe() to check whether the nla_len
is overflow and return -EMSGSIZE early if so.
Patch 5 uses the new helper in ethtool to make sure the ethtool doesn't
reply a corrupted netlink message.
---
Changes in v2:
- move ethtool.py to selftest (Jakub Kicinski)
- add a new helper nla_nest_end_safe (Jakub Kicinski)
- Link to v1: https://lore.kernel.org/r/20260331-b4-ynl_ethtool-v1-0-dda2a9b55df8@gmail.com
---
Hangbin Liu (5):
tools: ynl: move ethtool.py to selftest
tools: ynl: ethtool: use doit instead of dumpit for per-device GET
tools: ynl: ethtool: add --dbg-small-recv option
netlink: add a nla_nest_end_safe() helper
ethtool: strset: check nla_len overflow
include/net/netlink.h | 19 ++++++++
net/ethtool/strset.c | 3 +-
tools/net/ynl/tests/Makefile | 5 +-
tools/net/ynl/{pyynl => tests}/ethtool.py | 79 ++++++++++++++++---------------
tools/net/ynl/tests/test_ynl_ethtool.sh | 2 +-
5 files changed, 68 insertions(+), 40 deletions(-)
---
base-commit: e65d8b6f3092398efd7c74e722cb7a516d9a0d6d
change-id: 20260324-b4-ynl_ethtool-f87cd42f572c
Best regards,
--
Hangbin Liu <liuhangbin@gmail.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next v2 1/5] tools: ynl: move ethtool.py to selftest
2026-04-08 7:08 [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets Hangbin Liu
@ 2026-04-08 7:08 ` Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 2/5] tools: ynl: ethtool: use doit instead of dumpit for per-device GET Hangbin Liu
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Hangbin Liu @ 2026-04-08 7:08 UTC (permalink / raw)
To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Andrew Lunn
Cc: netdev, linux-kernel, Hangbin Liu
We have converted all the samples to selftests. This script is
the last piece of random "PoC" code we still have lying around.
Let's move it to tests.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
tools/net/ynl/tests/Makefile | 5 ++++-
tools/net/ynl/{pyynl => tests}/ethtool.py | 2 +-
tools/net/ynl/tests/test_ynl_ethtool.sh | 2 +-
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/net/ynl/tests/Makefile b/tools/net/ynl/tests/Makefile
index 2a02958c7039..94bf0346b54d 100644
--- a/tools/net/ynl/tests/Makefile
+++ b/tools/net/ynl/tests/Makefile
@@ -36,7 +36,10 @@ TEST_GEN_FILES := \
rt-route \
# end of TEST_GEN_FILES
-TEST_FILES := ynl_nsim_lib.sh
+TEST_FILES := \
+ ethtool.py \
+ ynl_nsim_lib.sh \
+# end of TEST_FILES
CFLAGS_netdev:=$(CFLAGS_netdev) $(CFLAGS_rt-link)
CFLAGS_ovs:=$(CFLAGS_ovs_datapath)
diff --git a/tools/net/ynl/pyynl/ethtool.py b/tools/net/ynl/tests/ethtool.py
similarity index 99%
rename from tools/net/ynl/pyynl/ethtool.py
rename to tools/net/ynl/tests/ethtool.py
index f1a2a2a89985..6eeeb867edcf 100755
--- a/tools/net/ynl/pyynl/ethtool.py
+++ b/tools/net/ynl/tests/ethtool.py
@@ -14,7 +14,7 @@ import re
import os
# pylint: disable=no-name-in-module,wrong-import-position
-sys.path.append(pathlib.Path(__file__).resolve().parent.as_posix())
+sys.path.append(pathlib.Path(__file__).resolve().parent.parent.joinpath('pyynl').as_posix())
# pylint: disable=import-error
from cli import schema_dir, spec_dir
from lib import YnlFamily
diff --git a/tools/net/ynl/tests/test_ynl_ethtool.sh b/tools/net/ynl/tests/test_ynl_ethtool.sh
index b826269017f4..b4480e9be7b7 100755
--- a/tools/net/ynl/tests/test_ynl_ethtool.sh
+++ b/tools/net/ynl/tests/test_ynl_ethtool.sh
@@ -8,7 +8,7 @@ KSELFTEST_KTAP_HELPERS="$(dirname "$(realpath "$0")")/../../../testing/selftests
source "$KSELFTEST_KTAP_HELPERS"
# Default ynl-ethtool path for direct execution, can be overridden by make install
-ynl_ethtool="../pyynl/ethtool.py"
+ynl_ethtool="./ethtool.py"
readonly NSIM_ID="1337"
readonly NSIM_DEV_NAME="nsim${NSIM_ID}"
--
Git-155)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v2 2/5] tools: ynl: ethtool: use doit instead of dumpit for per-device GET
2026-04-08 7:08 [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 1/5] tools: ynl: move ethtool.py to selftest Hangbin Liu
@ 2026-04-08 7:08 ` Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 3/5] tools: ynl: ethtool: add --dbg-small-recv option Hangbin Liu
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Hangbin Liu @ 2026-04-08 7:08 UTC (permalink / raw)
To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Andrew Lunn
Cc: netdev, linux-kernel, Hangbin Liu
Rename the local helper doit() to do_set() and dumpit() to do_get() to
better reflect their purpose.
Convert do_get() to use ynl.do() with an explicit device header instead
of ynl.dump() followed by client-side filtering. This is more efficient
as the kernel only processes and returns data for the requested device,
rather than dumping all devices across the netns.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
tools/net/ynl/tests/ethtool.py | 68 ++++++++++++++++++++----------------------
1 file changed, 33 insertions(+), 35 deletions(-)
diff --git a/tools/net/ynl/tests/ethtool.py b/tools/net/ynl/tests/ethtool.py
index 6eeeb867edcf..63854d21818c 100755
--- a/tools/net/ynl/tests/ethtool.py
+++ b/tools/net/ynl/tests/ethtool.py
@@ -84,9 +84,9 @@ def print_speed(name, value):
speed = [ k for k, v in value.items() if v and speed_re.match(k) ]
print(f'{name}: {" ".join(speed)}')
-def doit(ynl, args, op_name):
+def do_set(ynl, args, op_name):
"""
- Prepare request header, parse arguments and doit.
+ Prepare request header, parse arguments and do a set operation.
"""
req = {
'header': {
@@ -97,26 +97,24 @@ def doit(ynl, args, op_name):
args_to_req(ynl, op_name, args.args, req)
ynl.do(op_name, req)
-def dumpit(ynl, args, op_name, extra=None):
+def do_get(ynl, args, op_name, extra=None):
"""
- Prepare request header, parse arguments and dumpit (filtering out the
- devices we're not interested in).
+ Prepare request header and get info for a specific device using doit.
"""
extra = extra or {}
- reply = ynl.dump(op_name, { 'header': {} } | extra)
+ req = {'header': {'dev-name': args.device}}
+ req['header'].update(extra.pop('header', {}))
+ req.update(extra)
+
+ reply = ynl.do(op_name, req)
if not reply:
return {}
- for msg in reply:
- if msg['header']['dev-name'] == args.device:
- if args.json:
- pprint.PrettyPrinter().pprint(msg)
- sys.exit(0)
- msg.pop('header', None)
- return msg
-
- print(f"Not supported for device {args.device}")
- sys.exit(1)
+ if args.json:
+ pprint.PrettyPrinter().pprint(reply)
+ sys.exit(0)
+ reply.pop('header', None)
+ return reply
def bits_to_dict(attr):
"""
@@ -181,15 +179,15 @@ def main():
return
if args.set_eee:
- doit(ynl, args, 'eee-set')
+ do_set(ynl, args, 'eee-set')
return
if args.set_pause:
- doit(ynl, args, 'pause-set')
+ do_set(ynl, args, 'pause-set')
return
if args.set_coalesce:
- doit(ynl, args, 'coalesce-set')
+ do_set(ynl, args, 'coalesce-set')
return
if args.set_features:
@@ -198,20 +196,20 @@ def main():
return
if args.set_channels:
- doit(ynl, args, 'channels-set')
+ do_set(ynl, args, 'channels-set')
return
if args.set_ring:
- doit(ynl, args, 'rings-set')
+ do_set(ynl, args, 'rings-set')
return
if args.show_priv_flags:
- flags = bits_to_dict(dumpit(ynl, args, 'privflags-get')['flags'])
+ flags = bits_to_dict(do_get(ynl, args, 'privflags-get')['flags'])
print_field(flags)
return
if args.show_eee:
- eee = dumpit(ynl, args, 'eee-get')
+ eee = do_get(ynl, args, 'eee-get')
ours = bits_to_dict(eee['modes-ours'])
peer = bits_to_dict(eee['modes-peer'])
@@ -232,18 +230,18 @@ def main():
return
if args.show_pause:
- print_field(dumpit(ynl, args, 'pause-get'),
+ print_field(do_get(ynl, args, 'pause-get'),
('autoneg', 'Autonegotiate', 'bool'),
('rx', 'RX', 'bool'),
('tx', 'TX', 'bool'))
return
if args.show_coalesce:
- print_field(dumpit(ynl, args, 'coalesce-get'))
+ print_field(do_get(ynl, args, 'coalesce-get'))
return
if args.show_features:
- reply = dumpit(ynl, args, 'features-get')
+ reply = do_get(ynl, args, 'features-get')
available = bits_to_dict(reply['hw'])
requested = bits_to_dict(reply['wanted']).keys()
active = bits_to_dict(reply['active']).keys()
@@ -270,7 +268,7 @@ def main():
return
if args.show_channels:
- reply = dumpit(ynl, args, 'channels-get')
+ reply = do_get(ynl, args, 'channels-get')
print(f'Channel parameters for {args.device}:')
print('Pre-set maximums:')
@@ -290,7 +288,7 @@ def main():
return
if args.show_ring:
- reply = dumpit(ynl, args, 'channels-get')
+ reply = do_get(ynl, args, 'channels-get')
print(f'Ring parameters for {args.device}:')
@@ -319,7 +317,7 @@ def main():
print('NIC statistics:')
# TODO: pass id?
- strset = dumpit(ynl, args, 'strset-get')
+ strset = do_get(ynl, args, 'strset-get')
pprint.PrettyPrinter().pprint(strset)
req = {
@@ -338,7 +336,7 @@ def main():
},
}
- rsp = dumpit(ynl, args, 'stats-get', req)
+ rsp = do_get(ynl, args, 'stats-get', req)
pprint.PrettyPrinter().pprint(rsp)
return
@@ -349,7 +347,7 @@ def main():
},
}
- tsinfo = dumpit(ynl, args, 'tsinfo-get', req)
+ tsinfo = do_get(ynl, args, 'tsinfo-get', req)
print(f'Time stamping parameters for {args.device}:')
@@ -377,7 +375,7 @@ def main():
return
print(f'Settings for {args.device}:')
- linkmodes = dumpit(ynl, args, 'linkmodes-get')
+ linkmodes = do_get(ynl, args, 'linkmodes-get')
ours = bits_to_dict(linkmodes['ours'])
supported_ports = ('TP', 'AUI', 'BNC', 'MII', 'FIBRE', 'Backplane')
@@ -425,7 +423,7 @@ def main():
5: 'Directly Attached Copper',
0xef: 'None',
}
- linkinfo = dumpit(ynl, args, 'linkinfo-get')
+ linkinfo = do_get(ynl, args, 'linkinfo-get')
print(f'Port: {ports.get(linkinfo["port"], "Other")}')
print_field(linkinfo, ('phyaddr', 'PHYAD'))
@@ -447,11 +445,11 @@ def main():
mdix = mdix_ctrl.get(linkinfo['tp-mdix'], 'Unknown (auto)')
print(f'MDI-X: {mdix}')
- debug = dumpit(ynl, args, 'debug-get')
+ debug = do_get(ynl, args, 'debug-get')
msgmask = bits_to_dict(debug.get("msgmask", [])).keys()
print(f'Current message level: {" ".join(msgmask)}')
- linkstate = dumpit(ynl, args, 'linkstate-get')
+ linkstate = do_get(ynl, args, 'linkstate-get')
detected_states = {
0: 'no',
1: 'yes',
--
Git-155)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v2 3/5] tools: ynl: ethtool: add --dbg-small-recv option
2026-04-08 7:08 [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 1/5] tools: ynl: move ethtool.py to selftest Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 2/5] tools: ynl: ethtool: use doit instead of dumpit for per-device GET Hangbin Liu
@ 2026-04-08 7:08 ` Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 4/5] netlink: add a nla_nest_end_safe() helper Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 5/5] ethtool: strset: check nla_len overflow Hangbin Liu
4 siblings, 0 replies; 6+ messages in thread
From: Hangbin Liu @ 2026-04-08 7:08 UTC (permalink / raw)
To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Andrew Lunn
Cc: netdev, linux-kernel, Hangbin Liu
Add a --dbg-small-recv debug option to control the recv() buffer size
used by YNL, matching the same option already present in cli.py. This
is useful if user need to get large netlink message.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
tools/net/ynl/tests/ethtool.py | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/net/ynl/tests/ethtool.py b/tools/net/ynl/tests/ethtool.py
index 63854d21818c..db3b62c652e7 100755
--- a/tools/net/ynl/tests/ethtool.py
+++ b/tools/net/ynl/tests/ethtool.py
@@ -166,12 +166,19 @@ def main():
parser.add_argument('device', metavar='device', type=str)
parser.add_argument('args', metavar='args', type=str, nargs='*')
+ dbg_group = parser.add_argument_group('Debug options')
+ dbg_group.add_argument('--dbg-small-recv', default=0, const=4000,
+ action='store', nargs='?', type=int, metavar='INT',
+ help="Length of buffers used for recv()")
+
args = parser.parse_args()
spec = os.path.join(spec_dir(), 'ethtool.yaml')
schema = os.path.join(schema_dir(), 'genetlink-legacy.yaml')
- ynl = YnlFamily(spec, schema)
+ ynl = YnlFamily(spec, schema, recv_size=args.dbg_small_recv)
+ if args.dbg_small_recv:
+ ynl.set_recv_dbg(True)
if args.set_priv_flags:
# TODO: parse the bitmask
--
Git-155)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v2 4/5] netlink: add a nla_nest_end_safe() helper
2026-04-08 7:08 [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets Hangbin Liu
` (2 preceding siblings ...)
2026-04-08 7:08 ` [PATCH net-next v2 3/5] tools: ynl: ethtool: add --dbg-small-recv option Hangbin Liu
@ 2026-04-08 7:08 ` Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 5/5] ethtool: strset: check nla_len overflow Hangbin Liu
4 siblings, 0 replies; 6+ messages in thread
From: Hangbin Liu @ 2026-04-08 7:08 UTC (permalink / raw)
To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Andrew Lunn
Cc: netdev, linux-kernel, Hangbin Liu
The nla_len field in struct nlattr is a __u16, which can only hold
values up to 65535. If a nested attribute grows beyond this limit,
nla_nest_end() silently truncates the length, producing a corrupted
netlink message with no indication of the problem.
Since nla_nest_end() is used everywhere and this issue rarely happens,
let's add a new helper to check the length.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
include/net/netlink.h | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/include/net/netlink.h b/include/net/netlink.h
index 1a8356ca4b78..546d10586576 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -2264,6 +2264,25 @@ static inline int nla_nest_end(struct sk_buff *skb, struct nlattr *start)
return skb->len;
}
+/**
+ * nla_nest_end_safe - Validate and finalize nesting of attributes
+ * @skb: socket buffer the attributes are stored in
+ * @start: container attribute
+ *
+ * Corrects the container attribute header to include all appended
+ * attributes.
+ *
+ * Returns: the total data length of the skb, or -EMSGSIZE if the
+ * nested attribute length exceeds U16_MAX.
+ */
+static inline int nla_nest_end_safe(struct sk_buff *skb, struct nlattr *start)
+{
+ if (skb_tail_pointer(skb) - (unsigned char *)start > U16_MAX)
+ return -EMSGSIZE;
+
+ return nla_nest_end(skb, start);
+}
+
/**
* nla_nest_cancel - Cancel nesting of attributes
* @skb: socket buffer the message is stored in
--
Git-155)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v2 5/5] ethtool: strset: check nla_len overflow
2026-04-08 7:08 [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets Hangbin Liu
` (3 preceding siblings ...)
2026-04-08 7:08 ` [PATCH net-next v2 4/5] netlink: add a nla_nest_end_safe() helper Hangbin Liu
@ 2026-04-08 7:08 ` Hangbin Liu
4 siblings, 0 replies; 6+ messages in thread
From: Hangbin Liu @ 2026-04-08 7:08 UTC (permalink / raw)
To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Andrew Lunn
Cc: netdev, linux-kernel, Hangbin Liu
The netlink attribute length field nla_len is a __u16, which can only
represent values up to 65535 bytes. NICs with a large number of
statistics strings (e.g. mlx5_core with thousands of ETH_SS_STATS
entries) can produce a ETHTOOL_A_STRINGSET_STRINGS nest that exceeds
this limit.
When nla_nest_end() writes the actual nest size back to nla_len, the
value is silently truncated. This results in a corrupted netlink message
being sent to userspace: the parser reads a wrong (truncated) attribute
length and misaligns all subsequent attribute boundaries, causing decode
errors.
Fix this by using the new helper nla_nest_end_safe and error out if
the size exceeds U16_MAX.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
net/ethtool/strset.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c
index 9271aba8255e..bb1e829ba099 100644
--- a/net/ethtool/strset.c
+++ b/net/ethtool/strset.c
@@ -443,7 +443,8 @@ static int strset_fill_set(struct sk_buff *skb,
if (strset_fill_string(skb, set_info, i) < 0)
goto nla_put_failure;
}
- nla_nest_end(skb, strings_attr);
+ if (nla_nest_end_safe(skb, strings_attr) < 0)
+ goto nla_put_failure;
}
nla_nest_end(skb, stringset_attr);
--
Git-155)
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-08 7:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-08 7:08 [PATCH net-next v2 0/5] ynl/ethtool/netlink: fix nla_len overflow for large string sets Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 1/5] tools: ynl: move ethtool.py to selftest Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 2/5] tools: ynl: ethtool: use doit instead of dumpit for per-device GET Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 3/5] tools: ynl: ethtool: add --dbg-small-recv option Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 4/5] netlink: add a nla_nest_end_safe() helper Hangbin Liu
2026-04-08 7:08 ` [PATCH net-next v2 5/5] ethtool: strset: check nla_len overflow Hangbin Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox