* [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump
@ 2026-04-02 7:26 Fernando Fernandez Mancera
2026-04-02 7:26 ` [PATCH 2/2 net v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop() Fernando Fernandez Mancera
2026-04-03 22:50 ` [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump patchwork-bot+netdevbpf
0 siblings, 2 replies; 4+ messages in thread
From: Fernando Fernandez Mancera @ 2026-04-02 7:26 UTC (permalink / raw)
To: netdev
Cc: idosch, petrm, horms, pabeni, kuba, edumazet, davem, dsahern,
kees, Fernando Fernandez Mancera
Currently NHA_HW_STATS_ENABLE is included twice everytime a dump of
nexthop group is performed with NHA_OP_FLAG_DUMP_STATS. As all the stats
querying were moved to nla_put_nh_group_stats(), leave only that
instance of the attribute querying.
Fixes: 5072ae00aea4 ("net: nexthop: Expose nexthop group HW stats to user space")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
v2: patch added
v3: no changes
---
net/ipv4/nexthop.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index c942f1282236..a0c694583299 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -902,8 +902,7 @@ static int nla_put_nh_group(struct sk_buff *skb, struct nexthop *nh,
goto nla_put_failure;
if (op_flags & NHA_OP_FLAG_DUMP_STATS &&
- (nla_put_u32(skb, NHA_HW_STATS_ENABLE, nhg->hw_stats) ||
- nla_put_nh_group_stats(skb, nh, op_flags)))
+ nla_put_nh_group_stats(skb, nh, op_flags))
goto nla_put_failure;
return 0;
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2 net v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop()
2026-04-02 7:26 [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump Fernando Fernandez Mancera
@ 2026-04-02 7:26 ` Fernando Fernandez Mancera
2026-04-03 9:41 ` Ido Schimmel
2026-04-03 22:50 ` [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump patchwork-bot+netdevbpf
1 sibling, 1 reply; 4+ messages in thread
From: Fernando Fernandez Mancera @ 2026-04-02 7:26 UTC (permalink / raw)
To: netdev
Cc: idosch, petrm, horms, pabeni, kuba, edumazet, davem, dsahern,
kees, Fernando Fernandez Mancera, Yiming Qian
When querying a nexthop object via RTM_GETNEXTHOP, the kernel currently
allocates a fixed-size skb using NLMSG_GOODSIZE. While sufficient for
single nexthops and small Equal-Cost Multi-Path groups, this fixed
allocation fails for large nexthop groups like 512 nexthops.
This results in the following warning splat:
WARNING: net/ipv4/nexthop.c:3395 at rtm_get_nexthop+0x176/0x1c0, CPU#20: rep/4608
[...]
RIP: 0010:rtm_get_nexthop (net/ipv4/nexthop.c:3395)
[...]
Call Trace:
<TASK>
rtnetlink_rcv_msg (net/core/rtnetlink.c:6989)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1894)
____sys_sendmsg (net/socket.c:721 net/socket.c:736 net/socket.c:2585)
___sys_sendmsg (net/socket.c:2641)
__sys_sendmsg (net/socket.c:2671)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
</TASK>
Fix this by allocating the size dynamically using nh_nlmsg_size() and
using nlmsg_new(), this is consistent with nexthop_notify() behavior. In
addition, adjust nh_nlmsg_size_grp() so it calculates the size needed
based on flags passed. While at it, also add the size of NHA_FDB for
nexthop group size calculation as it was missing too.
This cannot be reproduced via iproute2 as the group size is currently
limited and the command fails as follows:
addattr_l ERROR: message exceeded bound of 1048
Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Closes: https://lore.kernel.org/netdev/CAL_bE8Li2h4KO+AQFXW4S6Yb_u5X4oSKnkywW+LPFjuErhqELA@mail.gmail.com/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
v2: adjust nh_nlmsg_size_grp() to handle size for stats and add symbols
to the trace in commit message
v3: include NHA_FDB size into nh_nlmsg_size_grp() calculation
---
net/ipv4/nexthop.c | 38 +++++++++++++++++++++++++++-----------
1 file changed, 27 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index a0c694583299..2c9036c719b6 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1003,16 +1003,32 @@ static size_t nh_nlmsg_size_grp_res(struct nh_group *nhg)
nla_total_size_64bit(8);/* NHA_RES_GROUP_UNBALANCED_TIME */
}
-static size_t nh_nlmsg_size_grp(struct nexthop *nh)
+static size_t nh_nlmsg_size_grp(struct nexthop *nh, u32 op_flags)
{
struct nh_group *nhg = rtnl_dereference(nh->nh_grp);
size_t sz = sizeof(struct nexthop_grp) * nhg->num_nh;
size_t tot = nla_total_size(sz) +
- nla_total_size(2); /* NHA_GROUP_TYPE */
+ nla_total_size(2) + /* NHA_GROUP_TYPE */
+ nla_total_size(0); /* NHA_FDB */
if (nhg->resilient)
tot += nh_nlmsg_size_grp_res(nhg);
+ if (op_flags & NHA_OP_FLAG_DUMP_STATS) {
+ tot += nla_total_size(0) + /* NHA_GROUP_STATS */
+ nla_total_size(4); /* NHA_HW_STATS_ENABLE */
+ tot += nhg->num_nh *
+ (nla_total_size(0) + /* NHA_GROUP_STATS_ENTRY */
+ nla_total_size(4) + /* NHA_GROUP_STATS_ENTRY_ID */
+ nla_total_size_64bit(8)); /* NHA_GROUP_STATS_ENTRY_PACKETS */
+
+ if (op_flags & NHA_OP_FLAG_DUMP_HW_STATS) {
+ tot += nhg->num_nh *
+ nla_total_size_64bit(8); /* NHA_GROUP_STATS_ENTRY_PACKETS_HW */
+ tot += nla_total_size(4); /* NHA_HW_STATS_USED */
+ }
+ }
+
return tot;
}
@@ -1047,14 +1063,14 @@ static size_t nh_nlmsg_size_single(struct nexthop *nh)
return sz;
}
-static size_t nh_nlmsg_size(struct nexthop *nh)
+static size_t nh_nlmsg_size(struct nexthop *nh, u32 op_flags)
{
size_t sz = NLMSG_ALIGN(sizeof(struct nhmsg));
sz += nla_total_size(4); /* NHA_ID */
if (nh->is_group)
- sz += nh_nlmsg_size_grp(nh) +
+ sz += nh_nlmsg_size_grp(nh, op_flags) +
nla_total_size(4) + /* NHA_OP_FLAGS */
0;
else
@@ -1070,7 +1086,7 @@ static void nexthop_notify(int event, struct nexthop *nh, struct nl_info *info)
struct sk_buff *skb;
int err = -ENOBUFS;
- skb = nlmsg_new(nh_nlmsg_size(nh), gfp_any());
+ skb = nlmsg_new(nh_nlmsg_size(nh, 0), gfp_any());
if (!skb)
goto errout;
@@ -3376,15 +3392,15 @@ static int rtm_get_nexthop(struct sk_buff *in_skb, struct nlmsghdr *nlh,
if (err)
return err;
- err = -ENOBUFS;
- skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
- if (!skb)
- goto out;
-
err = -ENOENT;
nh = nexthop_find_by_id(net, id);
if (!nh)
- goto errout_free;
+ goto out;
+
+ err = -ENOBUFS;
+ skb = nlmsg_new(nh_nlmsg_size(nh, op_flags), GFP_KERNEL);
+ if (!skb)
+ goto out;
err = nh_fill_node(skb, nh, RTM_NEWNEXTHOP, NETLINK_CB(in_skb).portid,
nlh->nlmsg_seq, 0, op_flags);
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2 net v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop()
2026-04-02 7:26 ` [PATCH 2/2 net v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop() Fernando Fernandez Mancera
@ 2026-04-03 9:41 ` Ido Schimmel
0 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-04-03 9:41 UTC (permalink / raw)
To: Fernando Fernandez Mancera
Cc: netdev, petrm, horms, pabeni, kuba, edumazet, davem, dsahern,
kees, Yiming Qian
On Thu, Apr 02, 2026 at 09:26:13AM +0200, Fernando Fernandez Mancera wrote:
> When querying a nexthop object via RTM_GETNEXTHOP, the kernel currently
> allocates a fixed-size skb using NLMSG_GOODSIZE. While sufficient for
> single nexthops and small Equal-Cost Multi-Path groups, this fixed
> allocation fails for large nexthop groups like 512 nexthops.
>
> This results in the following warning splat:
>
> WARNING: net/ipv4/nexthop.c:3395 at rtm_get_nexthop+0x176/0x1c0, CPU#20: rep/4608
> [...]
> RIP: 0010:rtm_get_nexthop (net/ipv4/nexthop.c:3395)
> [...]
> Call Trace:
> <TASK>
> rtnetlink_rcv_msg (net/core/rtnetlink.c:6989)
> netlink_rcv_skb (net/netlink/af_netlink.c:2550)
> netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
> netlink_sendmsg (net/netlink/af_netlink.c:1894)
> ____sys_sendmsg (net/socket.c:721 net/socket.c:736 net/socket.c:2585)
> ___sys_sendmsg (net/socket.c:2641)
> __sys_sendmsg (net/socket.c:2671)
> do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> </TASK>
>
> Fix this by allocating the size dynamically using nh_nlmsg_size() and
> using nlmsg_new(), this is consistent with nexthop_notify() behavior. In
> addition, adjust nh_nlmsg_size_grp() so it calculates the size needed
> based on flags passed. While at it, also add the size of NHA_FDB for
> nexthop group size calculation as it was missing too.
>
> This cannot be reproduced via iproute2 as the group size is currently
> limited and the command fails as follows:
>
> addattr_l ERROR: message exceeded bound of 1048
>
> Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Closes: https://lore.kernel.org/netdev/CAL_bE8Li2h4KO+AQFXW4S6Yb_u5X4oSKnkywW+LPFjuErhqELA@mail.gmail.com/
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump
2026-04-02 7:26 [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump Fernando Fernandez Mancera
2026-04-02 7:26 ` [PATCH 2/2 net v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop() Fernando Fernandez Mancera
@ 2026-04-03 22:50 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-04-03 22:50 UTC (permalink / raw)
To: Fernando Fernandez Mancera
Cc: netdev, idosch, petrm, horms, pabeni, kuba, edumazet, davem,
dsahern, kees
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 2 Apr 2026 09:26:12 +0200 you wrote:
> Currently NHA_HW_STATS_ENABLE is included twice everytime a dump of
> nexthop group is performed with NHA_OP_FLAG_DUMP_STATS. As all the stats
> querying were moved to nla_put_nh_group_stats(), leave only that
> instance of the attribute querying.
>
> Fixes: 5072ae00aea4 ("net: nexthop: Expose nexthop group HW stats to user space")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
>
> [...]
Here is the summary with links:
- [1/2,net,v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump
https://git.kernel.org/netdev/net/c/06aaf04ca815
- [2/2,net,v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop()
https://git.kernel.org/netdev/net/c/14cf0cd35361
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-03 22:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-02 7:26 [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump Fernando Fernandez Mancera
2026-04-02 7:26 ` [PATCH 2/2 net v3] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop() Fernando Fernandez Mancera
2026-04-03 9:41 ` Ido Schimmel
2026-04-03 22:50 ` [PATCH 1/2 net v3] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox