public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Fernando Fernandez Mancera <fmancera@suse.de>,
	Yiming Qian <yimingqian591@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	Ido Schimmel <idosch@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>, Sasha Levin <sashal@kernel.org>,
	dsahern@kernel.org, davem@davemloft.net, pabeni@redhat.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.18] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop()
Date: Mon, 20 Apr 2026 09:18:42 -0400	[thread overview]
Message-ID: <20260420132314.1023554-128-sashal@kernel.org> (raw)
In-Reply-To: <20260420132314.1023554-1-sashal@kernel.org>

From: Fernando Fernandez Mancera <fmancera@suse.de>

[ Upstream commit 14cf0cd35361f4e94824bf8a42f72713d7702a73 ]

When querying a nexthop object via RTM_GETNEXTHOP, the kernel currently
allocates a fixed-size skb using NLMSG_GOODSIZE. While sufficient for
single nexthops and small Equal-Cost Multi-Path groups, this fixed
allocation fails for large nexthop groups like 512 nexthops.

This results in the following warning splat:

 WARNING: net/ipv4/nexthop.c:3395 at rtm_get_nexthop+0x176/0x1c0, CPU#20: rep/4608
 [...]
 RIP: 0010:rtm_get_nexthop (net/ipv4/nexthop.c:3395)
 [...]
 Call Trace:
  <TASK>
  rtnetlink_rcv_msg (net/core/rtnetlink.c:6989)
  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
  netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
  netlink_sendmsg (net/netlink/af_netlink.c:1894)
  ____sys_sendmsg (net/socket.c:721 net/socket.c:736 net/socket.c:2585)
  ___sys_sendmsg (net/socket.c:2641)
  __sys_sendmsg (net/socket.c:2671)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  </TASK>

Fix this by allocating the size dynamically using nh_nlmsg_size() and
using nlmsg_new(), this is consistent with nexthop_notify() behavior. In
addition, adjust nh_nlmsg_size_grp() so it calculates the size needed
based on flags passed. While at it, also add the size of NHA_FDB for
nexthop group size calculation as it was missing too.

This cannot be reproduced via iproute2 as the group size is currently
limited and the command fails as follows:

addattr_l ERROR: message exceeded bound of 1048

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Closes: https://lore.kernel.org/netdev/CAL_bE8Li2h4KO+AQFXW4S6Yb_u5X4oSKnkywW+LPFjuErhqELA@mail.gmail.com/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260402072613.25262-2-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 net/ipv4/nexthop.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index aa53a74ac2389..c958b8edfe540 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1006,16 +1006,32 @@ static size_t nh_nlmsg_size_grp_res(struct nh_group *nhg)
 		nla_total_size_64bit(8);/* NHA_RES_GROUP_UNBALANCED_TIME */
 }
 
-static size_t nh_nlmsg_size_grp(struct nexthop *nh)
+static size_t nh_nlmsg_size_grp(struct nexthop *nh, u32 op_flags)
 {
 	struct nh_group *nhg = rtnl_dereference(nh->nh_grp);
 	size_t sz = sizeof(struct nexthop_grp) * nhg->num_nh;
 	size_t tot = nla_total_size(sz) +
-		nla_total_size(2); /* NHA_GROUP_TYPE */
+		nla_total_size(2) +	/* NHA_GROUP_TYPE */
+		nla_total_size(0);	/* NHA_FDB */
 
 	if (nhg->resilient)
 		tot += nh_nlmsg_size_grp_res(nhg);
 
+	if (op_flags & NHA_OP_FLAG_DUMP_STATS) {
+		tot += nla_total_size(0) +	  /* NHA_GROUP_STATS */
+		       nla_total_size(4);	  /* NHA_HW_STATS_ENABLE */
+		tot += nhg->num_nh *
+		       (nla_total_size(0) +	  /* NHA_GROUP_STATS_ENTRY */
+			nla_total_size(4) +	  /* NHA_GROUP_STATS_ENTRY_ID */
+			nla_total_size_64bit(8)); /* NHA_GROUP_STATS_ENTRY_PACKETS */
+
+		if (op_flags & NHA_OP_FLAG_DUMP_HW_STATS) {
+			tot += nhg->num_nh *
+			       nla_total_size_64bit(8); /* NHA_GROUP_STATS_ENTRY_PACKETS_HW */
+			tot += nla_total_size(4);	/* NHA_HW_STATS_USED */
+		}
+	}
+
 	return tot;
 }
 
@@ -1050,14 +1066,14 @@ static size_t nh_nlmsg_size_single(struct nexthop *nh)
 	return sz;
 }
 
-static size_t nh_nlmsg_size(struct nexthop *nh)
+static size_t nh_nlmsg_size(struct nexthop *nh, u32 op_flags)
 {
 	size_t sz = NLMSG_ALIGN(sizeof(struct nhmsg));
 
 	sz += nla_total_size(4); /* NHA_ID */
 
 	if (nh->is_group)
-		sz += nh_nlmsg_size_grp(nh) +
+		sz += nh_nlmsg_size_grp(nh, op_flags) +
 		      nla_total_size(4) +	/* NHA_OP_FLAGS */
 		      0;
 	else
@@ -1073,7 +1089,7 @@ static void nexthop_notify(int event, struct nexthop *nh, struct nl_info *info)
 	struct sk_buff *skb;
 	int err = -ENOBUFS;
 
-	skb = nlmsg_new(nh_nlmsg_size(nh), gfp_any());
+	skb = nlmsg_new(nh_nlmsg_size(nh, 0), gfp_any());
 	if (!skb)
 		goto errout;
 
@@ -3379,15 +3395,15 @@ static int rtm_get_nexthop(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 	if (err)
 		return err;
 
-	err = -ENOBUFS;
-	skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
-	if (!skb)
-		goto out;
-
 	err = -ENOENT;
 	nh = nexthop_find_by_id(net, id);
 	if (!nh)
-		goto errout_free;
+		goto out;
+
+	err = -ENOBUFS;
+	skb = nlmsg_new(nh_nlmsg_size(nh, op_flags), GFP_KERNEL);
+	if (!skb)
+		goto out;
 
 	err = nh_fill_node(skb, nh, RTM_NEWNEXTHOP, NETLINK_CB(in_skb).portid,
 			   nlh->nlmsg_seq, 0, op_flags);
-- 
2.53.0


  parent reply	other threads:[~2026-04-20 13:26 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260420132314.1023554-1-sashal@kernel.org>
2026-04-20 13:16 ` [PATCH AUTOSEL 7.0-5.10] FDDI: defxx: Rate-limit memory allocation errors Sasha Levin
2026-04-20 13:16 ` [PATCH AUTOSEL 6.18] xsk: fix XDP_UMEM_SG_FLAG issues Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-5.10] net: rose: reject truncated CLEAR_REQUEST frames in state machines Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_queue: nfqnl_instance GFP_ATOMIC -> GFP_KERNEL_ACCOUNT allocation Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.18] net: mana: hardening: Validate adapter_mtu from MANA_QUERY_DEV_CONFIG Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-5.10] enic: add V2 SR-IOV VF device ID Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.6] ipv6: move IFA_F_PERMANENT percpu allocation in process scope Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminator Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] net: increase IP_TUNNEL_RECURSION_LIMIT to 5 Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.1] net: lan743x: fix SGMII detection on PCI1xxxx B0+ during warm reset Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-5.10] vmxnet3: Suppress page allocation warning for massive Rx Data ring Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] xfrm: Wait for RCU readers during policy netns exit Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] ixgbe: stop re-reading flash on every get_drvinfo for e610 Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] devlink: Fix incorrect skb socket family dumping Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] net: sfp: add quirk for ZOERAX SFP-2.5G-T Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.18] ipv6: discard fragment queue earlier if there is malformed datagram Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] af_unix: read UNIX_DIAG_VFS data under unix_state_lock Sasha Levin
2026-04-20 13:18 ` Sasha Levin [this message]
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] xfrm: fix refcount leak in xfrm_migrate_policy_find Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] selftests: net: bridge_vlan_mcast: wait for h1 before querier check Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] xsk: tighten UMEM headroom validation to account for tailroom and min frame Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-5.15] gve: fix SW coalescing when hw-GRO is used Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] netfilter: ip6t_eui64: reject invalid MAC header for all packets Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] l2tp: Drop large packets with UDP encap Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] net: ethernet: ravb: Disable interrupts when closing device Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0] dsa: tag_mxl862xx: set dsa_default_offload_fwd_mark() Sasha Levin
2026-04-20 13:34   ` Daniel Golle
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.1] ipv4: validate IPV4_DEVCONF attributes properly Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] net: ipa: fix event ring index not programmed for IPA v5.0+ Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] net: core: allow netdev_upper_get_next_dev_rcu from bh context Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] net: txgbe: leave space for null terminators on property_entry Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] net: initialize sk_rx_queue_mapping in sk_clone() Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.19] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] netfilter: conntrack: add missing netlink policy validations Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] rtnetlink: add missing netlink_ns_capable() check for peer netns Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data() Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.1] net: sched: cls_u32: Avoid memcpy() false-positive warning in u32_init_knode() Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] xsk: respect tailroom for ZC setups Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.18] tcp: use WRITE_ONCE() for tsoffset in tcp_v6_connect() Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] net: mdio: realtek-rtl9300: use scoped device_for_each_child_node loop Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.12] net: ethernet: mtk_eth_soc: avoid writing to ESW registers on MT7628 Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] ipvs: fix NULL deref in ip_vs_add_service error path Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.18] net: hsr: emit notification for PRP slave2 changed hw addr on port deletion Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-5.10] net: hamradio: scc: validate bufsize in SIOCSCCSMEM ioctl Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] xfrm: account XFRMA_IF_ID in aevent size calculation Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] bridge: guard local VLAN-0 FDB helpers against NULL vlan group Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-5.10] net: hamradio: bpqether: validate frame length in bpq_rcv() Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] netfilter: ctnetlink: ensure safe access to master conntrack Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-6.18] hinic3: Add msg_send_lock for message sending concurrecy Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0] netfilter: require Ethernet MAC header before using eth_hdr() Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] net: sched: act_csum: validate nested VLAN headers Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] net: ipa: fix GENERIC_CMD register field masks for IPA v5.0+ Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] dt-bindings: net: Fix Tegra234 MGBE PTP clock Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] net: ioam6: fix OOB and missing lock Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] ipv4: icmp: fix null-ptr-deref in icmp_build_probe() Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] nfc: s3fwrn5: allocate rx skb before consuming bytes Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] xsk: validate MTU against usable frame size on bind Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] xfrm_user: fix info leak in build_mapping() Sasha Levin
2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: lapbether: handle NETDEV_PRE_TYPE_CHANGE Sasha Levin
2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: airoha: Fix memory leak in airoha_qdma_rx_process() Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260420132314.1023554-128-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=fmancera@suse.de \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=yimingqian591@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox