From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1859D34C9AD for ; Mon, 20 Apr 2026 16:27:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.182 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776702469; cv=none; b=js/sqn/F0lIqSBIhVPFi4qCNBNAKoZrm8mXxNr8CtIXAIZ4dOrO75fre2Dsfn9dY1bZxG8fCj6vBq/PeJmM/bs6MA6zwyv5/CnLL+h+ehr62T/SzJQoirkRndswe9c0NJYXOfMLBy8XoL50VT3qVLrZrEk4D3/vIxVTr9EPAcGU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776702469; c=relaxed/simple; bh=evayBaPrs6BX6HReooKQD8y9Hc6cu+bkyz671CEA814=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=Ethkk5gX9sa8FdbDGEXiO0kuqQE5hAdyxWypSHDYmKMEEc0tZBFS8lZXqgeZ0p9QvdhwWyj/ldwcQKJ6shpwsRf1AoRdRpLa4CdiZllfJB13MeedZ6eKjDSuDP3ZlwRmCo2VKLOnMVaOrV9Uwp1920Cw8fBYthc3oo498BWgJOs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kUNMaSWz; arc=none smtp.client-ip=209.85.222.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kUNMaSWz" Received: by mail-qk1-f182.google.com with SMTP id af79cd13be357-8d65f4073bfso454537485a.3 for ; Mon, 20 Apr 2026 09:27:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776702467; x=1777307267; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=zRBP2HEpjTOd8VHWChKXQR5DE4JBIEeYnfAtHpEwptc=; b=kUNMaSWzvqOefGfmV69yBh+UaQx6pfhSpZ8cVLDoScUyXMo6rB8YynkhhOzMCHulY7 M+Itd3Zm2V5vIq+lF0vxqvzeXr8Y/tYlUml1zw/zT9NHiIKCcp/fTDoj8EoMvkzFLZTJ 9XYBEAwV9LxmJjSOx2tJv2jg442henexsDlXKTqh2j+BidSgLrjWrXepuqFWfAzRznVN ycKl6ukCJ0zUW8ompm44X8FkwSVv59y3c/WtUFVi/cjZYfVkx0/HZ9zaJAm0NPPaMqEc D2cji66poGyhIxbu0QE6hLboaGan49+DO0y9sr02BBbT6mVBbuH9MQOrsEmyfgPhVaom lyJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776702467; x=1777307267; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=zRBP2HEpjTOd8VHWChKXQR5DE4JBIEeYnfAtHpEwptc=; b=IA6e8bJWG6SMkm/ktWFOgq+gxXiybK2LqOgGpnPbXYRXUL7PRGUSBLJl1gyHOz+Tt0 4NVbBX7rNrbyGrMvdAkOxVnoOSYfxCY7Ie5wKEzc6IP5l/8P5ezwUsSZGGKtN6J18c8m 9Tr2NNo7AUV8ITv5gx3gs7ABVMa0rY69nlxoVPgLEMP+KOlvimRfQmbRDjRpuepWaVTd BDSb2Lry8jnSnyJOebp8LGm12lzbAsfNinTAlkEpEYgS6GnjBMmCGCVQQk+HwDKvWN5t 21jYKQTIYQbKSLjjxtHVCthCPyhzCm7wPRUrE05n3hUelBhWIPsz4rtwhcE3SDl+OdDv +W/w== X-Forwarded-Encrypted: i=1; AFNElJ8SNh3122+Rov0WZM+s28Z90AS1jHkWPotH4okcicQtrjrQ0mHY0E2XRSidmTAsIH5SAfKqmRnUDSVjMo8=@vger.kernel.org X-Gm-Message-State: AOJu0Yz7M4yj0qJ2NyCLi8txpPD+FAFYbMNtzUybiysmUrEo+XtB1B6j Isd0KGWTAz0JAJJHReqQj85sIXJrGXxOZMXuJb4LPRXd8bUY7RPzbRJe X-Gm-Gg: AeBDietu/iw1JMrx/Reoid3TrE3GtaP2zL5mBk/lK6oBJ3ZoNimJtkibBLD5YEDLEXh zm6IuovXLICL2dVrncYAP86cukB5VJCreor7K8loitkHHyzIqCgcF8rxoTSbzKbu8stg2HSPxRQ k5VbZWJ4GixpXi3GsCyMqonRQn97zzwLy27yB8HO8Fu1KqreztGmrkcRon9PgzHxEICkiWqkcB4 IK0+xCBbGZlTqaE6qPO/z6znnhLJeHFxYBYCmYKiss+RF42Ls8NWvmzZlzwRmfzgwBVpfmLS2GD 1qFXqh6Ddj61UAF2xy7zhUbVH7AhqJjJ4G+jPmK1V4elIKimajFF+Dm5LmEQW7Gbzc/XYjGbG6X s8wWXhyJY6ns3AhVmNXVR9ksoPQ9NfUVDlUVcw9kLIEpmcoBUCWomthkn3kSQwoqK44j110Sfxl 2rkPCGYaZ/H5WIaaryzbhpWI9imFYPLV70tCPDudcao6nbWlDtEnImr0uiI5iO8lWVcvVGpg/Ac gsSYV/2/PJgwqmAq+0034+1lE/EZPs= X-Received: by 2002:a05:620a:4886:b0:8d6:9e5c:36c with SMTP id af79cd13be357-8e78f0584edmr2111339985a.6.1776702466795; Mon, 20 Apr 2026 09:27:46 -0700 (PDT) Received: from server0 (c-68-48-65-54.hsd1.mi.comcast.net. [68.48.65.54]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8eb9becc6c3sm208655085a.39.2026.04.20.09.27.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Apr 2026 09:27:46 -0700 (PDT) From: Michael Bommarito To: "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org Cc: Simon Horman , Kuniyuki Iwashima , Kees Cook , Feng Yang , linux-kernel@vger.kernel.org Subject: [PATCH net-next] netlink: clean up failed initial dump-start state Date: Mon, 20 Apr 2026 12:27:34 -0400 Message-ID: <20260420162734.854587-1-michael.bommarito@gmail.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit When __netlink_dump_start() has already installed cb->skb, taken the module reference and set cb_running, a failure from the first netlink_dump(sk, true) call returns via errout_skb without unwinding the callback lifetime. That leaves cb_running set and defers module_put() and consume_skb(cb->skb) until userspace drains the socket or closes it. Share the normal callback teardown in a helper and use it on successful completion and on the initial lock_taken=true failure path. Keep the lock_taken=false continuation path unchanged, because recvmsg()-driven retries legitimately preserve cb_running when they run out of receive room. Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.") Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito --- Validation inside a UML guest on current mainline: - An unprivileged local task (uid=65534, no CAP_NET_ADMIN) opens a plain NETLINK_ROUTE socket, preloads sk_rmem_alloc with echoed NLMSG_ERROR replies from an unsupported rtnetlink type, then issues RTM_GETLINK | NLM_F_DUMP | NLM_F_ACK. - Stock kernel: the initial __netlink_dump_start() hits the rmem gate and returns via errout_skb with cb_running stuck at 1 until recvmsg() or close() drives forward progress. - Patched kernel: the same probe leaves cb_running clear immediately on the lock_taken=true failure, and the larger-rcvbuf continuation path (legitimate dump in progress) is unchanged. A scaling pass on 3500 such wedged sockets in a 256M UML guest shows about 3.8-3.9 MiB of extra unreclaimable slab (/proc/meminfo SUnreclaim) beyond the visible queued rmem on the vulnerable kernel, roughly 1.1 KiB/socket. Real accumulation, but the test hits RLIMIT_NOFILE long before the guest approaches OOM, so this still looks like a local availability cleanup rather than an exhaustion primitive. No Cc: stable@ on the theory that the bug self-heals on recvmsg()/close and the accumulation is mild. Happy to add it and route to net if you'd rather see it backported. net/netlink/af_netlink.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 4d609d5cf406..7019c17e6879 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2250,6 +2250,20 @@ static int netlink_dump_done(struct netlink_sock *nlk, struct sk_buff *skb, return 0; } +static void netlink_dump_cleanup(struct netlink_sock *nlk) +{ + struct module *module = nlk->cb.module; + struct sk_buff *skb = nlk->cb.skb; + + if (nlk->cb.done) + nlk->cb.done(&nlk->cb); + + WRITE_ONCE(nlk->cb_running, false); + mutex_unlock(&nlk->nl_cb_mutex); + module_put(module); + consume_skb(skb); +} + static int netlink_dump(struct sock *sk, bool lock_taken) { struct netlink_sock *nlk = nlk_sk(sk); @@ -2258,7 +2272,6 @@ static int netlink_dump(struct sock *sk, bool lock_taken) struct sk_buff *skb = NULL; unsigned int rmem, rcvbuf; size_t max_recvmsg_len; - struct module *module; int err = -ENOBUFS; int alloc_min_size; int alloc_size; @@ -2366,19 +2379,14 @@ static int netlink_dump(struct sock *sk, bool lock_taken) else __netlink_sendskb(sk, skb); - if (cb->done) - cb->done(cb); - - WRITE_ONCE(nlk->cb_running, false); - module = cb->module; - skb = cb->skb; - mutex_unlock(&nlk->nl_cb_mutex); - module_put(module); - consume_skb(skb); + netlink_dump_cleanup(nlk); return 0; errout_skb: - mutex_unlock(&nlk->nl_cb_mutex); + if (lock_taken) + netlink_dump_cleanup(nlk); + else + mutex_unlock(&nlk->nl_cb_mutex); kfree_skb(skb); return err; } -- 2.53.0