From: Jay Vosburgh <jv@jvosburgh.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Calvin Owens <calvin@wbinvd.org>,
Breno Leitao <leitao@debian.org>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Shuah Khan <shuah@kernel.org>,
Simon Horman <horms@kernel.org>,
david decotigny <decot@googlers.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-kselftest@vger.kernel.org, asantostc@gmail.com,
efault@gmx.de, kernel-team@meta.com, stable@vger.kernel.org
Subject: Re: [PATCH net v3 1/3] netpoll: fix incorrect refcount handling causing incorrect cleanup
Date: Tue, 09 Sep 2025 17:18:26 -0700 [thread overview]
Message-ID: <2930648.1757463506@famine> (raw)
In-Reply-To: <20250908182958.23dc4ba0@kernel.org>
Jakub Kicinski <kuba@kernel.org> wrote:
>On Mon, 8 Sep 2025 13:47:24 -0700 Calvin Owens wrote:
>> I wonder if there might be a demon lurking in bonding+netpoll that this
>> was papering over? Not a reason not to fix the leaks IMO, I'm just
>> curious, I don't want to spend time on it if you already did :)
>
>+1, I also feel like it'd be good to have some bonding tests in place
>when we're removing a hack added specifically for bonding.
I'll disclaimer this by saying up front that I'm not super
familiar with the innards of netpoll.
That said, I looked at commit efa95b01da18 ("netpoll: fix use
after free") and the relevant upstream discussion, and I'm not sure the
assertion that "After a bonding master reclaims the netpoll info struct,
slaves could still hold a pointer to the reclaimed data" is correct.
I'm not sure the efa9 patch's reference count math is
correct (more on that below).
Second, I'm a bit unsure what's going on with the struct netpoll
*np parameter of __netpoll_setup for the second and subsequent netpoll
instances (i.e., second and later call), as the function will
unconditionally do
npinfo->netpoll = np;
which it seems like would overwrite the "np" supplied by any
prior calls to __netpoll_setup. In bonding, slave_enable_netpoll()
stashes the "np" it allocates as slave->np, and slave_disable_netpoll
relies on __netpoll_free to free it, so I don't think it's lost, but it
seems like netpoll internally only tracks one of these at a time,
regardless of the reference count.
On the reference counting, the upstream example from the prior
discussion includes:
mkdir /sys/kernel/config/netconsole/blah
echo 0 > /sys/kernel/config/netconsole/blah/enabled
echo bond0 > /sys/kernel/config/netconsole/blah/dev_name
echo 192.168.56.42 > /sys/kernel/config/netconsole/blah/remote_ip
echo 1 > /sys/kernel/config/netconsole/blah/enabled
# npinfo refcnt ->1
ifenslave bond0 eth1
# npinfo refcnt ->2
ifenslave bond0 eth0
# (this should be optional, preventing ndo_cleanup_nepoll below)
# npinfo refcnt ->3
I'm suspicious of the refcnt values here; both then and now, the
npinfo for each of the relevant interfaces is a separate per-interface
allocation in __netpoll_setup, so I'm not sure what exactly is supposed
to be getting a refcnt of 3.
If there are two netpoll instances using the slave in question
(either directly or via the bond itself), then clearing the
np->dev->npinfo pointer looks like the wrong thing to do until the last
reference is released.
-J
---
-Jay Vosburgh, jv@jvosburgh.net
next prev parent reply other threads:[~2025-09-10 0:18 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-05 17:25 [PATCH net v3 0/3] net: netpoll: fix a memleak and create a selftest Breno Leitao
2025-09-05 17:25 ` [PATCH net v3 1/3] netpoll: fix incorrect refcount handling causing incorrect cleanup Breno Leitao
2025-09-08 10:12 ` Simon Horman
2025-09-08 20:47 ` Calvin Owens
2025-09-09 1:29 ` Jakub Kicinski
2025-09-09 20:17 ` Breno Leitao
2025-09-09 23:16 ` Jakub Kicinski
2025-09-10 14:12 ` Breno Leitao
2025-09-10 17:58 ` Jakub Kicinski
2025-09-10 18:50 ` Breno Leitao
2025-09-10 0:18 ` Jay Vosburgh [this message]
2025-09-10 14:07 ` Breno Leitao
2025-09-09 14:05 ` Breno Leitao
2025-09-10 0:40 ` Calvin Owens
2025-09-05 17:25 ` [PATCH net v3 2/3] selftest: netcons: refactor target creation Breno Leitao
2025-09-08 10:13 ` Simon Horman
2025-09-05 17:25 ` [PATCH net v3 3/3] selftest: netcons: create a torture test Breno Leitao
2025-09-08 10:13 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2930648.1757463506@famine \
--to=jv@jvosburgh.net \
--cc=andrew+netdev@lunn.ch \
--cc=asantostc@gmail.com \
--cc=calvin@wbinvd.org \
--cc=davem@davemloft.net \
--cc=decot@googlers.com \
--cc=edumazet@google.com \
--cc=efault@gmx.de \
--cc=horms@kernel.org \
--cc=kernel-team@meta.com \
--cc=kuba@kernel.org \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).