From: Jay Vosburgh <jv@jvosburgh.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Calvin Owens <calvin@wbinvd.org>,
Breno Leitao <leitao@debian.org>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Shuah Khan <shuah@kernel.org>,
Simon Horman <horms@kernel.org>,
david decotigny <decot@googlers.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-kselftest@vger.kernel.org, asantostc@gmail.com,
efault@gmx.de, kernel-team@meta.com, stable@vger.kernel.org
Subject: Re: [PATCH net v3 1/3] netpoll: fix incorrect refcount handling causing incorrect cleanup
Date: Tue, 09 Sep 2025 17:18:26 -0700 [thread overview]
Message-ID: <2930648.1757463506@famine> (raw)
In-Reply-To: <20250908182958.23dc4ba0@kernel.org>
Jakub Kicinski <kuba@kernel.org> wrote:
>On Mon, 8 Sep 2025 13:47:24 -0700 Calvin Owens wrote:
>> I wonder if there might be a demon lurking in bonding+netpoll that this
>> was papering over? Not a reason not to fix the leaks IMO, I'm just
>> curious, I don't want to spend time on it if you already did :)
>
>+1, I also feel like it'd be good to have some bonding tests in place
>when we're removing a hack added specifically for bonding.
I'll disclaimer this by saying up front that I'm not super
familiar with the innards of netpoll.
That said, I looked at commit efa95b01da18 ("netpoll: fix use
after free") and the relevant upstream discussion, and I'm not sure the
assertion that "After a bonding master reclaims the netpoll info struct,
slaves could still hold a pointer to the reclaimed data" is correct.
I'm not sure the efa9 patch's reference count math is
correct (more on that below).
Second, I'm a bit unsure what's going on with the struct netpoll
*np parameter of __netpoll_setup for the second and subsequent netpoll
instances (i.e., second and later call), as the function will
unconditionally do
npinfo->netpoll = np;
which it seems like would overwrite the "np" supplied by any
prior calls to __netpoll_setup. In bonding, slave_enable_netpoll()
stashes the "np" it allocates as slave->np, and slave_disable_netpoll
relies on __netpoll_free to free it, so I don't think it's lost, but it
seems like netpoll internally only tracks one of these at a time,
regardless of the reference count.
On the reference counting, the upstream example from the prior
discussion includes:
mkdir /sys/kernel/config/netconsole/blah
echo 0 > /sys/kernel/config/netconsole/blah/enabled
echo bond0 > /sys/kernel/config/netconsole/blah/dev_name
echo 192.168.56.42 > /sys/kernel/config/netconsole/blah/remote_ip
echo 1 > /sys/kernel/config/netconsole/blah/enabled
# npinfo refcnt ->1
ifenslave bond0 eth1
# npinfo refcnt ->2
ifenslave bond0 eth0
# (this should be optional, preventing ndo_cleanup_nepoll below)
# npinfo refcnt ->3
I'm suspicious of the refcnt values here; both then and now, the
npinfo for each of the relevant interfaces is a separate per-interface
allocation in __netpoll_setup, so I'm not sure what exactly is supposed
to be getting a refcnt of 3.
If there are two netpoll instances using the slave in question
(either directly or via the bond itself), then clearing the
np->dev->npinfo pointer looks like the wrong thing to do until the last
reference is released.
-J
---
-Jay Vosburgh, jv@jvosburgh.net
next prev parent reply other threads:[~2025-09-10 0:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-05 17:25 [PATCH net v3 0/3] net: netpoll: fix a memleak and create a selftest Breno Leitao
2025-09-05 17:25 ` [PATCH net v3 1/3] netpoll: fix incorrect refcount handling causing incorrect cleanup Breno Leitao
2025-09-08 10:12 ` Simon Horman
2025-09-08 20:47 ` Calvin Owens
2025-09-09 1:29 ` Jakub Kicinski
2025-09-09 20:17 ` Breno Leitao
2025-09-09 23:16 ` Jakub Kicinski
2025-09-10 14:12 ` Breno Leitao
2025-09-10 17:58 ` Jakub Kicinski
2025-09-10 18:50 ` Breno Leitao
2025-09-10 0:18 ` Jay Vosburgh [this message]
2025-09-10 14:07 ` Breno Leitao
2025-09-09 14:05 ` Breno Leitao
2025-09-10 0:40 ` Calvin Owens
2025-09-05 17:25 ` [PATCH net v3 2/3] selftest: netcons: refactor target creation Breno Leitao
2025-09-05 17:27 ` kernel test robot
2025-09-08 10:13 ` Simon Horman
2025-09-05 17:25 ` [PATCH net v3 3/3] selftest: netcons: create a torture test Breno Leitao
2025-09-08 10:13 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2930648.1757463506@famine \
--to=jv@jvosburgh.net \
--cc=andrew+netdev@lunn.ch \
--cc=asantostc@gmail.com \
--cc=calvin@wbinvd.org \
--cc=davem@davemloft.net \
--cc=decot@googlers.com \
--cc=edumazet@google.com \
--cc=efault@gmx.de \
--cc=horms@kernel.org \
--cc=kernel-team@meta.com \
--cc=kuba@kernel.org \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.