public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <jv@jvosburgh.net>
To: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>,
	netdev@vger.kernel.org, jiayuan.chen@shopee.com,
	syzbot+80e046b8da2820b6ba73@syzkaller.appspotmail.com,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	KP Singh <kpsingh@kernel.org>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Clark Williams <clrkwllms@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Jussi Maki <joamaki@gmail.com>,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	linux-kselftest@vger.kernel.org, linux-rt-devel@lists.linux.dev
Subject: Re: [PATCH net v4 1/2] bonding: fix null-ptr-deref in bond_rr_gen_slave_id()
Date: Thu, 05 Mar 2026 13:03:51 -0800	[thread overview]
Message-ID: <1356528.1772744631@famine> (raw)
In-Reply-To: <aahsuPt7cY8LxETN@penguin>

Nikolay Aleksandrov <razor@blackwall.org> wrote:

>On Wed, Mar 04, 2026 at 09:27:28AM -0800, Jay Vosburgh wrote:
>> Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> 
>> >On Wed, Mar 04, 2026 at 03:42:57PM +0800, Jiayuan Chen wrote:
>> >> From: Jiayuan Chen <jiayuan.chen@shopee.com>
>> >> 
>> >> bond_rr_gen_slave_id() dereferences bond->rr_tx_counter without a NULL
>> >> check. rr_tx_counter is a per-CPU counter only allocated in bond_open()
>> >> when the bond mode is round-robin. If the bond device was never brought
>> >> up, rr_tx_counter remains NULL, causing a null-ptr-deref.
>> >> 
>> >> The XDP redirect path can reach this code even when the bond is not up:
>> >> bpf_master_redirect_enabled_key is a global static key, so when any bond
>> >> device has native XDP attached, the XDP_TX -> xdp_master_redirect()
>> >> interception is enabled for all bond slaves system-wide. This allows the
>> >> path xdp_master_redirect() -> bond_xdp_get_xmit_slave() ->
>> >> bond_xdp_xmit_roundrobin_slave_get() -> bond_rr_gen_slave_id() to be
>> >> reached on a bond that was never opened.
>> >> 
>> >> Fix this by allocating rr_tx_counter unconditionally in bond_init()
>> >> (ndo_init), which is called by register_netdevice() and covers both
>> >> device creation paths (bond_create() and bond_newlink()). This also
>> >> handles the case where bond mode is changed to round-robin after device
>> >> creation. The conditional allocation in bond_open() is removed. Since
>> >> bond_destructor() already unconditionally calls
>> >> free_percpu(bond->rr_tx_counter), the lifecycle is clean: allocate at
>> >> ndo_init, free at destructor.
>> >> 
>> >> Note: rr_tx_counter is only used by round-robin mode, so this
>> >> deliberately allocates a per-cpu u32 that goes unused for other modes.
>> >> Conditional allocation (e.g., in bond_option_mode_set) was considered
>> >> but rejected: the XDP path can race with mode changes on a downed bond,
>> >> and adding memory barriers to the XDP hot path is not justified for
>> >> saving 4 bytes per CPU.
>> >> 
>> >> Fixes: 879af96ffd72 ("net, core: Add support for XDP redirection to slave device")
>> >> Reported-by: syzbot+80e046b8da2820b6ba73@syzkaller.appspotmail.com
>> >> Closes: https://lore.kernel.org/all/698f84c6.a70a0220.2c38d7.00cc.GAE@google.com/T/
>> >> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
>> >> ---
>> >>  drivers/net/bonding/bond_main.c | 19 +++++++++++++------
>> >>  1 file changed, 13 insertions(+), 6 deletions(-)
>> >> 
>> >
>> >IMO it's not worth it to waste memory in all modes, for an unpopular mode.
>> >I think it'd be better to add a null check in bond_rr_gen_slave_id(),
>> >READ/WRITE_ONCE() should be enough since it is allocated only once, and
>> >freed when the xmit code cannot be reachable anymore (otherwise we'd have
>> >more bugs now). The branch will be successfully predicted practically always,
>> >and you can also mark the ptr being null as unlikely. That way only RR takes
>> >a very minimal hit, if any.
>> 
>> 	Is what you're suggesting different from Jiayuan's proposal[0],
>> in the sense of needing barriers in the XDP hot path to insure ordering?
>> 
>> 	If I understand correctly, your suggestion is something like
>> (totally untested):
>> 
>
>Basically yes, that is what I'm proposing + an unlikely() around that
>null check since it is really unlikely and will be always predicted
>correctly, this way it's only for RR mode.

	Jiayuan,

	Do you agree that the patch below (including Nikolay's
suggestion to add "unlikely") resolves the original issue without memory
waste, and without introducing performance issues (barriers) into the
XDP path?

	-J


>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index eb27cacc26d7..ac2a4fc0aad0 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -4273,13 +4273,17 @@ void bond_work_cancel_all(struct bonding *bond)
>>  static int bond_open(struct net_device *bond_dev)
>>  {
>>  	struct bonding *bond = netdev_priv(bond_dev);
>> +	u32 __percpu *rr_tx_tmp;
>>  	struct list_head *iter;
>>  	struct slave *slave;
>>  
>> -	if (BOND_MODE(bond) == BOND_MODE_ROUNDROBIN && !bond->rr_tx_counter) {
>> -		bond->rr_tx_counter = alloc_percpu(u32);
>> -		if (!bond->rr_tx_counter)
>> +	if (BOND_MODE(bond) == BOND_MODE_ROUNDROBIN &&
>> +	    !READ_ONCE(bond->rr_tx_counter)) {
>> +		rr_tx_tmp = alloc_percpu(u32);
>> +		if (!rr_tx_tmp)
>>  			return -ENOMEM;
>> +		WRITE_ONCE(bond->rr_tx_counter, rr_tx_tmp);
>> +
>>  	}
>>  
>>  	/* reset slave->backup and slave->inactive */
>> @@ -4866,6 +4870,9 @@ static u32 bond_rr_gen_slave_id(struct bonding *bond)
>>  	struct reciprocal_value reciprocal_packets_per_slave;
>>  	int packets_per_slave = bond->params.packets_per_slave;
>>  
>> +	if (!READ_ONCE(bond->rr_tx_counter))
>> +		packets_per_slave = 0;
>> +
>>  	switch (packets_per_slave) {
>>  	case 0:
>>  		slave_id = get_random_u32();
>> 
>> 	-J
>> 
>> 
>> [0] https://lore.kernel.org/netdev/e4a2a652784ec206728eb3a929a9892238c61f06@linux.dev/

---
	-Jay Vosburgh, jv@jvosburgh.net

  reply	other threads:[~2026-03-05 21:03 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-04  7:42 [PATCH net v4 0/2] net,bpf: fix null-ptr-deref in xdp_master_redirect() for bonding and add selftest Jiayuan Chen
2026-03-04  7:42 ` [PATCH net v4 1/2] bonding: fix null-ptr-deref in bond_rr_gen_slave_id() Jiayuan Chen
2026-03-04  8:20   ` Daniel Borkmann
2026-03-04  8:47     ` Jiayuan Chen
2026-03-04  9:40     ` Sebastian Andrzej Siewior
2026-03-04 15:59   ` Nikolay Aleksandrov
2026-03-04 17:27     ` Jay Vosburgh
2026-03-04 17:32       ` Nikolay Aleksandrov
2026-03-05 21:03         ` Jay Vosburgh [this message]
2026-03-06  2:42           ` Jiayuan Chen
2026-03-06 12:22             ` Nikolay Aleksandrov
2026-03-06 12:38               ` Jiayuan Chen
2026-03-04  7:42 ` [PATCH net v4 2/2] selftests/bpf: add test for xdp_master_redirect with bond not up Jiayuan Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1356528.1772744631@famine \
    --to=jv@jvosburgh.net \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=clrkwllms@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=jiayuan.chen@shopee.com \
    --cc=joamaki@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=rostedt@goodmis.org \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=syzbot+80e046b8da2820b6ba73@syzkaller.appspotmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox