From: Jay Vosburgh <jv@jvosburgh.net>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: netdev@vger.kernel.org, Jiayuan Chen <jiayuan.chen@shopee.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Pravin B Shelar <pshelar@nicira.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave()
Date: Tue, 03 Mar 2026 21:23:58 -0800 [thread overview]
Message-ID: <1268309.1772601838@famine> (raw)
In-Reply-To: <20260228095854.391093-1-jiayuan.chen@linux.dev>
Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>From: Jiayuan Chen <jiayuan.chen@shopee.com>
>
>kernel BUG at net/core/skbuff.c:2306!
>Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
>RIP: 0010:pskb_expand_head+0xa08/0xfe0 net/core/skbuff.c:2306
>RSP: 0018:ffffc90004aff760 EFLAGS: 00010293
>RAX: 0000000000000000 RBX: ffff88807e3c8780 RCX: ffffffff89593e0e
>RDX: ffff88807b7c4900 RSI: ffffffff89594747 RDI: ffff88807b7c4900
>RBP: 0000000000000820 R08: 0000000000000005 R09: 0000000000000000
>R10: 00000000961a63e0 R11: 0000000000000000 R12: ffff88807e3c8780
>R13: 00000000961a6560 R14: dffffc0000000000 R15: 00000000961a63e0
>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>CR2: 00007fe1a0ed8df0 CR3: 000000002d816000 CR4: 00000000003526f0
>Call Trace:
> <TASK>
> ipgre_header+0xdd/0x540 net/ipv4/ip_gre.c:900
> dev_hard_header include/linux/netdevice.h:3439 [inline]
> packet_snd net/packet/af_packet.c:3028 [inline]
> packet_sendmsg+0x3ae5/0x53c0 net/packet/af_packet.c:3108
> sock_sendmsg_nosec net/socket.c:727 [inline]
> __sock_sendmsg net/socket.c:742 [inline]
> ____sys_sendmsg+0xa54/0xc30 net/socket.c:2592
> ___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
> __sys_sendmsg+0x170/0x220 net/socket.c:2678
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>RIP: 0033:0x7fe1a0e6c1a9
>
>When a device (e.g. GRE tunnel) is enslaved to a bond,
>bond_setup_by_slave() directly copies the slave's header_ops to the
>bond device:
>
> bond_dev->header_ops = slave_dev->header_ops;
This is true only for devices that are not ARPHRD_ETHER.
>This causes a type confusion when dev_hard_header() is later called
>on the bond device. Functions like ipgre_header(), ip6gre_header(),
>vlan_dev_hard_header(), and macvlan_hard_header() all use
>netdev_priv(dev) to access their device-specific private data. When
>called with the bond device, netdev_priv() returns the bond's private
>data (struct bonding) instead of the expected type (e.g. struct
>ip_tunnel), leading to garbage values being read and kernel crashes.
I believe that vlan and macvlan are both ARPHRD_ETHER, and thus
won't take the bond_setup_by_slave path (which was originally
implemented for Infiniband).
That said, your description for the GRE paths seems correct.
>Fix this by introducing bond_header_ops with wrapper functions that
>delegate to the active slave's header_ops using the slave's own
>device. This ensures netdev_priv() in the slave's header functions
>always receives the correct device.
>
>The fix is placed in the bonding driver rather than individual device
>drivers, as the root cause is bond blindly inheriting header_ops from
>the slave without considering that these callbacks expect a specific
>netdev_priv() layout.
>
>The type confusion can be observed by adding a printk in
>ipgre_header() and running the following commands:
>
> ip link add dummy0 type dummy
> ip addr add 10.0.0.1/24 dev dummy0
> ip link set dummy0 up
> ip link add gre1 type gre local 10.0.0.1
> ip link add bond1 type bond mode active-backup
> ip link set gre1 master bond1
> ip link set gre1 up
> ip link set bond1 up
> ip addr add fe80::1/64 dev bond1
>
>Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Has something relevant changed more recently than this? I seem
to recall hearing of folks using bonding with GRE more recently than the
2013 date on c54419321455. I didn't do an exhaustive search, but I feel
like someone would have run into this in the prior 13-ish years if it's
been broken the whole time.
>Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
>---
> drivers/net/bonding/bond_main.c | 41 ++++++++++++++++++++++++++++++++-
> 1 file changed, 40 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 0ccf928eecde..e1d1e3b49cb9 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1509,6 +1509,44 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
> return features;
> }
>
>+static int bond_header_create(struct sk_buff *skb, struct net_device *bond_dev,
>+ unsigned short type, const void *daddr,
>+ const void *saddr, unsigned int len)
>+{
>+ struct bonding *bond = netdev_priv(bond_dev);
>+ struct slave *slave;
>+ int ret = 0;
>+
>+ rcu_read_lock();
>+ slave = rcu_dereference(bond->curr_active_slave);
>+ if (slave && slave->dev->header_ops &&
>+ slave->dev->header_ops->create)
>+ ret = slave->dev->header_ops->create(skb, slave->dev,
>+ type, daddr, saddr, len);
>+ rcu_read_unlock();
>+ return ret;
>+}
>+
>+static int bond_header_parse(const struct sk_buff *skb, unsigned char *haddr)
>+{
>+ struct bonding *bond = netdev_priv(skb->dev);
>+ struct slave *slave;
>+ int ret = 0;
>+
>+ rcu_read_lock();
>+ slave = rcu_dereference(bond->curr_active_slave);
>+ if (slave && slave->dev->header_ops &&
>+ slave->dev->header_ops->parse)
>+ ret = slave->dev->header_ops->parse(skb, haddr);
>+ rcu_read_unlock();
>+ return ret;
>+}
>+
>+static const struct header_ops bond_header_ops = {
>+ .create = bond_header_create,
>+ .parse = bond_header_parse,
>+};
>+
> static void bond_setup_by_slave(struct net_device *bond_dev,
> struct net_device *slave_dev)
> {
>@@ -1516,7 +1554,8 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
>
> dev_close(bond_dev);
>
>- bond_dev->header_ops = slave_dev->header_ops;
>+ bond_dev->header_ops = slave_dev->header_ops ?
>+ &bond_header_ops : NULL;
Can slave_dev->header_ops actually be NULL?
-J
> bond_dev->type = slave_dev->type;
> bond_dev->hard_header_len = slave_dev->hard_header_len;
>--
>2.43.0
>
---
-Jay Vosburgh, jv@jvosburgh.net
next prev parent reply other threads:[~2026-03-04 5:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-28 9:58 [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave() Jiayuan Chen
2026-03-04 5:23 ` Jay Vosburgh [this message]
2026-03-04 10:14 ` Jiayuan Chen
2026-03-04 7:09 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1268309.1772601838@famine \
--to=jv@jvosburgh.net \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jiayuan.chen@linux.dev \
--cc=jiayuan.chen@shopee.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pshelar@nicira.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox