public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <jv@jvosburgh.net>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: netdev@vger.kernel.org, Jiayuan Chen <jiayuan.chen@shopee.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Pravin B Shelar <pshelar@nicira.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave()
Date: Tue, 03 Mar 2026 21:23:58 -0800	[thread overview]
Message-ID: <1268309.1772601838@famine> (raw)
In-Reply-To: <20260228095854.391093-1-jiayuan.chen@linux.dev>

Jiayuan Chen <jiayuan.chen@linux.dev> wrote:

>From: Jiayuan Chen <jiayuan.chen@shopee.com>
>
>kernel BUG at net/core/skbuff.c:2306!
>Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
>RIP: 0010:pskb_expand_head+0xa08/0xfe0 net/core/skbuff.c:2306
>RSP: 0018:ffffc90004aff760 EFLAGS: 00010293
>RAX: 0000000000000000 RBX: ffff88807e3c8780 RCX: ffffffff89593e0e
>RDX: ffff88807b7c4900 RSI: ffffffff89594747 RDI: ffff88807b7c4900
>RBP: 0000000000000820 R08: 0000000000000005 R09: 0000000000000000
>R10: 00000000961a63e0 R11: 0000000000000000 R12: ffff88807e3c8780
>R13: 00000000961a6560 R14: dffffc0000000000 R15: 00000000961a63e0
>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>CR2: 00007fe1a0ed8df0 CR3: 000000002d816000 CR4: 00000000003526f0
>Call Trace:
> <TASK>
> ipgre_header+0xdd/0x540 net/ipv4/ip_gre.c:900
> dev_hard_header include/linux/netdevice.h:3439 [inline]
> packet_snd net/packet/af_packet.c:3028 [inline]
> packet_sendmsg+0x3ae5/0x53c0 net/packet/af_packet.c:3108
> sock_sendmsg_nosec net/socket.c:727 [inline]
> __sock_sendmsg net/socket.c:742 [inline]
> ____sys_sendmsg+0xa54/0xc30 net/socket.c:2592
> ___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
> __sys_sendmsg+0x170/0x220 net/socket.c:2678
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>RIP: 0033:0x7fe1a0e6c1a9
>
>When a device (e.g. GRE tunnel) is enslaved to a bond,
>bond_setup_by_slave() directly copies the slave's header_ops to the
>bond device:
>
>    bond_dev->header_ops = slave_dev->header_ops;

	This is true only for devices that are not ARPHRD_ETHER.

>This causes a type confusion when dev_hard_header() is later called
>on the bond device. Functions like ipgre_header(), ip6gre_header(),
>vlan_dev_hard_header(), and macvlan_hard_header() all use
>netdev_priv(dev) to access their device-specific private data. When
>called with the bond device, netdev_priv() returns the bond's private
>data (struct bonding) instead of the expected type (e.g. struct
>ip_tunnel), leading to garbage values being read and kernel crashes.

	I believe that vlan and macvlan are both ARPHRD_ETHER, and thus
won't take the bond_setup_by_slave path (which was originally
implemented for Infiniband).

	That said, your description for the GRE paths seems correct.

>Fix this by introducing bond_header_ops with wrapper functions that
>delegate to the active slave's header_ops using the slave's own
>device. This ensures netdev_priv() in the slave's header functions
>always receives the correct device.
>
>The fix is placed in the bonding driver rather than individual device
>drivers, as the root cause is bond blindly inheriting header_ops from
>the slave without considering that these callbacks expect a specific
>netdev_priv() layout.
>
>The type confusion can be observed by adding a printk in
>ipgre_header() and running the following commands:
>
>    ip link add dummy0 type dummy
>    ip addr add 10.0.0.1/24 dev dummy0
>    ip link set dummy0 up
>    ip link add gre1 type gre local 10.0.0.1
>    ip link add bond1 type bond mode active-backup
>    ip link set gre1 master bond1
>    ip link set gre1 up
>    ip link set bond1 up
>    ip addr add fe80::1/64 dev bond1
>
>Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")

	Has something relevant changed more recently than this?  I seem
to recall hearing of folks using bonding with GRE more recently than the
2013 date on c54419321455.  I didn't do an exhaustive search, but I feel
like someone would have run into this in the prior 13-ish years if it's
been broken the whole time.

>Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
>---
> drivers/net/bonding/bond_main.c | 41 ++++++++++++++++++++++++++++++++-
> 1 file changed, 40 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 0ccf928eecde..e1d1e3b49cb9 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1509,6 +1509,44 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
> 	return features;
> }
> 
>+static int bond_header_create(struct sk_buff *skb, struct net_device *bond_dev,
>+			      unsigned short type, const void *daddr,
>+			      const void *saddr, unsigned int len)
>+{
>+	struct bonding *bond = netdev_priv(bond_dev);
>+	struct slave *slave;
>+	int ret = 0;
>+
>+	rcu_read_lock();
>+	slave = rcu_dereference(bond->curr_active_slave);
>+	if (slave && slave->dev->header_ops &&
>+	    slave->dev->header_ops->create)
>+		ret = slave->dev->header_ops->create(skb, slave->dev,
>+						     type, daddr, saddr, len);
>+	rcu_read_unlock();
>+	return ret;
>+}
>+
>+static int bond_header_parse(const struct sk_buff *skb, unsigned char *haddr)
>+{
>+	struct bonding *bond = netdev_priv(skb->dev);
>+	struct slave *slave;
>+	int ret = 0;
>+
>+	rcu_read_lock();
>+	slave = rcu_dereference(bond->curr_active_slave);
>+	if (slave && slave->dev->header_ops &&
>+	    slave->dev->header_ops->parse)
>+		ret = slave->dev->header_ops->parse(skb, haddr);
>+	rcu_read_unlock();
>+	return ret;
>+}
>+
>+static const struct header_ops bond_header_ops = {
>+	.create	= bond_header_create,
>+	.parse	= bond_header_parse,
>+};
>+
> static void bond_setup_by_slave(struct net_device *bond_dev,
> 				struct net_device *slave_dev)
> {
>@@ -1516,7 +1554,8 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
> 
> 	dev_close(bond_dev);
> 
>-	bond_dev->header_ops	    = slave_dev->header_ops;
>+	bond_dev->header_ops	    = slave_dev->header_ops ?
>+				      &bond_header_ops : NULL;

	Can slave_dev->header_ops actually be NULL?

	-J


> 	bond_dev->type		    = slave_dev->type;
> 	bond_dev->hard_header_len   = slave_dev->hard_header_len;
>-- 
>2.43.0
>

---
	-Jay Vosburgh, jv@jvosburgh.net

  reply	other threads:[~2026-03-04  5:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-28  9:58 [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave() Jiayuan Chen
2026-03-04  5:23 ` Jay Vosburgh [this message]
2026-03-04 10:14   ` Jiayuan Chen
2026-03-04  7:09 ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1268309.1772601838@famine \
    --to=jv@jvosburgh.net \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jiayuan.chen@linux.dev \
    --cc=jiayuan.chen@shopee.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pshelar@nicira.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox