All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <jv@jvosburgh.net>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: netdev@vger.kernel.org, Jiayuan Chen <jiayuan.chen@shopee.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Pravin B Shelar <pshelar@nicira.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave()
Date: Tue, 03 Mar 2026 21:23:58 -0800	[thread overview]
Message-ID: <1268309.1772601838@famine> (raw)
In-Reply-To: <20260228095854.391093-1-jiayuan.chen@linux.dev>

Jiayuan Chen <jiayuan.chen@linux.dev> wrote:

>From: Jiayuan Chen <jiayuan.chen@shopee.com>
>
>kernel BUG at net/core/skbuff.c:2306!
>Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
>RIP: 0010:pskb_expand_head+0xa08/0xfe0 net/core/skbuff.c:2306
>RSP: 0018:ffffc90004aff760 EFLAGS: 00010293
>RAX: 0000000000000000 RBX: ffff88807e3c8780 RCX: ffffffff89593e0e
>RDX: ffff88807b7c4900 RSI: ffffffff89594747 RDI: ffff88807b7c4900
>RBP: 0000000000000820 R08: 0000000000000005 R09: 0000000000000000
>R10: 00000000961a63e0 R11: 0000000000000000 R12: ffff88807e3c8780
>R13: 00000000961a6560 R14: dffffc0000000000 R15: 00000000961a63e0
>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>CR2: 00007fe1a0ed8df0 CR3: 000000002d816000 CR4: 00000000003526f0
>Call Trace:
> <TASK>
> ipgre_header+0xdd/0x540 net/ipv4/ip_gre.c:900
> dev_hard_header include/linux/netdevice.h:3439 [inline]
> packet_snd net/packet/af_packet.c:3028 [inline]
> packet_sendmsg+0x3ae5/0x53c0 net/packet/af_packet.c:3108
> sock_sendmsg_nosec net/socket.c:727 [inline]
> __sock_sendmsg net/socket.c:742 [inline]
> ____sys_sendmsg+0xa54/0xc30 net/socket.c:2592
> ___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
> __sys_sendmsg+0x170/0x220 net/socket.c:2678
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>RIP: 0033:0x7fe1a0e6c1a9
>
>When a device (e.g. GRE tunnel) is enslaved to a bond,
>bond_setup_by_slave() directly copies the slave's header_ops to the
>bond device:
>
>    bond_dev->header_ops = slave_dev->header_ops;

	This is true only for devices that are not ARPHRD_ETHER.

>This causes a type confusion when dev_hard_header() is later called
>on the bond device. Functions like ipgre_header(), ip6gre_header(),
>vlan_dev_hard_header(), and macvlan_hard_header() all use
>netdev_priv(dev) to access their device-specific private data. When
>called with the bond device, netdev_priv() returns the bond's private
>data (struct bonding) instead of the expected type (e.g. struct
>ip_tunnel), leading to garbage values being read and kernel crashes.

	I believe that vlan and macvlan are both ARPHRD_ETHER, and thus
won't take the bond_setup_by_slave path (which was originally
implemented for Infiniband).

	That said, your description for the GRE paths seems correct.

>Fix this by introducing bond_header_ops with wrapper functions that
>delegate to the active slave's header_ops using the slave's own
>device. This ensures netdev_priv() in the slave's header functions
>always receives the correct device.
>
>The fix is placed in the bonding driver rather than individual device
>drivers, as the root cause is bond blindly inheriting header_ops from
>the slave without considering that these callbacks expect a specific
>netdev_priv() layout.
>
>The type confusion can be observed by adding a printk in
>ipgre_header() and running the following commands:
>
>    ip link add dummy0 type dummy
>    ip addr add 10.0.0.1/24 dev dummy0
>    ip link set dummy0 up
>    ip link add gre1 type gre local 10.0.0.1
>    ip link add bond1 type bond mode active-backup
>    ip link set gre1 master bond1
>    ip link set gre1 up
>    ip link set bond1 up
>    ip addr add fe80::1/64 dev bond1
>
>Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")

	Has something relevant changed more recently than this?  I seem
to recall hearing of folks using bonding with GRE more recently than the
2013 date on c54419321455.  I didn't do an exhaustive search, but I feel
like someone would have run into this in the prior 13-ish years if it's
been broken the whole time.

>Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
>---
> drivers/net/bonding/bond_main.c | 41 ++++++++++++++++++++++++++++++++-
> 1 file changed, 40 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 0ccf928eecde..e1d1e3b49cb9 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1509,6 +1509,44 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
> 	return features;
> }
> 
>+static int bond_header_create(struct sk_buff *skb, struct net_device *bond_dev,
>+			      unsigned short type, const void *daddr,
>+			      const void *saddr, unsigned int len)
>+{
>+	struct bonding *bond = netdev_priv(bond_dev);
>+	struct slave *slave;
>+	int ret = 0;
>+
>+	rcu_read_lock();
>+	slave = rcu_dereference(bond->curr_active_slave);
>+	if (slave && slave->dev->header_ops &&
>+	    slave->dev->header_ops->create)
>+		ret = slave->dev->header_ops->create(skb, slave->dev,
>+						     type, daddr, saddr, len);
>+	rcu_read_unlock();
>+	return ret;
>+}
>+
>+static int bond_header_parse(const struct sk_buff *skb, unsigned char *haddr)
>+{
>+	struct bonding *bond = netdev_priv(skb->dev);
>+	struct slave *slave;
>+	int ret = 0;
>+
>+	rcu_read_lock();
>+	slave = rcu_dereference(bond->curr_active_slave);
>+	if (slave && slave->dev->header_ops &&
>+	    slave->dev->header_ops->parse)
>+		ret = slave->dev->header_ops->parse(skb, haddr);
>+	rcu_read_unlock();
>+	return ret;
>+}
>+
>+static const struct header_ops bond_header_ops = {
>+	.create	= bond_header_create,
>+	.parse	= bond_header_parse,
>+};
>+
> static void bond_setup_by_slave(struct net_device *bond_dev,
> 				struct net_device *slave_dev)
> {
>@@ -1516,7 +1554,8 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
> 
> 	dev_close(bond_dev);
> 
>-	bond_dev->header_ops	    = slave_dev->header_ops;
>+	bond_dev->header_ops	    = slave_dev->header_ops ?
>+				      &bond_header_ops : NULL;

	Can slave_dev->header_ops actually be NULL?

	-J


> 	bond_dev->type		    = slave_dev->type;
> 	bond_dev->hard_header_len   = slave_dev->hard_header_len;
>-- 
>2.43.0
>

---
	-Jay Vosburgh, jv@jvosburgh.net

  reply	other threads:[~2026-03-04  5:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-28  9:58 [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave() Jiayuan Chen
2026-03-04  5:23 ` Jay Vosburgh [this message]
2026-03-04 10:14   ` Jiayuan Chen
2026-03-04  7:09 ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1268309.1772601838@famine \
    --to=jv@jvosburgh.net \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jiayuan.chen@linux.dev \
    --cc=jiayuan.chen@shopee.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pshelar@nicira.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.