All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com,
	andrew+netdev@lunn.ch, horms@kernel.org, jv@jvosburgh.net,
	sdf@fomichev.me, dongchenchen2@huawei.com, idosch@nvidia.com,
	n05ec@lzu.edu.cn, yuantan098@gmail.com, kuniyu@google.com,
	nb@tipi-net.de, aleksandr.loktionov@intel.com,
	dtatulea@nvidia.com, Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH net 4/4] selftests: bonding: add a test for VLAN propagation over a bonded real device
Date: Wed, 24 Jun 2026 11:20:18 -0700	[thread overview]
Message-ID: <20260624182018.2445732-5-kuba@kernel.org> (raw)
In-Reply-To: <20260624182018.2445732-1-kuba@kernel.org>

Add a regression test for the VLAN notifier handling that the netdev_work
deferral fixed.

A VLAN's real device propagates its UP/DOWN, MTU and feature changes onto
the VLANs stacked on top of it. This used to be done synchronously from the
real device's notifier and deadlocked when the real device was brought up
while enslaved to a bond (instance lock held across NETDEV_UP) and the VLAN
on top was itself a bond member: the synchronous propagation re-entered the
stack and took the same instance lock again.

The test covers both halves:
 - that the deferred UP/DOWN, MTU and feature propagation actually lands on
   the VLAN (link state and MTU use an ops-locked dummy, i.e. the deferral
   path; features use veth, which exports vlan_features to inherit), and
 - that the deadlock-prone topology - a VLAN on a dummy, with the VLAN and
   the dummy each enslaved to a different bond - can be built without
   hanging.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 .../selftests/drivers/net/bonding/Makefile    |   1 +
 .../drivers/net/bonding/bond_vlan_real_dev.sh | 180 ++++++++++++++++++
 2 files changed, 181 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh

diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile
index be130bf585a4..6364ca02642d 100644
--- a/tools/testing/selftests/drivers/net/bonding/Makefile
+++ b/tools/testing/selftests/drivers/net/bonding/Makefile
@@ -13,6 +13,7 @@ TEST_PROGS := \
 	bond_options.sh \
 	bond_passive_lacp.sh \
 	bond_stacked_header_parse.sh \
+	bond_vlan_real_dev.sh \
 	dev_addr_lists.sh \
 	mode-1-recovery-updelay.sh \
 	mode-2-recovery-updelay.sh \
diff --git a/tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh b/tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh
new file mode 100755
index 000000000000..542d9ffc4819
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh
@@ -0,0 +1,180 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test propagation of a real device's state to the VLANs stacked on top of it
+# when the real device is (or becomes) a bond member.
+#
+# The kernel mirrors a real device's UP/DOWN, MTU and feature changes onto its
+# VLANs.  This is done asynchronously (netdev_work): doing it synchronously from
+# the real device's notifier could deadlock.  If the real device is brought up
+# while enslaved to a bond - so its instance lock is held across NETDEV_UP - and
+# a VLAN on top of it is itself a bond member, the synchronous propagation
+# re-entered the stack and tried to take the same instance lock again.
+#
+# Cover both halves:
+#  - the deferred UP/DOWN, MTU and feature propagation actually lands on the
+#    VLAN (link state and MTU use an ops-locked dummy, i.e. the deferral path),
+#  - the deadlock-prone topology - a VLAN on a dummy, with the VLAN and the
+#    dummy each enslaved to a different bond - can be built without hanging.
+
+ALL_TESTS="
+	vlan_link_state
+	vlan_mtu
+	vlan_features
+	vlan_real_dev_enslave
+"
+
+REQUIRE_MZ=no
+NUM_NETIFS=0
+lib_dir=$(dirname "$0")
+source "$lib_dir"/../../../net/forwarding/lib.sh
+
+# Return 0 if $dev in netns $ns has flag $flag set (e.g. UP) in its <...> flags.
+link_has_flag()
+{
+	local ns=$1 dev=$2 flag=$3
+
+	ip -n "$ns" link show dev "$dev" 2>/dev/null | grep -q "[<,]${flag}[,>]"
+}
+
+link_lacks_flag()
+{
+	! link_has_flag "$@"
+}
+
+link_mtu_is()
+{
+	local ns=$1 dev=$2 want=$3 cur
+
+	cur=$(ip -n "$ns" link show dev "$dev" 2>/dev/null | \
+		sed -n 's/.* mtu \([0-9]\+\).*/\1/p')
+	[ "$cur" = "$want" ]
+}
+
+vlan_feature_is()
+{
+	local ns=$1 dev=$2 feature=$3 value=$4
+
+	ip netns exec "$ns" ethtool -k "$dev" 2>/dev/null | \
+		grep -q "^$feature: $value"
+}
+
+link_has_master()
+{
+	local ns=$1 dev=$2 master=$3
+
+	ip -n "$ns" -o link show dev "$dev" 2>/dev/null | grep -q "master $master"
+}
+
+vlan_link_state()
+{
+	RET=0
+
+	ip -n "$NS" link add ls_dummy type dummy
+	ip -n "$NS" link add link ls_dummy name ls_vlan type vlan id 100
+
+	# Bringing the real device up must propagate UP to the VLAN.
+	ip -n "$NS" link set ls_dummy up
+	busywait "$BUSYWAIT_TIMEOUT" link_has_flag "$NS" ls_vlan UP
+	check_err $? "VLAN did not go UP after the real device went UP"
+
+	# ... and likewise for DOWN.
+	ip -n "$NS" link set ls_dummy down
+	busywait "$BUSYWAIT_TIMEOUT" link_lacks_flag "$NS" ls_vlan UP
+	check_err $? "VLAN did not go DOWN after the real device went DOWN"
+
+	ip -n "$NS" link del ls_vlan
+	ip -n "$NS" link del ls_dummy
+
+	log_test "VLAN link state follows the real device"
+}
+
+vlan_mtu()
+{
+	RET=0
+
+	# The VLAN inherits the real device's MTU (2000) at creation time.
+	ip -n "$NS" link add mtu_dummy mtu 2000 type dummy
+	ip -n "$NS" link add link mtu_dummy name mtu_vlan type vlan id 100
+
+	# Shrinking the real device's MTU must clamp the VLAN's MTU.
+	ip -n "$NS" link set mtu_dummy mtu 1500
+	busywait "$BUSYWAIT_TIMEOUT" link_mtu_is "$NS" mtu_vlan 1500
+	check_err $? "VLAN MTU not clamped after the real device's MTU shrank"
+
+	ip -n "$NS" link del mtu_vlan
+	ip -n "$NS" link del mtu_dummy
+
+	log_test "VLAN MTU clamped to the real device"
+}
+
+vlan_features()
+{
+	RET=0
+
+	# Use veth as the real device: unlike dummy it exports vlan_features, so
+	# the VLAN actually inherits a toggleable offload to assert on.
+	ip -n "$NS" link add ft_veth type veth peer name ft_veth_pr
+	ip -n "$NS" link add link ft_veth name ft_vlan type vlan id 100
+
+	vlan_feature_is "$NS" ft_vlan scatter-gather on
+	check_err $? "VLAN did not inherit scatter-gather from the real device"
+
+	# Toggling the offload on the real device must propagate to the VLAN.
+	ip netns exec "$NS" ethtool -K ft_veth sg off
+	busywait "$BUSYWAIT_TIMEOUT" \
+		vlan_feature_is "$NS" ft_vlan scatter-gather off
+	check_err $? "VLAN scatter-gather still on after disabling it on real dev"
+
+	ip netns exec "$NS" ethtool -K ft_veth sg on
+	busywait "$BUSYWAIT_TIMEOUT" \
+		vlan_feature_is "$NS" ft_vlan scatter-gather on
+	check_err $? "VLAN scatter-gather still off after enabling it on real dev"
+
+	ip -n "$NS" link del ft_vlan
+	ip -n "$NS" link del ft_veth
+
+	log_test "VLAN features follow the real device"
+}
+
+vlan_real_dev_enslave()
+{
+	RET=0
+
+	# dummy <- VLAN -> bond0, then enslave the dummy itself to bond1.  The
+	# last step brings the dummy up under bond1's instance lock, which used
+	# to deadlock while synchronously propagating UP to the (bond-enslaved)
+	# VLAN on top.
+	ip -n "$NS" link add dl_dummy type dummy
+	ip -n "$NS" link set dl_dummy up
+	ip -n "$NS" link add link dl_dummy name dl_vlan type vlan id 100
+
+	ip -n "$NS" link add dl_bond0 type bond mode active-backup
+	ip -n "$NS" link set dl_vlan down
+	ip -n "$NS" link set dl_vlan master dl_bond0
+	check_err $? "could not enslave the VLAN to bond0"
+
+	ip -n "$NS" link add dl_bond1 type bond mode active-backup
+	ip -n "$NS" link set dl_dummy down
+	ip -n "$NS" link set dl_dummy master dl_bond1
+	check_err $? "could not enslave the real device to bond1"
+
+	# If we got here the kernel did not deadlock; make sure it is still
+	# responsive and the enslave really took effect.
+	link_has_master "$NS" dl_dummy dl_bond1
+	check_err $? "real device not enslaved to bond1"
+
+	ip -n "$NS" link del dl_bond1
+	ip -n "$NS" link del dl_bond0
+	ip -n "$NS" link del dl_vlan
+	ip -n "$NS" link del dl_dummy
+
+	log_test "VLAN real device enslaved to a second bond"
+}
+
+setup_ns NS
+trap 'cleanup_ns $NS' EXIT
+
+tests_run
+
+exit "$EXIT_STATUS"
-- 
2.54.0


      parent reply	other threads:[~2026-06-24 18:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 18:20 [PATCH net 0/4] net: avoid nested UP notifier events Jakub Kicinski
2026-06-24 18:20 ` [PATCH net 1/4] net: turn the rx_mode work into a generic netdev_work facility Jakub Kicinski
2026-06-24 18:20 ` [PATCH net 2/4] net: add the driver-facing netdev_work scheduling API Jakub Kicinski
2026-06-24 18:20 ` [PATCH net 3/4] vlan: defer real device state propagation to netdev_work Jakub Kicinski
2026-06-24 18:20 ` Jakub Kicinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260624182018.2445732-5-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=aleksandr.loktionov@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dongchenchen2@huawei.com \
    --cc=dtatulea@nvidia.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=jv@jvosburgh.net \
    --cc=kuniyu@google.com \
    --cc=n05ec@lzu.edu.cn \
    --cc=nb@tipi-net.de \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=yuantan098@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.