public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/3] net: Fix protodown with macvlan
@ 2026-05-05  8:16 Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 1/3] net: Do not inherit operational state when protodown is on Ido Schimmel
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-05  8:16 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, pabeni, edumazet, horms, petrm, Ido Schimmel

When protodown is enabled on a macvlan, two bugs cause the macvlan to
incorrectly report an UP operational state:

1. Toggling the lower device's carrier while protodown is enabled on the
macvlan causes the macvlan to inherit the UP operational state,
effectively bypassing the protodown mechanism.

2. Toggling protodown on and then off on the macvlan while the lower
device has no carrier causes the macvlan to report UP instead of
LOWERLAYERDOWN, since netif_change_proto_down() unconditionally turns
the carrier on.

Patch #1 solves the first problem by making
netif_stacked_transfer_operstate() return early when protodown is on.

Patch #2 solves the second problem by calling
netif_stacked_transfer_operstate() instead of netif_carrier_on() when
protodown is disabled on a net device that has a linked net device.

Patch #3 adds a selftest covering both bugs and the basic protodown
functionality.

Targeting at net-next since these are not regressions (i.e., never
worked).

Note that while these changes are in the core, they should only affect
macvlan as protodown is only supported by macvlan and vxlan and only the
former has a linked net device.

v2:
- Move protodown handling away from drivers to the core (Jakub).
- Add a new test case for vxlan.
v1: https://lore.kernel.org/netdev/20260429124624.835335-1-idosch@nvidia.com/

Ido Schimmel (3):
  net: Do not inherit operational state when protodown is on
  net: Do not unconditionally turn on carrier when turning off protodown
  selftests: net: Add protodown tests

 net/core/dev.c                           |  28 +++-
 tools/testing/selftests/net/Makefile     |   1 +
 tools/testing/selftests/net/protodown.sh | 182 +++++++++++++++++++++++
 3 files changed, 209 insertions(+), 2 deletions(-)
 create mode 100755 tools/testing/selftests/net/protodown.sh

-- 
2.54.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next v2 1/3] net: Do not inherit operational state when protodown is on
  2026-05-05  8:16 [PATCH net-next v2 0/3] net: Fix protodown with macvlan Ido Schimmel
@ 2026-05-05  8:16 ` Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 2/3] net: Do not unconditionally turn on carrier when turning off protodown Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 3/3] selftests: net: Add protodown tests Ido Schimmel
  2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-05  8:16 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, pabeni, edumazet, horms, petrm, Ido Schimmel

The protodown functionality allows user space to turn off the carrier of
a net device:

 # ip link add name dummy1 up type dummy
 # ip link add name macvlan1 up link dummy1 type macvlan mode bridge
 # ip link set dev macvlan1 protodown on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  DOWN           0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>

Different applications can set different protodown reasons, which
prevents an application from turning on the carrier of a net device as
long as others want it down:

 # ip link set dev macvlan1 protodown_reason 1 on
 # ip link set dev macvlan1 protodown_reason 2 on
 # ip link set dev macvlan1 protodown off
 Error: Cannot clear protodown, active reasons.
 # ip link set dev macvlan1 protodown_reason 2 off
 # ip link set dev macvlan1 protodown off
 Error: Cannot clear protodown, active reasons.
 # ip link set dev macvlan1 protodown_reason 1 off
 # ip link set dev macvlan1 protodown off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>

Unfortunately, this mechanism is not very useful when the carrier of a
net device can be toggled by toggling the carrier of its lower device:

 # ip link set dev macvlan1 protodown on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  DOWN           0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
 # ip link set dev dummy1 carrier off
 # ip link set dev dummy1 carrier on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>

Obviously, this is not the intended behavior and it is unlikely to be
relied on by anyone. In fact, it is a problem for applications like FRR
that use protodown with macvlan on top of a bridge as part of Virtual
Router Redundancy Protocol (VRRP).

Solve this by preventing a net device configured with protodown on from
inheriting the operational state of its lower device. Note that
READ_ONCE() is not needed as RTNL is held.

Output with the patch:

 # ip link add name dummy1 up type dummy
 # ip link add name macvlan1 up link dummy1 type macvlan mode bridge
 # ip link set dev macvlan1 protodown on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  DOWN           0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
 # ip link set dev dummy1 carrier off
 # ip link set dev dummy1 carrier on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  DOWN           0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
 # ip link set dev macvlan1 protodown off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/dev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 06c195906231..bfb0f297b234 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11113,6 +11113,9 @@ EXPORT_SYMBOL(netdev_change_features);
 void netif_stacked_transfer_operstate(const struct net_device *rootdev,
 					struct net_device *dev)
 {
+	if (dev->proto_down)
+		return;
+
 	if (rootdev->operstate == IF_OPER_DORMANT)
 		netif_dormant_on(dev);
 	else
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next v2 2/3] net: Do not unconditionally turn on carrier when turning off protodown
  2026-05-05  8:16 [PATCH net-next v2 0/3] net: Fix protodown with macvlan Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 1/3] net: Do not inherit operational state when protodown is on Ido Schimmel
@ 2026-05-05  8:16 ` Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 3/3] selftests: net: Add protodown tests Ido Schimmel
  2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-05  8:16 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, pabeni, edumazet, horms, petrm, Ido Schimmel

The protodown functionality allows user space to turn off the carrier of
a net device:

 # ip link add name dummy1 up type dummy
 # ip link add name macvlan1 up link dummy1 type macvlan mode bridge
 # ip link set dev macvlan1 protodown on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  DOWN           0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>

When protodown is turned off, the core unconditionally turns on the
carrier of the net device:

 # ip link set dev macvlan1 protodown off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>

This is wrong as it means that a macvlan can end up with a carrier when
its lower device does not have a carrier:

 # ip link set dev dummy1 carrier off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  LOWERLAYERDOWN 0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
 # ip link set dev macvlan1 protodown on
 # ip link set dev macvlan1 protodown off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>

Solve this by resolving the linked net device and if one exists, inherit
its operational state when protodown is turned off. Otherwise, as
before, simply turn on the carrier. Set 'dev->proto_down' before calling
netif_stacked_transfer_operstate() as this function is a NOP when
protodown is turned on.

Resolve the linked net device using a new helper and have it return the
device itself (in a similar fashion to dev_get_iflink()) if the device
does not implement both ndo_get_iflink() and get_link_net(). If the
latter is not implemented, it is unclear in which network namespace we
should look up the linked net device. Currently, this helper is only
used for net devices that support protodown (macvlan and vxlan) and for
both it returns the correct result.

Output with the patch:

 # ip link add name dummy1 up type dummy
 # ip link add name macvlan1 up link dummy1 type macvlan mode bridge
 # ip link set dev dummy1 carrier off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  LOWERLAYERDOWN 0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
 # ip link set dev macvlan1 protodown on
 # ip link set dev macvlan1 protodown off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  LOWERLAYERDOWN 0a:5c:a3:05:c7:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
 # ip link set dev dummy1 carrier on
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>
 # ip link set dev macvlan1 protodown on
 # ip link set dev macvlan1 protodown off
 $ ip -br link show dev macvlan1
 macvlan1@dummy1  UP             0a:5c:a3:05:c7:86 <BROADCAST,MULTICAST,UP,LOWER_UP>

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/dev.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index bfb0f297b234..46f8a2efd982 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10141,17 +10141,38 @@ bool netdev_port_same_parent_id(struct net_device *a, struct net_device *b)
 }
 EXPORT_SYMBOL(netdev_port_same_parent_id);
 
+static struct net_device *dev_get_iflink_dev(struct net_device *dev)
+{
+	struct net *net;
+
+	ASSERT_RTNL();
+
+	if (!dev->netdev_ops->ndo_get_iflink || !dev->rtnl_link_ops ||
+	    !dev->rtnl_link_ops->get_link_net)
+		return dev;
+
+	net = dev->rtnl_link_ops->get_link_net(dev);
+	return __dev_get_by_index(net, dev_get_iflink(dev));
+}
+
 int netif_change_proto_down(struct net_device *dev, bool proto_down)
 {
+	struct net_device *iflink_dev;
+
 	if (!dev->change_proto_down)
 		return -EOPNOTSUPP;
 	if (!netif_device_present(dev))
 		return -ENODEV;
+	iflink_dev = dev_get_iflink_dev(dev);
+	if (!iflink_dev)
+		return -ENODEV;
+	WRITE_ONCE(dev->proto_down, proto_down);
 	if (proto_down)
 		netif_carrier_off(dev);
-	else
+	else if (dev == iflink_dev)
 		netif_carrier_on(dev);
-	WRITE_ONCE(dev->proto_down, proto_down);
+	else
+		netif_stacked_transfer_operstate(iflink_dev, dev);
 	return 0;
 }
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next v2 3/3] selftests: net: Add protodown tests
  2026-05-05  8:16 [PATCH net-next v2 0/3] net: Fix protodown with macvlan Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 1/3] net: Do not inherit operational state when protodown is on Ido Schimmel
  2026-05-05  8:16 ` [PATCH net-next v2 2/3] net: Do not unconditionally turn on carrier when turning off protodown Ido Schimmel
@ 2026-05-05  8:16 ` Ido Schimmel
  2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-05  8:16 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, pabeni, edumazet, horms, petrm, Ido Schimmel

Add a selftest for the protodown mechanism.

Five test cases are included:

1. Basic protodown toggling: Verify that setting protodown on macvlan
   results in DOWN operational state and clearing it restores UP.

2. Same as the previous test case, but with vxlan.

3. Protodown reasons: Verify that protodown cannot be cleared while
   there are active protodown reasons, but can be cleared once all
   reasons are removed.

4. Operational state inheritance: Verify that toggling the lower
   device's carrier while protodown is on does not cause the macvlan to
   inherit the UP operational state.

5. Lower layer down: Verify that toggling protodown while the lower
   device has no carrier does not cause the macvlan to transition to UP
   operational state.

Note that the last two test cases fail without "net: Do not inherit
operational state when protodown is on" and "net: Do not unconditionally
turn on carrier when turning off protodown":

 # ./protodown.sh
 TEST: Basic protodown on/off with macvlan                           [ OK ]
 TEST: Basic protodown on/off with vxlan                             [ OK ]
 TEST: Protodown reasons                                             [ OK ]
 TEST: Inheriting operational state with protodown                   [FAIL]
         Macvlan operational state is not DOWN despite protodown
 TEST: Protodown with lower layer down                               [FAIL]
         Macvlan is not LOWERLAYERDOWN after clearing protodown

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 tools/testing/selftests/net/Makefile     |   1 +
 tools/testing/selftests/net/protodown.sh | 182 +++++++++++++++++++++++
 2 files changed, 183 insertions(+)
 create mode 100755 tools/testing/selftests/net/protodown.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index baa30287cf22..c6ff7b504e97 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -69,6 +69,7 @@ TEST_PROGS := \
 	nl_netdev.py \
 	nl_nlctrl.py \
 	pmtu.sh \
+	protodown.sh \
 	psock_snd.sh \
 	reuseaddr_ports_exhausted.sh \
 	reuseport_addr_any.sh \
diff --git a/tools/testing/selftests/net/protodown.sh b/tools/testing/selftests/net/protodown.sh
new file mode 100755
index 000000000000..de6ab90c521a
--- /dev/null
+++ b/tools/testing/selftests/net/protodown.sh
@@ -0,0 +1,182 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test the "protodown" mechanism. Verify basic protodown toggling, protodown
+# reasons, operational state inheritance when the lower device carrier changes,
+# and correct operational state when the lower device has no carrier.
+
+# shellcheck disable=SC1091,SC2034,SC2154,SC2317
+source lib.sh
+
+require_command jq
+
+ALL_TESTS="
+	protodown_basic_macvlan
+	protodown_basic_vxlan
+	protodown_reasons
+	protodown_inherit_operstate
+	protodown_lower_layer_down
+"
+
+operstate_get()
+{
+	local ns=$1; shift
+	local dev=$1; shift
+
+	ip -n "$ns" -j link show dev "$dev" | jq -r '.[].operstate'
+}
+
+operstate_check()
+{
+	local ns=$1; shift
+	local dev=$1; shift
+	local expected=$1; shift
+
+	local current
+	current=$(operstate_get "$ns" "$dev")
+
+	[ "$current" = "$expected" ]
+}
+
+setup_prepare()
+{
+	setup_ns NS
+	defer cleanup_all_ns
+
+	ip -n "$NS" link add name dummy0 up type dummy
+
+	ip -n "$NS" link add name macvlan0 link dummy0 up type macvlan mode bridge
+
+	ip -n "$NS" link add name vxlan0 up type vxlan id 10010 dstport 4789
+}
+
+protodown_basic()
+{
+	local dev=$1; shift
+
+	ip -n "$NS" link set dev "$dev" protodown on
+	check_err $? "Failed to set protodown on"
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" "$dev" DOWN
+	check_err $? "Operational state is not DOWN after setting protodown"
+
+	ip -n "$NS" link set dev "$dev" protodown off
+	check_err $? "Failed to set protodown off"
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" "$dev" UP
+	check_err $? "Operational state is not UP after clearing protodown"
+}
+
+protodown_basic_macvlan()
+{
+	RET=0
+
+	protodown_basic macvlan0
+
+	log_test "Basic protodown on/off with macvlan"
+}
+
+protodown_basic_vxlan()
+{
+	RET=0
+
+	protodown_basic vxlan0
+
+	log_test "Basic protodown on/off with vxlan"
+}
+
+protodown_reasons()
+{
+	RET=0
+
+	ip -n "$NS" link set dev macvlan0 protodown on
+
+	ip -n "$NS" link set dev macvlan0 protodown_reason 0 on
+	check_err $? "Failed to set protodown reason bit 0"
+
+	# Cannot clear protodown while reasons are active.
+	ip -n "$NS" link set dev macvlan0 protodown off 2>/dev/null
+	check_fail $? "Clearing protodown succeeded with active reasons"
+
+	ip -n "$NS" link set dev macvlan0 protodown_reason 0 off
+	check_err $? "Failed to clear protodown reason bit 0"
+
+	# Can clear protodown when no reasons are active.
+	ip -n "$NS" link set dev macvlan0 protodown off
+	check_err $? "Failed to clear protodown with no active reasons"
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 UP
+	check_err $? "Operational state is not UP after clearing protodown"
+
+	log_test "Protodown reasons"
+}
+
+protodown_inherit_operstate()
+{
+	RET=0
+
+	ip -n "$NS" link set dev macvlan0 protodown on
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 DOWN
+	check_err $? "Operational state is not DOWN after setting protodown"
+
+	# Toggle carrier on the lower device. The macvlan should stay DOWN
+	# because protodown is on.
+	ip -n "$NS" link set dev dummy0 carrier off
+	ip -n "$NS" link set dev dummy0 carrier on
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" dummy0 UP
+	check_err $? "Lower device is not UP after carrier on"
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 DOWN
+	check_err $? "Macvlan operational state is not DOWN despite protodown"
+
+	# Clear protodown and verify the macvlan comes back up.
+	ip -n "$NS" link set dev macvlan0 protodown off
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 UP
+	check_err $? "Operational state is not UP after clearing protodown"
+
+	log_test "Inheriting operational state with protodown"
+}
+
+protodown_lower_layer_down()
+{
+	RET=0
+
+	# Bring the lower device carrier down first.
+	ip -n "$NS" link set dev dummy0 carrier off
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 LOWERLAYERDOWN
+	check_err $? "Macvlan is not LOWERLAYERDOWN with lower carrier off"
+
+	# Toggle protodown on and off while lower has no carrier. The macvlan
+	# should not transition to UP.
+	ip -n "$NS" link set dev macvlan0 protodown on
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 LOWERLAYERDOWN
+	check_err $? "Macvlan is not LOWERLAYERDOWN after setting protodown"
+
+	ip -n "$NS" link set dev macvlan0 protodown off
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 LOWERLAYERDOWN
+	check_err $? "Macvlan is not LOWERLAYERDOWN after clearing protodown"
+
+	# Bring the lower device carrier up. The macvlan should transition to
+	# UP.
+	ip -n "$NS" link set dev dummy0 carrier on
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" dummy0 UP
+	check_err $? "Lower device is not UP after carrier on"
+
+	busywait "$BUSYWAIT_TIMEOUT" operstate_check "$NS" macvlan0 UP
+	check_err $? "Macvlan is not UP after lower device is UP"
+
+	log_test "Protodown with lower layer down"
+}
+
+trap defer_scopes_cleanup EXIT
+setup_prepare
+tests_run
+
+exit "$EXIT_STATUS"
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-05  8:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-05  8:16 [PATCH net-next v2 0/3] net: Fix protodown with macvlan Ido Schimmel
2026-05-05  8:16 ` [PATCH net-next v2 1/3] net: Do not inherit operational state when protodown is on Ido Schimmel
2026-05-05  8:16 ` [PATCH net-next v2 2/3] net: Do not unconditionally turn on carrier when turning off protodown Ido Schimmel
2026-05-05  8:16 ` [PATCH net-next v2 3/3] selftests: net: Add protodown tests Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox