netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 1/2] ipvlan: Make the addrs_lock be per port
  2025-12-02 14:11 [PATCH net 0/2] ipvlan: make addrs_lock be per port Dmitry Skorodumov
@ 2025-12-02 14:11 ` Dmitry Skorodumov
  0 siblings, 0 replies; 8+ messages in thread
From: Dmitry Skorodumov @ 2025-12-02 14:11 UTC (permalink / raw)
  To: netdev, Dmitry Skorodumov, Xiao Liang, Kuniyuki Iwashima,
	Jakub Kicinski, Stanislav Fomichev, Etienne Champetier,
	Paolo Abeni, David S. Miller, linux-kernel
  Cc: Andrew Lunn, Eric Dumazet

Make the addrs_lock be per port, not per ipvlan dev.

This appears to be a very minor problem though.
Since it's highly unlikely that ipvlan_add_addr() will
be called on 2 CPU simultaneously. But nevertheless,
this may cause:

1. False-negative of ipvlan_addr_busy(): one interface
iterated through all port->ipvlans + ipvlan->addrs
under some ipvlan spinlock, and another added IP
under its own lock. Though this is only possible
for IPv6, since looks like only ipvlan_addr6_event() can be
called without rtnl_lock.

2. Race since ipvlan_ht_addr_add(port) is called under
different ipvlan->addrs_lock locks

This should not affect performance, since add/remove IP
is a rare situation and spinlock is not locked on fast
paths.

Fixes: 8230819494b3 ("ipvlan: use per device spinlock to protect addrs list updates")
Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
CC: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/ipvlan/ipvlan.h      |  2 +-
 drivers/net/ipvlan/ipvlan_main.c | 20 ++++++++++----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 50de3ee204db..80f84fc87008 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -69,7 +69,6 @@ struct ipvl_dev {
 	DECLARE_BITMAP(mac_filters, IPVLAN_MAC_FILTER_SIZE);
 	netdev_features_t	sfeatures;
 	u32			msg_enable;
-	spinlock_t		addrs_lock;
 };
 
 struct ipvl_addr {
@@ -90,6 +89,7 @@ struct ipvl_port {
 	struct net_device	*dev;
 	possible_net_t		pnet;
 	struct hlist_head	hlhead[IPVLAN_HASH_SIZE];
+	spinlock_t		addrs_lock; /* guards hash-table and addrs */
 	struct list_head	ipvlans;
 	u16			mode;
 	u16			flags;
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 660f3db11766..c390f4241621 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -75,6 +75,7 @@ static int ipvlan_port_create(struct net_device *dev)
 	for (idx = 0; idx < IPVLAN_HASH_SIZE; idx++)
 		INIT_HLIST_HEAD(&port->hlhead[idx]);
 
+	spin_lock_init(&port->addrs_lock);
 	skb_queue_head_init(&port->backlog);
 	INIT_WORK(&port->wq, ipvlan_process_multicast);
 	ida_init(&port->ida);
@@ -579,7 +580,6 @@ int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params,
 	if (!tb[IFLA_MTU])
 		ipvlan_adjust_mtu(ipvlan, phy_dev);
 	INIT_LIST_HEAD(&ipvlan->addrs);
-	spin_lock_init(&ipvlan->addrs_lock);
 
 	/* TODO Probably put random address here to be presented to the
 	 * world but keep using the physical-dev address for the outgoing
@@ -657,13 +657,13 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct ipvl_addr *addr, *next;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	list_for_each_entry_safe(addr, next, &ipvlan->addrs, anode) {
 		ipvlan_ht_addr_del(addr);
 		list_del_rcu(&addr->anode);
 		kfree_rcu(addr, rcu);
 	}
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 
 	ida_free(&ipvlan->port->ida, dev->dev_id);
 	list_del_rcu(&ipvlan->pnode);
@@ -847,16 +847,16 @@ static void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 {
 	struct ipvl_addr *addr;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	addr = ipvlan_find_addr(ipvlan, iaddr, is_v6);
 	if (!addr) {
-		spin_unlock_bh(&ipvlan->addrs_lock);
+		spin_unlock_bh(&ipvlan->port->addrs_lock);
 		return;
 	}
 
 	ipvlan_ht_addr_del(addr);
 	list_del_rcu(&addr->anode);
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 	kfree_rcu(addr, rcu);
 }
 
@@ -878,14 +878,14 @@ static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr)
 {
 	int ret = -EINVAL;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	if (ipvlan_addr_busy(ipvlan->port, ip6_addr, true))
 		netif_err(ipvlan, ifup, ipvlan->dev,
 			  "Failed to add IPv6=%pI6c addr for %s intf\n",
 			  ip6_addr, ipvlan->dev->name);
 	else
 		ret = ipvlan_add_addr(ipvlan, ip6_addr, true);
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 	return ret;
 }
 
@@ -946,14 +946,14 @@ static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr)
 {
 	int ret = -EINVAL;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	if (ipvlan_addr_busy(ipvlan->port, ip4_addr, false))
 		netif_err(ipvlan, ifup, ipvlan->dev,
 			  "Failed to add IPv4=%pI4 on %s intf.\n",
 			  ip4_addr, ipvlan->dev->name);
 	else
 		ret = ipvlan_add_addr(ipvlan, ip4_addr, false);
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 	return ret;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 net 0/2] ipvlan: addrs_lock made per port
@ 2025-12-25 18:55 Dmitry Skorodumov
  2025-12-25 18:55 ` [PATCH net 1/2] ipvlan: Make the addrs_lock be " Dmitry Skorodumov
  2025-12-25 18:55 ` [PATCH net 2/2] selftests: net: simple selftest for ipvtap Dmitry Skorodumov
  0 siblings, 2 replies; 8+ messages in thread
From: Dmitry Skorodumov @ 2025-12-25 18:55 UTC (permalink / raw)
  To: netdev; +Cc: Dmitry Skorodumov

First patch fixes a rather minor issues that sometimes
ipvlan-addrs are modified without lock (because
for IPv6 addr can be sometimes added without RTNL)

diff from v2:
- Added a small self-test
- added early return in ipvlan_find_addr()
- the iterations over ipvlans in ipvlan_addr_busy()
must be protected by RCU
- Added simple self-test. I haven't invented anything
more sophisticated that this.

Dmitry Skorodumov (2):
  ipvlan: Make the addrs_lock be per port
  selftests: net: simple selftest for ipvtap

 drivers/net/ipvlan/ipvlan.h                |   2 +-
 drivers/net/ipvlan/ipvlan_core.c           |  16 +-
 drivers/net/ipvlan/ipvlan_main.c           |  49 +++---
 tools/testing/selftests/net/Makefile       |   1 +
 tools/testing/selftests/net/config         |   2 +
 tools/testing/selftests/net/ipvtap_test.sh | 167 +++++++++++++++++++++
 6 files changed, 207 insertions(+), 30 deletions(-)
 create mode 100755 tools/testing/selftests/net/ipvtap_test.sh

-- 
2.43.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net 1/2] ipvlan: Make the addrs_lock be per port
  2025-12-25 18:55 [PATCH v3 net 0/2] ipvlan: addrs_lock made per port Dmitry Skorodumov
@ 2025-12-25 18:55 ` Dmitry Skorodumov
  2025-12-28  9:46   ` Paolo Abeni
  2025-12-25 18:55 ` [PATCH net 2/2] selftests: net: simple selftest for ipvtap Dmitry Skorodumov
  1 sibling, 1 reply; 8+ messages in thread
From: Dmitry Skorodumov @ 2025-12-25 18:55 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, Dmitry Skorodumov, Xiao Liang,
	Kuniyuki Iwashima, Julian Vetter, Guillaume Nault, Ido Schimmel,
	Eric Dumazet, Stanislav Fomichev, Etienne Champetier,
	David S. Miller, Paolo Abeni, linux-kernel
  Cc: Andrew Lunn

Make the addrs_lock be per port, not per ipvlan dev.

Initial code seems to be written in the assumption,
that any address change must occur under RTNL.
But it is not so for the case of IPv6. So

1) Introduce per-port addrs_lock.

2) It was needed to fix places where it was forgotten
to take lock (ipvlan_open/ipvlan_close)

This appears to be a very minor problem though.
Since it's highly unlikely that ipvlan_add_addr() will
be called on 2 CPU simultaneously. But nevertheless,
this could cause:

1) False-negative of ipvlan_addr_busy(): one interface
iterated through all port->ipvlans + ipvlan->addrs
under some ipvlan spinlock, and another added IP
under its own lock. Though this is only possible
for IPv6, since looks like only ipvlan_addr6_event() can be
called without rtnl_lock.

2) Race since ipvlan_ht_addr_add(port) is called under
different ipvlan->addrs_lock locks

This should not affect performance, since add/remove IP
is a rare situation and spinlock is not taken on fast
paths.

Fixes: 8230819494b3 ("ipvlan: use per device spinlock to protect addrs list updates")
Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
CC: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/ipvlan/ipvlan.h      |  2 +-
 drivers/net/ipvlan/ipvlan_core.c | 16 +++++------
 drivers/net/ipvlan/ipvlan_main.c | 49 +++++++++++++++++++-------------
 3 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 50de3ee204db..80f84fc87008 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -69,7 +69,6 @@ struct ipvl_dev {
 	DECLARE_BITMAP(mac_filters, IPVLAN_MAC_FILTER_SIZE);
 	netdev_features_t	sfeatures;
 	u32			msg_enable;
-	spinlock_t		addrs_lock;
 };
 
 struct ipvl_addr {
@@ -90,6 +89,7 @@ struct ipvl_port {
 	struct net_device	*dev;
 	possible_net_t		pnet;
 	struct hlist_head	hlhead[IPVLAN_HASH_SIZE];
+	spinlock_t		addrs_lock; /* guards hash-table and addrs */
 	struct list_head	ipvlans;
 	u16			mode;
 	u16			flags;
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 2efa3ba148aa..bdb3a46b327c 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -107,17 +107,15 @@ void ipvlan_ht_addr_del(struct ipvl_addr *addr)
 struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev *ipvlan,
 				   const void *iaddr, bool is_v6)
 {
-	struct ipvl_addr *addr, *ret = NULL;
+	struct ipvl_addr *addr;
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(addr, &ipvlan->addrs, anode) {
-		if (addr_equal(is_v6, addr, iaddr)) {
-			ret = addr;
-			break;
-		}
+	assert_spin_locked(&ipvlan->port->addrs_lock);
+
+	list_for_each_entry(addr, &ipvlan->addrs, anode) {
+		if (addr_equal(is_v6, addr, iaddr))
+			return addr;
 	}
-	rcu_read_unlock();
-	return ret;
+	return NULL;
 }
 
 bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6)
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 660f3db11766..baccdad695fd 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -75,6 +75,7 @@ static int ipvlan_port_create(struct net_device *dev)
 	for (idx = 0; idx < IPVLAN_HASH_SIZE; idx++)
 		INIT_HLIST_HEAD(&port->hlhead[idx]);
 
+	spin_lock_init(&port->addrs_lock);
 	skb_queue_head_init(&port->backlog);
 	INIT_WORK(&port->wq, ipvlan_process_multicast);
 	ida_init(&port->ida);
@@ -181,6 +182,7 @@ static void ipvlan_uninit(struct net_device *dev)
 static int ipvlan_open(struct net_device *dev)
 {
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
+	struct ipvl_port *port = ipvlan->port;
 	struct ipvl_addr *addr;
 
 	if (ipvlan->port->mode == IPVLAN_MODE_L3 ||
@@ -189,10 +191,10 @@ static int ipvlan_open(struct net_device *dev)
 	else
 		dev->flags &= ~IFF_NOARP;
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(addr, &ipvlan->addrs, anode)
+	spin_lock_bh(&port->addrs_lock);
+	list_for_each_entry(addr, &ipvlan->addrs, anode)
 		ipvlan_ht_addr_add(ipvlan, addr);
-	rcu_read_unlock();
+	spin_unlock_bh(&port->addrs_lock);
 
 	return 0;
 }
@@ -206,10 +208,10 @@ static int ipvlan_stop(struct net_device *dev)
 	dev_uc_unsync(phy_dev, dev);
 	dev_mc_unsync(phy_dev, dev);
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(addr, &ipvlan->addrs, anode)
+	spin_lock_bh(&ipvlan->port->addrs_lock);
+	list_for_each_entry(addr, &ipvlan->addrs, anode)
 		ipvlan_ht_addr_del(addr);
-	rcu_read_unlock();
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 
 	return 0;
 }
@@ -579,7 +581,6 @@ int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params,
 	if (!tb[IFLA_MTU])
 		ipvlan_adjust_mtu(ipvlan, phy_dev);
 	INIT_LIST_HEAD(&ipvlan->addrs);
-	spin_lock_init(&ipvlan->addrs_lock);
 
 	/* TODO Probably put random address here to be presented to the
 	 * world but keep using the physical-dev address for the outgoing
@@ -657,13 +658,13 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct ipvl_addr *addr, *next;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	list_for_each_entry_safe(addr, next, &ipvlan->addrs, anode) {
 		ipvlan_ht_addr_del(addr);
 		list_del_rcu(&addr->anode);
 		kfree_rcu(addr, rcu);
 	}
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 
 	ida_free(&ipvlan->port->ida, dev->dev_id);
 	list_del_rcu(&ipvlan->pnode);
@@ -817,6 +818,8 @@ static int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 {
 	struct ipvl_addr *addr;
 
+	assert_spin_locked(&ipvlan->port->addrs_lock);
+
 	addr = kzalloc(sizeof(struct ipvl_addr), GFP_ATOMIC);
 	if (!addr)
 		return -ENOMEM;
@@ -847,16 +850,16 @@ static void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 {
 	struct ipvl_addr *addr;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	addr = ipvlan_find_addr(ipvlan, iaddr, is_v6);
 	if (!addr) {
-		spin_unlock_bh(&ipvlan->addrs_lock);
+		spin_unlock_bh(&ipvlan->port->addrs_lock);
 		return;
 	}
 
 	ipvlan_ht_addr_del(addr);
 	list_del_rcu(&addr->anode);
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 	kfree_rcu(addr, rcu);
 }
 
@@ -878,14 +881,14 @@ static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr)
 {
 	int ret = -EINVAL;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	if (ipvlan_addr_busy(ipvlan->port, ip6_addr, true))
 		netif_err(ipvlan, ifup, ipvlan->dev,
 			  "Failed to add IPv6=%pI6c addr for %s intf\n",
 			  ip6_addr, ipvlan->dev->name);
 	else
 		ret = ipvlan_add_addr(ipvlan, ip6_addr, true);
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 	return ret;
 }
 
@@ -924,21 +927,24 @@ static int ipvlan_addr6_validator_event(struct notifier_block *unused,
 	struct in6_validator_info *i6vi = (struct in6_validator_info *)ptr;
 	struct net_device *dev = (struct net_device *)i6vi->i6vi_dev->dev;
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
+	int ret = NOTIFY_OK;
 
 	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
 	switch (event) {
 	case NETDEV_UP:
+		spin_lock_bh(&ipvlan->port->addrs_lock);
 		if (ipvlan_addr_busy(ipvlan->port, &i6vi->i6vi_addr, true)) {
 			NL_SET_ERR_MSG(i6vi->extack,
 				       "Address already assigned to an ipvlan device");
-			return notifier_from_errno(-EADDRINUSE);
+			ret = notifier_from_errno(-EADDRINUSE);
 		}
+		spin_unlock_bh(&ipvlan->port->addrs_lock);
 		break;
 	}
 
-	return NOTIFY_OK;
+	return ret;
 }
 #endif
 
@@ -946,14 +952,14 @@ static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr)
 {
 	int ret = -EINVAL;
 
-	spin_lock_bh(&ipvlan->addrs_lock);
+	spin_lock_bh(&ipvlan->port->addrs_lock);
 	if (ipvlan_addr_busy(ipvlan->port, ip4_addr, false))
 		netif_err(ipvlan, ifup, ipvlan->dev,
 			  "Failed to add IPv4=%pI4 on %s intf.\n",
 			  ip4_addr, ipvlan->dev->name);
 	else
 		ret = ipvlan_add_addr(ipvlan, ip4_addr, false);
-	spin_unlock_bh(&ipvlan->addrs_lock);
+	spin_unlock_bh(&ipvlan->port->addrs_lock);
 	return ret;
 }
 
@@ -995,21 +1001,24 @@ static int ipvlan_addr4_validator_event(struct notifier_block *unused,
 	struct in_validator_info *ivi = (struct in_validator_info *)ptr;
 	struct net_device *dev = (struct net_device *)ivi->ivi_dev->dev;
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
+	int ret = NOTIFY_OK;
 
 	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
 	switch (event) {
 	case NETDEV_UP:
+		spin_lock_bh(&ipvlan->port->addrs_lock);
 		if (ipvlan_addr_busy(ipvlan->port, &ivi->ivi_addr, false)) {
 			NL_SET_ERR_MSG(ivi->extack,
 				       "Address already assigned to an ipvlan device");
-			return notifier_from_errno(-EADDRINUSE);
+			ret = notifier_from_errno(-EADDRINUSE);
 		}
+		spin_unlock_bh(&ipvlan->port->addrs_lock);
 		break;
 	}
 
-	return NOTIFY_OK;
+	return ret;
 }
 
 static struct notifier_block ipvlan_addr4_notifier_block __read_mostly = {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net 2/2] selftests: net: simple selftest for ipvtap
  2025-12-25 18:55 [PATCH v3 net 0/2] ipvlan: addrs_lock made per port Dmitry Skorodumov
  2025-12-25 18:55 ` [PATCH net 1/2] ipvlan: Make the addrs_lock be " Dmitry Skorodumov
@ 2025-12-25 18:55 ` Dmitry Skorodumov
  2025-12-26 17:52   ` Stephen Hemminger
  2025-12-28 12:14   ` Paolo Abeni
  1 sibling, 2 replies; 8+ messages in thread
From: Dmitry Skorodumov @ 2025-12-25 18:55 UTC (permalink / raw)
  To: netdev, Simon Horman, linux-kernel, linux-kselftest
  Cc: Dmitry Skorodumov, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Shuah Khan

This is a simple ipvtap test to do simple test
on handling IP-address assigning to ipvlan interface.

It creates a veth-interface and then creates several
network-namespace with ipvlan0 interface in it linked to veth.

Then it starts to add/remove addresses on ipvlan0 interfaces
in several threads.

At finish, it checks that there is no duplicated addresses.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 tools/testing/selftests/net/Makefile       |   1 +
 tools/testing/selftests/net/config         |   2 +
 tools/testing/selftests/net/ipvtap_test.sh | 167 +++++++++++++++++++++
 3 files changed, 170 insertions(+)
 create mode 100755 tools/testing/selftests/net/ipvtap_test.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index b66ba04f19d9..45c4ea381bc3 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -48,6 +48,7 @@ TEST_PROGS := \
 	ipv6_flowlabel.sh \
 	ipv6_force_forwarding.sh \
 	ipv6_route_update_soft_lockup.sh \
+	ipvtap_test.sh \
 	l2_tos_ttl_inherit.sh \
 	l2tp.sh \
 	link_netns.py \
diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config
index 1e1f253118f5..5702ab8fa5ad 100644
--- a/tools/testing/selftests/net/config
+++ b/tools/testing/selftests/net/config
@@ -48,6 +48,7 @@ CONFIG_IPV6_SEG6_LWTUNNEL=y
 CONFIG_IPV6_SIT=y
 CONFIG_IPV6_VTI=y
 CONFIG_IPVLAN=m
+CONFIG_IPVTAP=m
 CONFIG_KALLSYMS=y
 CONFIG_L2TP=m
 CONFIG_L2TP_ETH=m
@@ -122,6 +123,7 @@ CONFIG_TEST_BPF=m
 CONFIG_TLS=m
 CONFIG_TRACEPOINTS=y
 CONFIG_TUN=y
+CONFIG_TAP=m
 CONFIG_USER_NS=y
 CONFIG_VETH=y
 CONFIG_VLAN_8021Q=y
diff --git a/tools/testing/selftests/net/ipvtap_test.sh b/tools/testing/selftests/net/ipvtap_test.sh
new file mode 100755
index 000000000000..751793f26fed
--- /dev/null
+++ b/tools/testing/selftests/net/ipvtap_test.sh
@@ -0,0 +1,168 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Simple tests for ipvtap
+
+
+#
+# The testing environment looks this way:
+#
+# |------HOST------|     |------PHY-------|
+# |      veth<----------------->veth      |
+# |------|--|------|     |----------------|
+#        |  |
+#        |  |            |-----TST0-------|
+#        |  |------------|----ipvlan      |
+#        |               |----------------|
+#        |
+#        |               |-----TST1-------|
+#        |---------------|----ipvlan      |
+#                        |----------------|
+#
+
+ALL_TESTS="
+	test_ip_set
+"
+
+source lib.sh
+
+DEBUG=0
+
+VETH_HOST=vethtst.h
+VETH_PHY=vethtst.p
+
+NS_COUNT=32
+IP_ITERATIONS=1024
+
+ns_run() {
+	ns=$1
+	shift
+	if [[ "$ns" == "global" ]]; then
+		"$@" >/dev/null
+	else
+		ip netns exec "$ns" "$@" >/dev/null
+	fi
+}
+
+test_ip_setup_env() {
+	modprobe -q tap
+	modprobe -q ipvlan
+	modprobe -q ipvtap
+
+	setup_ns NS_PHY
+
+	# setup simulated other-host (phy) and host itself
+	ip link add $VETH_HOST type veth peer name $VETH_PHY \
+		netns "$NS_PHY" >/dev/null
+	ip link set $VETH_HOST up
+	ns_run "$NS_PHY" ip link set $VETH_PHY up
+
+	for ((i=0; i<NS_COUNT; i++)); do
+		setup_ns ipvlan_ns_$i
+		ns="ipvlan_ns_$i"
+		if [ "$DEBUG" = "1" ]; then
+			echo "created NS ${!ns}"
+		fi
+		if ! ip link add netns ${!ns} ipvlan0 link $VETH_HOST \
+		    type ipvtap mode l2 bridge; then
+			exit_error "FAIL: Failed to configure ipvlan link."
+		fi
+	done
+}
+
+test_ip_cleanup_env() {
+	ip link del $VETH_HOST
+	cleanup_all_ns
+}
+
+exit_error() {
+	echo "$1"
+	exit $ksft_fail
+}
+
+rnd() {
+	echo $(( RANDOM % 32 + 16 ))
+}
+
+test_ip_set_thread() {
+	ip link set ipvlan0 up
+	for ((i=0; i<IP_ITERATIONS; i++)); do
+		v=$(rnd)
+		ip a a "172.25.0.$v/24" dev ipvlan0 2>/dev/null
+		ip a a "fc00::$v/64" dev ipvlan0 2>/dev/null
+		v=$(rnd)
+		ip a d "172.25.0.$v/24" dev ipvlan0 2>/dev/null
+		ip a d "fc00::$v/64" dev ipvlan0 2>/dev/null
+	done
+}
+
+test_ip_set() {
+	RET=0
+
+	modprobe -q tap
+	modprobe -q ipvlan
+	modprobe -q ipvtap
+
+	trap test_ip_cleanup_env EXIT
+
+	test_ip_setup_env
+
+	declare -A ns_pids
+	for ((i=0; i<NS_COUNT; i++)); do
+		ns="ipvlan_ns_$i"
+		ns_run ${!ns} bash -c "$0 test_ip_set_thread"&
+		ns_pids[$i]=$!
+	done
+
+	for ((i=0; i<NS_COUNT; i++)); do
+		wait "${ns_pids[$i]}"
+	done
+
+	declare -A all_ips
+	for ((i=0; i<NS_COUNT; i++)); do
+		ns="ipvlan_ns_$i"
+		ip_output=$(ip netns exec ${!ns} ip a l dev ipvlan0 | grep inet)
+		while IFS= read -r nsip_out; do
+			if [[ -z $nsip_out ]]; then
+				continue;
+			fi
+			nsip=$(awk '{print $2}' <<< "$nsip_out")
+			if [[ -v all_ips[$nsip] ]]; then
+				RET=$ksft_fail
+				log_test "conflict for $nsip"
+				return "$RET"
+			else
+				all_ips[$nsip]=$i
+			fi
+		done <<< "$ip_output"
+	done
+
+	if [ "$DEBUG" = "1" ]; then
+		for key in "${!all_ips[@]}"; do
+			echo "$key: ${all_ips[$key]}"
+		done
+	fi
+
+	trap - EXIT
+	test_ip_cleanup_env
+
+	log_test "test multithreaded ip set"
+}
+
+if [[ "$1" == "-d" ]]; then
+	DEBUG=1
+	shift
+fi
+
+if [[ "$1" == "-t" ]]; then
+	shift
+	TESTS="$*"
+fi
+
+if [[ "$1" == "test_ip_set_thread" ]]; then
+	test_ip_set_thread
+else
+	require_command ip
+
+	tests_run
+fi
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net 2/2] selftests: net: simple selftest for ipvtap
  2025-12-25 18:55 ` [PATCH net 2/2] selftests: net: simple selftest for ipvtap Dmitry Skorodumov
@ 2025-12-26 17:52   ` Stephen Hemminger
  2025-12-28 12:14   ` Paolo Abeni
  1 sibling, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2025-12-26 17:52 UTC (permalink / raw)
  To: Dmitry Skorodumov
  Cc: netdev, Simon Horman, linux-kernel, linux-kselftest,
	Dmitry Skorodumov, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Shuah Khan

On Thu, 25 Dec 2025 21:55:34 +0300
Dmitry Skorodumov <dskr99@gmail.com> wrote:

> +test_ip_set() {
> +	RET=0
> +
> +	modprobe -q tap
> +	modprobe -q ipvlan
> +	modprobe -q ipvtap
> +

Did you know that kernel will auto load the module when device is created
so in most cases modprobe is not needed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net 1/2] ipvlan: Make the addrs_lock be per port
  2025-12-25 18:55 ` [PATCH net 1/2] ipvlan: Make the addrs_lock be " Dmitry Skorodumov
@ 2025-12-28  9:46   ` Paolo Abeni
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Abeni @ 2025-12-28  9:46 UTC (permalink / raw)
  To: Dmitry Skorodumov, netdev, Jakub Kicinski, Dmitry Skorodumov,
	Xiao Liang, Kuniyuki Iwashima, Julian Vetter, Guillaume Nault,
	Ido Schimmel, Eric Dumazet, Stanislav Fomichev,
	Etienne Champetier, David S. Miller, linux-kernel
  Cc: Andrew Lunn

On 12/25/25 7:55 PM, Dmitry Skorodumov wrote:
> Make the addrs_lock be per port, not per ipvlan dev.
> 
> Initial code seems to be written in the assumption,
> that any address change must occur under RTNL.
> But it is not so for the case of IPv6. So
> 
> 1) Introduce per-port addrs_lock.
> 
> 2) It was needed to fix places where it was forgotten
> to take lock (ipvlan_open/ipvlan_close)
> 
> This appears to be a very minor problem though.
> Since it's highly unlikely that ipvlan_add_addr() will
> be called on 2 CPU simultaneously. But nevertheless,
> this could cause:
> 
> 1) False-negative of ipvlan_addr_busy(): one interface
> iterated through all port->ipvlans + ipvlan->addrs
> under some ipvlan spinlock, and another added IP
> under its own lock. Though this is only possible
> for IPv6, since looks like only ipvlan_addr6_event() can be
> called without rtnl_lock.
> 
> 2) Race since ipvlan_ht_addr_add(port) is called under
> different ipvlan->addrs_lock locks
> 
> This should not affect performance, since add/remove IP
> is a rare situation and spinlock is not taken on fast
> paths.
> 
> Fixes: 8230819494b3 ("ipvlan: use per device spinlock to protect addrs list updates")
> Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
> CC: Paolo Abeni <pabeni@redhat.com>

Not so minor process nits: you should include the revision number in the
subj prefix, and you should include the main changes vs the previous
revision (and possibly even link to the previous revisions after the
'---' separator,  it will help reviewers greatly.

Patch contents LGTM.

/P


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net 2/2] selftests: net: simple selftest for ipvtap
  2025-12-25 18:55 ` [PATCH net 2/2] selftests: net: simple selftest for ipvtap Dmitry Skorodumov
  2025-12-26 17:52   ` Stephen Hemminger
@ 2025-12-28 12:14   ` Paolo Abeni
  2025-12-30 10:06     ` Dmitry Skorodumov
  1 sibling, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2025-12-28 12:14 UTC (permalink / raw)
  To: Dmitry Skorodumov, netdev, Simon Horman, linux-kernel,
	linux-kselftest
  Cc: Dmitry Skorodumov, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Shuah Khan

On 12/25/25 7:55 PM, Dmitry Skorodumov wrote:
> diff --git a/tools/testing/selftests/net/ipvtap_test.sh b/tools/testing/selftests/net/ipvtap_test.sh
> new file mode 100755
> index 000000000000..751793f26fed
> --- /dev/null
> +++ b/tools/testing/selftests/net/ipvtap_test.sh
> @@ -0,0 +1,168 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Simple tests for ipvtap
> +
> +
> +#
> +# The testing environment looks this way:
> +#
> +# |------HOST------|     |------PHY-------|
> +# |      veth<----------------->veth      |
> +# |------|--|------|     |----------------|
> +#        |  |
> +#        |  |            |-----TST0-------|
> +#        |  |------------|----ipvlan      |
> +#        |               |----------------|
> +#        |
> +#        |               |-----TST1-------|
> +#        |---------------|----ipvlan      |
> +#                        |----------------|
> +#
> +
> +ALL_TESTS="
> +	test_ip_set
> +"
> +
> +source lib.sh
> +
> +DEBUG=0
> +
> +VETH_HOST=vethtst.h
> +VETH_PHY=vethtst.p
> +
> +NS_COUNT=32
> +IP_ITERATIONS=1024
> +
> +ns_run() {
> +	ns=$1
> +	shift
> +	if [[ "$ns" == "global" ]]; then
> +		"$@" >/dev/null
> +	else
> +		ip netns exec "$ns" "$@" >/dev/null
> +	fi
> +}
> +
> +test_ip_setup_env() {
> +	modprobe -q tap
> +	modprobe -q ipvlan
> +	modprobe -q ipvtap
> +
> +	setup_ns NS_PHY
> +
> +	# setup simulated other-host (phy) and host itself
> +	ip link add $VETH_HOST type veth peer name $VETH_PHY \
> +		netns "$NS_PHY" >/dev/null

It would be better to avoid creating devices in the main netns.

> +	ip link set $VETH_HOST up
> +	ns_run "$NS_PHY" ip link set $VETH_PHY up
> +
> +	for ((i=0; i<NS_COUNT; i++)); do
> +		setup_ns ipvlan_ns_$i
> +		ns="ipvlan_ns_$i"
> +		if [ "$DEBUG" = "1" ]; then
> +			echo "created NS ${!ns}"
> +		fi
> +		if ! ip link add netns ${!ns} ipvlan0 link $VETH_HOST \
> +		    type ipvtap mode l2 bridge; then
> +			exit_error "FAIL: Failed to configure ipvlan link."
> +		fi
> +	done
> +}
> +
> +test_ip_cleanup_env() {
> +	ip link del $VETH_HOST
> +	cleanup_all_ns
> +}
> +
> +exit_error() {
> +	echo "$1"
> +	exit $ksft_fail
> +}
> +
> +rnd() {
> +	echo $(( RANDOM % 32 + 16 ))
> +}
> +
> +test_ip_set_thread() {
> +	ip link set ipvlan0 up
> +	for ((i=0; i<IP_ITERATIONS; i++)); do
> +		v=$(rnd)
> +		ip a a "172.25.0.$v/24" dev ipvlan0 2>/dev/null
> +		ip a a "fc00::$v/64" dev ipvlan0 2>/dev/null
> +		v=$(rnd)
> +		ip a d "172.25.0.$v/24" dev ipvlan0 2>/dev/null
> +		ip a d "fc00::$v/64" dev ipvlan0 2>/dev/null

It's unclear to me why the above tries to remove random addresses
different from the ones just added (possibly not existing)

> +	done
> +}
> +
> +test_ip_set() {
> +	RET=0
> +
> +	modprobe -q tap
> +	modprobe -q ipvlan
> +	modprobe -q ipvtap
> +
> +	trap test_ip_cleanup_env EXIT
> +
> +	test_ip_setup_env
> +
> +	declare -A ns_pids
> +	for ((i=0; i<NS_COUNT; i++)); do
> +		ns="ipvlan_ns_$i"
> +		ns_run ${!ns} bash -c "$0 test_ip_set_thread"&
> +		ns_pids[$i]=$!
> +	done
> +
> +	for ((i=0; i<NS_COUNT; i++)); do
> +		wait "${ns_pids[$i]}"
> +	done

This tests fails quite often in debug build due to timeout, see:

https://netdev.bots.linux.dev/flakes.html?tn-needle=ipvtap

and i.e.

https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/447821/5-ipvtap-test-sh/

You should likely decrease the thread and/or iterations count, at least
for debug builds

/P


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net 2/2] selftests: net: simple selftest for ipvtap
  2025-12-28 12:14   ` Paolo Abeni
@ 2025-12-30 10:06     ` Dmitry Skorodumov
  0 siblings, 0 replies; 8+ messages in thread
From: Dmitry Skorodumov @ 2025-12-30 10:06 UTC (permalink / raw)
  To: Paolo Abeni, Dmitry Skorodumov, netdev, Simon Horman,
	linux-kernel, linux-kselftest
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Shuah Khan


>> +test_ip_set_thread() {
>> +	ip link set ipvlan0 up
>> +	for ((i=0; i<IP_ITERATIONS; i++)); do
>> +		v=$(rnd)
>> +		ip a a "172.25.0.$v/24" dev ipvlan0 2>/dev/null
>> +		ip a a "fc00::$v/64" dev ipvlan0 2>/dev/null
>> +		v=$(rnd)
>> +		ip a d "172.25.0.$v/24" dev ipvlan0 2>/dev/null
>> +		ip a d "fc00::$v/64" dev ipvlan0 2>/dev/null
> It's unclear to me why the above tries to remove random addresses
> different from the ones just added (possibly not existing)

The idea is that we are trying to create conflicts between namespaces. If we add random address, and then delete the same address - nothing interesting happens. But if we delete some another random address - we will eventually occupy some share of IP-addresses - and conflicts start to appear. I'll mention this in comment in next version.


PS: I agree with other comments from your side. Will fix them.


Dmitry


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-12-30 10:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-25 18:55 [PATCH v3 net 0/2] ipvlan: addrs_lock made per port Dmitry Skorodumov
2025-12-25 18:55 ` [PATCH net 1/2] ipvlan: Make the addrs_lock be " Dmitry Skorodumov
2025-12-28  9:46   ` Paolo Abeni
2025-12-25 18:55 ` [PATCH net 2/2] selftests: net: simple selftest for ipvtap Dmitry Skorodumov
2025-12-26 17:52   ` Stephen Hemminger
2025-12-28 12:14   ` Paolo Abeni
2025-12-30 10:06     ` Dmitry Skorodumov
  -- strict thread matches above, loose matches on Subject: below --
2025-12-02 14:11 [PATCH net 0/2] ipvlan: make addrs_lock be per port Dmitry Skorodumov
2025-12-02 14:11 ` [PATCH net 1/2] ipvlan: Make the " Dmitry Skorodumov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).