netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Petr Machata <petrm@nvidia.com>, Ido Schimmel <idosch@nvidia.com>,
	Nikolay Aleksandrov <razor@blackwall.org>,
	Paolo Abeni <pabeni@redhat.com>, Sasha Levin <sashal@kernel.org>,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, menglong8.dong@gmail.com, gnault@redhat.com,
	netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 6.12 315/486] vxlan: Join / leave MC group after remote changes
Date: Mon,  5 May 2025 18:36:31 -0400	[thread overview]
Message-ID: <20250505223922.2682012-315-sashal@kernel.org> (raw)
In-Reply-To: <20250505223922.2682012-1-sashal@kernel.org>

From: Petr Machata <petrm@nvidia.com>

[ Upstream commit d42d543368343c0449a4e433b5f02e063a86209c ]

When a vxlan netdevice is brought up, if its default remote is a multicast
address, the device joins the indicated group.

Therefore when the multicast remote address changes, the device should
leave the current group and subscribe to the new one. Similarly when the
interface used for endpoint communication is changed in a situation when
multicast remote is configured. This is currently not done.

Both vxlan_igmp_join() and vxlan_igmp_leave() can however fail. So it is
possible that with such fix, the netdevice will end up in an inconsistent
situation where the old group is not joined anymore, but joining the new
group fails. Should we join the new group first, and leave the old one
second, we might end up in the opposite situation, where both groups are
joined. Undoing any of this during rollback is going to be similarly
problematic.

One solution would be to just forbid the change when the netdevice is up.
However in vnifilter mode, changing the group address is allowed, and these
problems are simply ignored (see vxlan_vni_update_group()):

 # ip link add name br up type bridge vlan_filtering 1
 # ip link add vx1 up master br type vxlan external vnifilter local 192.0.2.1 dev lo dstport 4789
 # bridge vni add dev vx1 vni 200 group 224.0.0.1
 # tcpdump -i lo &
 # bridge vni add dev vx1 vni 200 group 224.0.0.2
 18:55:46.523438 IP 0.0.0.0 > 224.0.0.22: igmp v3 report, 1 group record(s)
 18:55:46.943447 IP 0.0.0.0 > 224.0.0.22: igmp v3 report, 1 group record(s)
 # bridge vni
 dev               vni                group/remote
 vx1               200                224.0.0.2

Having two different modes of operation for conceptually the same interface
is silly, so in this patch, just do what the vnifilter code does and deal
with the errors by crossing fingers real hard.

The vnifilter code leaves old before joining new, and in case of join /
leave failures does not roll back the configuration changes that have
already been applied, but bails out of joining if it could not leave. Do
the same here: leave before join, apply changes unconditionally and do not
attempt to join if we couldn't leave.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/vxlan/vxlan_core.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 5e7cdd1b806fb..01f66760e1328 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -4340,6 +4340,7 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 			    struct netlink_ext_ack *extack)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
+	bool rem_ip_changed, change_igmp;
 	struct net_device *lowerdev;
 	struct vxlan_config conf;
 	struct vxlan_rdst *dst;
@@ -4363,8 +4364,13 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 	if (err)
 		return err;
 
+	rem_ip_changed = !vxlan_addr_equal(&conf.remote_ip, &dst->remote_ip);
+	change_igmp = vxlan->dev->flags & IFF_UP &&
+		      (rem_ip_changed ||
+		       dst->remote_ifindex != conf.remote_ifindex);
+
 	/* handle default dst entry */
-	if (!vxlan_addr_equal(&conf.remote_ip, &dst->remote_ip)) {
+	if (rem_ip_changed) {
 		u32 hash_index = fdb_head_index(vxlan, all_zeros_mac, conf.vni);
 
 		spin_lock_bh(&vxlan->hash_lock[hash_index]);
@@ -4408,6 +4414,9 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 		}
 	}
 
+	if (change_igmp && vxlan_addr_multicast(&dst->remote_ip))
+		err = vxlan_multicast_leave(vxlan);
+
 	if (conf.age_interval != vxlan->cfg.age_interval)
 		mod_timer(&vxlan->age_timer, jiffies);
 
@@ -4415,7 +4424,12 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 	if (lowerdev && lowerdev != dst->remote_dev)
 		dst->remote_dev = lowerdev;
 	vxlan_config_apply(dev, &conf, lowerdev, vxlan->net, true);
-	return 0;
+
+	if (!err && change_igmp &&
+	    vxlan_addr_multicast(&dst->remote_ip))
+		err = vxlan_multicast_join(vxlan);
+
+	return err;
 }
 
 static void vxlan_dellink(struct net_device *dev, struct list_head *head)
-- 
2.39.5


  parent reply	other threads:[~2025-05-05 22:50 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250505223922.2682012-1-sashal@kernel.org>
2025-05-05 22:31 ` [PATCH AUTOSEL 6.12 013/486] SUNRPC: Don't allow waiting for exiting tasks Sasha Levin
2025-05-05 22:31 ` [PATCH AUTOSEL 6.12 029/486] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting Sasha Levin
2025-05-05 22:31 ` [PATCH AUTOSEL 6.12 030/486] SUNRPC: rpcbind should never reset the port to the value '0' Sasha Levin
2025-05-05 22:31 ` [PATCH AUTOSEL 6.12 034/486] mctp: Fix incorrect tx flow invalidation condition in mctp-i2c Sasha Levin
2025-05-05 22:31 ` [PATCH AUTOSEL 6.12 035/486] net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards Sasha Levin
2025-05-05 22:31 ` [PATCH AUTOSEL 6.12 036/486] net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus Sasha Levin
2025-05-05 22:32 ` [PATCH AUTOSEL 6.12 046/486] r8169: disable RTL8126 ZRX-DC timeout Sasha Levin
2025-05-05 22:32 ` [PATCH AUTOSEL 6.12 092/486] bnxt_en: Query FW parameters when the CAPS_CHANGE bit is set Sasha Levin
2025-05-05 22:32 ` [PATCH AUTOSEL 6.12 103/486] tcp: reorganize tcp_in_ack_event() and tcp_count_delivered() Sasha Levin
2025-05-05 22:33 ` [PATCH AUTOSEL 6.12 116/486] net/smc: use the correct ndev to find pnetid by pnetid table Sasha Levin
2025-05-05 22:33 ` [PATCH AUTOSEL 6.12 131/486] net: stmmac: dwmac-rk: Validate GRF and peripheral GRF during probe Sasha Levin
2025-05-05 22:33 ` [PATCH AUTOSEL 6.12 132/486] net: hsr: Fix PRP duplicate detection Sasha Levin
2025-05-05 22:33 ` [PATCH AUTOSEL 6.12 135/486] netfilter: conntrack: Bound nf_conntrack sysctl writes Sasha Levin
2025-05-05 22:33 ` [PATCH AUTOSEL 6.12 155/486] ipv6: save dontfrag in cork Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 180/486] tcp: bring back NUMA dispersion in inet_ehash_locks_alloc() Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 182/486] ieee802154: ca8210: Use proper setters and getters for bitwise types Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 193/486] net: phylink: use pl->link_interface in phylink_expects_phy() Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 206/486] net: ethernet: ti: cpsw_new: populate netdev of_node Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 207/486] net: phy: nxp-c45-tja11xx: add match_phy_device to TJA1103/TJA1104 Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 208/486] dpll: Add an assertion to check freq_supported_num Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 212/486] net: pktgen: fix mpls maximum labels list parsing Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 216/486] ipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config() Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 238/486] net/mlx5: Avoid report two health errors on same syndrome Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 239/486] selftests/net: have `gro.sh -t` return a correct exit code Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 244/486] net: ethernet: mtk_ppe_offload: Allow QinQ, double ETH_P_8021Q only Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 245/486] net: xgene-v2: remove incorrect ACPI_PTR annotation Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 246/486] bonding: report duplicate MAC address in all situations Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 250/486] Octeontx2-af: RPM: Register driver with PCI subsys IDs Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 258/486] vhost-scsi: Return queue full for page alloc failures during copy Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 263/486] net/mlx5e: Add correct match to check IPSec syndromes for switchdev mode Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 270/486] net/mlx5: Change POOL_NEXT_SIZE define value and make it global Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 274/486] net: ipv6: Init tunnel link-netns before registering dev Sasha Levin
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 291/486] net: pktgen: fix access outside of user given buffer in pktgen_thread_write() Sasha Levin
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 294/486] bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback Sasha Levin
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 312/486] eth: mlx4: don't try to complete XDP frames in netpoll Sasha Levin
2025-05-05 22:36 ` Sasha Levin [this message]
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 321/486] net/mlx5: Modify LSB bitmask in temperature event to include only the first bit Sasha Levin
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 322/486] net/mlx5: Apply rate-limiting to high temperature warning Sasha Levin
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 342/486] net/mlx4_core: Avoid impossible mlx4_db_alloc() order value Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 359/486] net: stmmac: dwmac-loongson: Set correct {tx,rx}_fifo_size Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 379/486] net/mlx5: XDP, Enable TX side XDP multi-buffer support Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 380/486] net/mlx5: Extend Ethtool loopback selftest to support non-linear SKB Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 381/486] net/mlx5e: set the tx_queue_len for pfifo_fast Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 382/486] net/mlx5e: reduce rep rxq depth to 256 for ECPF Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 383/486] net/mlx5e: reduce the max log mpwrq sz for ECPF and reps Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 385/486] xfrm: prevent high SEQ input in non-ESN mode Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 387/486] mptcp: pm: userspace: flags: clearer msg if no remote addr Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 393/486] net: fec: Refactor MAC reset to function Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 397/486] ip: fib_rules: Fetch net from fib_rule in fib[46]_rule_configure() Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 398/486] r8152: add vendor/device ID pair for Dell Alienware AW1022z Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 402/486] net: ethtool: prevent flow steering to RSS contexts which don't exist Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 412/486] net: page_pool: avoid false positive warning if NAPI was never added Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 420/486] eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 421/486] tools: ynl-gen: don't output external constants Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 422/486] net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 424/486] vxlan: Annotate FDB data races Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 425/486] ipv4: ip_gre: Fix set but not used warning in ipgre_err() if IPv4-only Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 426/486] r8169: don't scan PHY addresses > 0 Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 427/486] net: flush_backlog() small changes Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 428/486] bridge: mdb: Allow replace of a host-joined group Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 429/486] net-sysfs: remove rtnl_trylock from queue attributes Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 430/486] net-sysfs: prevent uncleared queues from being re-added Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 431/486] net-sysfs: remove rtnl_trylock from device attributes Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 432/486] ice: init flow director before RDMA Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 433/486] ice: treat dyn_allowed only as suggestion Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 438/486] ice: count combined queues using Rx/Tx count Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 440/486] net/mana: fix warning in the writer of client oob Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 456/486] bpf: Use kallsyms to find the function name of a struct_ops's stub function Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250505223922.2682012-315-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gnault@redhat.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menglong8.dong@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    --cc=razor@blackwall.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).