netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: <netdev@vger.kernel.org>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<edumazet@google.com>, <andrew+netdev@lunn.ch>,
	<horms@kernel.org>, <petrm@nvidia.com>, <razor@blackwall.org>,
	Ido Schimmel <idosch@nvidia.com>
Subject: [PATCH net-next 02/15] vxlan: Simplify creation of default FDB entry
Date: Tue, 15 Apr 2025 15:11:30 +0300	[thread overview]
Message-ID: <20250415121143.345227-3-idosch@nvidia.com> (raw)
In-Reply-To: <20250415121143.345227-1-idosch@nvidia.com>

There is asymmetry in how the default FDB entry (all-zeroes) is created
and destroyed in the VXLAN driver. It is created as part of the driver's
newlink() routine, but destroyed as part of its ndo_uninit() routine.

This caused multiple problems in the past. First, commit 0241b836732f
("vxlan: fix default fdb entry netlink notify ordering during netdev
create") split the notification about the entry from its creation so
that it will not be notified to user space before the VXLAN device is
registered.

Then, commit 6db924687139 ("vxlan: Fix error path in
__vxlan_dev_create()") made the error path in __vxlan_dev_create()
asymmetric by destroying the FDB entry before unregistering the net
device. Otherwise, the FDB entry would have been freed twice: By
ndo_uninit() as part of unregister_netdevice() and by
vxlan_fdb_destroy() in the error path.

Finally, commit 7c31e54aeee5 ("vxlan: do not destroy fdb if
register_netdevice() is failed") split the insertion of the FDB entry
into the hash table from its creation, moving the insertion after the
registration of the net device. Otherwise, like before, the FDB entry
would have been freed twice: By ndo_uninit() as part of
register_netdevice()'s error path and by vxlan_fdb_destroy() in the
error path of __vxlan_dev_create().

The end result is that the code is unnecessarily complex. In addition,
the fixed size hash table cannot be converted to rhashtable as
vxlan_fdb_insert() cannot fail, which will no longer be true with
rhashtable.

Solve this by making the addition and deletion of the default FDB entry
completely symmetric. Namely, as part of newlink() routine, create the
entry, insert it into to the hash table and send a notification to user
space after the net device was registered. Note that at this stage the
net device is still administratively down and cannot transmit / receive
packets.

Move the deletion from ndo_uninit() to the dellink routine(): Flush the
default entry together with all the other entries, before unregistering
the net device.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 78 +++++++++++-----------------------
 1 file changed, 25 insertions(+), 53 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 7872b85e890e..3df86927b1ec 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -2930,18 +2930,6 @@ static int vxlan_init(struct net_device *dev)
 	return err;
 }
 
-static void vxlan_fdb_delete_default(struct vxlan_dev *vxlan, __be32 vni)
-{
-	struct vxlan_fdb *f;
-	u32 hash_index = fdb_head_index(vxlan, all_zeros_mac, vni);
-
-	spin_lock_bh(&vxlan->hash_lock[hash_index]);
-	f = __vxlan_find_mac(vxlan, all_zeros_mac, vni);
-	if (f)
-		vxlan_fdb_destroy(vxlan, f, true, true);
-	spin_unlock_bh(&vxlan->hash_lock[hash_index]);
-}
-
 static void vxlan_uninit(struct net_device *dev)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
@@ -2952,8 +2940,6 @@ static void vxlan_uninit(struct net_device *dev)
 		vxlan_vnigroup_uninit(vxlan);
 
 	gro_cells_destroy(&vxlan->gro_cells);
-
-	vxlan_fdb_delete_default(vxlan, vxlan->cfg.vni);
 }
 
 /* Start ageing timer and join group when device is brought up */
@@ -3187,7 +3173,7 @@ static int vxlan_stop(struct net_device *dev)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct vxlan_fdb_flush_desc desc = {
-		/* Default entry is deleted at vxlan_uninit. */
+		/* Default entry is deleted at vxlan_dellink. */
 		.ignore_default_entry = true,
 		.state = 0,
 		.state_mask = NUD_PERMANENT | NUD_NOARP,
@@ -3963,7 +3949,6 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct net_device *remote_dev = NULL;
 	struct vxlan_fdb *f = NULL;
-	bool unregister = false;
 	struct vxlan_rdst *dst;
 	int err;
 
@@ -3974,72 +3959,62 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 
 	dev->ethtool_ops = &vxlan_ethtool_ops;
 
-	/* create an fdb entry for a valid default destination */
-	if (!vxlan_addr_any(&dst->remote_ip)) {
-		err = vxlan_fdb_create(vxlan, all_zeros_mac,
-				       &dst->remote_ip,
-				       NUD_REACHABLE | NUD_PERMANENT,
-				       vxlan->cfg.dst_port,
-				       dst->remote_vni,
-				       dst->remote_vni,
-				       dst->remote_ifindex,
-				       NTF_SELF, 0, &f, extack);
-		if (err)
-			return err;
-	}
-
 	err = register_netdevice(dev);
 	if (err)
-		goto errout;
-	unregister = true;
+		return err;
 
 	if (dst->remote_ifindex) {
 		remote_dev = __dev_get_by_index(net, dst->remote_ifindex);
 		if (!remote_dev) {
 			err = -ENODEV;
-			goto errout;
+			goto unregister;
 		}
 
 		err = netdev_upper_dev_link(remote_dev, dev, extack);
 		if (err)
-			goto errout;
+			goto unregister;
 	}
 
 	err = rtnl_configure_link(dev, NULL, 0, NULL);
 	if (err < 0)
 		goto unlink;
 
+	/* create an fdb entry for a valid default destination */
+	if (!vxlan_addr_any(&dst->remote_ip)) {
+		err = vxlan_fdb_create(vxlan, all_zeros_mac,
+				       &dst->remote_ip,
+				       NUD_REACHABLE | NUD_PERMANENT,
+				       vxlan->cfg.dst_port,
+				       dst->remote_vni,
+				       dst->remote_vni,
+				       dst->remote_ifindex,
+				       NTF_SELF, 0, &f, extack);
+		if (err)
+			goto unlink;
+	}
+
 	if (f) {
 		vxlan_fdb_insert(vxlan, all_zeros_mac, dst->remote_vni, f);
 
 		/* notify default fdb entry */
 		err = vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f),
 				       RTM_NEWNEIGH, true, extack);
-		if (err) {
-			vxlan_fdb_destroy(vxlan, f, false, false);
-			if (remote_dev)
-				netdev_upper_dev_unlink(remote_dev, dev);
-			goto unregister;
-		}
+		if (err)
+			goto fdb_destroy;
 	}
 
 	list_add(&vxlan->next, &vn->vxlan_list);
 	if (remote_dev)
 		dst->remote_dev = remote_dev;
 	return 0;
+
+fdb_destroy:
+	vxlan_fdb_destroy(vxlan, f, false, false);
 unlink:
 	if (remote_dev)
 		netdev_upper_dev_unlink(remote_dev, dev);
-errout:
-	/* unregister_netdevice() destroys the default FDB entry with deletion
-	 * notification. But the addition notification was not sent yet, so
-	 * destroy the entry by hand here.
-	 */
-	if (f)
-		__vxlan_fdb_free(f);
 unregister:
-	if (unregister)
-		unregister_netdevice(dev);
+	unregister_netdevice(dev);
 	return err;
 }
 
@@ -4520,10 +4495,7 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 static void vxlan_dellink(struct net_device *dev, struct list_head *head)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
-	struct vxlan_fdb_flush_desc desc = {
-		/* Default entry is deleted at vxlan_uninit. */
-		.ignore_default_entry = true,
-	};
+	struct vxlan_fdb_flush_desc desc = {};
 
 	vxlan_flush(vxlan, &desc);
 
-- 
2.49.0


  parent reply	other threads:[~2025-04-15 12:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-15 12:11 [PATCH net-next 00/15] vxlan: Convert FDB table to rhashtable Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 01/15] vxlan: Add RCU read-side critical sections in the Tx path Ido Schimmel
2025-04-15 12:11 ` Ido Schimmel [this message]
2025-04-15 12:11 ` [PATCH net-next 03/15] vxlan: Insert FDB into hash table in vxlan_fdb_create() Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 04/15] vxlan: Unsplit default FDB entry creation and notification Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 05/15] vxlan: Relocate assignment of default remote device Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 06/15] vxlan: Use a single lock to protect the FDB table Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 07/15] vxlan: Add a linked list of FDB entries Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 08/15] vxlan: Use linked list to traverse " Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 09/15] vxlan: Convert FDB garbage collection to RCU Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 10/15] vxlan: Convert FDB flushing " Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 11/15] vxlan: Rename FDB Tx lookup function Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 12/15] vxlan: Create wrappers for FDB lookup Ido Schimmel
2025-04-22  8:46   ` Paolo Abeni
2025-04-23 12:21     ` Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 13/15] vxlan: Do not treat dst cache initialization errors as fatal Ido Schimmel
2025-04-22  8:49   ` Paolo Abeni
2025-04-24  8:18     ` Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 14/15] vxlan: Introduce FDB key structure Ido Schimmel
2025-04-15 12:11 ` [PATCH net-next 15/15] vxlan: Convert FDB table to rhashtable Ido Schimmel
2025-04-15 14:15 ` [PATCH net-next 00/15] " Nikolay Aleksandrov
2025-04-22  9:38 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250415121143.345227-3-idosch@nvidia.com \
    --to=idosch@nvidia.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    --cc=razor@blackwall.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).