netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic
@ 2025-02-04 14:55 Ido Schimmel
  2025-02-04 14:55 ` [PATCH net-next 1/8] vxlan: Annotate FDB data races Ido Schimmel
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

tl;dr - This patchset prevents VXLAN FDB entries from lingering if
traffic is only forwarded to a silent host.

The VXLAN driver maintains two timestamps for each FDB entry: 'used' and
'updated'. The first is refreshed by both the Rx and Tx paths and the
second is refreshed upon migration.

The driver ages out entries according to their 'used' time which means
that an entry can linger when traffic is only forwarded to a silent host
that might have migrated to a different remote.

This patchset solves the problem by adjusting the above semantics and
aligning them to those of the bridge driver. That is, 'used' time is
refreshed by the Tx path, 'updated' time is refresh by Rx path or user
space updates and entries are aged out according to their 'updated'
time.

Patches #1-#2 perform small changes in how the 'used' and 'updated'
fields are accessed.

Patches #3-#5 refresh the 'updated' time where needed.

Patch #6 flips the driver to age out FDB entries according to their
'updated' time.

Patch #7 removes unnecessary updates to the 'used' time.

Patch #8 extends a test case to cover aging of FDB entries in the
presence of Tx traffic.

Ido Schimmel (8):
  vxlan: Annotate FDB data races
  vxlan: Read jiffies once when updating FDB 'used' time
  vxlan: Always refresh FDB 'updated' time when learning is enabled
  vxlan: Refresh FDB 'updated' time upon 'NTF_USE'
  vxlan: Refresh FDB 'updated' time upon user space updates
  vxlan: Age out FDB entries based on 'updated' time
  vxlan: Avoid unnecessary updates to FDB 'used' time
  selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding

 drivers/net/vxlan/vxlan_core.c                | 32 +++++++++++--------
 .../net/forwarding/vxlan_bridge_1d.sh         |  2 ++
 2 files changed, 21 insertions(+), 13 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net-next 1/8] vxlan: Annotate FDB data races
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 15:41   ` Eric Dumazet
  2025-02-04 16:36   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time Ido Schimmel
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

The 'used' and 'updated' fields in the FDB entry structure can be
accessed concurrently by multiple threads, leading to reports such as
[1]. Can be reproduced using [2].

Suppress these reports by annotating these accesses using
READ_ONCE() / WRITE_ONCE().

[1]
BUG: KCSAN: data-race in vxlan_xmit / vxlan_xmit

write to 0xffff942604d263a8 of 8 bytes by task 286 on cpu 0:
 vxlan_xmit+0xb29/0x2380
 dev_hard_start_xmit+0x84/0x2f0
 __dev_queue_xmit+0x45a/0x1650
 packet_xmit+0x100/0x150
 packet_sendmsg+0x2114/0x2ac0
 __sys_sendto+0x318/0x330
 __x64_sys_sendto+0x76/0x90
 x64_sys_call+0x14e8/0x1c00
 do_syscall_64+0x9e/0x1a0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff942604d263a8 of 8 bytes by task 287 on cpu 2:
 vxlan_xmit+0xadf/0x2380
 dev_hard_start_xmit+0x84/0x2f0
 __dev_queue_xmit+0x45a/0x1650
 packet_xmit+0x100/0x150
 packet_sendmsg+0x2114/0x2ac0
 __sys_sendto+0x318/0x330
 __x64_sys_sendto+0x76/0x90
 x64_sys_call+0x14e8/0x1c00
 do_syscall_64+0x9e/0x1a0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0x00000000fffbac6e -> 0x00000000fffbac6f

Reported by Kernel Concurrency Sanitizer on:
CPU: 2 UID: 0 PID: 287 Comm: mausezahn Not tainted 6.13.0-rc7-01544-gb4b270f11a02 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014

[2]
 #!/bin/bash

 set +H
 echo whitelist > /sys/kernel/debug/kcsan
 echo !vxlan_xmit > /sys/kernel/debug/kcsan

 ip link add name vx0 up type vxlan id 10010 dstport 4789 local 192.0.2.1
 bridge fdb add 00:11:22:33:44:55 dev vx0 self static dst 198.51.100.1
 taskset -c 0 mausezahn vx0 -a own -b 00:11:22:33:44:55 -c 0 -q &
 taskset -c 2 mausezahn vx0 -a own -b 00:11:22:33:44:55 -c 0 -q &

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 05c10acb2a57..2f2c6606f719 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -227,9 +227,9 @@ static int vxlan_fdb_info(struct sk_buff *skb, struct vxlan_dev *vxlan,
 			be32_to_cpu(fdb->vni)))
 		goto nla_put_failure;
 
-	ci.ndm_used	 = jiffies_to_clock_t(now - fdb->used);
+	ci.ndm_used	 = jiffies_to_clock_t(now - READ_ONCE(fdb->used));
 	ci.ndm_confirmed = 0;
-	ci.ndm_updated	 = jiffies_to_clock_t(now - fdb->updated);
+	ci.ndm_updated	 = jiffies_to_clock_t(now - READ_ONCE(fdb->updated));
 	ci.ndm_refcnt	 = 0;
 
 	if (nla_put(skb, NDA_CACHEINFO, sizeof(ci), &ci))
@@ -434,8 +434,8 @@ static struct vxlan_fdb *vxlan_find_mac(struct vxlan_dev *vxlan,
 	struct vxlan_fdb *f;
 
 	f = __vxlan_find_mac(vxlan, mac, vni);
-	if (f && f->used != jiffies)
-		f->used = jiffies;
+	if (f && READ_ONCE(f->used) != jiffies)
+		WRITE_ONCE(f->used, jiffies);
 
 	return f;
 }
@@ -1009,12 +1009,12 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
 	    !(f->flags & NTF_VXLAN_ADDED_BY_USER)) {
 		if (f->state != state) {
 			f->state = state;
-			f->updated = jiffies;
+			WRITE_ONCE(f->updated, jiffies);
 			notify = 1;
 		}
 		if (f->flags != fdb_flags) {
 			f->flags = fdb_flags;
-			f->updated = jiffies;
+			WRITE_ONCE(f->updated, jiffies);
 			notify = 1;
 		}
 	}
@@ -1048,7 +1048,7 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
 	}
 
 	if (ndm_flags & NTF_USE)
-		f->used = jiffies;
+		WRITE_ONCE(f->used, jiffies);
 
 	if (notify) {
 		if (rd == NULL)
@@ -1481,7 +1481,7 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
 				    src_mac, &rdst->remote_ip.sa, &src_ip->sa);
 
 		rdst->remote_ip = *src_ip;
-		f->updated = jiffies;
+		WRITE_ONCE(f->updated, jiffies);
 		vxlan_fdb_notify(vxlan, f, rdst, RTM_NEWNEIGH, true, NULL);
 	} else {
 		u32 hash_index = fdb_head_index(vxlan, src_mac, vni);
@@ -2852,7 +2852,7 @@ static void vxlan_cleanup(struct timer_list *t)
 			if (f->flags & NTF_EXT_LEARNED)
 				continue;
 
-			timeout = f->used + vxlan->cfg.age_interval * HZ;
+			timeout = READ_ONCE(f->used) + vxlan->cfg.age_interval * HZ;
 			if (time_before_eq(timeout, jiffies)) {
 				netdev_dbg(vxlan->dev,
 					   "garbage collect %pM\n",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
  2025-02-04 14:55 ` [PATCH net-next 1/8] vxlan: Annotate FDB data races Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 15:42   ` Eric Dumazet
  2025-02-04 16:36   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled Ido Schimmel
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

Avoid two volatile reads in the data path. Instead, read jiffies once
and only if an FDB entry was found.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 2f2c6606f719..676a93ce3a19 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -434,8 +434,12 @@ static struct vxlan_fdb *vxlan_find_mac(struct vxlan_dev *vxlan,
 	struct vxlan_fdb *f;
 
 	f = __vxlan_find_mac(vxlan, mac, vni);
-	if (f && READ_ONCE(f->used) != jiffies)
-		WRITE_ONCE(f->used, jiffies);
+	if (f) {
+		unsigned long now = jiffies;
+
+		if (READ_ONCE(f->used) != now)
+			WRITE_ONCE(f->used, now);
+	}
 
 	return f;
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
  2025-02-04 14:55 ` [PATCH net-next 1/8] vxlan: Annotate FDB data races Ido Schimmel
  2025-02-04 14:55 ` [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 16:37   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE' Ido Schimmel
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

Currently, when learning is enabled and a packet is received from the
expected remote, the 'updated' field of the FDB entry is not refreshed.
This will become a problem when we switch the VXLAN driver to age out
entries based on the 'updated' field.

Solve this by always refreshing an FDB entry when we receive a packet
with a matching source MAC address, regardless if it was received via
the expected remote or not as it indicates the host is alive. This is
consistent with the bridge driver's FDB.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 676a93ce3a19..36cb06a56aca 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1466,6 +1466,10 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
 	f = vxlan_find_mac(vxlan, src_mac, vni);
 	if (likely(f)) {
 		struct vxlan_rdst *rdst = first_remote_rcu(f);
+		unsigned long now = jiffies;
+
+		if (READ_ONCE(f->updated) != now)
+			WRITE_ONCE(f->updated, now);
 
 		if (likely(vxlan_addr_equal(&rdst->remote_ip, src_ip) &&
 			   rdst->remote_ifindex == ifindex))
@@ -1485,7 +1489,6 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
 				    src_mac, &rdst->remote_ip.sa, &src_ip->sa);
 
 		rdst->remote_ip = *src_ip;
-		WRITE_ONCE(f->updated, jiffies);
 		vxlan_fdb_notify(vxlan, f, rdst, RTM_NEWNEIGH, true, NULL);
 	} else {
 		u32 hash_index = fdb_head_index(vxlan, src_mac, vni);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE'
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
                   ` (2 preceding siblings ...)
  2025-02-04 14:55 ` [PATCH net-next 3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 16:37   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 5/8] vxlan: Refresh FDB 'updated' time upon user space updates Ido Schimmel
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

The 'NTF_USE' flag can be used by user space to refresh FDB entries so
that they will not age out. Currently, the VXLAN driver implements it by
refreshing the 'used' field in the FDB entry as this is the field
according to which FDB entries are aged out.

Subsequent patches will switch the VXLAN driver to age out entries based
on the 'updated' field. Prepare for this change by refreshing the
'updated' field upon 'NTF_USE'. This is consistent with the bridge
driver's FDB:

 # ip link add name br1 up type bridge
 # ip link add name swp1 master br1 up type dummy
 # bridge fdb add 00:11:22:33:44:55 dev swp1 master dynamic vlan 1
 # sleep 10
 # bridge fdb replace 00:11:22:33:44:55 dev swp1 master dynamic vlan 1
 # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
 10
 # sleep 10
 # bridge fdb replace 00:11:22:33:44:55 dev swp1 master use dynamic vlan 1
 # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
 0

Before:

 # ip link add name vx1 up type vxlan id 10010 dstport 4789
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # sleep 10
 # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 10
 # sleep 10
 # bridge fdb replace 00:11:22:33:44:55 dev vx1 self use dynamic dst 198.51.100.1
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 20

After:

 # ip link add name vx1 up type vxlan id 10010 dstport 4789
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # sleep 10
 # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 10
 # sleep 10
 # bridge fdb replace 00:11:22:33:44:55 dev vx1 self use dynamic dst 198.51.100.1
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 0

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 36cb06a56aca..c73138647110 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1051,8 +1051,10 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
 		notify |= rc;
 	}
 
-	if (ndm_flags & NTF_USE)
+	if (ndm_flags & NTF_USE) {
 		WRITE_ONCE(f->used, jiffies);
+		WRITE_ONCE(f->updated, jiffies);
+	}
 
 	if (notify) {
 		if (rd == NULL)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 5/8] vxlan: Refresh FDB 'updated' time upon user space updates
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
                   ` (3 preceding siblings ...)
  2025-02-04 14:55 ` [PATCH net-next 4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE' Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 16:37   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 6/8] vxlan: Age out FDB entries based on 'updated' time Ido Schimmel
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

When a host migrates to a different remote and a packet is received from
the new remote, the corresponding FDB entry is updated and its 'updated'
time is refreshed.

However, when user space replaces the remote of an FDB entry, its
'updated' time is not refreshed:

 # ip link add name vx1 up type vxlan id 10010 dstport 4789
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # sleep 10
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 10
 # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.2
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 10

This can lead to the entry being aged out prematurely and it is also
inconsistent with the bridge driver:

 # ip link add name br1 up type bridge
 # ip link add name swp1 master br1 up type dummy
 # ip link add name swp2 master br1 up type dummy
 # bridge fdb add 00:11:22:33:44:55 dev swp1 master dynamic vlan 1
 # sleep 10
 # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
 10
 # bridge fdb replace 00:11:22:33:44:55 dev swp2 master dynamic vlan 1
 # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
 0

Adjust the VXLAN driver to refresh the 'updated' time of an FDB entry
whenever one of its attributes is changed by user space:

 # ip link add name vx1 up type vxlan id 10010 dstport 4789
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # sleep 10
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 10
 # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.2
 # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
 0

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index c73138647110..c75fcb0679ac 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1013,12 +1013,10 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
 	    !(f->flags & NTF_VXLAN_ADDED_BY_USER)) {
 		if (f->state != state) {
 			f->state = state;
-			WRITE_ONCE(f->updated, jiffies);
 			notify = 1;
 		}
 		if (f->flags != fdb_flags) {
 			f->flags = fdb_flags;
-			WRITE_ONCE(f->updated, jiffies);
 			notify = 1;
 		}
 	}
@@ -1060,6 +1058,7 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
 		if (rd == NULL)
 			rd = first_remote_rtnl(f);
 
+		WRITE_ONCE(f->updated, jiffies);
 		err = vxlan_fdb_notify(vxlan, f, rd, RTM_NEWNEIGH,
 				       swdev_notify, extack);
 		if (err)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 6/8] vxlan: Age out FDB entries based on 'updated' time
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
                   ` (4 preceding siblings ...)
  2025-02-04 14:55 ` [PATCH net-next 5/8] vxlan: Refresh FDB 'updated' time upon user space updates Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 16:38   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 7/8] vxlan: Avoid unnecessary updates to FDB 'used' time Ido Schimmel
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

Currently, the VXLAN driver ages out FDB entries based on their 'used'
time which is refreshed by both the Tx and Rx paths. This means that an
FDB entry will not age out if traffic is only forwarded to the target
host:

 # ip link add name vx1 up type vxlan id 10010 local 192.0.2.1 dstport 4789 learning ageing 10
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # bridge fdb get 00:11:22:33:44:55 br vx1 self
 00:11:22:33:44:55 dev vx1 dst 198.51.100.1 self
 # mausezahn vx1 -a own -b 00:11:22:33:44:55 -c 0 -p 100 -q &
 # sleep 20
 # bridge fdb get 00:11:22:33:44:55 br vx1 self
 00:11:22:33:44:55 dev vx1 dst 198.51.100.1 self

This is wrong as an FDB entry will remain present when we no longer have
an indication that the host is still behind the current remote. It is
also inconsistent with the bridge driver:

 # ip link add name br1 up type bridge ageing_time $((10 * 100))
 # ip link add name swp1 up master br1 type dummy
 # bridge fdb add 00:11:22:33:44:55 dev swp1 master dynamic
 # bridge fdb get 00:11:22:33:44:55 br br1
 00:11:22:33:44:55 dev swp1 master br1
 # mausezahn br1 -a own -b 00:11:22:33:44:55 -c 0 -p 100 -q &
 # sleep 20
 # bridge fdb get 00:11:22:33:44:55 br br1
 Error: Fdb entry not found.

Solve this by aging out entries based on their 'updated' time, which is
not refreshed by the Tx path:

 # ip link add name vx1 up type vxlan id 10010 local 192.0.2.1 dstport 4789 learning ageing 10
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
 # bridge fdb get 00:11:22:33:44:55 br vx1 self
 00:11:22:33:44:55 dev vx1 dst 198.51.100.1 self
 # mausezahn vx1 -a own -b 00:11:22:33:44:55 -c 0 -p 100 -q &
 # sleep 20
 # bridge fdb get 00:11:22:33:44:55 br vx1 self
 Error: Fdb entry not found.

But is refreshed by the Rx path:

 # ip address add 192.0.2.1/32 dev lo
 # ip link add name vx1 up type vxlan id 10010 local 192.0.2.1 dstport 4789 localbypass
 # ip link add name vx2 up type vxlan id 20010 local 192.0.2.1 dstport 4789 learning ageing 10
 # bridge fdb add 00:11:22:33:44:55 dev vx1 self static dst 127.0.0.1 vni 20010
 # mausezahn vx1 -a 00:aa:bb:cc:dd:ee -b 00:11:22:33:44:55 -c 0 -p 100 -q &
 # sleep 20
 # bridge fdb get 00:aa:bb:cc:dd:ee br vx2 self
 00:aa:bb:cc:dd:ee dev vx2 dst 127.0.0.1 self
 # pkill mausezahn
 # sleep 20
 # bridge fdb get 00:aa:bb:cc:dd:ee br vx2 self
 Error: Fdb entry not found.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index c75fcb0679ac..01797becae09 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -2860,7 +2860,7 @@ static void vxlan_cleanup(struct timer_list *t)
 			if (f->flags & NTF_EXT_LEARNED)
 				continue;
 
-			timeout = READ_ONCE(f->used) + vxlan->cfg.age_interval * HZ;
+			timeout = READ_ONCE(f->updated) + vxlan->cfg.age_interval * HZ;
 			if (time_before_eq(timeout, jiffies)) {
 				netdev_dbg(vxlan->dev,
 					   "garbage collect %pM\n",
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 7/8] vxlan: Avoid unnecessary updates to FDB 'used' time
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
                   ` (5 preceding siblings ...)
  2025-02-04 14:55 ` [PATCH net-next 6/8] vxlan: Age out FDB entries based on 'updated' time Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 16:38   ` Nikolay Aleksandrov
  2025-02-04 14:55 ` [PATCH net-next 8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding Ido Schimmel
  2025-02-06  4:40 ` [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic patchwork-bot+netdevbpf
  8 siblings, 1 reply; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

Now that the VXLAN driver ages out FDB entries based on their 'updated'
time we can remove unnecessary updates of the 'used' time from the Rx
path and the control path, so that the 'used' time is only updated by
the Tx path.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan/vxlan_core.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 01797becae09..ece5415f9013 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1049,10 +1049,8 @@ static int vxlan_fdb_update_existing(struct vxlan_dev *vxlan,
 		notify |= rc;
 	}
 
-	if (ndm_flags & NTF_USE) {
-		WRITE_ONCE(f->used, jiffies);
+	if (ndm_flags & NTF_USE)
 		WRITE_ONCE(f->updated, jiffies);
-	}
 
 	if (notify) {
 		if (rd == NULL)
@@ -1297,7 +1295,7 @@ int __vxlan_fdb_delete(struct vxlan_dev *vxlan,
 	struct vxlan_fdb *f;
 	int err = -ENOENT;
 
-	f = vxlan_find_mac(vxlan, addr, src_vni);
+	f = __vxlan_find_mac(vxlan, addr, src_vni);
 	if (!f)
 		return err;
 
@@ -1464,7 +1462,7 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
 		ifindex = src_ifindex;
 #endif
 
-	f = vxlan_find_mac(vxlan, src_mac, vni);
+	f = __vxlan_find_mac(vxlan, src_mac, vni);
 	if (likely(f)) {
 		struct vxlan_rdst *rdst = first_remote_rcu(f);
 		unsigned long now = jiffies;
@@ -4773,7 +4771,7 @@ vxlan_fdb_offloaded_set(struct net_device *dev,
 
 	spin_lock_bh(&vxlan->hash_lock[hash_index]);
 
-	f = vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
+	f = __vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
 	if (!f)
 		goto out;
 
@@ -4829,7 +4827,7 @@ vxlan_fdb_external_learn_del(struct net_device *dev,
 	hash_index = fdb_head_index(vxlan, fdb_info->eth_addr, fdb_info->vni);
 	spin_lock_bh(&vxlan->hash_lock[hash_index]);
 
-	f = vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
+	f = __vxlan_find_mac(vxlan, fdb_info->eth_addr, fdb_info->vni);
 	if (!f)
 		err = -ENOENT;
 	else if (f->flags & NTF_EXT_LEARNED)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
                   ` (6 preceding siblings ...)
  2025-02-04 14:55 ` [PATCH net-next 7/8] vxlan: Avoid unnecessary updates to FDB 'used' time Ido Schimmel
@ 2025-02-04 14:55 ` Ido Schimmel
  2025-02-04 16:38   ` Nikolay Aleksandrov
  2025-02-06  4:40 ` [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic patchwork-bot+netdevbpf
  8 siblings, 1 reply; 20+ messages in thread
From: Ido Schimmel @ 2025-02-04 14:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm, razor,
	Ido Schimmel

Extend the VXLAN FDB aging test case to verify that FDB entries are aged
out when they only forward traffic and not refreshed by received
traffic.

The test fails before "vxlan: Age out FDB entries based on 'updated'
time":

 # ./vxlan_bridge_1d.sh
 [...]
 TEST: VXLAN: Ageing of learned FDB entry                            [FAIL]
 [...]
 # echo $?
 1

And passes after it:

 # ./vxlan_bridge_1d.sh
 [...]
 TEST: VXLAN: Ageing of learned FDB entry                            [ OK ]
 [...]
 # echo $?
 0

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
index 3f9d50f1ef9e..180c5eca556f 100755
--- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
+++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
@@ -740,6 +740,8 @@ test_learning()
 
 	vxlan_flood_test $mac $dst 0 10 0
 
+	# The entry should age out when it only forwards traffic
+	$MZ $h1 -c 50 -d 1sec -p 64 -b $mac -B $dst -t icmp -q &
 	sleep 60
 
 	bridge fdb show brport vx1 | grep $mac | grep -q self
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 1/8] vxlan: Annotate FDB data races
  2025-02-04 14:55 ` [PATCH net-next 1/8] vxlan: Annotate FDB data races Ido Schimmel
@ 2025-02-04 15:41   ` Eric Dumazet
  2025-02-04 16:36   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2025-02-04 15:41 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, andrew+netdev, horms, petrm, razor

On Tue, Feb 4, 2025 at 3:56 PM Ido Schimmel <idosch@nvidia.com> wrote:
>
> The 'used' and 'updated' fields in the FDB entry structure can be
> accessed concurrently by multiple threads, leading to reports such as
> [1]. Can be reproduced using [2].
>
> Suppress these reports by annotating these accesses using
> READ_ONCE() / WRITE_ONCE().
>
>
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time
  2025-02-04 14:55 ` [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time Ido Schimmel
@ 2025-02-04 15:42   ` Eric Dumazet
  2025-02-04 16:36   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2025-02-04 15:42 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, andrew+netdev, horms, petrm, razor

On Tue, Feb 4, 2025 at 3:56 PM Ido Schimmel <idosch@nvidia.com> wrote:
>
> Avoid two volatile reads in the data path. Instead, read jiffies once
> and only if an FDB entry was found.
>
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 1/8] vxlan: Annotate FDB data races
  2025-02-04 14:55 ` [PATCH net-next 1/8] vxlan: Annotate FDB data races Ido Schimmel
  2025-02-04 15:41   ` Eric Dumazet
@ 2025-02-04 16:36   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:36 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> The 'used' and 'updated' fields in the FDB entry structure can be
> accessed concurrently by multiple threads, leading to reports such as
> [1]. Can be reproduced using [2].
> 
> Suppress these reports by annotating these accesses using
> READ_ONCE() / WRITE_ONCE().
> 
> [1]
> BUG: KCSAN: data-race in vxlan_xmit / vxlan_xmit
> 
> write to 0xffff942604d263a8 of 8 bytes by task 286 on cpu 0:
>  vxlan_xmit+0xb29/0x2380
>  dev_hard_start_xmit+0x84/0x2f0
>  __dev_queue_xmit+0x45a/0x1650
>  packet_xmit+0x100/0x150
>  packet_sendmsg+0x2114/0x2ac0
>  __sys_sendto+0x318/0x330
>  __x64_sys_sendto+0x76/0x90
>  x64_sys_call+0x14e8/0x1c00
>  do_syscall_64+0x9e/0x1a0
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> read to 0xffff942604d263a8 of 8 bytes by task 287 on cpu 2:
>  vxlan_xmit+0xadf/0x2380
>  dev_hard_start_xmit+0x84/0x2f0
>  __dev_queue_xmit+0x45a/0x1650
>  packet_xmit+0x100/0x150
>  packet_sendmsg+0x2114/0x2ac0
>  __sys_sendto+0x318/0x330
>  __x64_sys_sendto+0x76/0x90
>  x64_sys_call+0x14e8/0x1c00
>  do_syscall_64+0x9e/0x1a0
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> value changed: 0x00000000fffbac6e -> 0x00000000fffbac6f
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 2 UID: 0 PID: 287 Comm: mausezahn Not tainted 6.13.0-rc7-01544-gb4b270f11a02 #5
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
> 
> [2]
>  #!/bin/bash
> 
>  set +H
>  echo whitelist > /sys/kernel/debug/kcsan
>  echo !vxlan_xmit > /sys/kernel/debug/kcsan
> 
>  ip link add name vx0 up type vxlan id 10010 dstport 4789 local 192.0.2.1
>  bridge fdb add 00:11:22:33:44:55 dev vx0 self static dst 198.51.100.1
>  taskset -c 0 mausezahn vx0 -a own -b 00:11:22:33:44:55 -c 0 -q &
>  taskset -c 2 mausezahn vx0 -a own -b 00:11:22:33:44:55 -c 0 -q &
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time
  2025-02-04 14:55 ` [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time Ido Schimmel
  2025-02-04 15:42   ` Eric Dumazet
@ 2025-02-04 16:36   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:36 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> Avoid two volatile reads in the data path. Instead, read jiffies once
> and only if an FDB entry was found.
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled
  2025-02-04 14:55 ` [PATCH net-next 3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled Ido Schimmel
@ 2025-02-04 16:37   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:37 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> Currently, when learning is enabled and a packet is received from the
> expected remote, the 'updated' field of the FDB entry is not refreshed.
> This will become a problem when we switch the VXLAN driver to age out
> entries based on the 'updated' field.
> 
> Solve this by always refreshing an FDB entry when we receive a packet
> with a matching source MAC address, regardless if it was received via
> the expected remote or not as it indicates the host is alive. This is
> consistent with the bridge driver's FDB.
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE'
  2025-02-04 14:55 ` [PATCH net-next 4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE' Ido Schimmel
@ 2025-02-04 16:37   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:37 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> The 'NTF_USE' flag can be used by user space to refresh FDB entries so
> that they will not age out. Currently, the VXLAN driver implements it by
> refreshing the 'used' field in the FDB entry as this is the field
> according to which FDB entries are aged out.
> 
> Subsequent patches will switch the VXLAN driver to age out entries based
> on the 'updated' field. Prepare for this change by refreshing the
> 'updated' field upon 'NTF_USE'. This is consistent with the bridge
> driver's FDB:
> 
>  # ip link add name br1 up type bridge
>  # ip link add name swp1 master br1 up type dummy
>  # bridge fdb add 00:11:22:33:44:55 dev swp1 master dynamic vlan 1
>  # sleep 10
>  # bridge fdb replace 00:11:22:33:44:55 dev swp1 master dynamic vlan 1
>  # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
>  10
>  # sleep 10
>  # bridge fdb replace 00:11:22:33:44:55 dev swp1 master use dynamic vlan 1
>  # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
>  0
> 
> Before:
> 
>  # ip link add name vx1 up type vxlan id 10010 dstport 4789
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # sleep 10
>  # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  10
>  # sleep 10
>  # bridge fdb replace 00:11:22:33:44:55 dev vx1 self use dynamic dst 198.51.100.1
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  20
> 
> After:
> 
>  # ip link add name vx1 up type vxlan id 10010 dstport 4789
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # sleep 10
>  # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  10
>  # sleep 10
>  # bridge fdb replace 00:11:22:33:44:55 dev vx1 self use dynamic dst 198.51.100.1
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  0
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 5/8] vxlan: Refresh FDB 'updated' time upon user space updates
  2025-02-04 14:55 ` [PATCH net-next 5/8] vxlan: Refresh FDB 'updated' time upon user space updates Ido Schimmel
@ 2025-02-04 16:37   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:37 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> When a host migrates to a different remote and a packet is received from
> the new remote, the corresponding FDB entry is updated and its 'updated'
> time is refreshed.
> 
> However, when user space replaces the remote of an FDB entry, its
> 'updated' time is not refreshed:
> 
>  # ip link add name vx1 up type vxlan id 10010 dstport 4789
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # sleep 10
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  10
>  # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.2
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  10
> 
> This can lead to the entry being aged out prematurely and it is also
> inconsistent with the bridge driver:
> 
>  # ip link add name br1 up type bridge
>  # ip link add name swp1 master br1 up type dummy
>  # ip link add name swp2 master br1 up type dummy
>  # bridge fdb add 00:11:22:33:44:55 dev swp1 master dynamic vlan 1
>  # sleep 10
>  # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
>  10
>  # bridge fdb replace 00:11:22:33:44:55 dev swp2 master dynamic vlan 1
>  # bridge -s -j fdb get 00:11:22:33:44:55 br br1 vlan 1 | jq '.[]["updated"]'
>  0
> 
> Adjust the VXLAN driver to refresh the 'updated' time of an FDB entry
> whenever one of its attributes is changed by user space:
> 
>  # ip link add name vx1 up type vxlan id 10010 dstport 4789
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # sleep 10
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  10
>  # bridge fdb replace 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.2
>  # bridge -s -j -p fdb get 00:11:22:33:44:55 br vx1 self | jq '.[]["updated"]'
>  0
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 6/8] vxlan: Age out FDB entries based on 'updated' time
  2025-02-04 14:55 ` [PATCH net-next 6/8] vxlan: Age out FDB entries based on 'updated' time Ido Schimmel
@ 2025-02-04 16:38   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:38 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> Currently, the VXLAN driver ages out FDB entries based on their 'used'
> time which is refreshed by both the Tx and Rx paths. This means that an
> FDB entry will not age out if traffic is only forwarded to the target
> host:
> 
>  # ip link add name vx1 up type vxlan id 10010 local 192.0.2.1 dstport 4789 learning ageing 10
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # bridge fdb get 00:11:22:33:44:55 br vx1 self
>  00:11:22:33:44:55 dev vx1 dst 198.51.100.1 self
>  # mausezahn vx1 -a own -b 00:11:22:33:44:55 -c 0 -p 100 -q &
>  # sleep 20
>  # bridge fdb get 00:11:22:33:44:55 br vx1 self
>  00:11:22:33:44:55 dev vx1 dst 198.51.100.1 self
> 
> This is wrong as an FDB entry will remain present when we no longer have
> an indication that the host is still behind the current remote. It is
> also inconsistent with the bridge driver:
> 
>  # ip link add name br1 up type bridge ageing_time $((10 * 100))
>  # ip link add name swp1 up master br1 type dummy
>  # bridge fdb add 00:11:22:33:44:55 dev swp1 master dynamic
>  # bridge fdb get 00:11:22:33:44:55 br br1
>  00:11:22:33:44:55 dev swp1 master br1
>  # mausezahn br1 -a own -b 00:11:22:33:44:55 -c 0 -p 100 -q &
>  # sleep 20
>  # bridge fdb get 00:11:22:33:44:55 br br1
>  Error: Fdb entry not found.
> 
> Solve this by aging out entries based on their 'updated' time, which is
> not refreshed by the Tx path:
> 
>  # ip link add name vx1 up type vxlan id 10010 local 192.0.2.1 dstport 4789 learning ageing 10
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self dynamic dst 198.51.100.1
>  # bridge fdb get 00:11:22:33:44:55 br vx1 self
>  00:11:22:33:44:55 dev vx1 dst 198.51.100.1 self
>  # mausezahn vx1 -a own -b 00:11:22:33:44:55 -c 0 -p 100 -q &
>  # sleep 20
>  # bridge fdb get 00:11:22:33:44:55 br vx1 self
>  Error: Fdb entry not found.
> 
> But is refreshed by the Rx path:
> 
>  # ip address add 192.0.2.1/32 dev lo
>  # ip link add name vx1 up type vxlan id 10010 local 192.0.2.1 dstport 4789 localbypass
>  # ip link add name vx2 up type vxlan id 20010 local 192.0.2.1 dstport 4789 learning ageing 10
>  # bridge fdb add 00:11:22:33:44:55 dev vx1 self static dst 127.0.0.1 vni 20010
>  # mausezahn vx1 -a 00:aa:bb:cc:dd:ee -b 00:11:22:33:44:55 -c 0 -p 100 -q &
>  # sleep 20
>  # bridge fdb get 00:aa:bb:cc:dd:ee br vx2 self
>  00:aa:bb:cc:dd:ee dev vx2 dst 127.0.0.1 self
>  # pkill mausezahn
>  # sleep 20
>  # bridge fdb get 00:aa:bb:cc:dd:ee br vx2 self
>  Error: Fdb entry not found.
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 7/8] vxlan: Avoid unnecessary updates to FDB 'used' time
  2025-02-04 14:55 ` [PATCH net-next 7/8] vxlan: Avoid unnecessary updates to FDB 'used' time Ido Schimmel
@ 2025-02-04 16:38   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:38 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> Now that the VXLAN driver ages out FDB entries based on their 'updated'
> time we can remove unnecessary updates of the 'used' time from the Rx
> path and the control path, so that the 'used' time is only updated by
> the Tx path.
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan/vxlan_core.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding
  2025-02-04 14:55 ` [PATCH net-next 8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding Ido Schimmel
@ 2025-02-04 16:38   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 20+ messages in thread
From: Nikolay Aleksandrov @ 2025-02-04 16:38 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, horms, petrm

On 2/4/25 16:55, Ido Schimmel wrote:
> Extend the VXLAN FDB aging test case to verify that FDB entries are aged
> out when they only forward traffic and not refreshed by received
> traffic.
> 
> The test fails before "vxlan: Age out FDB entries based on 'updated'
> time":
> 
>  # ./vxlan_bridge_1d.sh
>  [...]
>  TEST: VXLAN: Ageing of learned FDB entry                            [FAIL]
>  [...]
>  # echo $?
>  1
> 
> And passes after it:
> 
>  # ./vxlan_bridge_1d.sh
>  [...]
>  TEST: VXLAN: Ageing of learned FDB entry                            [ OK ]
>  [...]
>  # echo $?
>  0
> 
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh | 2 ++
>  1 file changed, 2 insertions(+)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic
  2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
                   ` (7 preceding siblings ...)
  2025-02-04 14:55 ` [PATCH net-next 8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding Ido Schimmel
@ 2025-02-06  4:40 ` patchwork-bot+netdevbpf
  8 siblings, 0 replies; 20+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-02-06  4:40 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, andrew+netdev, horms,
	petrm, razor

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 4 Feb 2025 16:55:41 +0200 you wrote:
> tl;dr - This patchset prevents VXLAN FDB entries from lingering if
> traffic is only forwarded to a silent host.
> 
> The VXLAN driver maintains two timestamps for each FDB entry: 'used' and
> 'updated'. The first is refreshed by both the Rx and Tx paths and the
> second is refreshed upon migration.
> 
> [...]

Here is the summary with links:
  - [net-next,1/8] vxlan: Annotate FDB data races
    https://git.kernel.org/netdev/net-next/c/f6205f8215f1
  - [net-next,2/8] vxlan: Read jiffies once when updating FDB 'used' time
    https://git.kernel.org/netdev/net-next/c/1370c45d6e7e
  - [net-next,3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled
    https://git.kernel.org/netdev/net-next/c/c4f2082bf641
  - [net-next,4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE'
    https://git.kernel.org/netdev/net-next/c/40a9994f2fbd
  - [net-next,5/8] vxlan: Refresh FDB 'updated' time upon user space updates
    https://git.kernel.org/netdev/net-next/c/fb2f449eca51
  - [net-next,6/8] vxlan: Age out FDB entries based on 'updated' time
    https://git.kernel.org/netdev/net-next/c/b4a1d98b0fa5
  - [net-next,7/8] vxlan: Avoid unnecessary updates to FDB 'used' time
    https://git.kernel.org/netdev/net-next/c/9722f834fe9a
  - [net-next,8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding
    https://git.kernel.org/netdev/net-next/c/c467a98e1de0

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-02-06  4:40 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-04 14:55 [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic Ido Schimmel
2025-02-04 14:55 ` [PATCH net-next 1/8] vxlan: Annotate FDB data races Ido Schimmel
2025-02-04 15:41   ` Eric Dumazet
2025-02-04 16:36   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 2/8] vxlan: Read jiffies once when updating FDB 'used' time Ido Schimmel
2025-02-04 15:42   ` Eric Dumazet
2025-02-04 16:36   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 3/8] vxlan: Always refresh FDB 'updated' time when learning is enabled Ido Schimmel
2025-02-04 16:37   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 4/8] vxlan: Refresh FDB 'updated' time upon 'NTF_USE' Ido Schimmel
2025-02-04 16:37   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 5/8] vxlan: Refresh FDB 'updated' time upon user space updates Ido Schimmel
2025-02-04 16:37   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 6/8] vxlan: Age out FDB entries based on 'updated' time Ido Schimmel
2025-02-04 16:38   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 7/8] vxlan: Avoid unnecessary updates to FDB 'used' time Ido Schimmel
2025-02-04 16:38   ` Nikolay Aleksandrov
2025-02-04 14:55 ` [PATCH net-next 8/8] selftests: forwarding: vxlan_bridge_1d: Check aging while forwarding Ido Schimmel
2025-02-04 16:38   ` Nikolay Aleksandrov
2025-02-06  4:40 ` [PATCH net-next 0/8] vxlan: Age FDB entries based on Rx traffic patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).