netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries
@ 2023-06-19  7:14 Johannes Nixdorf
  2023-06-19  7:14 ` [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry Johannes Nixdorf
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Johannes Nixdorf @ 2023-06-19  7:14 UTC (permalink / raw)
  To: bridge
  Cc: netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu,
	Ido Schimmel, Johannes Nixdorf

Introduce a limit on the amount of learned FDB entries on a bridge,
configured by netlink with a build time default on bridge creation in
the kernel config.

For backwards compatibility the kernel config default is disabling the
limit (0).

Without any limit a malicious actor may OOM a kernel by spamming packets
with changing MAC addresses on their bridge port, so allow the bridge
creator to limit the number of entries.

Currently the manual entries are identified by the bridge flags
BR_FDB_LOCAL or BR_FDB_ADDED_BY_USER, and changes to those flags are
protected under a lock. This means the limit also applies to entries
created with BR_FDB_ADDED_BY_EXT_LEARN but none of the other two,
e.g. ones added by SWITCHDEV_FDB_ADD_TO_BRIDGE.

v1: https://lore.kernel.org/netdev/20230515085046.4457-1-jnixdorf-oss@avm.de/

Changes since v1:
 - Added BR_FDB_ADDED_BY_USER earlier in fdb_add_entry to ensure the
   limit is not applied.
 - Do not initialize fdb_*_entries to 0. (from review)
 - Do not skip decrementing on 0. (from review)
 - Moved the counters to a conditional hole in struct net_bridge to
   avoid growing the struct. (from review, it still grows the struct as
   there are 2 32-bit values)
 - Add IFLA_BR_FDB_CUR_LEARNED_ENTRIES (from review)
 - Fix br_get_size() with the added attributes.
 - Only limit learned entries, rename to
   *_(CUR|MAX)_LEARNED_ENTRIES. (from review)
 - Added a default limit in Kconfig. (deemed acceptable in review
   comments, helps with embedded use-cases where a special purpose kernel
   is built anyways)
 - Added an iproute2 patch for easier testing.

Obsolete v1 review comments:
 - Return better errors to users: Due to limiting the limit to
   automatically created entries, netlink fdb add requests and changing
   bridge ports are never rejected, so they do not yet need a more
   friendly error returned.

net-next:

Johannes Nixdorf (3):
  bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry
  bridge: Add a limit on learned FDB entries
  net: bridge: Add a configurable default FDB learning limit

 include/uapi/linux/if_link.h |  2 +
 net/bridge/Kconfig           | 13 +++++++
 net/bridge/br_device.c       |  2 +
 net/bridge/br_fdb.c          | 73 ++++++++++++++++++++++++++++++++----
 net/bridge/br_netlink.c      | 13 ++++++-
 net/bridge/br_private.h      |  6 +++
 6 files changed, 101 insertions(+), 8 deletions(-)

iproute2-next:

Johannes Nixdorf (1):
  iplink: bridge: Add support for bridge FDB learning limits

 include/uapi/linux/if_link.h |  2 ++
 ip/iplink_bridge.c           | 21 +++++++++++++++++++++
 man/man8/ip-link.8.in        |  9 +++++++++
 3 files changed, 32 insertions(+)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry
  2023-06-19  7:14 [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries Johannes Nixdorf
@ 2023-06-19  7:14 ` Johannes Nixdorf
  2023-06-19 14:50   ` Ido Schimmel
  2023-06-19  7:14 ` [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries Johannes Nixdorf
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Johannes Nixdorf @ 2023-06-19  7:14 UTC (permalink / raw)
  To: bridge
  Cc: netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu,
	Ido Schimmel, Johannes Nixdorf

This allows the called fdb_create to detect that the entry was added by
the user early in the process. This is in preparation to adding limits
in fdb_create that should not apply to user created fdb entries.

Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>

---

Changes since v1:
 - Added this change to ensure user added entries are not limited.

 net/bridge/br_fdb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e69a872bfc1d..ac1dc8723b9c 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1056,7 +1056,7 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 		if (!(flags & NLM_F_CREATE))
 			return -ENOENT;
 
-		fdb = fdb_create(br, source, addr, vid, 0);
+		fdb = fdb_create(br, source, addr, vid, BR_FDB_ADDED_BY_USER);
 		if (!fdb)
 			return -ENOMEM;
 
@@ -1069,6 +1069,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 			WRITE_ONCE(fdb->dst, source);
 			modified = true;
 		}
+
+		set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 	}
 
 	if (fdb_to_nud(br, fdb) != state) {
@@ -1100,8 +1102,6 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 	if (fdb_handle_notify(fdb, notify))
 		modified = true;
 
-	set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
-
 	fdb->used = jiffies;
 	if (modified) {
 		if (refresh)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
  2023-06-19  7:14 [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries Johannes Nixdorf
  2023-06-19  7:14 ` [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry Johannes Nixdorf
@ 2023-06-19  7:14 ` Johannes Nixdorf
  2023-06-19 15:34   ` Ido Schimmel
  2023-06-20  6:55   ` Nikolay Aleksandrov
  2023-06-19  7:14 ` [PATCH net-next v2 3/3] net: bridge: Add a configurable default FDB learning limit Johannes Nixdorf
  2023-06-19  7:14 ` [PATCH iproute2-next 1/1] iplink: bridge: Add support for bridge FDB learning limits Johannes Nixdorf
  3 siblings, 2 replies; 13+ messages in thread
From: Johannes Nixdorf @ 2023-06-19  7:14 UTC (permalink / raw)
  To: bridge
  Cc: netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu,
	Ido Schimmel, Johannes Nixdorf

A malicious actor behind one bridge port may spam the kernel with packets
with a random source MAC address, each of which will create an FDB entry,
each of which is a dynamic allocation in the kernel.

There are roughly 2^48 different MAC addresses, further limited by the
rhashtable they are stored in to 2^31. Each entry is of the type struct
net_bridge_fdb_entry, which is currently 128 bytes big. This means the
maximum amount of memory allocated for FDB entries is 2^31 * 128B =
256GiB, which is too much for most computers.

Mitigate this by adding a bridge netlink setting
IFLA_BR_FDB_MAX_LEARNED_ENTRIES, which, if nonzero, limits the amount
of learned entries to a user specified maximum.

For backwards compatibility the default setting of 0 disables the limit.

User-added entries by netlink or from bridge or bridge port addresses
are never blocked and do not count towards that limit.

All changes to fdb_n_entries are under br->hash_lock, which means we do
not need additional locking. The call paths are (✓ denotes that
br->hash_lock is taken around the next call):

 - fdb_delete <-+- fdb_delete_local <-+- br_fdb_changeaddr ✓
                |                     +- br_fdb_change_mac_address ✓
                |                     +- br_fdb_delete_by_port ✓
                +- br_fdb_find_delete_local ✓
                +- fdb_add_local <-+- br_fdb_changeaddr ✓
                |                  +- br_fdb_change_mac_address ✓
                |                  +- br_fdb_add_local ✓
                +- br_fdb_cleanup ✓
                +- br_fdb_flush ✓
                +- br_fdb_delete_by_port ✓
                +- fdb_delete_by_addr_and_port <--- __br_fdb_delete ✓
                +- br_fdb_external_learn_del ✓
 - fdb_create <-+- fdb_add_local <-+- br_fdb_changeaddr ✓
                |                  +- br_fdb_change_mac_address ✓
                |                  +- br_fdb_add_local ✓
                +- br_fdb_update ✓
                +- fdb_add_entry <--- __br_fdb_add ✓
                +- br_fdb_external_learn_add ✓

The flags that imply an entry does not come from learning
(BR_FDB_NOT_LEARNED_MASK) are now only set or cleared under br->hash_lock
as well, and when the boolean value of (fdb->flags &
BR_FDB_NOT_LEARNED_MASK) changes the accounting is updated.

This introduces one additional locked update in br_fdb_update if
BR_FDB_ADDED_BY_USER was set. This is only the case when creating a new
entry via netlink, and never in the packet handling fast path.

Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>

---

Changes since v1:
 - Do not initialize fdb_*_entries to 0. (from review)
 - Do not skip decrementing on 0. (from review)
 - Moved the counters to a conditional hole in struct net_bridge to
   avoid growing the struct. (from review, it still grows the struct as
   there are 2 32-bit values)
 - Add IFLA_BR_FDB_CUR_LEARNED_ENTRIES (from review)
 - Fix br_get_size()
 - Only limit learned entries, rename to
   *_(CUR|MAX)_LEARNED_ENTRIES. (from review)

Obsolete v1 review comments:
 - Return better errors to users: Due to limiting the limit to
   automatically created entries, netlink fdb add requests and changing
   bridge ports are never rejected, so they do not yet need a more
   friendly error returned.

 include/uapi/linux/if_link.h |  2 ++
 net/bridge/br_fdb.c          | 67 +++++++++++++++++++++++++++++++++---
 net/bridge/br_netlink.c      | 13 ++++++-
 net/bridge/br_private.h      |  6 ++++
 4 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 4ac1000b0ef2..165b9014379b 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -510,6 +510,8 @@ enum {
 	IFLA_BR_VLAN_STATS_PER_PORT,
 	IFLA_BR_MULTI_BOOLOPT,
 	IFLA_BR_MCAST_QUERIER_STATE,
+	IFLA_BR_FDB_CUR_LEARNED_ENTRIES,
+	IFLA_BR_FDB_MAX_LEARNED_ENTRIES,
 	__IFLA_BR_MAX,
 };
 
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index ac1dc8723b9c..bc61d1fd5fcf 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -301,6 +301,38 @@ static void fdb_add_hw_addr(struct net_bridge *br, const unsigned char *addr)
 	}
 }
 
+/* Set a FDB flag that implies the entry was not learned, and account
+ * for changes in the learned status.
+ */
+static void __fdb_set_flag_not_learned(struct net_bridge *br,
+				       struct net_bridge_fdb_entry *fdb,
+				       long nr)
+{
+	WARN_ON_ONCE(!(BIT(nr) & BR_FDB_NOT_LEARNED_MASK));
+
+	/* learned before, but we set a flag that implies it's manually added */
+	if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
+		br->fdb_cur_learned_entries--;
+	set_bit(nr, &fdb->flags);
+}
+
+/* Set a FDB flag that implies the entry was not learned, and account
+ * for changes in the learned status.
+ *
+ * This function takes a lock, so ensure it is not called in the fast
+ * path.
+ */
+static void fdb_set_flag_not_learned(struct net_bridge *br,
+				     struct net_bridge_fdb_entry *fdb,
+				     long nr)
+{
+	spin_lock_bh(&br->hash_lock);
+
+	__fdb_set_flag_not_learned(br, fdb, nr);
+
+	spin_unlock_bh(&br->hash_lock);
+}
+
 /* When a static FDB entry is deleted, the HW address from that entry is
  * also removed from the bridge private HW address list and updates all
  * the ports with needed information.
@@ -321,6 +353,8 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
 static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
 		       bool swdev_notify)
 {
+	bool learned = !(f->flags & BR_FDB_NOT_LEARNED_MASK);
+
 	trace_fdb_delete(br, f);
 
 	if (test_bit(BR_FDB_STATIC, &f->flags))
@@ -329,11 +363,16 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
 	hlist_del_init_rcu(&f->fdb_node);
 	rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode,
 			       br_fdb_rht_params);
+	br->fdb_cur_learned_entries -= learned;
 	fdb_notify(br, f, RTM_DELNEIGH, swdev_notify);
 	call_rcu(&f->rcu, fdb_rcu_free);
 }
 
-/* Delete a local entry if no other port had the same address. */
+/* Delete a local entry if no other port had the same address.
+ *
+ * This function should only be called on entries with BR_FDB_LOCAL set,
+ * so clear_bit never removes the last bit in BR_FDB_NOT_LEARNED_MASK.
+ */
 static void fdb_delete_local(struct net_bridge *br,
 			     const struct net_bridge_port *p,
 			     struct net_bridge_fdb_entry *f)
@@ -390,6 +429,11 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br,
 {
 	struct net_bridge_fdb_entry *fdb;
 	int err;
+	bool learned = !(flags & BR_FDB_NOT_LEARNED_MASK);
+
+	if (unlikely(learned && br->fdb_max_learned_entries &&
+		     br->fdb_cur_learned_entries >= br->fdb_max_learned_entries))
+		return NULL;
 
 	fdb = kmem_cache_alloc(br_fdb_cache, GFP_ATOMIC);
 	if (!fdb)
@@ -409,6 +453,8 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br,
 
 	hlist_add_head_rcu(&fdb->fdb_node, &br->fdb_list);
 
+	br->fdb_cur_learned_entries += learned;
+
 	return fdb;
 }
 
@@ -894,7 +940,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 			}
 
 			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
-				set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+				fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
 			if (unlikely(fdb_modified)) {
 				trace_br_fdb_update(br, source, addr, vid, flags);
 				fdb_notify(br, fdb, RTM_NEWNEIGH, true);
@@ -1070,6 +1116,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 			modified = true;
 		}
 
+		if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
+			br->fdb_cur_learned_entries--;
 		set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 	}
 
@@ -1440,10 +1488,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		}
 
 		if (swdev_notify)
-			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+			__fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
 
 		if (!p)
-			set_bit(BR_FDB_LOCAL, &fdb->flags);
+			__fdb_set_flag_not_learned(br, fdb, BR_FDB_LOCAL);
 
 		if (modified)
 			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);
@@ -1508,3 +1556,14 @@ void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
 	spin_unlock_bh(&p->br->hash_lock);
 }
 EXPORT_SYMBOL_GPL(br_fdb_clear_offload);
+
+u32 br_fdb_get_cur_learned_entries(struct net_bridge *br)
+{
+	u32 ret;
+
+	spin_lock_bh(&br->hash_lock);
+	ret = br->fdb_cur_learned_entries;
+	spin_unlock_bh(&br->hash_lock);
+
+	return ret;
+}
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 05c5863d2e20..954c468d52ec 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1527,6 +1527,12 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[],
 			return err;
 	}
 
+	if (data[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]) {
+		u32 val = nla_get_u32(data[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]);
+
+		br->fdb_max_learned_entries = val;
+	}
+
 	return 0;
 }
 
@@ -1581,6 +1587,8 @@ static size_t br_get_size(const struct net_device *brdev)
 	       nla_total_size_64bit(sizeof(u64)) + /* IFLA_BR_TOPOLOGY_CHANGE_TIMER */
 	       nla_total_size_64bit(sizeof(u64)) + /* IFLA_BR_GC_TIMER */
 	       nla_total_size(ETH_ALEN) +       /* IFLA_BR_GROUP_ADDR */
+	       nla_total_size(sizeof(u32)) +    /* IFLA_BR_FDB_CUR_LEARNED_ENTRIES */
+	       nla_total_size(sizeof(u32)) +    /* IFLA_BR_FDB_MAX_LEARNED_ENTRIES */
 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
 	       nla_total_size(sizeof(u8)) +     /* IFLA_BR_MCAST_ROUTER */
 	       nla_total_size(sizeof(u8)) +     /* IFLA_BR_MCAST_SNOOPING */
@@ -1620,6 +1628,7 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
 	u32 stp_enabled = br->stp_enabled;
 	u16 priority = (br->bridge_id.prio[0] << 8) | br->bridge_id.prio[1];
 	u8 vlan_enabled = br_vlan_enabled(br->dev);
+	u32 fdb_cur_learned_entries = br_fdb_get_cur_learned_entries(br);
 	struct br_boolopt_multi bm;
 	u64 clockval;
 
@@ -1656,7 +1665,9 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
 	    nla_put_u8(skb, IFLA_BR_TOPOLOGY_CHANGE_DETECTED,
 		       br->topology_change_detected) ||
 	    nla_put(skb, IFLA_BR_GROUP_ADDR, ETH_ALEN, br->group_addr) ||
-	    nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm))
+	    nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm) ||
+	    nla_put_u32(skb, IFLA_BR_FDB_CUR_LEARNED_ENTRIES, fdb_cur_learned_entries) ||
+	    nla_put_u32(skb, IFLA_BR_FDB_MAX_LEARNED_ENTRIES, br->fdb_max_learned_entries))
 		return -EMSGSIZE;
 
 #ifdef CONFIG_BRIDGE_VLAN_FILTERING
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 2119729ded2b..df079191479e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -275,6 +275,8 @@ enum {
 	BR_FDB_LOCKED,
 };
 
+#define BR_FDB_NOT_LEARNED_MASK (BIT(BR_FDB_LOCAL) | BIT(BR_FDB_ADDED_BY_USER))
+
 struct net_bridge_fdb_key {
 	mac_addr addr;
 	u16 vlan_id;
@@ -553,6 +555,9 @@ struct net_bridge {
 	struct kobject			*ifobj;
 	u32				auto_cnt;
 
+	u32				fdb_max_learned_entries;
+	u32				fdb_cur_learned_entries;
+
 #ifdef CONFIG_NET_SWITCHDEV
 	/* Counter used to make sure that hardware domains get unique
 	 * identifiers in case a bridge spans multiple switchdev instances.
@@ -838,6 +843,7 @@ int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      bool swdev_notify);
 void br_fdb_offloaded_set(struct net_bridge *br, struct net_bridge_port *p,
 			  const unsigned char *addr, u16 vid, bool offloaded);
+u32 br_fdb_get_cur_learned_entries(struct net_bridge *br);
 
 /* br_forward.c */
 enum br_pkt_type {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 3/3] net: bridge: Add a configurable default FDB learning limit
  2023-06-19  7:14 [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries Johannes Nixdorf
  2023-06-19  7:14 ` [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry Johannes Nixdorf
  2023-06-19  7:14 ` [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries Johannes Nixdorf
@ 2023-06-19  7:14 ` Johannes Nixdorf
  2023-06-20  6:56   ` Nikolay Aleksandrov
  2023-06-19  7:14 ` [PATCH iproute2-next 1/1] iplink: bridge: Add support for bridge FDB learning limits Johannes Nixdorf
  3 siblings, 1 reply; 13+ messages in thread
From: Johannes Nixdorf @ 2023-06-19  7:14 UTC (permalink / raw)
  To: bridge
  Cc: netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu,
	Ido Schimmel, Johannes Nixdorf

This adds a Kconfig option to configure a default FDB learning limit
system wide, so a distributor building a special purpose kernel can
limit all created bridges by default.

The limit is only a soft default setting and overridable per bridge
using netlink.

Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>

---

Changes since v1:
 - Added a default limit in Kconfig. (deemed acceptable in review
   comments)

 net/bridge/Kconfig     | 13 +++++++++++++
 net/bridge/br_device.c |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/net/bridge/Kconfig b/net/bridge/Kconfig
index 3c8ded7d3e84..c0d9c08088c4 100644
--- a/net/bridge/Kconfig
+++ b/net/bridge/Kconfig
@@ -84,3 +84,16 @@ config BRIDGE_CFM
 	  Say N to exclude this support and reduce the binary size.
 
 	  If unsure, say N.
+
+config BRIDGE_DEFAULT_FDB_MAX_LEARNED
+	int "Default FDB learning limit"
+	default 0
+	depends on BRIDGE
+	help
+	  Sets a default limit on the number of learned FDB entries on
+	  new bridges. This limit can be overwritten via netlink on a
+	  per bridge basis.
+
+	  The default of 0 disables the limit.
+
+	  If unsure, say 0.
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8eca8a5c80c6..93f081ce8195 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -530,6 +530,8 @@ void br_dev_setup(struct net_device *dev)
 	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
 	dev->max_mtu = ETH_MAX_MTU;
 
+	br->fdb_max_learned_entries = CONFIG_BRIDGE_DEFAULT_FDB_MAX_LEARNED;
+
 	br_netfilter_rtable_init(br);
 	br_stp_timer_init(br);
 	br_multicast_init(br);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH iproute2-next 1/1] iplink: bridge: Add support for bridge FDB learning limits
  2023-06-19  7:14 [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries Johannes Nixdorf
                   ` (2 preceding siblings ...)
  2023-06-19  7:14 ` [PATCH net-next v2 3/3] net: bridge: Add a configurable default FDB learning limit Johannes Nixdorf
@ 2023-06-19  7:14 ` Johannes Nixdorf
  2023-06-19 14:37   ` Ido Schimmel
  3 siblings, 1 reply; 13+ messages in thread
From: Johannes Nixdorf @ 2023-06-19  7:14 UTC (permalink / raw)
  To: bridge
  Cc: netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu,
	Ido Schimmel, Johannes Nixdorf

Support setting the FDB limit through ip link. The arguments is:
 - fdb_max_learned_entries: A 32-bit unsigned integer specifying the
                            maximum number of learned FDB entries, with 0
                            disabling the limit.

Also support reading back the current number of learned FDB entries in
the bridge by this count. The returned value's name is:
 - fdb_cur_learned_entries: A 32-bit unsigned integer specifying the
                             current number of learned FDB entries.

Example:

 # ip -d -j -p link show br0
[ {
...
        "linkinfo": {
            "info_kind": "bridge",
            "info_data": {
...
                "fdb_cur_learned_entries": 2,
                "fdb_max_learned_entries": 0,
...
            }
        },
...
    } ]
 # ip link set br0 type bridge fdb_max_learned_entries 1024
 # ip -d -j -p link show br0
[ {
...
        "linkinfo": {
            "info_kind": "bridge",
            "info_data": {
...
                "fdb_cur_learned_entries": 2,
                "fdb_max_learned_entries": 1024,
...
            }
        },
...
    } ]

Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
---
 include/uapi/linux/if_link.h |  2 ++
 ip/iplink_bridge.c           | 21 +++++++++++++++++++++
 man/man8/ip-link.8.in        |  9 +++++++++
 3 files changed, 32 insertions(+)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 94fb7ef9e226..5ad1e2727e0d 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -508,6 +508,8 @@ enum {
 	IFLA_BR_VLAN_STATS_PER_PORT,
 	IFLA_BR_MULTI_BOOLOPT,
 	IFLA_BR_MCAST_QUERIER_STATE,
+	IFLA_BR_FDB_CUR_LEARNED_ENTRIES,
+	IFLA_BR_FDB_MAX_LEARNED_ENTRIES,
 	__IFLA_BR_MAX,
 };
 
diff --git a/ip/iplink_bridge.c b/ip/iplink_bridge.c
index 7e4e62c81c0c..68ed3c251945 100644
--- a/ip/iplink_bridge.c
+++ b/ip/iplink_bridge.c
@@ -34,6 +34,7 @@ static void print_explain(FILE *f)
 		"		  [ group_fwd_mask MASK ]\n"
 		"		  [ group_address ADDRESS ]\n"
 		"		  [ no_linklocal_learn NO_LINKLOCAL_LEARN ]\n"
+		"		  [ fdb_max_learned_entries FDB_MAX_LEARNED_ENTRIES ]\n"
 		"		  [ vlan_filtering VLAN_FILTERING ]\n"
 		"		  [ vlan_protocol VLAN_PROTOCOL ]\n"
 		"		  [ vlan_default_pvid VLAN_DEFAULT_PVID ]\n"
@@ -168,6 +169,14 @@ static int bridge_parse_opt(struct link_util *lu, int argc, char **argv,
 				bm.optval |= no_ll_learn_bit;
 			else
 				bm.optval &= ~no_ll_learn_bit;
+		} else if (matches(*argv, "fdb_max_learned_entries") == 0) {
+			__u32 fdb_max_learned_entries;
+
+			NEXT_ARG();
+			if (get_u32(&fdb_max_learned_entries, *argv, 0))
+				invarg("invalid fdb_max_learned_entries", *argv);
+
+			addattr32(n, 1024, IFLA_BR_FDB_MAX_LEARNED_ENTRIES, fdb_max_learned_entries);
 		} else if (matches(*argv, "fdb_flush") == 0) {
 			addattr(n, 1024, IFLA_BR_FDB_FLUSH);
 		} else if (matches(*argv, "vlan_default_pvid") == 0) {
@@ -544,6 +553,18 @@ static void bridge_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 	if (tb[IFLA_BR_GC_TIMER])
 		_bridge_print_timer(f, "gc_timer", tb[IFLA_BR_GC_TIMER]);
 
+	if (tb[IFLA_BR_FDB_CUR_LEARNED_ENTRIES])
+		print_uint(PRINT_ANY,
+			   "fdb_cur_learned_entries",
+			   "fdb_cur_learned_entries %u ",
+			   rta_getattr_u32(tb[IFLA_BR_FDB_CUR_LEARNED_ENTRIES]));
+
+	if (tb[IFLA_BR_FDB_MAX_LEARNED_ENTRIES])
+		print_uint(PRINT_ANY,
+			   "fdb_max_learned_entries",
+			   "fdb_max_learned_entries %u ",
+			   rta_getattr_u32(tb[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]));
+
 	if (tb[IFLA_BR_VLAN_DEFAULT_PVID])
 		print_uint(PRINT_ANY,
 			   "vlan_default_pvid",
diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index bf3605a9fa2e..a29595858a51 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -1620,6 +1620,8 @@ the following additional arguments are supported:
 ] [
 .BI no_linklocal_learn " NO_LINKLOCAL_LEARN "
 ] [
+.BI fdb_max_entries " FDB_MAX_ENTRIES "
+] [
 .BI vlan_filtering " VLAN_FILTERING "
 ] [
 .BI vlan_protocol " VLAN_PROTOCOL "
@@ -1731,6 +1733,13 @@ or off
 When disabled, the bridge will not learn from link-local frames (default:
 enabled).
 
+.BI fdb_max_learned_entries " FDB_MAX_LEARNED_ENTRIES "
+- set the maximum number of learned FDB entries linux may create. If
+.RI ( FDB_MAX_LEARNED_ENTRIES " == 0) "
+the feature is disabled.
+.I FDB_MAX_LEARNED_ENTRIES
+is a 32bit unsigned integer.
+
 .BI vlan_filtering " VLAN_FILTERING "
 - turn VLAN filtering on
 .RI ( VLAN_FILTERING " > 0) "
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH iproute2-next 1/1] iplink: bridge: Add support for bridge FDB learning limits
  2023-06-19  7:14 ` [PATCH iproute2-next 1/1] iplink: bridge: Add support for bridge FDB learning limits Johannes Nixdorf
@ 2023-06-19 14:37   ` Ido Schimmel
  0 siblings, 0 replies; 13+ messages in thread
From: Ido Schimmel @ 2023-06-19 14:37 UTC (permalink / raw)
  To: Johannes Nixdorf
  Cc: bridge, netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu

Please see the following link regarding posting of iproute2 patches:

https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html#co-posting-changes-to-user-space-components

On Mon, Jun 19, 2023 at 09:14:44AM +0200, Johannes Nixdorf wrote:
> Support setting the FDB limit through ip link. The arguments is:
>  - fdb_max_learned_entries: A 32-bit unsigned integer specifying the
>                             maximum number of learned FDB entries, with 0
>                             disabling the limit.
> 
> Also support reading back the current number of learned FDB entries in
> the bridge by this count. The returned value's name is:
>  - fdb_cur_learned_entries: A 32-bit unsigned integer specifying the
>                              current number of learned FDB entries.

MDB has "mcast_n_groups" and "mcast_max_groups". Maybe use
"fdb_n_learned_entries" to be consistent?

> 
> Example:
> 
>  # ip -d -j -p link show br0
> [ {
> ...
>         "linkinfo": {
>             "info_kind": "bridge",
>             "info_data": {
> ...
>                 "fdb_cur_learned_entries": 2,
>                 "fdb_max_learned_entries": 0,
> ...
>             }
>         },
> ...
>     } ]
>  # ip link set br0 type bridge fdb_max_learned_entries 1024
>  # ip -d -j -p link show br0
> [ {
> ...
>         "linkinfo": {
>             "info_kind": "bridge",
>             "info_data": {
> ...
>                 "fdb_cur_learned_entries": 2,
>                 "fdb_max_learned_entries": 1024,
> ...
>             }
>         },
> ...
>     } ]
> 
> Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
> ---
>  include/uapi/linux/if_link.h |  2 ++
>  ip/iplink_bridge.c           | 21 +++++++++++++++++++++
>  man/man8/ip-link.8.in        |  9 +++++++++
>  3 files changed, 32 insertions(+)
> 
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index 94fb7ef9e226..5ad1e2727e0d 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -508,6 +508,8 @@ enum {
>  	IFLA_BR_VLAN_STATS_PER_PORT,
>  	IFLA_BR_MULTI_BOOLOPT,
>  	IFLA_BR_MCAST_QUERIER_STATE,
> +	IFLA_BR_FDB_CUR_LEARNED_ENTRIES,
> +	IFLA_BR_FDB_MAX_LEARNED_ENTRIES,
>  	__IFLA_BR_MAX,
>  };
>  
> diff --git a/ip/iplink_bridge.c b/ip/iplink_bridge.c
> index 7e4e62c81c0c..68ed3c251945 100644
> --- a/ip/iplink_bridge.c
> +++ b/ip/iplink_bridge.c
> @@ -34,6 +34,7 @@ static void print_explain(FILE *f)
>  		"		  [ group_fwd_mask MASK ]\n"
>  		"		  [ group_address ADDRESS ]\n"
>  		"		  [ no_linklocal_learn NO_LINKLOCAL_LEARN ]\n"
> +		"		  [ fdb_max_learned_entries FDB_MAX_LEARNED_ENTRIES ]\n"
>  		"		  [ vlan_filtering VLAN_FILTERING ]\n"
>  		"		  [ vlan_protocol VLAN_PROTOCOL ]\n"
>  		"		  [ vlan_default_pvid VLAN_DEFAULT_PVID ]\n"
> @@ -168,6 +169,14 @@ static int bridge_parse_opt(struct link_util *lu, int argc, char **argv,
>  				bm.optval |= no_ll_learn_bit;
>  			else
>  				bm.optval &= ~no_ll_learn_bit;
> +		} else if (matches(*argv, "fdb_max_learned_entries") == 0) {

New code is expected to use strcmp() instead of matches().

> +			__u32 fdb_max_learned_entries;
> +
> +			NEXT_ARG();
> +			if (get_u32(&fdb_max_learned_entries, *argv, 0))
> +				invarg("invalid fdb_max_learned_entries", *argv);
> +
> +			addattr32(n, 1024, IFLA_BR_FDB_MAX_LEARNED_ENTRIES, fdb_max_learned_entries);
>  		} else if (matches(*argv, "fdb_flush") == 0) {
>  			addattr(n, 1024, IFLA_BR_FDB_FLUSH);
>  		} else if (matches(*argv, "vlan_default_pvid") == 0) {
> @@ -544,6 +553,18 @@ static void bridge_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
>  	if (tb[IFLA_BR_GC_TIMER])
>  		_bridge_print_timer(f, "gc_timer", tb[IFLA_BR_GC_TIMER]);
>  
> +	if (tb[IFLA_BR_FDB_CUR_LEARNED_ENTRIES])
> +		print_uint(PRINT_ANY,
> +			   "fdb_cur_learned_entries",
> +			   "fdb_cur_learned_entries %u ",
> +			   rta_getattr_u32(tb[IFLA_BR_FDB_CUR_LEARNED_ENTRIES]));
> +
> +	if (tb[IFLA_BR_FDB_MAX_LEARNED_ENTRIES])
> +		print_uint(PRINT_ANY,
> +			   "fdb_max_learned_entries",
> +			   "fdb_max_learned_entries %u ",
> +			   rta_getattr_u32(tb[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]));
> +
>  	if (tb[IFLA_BR_VLAN_DEFAULT_PVID])
>  		print_uint(PRINT_ANY,
>  			   "vlan_default_pvid",
> diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
> index bf3605a9fa2e..a29595858a51 100644
> --- a/man/man8/ip-link.8.in
> +++ b/man/man8/ip-link.8.in
> @@ -1620,6 +1620,8 @@ the following additional arguments are supported:
>  ] [
>  .BI no_linklocal_learn " NO_LINKLOCAL_LEARN "
>  ] [
> +.BI fdb_max_entries " FDB_MAX_ENTRIES "

Inconsistent with actual name.

> +] [
>  .BI vlan_filtering " VLAN_FILTERING "
>  ] [
>  .BI vlan_protocol " VLAN_PROTOCOL "
> @@ -1731,6 +1733,13 @@ or off
>  When disabled, the bridge will not learn from link-local frames (default:
>  enabled).
>  
> +.BI fdb_max_learned_entries " FDB_MAX_LEARNED_ENTRIES "
> +- set the maximum number of learned FDB entries linux may create. If

You can drop "linux may create".

> +.RI ( FDB_MAX_LEARNED_ENTRIES " == 0) "
> +the feature is disabled.

Please mention it's the default.

> +.I FDB_MAX_LEARNED_ENTRIES
> +is a 32bit unsigned integer.
> +
>  .BI vlan_filtering " VLAN_FILTERING "
>  - turn VLAN filtering on
>  .RI ( VLAN_FILTERING " > 0) "
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry
  2023-06-19  7:14 ` [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry Johannes Nixdorf
@ 2023-06-19 14:50   ` Ido Schimmel
  0 siblings, 0 replies; 13+ messages in thread
From: Ido Schimmel @ 2023-06-19 14:50 UTC (permalink / raw)
  To: Johannes Nixdorf
  Cc: bridge, netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu

On Mon, Jun 19, 2023 at 09:14:41AM +0200, Johannes Nixdorf wrote:
> This allows the called fdb_create to detect that the entry was added by
> the user early in the process. This is in preparation to adding limits
> in fdb_create that should not apply to user created fdb entries.

Use imperative mood:
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

> 
> Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
> 

Remove the blank line

> ---
> 
> Changes since v1:
>  - Added this change to ensure user added entries are not limited.
> 
>  net/bridge/br_fdb.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index e69a872bfc1d..ac1dc8723b9c 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -1056,7 +1056,7 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>  		if (!(flags & NLM_F_CREATE))
>  			return -ENOENT;
>  
> -		fdb = fdb_create(br, source, addr, vid, 0);
> +		fdb = fdb_create(br, source, addr, vid, BR_FDB_ADDED_BY_USER);

BIT(BR_FDB_ADDED_BY_USER)

>  		if (!fdb)
>  			return -ENOMEM;
>  
> @@ -1069,6 +1069,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>  			WRITE_ONCE(fdb->dst, source);
>  			modified = true;
>  		}
> +
> +		set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
>  	}
>  
>  	if (fdb_to_nud(br, fdb) != state) {
> @@ -1100,8 +1102,6 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>  	if (fdb_handle_notify(fdb, notify))
>  		modified = true;
>  
> -	set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> -
>  	fdb->used = jiffies;
>  	if (modified) {
>  		if (refresh)
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
  2023-06-19  7:14 ` [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries Johannes Nixdorf
@ 2023-06-19 15:34   ` Ido Schimmel
  2023-06-20  6:55   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 13+ messages in thread
From: Ido Schimmel @ 2023-06-19 15:34 UTC (permalink / raw)
  To: Johannes Nixdorf
  Cc: bridge, netdev, David Ahern, Nikolay Aleksandrov, Vladimir Oltean,
	Andrew Lunn, Florian Fainelli, Oleksij Rempel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Roopa Prabhu

On Mon, Jun 19, 2023 at 09:14:42AM +0200, Johannes Nixdorf wrote:
> A malicious actor behind one bridge port may spam the kernel with packets
> with a random source MAC address, each of which will create an FDB entry,
> each of which is a dynamic allocation in the kernel.
> 
> There are roughly 2^48 different MAC addresses, further limited by the
> rhashtable they are stored in to 2^31. Each entry is of the type struct
> net_bridge_fdb_entry, which is currently 128 bytes big. This means the
> maximum amount of memory allocated for FDB entries is 2^31 * 128B =
> 256GiB, which is too much for most computers.
> 
> Mitigate this by adding a bridge netlink setting
> IFLA_BR_FDB_MAX_LEARNED_ENTRIES, which, if nonzero, limits the amount
> of learned entries to a user specified maximum.
> 
> For backwards compatibility the default setting of 0 disables the limit.
> 
> User-added entries by netlink or from bridge or bridge port addresses
> are never blocked and do not count towards that limit.
> 
> All changes to fdb_n_entries are under br->hash_lock, which means we do
> not need additional locking. The call paths are (✓ denotes that
> br->hash_lock is taken around the next call):
> 
>  - fdb_delete <-+- fdb_delete_local <-+- br_fdb_changeaddr ✓
>                 |                     +- br_fdb_change_mac_address ✓
>                 |                     +- br_fdb_delete_by_port ✓
>                 +- br_fdb_find_delete_local ✓
>                 +- fdb_add_local <-+- br_fdb_changeaddr ✓
>                 |                  +- br_fdb_change_mac_address ✓
>                 |                  +- br_fdb_add_local ✓
>                 +- br_fdb_cleanup ✓
>                 +- br_fdb_flush ✓
>                 +- br_fdb_delete_by_port ✓
>                 +- fdb_delete_by_addr_and_port <--- __br_fdb_delete ✓
>                 +- br_fdb_external_learn_del ✓
>  - fdb_create <-+- fdb_add_local <-+- br_fdb_changeaddr ✓
>                 |                  +- br_fdb_change_mac_address ✓
>                 |                  +- br_fdb_add_local ✓
>                 +- br_fdb_update ✓
>                 +- fdb_add_entry <--- __br_fdb_add ✓
>                 +- br_fdb_external_learn_add ✓
> 
> The flags that imply an entry does not come from learning
> (BR_FDB_NOT_LEARNED_MASK) are now only set or cleared under br->hash_lock
> as well, and when the boolean value of (fdb->flags &
> BR_FDB_NOT_LEARNED_MASK) changes the accounting is updated.
> 
> This introduces one additional locked update in br_fdb_update if
> BR_FDB_ADDED_BY_USER was set. This is only the case when creating a new
> entry via netlink, and never in the packet handling fast path.
> 
> Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
> 
> ---
> 
> Changes since v1:
>  - Do not initialize fdb_*_entries to 0. (from review)
>  - Do not skip decrementing on 0. (from review)
>  - Moved the counters to a conditional hole in struct net_bridge to
>    avoid growing the struct. (from review, it still grows the struct as
>    there are 2 32-bit values)
>  - Add IFLA_BR_FDB_CUR_LEARNED_ENTRIES (from review)
>  - Fix br_get_size()
>  - Only limit learned entries, rename to
>    *_(CUR|MAX)_LEARNED_ENTRIES. (from review)
> 
> Obsolete v1 review comments:
>  - Return better errors to users: Due to limiting the limit to
>    automatically created entries, netlink fdb add requests and changing
>    bridge ports are never rejected, so they do not yet need a more
>    friendly error returned.
> 
>  include/uapi/linux/if_link.h |  2 ++
>  net/bridge/br_fdb.c          | 67 +++++++++++++++++++++++++++++++++---
>  net/bridge/br_netlink.c      | 13 ++++++-
>  net/bridge/br_private.h      |  6 ++++

To minimize the number of changes per patch and make review easier, try
to first maintain the count and the maximum and then in a separate patch
expose them via netlink. See b57e8d870d52 and a1aee20d5db2, for example.
Merge commit is cb3086cee656.

>  4 files changed, 83 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index 4ac1000b0ef2..165b9014379b 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -510,6 +510,8 @@ enum {
>  	IFLA_BR_VLAN_STATS_PER_PORT,
>  	IFLA_BR_MULTI_BOOLOPT,
>  	IFLA_BR_MCAST_QUERIER_STATE,
> +	IFLA_BR_FDB_CUR_LEARNED_ENTRIES,
> +	IFLA_BR_FDB_MAX_LEARNED_ENTRIES,
>  	__IFLA_BR_MAX,
>  };
>  
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index ac1dc8723b9c..bc61d1fd5fcf 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -301,6 +301,38 @@ static void fdb_add_hw_addr(struct net_bridge *br, const unsigned char *addr)
>  	}
>  }
>  
> +/* Set a FDB flag that implies the entry was not learned, and account
> + * for changes in the learned status.
> + */
> +static void __fdb_set_flag_not_learned(struct net_bridge *br,
> +				       struct net_bridge_fdb_entry *fdb,
> +				       long nr)
> +{
> +	WARN_ON_ONCE(!(BIT(nr) & BR_FDB_NOT_LEARNED_MASK));
> +
> +	/* learned before, but we set a flag that implies it's manually added */
> +	if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
> +		br->fdb_cur_learned_entries--;
> +	set_bit(nr, &fdb->flags);
> +}
> +
> +/* Set a FDB flag that implies the entry was not learned, and account
> + * for changes in the learned status.
> + *
> + * This function takes a lock, so ensure it is not called in the fast
> + * path.
> + */
> +static void fdb_set_flag_not_learned(struct net_bridge *br,
> +				     struct net_bridge_fdb_entry *fdb,
> +				     long nr)
> +{
> +	spin_lock_bh(&br->hash_lock);
> +
> +	__fdb_set_flag_not_learned(br, fdb, nr);
> +
> +	spin_unlock_bh(&br->hash_lock);
> +}
> +
>  /* When a static FDB entry is deleted, the HW address from that entry is
>   * also removed from the bridge private HW address list and updates all
>   * the ports with needed information.
> @@ -321,6 +353,8 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
>  static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>  		       bool swdev_notify)
>  {
> +	bool learned = !(f->flags & BR_FDB_NOT_LEARNED_MASK);
> +
>  	trace_fdb_delete(br, f);
>  
>  	if (test_bit(BR_FDB_STATIC, &f->flags))
> @@ -329,11 +363,16 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>  	hlist_del_init_rcu(&f->fdb_node);
>  	rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode,
>  			       br_fdb_rht_params);
> +	br->fdb_cur_learned_entries -= learned;
>  	fdb_notify(br, f, RTM_DELNEIGH, swdev_notify);
>  	call_rcu(&f->rcu, fdb_rcu_free);
>  }
>  
> -/* Delete a local entry if no other port had the same address. */
> +/* Delete a local entry if no other port had the same address.
> + *
> + * This function should only be called on entries with BR_FDB_LOCAL set,
> + * so clear_bit never removes the last bit in BR_FDB_NOT_LEARNED_MASK.
> + */
>  static void fdb_delete_local(struct net_bridge *br,
>  			     const struct net_bridge_port *p,
>  			     struct net_bridge_fdb_entry *f)
> @@ -390,6 +429,11 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br,
>  {
>  	struct net_bridge_fdb_entry *fdb;
>  	int err;
> +	bool learned = !(flags & BR_FDB_NOT_LEARNED_MASK);

https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html#local-variable-ordering-reverse-xmas-tree-rcs

> +
> +	if (unlikely(learned && br->fdb_max_learned_entries &&
> +		     br->fdb_cur_learned_entries >= br->fdb_max_learned_entries))

This function is not run with RTNL held and so 'fdb_max_learned_entries'
can be updated concurrently from user space. You will need READ_ONCE() /
WRITE_ONCE() to silence KCSAN.

> +		return NULL;
>  
>  	fdb = kmem_cache_alloc(br_fdb_cache, GFP_ATOMIC);
>  	if (!fdb)
> @@ -409,6 +453,8 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br,
>  
>  	hlist_add_head_rcu(&fdb->fdb_node, &br->fdb_list);
>  
> +	br->fdb_cur_learned_entries += learned;

Don't remember seeing bool being added to a u32...

> +
>  	return fdb;
>  }
>  
> @@ -894,7 +940,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>  			}
>  
>  			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
> -				set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> +				fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
>  			if (unlikely(fdb_modified)) {
>  				trace_br_fdb_update(br, source, addr, vid, flags);
>  				fdb_notify(br, fdb, RTM_NEWNEIGH, true);
> @@ -1070,6 +1116,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>  			modified = true;
>  		}
>  
> +		if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
> +			br->fdb_cur_learned_entries--;
>  		set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
>  	}
>  
> @@ -1440,10 +1488,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  		}
>  
>  		if (swdev_notify)
> -			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> +			__fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
>  
>  		if (!p)
> -			set_bit(BR_FDB_LOCAL, &fdb->flags);
> +			__fdb_set_flag_not_learned(br, fdb, BR_FDB_LOCAL);
>  
>  		if (modified)
>  			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);
> @@ -1508,3 +1556,14 @@ void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
>  	spin_unlock_bh(&p->br->hash_lock);
>  }
>  EXPORT_SYMBOL_GPL(br_fdb_clear_offload);
> +
> +u32 br_fdb_get_cur_learned_entries(struct net_bridge *br)
> +{
> +	u32 ret;
> +
> +	spin_lock_bh(&br->hash_lock);
> +	ret = br->fdb_cur_learned_entries;

I believe that for MDB Nik asked not to take a spin lock for each dump
and we used READ_ONCE() / WRITE_ONCE() instead.

> +	spin_unlock_bh(&br->hash_lock);
> +
> +	return ret;
> +}
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 05c5863d2e20..954c468d52ec 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -1527,6 +1527,12 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[],
>  			return err;
>  	}
>  
> +	if (data[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]) {

Please add this attribute to 'br_policy' and in a separate patch convert
the policy to use 'strict_start_type' so that future submissions will be
forced to update the policy. See c00041cf1cb82 for example.

I suggest writing a selftest to verify the expected behavior and to make
sure this feature does not regress. There are a lot of examples under
tools/testing/selftests/net/ and tools/testing/selftests/net/forwarding
in particular. You can post another version before writing a test, but
mark it as RFC so that it doesn't get applied by mistake.

Thanks

> +		u32 val = nla_get_u32(data[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]);
> +
> +		br->fdb_max_learned_entries = val;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -1581,6 +1587,8 @@ static size_t br_get_size(const struct net_device *brdev)
>  	       nla_total_size_64bit(sizeof(u64)) + /* IFLA_BR_TOPOLOGY_CHANGE_TIMER */
>  	       nla_total_size_64bit(sizeof(u64)) + /* IFLA_BR_GC_TIMER */
>  	       nla_total_size(ETH_ALEN) +       /* IFLA_BR_GROUP_ADDR */
> +	       nla_total_size(sizeof(u32)) +    /* IFLA_BR_FDB_CUR_LEARNED_ENTRIES */
> +	       nla_total_size(sizeof(u32)) +    /* IFLA_BR_FDB_MAX_LEARNED_ENTRIES */
>  #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
>  	       nla_total_size(sizeof(u8)) +     /* IFLA_BR_MCAST_ROUTER */
>  	       nla_total_size(sizeof(u8)) +     /* IFLA_BR_MCAST_SNOOPING */
> @@ -1620,6 +1628,7 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
>  	u32 stp_enabled = br->stp_enabled;
>  	u16 priority = (br->bridge_id.prio[0] << 8) | br->bridge_id.prio[1];
>  	u8 vlan_enabled = br_vlan_enabled(br->dev);
> +	u32 fdb_cur_learned_entries = br_fdb_get_cur_learned_entries(br);
>  	struct br_boolopt_multi bm;
>  	u64 clockval;
>  
> @@ -1656,7 +1665,9 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
>  	    nla_put_u8(skb, IFLA_BR_TOPOLOGY_CHANGE_DETECTED,
>  		       br->topology_change_detected) ||
>  	    nla_put(skb, IFLA_BR_GROUP_ADDR, ETH_ALEN, br->group_addr) ||
> -	    nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm))
> +	    nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm) ||
> +	    nla_put_u32(skb, IFLA_BR_FDB_CUR_LEARNED_ENTRIES, fdb_cur_learned_entries) ||
> +	    nla_put_u32(skb, IFLA_BR_FDB_MAX_LEARNED_ENTRIES, br->fdb_max_learned_entries))
>  		return -EMSGSIZE;
>  
>  #ifdef CONFIG_BRIDGE_VLAN_FILTERING
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 2119729ded2b..df079191479e 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -275,6 +275,8 @@ enum {
>  	BR_FDB_LOCKED,
>  };
>  
> +#define BR_FDB_NOT_LEARNED_MASK (BIT(BR_FDB_LOCAL) | BIT(BR_FDB_ADDED_BY_USER))
> +
>  struct net_bridge_fdb_key {
>  	mac_addr addr;
>  	u16 vlan_id;
> @@ -553,6 +555,9 @@ struct net_bridge {
>  	struct kobject			*ifobj;
>  	u32				auto_cnt;
>  
> +	u32				fdb_max_learned_entries;
> +	u32				fdb_cur_learned_entries;
> +
>  #ifdef CONFIG_NET_SWITCHDEV
>  	/* Counter used to make sure that hardware domains get unique
>  	 * identifiers in case a bridge spans multiple switchdev instances.
> @@ -838,6 +843,7 @@ int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
>  			      bool swdev_notify);
>  void br_fdb_offloaded_set(struct net_bridge *br, struct net_bridge_port *p,
>  			  const unsigned char *addr, u16 vid, bool offloaded);
> +u32 br_fdb_get_cur_learned_entries(struct net_bridge *br);
>  
>  /* br_forward.c */
>  enum br_pkt_type {
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
  2023-06-19  7:14 ` [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries Johannes Nixdorf
  2023-06-19 15:34   ` Ido Schimmel
@ 2023-06-20  6:55   ` Nikolay Aleksandrov
  2023-06-20 13:35     ` Johannes Nixdorf
  1 sibling, 1 reply; 13+ messages in thread
From: Nikolay Aleksandrov @ 2023-06-20  6:55 UTC (permalink / raw)
  To: Johannes Nixdorf, bridge
  Cc: netdev, David Ahern, Vladimir Oltean, Andrew Lunn,
	Florian Fainelli, Oleksij Rempel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Roopa Prabhu, Ido Schimmel

On 6/19/23 10:14, Johannes Nixdorf wrote:
> A malicious actor behind one bridge port may spam the kernel with packets
> with a random source MAC address, each of which will create an FDB entry,
> each of which is a dynamic allocation in the kernel.
> 
> There are roughly 2^48 different MAC addresses, further limited by the
> rhashtable they are stored in to 2^31. Each entry is of the type struct
> net_bridge_fdb_entry, which is currently 128 bytes big. This means the
> maximum amount of memory allocated for FDB entries is 2^31 * 128B =
> 256GiB, which is too much for most computers.
> 
> Mitigate this by adding a bridge netlink setting
> IFLA_BR_FDB_MAX_LEARNED_ENTRIES, which, if nonzero, limits the amount
> of learned entries to a user specified maximum.
> 
> For backwards compatibility the default setting of 0 disables the limit.
> 
> User-added entries by netlink or from bridge or bridge port addresses
> are never blocked and do not count towards that limit.
> 
> All changes to fdb_n_entries are under br->hash_lock, which means we do
> not need additional locking. The call paths are (✓ denotes that
> br->hash_lock is taken around the next call):
> 
>   - fdb_delete <-+- fdb_delete_local <-+- br_fdb_changeaddr ✓
>                  |                     +- br_fdb_change_mac_address ✓
>                  |                     +- br_fdb_delete_by_port ✓
>                  +- br_fdb_find_delete_local ✓
>                  +- fdb_add_local <-+- br_fdb_changeaddr ✓
>                  |                  +- br_fdb_change_mac_address ✓
>                  |                  +- br_fdb_add_local ✓
>                  +- br_fdb_cleanup ✓
>                  +- br_fdb_flush ✓
>                  +- br_fdb_delete_by_port ✓
>                  +- fdb_delete_by_addr_and_port <--- __br_fdb_delete ✓
>                  +- br_fdb_external_learn_del ✓
>   - fdb_create <-+- fdb_add_local <-+- br_fdb_changeaddr ✓
>                  |                  +- br_fdb_change_mac_address ✓
>                  |                  +- br_fdb_add_local ✓
>                  +- br_fdb_update ✓
>                  +- fdb_add_entry <--- __br_fdb_add ✓
>                  +- br_fdb_external_learn_add ✓
> 
> The flags that imply an entry does not come from learning
> (BR_FDB_NOT_LEARNED_MASK) are now only set or cleared under br->hash_lock
> as well, and when the boolean value of (fdb->flags &
> BR_FDB_NOT_LEARNED_MASK) changes the accounting is updated.
> 
> This introduces one additional locked update in br_fdb_update if
> BR_FDB_ADDED_BY_USER was set. This is only the case when creating a new
> entry via netlink, and never in the packet handling fast path.
> 
> Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
> 
> ---
> 
> Changes since v1:
>   - Do not initialize fdb_*_entries to 0. (from review)
>   - Do not skip decrementing on 0. (from review)
>   - Moved the counters to a conditional hole in struct net_bridge to
>     avoid growing the struct. (from review, it still grows the struct as
>     there are 2 32-bit values)
>   - Add IFLA_BR_FDB_CUR_LEARNED_ENTRIES (from review)
>   - Fix br_get_size()
>   - Only limit learned entries, rename to
>     *_(CUR|MAX)_LEARNED_ENTRIES. (from review)
> 
> Obsolete v1 review comments:
>   - Return better errors to users: Due to limiting the limit to
>     automatically created entries, netlink fdb add requests and changing
>     bridge ports are never rejected, so they do not yet need a more
>     friendly error returned.
> 
>   include/uapi/linux/if_link.h |  2 ++
>   net/bridge/br_fdb.c          | 67 +++++++++++++++++++++++++++++++++---
>   net/bridge/br_netlink.c      | 13 ++++++-
>   net/bridge/br_private.h      |  6 ++++
>   4 files changed, 83 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index 4ac1000b0ef2..165b9014379b 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -510,6 +510,8 @@ enum {
>   	IFLA_BR_VLAN_STATS_PER_PORT,
>   	IFLA_BR_MULTI_BOOLOPT,
>   	IFLA_BR_MCAST_QUERIER_STATE,
> +	IFLA_BR_FDB_CUR_LEARNED_ENTRIES,
> +	IFLA_BR_FDB_MAX_LEARNED_ENTRIES,
>   	__IFLA_BR_MAX,
>   };
>   
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index ac1dc8723b9c..bc61d1fd5fcf 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -301,6 +301,38 @@ static void fdb_add_hw_addr(struct net_bridge *br, const unsigned char *addr)
>   	}
>   }
>   
> +/* Set a FDB flag that implies the entry was not learned, and account
> + * for changes in the learned status.
> + */
> +static void __fdb_set_flag_not_learned(struct net_bridge *br,
> +				       struct net_bridge_fdb_entry *fdb,
> +				       long nr)
> +{
> +	WARN_ON_ONCE(!(BIT(nr) & BR_FDB_NOT_LEARNED_MASK));

Please use *_bit

> +
> +	/* learned before, but we set a flag that implies it's manually added */
> +	if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))

Please use *_bit

> +		br->fdb_cur_learned_entries--;
> +	set_bit(nr, &fdb->flags);
> +}

Having a helper that conditionally decrements only is counterintuitive 
and people can get confused. Either add 2 helpers for inc/dec and use
them where appropriate or don't use helpers at all.

> +
> +/* Set a FDB flag that implies the entry was not learned, and account
> + * for changes in the learned status.
> + *
> + * This function takes a lock, so ensure it is not called in the fast
> + * path.
> + */
> +static void fdb_set_flag_not_learned(struct net_bridge *br,
> +				     struct net_bridge_fdb_entry *fdb,
> +				     long nr)
> +{
> +	spin_lock_bh(&br->hash_lock);
> +
> +	__fdb_set_flag_not_learned(br, fdb, nr);
> +

Unnecessary empty lines

> +	spin_unlock_bh(&br->hash_lock);
> +}
> +
>   /* When a static FDB entry is deleted, the HW address from that entry is
>    * also removed from the bridge private HW address list and updates all
>    * the ports with needed information.
> @@ -321,6 +353,8 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
>   static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>   		       bool swdev_notify)
>   {
> +	bool learned = !(f->flags & BR_FDB_NOT_LEARNED_MASK);

*_bit

> +
>   	trace_fdb_delete(br, f);
>   
>   	if (test_bit(BR_FDB_STATIC, &f->flags))
> @@ -329,11 +363,16 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>   	hlist_del_init_rcu(&f->fdb_node);
>   	rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode,
>   			       br_fdb_rht_params);
> +	br->fdb_cur_learned_entries -= learned;

This looks confusing because of the conditional above, as I mentioned 
please make it explicit.

>   	fdb_notify(br, f, RTM_DELNEIGH, swdev_notify);
>   	call_rcu(&f->rcu, fdb_rcu_free);
>   }
>   
> -/* Delete a local entry if no other port had the same address. */
> +/* Delete a local entry if no other port had the same address.
> + *
> + * This function should only be called on entries with BR_FDB_LOCAL set,
> + * so clear_bit never removes the last bit in BR_FDB_NOT_LEARNED_MASK.
> + */
>   static void fdb_delete_local(struct net_bridge *br,
>   			     const struct net_bridge_port *p,
>   			     struct net_bridge_fdb_entry *f)
> @@ -390,6 +429,11 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br,
>   {
>   	struct net_bridge_fdb_entry *fdb;
>   	int err;
> +	bool learned = !(flags & BR_FDB_NOT_LEARNED_MASK);
> +

*_bit and also use reverse xmas tree order for local variables

> +	if (unlikely(learned && br->fdb_max_learned_entries &&
> +		     br->fdb_cur_learned_entries >= br->fdb_max_learned_entries))
> +		return NULL;
>   
>   	fdb = kmem_cache_alloc(br_fdb_cache, GFP_ATOMIC);
>   	if (!fdb)
> @@ -409,6 +453,8 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br,
>   
>   	hlist_add_head_rcu(&fdb->fdb_node, &br->fdb_list);
>   
> +	br->fdb_cur_learned_entries += learned;

I'd prefer to be explicit heere instead of adding a bool

> +
>   	return fdb;
>   }
>   
> @@ -894,7 +940,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>   			}
>   
>   			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
> -				set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> +				fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);

Unacceptable to take hash_lock and block all learning here, eventual 
consistency is ok or some other method that is much lighter and doesn't
block all learning or requires a lock.

>   			if (unlikely(fdb_modified)) {
>   				trace_br_fdb_update(br, source, addr, vid, flags);
>   				fdb_notify(br, fdb, RTM_NEWNEIGH, true);
> @@ -1070,6 +1116,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>   			modified = true;
>   		}
>   
> +		if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))

*_bit

> +			br->fdb_cur_learned_entries--; >   		set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);

This looks like open-coded version of the helper above.

>   	}
>   
> @@ -1440,10 +1488,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>   		}
>   
>   		if (swdev_notify)
> -			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> +			__fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
>   
>   		if (!p)
> -			set_bit(BR_FDB_LOCAL, &fdb->flags);
> +			__fdb_set_flag_not_learned(br, fdb, BR_FDB_LOCAL);
>   
>   		if (modified)
>   			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);
> @@ -1508,3 +1556,14 @@ void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
>   	spin_unlock_bh(&p->br->hash_lock);
>   }
>   EXPORT_SYMBOL_GPL(br_fdb_clear_offload);
> +
> +u32 br_fdb_get_cur_learned_entries(struct net_bridge *br)
> +{
> +	u32 ret;
> +
> +	spin_lock_bh(&br->hash_lock);
> +	ret = br->fdb_cur_learned_entries;
> +	spin_unlock_bh(&br->hash_lock);
> +
> +	return ret;
> +}

It's unnecessary to take lock to read the value, it'd be better to have
inaccurate result and read it without locking and blocking learning.

> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 05c5863d2e20..954c468d52ec 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -1527,6 +1527,12 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[],
>   			return err;
>   	}
>   
> +	if (data[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]) {
> +		u32 val = nla_get_u32(data[IFLA_BR_FDB_MAX_LEARNED_ENTRIES]);
> +
> +		br->fdb_max_learned_entries = val;
> +	}

Here you change the value without any locking, while you check it with 
locks held - kcsan won't be happy.

> +
>   	return 0;
>   }
>   
> @@ -1581,6 +1587,8 @@ static size_t br_get_size(const struct net_device *brdev)
>   	       nla_total_size_64bit(sizeof(u64)) + /* IFLA_BR_TOPOLOGY_CHANGE_TIMER */
>   	       nla_total_size_64bit(sizeof(u64)) + /* IFLA_BR_GC_TIMER */
>   	       nla_total_size(ETH_ALEN) +       /* IFLA_BR_GROUP_ADDR */
> +	       nla_total_size(sizeof(u32)) +    /* IFLA_BR_FDB_CUR_LEARNED_ENTRIES */
> +	       nla_total_size(sizeof(u32)) +    /* IFLA_BR_FDB_MAX_LEARNED_ENTRIES */
>   #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
>   	       nla_total_size(sizeof(u8)) +     /* IFLA_BR_MCAST_ROUTER */
>   	       nla_total_size(sizeof(u8)) +     /* IFLA_BR_MCAST_SNOOPING */
> @@ -1620,6 +1628,7 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
>   	u32 stp_enabled = br->stp_enabled;
>   	u16 priority = (br->bridge_id.prio[0] << 8) | br->bridge_id.prio[1];
>   	u8 vlan_enabled = br_vlan_enabled(br->dev);
> +	u32 fdb_cur_learned_entries = br_fdb_get_cur_learned_entries(br);
>   	struct br_boolopt_multi bm;
>   	u64 clockval;
>   
> @@ -1656,7 +1665,9 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
>   	    nla_put_u8(skb, IFLA_BR_TOPOLOGY_CHANGE_DETECTED,
>   		       br->topology_change_detected) ||
>   	    nla_put(skb, IFLA_BR_GROUP_ADDR, ETH_ALEN, br->group_addr) ||
> -	    nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm))
> +	    nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm) ||
> +	    nla_put_u32(skb, IFLA_BR_FDB_CUR_LEARNED_ENTRIES, fdb_cur_learned_entries) ||
> +	    nla_put_u32(skb, IFLA_BR_FDB_MAX_LEARNED_ENTRIES, br->fdb_max_learned_entries))

I'd say export these only if max entries is > 0

>   		return -EMSGSIZE;
>   
>   #ifdef CONFIG_BRIDGE_VLAN_FILTERING
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 2119729ded2b..df079191479e 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -275,6 +275,8 @@ enum {
>   	BR_FDB_LOCKED,
>   };
>   
> +#define BR_FDB_NOT_LEARNED_MASK (BIT(BR_FDB_LOCAL) | BIT(BR_FDB_ADDED_BY_USER))

Not learned sounds confusing and doesn't accurately describe the entry.
BR_FDB_DYNAMIC_LEARNED perhaps or some other name, that doesn't cause
double negatives (not not learned).

> +
>   struct net_bridge_fdb_key {
>   	mac_addr addr;
>   	u16 vlan_id;
> @@ -553,6 +555,9 @@ struct net_bridge {
>   	struct kobject			*ifobj;
>   	u32				auto_cnt;
>   
> +	u32				fdb_max_learned_entries;
> +	u32				fdb_cur_learned_entries;
> +
>   #ifdef CONFIG_NET_SWITCHDEV
>   	/* Counter used to make sure that hardware domains get unique
>   	 * identifiers in case a bridge spans multiple switchdev instances.
> @@ -838,6 +843,7 @@ int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
>   			      bool swdev_notify);
>   void br_fdb_offloaded_set(struct net_bridge *br, struct net_bridge_port *p,
>   			  const unsigned char *addr, u16 vid, bool offloaded);
> +u32 br_fdb_get_cur_learned_entries(struct net_bridge *br);
>   
>   /* br_forward.c */
>   enum br_pkt_type {


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 3/3] net: bridge: Add a configurable default FDB learning limit
  2023-06-19  7:14 ` [PATCH net-next v2 3/3] net: bridge: Add a configurable default FDB learning limit Johannes Nixdorf
@ 2023-06-20  6:56   ` Nikolay Aleksandrov
  0 siblings, 0 replies; 13+ messages in thread
From: Nikolay Aleksandrov @ 2023-06-20  6:56 UTC (permalink / raw)
  To: Johannes Nixdorf, bridge
  Cc: netdev, David Ahern, Vladimir Oltean, Andrew Lunn,
	Florian Fainelli, Oleksij Rempel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Roopa Prabhu, Ido Schimmel

On 6/19/23 10:14, Johannes Nixdorf wrote:
> This adds a Kconfig option to configure a default FDB learning limit
> system wide, so a distributor building a special purpose kernel can
> limit all created bridges by default.
> 
> The limit is only a soft default setting and overridable per bridge
> using netlink.
> 
> Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
> 
> ---
> 
> Changes since v1:
>   - Added a default limit in Kconfig. (deemed acceptable in review
>     comments)
> 
>   net/bridge/Kconfig     | 13 +++++++++++++
>   net/bridge/br_device.c |  2 ++
>   2 files changed, 15 insertions(+)
> 
> diff --git a/net/bridge/Kconfig b/net/bridge/Kconfig
> index 3c8ded7d3e84..c0d9c08088c4 100644
> --- a/net/bridge/Kconfig
> +++ b/net/bridge/Kconfig
> @@ -84,3 +84,16 @@ config BRIDGE_CFM
>   	  Say N to exclude this support and reduce the binary size.
>   
>   	  If unsure, say N.
> +
> +config BRIDGE_DEFAULT_FDB_MAX_LEARNED
> +	int "Default FDB learning limit"
> +	default 0
> +	depends on BRIDGE
> +	help
> +	  Sets a default limit on the number of learned FDB entries on
> +	  new bridges. This limit can be overwritten via netlink on a
> +	  per bridge basis.
> +
> +	  The default of 0 disables the limit.
> +
> +	  If unsure, say 0.
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index 8eca8a5c80c6..93f081ce8195 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -530,6 +530,8 @@ void br_dev_setup(struct net_device *dev)
>   	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
>   	dev->max_mtu = ETH_MAX_MTU;
>   
> +	br->fdb_max_learned_entries = CONFIG_BRIDGE_DEFAULT_FDB_MAX_LEARNED;
> +
>   	br_netfilter_rtable_init(br);
>   	br_stp_timer_init(br);
>   	br_multicast_init(br);

IMO this is pointless, noone will set the kconfig option except very 
specific users. I prefer if we leave it to the distribution to set a 
maximum on bridge creation, i.e. make it a distro policy.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
  2023-06-20  6:55   ` Nikolay Aleksandrov
@ 2023-06-20 13:35     ` Johannes Nixdorf
  2023-06-22 12:27       ` Nikolay Aleksandrov
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Nixdorf @ 2023-06-20 13:35 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: bridge, netdev, David Ahern, Vladimir Oltean, Andrew Lunn,
	Florian Fainelli, Oleksij Rempel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Roopa Prabhu, Ido Schimmel

On Tue, Jun 20, 2023 at 09:55:31AM +0300, Nikolay Aleksandrov wrote:
> On 6/19/23 10:14, Johannes Nixdorf wrote:
> > +/* Set a FDB flag that implies the entry was not learned, and account
> > + * for changes in the learned status.
> > + */
> > +static void __fdb_set_flag_not_learned(struct net_bridge *br,
> > +				       struct net_bridge_fdb_entry *fdb,
> > +				       long nr)
> > +{
> > +	WARN_ON_ONCE(!(BIT(nr) & BR_FDB_NOT_LEARNED_MASK));
> 
> Please use *_bit

Can you tell me which *_bit helper you had in mind? The shortest option I could
come up with the ones I found seemed needlessly verbose and wasteful:

  static const unsigned long br_fdb_not_learned_mask = BR_FDB_NOT_LEARNED_MASK;
  ...
  WARN_ON_ONCE(test_bit(nr, &br_fdb_not_learned_mask));

> > +
> > +	/* learned before, but we set a flag that implies it's manually added */
> > +	if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
> 
> Please use *_bit

This will be fixed by the redesign to get rid of my use of hash_lock
(proposed later in this mail), as I'll only have to test one bit and can
use test_and_clear_bit then.

> > +		br->fdb_cur_learned_entries--;
> > +	set_bit(nr, &fdb->flags);
> > +}
> 
> Having a helper that conditionally decrements only is counterintuitive and
> people can get confused. Either add 2 helpers for inc/dec and use
> them where appropriate or don't use helpers at all.

The *_set_bit helper can only cause the count to drop, as there
is currently no flag that could turn a manually added entry back into
a dynamically learned one.

The analogous helper that increments the value would be *_clear_bit,
which I did not add because it has no users.

> > +	spin_unlock_bh(&br->hash_lock);
> > +}
> > +
> >   /* When a static FDB entry is deleted, the HW address from that entry is
> >    * also removed from the bridge private HW address list and updates all
> >    * the ports with needed information.
> > @@ -321,6 +353,8 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
> >   static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
> >   		       bool swdev_notify)
> >   {
> > +	bool learned = !(f->flags & BR_FDB_NOT_LEARNED_MASK);
> 
> *_bit

I do not know a *_bit helper that would help me test the intersection
of multiple bits on both sides. Do you have any in mind?

> > +
> >   	return fdb;
> >   }
> > @@ -894,7 +940,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
> >   			}
> >   			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
> > -				set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> > +				fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
> 
> Unacceptable to take hash_lock and block all learning here, eventual
> consistency is ok or some other method that is much lighter and doesn't
> block all learning or requires a lock.

At the time of writing v2, this seemed difficult because we want to test
multiple bits and increment a counter, but remembering that clear_bit
is never called for the bits I care about I came up with the following
approach:

  a) Add a new flag BR_FDB_DYNAMIC_LEARNED, which is set to 1 iff
     BR_FDB_ADDED_BY_USER or BR_FDB_LOCAL are set in br_create.
     Every time BR_FDB_ADDED_BY_USER or BR_FDB_LOCAL is set, also clear
     BR_FDB_DYNAMIC_LEARNED, and decrement the count if it was 1 before.
     This solves the problem of testing two bits at once, and would not
     have been possible if we had a code path that could clear both bits,
     as it is not as easy to decide when to set BR_FDB_DYNAMIC_LEARNED
     again in that case.
  b) Replace the current count with an atomic_t.

I'll change it this way for v3.

> >   		return -EMSGSIZE;
> >   #ifdef CONFIG_BRIDGE_VLAN_FILTERING
> > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> > index 2119729ded2b..df079191479e 100644
> > --- a/net/bridge/br_private.h
> > +++ b/net/bridge/br_private.h
> > @@ -275,6 +275,8 @@ enum {
> >   	BR_FDB_LOCKED,
> >   };
> > +#define BR_FDB_NOT_LEARNED_MASK (BIT(BR_FDB_LOCAL) | BIT(BR_FDB_ADDED_BY_USER))
> 
> Not learned sounds confusing and doesn't accurately describe the entry.
> BR_FDB_DYNAMIC_LEARNED perhaps or some other name, that doesn't cause
> double negatives (not not learned).

Your proposal would not have captured the mask, as it describes all the
opposite cases, which were _not_ dynamically learned.

But with the proposed new flag from the hash_lock comment we can trivially
flip the meaning, so I went with your proposed name there.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
  2023-06-20 13:35     ` Johannes Nixdorf
@ 2023-06-22 12:27       ` Nikolay Aleksandrov
  2023-06-22 12:39         ` Nikolay Aleksandrov
  0 siblings, 1 reply; 13+ messages in thread
From: Nikolay Aleksandrov @ 2023-06-22 12:27 UTC (permalink / raw)
  To: Johannes Nixdorf
  Cc: bridge, netdev, David Ahern, Vladimir Oltean, Andrew Lunn,
	Florian Fainelli, Oleksij Rempel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Roopa Prabhu, Ido Schimmel

On 20/06/2023 16:35, Johannes Nixdorf wrote:
> On Tue, Jun 20, 2023 at 09:55:31AM +0300, Nikolay Aleksandrov wrote:
>> On 6/19/23 10:14, Johannes Nixdorf wrote:
>>> +/* Set a FDB flag that implies the entry was not learned, and account
>>> + * for changes in the learned status.
>>> + */
>>> +static void __fdb_set_flag_not_learned(struct net_bridge *br,
>>> +				       struct net_bridge_fdb_entry *fdb,
>>> +				       long nr)
>>> +{
>>> +	WARN_ON_ONCE(!(BIT(nr) & BR_FDB_NOT_LEARNED_MASK));
>>
>> Please use *_bit
> 
> Can you tell me which *_bit helper you had in mind? The shortest option I could
> come up with the ones I found seemed needlessly verbose and wasteful:
> 
>   static const unsigned long br_fdb_not_learned_mask = BR_FDB_NOT_LEARNED_MASK;
>   ...
>   WARN_ON_ONCE(test_bit(nr, &br_fdb_not_learned_mask));
> 
>>> +
>>> +	/* learned before, but we set a flag that implies it's manually added */
>>> +	if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
>>
>> Please use *_bit
> 
> This will be fixed by the redesign to get rid of my use of hash_lock
> (proposed later in this mail), as I'll only have to test one bit and can
> use test_and_clear_bit then.
> 
>>> +		br->fdb_cur_learned_entries--;
>>> +	set_bit(nr, &fdb->flags);
>>> +}
>>
>> Having a helper that conditionally decrements only is counterintuitive and
>> people can get confused. Either add 2 helpers for inc/dec and use
>> them where appropriate or don't use helpers at all.
> 
> The *_set_bit helper can only cause the count to drop, as there
> is currently no flag that could turn a manually added entry back into
> a dynamically learned one.
> 
> The analogous helper that increments the value would be *_clear_bit,
> which I did not add because it has no users.
> 
>>> +	spin_unlock_bh(&br->hash_lock);
>>> +}
>>> +
>>>   /* When a static FDB entry is deleted, the HW address from that entry is
>>>    * also removed from the bridge private HW address list and updates all
>>>    * the ports with needed information.
>>> @@ -321,6 +353,8 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
>>>   static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>   		       bool swdev_notify)
>>>   {
>>> +	bool learned = !(f->flags & BR_FDB_NOT_LEARNED_MASK);
>>
>> *_bit
> 
> I do not know a *_bit helper that would help me test the intersection
> of multiple bits on both sides. Do you have any in mind?
> 
>>> +
>>>   	return fdb;
>>>   }
>>> @@ -894,7 +940,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>>>   			}
>>>   			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
>>> -				set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
>>> +				fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
>>
>> Unacceptable to take hash_lock and block all learning here, eventual
>> consistency is ok or some other method that is much lighter and doesn't
>> block all learning or requires a lock.
> 
> At the time of writing v2, this seemed difficult because we want to test
> multiple bits and increment a counter, but remembering that clear_bit
> is never called for the bits I care about I came up with the following
> approach:
> 
>   a) Add a new flag BR_FDB_DYNAMIC_LEARNED, which is set to 1 iff
>      BR_FDB_ADDED_BY_USER or BR_FDB_LOCAL are set in br_create.
>      Every time BR_FDB_ADDED_BY_USER or BR_FDB_LOCAL is set, also clear
>      BR_FDB_DYNAMIC_LEARNED, and decrement the count if it was 1 before.
>      This solves the problem of testing two bits at once, and would not
>      have been possible if we had a code path that could clear both bits,
>      as it is not as easy to decide when to set BR_FDB_DYNAMIC_LEARNED
>      again in that case.

I think you can try without adding any new flags, the places that add dynamic
entries are known for the inc part of the problem, and an entry can become
local/added_by_user again only through well known paths as well. You may be able to
infer whether to inc/dec and make it work with careful fn argument passing.
Could you please look into that way? I'd prefer that we don't add new flags as
there are already so many.

>   b) Replace the current count with an atomic_t.
> 

Sounds good.

> I'll change it this way for v3.
> 
>>>   		return -EMSGSIZE;
>>>   #ifdef CONFIG_BRIDGE_VLAN_FILTERING
>>> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
>>> index 2119729ded2b..df079191479e 100644
>>> --- a/net/bridge/br_private.h
>>> +++ b/net/bridge/br_private.h
>>> @@ -275,6 +275,8 @@ enum {
>>>   	BR_FDB_LOCKED,
>>>   };
>>> +#define BR_FDB_NOT_LEARNED_MASK (BIT(BR_FDB_LOCAL) | BIT(BR_FDB_ADDED_BY_USER))
>>
>> Not learned sounds confusing and doesn't accurately describe the entry.
>> BR_FDB_DYNAMIC_LEARNED perhaps or some other name, that doesn't cause
>> double negatives (not not learned).
> 
> Your proposal would not have captured the mask, as it describes all the
> opposite cases, which were _not_ dynamically learned.
> 
> But with the proposed new flag from the hash_lock comment we can trivially
> flip the meaning, so I went with your proposed name there.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
  2023-06-22 12:27       ` Nikolay Aleksandrov
@ 2023-06-22 12:39         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 13+ messages in thread
From: Nikolay Aleksandrov @ 2023-06-22 12:39 UTC (permalink / raw)
  To: Johannes Nixdorf
  Cc: bridge, netdev, David Ahern, Vladimir Oltean, Andrew Lunn,
	Florian Fainelli, Oleksij Rempel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Roopa Prabhu, Ido Schimmel

On 22/06/2023 15:27, Nikolay Aleksandrov wrote:
> On 20/06/2023 16:35, Johannes Nixdorf wrote:
>> On Tue, Jun 20, 2023 at 09:55:31AM +0300, Nikolay Aleksandrov wrote:
>>> On 6/19/23 10:14, Johannes Nixdorf wrote:
>>>> +/* Set a FDB flag that implies the entry was not learned, and account
>>>> + * for changes in the learned status.
>>>> + */
>>>> +static void __fdb_set_flag_not_learned(struct net_bridge *br,
>>>> +				       struct net_bridge_fdb_entry *fdb,
>>>> +				       long nr)
>>>> +{
>>>> +	WARN_ON_ONCE(!(BIT(nr) & BR_FDB_NOT_LEARNED_MASK));
>>>
>>> Please use *_bit
>>
>> Can you tell me which *_bit helper you had in mind? The shortest option I could
>> come up with the ones I found seemed needlessly verbose and wasteful:
>>
>>   static const unsigned long br_fdb_not_learned_mask = BR_FDB_NOT_LEARNED_MASK;
>>   ...
>>   WARN_ON_ONCE(test_bit(nr, &br_fdb_not_learned_mask));
>>
>>>> +
>>>> +	/* learned before, but we set a flag that implies it's manually added */
>>>> +	if (!(fdb->flags & BR_FDB_NOT_LEARNED_MASK))
>>>
>>> Please use *_bit
>>
>> This will be fixed by the redesign to get rid of my use of hash_lock
>> (proposed later in this mail), as I'll only have to test one bit and can
>> use test_and_clear_bit then.
>>
>>>> +		br->fdb_cur_learned_entries--;
>>>> +	set_bit(nr, &fdb->flags);
>>>> +}
>>>
>>> Having a helper that conditionally decrements only is counterintuitive and
>>> people can get confused. Either add 2 helpers for inc/dec and use
>>> them where appropriate or don't use helpers at all.
>>
>> The *_set_bit helper can only cause the count to drop, as there
>> is currently no flag that could turn a manually added entry back into
>> a dynamically learned one.
>>
>> The analogous helper that increments the value would be *_clear_bit,
>> which I did not add because it has no users.
>>
>>>> +	spin_unlock_bh(&br->hash_lock);
>>>> +}
>>>> +
>>>>   /* When a static FDB entry is deleted, the HW address from that entry is
>>>>    * also removed from the bridge private HW address list and updates all
>>>>    * the ports with needed information.
>>>> @@ -321,6 +353,8 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
>>>>   static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>   		       bool swdev_notify)
>>>>   {
>>>> +	bool learned = !(f->flags & BR_FDB_NOT_LEARNED_MASK);
>>>
>>> *_bit
>>
>> I do not know a *_bit helper that would help me test the intersection
>> of multiple bits on both sides. Do you have any in mind?
>>
>>>> +
>>>>   	return fdb;
>>>>   }
>>>> @@ -894,7 +940,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
>>>>   			}
>>>>   			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
>>>> -				set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
>>>> +				fdb_set_flag_not_learned(br, fdb, BR_FDB_ADDED_BY_USER);
>>>
>>> Unacceptable to take hash_lock and block all learning here, eventual
>>> consistency is ok or some other method that is much lighter and doesn't
>>> block all learning or requires a lock.
>>
>> At the time of writing v2, this seemed difficult because we want to test
>> multiple bits and increment a counter, but remembering that clear_bit
>> is never called for the bits I care about I came up with the following
>> approach:
>>
>>   a) Add a new flag BR_FDB_DYNAMIC_LEARNED, which is set to 1 iff
>>      BR_FDB_ADDED_BY_USER or BR_FDB_LOCAL are set in br_create.
>>      Every time BR_FDB_ADDED_BY_USER or BR_FDB_LOCAL is set, also clear
>>      BR_FDB_DYNAMIC_LEARNED, and decrement the count if it was 1 before.
>>      This solves the problem of testing two bits at once, and would not
>>      have been possible if we had a code path that could clear both bits,
>>      as it is not as easy to decide when to set BR_FDB_DYNAMIC_LEARNED
>>      again in that case.
> 
> I think you can try without adding any new flags, the places that add dynamic
> entries are known for the inc part of the problem, and an entry can become
> local/added_by_user again only through well known paths as well. You may be able to
> infer whether to inc/dec and make it work with careful fn argument passing.
> Could you please look into that way? I'd prefer that we don't add new flags as
> there are already so many.
> 

To clarify  - just look into it if it is possible and looks sane, if not do go
ahead with the new flag.

>>   b) Replace the current count with an atomic_t.
>>
> 
> Sounds good.
> 
>> I'll change it this way for v3.
>>
>>>>   		return -EMSGSIZE;
>>>>   #ifdef CONFIG_BRIDGE_VLAN_FILTERING
>>>> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
>>>> index 2119729ded2b..df079191479e 100644
>>>> --- a/net/bridge/br_private.h
>>>> +++ b/net/bridge/br_private.h
>>>> @@ -275,6 +275,8 @@ enum {
>>>>   	BR_FDB_LOCKED,
>>>>   };
>>>> +#define BR_FDB_NOT_LEARNED_MASK (BIT(BR_FDB_LOCAL) | BIT(BR_FDB_ADDED_BY_USER))
>>>
>>> Not learned sounds confusing and doesn't accurately describe the entry.
>>> BR_FDB_DYNAMIC_LEARNED perhaps or some other name, that doesn't cause
>>> double negatives (not not learned).
>>
>> Your proposal would not have captured the mask, as it describes all the
>> opposite cases, which were _not_ dynamically learned.
>>
>> But with the proposed new flag from the hash_lock comment we can trivially
>> flip the meaning, so I went with your proposed name there.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-22 12:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-19  7:14 [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries Johannes Nixdorf
2023-06-19  7:14 ` [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry Johannes Nixdorf
2023-06-19 14:50   ` Ido Schimmel
2023-06-19  7:14 ` [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries Johannes Nixdorf
2023-06-19 15:34   ` Ido Schimmel
2023-06-20  6:55   ` Nikolay Aleksandrov
2023-06-20 13:35     ` Johannes Nixdorf
2023-06-22 12:27       ` Nikolay Aleksandrov
2023-06-22 12:39         ` Nikolay Aleksandrov
2023-06-19  7:14 ` [PATCH net-next v2 3/3] net: bridge: Add a configurable default FDB learning limit Johannes Nixdorf
2023-06-20  6:56   ` Nikolay Aleksandrov
2023-06-19  7:14 ` [PATCH iproute2-next 1/1] iplink: bridge: Add support for bridge FDB learning limits Johannes Nixdorf
2023-06-19 14:37   ` Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).