[PATCH net v3 1/1] net: hsr: limit node table growth

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH net v3 1/1] net: hsr: limit node table growth
@ 2026-04-21 14:50 Ren Wei
  2026-04-21 15:18 ` Andrew Lunn
  2026-04-22  8:31 ` Felix Maurer
  0 siblings, 2 replies; 5+ messages in thread
From: Ren Wei @ 2026-04-21 14:50 UTC (permalink / raw)
  To: netdev, Felix Maurer, Sebastian Andrzej Siewior
  Cc: davem, edumazet, kuba, pabeni, horms, kees, kexinsun, luka.gejak,
	Arvid.Brodin, m-karicheri2, yuantan098, yifanwucs, tomapufckgml,
	bird, xuyuqiabc, royenheart, n05ec

From: Haoze Xie <royenheart@gmail.com>

The HSR/PRP node learning paths allocate one persistent entry per
previously unseen source MAC. Since learned entries stay alive until the
prune timer catches up, the node tables can otherwise grow without a
bound under high churn of learned senders.

Limit the number of learned entries in each node table and stop adding
new ones once the configured limit is reached. This keeps node-table
resource use bounded across the affected learning paths.

Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Fixes: 451d8123f897 ("net: prp: add packet handling support")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Yuqi Xu <xuyuqiabc@gmail.com>
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
---
changes in v3:
- replace the v2 learning-suppression approach with direct node-table growth limiting
- add a node_table_size module parameter and stop learning new entries once each table reaches the configured limit
- fix the full-table handling so failed learning returns NULL instead of reusing an existing node
- v2 Link: https://lore.kernel.org/all/b053e938014c9bac22f7f687ecc2970f23a2b74a.1775281843.git.royenheart@gmail.com/

changes in v2:
- generalize the fix beyond PRP SAN traffic and cover HSR/PRP tagged sender floods
- decide whether learning is needed from local-exclusive delivery instead of protocol-specific SAN checks
- use the normal NULL return semantics from hsr_get_node() instead of ERR_PTR-based error plumbing
- skip duplicate-discard state checks when no node state exists
- v1 Link: https://lore.kernel.org/all/9c88b4b7844f867d065e7a7aba28b2c026386168.1775056603.git.royenheart@outlook.com/

 net/hsr/hsr_framereg.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
index d09875b33588..8a5a2a54a81f 100644
--- a/net/hsr/hsr_framereg.c
+++ b/net/hsr/hsr_framereg.c
@@ -14,12 +14,18 @@
 #include <kunit/visibility.h>
 #include <linux/if_ether.h>
 #include <linux/etherdevice.h>
+#include <linux/moduleparam.h>
 #include <linux/slab.h>
 #include <linux/rculist.h>
 #include "hsr_main.h"
 #include "hsr_framereg.h"
 #include "hsr_netlink.h"
 
+static unsigned int hsr_node_table_size = 1024;
+module_param_named(node_table_size, hsr_node_table_size, uint, 0644);
+MODULE_PARM_DESC(node_table_size,
+		 "Maximum number of learned entries in each HSR/PRP node table (0 = unlimited)");
+
 bool hsr_addr_is_redbox(struct hsr_priv *hsr, unsigned char *addr)
 {
 	if (!hsr->redbox || !is_valid_ether_addr(hsr->macaddress_redbox))
@@ -189,6 +195,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
 				     enum hsr_port_type rx_port)
 {
 	struct hsr_node *new_node, *node = NULL;
+	unsigned int node_count = 0;
 	unsigned long now;
 	size_t block_sz;
 	int i;
@@ -226,20 +233,31 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
 	spin_lock_bh(&hsr->list_lock);
 	list_for_each_entry_rcu(node, node_db, mac_list,
 				lockdep_is_held(&hsr->list_lock)) {
+		node_count++;
 		if (ether_addr_equal(node->macaddress_A, addr))
-			goto out;
+			goto out_found;
 		if (ether_addr_equal(node->macaddress_B, addr))
-			goto out;
+			goto out_found;
 	}
+
+	if (hsr_node_table_size && node_count >= hsr_node_table_size)
+		goto out_drop;
 	list_add_tail_rcu(&new_node->mac_list, node_db);
 	spin_unlock_bh(&hsr->list_lock);
 	return new_node;
-out:
+out_found:
 	spin_unlock_bh(&hsr->list_lock);
+	xa_destroy(&new_node->seq_blocks);
 	kfree(new_node->block_buf);
-free:
 	kfree(new_node);
 	return node;
+out_drop:
+	spin_unlock_bh(&hsr->list_lock);
+	xa_destroy(&new_node->seq_blocks);
+	kfree(new_node->block_buf);
+free:
+	kfree(new_node);
+	return NULL;
 }
 
 void prp_update_san_info(struct hsr_node *node, bool is_sup)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v3 1/1] net: hsr: limit node table growth
  2026-04-21 14:50 [PATCH net v3 1/1] net: hsr: limit node table growth Ren Wei
@ 2026-04-21 15:18 ` Andrew Lunn
  2026-04-22  8:31 ` Felix Maurer
  1 sibling, 0 replies; 5+ messages in thread
From: Andrew Lunn @ 2026-04-21 15:18 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, Felix Maurer, Sebastian Andrzej Siewior, davem, edumazet,
	kuba, pabeni, horms, kees, kexinsun, luka.gejak, Arvid.Brodin,
	m-karicheri2, yuantan098, yifanwucs, tomapufckgml, bird,
	xuyuqiabc, royenheart

> +static unsigned int hsr_node_table_size = 1024;
> +module_param_named(node_table_size, hsr_node_table_size, uint, 0644);
> +MODULE_PARM_DESC(node_table_size,
> +		 "Maximum number of learned entries in each HSR/PRP node table (0 = unlimited)");
> +

Please don't use module parameters. Look at other parts of the network
stack where such limits are imposed. They all use sysctl values.

    Andrew

---
pw-bot: cr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v3 1/1] net: hsr: limit node table growth
  2026-04-21 14:50 [PATCH net v3 1/1] net: hsr: limit node table growth Ren Wei
  2026-04-21 15:18 ` Andrew Lunn
@ 2026-04-22  8:31 ` Felix Maurer
  2026-04-22  8:52   ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 5+ messages in thread
From: Felix Maurer @ 2026-04-22  8:31 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, Sebastian Andrzej Siewior, davem, edumazet, kuba, pabeni,
	horms, kees, kexinsun, luka.gejak, Arvid.Brodin, m-karicheri2,
	yuantan098, yifanwucs, tomapufckgml, bird, xuyuqiabc, royenheart

On Tue, Apr 21, 2026 at 10:50:01PM +0800, Ren Wei wrote:
> From: Haoze Xie <royenheart@gmail.com>
>
> The HSR/PRP node learning paths allocate one persistent entry per
> previously unseen source MAC. Since learned entries stay alive until the
> prune timer catches up, the node tables can otherwise grow without a
> bound under high churn of learned senders.
>
> Limit the number of learned entries in each node table and stop adding
> new ones once the configured limit is reached. This keeps node-table
> resource use bounded across the affected learning paths.
Hi,

thank you for giving this approach a try!

[snip]
> diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> index d09875b33588..8a5a2a54a81f 100644
> --- a/net/hsr/hsr_framereg.c
> +++ b/net/hsr/hsr_framereg.c
> @@ -14,12 +14,18 @@
>  #include <kunit/visibility.h>
>  #include <linux/if_ether.h>
>  #include <linux/etherdevice.h>
> +#include <linux/moduleparam.h>
>  #include <linux/slab.h>
>  #include <linux/rculist.h>
>  #include "hsr_main.h"
>  #include "hsr_framereg.h"
>  #include "hsr_netlink.h"
>
> +static unsigned int hsr_node_table_size = 1024;
> +module_param_named(node_table_size, hsr_node_table_size, uint, 0644);
> +MODULE_PARM_DESC(node_table_size,
> +		 "Maximum number of learned entries in each HSR/PRP node table (0 = unlimited)");
> +
>  bool hsr_addr_is_redbox(struct hsr_priv *hsr, unsigned char *addr)
>  {
>  	if (!hsr->redbox || !is_valid_ether_addr(hsr->macaddress_redbox))
> @@ -189,6 +195,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
>  				     enum hsr_port_type rx_port)
>  {
>  	struct hsr_node *new_node, *node = NULL;
> +	unsigned int node_count = 0;
>  	unsigned long now;
>  	size_t block_sz;
>  	int i;
> @@ -226,20 +233,31 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
>  	spin_lock_bh(&hsr->list_lock);
>  	list_for_each_entry_rcu(node, node_db, mac_list,
>  				lockdep_is_held(&hsr->list_lock)) {
> +		node_count++;

I'm not sure if this on-the-fly node counting is the best solution here.
My concern is that it comes quite late in the process, i.e., after we
already allocated a bunch of memory, etc. As we are discussing a
scenario where a lot of entries are created, maybe we shouldn't even
allocate a new_node if the table is already full? For example by storing
the node_count in hsr_priv and checking it early in the function?

>  		if (ether_addr_equal(node->macaddress_A, addr))
> -			goto out;
> +			goto out_found;
>  		if (ether_addr_equal(node->macaddress_B, addr))
> -			goto out;
> +			goto out_found;
>  	}
> +
> +	if (hsr_node_table_size && node_count >= hsr_node_table_size)
> +		goto out_drop;

I think it would be good to somehow make this situation transparent to
the user, so they can react if this an undesired behavior (for example,
because they simply have a large network and need a large node table).

>  	list_add_tail_rcu(&new_node->mac_list, node_db);
>  	spin_unlock_bh(&hsr->list_lock);
>  	return new_node;
> -out:
> +out_found:
>  	spin_unlock_bh(&hsr->list_lock);
> +	xa_destroy(&new_node->seq_blocks);
>  	kfree(new_node->block_buf);
> -free:
>  	kfree(new_node);
>  	return node;
> +out_drop:
> +	spin_unlock_bh(&hsr->list_lock);
> +	xa_destroy(&new_node->seq_blocks);
> +	kfree(new_node->block_buf);
> +free:
> +	kfree(new_node);
> +	return NULL;
>  }

The two cleanup paths are almost the same now. We usually attempt to
keep them unified to make sure that we do the correct cleanup steps in
all situations. So please keep them unified here as well.

Thanks,
   Felix


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v3 1/1] net: hsr: limit node table growth
  2026-04-22  8:31 ` Felix Maurer
@ 2026-04-22  8:52   ` Sebastian Andrzej Siewior
  2026-04-22  9:45     ` Felix Maurer
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-04-22  8:52 UTC (permalink / raw)
  To: Felix Maurer
  Cc: Ren Wei, netdev, davem, edumazet, kuba, pabeni, horms, kees,
	kexinsun, luka.gejak, Arvid.Brodin, m-karicheri2, yuantan098,
	yifanwucs, tomapufckgml, bird, xuyuqiabc, royenheart

On 2026-04-22 10:31:39 [+0200], Felix Maurer wrote:
> On Tue, Apr 21, 2026 at 10:50:01PM +0800, Ren Wei wrote:
> > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> > index d09875b33588..8a5a2a54a81f 100644
> > --- a/net/hsr/hsr_framereg.c
> > +++ b/net/hsr/hsr_framereg.c
> > @@ -189,6 +195,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> >  				     enum hsr_port_type rx_port)
> >  {
> >  	struct hsr_node *new_node, *node = NULL;
> > +	unsigned int node_count = 0;
> >  	unsigned long now;
> >  	size_t block_sz;
> >  	int i;
> > @@ -226,20 +233,31 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> >  	spin_lock_bh(&hsr->list_lock);
> >  	list_for_each_entry_rcu(node, node_db, mac_list,
> >  				lockdep_is_held(&hsr->list_lock)) {
> > +		node_count++;
> 
> I'm not sure if this on-the-fly node counting is the best solution here.
> My concern is that it comes quite late in the process, i.e., after we
> already allocated a bunch of memory, etc. As we are discussing a
> scenario where a lot of entries are created, maybe we shouldn't even
> allocate a new_node if the table is already full? For example by storing
> the node_count in hsr_priv and checking it early in the function?

The node is allocated upfront. Then it iterates here and we only end up
counting through the full list if there is no match. This is under a
lock so "many clients" are serialized. If we allocate the node later
then we need to do it under the lock.

I don't think the node count exceeds 100 in production. So having a
counter which is incremented while adding to the list and decremented
while removing items from the list would optimize the "worst case". So
instead traversing the list with 1000 we would just give up.

The "oom block" works regardless. This does not affect the common case
where we have far less nodes.

> >  		if (ether_addr_equal(node->macaddress_A, addr))
> > -			goto out;
> > +			goto out_found;
> >  		if (ether_addr_equal(node->macaddress_B, addr))
> > -			goto out;
> > +			goto out_found;
> >  	}
> > +
> > +	if (hsr_node_table_size && node_count >= hsr_node_table_size)
> > +		goto out_drop;
> 
> I think it would be good to somehow make this situation transparent to
> the user, so they can react if this an undesired behavior (for example,
> because they simply have a large network and need a large node table).

netdev_warn_once() probably.

> >  	list_add_tail_rcu(&new_node->mac_list, node_db);
> >  	spin_unlock_bh(&hsr->list_lock);
> >  	return new_node;
> > -out:
> > +out_found:
> >  	spin_unlock_bh(&hsr->list_lock);
> > +	xa_destroy(&new_node->seq_blocks);
> >  	kfree(new_node->block_buf);
> > -free:
> >  	kfree(new_node);
> >  	return node;
> > +out_drop:
> > +	spin_unlock_bh(&hsr->list_lock);
> > +	xa_destroy(&new_node->seq_blocks);
> > +	kfree(new_node->block_buf);
> > +free:
> > +	kfree(new_node);
> > +	return NULL;
> >  }
> 
> The two cleanup paths are almost the same now. We usually attempt to
> keep them unified to make sure that we do the correct cleanup steps in
> all situations. So please keep them unified here as well.
> 
> Thanks,
>    Felix

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v3 1/1] net: hsr: limit node table growth
  2026-04-22  8:52   ` Sebastian Andrzej Siewior
@ 2026-04-22  9:45     ` Felix Maurer
  0 siblings, 0 replies; 5+ messages in thread
From: Felix Maurer @ 2026-04-22  9:45 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Ren Wei, netdev, davem, edumazet, kuba, pabeni, horms, kees,
	kexinsun, luka.gejak, Arvid.Brodin, m-karicheri2, yuantan098,
	yifanwucs, tomapufckgml, bird, xuyuqiabc, royenheart

On Wed, Apr 22, 2026 at 10:52:42AM +0200, Sebastian Andrzej Siewior wrote:
> On 2026-04-22 10:31:39 [+0200], Felix Maurer wrote:
> > On Tue, Apr 21, 2026 at 10:50:01PM +0800, Ren Wei wrote:
> > > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> > > index d09875b33588..8a5a2a54a81f 100644
> > > --- a/net/hsr/hsr_framereg.c
> > > +++ b/net/hsr/hsr_framereg.c
> > > @@ -189,6 +195,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> > >  				     enum hsr_port_type rx_port)
> > >  {
> > >  	struct hsr_node *new_node, *node = NULL;
> > > +	unsigned int node_count = 0;
> > >  	unsigned long now;
> > >  	size_t block_sz;
> > >  	int i;
> > > @@ -226,20 +233,31 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> > >  	spin_lock_bh(&hsr->list_lock);
> > >  	list_for_each_entry_rcu(node, node_db, mac_list,
> > >  				lockdep_is_held(&hsr->list_lock)) {
> > > +		node_count++;
> >
> > I'm not sure if this on-the-fly node counting is the best solution here.
> > My concern is that it comes quite late in the process, i.e., after we
> > already allocated a bunch of memory, etc. As we are discussing a
> > scenario where a lot of entries are created, maybe we shouldn't even
> > allocate a new_node if the table is already full? For example by storing
> > the node_count in hsr_priv and checking it early in the function?
>
> The node is allocated upfront. Then it iterates here and we only end up
> counting through the full list if there is no match. This is under a
> lock so "many clients" are serialized. If we allocate the node later
> then we need to do it under the lock.
>
> I don't think the node count exceeds 100 in production. So having a
> counter which is incremented while adding to the list and decremented
> while removing items from the list would optimize the "worst case". So
> instead traversing the list with 1000 we would just give up.

The counter is what I had in mind. I agree that allocating under the
lock isn't what we want.

I'd argue counting through the whole list is the normal case.
hsr_add_node() is only called after the node table has been searched
already (without the lock). Here we go through the whole list again
under the lock to prevent TOCTOU-type situations.

I agree that, overall, it would be optimizing the worst case, but I
think it may be worth it to prevent the memory allocations and walking
the whole list. But I'd go along with the (current) on-the-fly counting
as well.

Thanks,
   Felix


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-22  9:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 14:50 [PATCH net v3 1/1] net: hsr: limit node table growth Ren Wei
2026-04-21 15:18 ` Andrew Lunn
2026-04-22  8:31 ` Felix Maurer
2026-04-22  8:52   ` Sebastian Andrzej Siewior
2026-04-22  9:45     ` Felix Maurer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox