* [PATCHv2 net] hsr: use proper locking when iterating over ports
@ 2025-08-27 9:33 Hangbin Liu
2025-08-28 9:19 ` Paolo Abeni
0 siblings, 1 reply; 4+ messages in thread
From: Hangbin Liu @ 2025-08-27 9:33 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, MD Danish Anwar, Alexander Lobakin,
Jaakko Karrenpalo, Fernando Fernandez Mancera, Murali Karicheri,
WingMan Kwok, Stanislav Fomichev, Xiao Liang, Kuniyuki Iwashima,
Johannes Berg, Hangbin Liu
hsr_for_each_port is called in many places without holding the RCU read
lock, this may trigger warnings on debug kernels like:
[ 40.457015] [ T201] WARNING: suspicious RCU usage
[ 40.457020] [ T201] 6.17.0-rc2-virtme #1 Not tainted
[ 40.457025] [ T201] -----------------------------
[ 40.457029] [ T201] net/hsr/hsr_main.c:137 RCU-list traversed in non-reader section!!
[ 40.457036] [ T201]
other info that might help us debug this:
[ 40.457040] [ T201]
rcu_scheduler_active = 2, debug_locks = 1
[ 40.457045] [ T201] 2 locks held by ip/201:
[ 40.457050] [ T201] #0: ffffffff93040a40 (&ops->srcu){.+.+}-{0:0}, at: rtnl_link_ops_get+0xf2/0x280
[ 40.457080] [ T201] #1: ffffffff92e7f968 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x5e1/0xb20
[ 40.457102] [ T201]
stack backtrace:
[ 40.457108] [ T201] CPU: 2 UID: 0 PID: 201 Comm: ip Not tainted 6.17.0-rc2-virtme #1 PREEMPT(full)
[ 40.457114] [ T201] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 40.457117] [ T201] Call Trace:
[ 40.457120] [ T201] <TASK>
[ 40.457126] [ T201] dump_stack_lvl+0x6f/0xb0
[ 40.457136] [ T201] lockdep_rcu_suspicious.cold+0x4f/0xb1
[ 40.457148] [ T201] hsr_port_get_hsr+0xfe/0x140
[ 40.457158] [ T201] hsr_add_port+0x192/0x940
[ 40.457167] [ T201] ? __pfx_hsr_add_port+0x10/0x10
[ 40.457176] [ T201] ? lockdep_init_map_type+0x5c/0x270
[ 40.457189] [ T201] hsr_dev_finalize+0x4bc/0xbf0
[ 40.457204] [ T201] hsr_newlink+0x3c3/0x8f0
[ 40.457212] [ T201] ? __pfx_hsr_newlink+0x10/0x10
[ 40.457222] [ T201] ? rtnl_create_link+0x173/0xe40
[ 40.457233] [ T201] rtnl_newlink_create+0x2cf/0x750
[ 40.457243] [ T201] ? __pfx_rtnl_newlink_create+0x10/0x10
[ 40.457247] [ T201] ? __dev_get_by_name+0x12/0x50
[ 40.457252] [ T201] ? rtnl_dev_get+0xac/0x140
[ 40.457259] [ T201] ? __pfx_rtnl_dev_get+0x10/0x10
[ 40.457285] [ T201] __rtnl_newlink+0x22c/0xa50
[ 40.457305] [ T201] rtnl_newlink+0x637/0xb20
Adding rcu_read_lock() for all hsr_for_each_port() looks confusing.
Introduce a new helper, hsr_for_each_port_rtnl(), that assumes the
RTNL lock is held. This allows callers in suitable contexts to iterate
ports safely without explicit RCU locking.
Other code paths that rely on RCU protection continue to use
hsr_for_each_port() with rcu_read_lock().
Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
v2: check rtnl lock for net_device_ops (Kuniyuki Iwashima)
hold rcu read lock in timer function (Kuniyuki Iwashima)
---
net/hsr/hsr_device.c | 24 ++++++++++++++----------
net/hsr/hsr_main.c | 12 ++++++++++--
net/hsr/hsr_main.h | 3 +++
net/hsr/hsr_netlink.c | 4 ----
4 files changed, 27 insertions(+), 16 deletions(-)
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c
index 88657255fec1..c7d2419abecd 100644
--- a/net/hsr/hsr_device.c
+++ b/net/hsr/hsr_device.c
@@ -49,7 +49,7 @@ static bool hsr_check_carrier(struct hsr_port *master)
ASSERT_RTNL();
- hsr_for_each_port(master->hsr, port) {
+ hsr_for_each_port_rtnl(master->hsr, port) {
if (port->type != HSR_PT_MASTER && is_slave_up(port->dev)) {
netif_carrier_on(master->dev);
return true;
@@ -105,7 +105,7 @@ int hsr_get_max_mtu(struct hsr_priv *hsr)
struct hsr_port *port;
mtu_max = ETH_DATA_LEN;
- hsr_for_each_port(hsr, port)
+ hsr_for_each_port_rtnl(hsr, port)
if (port->type != HSR_PT_MASTER)
mtu_max = min(port->dev->mtu, mtu_max);
@@ -139,7 +139,7 @@ static int hsr_dev_open(struct net_device *dev)
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) {
+ hsr_for_each_port_rtnl(hsr, port) {
if (port->type == HSR_PT_MASTER)
continue;
switch (port->type) {
@@ -172,7 +172,7 @@ static int hsr_dev_close(struct net_device *dev)
struct hsr_priv *hsr;
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) {
+ hsr_for_each_port_rtnl(hsr, port) {
if (port->type == HSR_PT_MASTER)
continue;
switch (port->type) {
@@ -205,7 +205,7 @@ static netdev_features_t hsr_features_recompute(struct hsr_priv *hsr,
* may become enabled.
*/
features &= ~NETIF_F_ONE_FOR_ALL;
- hsr_for_each_port(hsr, port)
+ hsr_for_each_port_rtnl(hsr, port)
features = netdev_increment_features(features,
port->dev->features,
mask);
@@ -484,7 +484,7 @@ static void hsr_set_rx_mode(struct net_device *dev)
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) {
+ hsr_for_each_port_rtnl(hsr, port) {
if (port->type == HSR_PT_MASTER)
continue;
switch (port->type) {
@@ -506,7 +506,7 @@ static void hsr_change_rx_flags(struct net_device *dev, int change)
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) {
+ hsr_for_each_port_rtnl(hsr, port) {
if (port->type == HSR_PT_MASTER)
continue;
switch (port->type) {
@@ -534,7 +534,7 @@ static int hsr_ndo_vlan_rx_add_vid(struct net_device *dev,
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) {
+ hsr_for_each_port_rtnl(hsr, port) {
if (port->type == HSR_PT_MASTER ||
port->type == HSR_PT_INTERLINK)
continue;
@@ -580,7 +580,7 @@ static int hsr_ndo_vlan_rx_kill_vid(struct net_device *dev,
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) {
+ hsr_for_each_port_rtnl(hsr, port) {
switch (port->type) {
case HSR_PT_SLAVE_A:
case HSR_PT_SLAVE_B:
@@ -672,9 +672,13 @@ struct net_device *hsr_get_port_ndev(struct net_device *ndev,
struct hsr_priv *hsr = netdev_priv(ndev);
struct hsr_port *port;
+ rcu_read_lock();
hsr_for_each_port(hsr, port)
- if (port->type == pt)
+ if (port->type == pt) {
+ rcu_read_unlock();
return port->dev;
+ }
+ rcu_read_unlock();
return NULL;
}
EXPORT_SYMBOL(hsr_get_port_ndev);
diff --git a/net/hsr/hsr_main.c b/net/hsr/hsr_main.c
index 192893c3f2ec..eec6e20a8494 100644
--- a/net/hsr/hsr_main.c
+++ b/net/hsr/hsr_main.c
@@ -22,9 +22,13 @@ static bool hsr_slave_empty(struct hsr_priv *hsr)
{
struct hsr_port *port;
+ rcu_read_lock();
hsr_for_each_port(hsr, port)
- if (port->type != HSR_PT_MASTER)
+ if (port->type != HSR_PT_MASTER) {
+ rcu_read_unlock();
return false;
+ }
+ rcu_read_unlock();
return true;
}
@@ -134,9 +138,13 @@ struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt)
{
struct hsr_port *port;
+ rcu_read_lock();
hsr_for_each_port(hsr, port)
- if (port->type == pt)
+ if (port->type == pt) {
+ rcu_read_unlock();
return port;
+ }
+ rcu_read_unlock();
return NULL;
}
diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h
index 135ec5fce019..33b0d2460c9b 100644
--- a/net/hsr/hsr_main.h
+++ b/net/hsr/hsr_main.h
@@ -224,6 +224,9 @@ struct hsr_priv {
#define hsr_for_each_port(hsr, port) \
list_for_each_entry_rcu((port), &(hsr)->ports, port_list)
+#define hsr_for_each_port_rtnl(hsr, port) \
+ list_for_each_entry_rcu((port), &(hsr)->ports, port_list, lockdep_rtnl_is_held())
+
struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt);
/* Caller must ensure skb is a valid HSR frame */
diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c
index b120470246cc..f57c289e2322 100644
--- a/net/hsr/hsr_netlink.c
+++ b/net/hsr/hsr_netlink.c
@@ -241,10 +241,8 @@ void hsr_nl_ringerror(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN],
kfree_skb(skb);
fail:
- rcu_read_lock();
master = hsr_port_get_hsr(hsr, HSR_PT_MASTER);
netdev_warn(master->dev, "Could not send HSR ring error message\n");
- rcu_read_unlock();
}
/* This is called when we haven't heard from the node with MAC address addr for
@@ -278,10 +276,8 @@ void hsr_nl_nodedown(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN])
kfree_skb(skb);
fail:
- rcu_read_lock();
master = hsr_port_get_hsr(hsr, HSR_PT_MASTER);
netdev_warn(master->dev, "Could not send HSR node down\n");
- rcu_read_unlock();
}
/* HSR_C_GET_NODE_STATUS lets userspace query the internal HSR node table
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCHv2 net] hsr: use proper locking when iterating over ports
2025-08-27 9:33 [PATCHv2 net] hsr: use proper locking when iterating over ports Hangbin Liu
@ 2025-08-28 9:19 ` Paolo Abeni
2025-08-28 9:52 ` Hangbin Liu
0 siblings, 1 reply; 4+ messages in thread
From: Paolo Abeni @ 2025-08-28 9:19 UTC (permalink / raw)
To: Hangbin Liu, netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Simon Horman,
MD Danish Anwar, Alexander Lobakin, Jaakko Karrenpalo,
Fernando Fernandez Mancera, Murali Karicheri, WingMan Kwok,
Stanislav Fomichev, Xiao Liang, Kuniyuki Iwashima, Johannes Berg
On 8/27/25 11:33 AM, Hangbin Liu wrote:
> @@ -672,9 +672,13 @@ struct net_device *hsr_get_port_ndev(struct net_device *ndev,
> struct hsr_priv *hsr = netdev_priv(ndev);
> struct hsr_port *port;
>
> + rcu_read_lock();
> hsr_for_each_port(hsr, port)
> - if (port->type == pt)
> + if (port->type == pt) {
> + rcu_read_unlock();
> return port->dev;
This is not good enough. At this point accessing `port` could still
cause UaF;
The only callers, in icssg_prueth_hsr_{add,del}_mcast(), can be either
under the RTNL lock or not. A safer option would be acquiring a
reference on dev before releasing the rcu lock and let the caller drop
such reference
> + }
> + rcu_read_unlock();
> return NULL;
> }
> EXPORT_SYMBOL(hsr_get_port_ndev);
> diff --git a/net/hsr/hsr_main.c b/net/hsr/hsr_main.c
> index 192893c3f2ec..eec6e20a8494 100644
> --- a/net/hsr/hsr_main.c
> +++ b/net/hsr/hsr_main.c
> @@ -22,9 +22,13 @@ static bool hsr_slave_empty(struct hsr_priv *hsr)
> {
> struct hsr_port *port;
>
> + rcu_read_lock();
> hsr_for_each_port(hsr, port)
> - if (port->type != HSR_PT_MASTER)
> + if (port->type != HSR_PT_MASTER) {
> + rcu_read_unlock();
> return false;
> + }
> + rcu_read_unlock();
> return true;
> }
AFAICS the only caller of this helper is under the RTNL lock
> @@ -134,9 +138,13 @@ struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt)
> {
> struct hsr_port *port;
>
> + rcu_read_lock();
> hsr_for_each_port(hsr, port)
> - if (port->type == pt)
> + if (port->type == pt) {
> + rcu_read_unlock();
> return port;
The above is not enough.
AFAICS some/most caller are already either under the RTNL lock or the
rcu lock.
I think it would be better rename the hsr_for_each_port_rtnl() helper to
hsr_for_each_port_rcu(), retaining the current semantic, use it here,
and fix the caller as needed.
It will be useful to somehow split the patch in a series, as it's
already quite big and will increase even more.
/P
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCHv2 net] hsr: use proper locking when iterating over ports
2025-08-28 9:19 ` Paolo Abeni
@ 2025-08-28 9:52 ` Hangbin Liu
2025-08-28 10:40 ` Paolo Abeni
0 siblings, 1 reply; 4+ messages in thread
From: Hangbin Liu @ 2025-08-28 9:52 UTC (permalink / raw)
To: Paolo Abeni
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Simon Horman, MD Danish Anwar, Alexander Lobakin,
Jaakko Karrenpalo, Fernando Fernandez Mancera, Murali Karicheri,
WingMan Kwok, Stanislav Fomichev, Xiao Liang, Kuniyuki Iwashima,
Johannes Berg
On Thu, Aug 28, 2025 at 11:19:11AM +0200, Paolo Abeni wrote:
> On 8/27/25 11:33 AM, Hangbin Liu wrote:
> > @@ -672,9 +672,13 @@ struct net_device *hsr_get_port_ndev(struct net_device *ndev,
> > struct hsr_priv *hsr = netdev_priv(ndev);
> > struct hsr_port *port;
> >
> > + rcu_read_lock();
> > hsr_for_each_port(hsr, port)
> > - if (port->type == pt)
> > + if (port->type == pt) {
> > + rcu_read_unlock();
> > return port->dev;
>
> This is not good enough. At this point accessing `port` could still
> cause UaF;
>
> The only callers, in icssg_prueth_hsr_{add,del}_mcast(), can be either
> under the RTNL lock or not. A safer option would be acquiring a
> reference on dev before releasing the rcu lock and let the caller drop
> such reference
OK, thanks for the suggestion.
>
> > + }
> > + rcu_read_unlock();
> > return NULL;
> > }
> > EXPORT_SYMBOL(hsr_get_port_ndev);
> > diff --git a/net/hsr/hsr_main.c b/net/hsr/hsr_main.c
> > index 192893c3f2ec..eec6e20a8494 100644
> > --- a/net/hsr/hsr_main.c
> > +++ b/net/hsr/hsr_main.c
> > @@ -22,9 +22,13 @@ static bool hsr_slave_empty(struct hsr_priv *hsr)
> > {
> > struct hsr_port *port;
> >
> > + rcu_read_lock();
> > hsr_for_each_port(hsr, port)
> > - if (port->type != HSR_PT_MASTER)
> > + if (port->type != HSR_PT_MASTER) {
> > + rcu_read_unlock();
> > return false;
> > + }
> > + rcu_read_unlock();
> > return true;
> > }
>
> AFAICS the only caller of this helper is under the RTNL lock
Thanks, sometimes I not very sure if the caller is under RTNL lock or not.
Is there a good way to check this?
>
> > @@ -134,9 +138,13 @@ struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt)
> > {
> > struct hsr_port *port;
> >
> > + rcu_read_lock();
> > hsr_for_each_port(hsr, port)
> > - if (port->type == pt)
> > + if (port->type == pt) {
> > + rcu_read_unlock();
> > return port;
>
> The above is not enough.
>
> AFAICS some/most caller are already either under the RTNL lock or the
> rcu lock.
>
> I think it would be better rename the hsr_for_each_port_rtnl() helper to
> hsr_for_each_port_rcu(), retaining the current semantic, use it here,
> and fix the caller as needed.
Do you mean to modify like
#define hsr_for_each_port(hsr, port) \
list_for_each_entry_rcu((port), &(hsr)->ports, port_list)
+#define hsr_for_each_port_rcu(hsr, port) \
+ list_for_each_entry_rcu((port), &(hsr)->ports, port_list, lockdep_rtnl_is_held())
I'm not sure if the naming is clear. e.g. rcu_dereference_rtnl() also use rtnl
suffix to check if rtnl is held.
>
> It will be useful to somehow split the patch in a series, as it's
> already quite big and will increase even more.
OK.
Thanks
Hangbin
>
> /P
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCHv2 net] hsr: use proper locking when iterating over ports
2025-08-28 9:52 ` Hangbin Liu
@ 2025-08-28 10:40 ` Paolo Abeni
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2025-08-28 10:40 UTC (permalink / raw)
To: Hangbin Liu
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Simon Horman, MD Danish Anwar, Alexander Lobakin,
Jaakko Karrenpalo, Fernando Fernandez Mancera, Murali Karicheri,
WingMan Kwok, Stanislav Fomichev, Xiao Liang, Kuniyuki Iwashima,
Johannes Berg
On 8/28/25 11:52 AM, Hangbin Liu wrote:
> On Thu, Aug 28, 2025 at 11:19:11AM +0200, Paolo Abeni wrote:
>> On 8/27/25 11:33 AM, Hangbin Liu wrote:
>>> diff --git a/net/hsr/hsr_main.c b/net/hsr/hsr_main.c
>>> index 192893c3f2ec..eec6e20a8494 100644
>>> --- a/net/hsr/hsr_main.c
>>> +++ b/net/hsr/hsr_main.c
>>> @@ -22,9 +22,13 @@ static bool hsr_slave_empty(struct hsr_priv *hsr)
>>> {
>>> struct hsr_port *port;
>>>
>>> + rcu_read_lock();
>>> hsr_for_each_port(hsr, port)
>>> - if (port->type != HSR_PT_MASTER)
>>> + if (port->type != HSR_PT_MASTER) {
>>> + rcu_read_unlock();
>>> return false;
>>> + }
>>> + rcu_read_unlock();
>>> return true;
>>> }
>>
>> AFAICS the only caller of this helper is under the RTNL lock
>
> Thanks, sometimes I not very sure if the caller is under RTNL lock or not.
> Is there a good way to check this?
I'm not aware of any formal way to do this check. I relay on code
inspection.
>>> @@ -134,9 +138,13 @@ struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt)
>>> {
>>> struct hsr_port *port;
>>>
>>> + rcu_read_lock();
>>> hsr_for_each_port(hsr, port)
>>> - if (port->type == pt)
>>> + if (port->type == pt) {
>>> + rcu_read_unlock();
>>> return port;
>>
>> The above is not enough.
>>
>> AFAICS some/most caller are already either under the RTNL lock or the
>> rcu lock.
>>
>> I think it would be better rename the hsr_for_each_port_rtnl() helper to
>> hsr_for_each_port_rcu(), retaining the current semantic, use it here,
>> and fix the caller as needed.
>
> Do you mean to modify like
>
> #define hsr_for_each_port(hsr, port) \
> list_for_each_entry_rcu((port), &(hsr)->ports, port_list)
>
> +#define hsr_for_each_port_rcu(hsr, port) \
> + list_for_each_entry_rcu((port), &(hsr)->ports, port_list, lockdep_rtnl_is_held())
>
>
> I'm not sure if the naming is clear. e.g. rcu_dereference_rtnl() also use rtnl
> suffix to check if rtnl is held.
My naming suggestions are usually not that good, feel free to opt for a
better name. The more substantial feedback here is to properly address
the relevant callers.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-28 10:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-27 9:33 [PATCHv2 net] hsr: use proper locking when iterating over ports Hangbin Liu
2025-08-28 9:19 ` Paolo Abeni
2025-08-28 9:52 ` Hangbin Liu
2025-08-28 10:40 ` Paolo Abeni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).