* [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
@ 2023-05-15 5:37 Taehee Yoo
2023-05-15 6:24 ` Nikolay Aleksandrov
2023-05-15 13:11 ` Simon Horman
0 siblings, 2 replies; 7+ messages in thread
From: Taehee Yoo @ 2023-05-15 5:37 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, jiri, j.vosburgh, andy, netdev
Cc: jarod, wangyufen, ap420073, syzbot+60748c96cf5c6df8e581
When the virtual interface's feature is updated, it synchronizes the
updated feature for its own lower interface.
This propagation logic should be worked as the iteration, not recursively.
But it works recursively due to the netdev notification unexpectedly.
This problem occurs when it disables LRO only for the team and bonding
interface type.
team0
|
+------+------+-----+-----+
| | | | |
team1 team2 team3 ... team200
If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
event to its own lower interfaces(team1 ~ team200).
It is worked by netdev_sync_lower_features().
So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
work iteratively.
But generated NETDEV_FEAT_CHANGE event is also sent to the upper
interface too.
upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
lower interfaces again.
lower and upper interfaces receive this event and generate this
event again and again.
So, the stack overflow occurs.
But it is not the infinite loop issue.
Because the netdev_sync_lower_features() updates features before
generating the NETDEV_FEAT_CHANGE event.
Already synchronized lower interfaces skip notification logic.
So, it is just the problem that iteration logic is changed to the
recursive unexpectedly due to the notification mechanism.
Reproducer:
ip link add team0 type team
ethtool -K team0 lro on
for i in {1..200}
do
ip link add team$i master team0 type team
ethtool -K team$i lro on
done
ethtool -K team0 lro off
In order to fix it, the priv_notifier_ctx net_device member is introduced.
This variable can be used by each interface in its own way in the
notification context. The bonding and team interface is going to use it
to avoid duplicated NETDEV_FEAT_CHANGE event handling.
Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
drivers/net/bonding/bond_main.c | 6 +++++-
drivers/net/team/team.c | 6 +++++-
include/linux/netdevice.h | 1 +
net/core/dev.c | 2 ++
4 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3fed888629f7..dc6d5172475c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3947,7 +3947,11 @@ static int bond_slave_netdev_event(unsigned long event,
unblock_netpoll_tx();
break;
case NETDEV_FEAT_CHANGE:
- bond_compute_features(bond);
+ if (!bond_dev->priv_notifier_ctx) {
+ bond_dev->priv_notifier_ctx = NETDEV_FEAT_CHANGE;
+ bond_compute_features(bond);
+ bond_dev->priv_notifier_ctx = 0;
+ }
break;
case NETDEV_RESEND_IGMP:
/* Propagate to master device */
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index d10606f257c4..c7af29d3cda0 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -3022,7 +3022,11 @@ static int team_device_event(struct notifier_block *unused,
team_del_slave(port->team->dev, dev);
break;
case NETDEV_FEAT_CHANGE:
- team_compute_features(port->team);
+ if (!port->team->dev->priv_notifier_ctx) {
+ port->team->dev->priv_notifier_ctx = NETDEV_FEAT_CHANGE;
+ team_compute_features(port->team);
+ port->team->dev->priv_notifier_ctx = 0;
+ }
break;
case NETDEV_PRECHANGEMTU:
/* Forbid to change mtu of underlaying device */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 08fbd4622ccf..ebd49a54f0d5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2393,6 +2393,7 @@ struct net_device {
unsigned threaded:1;
struct list_head net_notifier_list;
+ u32 priv_notifier_ctx;
#if IS_ENABLED(CONFIG_MACSEC)
/* MACsec management functions */
diff --git a/net/core/dev.c b/net/core/dev.c
index b3c13e041935..cc6b5f626054 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10677,6 +10677,8 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
+ dev->priv_notifier_ctx = 0;
+
if (!dev->tx_queue_len) {
dev->priv_flags |= IFF_NO_QUEUE;
dev->tx_queue_len = DEFAULT_TX_QUEUE_LEN;
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
2023-05-15 5:37 [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces Taehee Yoo
@ 2023-05-15 6:24 ` Nikolay Aleksandrov
2023-05-15 9:12 ` Taehee Yoo
2023-05-15 13:11 ` Simon Horman
1 sibling, 1 reply; 7+ messages in thread
From: Nikolay Aleksandrov @ 2023-05-15 6:24 UTC (permalink / raw)
To: Taehee Yoo, davem, kuba, pabeni, edumazet, jiri, j.vosburgh, andy,
netdev
Cc: jarod, wangyufen, syzbot+60748c96cf5c6df8e581
On 15/05/2023 08:37, Taehee Yoo wrote:
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
>
> team0
> |
> +------+------+-----+-----+
> | | | | |
> team1 team2 team3 ... team200
>
> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> event to its own lower interfaces(team1 ~ team200).
> It is worked by netdev_sync_lower_features().
> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> work iteratively.
> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> interface too.
> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
> lower interfaces again.
> lower and upper interfaces receive this event and generate this
> event again and again.
> So, the stack overflow occurs.
>
> But it is not the infinite loop issue.
> Because the netdev_sync_lower_features() updates features before
> generating the NETDEV_FEAT_CHANGE event.
> Already synchronized lower interfaces skip notification logic.
> So, it is just the problem that iteration logic is changed to the
> recursive unexpectedly due to the notification mechanism.
>
> Reproducer:
>
> ip link add team0 type team
> ethtool -K team0 lro on
> for i in {1..200}
> do
> ip link add team$i master team0 type team
> ethtool -K team$i lro on
> done
>
> ethtool -K team0 lro off
>
> In order to fix it, the priv_notifier_ctx net_device member is introduced.
> This variable can be used by each interface in its own way in the
> notification context. The bonding and team interface is going to use it
> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
>
> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
> drivers/net/bonding/bond_main.c | 6 +++++-
> drivers/net/team/team.c | 6 +++++-
> include/linux/netdevice.h | 1 +
> net/core/dev.c | 2 ++
> 4 files changed, 13 insertions(+), 2 deletions(-)
>
Since you're syncing to lower devices, can't you check if the event source device
is lower to the current one (i.e. reverse propagation has happened) in the affected
drivers ? Adding a new struct netdevice member just for this seems unnecessary to me.
Especially for a setup like a bond of bonds or a team of teams, these are corner case
setups that shouldn't exist in general. :)
Cheers,
Nik
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
2023-05-15 6:24 ` Nikolay Aleksandrov
@ 2023-05-15 9:12 ` Taehee Yoo
2023-05-16 8:34 ` Paolo Abeni
0 siblings, 1 reply; 7+ messages in thread
From: Taehee Yoo @ 2023-05-15 9:12 UTC (permalink / raw)
To: Nikolay Aleksandrov, davem, kuba, pabeni, edumazet, jiri,
j.vosburgh, andy, netdev
Cc: jarod, wangyufen, syzbot+60748c96cf5c6df8e581
On 5/15/23 15:24, Nikolay Aleksandrov wrote:
Hi Nikolay,
Thank you so much for the review!
> On 15/05/2023 08:37, Taehee Yoo wrote:
>> When the virtual interface's feature is updated, it synchronizes the
>> updated feature for its own lower interface.
>> This propagation logic should be worked as the iteration, not
recursively.
>> But it works recursively due to the netdev notification unexpectedly.
>> This problem occurs when it disables LRO only for the team and bonding
>> interface type.
>>
>> team0
>> |
>> +------+------+-----+-----+
>> | | | | |
>> team1 team2 team3 ... team200
>>
>> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
>> event to its own lower interfaces(team1 ~ team200).
>> It is worked by netdev_sync_lower_features().
>> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
>> work iteratively.
>> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
>> interface too.
>> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for
its own
>> lower interfaces again.
>> lower and upper interfaces receive this event and generate this
>> event again and again.
>> So, the stack overflow occurs.
>>
>> But it is not the infinite loop issue.
>> Because the netdev_sync_lower_features() updates features before
>> generating the NETDEV_FEAT_CHANGE event.
>> Already synchronized lower interfaces skip notification logic.
>> So, it is just the problem that iteration logic is changed to the
>> recursive unexpectedly due to the notification mechanism.
>>
>> Reproducer:
>>
>> ip link add team0 type team
>> ethtool -K team0 lro on
>> for i in {1..200}
>> do
>> ip link add team$i master team0 type team
>> ethtool -K team$i lro on
>> done
>>
>> ethtool -K team0 lro off
>>
>> In order to fix it, the priv_notifier_ctx net_device member is
introduced.
>> This variable can be used by each interface in its own way in the
>> notification context. The bonding and team interface is going to use it
>> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
>>
>> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
>> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev
features down stack")
>> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>> ---
>> drivers/net/bonding/bond_main.c | 6 +++++-
>> drivers/net/team/team.c | 6 +++++-
>> include/linux/netdevice.h | 1 +
>> net/core/dev.c | 2 ++
>> 4 files changed, 13 insertions(+), 2 deletions(-)
>>
>
> Since you're syncing to lower devices, can't you check if the event
source device
> is lower to the current one (i.e. reverse propagation has happened)
in the affected
> drivers ? Adding a new struct netdevice member just for this seems
unnecessary to me.
> Especially for a setup like a bond of bonds or a team of teams, these
are corner case
> setups that shouldn't exist in general. :)
>
I agree that this new variable is unnecessary right now.
I tried to avoid introducing new variables, but unfortunately, I
couldn't find a solution to detect duplicated notification events.
The reason why I introduced the new member of the net_device is that I
thought there might be similar problems in the future such as mtu.
so, I hoped that it can be used as a general variable to avoid similar
problems.
But I really agree that this new variable is over-spec.
So, adding a new boolean variable into the struct bonding and team, not
net_device would be reasonable if I can't find a proper solution.
Yes, the above interface graph is not a real-world case.
The purpose of the above is just to trigger stack overflow problems for
anyone with just copy-and-paste to make it easy for testing.
It can't reproduce this problem with LRO non-support virtual interfaces
such as dummy, VLAN, and others.
we can reproduce this problem with a team and bonding interface, so I
used team over team as a reproducer.
I will send a v2 patch after trying to find better solution for days,
which would not introduce the new member of net_device.
If I can't find it, v2 would introduce a new member into struct bonding
and struct team.
Of course, any ideas are welcome!
Thank you so much!
Taehee Yoo
> Cheers,
> Nik
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
2023-05-15 9:12 ` Taehee Yoo
@ 2023-05-16 8:34 ` Paolo Abeni
2023-05-16 11:29 ` Taehee Yoo
0 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2023-05-16 8:34 UTC (permalink / raw)
To: Taehee Yoo, Nikolay Aleksandrov, davem, kuba, edumazet, jiri,
j.vosburgh, andy, netdev
Cc: jarod, wangyufen, syzbot+60748c96cf5c6df8e581
On Mon, 2023-05-15 at 18:12 +0900, Taehee Yoo wrote:
> On 5/15/23 15:24, Nikolay Aleksandrov wrote:
> > On 15/05/2023 08:37, Taehee Yoo wrote:
> >> When the virtual interface's feature is updated, it synchronizes the
> >> updated feature for its own lower interface.
> >> This propagation logic should be worked as the iteration, not
> recursively.
> >> But it works recursively due to the netdev notification unexpectedly.
> >> This problem occurs when it disables LRO only for the team and bonding
> >> interface type.
> >>
> >> team0
> >> |
> >> +------+------+-----+-----+
> >> | | | | |
> >> team1 team2 team3 ... team200
> >>
> >> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> >> event to its own lower interfaces(team1 ~ team200).
> >> It is worked by netdev_sync_lower_features().
> >> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> >> work iteratively.
> >> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> >> interface too.
> >> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for
> its own
> >> lower interfaces again.
> >> lower and upper interfaces receive this event and generate this
> >> event again and again.
> >> So, the stack overflow occurs.
> >>
> >> But it is not the infinite loop issue.
> >> Because the netdev_sync_lower_features() updates features before
> >> generating the NETDEV_FEAT_CHANGE event.
> >> Already synchronized lower interfaces skip notification logic.
> >> So, it is just the problem that iteration logic is changed to the
> >> recursive unexpectedly due to the notification mechanism.
> >>
> >> Reproducer:
> >>
> >> ip link add team0 type team
> >> ethtool -K team0 lro on
> >> for i in {1..200}
> >> do
> >> ip link add team$i master team0 type team
> >> ethtool -K team$i lro on
> >> done
> >>
> >> ethtool -K team0 lro off
> >>
> >> In order to fix it, the priv_notifier_ctx net_device member is
> introduced.
> >> This variable can be used by each interface in its own way in the
> >> notification context. The bonding and team interface is going to use it
> >> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
> >>
> >> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
> >> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev
> features down stack")
> >> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> >> ---
> >> drivers/net/bonding/bond_main.c | 6 +++++-
> >> drivers/net/team/team.c | 6 +++++-
> >> include/linux/netdevice.h | 1 +
> >> net/core/dev.c | 2 ++
> >> 4 files changed, 13 insertions(+), 2 deletions(-)
> >>
> >
> > Since you're syncing to lower devices, can't you check if the event
> source device
> > is lower to the current one (i.e. reverse propagation has happened)
> in the affected
> > drivers ? Adding a new struct netdevice member just for this seems
> unnecessary to me.
> > Especially for a setup like a bond of bonds or a team of teams, these
> are corner case
> > setups that shouldn't exist in general. :)
> >
>
> I agree that this new variable is unnecessary right now.
> I tried to avoid introducing new variables, but unfortunately, I
> couldn't find a solution to detect duplicated notification events.
>
> The reason why I introduced the new member of the net_device is that I
> thought there might be similar problems in the future such as mtu.
> so, I hoped that it can be used as a general variable to avoid similar
> problems.
> But I really agree that this new variable is over-spec.
> So, adding a new boolean variable into the struct bonding and team, not
> net_device would be reasonable if I can't find a proper solution.
I think adding a bool variable to bonding/team priv would be better, as
it looks like the issues is specific to such kind of devices.
Thanks!
Paolo
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
2023-05-16 8:34 ` Paolo Abeni
@ 2023-05-16 11:29 ` Taehee Yoo
0 siblings, 0 replies; 7+ messages in thread
From: Taehee Yoo @ 2023-05-16 11:29 UTC (permalink / raw)
To: Paolo Abeni, Nikolay Aleksandrov, davem, kuba, edumazet, jiri,
j.vosburgh, andy, netdev
Cc: jarod, wangyufen, syzbot+60748c96cf5c6df8e581
On 5/16/23 17:34, Paolo Abeni wrote:
Hi Paolo,
Thank you so much for the review!
> On Mon, 2023-05-15 at 18:12 +0900, Taehee Yoo wrote:
>> On 5/15/23 15:24, Nikolay Aleksandrov wrote:
>> > On 15/05/2023 08:37, Taehee Yoo wrote:
>> >> When the virtual interface's feature is updated, it
synchronizes the
>> >> updated feature for its own lower interface.
>> >> This propagation logic should be worked as the iteration, not
>> recursively.
>> >> But it works recursively due to the netdev notification
unexpectedly.
>> >> This problem occurs when it disables LRO only for the team and
bonding
>> >> interface type.
>> >>
>> >> team0
>> >> |
>> >> +------+------+-----+-----+
>> >> | | | | |
>> >> team1 team2 team3 ... team200
>> >>
>> >> If team0's LRO feature is updated, it generates the
NETDEV_FEAT_CHANGE
>> >> event to its own lower interfaces(team1 ~ team200).
>> >> It is worked by netdev_sync_lower_features().
>> >> So, the NETDEV_FEAT_CHANGE notification logic of each lower
interface
>> >> work iteratively.
>> >> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
>> >> interface too.
>> >> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for
>> its own
>> >> lower interfaces again.
>> >> lower and upper interfaces receive this event and generate this
>> >> event again and again.
>> >> So, the stack overflow occurs.
>> >>
>> >> But it is not the infinite loop issue.
>> >> Because the netdev_sync_lower_features() updates features before
>> >> generating the NETDEV_FEAT_CHANGE event.
>> >> Already synchronized lower interfaces skip notification logic.
>> >> So, it is just the problem that iteration logic is changed to the
>> >> recursive unexpectedly due to the notification mechanism.
>> >>
>> >> Reproducer:
>> >>
>> >> ip link add team0 type team
>> >> ethtool -K team0 lro on
>> >> for i in {1..200}
>> >> do
>> >> ip link add team$i master team0 type team
>> >> ethtool -K team$i lro on
>> >> done
>> >>
>> >> ethtool -K team0 lro off
>> >>
>> >> In order to fix it, the priv_notifier_ctx net_device member is
>> introduced.
>> >> This variable can be used by each interface in its own way in the
>> >> notification context. The bonding and team interface is going
to use it
>> >> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
>> >>
>> >> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
>> >> Fixes: fd867d51f889 ("net/core: generic support for disabling
netdev
>> features down stack")
>> >> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>> >> ---
>> >> drivers/net/bonding/bond_main.c | 6 +++++-
>> >> drivers/net/team/team.c | 6 +++++-
>> >> include/linux/netdevice.h | 1 +
>> >> net/core/dev.c | 2 ++
>> >> 4 files changed, 13 insertions(+), 2 deletions(-)
>> >>
>> >
>> > Since you're syncing to lower devices, can't you check if the event
>> source device
>> > is lower to the current one (i.e. reverse propagation has happened)
>> in the affected
>> > drivers ? Adding a new struct netdevice member just for this seems
>> unnecessary to me.
>> > Especially for a setup like a bond of bonds or a team of teams,
these
>> are corner case
>> > setups that shouldn't exist in general. :)
>> >
>>
>> I agree that this new variable is unnecessary right now.
>> I tried to avoid introducing new variables, but unfortunately, I
>> couldn't find a solution to detect duplicated notification events.
>>
>> The reason why I introduced the new member of the net_device is that I
>> thought there might be similar problems in the future such as mtu.
>> so, I hoped that it can be used as a general variable to avoid similar
>> problems.
>> But I really agree that this new variable is over-spec.
>> So, adding a new boolean variable into the struct bonding and team, not
>> net_device would be reasonable if I can't find a proper solution.
>
> I think adding a bool variable to bonding/team priv would be better, as
> it looks like the issues is specific to such kind of devices.
>
Thanks, I will add a bool variable to the bonding and team struct in the v2.
Thank you so much!
Taehee Yoo
> Thanks!
>
> Paolo
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
2023-05-15 5:37 [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces Taehee Yoo
2023-05-15 6:24 ` Nikolay Aleksandrov
@ 2023-05-15 13:11 ` Simon Horman
2023-05-15 16:21 ` Taehee Yoo
1 sibling, 1 reply; 7+ messages in thread
From: Simon Horman @ 2023-05-15 13:11 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, jiri, j.vosburgh, andy, netdev,
jarod, wangyufen, syzbot+60748c96cf5c6df8e581
On Mon, May 15, 2023 at 05:37:40AM +0000, Taehee Yoo wrote:
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
>
> team0
> |
> +------+------+-----+-----+
> | | | | |
> team1 team2 team3 ... team200
>
> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> event to its own lower interfaces(team1 ~ team200).
> It is worked by netdev_sync_lower_features().
> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> work iteratively.
> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> interface too.
> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
> lower interfaces again.
> lower and upper interfaces receive this event and generate this
> event again and again.
> So, the stack overflow occurs.
>
> But it is not the infinite loop issue.
> Because the netdev_sync_lower_features() updates features before
> generating the NETDEV_FEAT_CHANGE event.
> Already synchronized lower interfaces skip notification logic.
> So, it is just the problem that iteration logic is changed to the
> recursive unexpectedly due to the notification mechanism.
>
> Reproducer:
>
> ip link add team0 type team
> ethtool -K team0 lro on
> for i in {1..200}
> do
> ip link add team$i master team0 type team
> ethtool -K team$i lro on
> done
>
> ethtool -K team0 lro off
>
> In order to fix it, the priv_notifier_ctx net_device member is introduced.
> This variable can be used by each interface in its own way in the
> notification context. The bonding and team interface is going to use it
> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
>
> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
...
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 08fbd4622ccf..ebd49a54f0d5 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2393,6 +2393,7 @@ struct net_device {
> unsigned threaded:1;
>
> struct list_head net_notifier_list;
> + u32 priv_notifier_ctx;
Hi Taehee,
Please add this new field to the kdoc for struct net_device.
>
> #if IS_ENABLED(CONFIG_MACSEC)
> /* MACsec management functions */
...
---
pw-bot: cr
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
2023-05-15 13:11 ` Simon Horman
@ 2023-05-15 16:21 ` Taehee Yoo
0 siblings, 0 replies; 7+ messages in thread
From: Taehee Yoo @ 2023-05-15 16:21 UTC (permalink / raw)
To: Simon Horman
Cc: davem, kuba, pabeni, edumazet, jiri, j.vosburgh, andy, netdev,
jarod, wangyufen, syzbot+60748c96cf5c6df8e581
On 5/15/23 22:11, Simon Horman wrote:
Hi Simon,
Thank you so much for the review!
> On Mon, May 15, 2023 at 05:37:40AM +0000, Taehee Yoo wrote:
>> When the virtual interface's feature is updated, it synchronizes the
>> updated feature for its own lower interface.
>> This propagation logic should be worked as the iteration, not
recursively.
>> But it works recursively due to the netdev notification unexpectedly.
>> This problem occurs when it disables LRO only for the team and bonding
>> interface type.
>>
>> team0
>> |
>> +------+------+-----+-----+
>> | | | | |
>> team1 team2 team3 ... team200
>>
>> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
>> event to its own lower interfaces(team1 ~ team200).
>> It is worked by netdev_sync_lower_features().
>> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
>> work iteratively.
>> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
>> interface too.
>> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for
its own
>> lower interfaces again.
>> lower and upper interfaces receive this event and generate this
>> event again and again.
>> So, the stack overflow occurs.
>>
>> But it is not the infinite loop issue.
>> Because the netdev_sync_lower_features() updates features before
>> generating the NETDEV_FEAT_CHANGE event.
>> Already synchronized lower interfaces skip notification logic.
>> So, it is just the problem that iteration logic is changed to the
>> recursive unexpectedly due to the notification mechanism.
>>
>> Reproducer:
>>
>> ip link add team0 type team
>> ethtool -K team0 lro on
>> for i in {1..200}
>> do
>> ip link add team$i master team0 type team
>> ethtool -K team$i lro on
>> done
>>
>> ethtool -K team0 lro off
>>
>> In order to fix it, the priv_notifier_ctx net_device member is
introduced.
>> This variable can be used by each interface in its own way in the
>> notification context. The bonding and team interface is going to use it
>> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
>>
>> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
>> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev
features down stack")
>> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>
> ...
>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 08fbd4622ccf..ebd49a54f0d5 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2393,6 +2393,7 @@ struct net_device {
>> unsigned threaded:1;
>>
>> struct list_head net_notifier_list;
>> + u32 priv_notifier_ctx;
>
> Hi Taehee,
>
> Please add this new field to the kdoc for struct net_device.
>
Thanks! I will check this before submitting the v2 patch.
Thank you so much,
Taehee Yoo
>>
>> #if IS_ENABLED(CONFIG_MACSEC)
>> /* MACsec management functions */
>
> ...
>
> ---
> pw-bot: cr
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-05-16 11:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-15 5:37 [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces Taehee Yoo
2023-05-15 6:24 ` Nikolay Aleksandrov
2023-05-15 9:12 ` Taehee Yoo
2023-05-16 8:34 ` Paolo Abeni
2023-05-16 11:29 ` Taehee Yoo
2023-05-15 13:11 ` Simon Horman
2023-05-15 16:21 ` Taehee Yoo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).