From: Taehee Yoo <ap420073@gmail.com>
To: Nikolay Aleksandrov <razor@blackwall.org>,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
edumazet@google.com, jiri@resnulli.us, j.vosburgh@gmail.com,
andy@greyhouse.net, netdev@vger.kernel.org
Cc: jarod@redhat.com, wangyufen@huawei.com,
syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
Subject: Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces
Date: Mon, 15 May 2023 18:12:52 +0900 [thread overview]
Message-ID: <52da9cd3-508f-eb7d-98b3-cd777acc90eb@gmail.com> (raw)
In-Reply-To: <eeff656b-22ac-082d-9b94-62980e806f0f@blackwall.org>
On 5/15/23 15:24, Nikolay Aleksandrov wrote:
Hi Nikolay,
Thank you so much for the review!
> On 15/05/2023 08:37, Taehee Yoo wrote:
>> When the virtual interface's feature is updated, it synchronizes the
>> updated feature for its own lower interface.
>> This propagation logic should be worked as the iteration, not
recursively.
>> But it works recursively due to the netdev notification unexpectedly.
>> This problem occurs when it disables LRO only for the team and bonding
>> interface type.
>>
>> team0
>> |
>> +------+------+-----+-----+
>> | | | | |
>> team1 team2 team3 ... team200
>>
>> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
>> event to its own lower interfaces(team1 ~ team200).
>> It is worked by netdev_sync_lower_features().
>> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
>> work iteratively.
>> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
>> interface too.
>> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for
its own
>> lower interfaces again.
>> lower and upper interfaces receive this event and generate this
>> event again and again.
>> So, the stack overflow occurs.
>>
>> But it is not the infinite loop issue.
>> Because the netdev_sync_lower_features() updates features before
>> generating the NETDEV_FEAT_CHANGE event.
>> Already synchronized lower interfaces skip notification logic.
>> So, it is just the problem that iteration logic is changed to the
>> recursive unexpectedly due to the notification mechanism.
>>
>> Reproducer:
>>
>> ip link add team0 type team
>> ethtool -K team0 lro on
>> for i in {1..200}
>> do
>> ip link add team$i master team0 type team
>> ethtool -K team$i lro on
>> done
>>
>> ethtool -K team0 lro off
>>
>> In order to fix it, the priv_notifier_ctx net_device member is
introduced.
>> This variable can be used by each interface in its own way in the
>> notification context. The bonding and team interface is going to use it
>> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
>>
>> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
>> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev
features down stack")
>> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>> ---
>> drivers/net/bonding/bond_main.c | 6 +++++-
>> drivers/net/team/team.c | 6 +++++-
>> include/linux/netdevice.h | 1 +
>> net/core/dev.c | 2 ++
>> 4 files changed, 13 insertions(+), 2 deletions(-)
>>
>
> Since you're syncing to lower devices, can't you check if the event
source device
> is lower to the current one (i.e. reverse propagation has happened)
in the affected
> drivers ? Adding a new struct netdevice member just for this seems
unnecessary to me.
> Especially for a setup like a bond of bonds or a team of teams, these
are corner case
> setups that shouldn't exist in general. :)
>
I agree that this new variable is unnecessary right now.
I tried to avoid introducing new variables, but unfortunately, I
couldn't find a solution to detect duplicated notification events.
The reason why I introduced the new member of the net_device is that I
thought there might be similar problems in the future such as mtu.
so, I hoped that it can be used as a general variable to avoid similar
problems.
But I really agree that this new variable is over-spec.
So, adding a new boolean variable into the struct bonding and team, not
net_device would be reasonable if I can't find a proper solution.
Yes, the above interface graph is not a real-world case.
The purpose of the above is just to trigger stack overflow problems for
anyone with just copy-and-paste to make it easy for testing.
It can't reproduce this problem with LRO non-support virtual interfaces
such as dummy, VLAN, and others.
we can reproduce this problem with a team and bonding interface, so I
used team over team as a reproducer.
I will send a v2 patch after trying to find better solution for days,
which would not introduce the new member of net_device.
If I can't find it, v2 would introduce a new member into struct bonding
and struct team.
Of course, any ideas are welcome!
Thank you so much!
Taehee Yoo
> Cheers,
> Nik
>
next prev parent reply other threads:[~2023-05-15 9:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-15 5:37 [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces Taehee Yoo
2023-05-15 6:24 ` Nikolay Aleksandrov
2023-05-15 9:12 ` Taehee Yoo [this message]
2023-05-16 8:34 ` Paolo Abeni
2023-05-16 11:29 ` Taehee Yoo
2023-05-15 13:11 ` Simon Horman
2023-05-15 16:21 ` Taehee Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52da9cd3-508f-eb7d-98b3-cd777acc90eb@gmail.com \
--to=ap420073@gmail.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=j.vosburgh@gmail.com \
--cc=jarod@redhat.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com \
--cc=wangyufen@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).