From: Shigeru Yoshida <syoshida@redhat.com>
To: Nikolay Aleksandrov <razor@blackwall.org>
Cc: j.vosburgh@gmail.com, andy@greyhouse.net, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
syzbot+9dfc3f3348729cc82277@syzkaller.appspotmail.com
Subject: Re: [PATCH net] bonding: Fix warning in default_device_exit_batch()
Date: Tue, 14 Mar 2023 11:17:29 +0900 [thread overview]
Message-ID: <ZA/ZOTGGADHe8TL7@kernel-devel> (raw)
In-Reply-To: <45fc873b-9b71-adf2-8f2f-17134344e490@blackwall.org>
Hi Nik,
On Mon, Mar 13, 2023 at 12:52:44PM +0200, Nikolay Aleksandrov wrote:
> On 13/03/2023 11:35, Shigeru Yoshida wrote:
> > Hi Nik,
> >
> > On Sun, Mar 12, 2023 at 10:58:18PM +0200, Nikolay Aleksandrov wrote:
> >> On 12/03/2023 17:21, Shigeru Yoshida wrote:
> >>> syzbot reported warning in default_device_exit_batch() like below [1]:
> >>>
> >>> WARNING: CPU: 1 PID: 56 at net/core/dev.c:10867 unregister_netdevice_many_notify+0x14cf/0x19f0 net/core/dev.c:10867
> >>> ...
> >>> Call Trace:
> >>> <TASK>
> >>> unregister_netdevice_many net/core/dev.c:10897 [inline]
> >>> default_device_exit_batch+0x451/0x5b0 net/core/dev.c:11350
> >>> ops_exit_list+0x125/0x170 net/core/net_namespace.c:174
> >>> cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:613
> >>> process_one_work+0x9bf/0x1820 kernel/workqueue.c:2390
> >>> worker_thread+0x669/0x1090 kernel/workqueue.c:2537
> >>> kthread+0x2e8/0x3a0 kernel/kthread.c:376
> >>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
> >>> </TASK>
> >>>
> >>> For bond devices which also has a master device, IFF_SLAVE flag is
> >>> cleared at err_undo_flags label in bond_enslave() if it is not
> >>> ARPHRD_ETHER type. In this case, __bond_release_one() is not called
> >>> when bond_netdev_event() received NETDEV_UNREGISTER event. This
> >>> causes the above warning.
> >>>
> >>> This patch fixes this issue by setting IFF_SLAVE flag at
> >>> err_undo_flags label in bond_enslave() if the bond device has a master
> >>> device.
> >>>
> >>
> >> The proper way is to check if the bond device had the IFF_SLAVE flag before the
> >> ether_setup() call which clears it, and restore it after.
> >>
> >>> Fixes: 7d5cd2ce5292 ("bonding: correctly handle bonding type change on enslave failure")
> >>> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>> Link: https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef [1]
> >>> Reported-by: syzbot+9dfc3f3348729cc82277@syzkaller.appspotmail.com
> >>> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
> >>> ---
> >>> drivers/net/bonding/bond_main.c | 2 ++
> >>> include/net/bonding.h | 5 +++++
> >>> 2 files changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >>> index 00646aa315c3..1a8b59e1468d 100644
> >>> --- a/drivers/net/bonding/bond_main.c
> >>> +++ b/drivers/net/bonding/bond_main.c
> >>> @@ -2291,6 +2291,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> >>> dev_close(bond_dev);
> >>> ether_setup(bond_dev);
> >>> bond_dev->flags |= IFF_MASTER;
> >>> + if (bond_has_master(bond))
> >>> + bond_dev->flags |= IFF_SLAVE;
> >>> bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
> >>> }
> >>> }
> >>> diff --git a/include/net/bonding.h b/include/net/bonding.h
> >>> index ea36ab7f9e72..ed0b49501fad 100644
> >>> --- a/include/net/bonding.h
> >>> +++ b/include/net/bonding.h
> >>> @@ -57,6 +57,11 @@
> >>>
> >>> #define bond_has_slaves(bond) !list_empty(bond_slave_list(bond))
> >>>
> >>> +/* master list primitives */
> >>> +#define bond_master_list(bond) (&(bond)->dev->adj_list.upper)
> >>> +
> >>> +#define bond_has_master(bond) !list_empty(bond_master_list(bond))
> >>> +
> >>
> >> This is not the proper way to check for a master device.
> >>
> >>> /* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
> >>> #define bond_first_slave(bond) \
> >>> (bond_has_slaves(bond) ? \
> >>
> >> The device flags are wrong because of ether_setup() which clears IFF_SLAVE, we should
> >> just check if it was present before and restore it after the ether_setup() call.
> >
> > Thank you so much for your comment! I understand your point, and
> > agree that your approach must resolve the issue.
> >
> > BTW, do you mean there is a case where a device has IFF_SLAVE flag but
> > the upper list is empty? I thought a device with IFF_SLAVE flag has a
> > master device in the upper list (that is why I took the above way.)
> >
>
> Hi Shigeru,
> No, that's not what I meant. It's the opposite actually, you may have an upper list
> but you don't have a "master" device or slave flag set. Yes, you can say that if
> a device has IFF_SLAVE set, then it must have a master upper device but that's not
> what you're checking for, you've reversed that logic to check for an upper device instead
> and assume there's a IFF_SLAVE flag set (which may not be true).
> For an upper device to be considered a "master" device, it must have the master bool set to
> true in its netdev_adjacent structure. We already have helpers to check for master devices
> and to retrieve them, e.g. check netdev_master_upper_dev_get* in net/core/dev.c
>
> The most robust way to fix it is to check if the flag was there prior to the ether_setup() call
> and restore it after, also to leave a nice comment about all of this. :)
Thanks for kindly explanation. I've now understand why my fix is not
sufficient to check a master device. And, yes, the most robust and
simple way to fix the issue is to check the flag before it is cleared.
Thanks you~
Shigeru
>
> > Thanks,
> > Shigeru
> >
>
> Cheers,
> Nik
>
prev parent reply other threads:[~2023-03-14 2:18 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-12 15:21 [PATCH net] bonding: Fix warning in default_device_exit_batch() Shigeru Yoshida
2023-03-12 20:58 ` Nikolay Aleksandrov
2023-03-13 9:35 ` Shigeru Yoshida
2023-03-13 10:52 ` Nikolay Aleksandrov
2023-03-14 2:17 ` Shigeru Yoshida [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZA/ZOTGGADHe8TL7@kernel-devel \
--to=syoshida@redhat.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=j.vosburgh@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=syzbot+9dfc3f3348729cc82277@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.