From: Leon Romanovsky <leon@kernel.org>
To: Edwin Peer <edwin.peer@broadcom.com>
Cc: Ido Schimmel <idosch@idosch.org>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Ido Schimmel <idosch@mellanox.com>,
Jiri Pirko <jiri@mellanox.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
netdev <netdev@vger.kernel.org>,
syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com,
Michael Chan <michael.chan@broadcom.com>
Subject: Re: [PATCH net-next] netdevsim: Register and unregister devlink traps on probe/remove device
Date: Wed, 27 Oct 2021 09:43:51 +0300 [thread overview]
Message-ID: <YXj1J/Z8HYvBWC6Y@unreal> (raw)
In-Reply-To: <CAKOOJTzrQYz4FTDU_d_R0RLA4u6pfK9=+=E_uKMr4VCNbmF_kA@mail.gmail.com>
On Tue, Oct 26, 2021 at 01:03:41PM -0700, Edwin Peer wrote:
> On Tue, Oct 26, 2021 at 12:22 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> > At least in mlx5 case, reload_enable() was before register_netdev().
> > It stayed like this after swapping it with devlink_register().
>
> What am I missing here?
>
> err = mlx5_init_one(dev);
> if (err) {
> mlx5_core_err(dev, "mlx5_init_one failed with error code %d\n", err);
> goto err_init_one;
> }
>
> err = mlx5_crdump_enable(dev);
> if (err)
> dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code
> %d\n", err);
>
> pci_save_state(pdev);
> devlink_register(devlink);
>
> Doesn't mlx5_init_one() ultimately result in the netdev being
> presented to user space, even if it is via aux bus?
The mlx5_init_one() aux devices, and driver is not always loaded
directly in the Linux kernel. The device creation triggers udev event,
which is handled by udev systemd. The systemd reads various modules.* files
that kernel provides and this is how it knows which driver to load.
In our case, the eth driver is part of mlx5_core module, so at the
device creation phase that module is already loaded and driver/core
will try to autoprobe it.
However, the last step is not always performed and controlled by the
userspace. Users can disable driver autoprobe and bind manually. This
is pretty standard practice in the SR-IOV or VFIO modes.
>
> > No, it is not requirement, but my suggestion. You need to be aware that
> > after call to devlink_register(), the device will be fully open for devlink
> > netlink access. So it is strongly advised to put devlink_register to be the
> > last command in PCI initialization sequence.
>
> Right, that's the problem. Once we register the netdev, we're in a
> race with user space, which may expect to be able to call devlink
> before we get to devlink_register().
This is why devlink has monitor mode where you can see devlink device
addition and removal. It is user space job to check that device is
ready.
>
> > You obviously need to fix your code. Upstream version of bnxt driver
> > doesn't have reload_* support, so all this regression blaming it not
> > relevant here.
>
> Right, our timing is unfortunate and that's on us. It's still not
> clear to me how to actually fix the devlink reload code without the
> benefit of something similar to the reload enable API.
>
> > In upstream code, devlink_register() doesn't accept ops like it was
> > before and position of that call does only one thing - opens devlink
> > netlink access. All kernel devlink APIs continue to be accessible even
> > before devlink_register.
>
> This isn't about kernel API. This is precisely about existing user
> space that expects devlink to work immediately after the netdev
> appears.
Can you please share open source project that has such assumption?
>
> > It looks like your failure is in backport code.
>
> Our out-of-tree driver isn't the issue here. I'm talking about the
> proposed upstream code. The issue is what to do in order to get
> something workable upstream for devlink reload. We can't move
> devlink_register() later, that will cause a regression. What do you
> suggest instead?
Fix your test respect devlink notifications and don't ignore them.
Thanks
>
> Regards,
> Edwin Peer
next prev parent reply other threads:[~2021-10-27 6:44 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-24 8:42 [PATCH net-next] netdevsim: Register and unregister devlink traps on probe/remove device Leon Romanovsky
2021-10-24 9:05 ` Ido Schimmel
2021-10-24 9:54 ` Leon Romanovsky
2021-10-24 10:48 ` Ido Schimmel
2021-10-25 5:34 ` Leon Romanovsky
2021-10-25 8:08 ` Ido Schimmel
2021-10-25 10:56 ` Leon Romanovsky
2021-10-26 6:51 ` Ido Schimmel
2021-10-26 7:18 ` Leon Romanovsky
2021-10-26 14:09 ` Ido Schimmel
2021-10-26 16:14 ` Leon Romanovsky
2021-10-26 19:02 ` Jakub Kicinski
2021-10-26 19:30 ` Leon Romanovsky
2021-10-26 19:56 ` Jakub Kicinski
2021-10-27 5:56 ` Leon Romanovsky
2021-10-27 14:17 ` Jakub Kicinski
2021-10-27 15:17 ` Leon Romanovsky
2021-10-27 19:15 ` Leon Romanovsky
2021-10-27 19:28 ` Jakub Kicinski
2021-10-25 18:24 ` Jakub Kicinski
2021-10-25 19:12 ` Leon Romanovsky
2021-10-25 23:19 ` Edwin Peer
2021-10-26 5:56 ` Leon Romanovsky
2021-10-26 17:34 ` Edwin Peer
2021-10-26 19:22 ` Leon Romanovsky
2021-10-26 20:03 ` Edwin Peer
2021-10-27 6:43 ` Leon Romanovsky [this message]
2021-10-27 8:46 ` Edwin Peer
2021-10-27 9:43 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YXj1J/Z8HYvBWC6Y@unreal \
--to=leon@kernel.org \
--cc=davem@davemloft.net \
--cc=edwin.peer@broadcom.com \
--cc=idosch@idosch.org \
--cc=idosch@mellanox.com \
--cc=jiri@mellanox.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).