From: Jason Gunthorpe <jgg@nvidia.com>
To: Jakub Kicinski <kuba@kernel.org>, Jiri Pirko <jiri@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>,
Ido Schimmel <idosch@idosch.org>, Jiri Pirko <jiri@resnulli.us>,
"David S . Miller" <davem@davemloft.net>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
edwin.peer@broadcom.com
Subject: Re: [PATCH net-next] devlink: Require devlink lock during device reload
Date: Tue, 9 Nov 2021 11:33:35 -0400 [thread overview]
Message-ID: <20211109153335.GH1740502@nvidia.com> (raw)
In-Reply-To: <20211109070702.17364ec7@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
On Tue, Nov 09, 2021 at 07:07:02AM -0800, Jakub Kicinski wrote:
> On Tue, 9 Nov 2021 10:43:58 -0400 Jason Gunthorpe wrote:
> > This becomes all entangled in the aux device stuff we did before.
>
> So entangled in fact that neither of you is willing to elucidate
> the exact need ;)
I haven't been paying attention to this thread, Leon just asked me to
elaborate on why roce is in these traces.
What I know is mlx5 testing has shown an alarming number of crashers
and bugs connected to devlink and Leon has been grinding them
away. mlx5 is quite heavily invested in devlink and mlx5 really needs
it to work robustly.
> The main use case for reload for netns is placing a VF in a namespace,
> for a container to use. Is that right? I've not seen use cases
> requiring the PF to be moved, are there any?
Sure, it can be useful. I wouldn't call it essential, but it is there.
> > I once sketched out fixing this by removing the need to hold the
> > per_net_rwsem just for list iteration, which in turn avoids holding it
> > over the devlink reload paths. It seemed like a reasonable step toward
> > finer grained locking.
>
> Seems to me the locking is just a symptom.
My fear is this reload during net ns destruction is devlink uAPI now
and, yes it may be only a symptom, but the root cause may be unfixable
uAPI constraints. Jiri, what do you think?
Speaking generally, I greatly prefer devlink move toward a design where
the aux bus operations that must be connected with devlink reload are
not invoked under any global locks at all. Not per_net_rwsem, rtnl, or
devlink BKLs.
We know holding global locks in open-ended contexts, like driver core
device_register/unregister, is an dangerous pattern.
Concretely, now that we are pushing virtualization use cases into
using devlink as a control plane mlx5 has pod/VM launch issuing
devlink steps. Some of this stuff is necessarily slow, like attaching
a roce aux driver takes a while. Holding a BKL while doing that means
all VM launches are serialized.
IMHO these needs demand a fine grained locking approach, not a BKL
design.
Jason
next prev parent reply other threads:[~2021-11-09 15:33 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-31 17:35 [PATCH net-next] devlink: Require devlink lock during device reload Leon Romanovsky
2021-11-01 7:12 ` Ido Schimmel
2021-11-01 15:03 ` Jiri Pirko
2021-11-01 20:52 ` Ido Schimmel
2021-11-01 23:11 ` Jakub Kicinski
2021-11-07 17:16 ` Ido Schimmel
2021-11-07 17:54 ` Leon Romanovsky
2021-11-08 16:09 ` Jakub Kicinski
2021-11-08 17:32 ` Leon Romanovsky
2021-11-08 18:16 ` Jakub Kicinski
2021-11-08 18:24 ` Leon Romanovsky
2021-11-08 18:46 ` Jakub Kicinski
2021-11-08 19:58 ` Leon Romanovsky
2021-11-08 23:31 ` Jakub Kicinski
2021-11-09 14:12 ` Leon Romanovsky
2021-11-09 14:17 ` Jakub Kicinski
2021-11-09 14:30 ` Leon Romanovsky
2021-11-09 14:49 ` Jakub Kicinski
2021-11-09 16:29 ` Jiri Pirko
2021-11-09 14:43 ` Jason Gunthorpe
2021-11-09 15:07 ` Jakub Kicinski
2021-11-09 15:33 ` Jason Gunthorpe [this message]
2021-11-09 16:20 ` Jakub Kicinski
2021-11-09 18:24 ` Jason Gunthorpe
2021-11-11 12:05 ` Jiri Pirko
2021-11-11 12:17 ` Leon Romanovsky
2021-11-12 7:38 ` Jiri Pirko
2021-11-14 6:19 ` Leon Romanovsky
2021-11-15 11:20 ` Jiri Pirko
2021-11-15 12:53 ` Jason Gunthorpe
2021-11-15 14:42 ` Jiri Pirko
2021-11-15 15:09 ` Jason Gunthorpe
2021-11-15 15:22 ` Jakub Kicinski
2021-11-16 7:00 ` Jiri Pirko
2021-11-16 13:45 ` Jakub Kicinski
2021-11-16 6:57 ` Jiri Pirko
2021-11-16 12:44 ` Jason Gunthorpe
2021-11-17 14:15 ` Leon Romanovsky
2021-11-10 7:52 ` Leon Romanovsky
2021-11-09 16:15 ` Jiri Pirko
2021-11-09 16:26 ` Jakub Kicinski
2021-11-09 16:30 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211109153335.GH1740502@nvidia.com \
--to=jgg@nvidia.com \
--cc=davem@davemloft.net \
--cc=edwin.peer@broadcom.com \
--cc=idosch@idosch.org \
--cc=jiri@nvidia.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).