From: Jason Gunthorpe <jgg@nvidia.com>
To: Edwin Peer <edwin.peer@broadcom.com>
Cc: Parav Pandit <parav@nvidia.com>,
Saeed Mahameed <saeed@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
Sridhar Samudrala <sridhar.samudrala@intel.com>,
David Ahern <dsahern@kernel.org>,
Kiran Patil <kiran.patil@intel.com>,
Jacob Keller <jacob.e.keller@intel.com>,
"Ertman, David M" <david.m.ertman@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Saeed Mahameed <saeedm@nvidia.com>
Subject: Re: [pull request][net-next V10 00/14] Add mlx5 subfunction support
Date: Mon, 25 Jan 2021 19:13:55 -0400 [thread overview]
Message-ID: <20210125231355.GC4147@nvidia.com> (raw)
In-Reply-To: <CAKOOJTwWUCe+6qkderKY7ojfHWDxkMQyQTR6uYRFNiZJ8zzYbw@mail.gmail.com>
On Mon, Jan 25, 2021 at 01:23:04PM -0800, Edwin Peer wrote:
> On Mon, Jan 25, 2021 at 12:41 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> > > That's an implementation decision. Nothing mandates that the state has
> > > to physically exist in the same structure, only that reads and writes
> > > are appropriately responded to.
> >
> > Yes, PCI does mandate this, you can't store the data on the other side
> > of the PCI link, and if you can't cross the PCI link that only leaves
> > on die/package memory resources.
>
> Going off device was not what I was suggesting at all. I meant the
> data doesn't necessarily need to be stored in the same physical
> layout.
It doesn't change anything, every writable bit must still be stored
on-die SRAM. You can compute the minimum by summing all writable and
read-reporting bits in the standard SRIOV config space.
Every bit used for SRIOV is a bit that couldn't be used to improve
device performance.
> > > Right, but presumably it still needs to be at least a page. And,
> > > nothing says your device's VF BAR protocol can't be equally simple.
> >
> > Having VFs that are not self-contained would require significant
> > changing of current infrastructure, if we are going to change things
> > then let's fix everything instead of some half measure.
>
> I don't understand what you mean by self-contained.
Self-contained means you can pass the VF to a VM with vfio and run a
driver on it. A VF that only has a write-only doorbell page probably
cannot be self contained.
> In practice, there will be some kind of configuration channel too,
> but this doesn't necessarily need a lot of room either
I don't know of any device that can run without configuration, even in
a VF case.
So this all costs SRAM too.
> > The actual complexity inside the kernel is small and the user
> > experience to manage them through devlink is dramatically better than
> > SRIOV. I think it is a win even if there isn't any HW savings.
>
> I'm not sure I agree with respect to user experience. Users are
> familiar with SR-IOV.
Sort of, SRIOV is a very bad fit for these sophisticated devices, and
no, users are not familiar with the weird intricate details of SR-IOV
in the context of very sophisticated reconfigurable HW like we are
seeing now.
Look at the other series about MSI-X reconfiguration for some colour
on where SRIOV runs into limits due to its specific design.
> Now you impose a complementary model for accomplishing the same goal
> (without solving all the problems, as per the previous discussion,
> so we'll need to reinvent it again later).
I'm not sure what you are referring to.
> It's not easier for vendors either. Now we need to get users onto new
> drivers to exploit it, with all the distribution lags that entails
> (where existing drivers would work for SR-IOV).
Compatability with existing drivers in a VM is a vendor
choice. Drivers can do a lot in a scalable way in hypervisor SW to
present whateve programming interface makes sense to the VM. Intel is
showing this approach in their IDXD SIOV ADI driver.
> Some vendors will support it, some won't, further adding to user
> confusion.
Such is the nature of all things, some vendors supported SRIOV and
other didn't too.
Jason
next prev parent reply other threads:[~2021-01-25 23:16 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-22 19:36 [pull request][net-next V10 00/14] Add mlx5 subfunction support Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 01/14] devlink: Prepare code to fill multiple port function attributes Saeed Mahameed
2021-01-29 1:40 ` patchwork-bot+netdevbpf
2021-01-22 19:36 ` [net-next V10 02/14] devlink: Introduce PCI SF port flavour and port attribute Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 03/14] devlink: Support add and delete devlink port Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 04/14] devlink: Support get and set state of port function Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 05/14] net/mlx5: Introduce vhca state event notifier Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 06/14] net/mlx5: SF, Add auxiliary device support Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 07/14] net/mlx5: SF, Add auxiliary device driver Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 08/14] net/mlx5: E-switch, Prepare eswitch to handle SF vport Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 09/14] net/mlx5: E-switch, Add eswitch helpers for " Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 10/14] net/mlx5: SF, Add port add delete functionality Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 11/14] net/mlx5: SF, Port function state change support Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 12/14] devlink: Add devlink port documentation Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 13/14] devlink: Extend devlink port documentation for subfunctions Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 14/14] net/mlx5: Add devlink subfunction port documentation Saeed Mahameed
2021-01-24 20:47 ` [pull request][net-next V10 00/14] Add mlx5 subfunction support Edwin Peer
2021-01-25 10:57 ` Parav Pandit
2021-01-25 13:22 ` Jason Gunthorpe
2021-01-25 19:23 ` Edwin Peer
2021-01-25 19:49 ` Jason Gunthorpe
2021-01-25 20:05 ` Edwin Peer
2021-01-25 20:22 ` Michael Chan
2021-01-25 20:26 ` Parav Pandit
2021-01-25 18:35 ` Edwin Peer
2021-01-25 19:34 ` Edwin Peer
2021-01-25 19:59 ` Jason Gunthorpe
2021-01-25 20:22 ` Edwin Peer
2021-01-25 20:41 ` Jason Gunthorpe
2021-01-25 21:23 ` Edwin Peer
2021-01-25 23:13 ` Jason Gunthorpe [this message]
2021-01-27 1:34 ` Jakub Kicinski
2021-01-29 0:03 ` Saeed Mahameed
2021-01-29 0:11 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210125231355.GC4147@nvidia.com \
--to=jgg@nvidia.com \
--cc=alexander.duyck@gmail.com \
--cc=dan.j.williams@intel.com \
--cc=davem@davemloft.net \
--cc=david.m.ertman@intel.com \
--cc=dsahern@kernel.org \
--cc=edwin.peer@broadcom.com \
--cc=jacob.e.keller@intel.com \
--cc=kiran.patil@intel.com \
--cc=kuba@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=parav@nvidia.com \
--cc=saeed@kernel.org \
--cc=saeedm@nvidia.com \
--cc=sridhar.samudrala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).