From: Jason Gunthorpe <jgg@nvidia.com>
To: Edwin Peer <edwin.peer@broadcom.com>
Cc: Parav Pandit <parav@nvidia.com>,
Saeed Mahameed <saeed@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
Sridhar Samudrala <sridhar.samudrala@intel.com>,
David Ahern <dsahern@kernel.org>,
Kiran Patil <kiran.patil@intel.com>,
Jacob Keller <jacob.e.keller@intel.com>,
"Ertman, David M" <david.m.ertman@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Saeed Mahameed <saeedm@nvidia.com>
Subject: Re: [pull request][net-next V10 00/14] Add mlx5 subfunction support
Date: Mon, 25 Jan 2021 19:13:55 -0400 [thread overview]
Message-ID: <20210125231355.GC4147@nvidia.com> (raw)
In-Reply-To: <CAKOOJTwWUCe+6qkderKY7ojfHWDxkMQyQTR6uYRFNiZJ8zzYbw@mail.gmail.com>
On Mon, Jan 25, 2021 at 01:23:04PM -0800, Edwin Peer wrote:
> On Mon, Jan 25, 2021 at 12:41 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> > > That's an implementation decision. Nothing mandates that the state has
> > > to physically exist in the same structure, only that reads and writes
> > > are appropriately responded to.
> >
> > Yes, PCI does mandate this, you can't store the data on the other side
> > of the PCI link, and if you can't cross the PCI link that only leaves
> > on die/package memory resources.
>
> Going off device was not what I was suggesting at all. I meant the
> data doesn't necessarily need to be stored in the same physical
> layout.
It doesn't change anything, every writable bit must still be stored
on-die SRAM. You can compute the minimum by summing all writable and
read-reporting bits in the standard SRIOV config space.
Every bit used for SRIOV is a bit that couldn't be used to improve
device performance.
> > > Right, but presumably it still needs to be at least a page. And,
> > > nothing says your device's VF BAR protocol can't be equally simple.
> >
> > Having VFs that are not self-contained would require significant
> > changing of current infrastructure, if we are going to change things
> > then let's fix everything instead of some half measure.
>
> I don't understand what you mean by self-contained.
Self-contained means you can pass the VF to a VM with vfio and run a
driver on it. A VF that only has a write-only doorbell page probably
cannot be self contained.
> In practice, there will be some kind of configuration channel too,
> but this doesn't necessarily need a lot of room either
I don't know of any device that can run without configuration, even in
a VF case.
So this all costs SRAM too.
> > The actual complexity inside the kernel is small and the user
> > experience to manage them through devlink is dramatically better than
> > SRIOV. I think it is a win even if there isn't any HW savings.
>
> I'm not sure I agree with respect to user experience. Users are
> familiar with SR-IOV.
Sort of, SRIOV is a very bad fit for these sophisticated devices, and
no, users are not familiar with the weird intricate details of SR-IOV
in the context of very sophisticated reconfigurable HW like we are
seeing now.
Look at the other series about MSI-X reconfiguration for some colour
on where SRIOV runs into limits due to its specific design.
> Now you impose a complementary model for accomplishing the same goal
> (without solving all the problems, as per the previous discussion,
> so we'll need to reinvent it again later).
I'm not sure what you are referring to.
> It's not easier for vendors either. Now we need to get users onto new
> drivers to exploit it, with all the distribution lags that entails
> (where existing drivers would work for SR-IOV).
Compatability with existing drivers in a VM is a vendor
choice. Drivers can do a lot in a scalable way in hypervisor SW to
present whateve programming interface makes sense to the VM. Intel is
showing this approach in their IDXD SIOV ADI driver.
> Some vendors will support it, some won't, further adding to user
> confusion.
Such is the nature of all things, some vendors supported SRIOV and
other didn't too.
Jason
next prev parent reply other threads:[~2021-01-25 23:16 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-22 19:36 [pull request][net-next V10 00/14] Add mlx5 subfunction support Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 01/14] devlink: Prepare code to fill multiple port function attributes Saeed Mahameed
2021-01-29 1:40 ` patchwork-bot+netdevbpf
2021-01-22 19:36 ` [net-next V10 02/14] devlink: Introduce PCI SF port flavour and port attribute Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 03/14] devlink: Support add and delete devlink port Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 04/14] devlink: Support get and set state of port function Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 05/14] net/mlx5: Introduce vhca state event notifier Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 06/14] net/mlx5: SF, Add auxiliary device support Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 07/14] net/mlx5: SF, Add auxiliary device driver Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 08/14] net/mlx5: E-switch, Prepare eswitch to handle SF vport Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 09/14] net/mlx5: E-switch, Add eswitch helpers for " Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 10/14] net/mlx5: SF, Add port add delete functionality Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 11/14] net/mlx5: SF, Port function state change support Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 12/14] devlink: Add devlink port documentation Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 13/14] devlink: Extend devlink port documentation for subfunctions Saeed Mahameed
2021-01-22 19:36 ` [net-next V10 14/14] net/mlx5: Add devlink subfunction port documentation Saeed Mahameed
2021-01-24 20:47 ` [pull request][net-next V10 00/14] Add mlx5 subfunction support Edwin Peer
2021-01-25 10:57 ` Parav Pandit
2021-01-25 13:22 ` Jason Gunthorpe
2021-01-25 19:23 ` Edwin Peer
2021-01-25 19:49 ` Jason Gunthorpe
2021-01-25 20:05 ` Edwin Peer
2021-01-25 20:22 ` Michael Chan
2021-01-25 20:26 ` Parav Pandit
2021-01-25 18:35 ` Edwin Peer
2021-01-25 19:34 ` Edwin Peer
2021-01-25 19:59 ` Jason Gunthorpe
2021-01-25 20:22 ` Edwin Peer
2021-01-25 20:41 ` Jason Gunthorpe
2021-01-25 21:23 ` Edwin Peer
2021-01-25 23:13 ` Jason Gunthorpe [this message]
2021-01-27 1:34 ` Jakub Kicinski
2021-01-29 0:03 ` Saeed Mahameed
2021-01-29 0:11 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210125231355.GC4147@nvidia.com \
--to=jgg@nvidia.com \
--cc=alexander.duyck@gmail.com \
--cc=dan.j.williams@intel.com \
--cc=davem@davemloft.net \
--cc=david.m.ertman@intel.com \
--cc=dsahern@kernel.org \
--cc=edwin.peer@broadcom.com \
--cc=jacob.e.keller@intel.com \
--cc=kiran.patil@intel.com \
--cc=kuba@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=parav@nvidia.com \
--cc=saeed@kernel.org \
--cc=saeedm@nvidia.com \
--cc=sridhar.samudrala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.