From: Jakub Kicinski <jakub.kicinski@netronome.com>
To: Yuval Avnery <yuvalav@mellanox.com>
Cc: Andy Gospodarek <andrew.gospodarek@broadcom.com>,
Jiri Pirko <jiri@mellanox.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Saeed Mahameed <saeedm@mellanox.com>,
"leon@kernel.org" <leon@kernel.org>,
"davem@davemloft.net" <davem@davemloft.net>,
"shuah@kernel.org" <shuah@kernel.org>,
Daniel Jurgens <danielj@mellanox.com>,
Parav Pandit <parav@mellanox.com>,
"michael.chan@broadcom.com" <michael.chan@broadcom.com>
Subject: Re: [PATCH net-next v2 00/10] devlink subdev
Date: Tue, 12 Nov 2019 14:19:54 -0800 [thread overview]
Message-ID: <20191112141954.371e8cd2@cakuba> (raw)
In-Reply-To: <AM6PR05MB5142D1ADB06CB0D5FF646F46C5770@AM6PR05MB5142.eurprd05.prod.outlook.com>
On Tue, 12 Nov 2019 18:35:26 +0000, Yuval Avnery wrote:
> > It feels a bit 'unconstrained' to me as well. As Jakub said you added some
> > documentation, but the direction of this long-term is not clear.
> > What seems to happen too often is that we skip creating better infra for
> > existing devices and create it only for the newest shiniest object.
> > These changes are what garners the most attention from those who grant
> > permission to push things upstream (*cough* managers *cough*), but it's
> > not the right choice in this case. I'm not sure if that is part of what bothers
> > Jakub, but it is one thing that bothers me and why this feels incomplete.
> >
> > The thing that has been bouncing around in my head about this (and until I
> > was in front of a good text-based MUA I didn't dare respond) is that we seem
> > to have an overlap in functionality between what you are proposing and
> > existing virtual device configuration, but you are completely ignoring
> > improving upon the existing creation methods.
> > Whether dealing with a SmartNIC (which I used to describe a NIC with
> > general purpose processors that could be running Linux and I think you do,
> > too) or a regular NIC it seems like we should use devlink to create and
> > configure a variety of devices these could be:
> >
> > 1. Number of PFs (there are cases where more than 1 PF per uplink port
> > is required for command/control of the SmartNIC or where a single
> > PCI b/d/f may have many uplink ports that need to be addressed
> > separately)
>
> Yes, uplink ports can be addressed using "devlink port".
>
> Multiple pfs are already supported.
>
> $ devlink subdev show
> pci/0000:03:00.0/1: flavour pcipf pf 0
> pci/0000:03:00.0/1: flavour pcipf pf 1
> pci/0000:03:00.0/1: flavour pcivf pf 1 vf 0
I don't think that covers what Andy was taking about.
> > 2. Device-specific SR-IOV VFs visible to server 3. mdev devices that _may_
> > have representers on the embedded cores of
> > the NIC
>
> Yes, Is also planned for current interface. (will need to add mdev flavor in the future)
>
> > 4. Hardware VirtIO-net devices supported 5. Other non-network devices
> > (storage given as the first example) 6. ...
>
> All is up to driver - what to expose and what flavor to grant the subdev.
>
> > We cannot get rid of the methods for creating hardware VFs via sysfs, but
> > now that we are seeing lots of different flavors of devices that might be
> > created we should start by making sure at a minimum we can create existing
> > hardware devices (VFs) with this devlink interface and move from there. Is
> > there a plan to use this interface to create subdevs that are VFs or just new
> > subdev flavors? I could start to get behind an interface like this if the patches
> > showed that devlink would be the one place where device creation and
> > control could be done for all types of subdevs, but that isn't clear that it is
> > your direction based on the patches.
>
> I don’t see how the SmartNic can force the host PF to create new VFs.
> Isn’t that the host PF responsibility? Doesn't make sense to me.
> Do we want to have an interface that works only for NICs and not from
> the SmartNic embedded system?
>
> What we do here is expose all the PFS and potential VFs/mdevs that the host can enable.
> Unlike ip link which exposes only enabled VFs, here (in mlx5
> implementation) we expose all the VFs that the host PF _can_ enable.
> This allows, for example, to set the MAC of a VF before it is enabled.
So subdev will never be used to create interfaces? Just expose all
_possible_? Ugh. Wouldn't that be something to document?
How things are configured matters. This patch set throws some basics
together to get you the hw_addr from the SmartNIC side, and then you
start talking big plans like NVMe and PCIe control which you don't seem
to have thought through.
How does the PF in your diagram magically have different types of VFs?
PCIe SR-IOV cap has VF device vendor:product, you won't have NVMe VFs
hanging off the main PF like that in a normal world :/
Jiri pitched the subdev object as the "other side of the wire" but you
are back to arguing that you think it's actually the same as the port,
just more general. How does the hw_addr belongs to the ASIC side then?
And for PCIe you'd use this new interface for MSI-X or other resource
allocation. Say VFs. Who controls that in a SmartNIC scenario? Is the
host not allowed to move them around? This starts to touch multi-host
use case which Linux networking never begun to address.
You're creating a new object type with no clear semantics just to throw
in one feature - namely the hw_addr. If this is the quality of design
we can expect - just add the hw addr to devlink ports and let's move on.
> > So just to make sure this is clear, what I'm proposing that devlink
> > is used to create and configure all types of devices that it may be
> > possible to create,
>
> More flavors can be defiantly added in the future. About creation,
> see my comment above. Actually flavor is the only required attribute
> for registration. The rest is optional.
> > configure, or address from a PF and the starting place should be
> > standard, hardware VFs. If that can be done right and we can
> > address some of the issues with the current implementation (more
> > than one hw addr is a _great_ example) then adding new flavours for
> > the server devices and and adding new flavors for SmartNIC-based
> > devices should be easy and feel natural.
>
> So what you are saying that allowing multiple addresses per subdev
> will make a strong case for this patch set?
next prev parent reply other threads:[~2019-11-12 22:20 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-08 16:18 [PATCH net-next v2 00/10] devlink subdev Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 01/10] devlink: Introduce subdev Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 02/10] devlink: Add PCI attributes support for subdev Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 03/10] devlink: Add port with subdev register support Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 04/10] devlink: Support subdev HW address get Yuval Avnery
2019-11-08 18:00 ` Parav Pandit
2019-11-09 17:45 ` Yuval Avnery
2019-11-09 18:13 ` Parav Pandit
2019-11-10 8:12 ` Jiri Pirko
2019-11-08 16:18 ` [PATCH net-next v2 05/10] devlink: Support subdev HW address set Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 06/10] Documentation: Add devlink-subdev documentation Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 07/10] netdevsim: Add max_vfs to bus_dev Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 08/10] netdevsim: Add devlink subdev creation Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 09/10] netdevsim: Add devlink subdev sefltest for netdevsim Yuval Avnery
2019-11-08 16:18 ` [PATCH net-next v2 10/10] net/mlx5e: Add support for devlink subdev and subdev hw_addr set/show Yuval Avnery
2019-11-11 18:00 ` [PATCH net-next v2 00/10] devlink subdev Jakub Kicinski
2019-11-11 18:46 ` Yuval Avnery
2019-11-12 14:55 ` Andy Gospodarek
2019-11-12 18:35 ` Yuval Avnery
2019-11-12 22:19 ` Jakub Kicinski [this message]
2019-11-12 23:32 ` Yuval Avnery
2019-11-13 6:37 ` Parav Pandit
2019-11-14 16:47 ` Yuval Avnery
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191112141954.371e8cd2@cakuba \
--to=jakub.kicinski@netronome.com \
--cc=andrew.gospodarek@broadcom.com \
--cc=danielj@mellanox.com \
--cc=davem@davemloft.net \
--cc=jiri@mellanox.com \
--cc=leon@kernel.org \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=parav@mellanox.com \
--cc=saeedm@mellanox.com \
--cc=shuah@kernel.org \
--cc=yuvalav@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).