All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: davem@davemloft.net, oss-drivers@netronome.com,
	netdev@vger.kernel.org, parav@mellanox.com, jgg@mellanox.com
Subject: Re: [PATCH net-next 4/8] devlink: allow subports on devlink PCI ports
Date: Fri, 1 Mar 2019 08:25:57 +0100	[thread overview]
Message-ID: <20190301072557.GG2314@nanopsycho> (raw)
In-Reply-To: <20190228082404.5b6d1061@cakuba.netronome.com>

Thu, Feb 28, 2019 at 05:24:04PM CET, jakub.kicinski@netronome.com wrote:
>On Thu, 28 Feb 2019 09:56:24 +0100, Jiri Pirko wrote:
>> Wed, Feb 27, 2019 at 07:30:00PM CET, jakub.kicinski@netronome.com wrote:
>> >On Wed, 27 Feb 2019 13:37:53 +0100, Jiri Pirko wrote:  
>> >> Tue, Feb 26, 2019 at 07:24:32PM CET, jakub.kicinski@netronome.com wrote:  
>> >> >PCI endpoint corresponds to a PCI device, but such device
>> >> >can have one more more logical device ports associated with it.
>> >> >We need a way to distinguish those. Add a PCI subport in the
>> >> >dumps and print the info in phys_port_name appropriately.
>> >> >
>> >> >This is not equivalent to port splitting, there is no split
>> >> >group. It's just a way of representing multiple netdevs on
>> >> >a single PCI function.
>> >> >
>> >> >Note that the quality of being multiport pertains only to
>> >> >the PCI function itself. A PF having multiple netdevs does
>> >> >not mean that its VFs will also have multiple, or that VFs
>> >> >are associated with any particular port of a multiport VF.  
>> >> 
>> >> We've been discussing the problem of subport (we call it "subfunction"
>> >> or "SF") for some time internally. Turned out, this is probably harder
>> >> task to model. Please prove me wrong.
>> >> 
>> >> The nature of VF makes it a logically separate entity. It has a separate
>> >> PCI address, it should therefore have a separate devlink instance.
>> >> You can pass it through to VM, then the same devlink instance should be
>> >> created inside the VM and disappear from the host.  
>> >
>> >Depends what a devlink instance represents :/  On one hand you may want
>> >to create an instance for a VF to allow it to spawn soft ports, on the
>> >other you may want to group multiple functions together.
>> >
>> >IOW if devlink instance is for an ASIC, there should be one per device
>> >per host.  So if we start connecting multiple functions (PFs and/or VFs)
>> >to one host we should probably introduce the notion of devlink aliases
>> >or some such (so that multiple bus addresses can target the same  
>> 
>> Hmm. Like VF address -> PF address alias? That would be confusing to see
>> eswitch ports under VF devlink instance... I probably did not get you
>> right.
>
>No eswitch ports under VF, more in case of mutli-PF.  Bus addresses of
>all PFs aliasing to the same devlink instance.

The multi-PF aliasing makes sense to me.


>
>> >devlink instance).  Those less pipelined NICs can forward between
>> >ports, but still want a function per port (otherwise user space
>> >sometimes gets confused).  If we have multiple functions which are on
>> >the same "switchid" they should have a single devlink instance if you
>> >ask me.  That instance will have all the ports of the device.  
>> 
>> Okay, that makes sense. But the question it, can the same devlink
>> instance contain ports that does not have "Switchid"?
>
>No strong preference if switchid is different.  To me devlink is an ASIC
>instance, if the multiport card is constructed by copy-pasting the same
>IP twice onto a die, and the ports really are completely separate, there
>is no reason to require single devlink instance.

Okay.


>
>> I think it would be beneficial to have the switchid shown for devlink
>> ports too. Then it is clean that the devlink ports with the same
>> switchid belong to the same switch, and other ports under the same
>> devlink instance (like PF itself) is separate, but still under the same
>> ASIC.
>
>Sure, you mean in terms of UI - user space can do a link dump or get
>that from sysfs, right?

I thinking about moving it to devlink. I'll work on it more today.


>
>> >You say disappear from the host - what do you mean.  Are you referring
>> >to the VF port disappearing?  But on the switch the port is still  
>> 
>> No, VF itself. eswitch port will be still there on the host.
>> 
>> 
>> >there, and you should show the subports on the PF side IMHO.  Devlink
>> >ports should allow users to understand the topology of the switch.  
>> 
>> What do you mean by "topology"?
>
>Mostly which ports are part of the switch and what's their "flavour".
>Also (less importantly) which host netdevs are "peers" of eswitch ports.

Makes sense.


>
>> >Is spawning VMDq sub-instances the only thing we can think of that VMs
>> >may want to do?  Are there any other uses?
>> >  
>> >> SF (or subport) feels similar to that. Basically it is exactly the same
>> >> thing as VF, only does reside under PF PCI function.
>> >> 
>> >> That is why I think, for sake of consistency, it should have a separate
>> >> devlink entity as well. The problem is correct sysfs modelling and
>> >> devlink handle derived from that. Parav is working on a simple soft
>> >> bus for this purpose called "subbus". There is a RFC floating around on
>> >> Mellanox internal mailing list, looks like it is time to send it
>> >> upstream.
>> >> 
>> >> Then each PF driver which have SFs would register subbus devices
>> >> according to SFs/subports and they would be properly handled by bus
>> >> probe, devlink and devlink port and netdev instances created.
>> >> 
>> >> Ccing Parav and Jason.  
>> >
>> >You guys come from the RDMA side of the world, with which I'm less
>> >familiar, and the soft bus + spawning devices seems to be a popular
>> >design there.  Could you describe the advantages of that model for 
>> >the sake of the netdev-only folks? :)  
>> 
>> I'll try to draw some ascii art :)
>
>Yess :)
>
>> >Another term that gets thrown into the mix here is mediated devices,
>> >right?  If you wanna pass the sub-spawn-soft-port to a VM.  Or run 
>> >DPDK on some queues.
>> >
>> >To state the obvious AF_XDP and macvlan offload were are previous
>> >answers to some of those use cases.  What is the forwarding model
>> >for those subports?  Are we going to allow flower rules from VMs?
>> >Is it going to be dst MAC only?  Or is the hypervisor going to forward
>> >as it sees appropriate (OvS + "repr"/port netdev)?  

  reply	other threads:[~2019-03-01  7:35 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26 18:24 [PATCH net-next 0/8] devlink: add PF and VF port flavours Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 1/8] nfp: split devlink port init from registration Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 2/8] devlink: add PF and VF port flavours Jakub Kicinski
2019-02-27 12:16   ` Jiri Pirko
2019-03-04  4:59     ` Parav Pandit
2019-03-04  7:30       ` Jiri Pirko
2019-03-20 17:29         ` Abodunrin, Akeem G
2019-03-21 12:26           ` Jiri Pirko
2019-02-27 12:23   ` Jiri Pirko
2019-02-27 12:41     ` Jiri Pirko
2019-02-27 17:23       ` Jakub Kicinski
2019-02-27 20:17         ` Jiri Pirko
2019-02-27 22:42           ` Jakub Kicinski
2019-02-28  8:44             ` Jiri Pirko
2019-02-28 16:08               ` Jakub Kicinski
2019-02-28 16:24             ` David Ahern
2019-02-26 18:24 ` [PATCH net-next 3/8] nfp: register devlink ports of all reprs Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 4/8] devlink: allow subports on devlink PCI ports Jakub Kicinski
2019-02-27 12:37   ` Jiri Pirko
2019-02-27 18:30     ` Jakub Kicinski
2019-02-28  8:56       ` Jiri Pirko
2019-02-28 13:32         ` Jiri Pirko
2019-02-28 16:24         ` Jakub Kicinski
2019-03-01  7:25           ` Jiri Pirko [this message]
2019-03-01 16:04             ` Jakub Kicinski
2019-03-01 16:20               ` Jiri Pirko
2019-03-04 16:15       ` Jason Gunthorpe
2019-03-05  1:03         ` Jakub Kicinski
2019-03-05  1:30           ` Jason Gunthorpe
2019-03-05  2:11             ` Jakub Kicinski
2019-03-05 22:11               ` Jason Gunthorpe
2019-03-04  5:00     ` Parav Pandit
2019-02-26 18:24 ` [PATCH net-next 5/8] nfp: switch to devlink_port_get_phys_port_name() Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 6/8] devlink: introduce port's peer netdevs Jakub Kicinski
2019-02-27 13:08   ` Jiri Pirko
2019-02-27 18:47     ` Jakub Kicinski
2019-02-28  9:00       ` Jiri Pirko
2019-02-28 16:36         ` Jakub Kicinski
2019-03-01  7:37           ` Jiri Pirko
2019-03-01 16:05             ` Jakub Kicinski
2019-03-04  5:07     ` Parav Pandit
2019-02-26 18:24 ` [PATCH net-next 7/8] nfp: expose PF " Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 8/8] devlink: fix kdoc Jakub Kicinski
2019-02-27 13:13   ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301072557.GG2314@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jgg@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=parav@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.