From: Jiri Pirko <jiri@resnulli.us>
To: Parav Pandit <parav@mellanox.com>
Cc: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
Jakub Kicinski <jakub.kicinski@netronome.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"oss-drivers@netronome.com" <oss-drivers@netronome.com>
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports
Date: Mon, 18 Mar 2019 13:21:05 +0100 [thread overview]
Message-ID: <20190318122105.GH2270@nanopsycho> (raw)
In-Reply-To: <VI1PR0501MB227109474D520BDCE212A5B3D1440@VI1PR0501MB2271.eurprd05.prod.outlook.com>
Fri, Mar 15, 2019 at 10:59:33PM CET, parav@mellanox.com wrote:
>
>
>> -----Original Message-----
>> From: Jiri Pirko <jiri@resnulli.us>
>> Sent: Friday, March 15, 2019 3:08 PM
>> To: Parav Pandit <parav@mellanox.com>
>> Cc: Samudrala, Sridhar <sridhar.samudrala@intel.com>; Jakub Kicinski
>> <jakub.kicinski@netronome.com>; davem@davemloft.net;
>> netdev@vger.kernel.org; oss-drivers@netronome.com
>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
>> ports
>>
>> Fri, Mar 15, 2019 at 04:32:24PM CET, parav@mellanox.com wrote:
>> >
>> >
>> >> -----Original Message-----
>> >> From: Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> >> Sent: Friday, March 15, 2019 12:58 AM
>> >> To: Parav Pandit <parav@mellanox.com>; Jakub Kicinski
>> >> <jakub.kicinski@netronome.com>
>> >> Cc: Jiri Pirko <jiri@resnulli.us>; davem@davemloft.net;
>> >> netdev@vger.kernel.org; oss-drivers@netronome.com
>> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
>> >> devlink PCI ports
>> >>
>> >>
>> >> On 3/14/2019 7:40 PM, Parav Pandit wrote:
>> >> >
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> >> >> Sent: Thursday, March 14, 2019 9:16 PM
>> >> >> To: Parav Pandit <parav@mellanox.com>; Jakub Kicinski
>> >> >> <jakub.kicinski@netronome.com>
>> >> >> Cc: Jiri Pirko <jiri@resnulli.us>; davem@davemloft.net;
>> >> >> netdev@vger.kernel.org; oss-drivers@netronome.com
>> >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
>> >> >> devlink PCI ports
>> >> >>
>> >> >>
>> >> >>
>> >> >> On 3/14/2019 6:28 PM, Parav Pandit wrote:
>> >> >>>
>> >> >>>
>> >> >>>> -----Original Message-----
>> >> >>>> From: Jakub Kicinski <jakub.kicinski@netronome.com>
>> >> >>>> Sent: Thursday, March 14, 2019 6:39 PM
>> >> >>>> To: Parav Pandit <parav@mellanox.com>
>> >> >>>> Cc: Jiri Pirko <jiri@resnulli.us>; davem@davemloft.net;
>> >> >>>> netdev@vger.kernel.org; oss-drivers@netronome.com
>> >> >>>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
>> >> >>>> devlink PCI ports
>> >> >>>>
>> >> >>>> On Thu, 14 Mar 2019 22:35:36 +0000, Parav Pandit wrote:
>> >> >>>>>>> Then instances of flavour pci_vf are going to appear in the
>> >> >>>>>>> same devlink instance. Those are the switch ports:
>> >> >>>>>>> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0
>> >> >>>>>>> flavour pci_vf pf 0 vf 0
>> >> >>>>>>> switch_id 00154d130d2f peer
>> >> >>>>>>> pci/0000:05:10.1/0
>> >> >>>>>>> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0
>> >> >>>>>>> flavour pci_vf pf 0 vf 0 subport 1
>> >> >>>>>>> switch_id 00154d130d2f peer
>> >> >>>>>>> pci/0000:05:10.1/1
>> >> >>>>>>>
>> >> >>>>>>> With that, peers are going to appear too, and those are the
>> >> >>>>>>> actual VF/VF
>> >> >>>>>>> subport:
>> >> >>>>>>> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host
>> >> >>>>>>> peer pci/0000:05:00.0/10002
>> >> >>>>>>> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host
>> >> >>>>>>> peer pci/0000:05:00.0/10003
>> >> >>>>>>>
>> >> >>>>>>> Later you can push this VF along with all subports to VM. So
>> >> >>>>>>> in VM, you are going to see the VF like this:
>> >> >>>>>>> $ devlink dev
>> >> >>>>>>> pci/0000:00:08.0
>> >> >>>>>>> $ devlink port
>> >> >>>>>>> pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host
>> >> >>>>>>> pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host
>> >> >>>>>>>
>> >> >>>>>>> And back to your question of how are they connected in eswitch.
>> >> >>>>>>> That is totally up to the original user John who did the creation.
>> >> >>>>>>> He is in charge of the eswitch on baremetal, he would
>> >> >>>>>>> configure the forwarding however he likes.
>> >> >>>>>>
>> >> >>>>>> Ack, so I think you're saying VM has to communicate to the
>> >> >>>>>> cloud environment to have this provisioned using some service
>> >> >>>>>> API, not a kernel API. That's what I wanted to confirm.
>> >> >>>>>>
>> >> >>>>>> I don't see any benefit to having the "host ports" under
>> >> >>>>>> devlink, as such I think it's a matter of preference.
>> >> >>>>>
>> >> >>>>> We need 'host ports' to configure parameters of this host port
>> >> >>>>> which is not exposed by the rep-netdev.
>> >> >>>>> Such as mac address.
>> >> >>>>
>> >> >>>> Please look at the quote of what Jiri wrote above - the host
>> >> >>>> port gets passed to the VM, you can't use it as a handle to set the
>> MAC.
>> >> >>>>
>> >> >>>> The way to set the MAC remains:
>> >> >>>>
>> >> >>>> # devlink port set pci/0000:05:00.0/10002 peer mac_addr
>> >> >>>> 00:11:22:33:44:55
>> >> >>>>
>> >> >>> Even though it can be done, I think this is wrong model to
>> >> >>> program
>> >> >> hostport mac address using eswitch port.
>> >> >>> All devlink objects are control objects, so what is passed to VM
>> >> >>> is what is
>> >> >> represented by devlink.
>> >> >>> VF in the VM will anyway create its devlink object.
>> >> >>> What is wrong in programming hostport?
>> >> >>> It gives a very clear view to users of topology and objects.
>> >> >>
>> >> >> The VF or any subport MAC address should be configured by the
>> >> >> orchestration layer that is running on the hypervisor and when a
>> >> >> VF is assigned to a VF, the host port is not visible to the hypervisor.
>> >> > What prevents creation of hostport due to which is not visible?
>> >> > Hostport is control port to program host side of parameters.
>> >> > It should be created when user wants to program the parameters.
>> >> >
>> >> > Model is really straight forward.
>> >> > Program host port params using hostport object.
>> >> > Program switchport params using rep-netdev.
>> >>
>> >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects for each
>> >> port - host facing ports and switch facing ports. This is in addition
>> >> to the netdevs that are created today.
>> >>
>> >I am not proposing any different.
>> >I am proposing only two changes.
>> >1. control hostport params via referring hostport (not via indirect
>> >peer)
>>
>> Not really possible. If you passthrough VF into VM, the hostport goes along
>> with it.
>>
>No.
>I am sorry in showing the enumeration which is the source of confusion.
>
>Below is the right enumeration.
>
>When VF is enumerated initially in the host, where eswitch devlink instance is located.
>Below enumeration is seen.
>
>First two entries shows the link between hostport and switchport.
>$ devlink port show
>pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f peer pci/0000:05:00.0/1
>
>pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002
Hostport should not have switch_id.
>
>pci/0000:05:10.1/0 eth netdev flavour hostport
>This entry won't be seen if VF auto probing is disabled. Because than VF is not enumerated.
>
>As a user, I will be programming the mac address of hostport for a VF.
>pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002
Hmm, so you are going to have 2 hostports for VF:
1) pci/0000:05:10.1/0
real one, that is going to go to VM - with a separate pci address
and devlink instance.
2) pci/0000:05:00.0/1
dummy one, which is not really a hostport, as there is no netdev
created for it. It only models the other side of cable, which is away
in VM.
>
>
>>
>> >2. flavour should not be vf/pf, flavour should be hostport, switchport.
>> >Because switch is flat and agnostic of pf/vf/mdev.
>>
>> Not sure. It's good to have this kind of visibility.
>>
>port can have label/attribute indicating that this belong to VF-1 or mdev as long as you are agreeing to have mdev attribute on host port.
>(and not ask for abstracting it, because mdev is well defined kernel object).
Why mdev cannot be another flavour?
>
>>
>> >
>> >> Are you suggesting that all the devlink objects should be visible
>> >> only at the hypervisor layer?
>> >>
>> >Of course not.
>> >
>> >Ports and params controlled by hypervisor should be exposed at
>> hypervisor/eswitch wherever its parent devlink instance exist.
>> >Ports which should be visible inside a VM should be exposed inside a VM.
>> >So for a given VF,
>> >
>> >If eswitch is at hypervisor level,
>> >$ devlink port show
>> >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id
>> >00154d130d2f peer pci/0000:05:10.1/0
>> >pci/0000:05:10.1/0 eth netdev flavour hostport switch_id 00154d130d2f
>> >peer pci/0000:05:00.0/10002
>> >
>> >where VF is enumerated,
>> >$ devlink port show
>> >pci/0000:05:10.1/0 eth netdev flavour hostport
>>
>> So this is how it looks like in VM, right?
>>
>Yep.
>Once VF is mapped to VM only two entries are seen and hostport can be still controlled.
>
>$ devlink port show
>pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f peer pci/0000:05:00.0/1
>
>pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002
>
>This addresses the case for Infiniband where there is no eswitch, but hostports exists and should be managed.
>We shouldn't be inventing new devlink APIs or create a fake sw eswitch object which doesn't exist in hw.
>
>>
>> >This is because unprivileged VF doesn't have visibility to eswitch and its
>> links.
>> >
>> >> I think the terminology need to be defined clearly so that we are all
>> >> on the same page.
>> >>
>> >> >
>> >> >> Currently we have ndo_set_vf_mac_addr api that works with PF
>> >> >> netdev, but i think we are trying to move away from that API and
>> >> >> do all the configuration via the port representor netdevs.
>> >> > This is fine rep-netdev represents eswitch port.
>> >> > You normally don't go to switch to program host port params.
>> >> >
>> >> >> As the mac address cannot be configured using this netdev, i think
>> >> >> Jakub is suggesting creating a devlink opject for each port
>> >> >> representor and use that interface to set peer mac address.
>> >> >
>> >> > I understand but is convoluted interface.
>> >> > When you program host NIC mac address you talk to iLo or BIOS.
>> >> > When you program switch side mac address, you go
>> switch/router/modem.
>> >> >
>> >> > Also programming host params on host side, also doesn't make
>> >> assumption that its connected to eswitch.
>> >> > It also doesn't assume that same connectivity for its life.
>> >> >
>> >> > If you model around how physical devices are configured, it will
>> >> > almost
>> >> never go wrong and still provides same level of flexibility.
>> >> >
>> >> >> We should be able use this to configure port vlan too.
>> >> >>
>> >> >> Also, instead of subport, can we call vport and support different
>> >> >> types of vports - sr-iov, siov, vmdq etc.
>> >> >>
>> >> > At switch level there are just ports.
>> >> > sriov, siov, mdev, vmdq are their couter part (peer) where it is
>> connected.
>> >> >
>> >> >>>
>> >> >>> Also eswitch is flat. There is no need of pf/vf flavour for port.
>> >> >>> It doesn't make sense to define 'mdev' flavour which we are
>> >> >>> already
>> >> >> working.
>> >> >>> At eswitch level it is just a port, it happen to be connected to
>> >> >>> vf or pf or
>> >> >> other objects, it doesn't matter.
>> >> >>> Port should be flavoured as 'hostport' or 'switchport'.
>> >> >>>
>> >> >>>
>> >> >>>> (using the port ids from above)
next prev parent reply other threads:[~2019-03-18 12:31 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-01 18:04 [PATCH net-next v2 0/7] devlink: expose PF and VF representors as ports Jakub Kicinski
2019-03-01 18:04 ` [PATCH net-next v2 1/7] nfp: split devlink port init from registration Jakub Kicinski
2019-03-01 18:04 ` [PATCH net-next v2 2/7] devlink: add PF and VF port flavours Jakub Kicinski
2019-03-01 18:04 ` [PATCH net-next v2 3/7] nfp: register devlink ports of all reprs Jakub Kicinski
2019-03-02 8:43 ` Jiri Pirko
2019-03-02 19:07 ` Jakub Kicinski
2019-03-04 7:36 ` Jiri Pirko
2019-03-04 23:32 ` Jakub Kicinski
2019-03-01 18:04 ` [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Jakub Kicinski
2019-03-02 9:41 ` Jiri Pirko
2019-03-02 19:48 ` Jakub Kicinski
2019-03-04 7:56 ` Jiri Pirko
2019-03-05 0:33 ` Jakub Kicinski
2019-03-05 11:06 ` Jiri Pirko
2019-03-05 17:15 ` Jakub Kicinski
2019-03-05 19:59 ` Parav Pandit
2019-03-06 12:20 ` Jiri Pirko
2019-03-06 17:56 ` Jakub Kicinski
2019-03-07 3:56 ` Parav Pandit
2019-03-07 9:48 ` Jiri Pirko
2019-03-08 2:52 ` Jakub Kicinski
2019-03-08 14:54 ` Jiri Pirko
2019-03-08 19:09 ` Jakub Kicinski
2019-03-11 8:52 ` Jiri Pirko
2019-03-12 2:10 ` Jakub Kicinski
2019-03-12 14:02 ` Jiri Pirko
2019-03-12 20:56 ` Jakub Kicinski
2019-03-13 6:07 ` Jiri Pirko
2019-03-13 16:17 ` Jakub Kicinski
2019-03-13 16:22 ` Jiri Pirko
2019-03-13 16:55 ` Jakub Kicinski
2019-03-14 7:38 ` Jiri Pirko
2019-03-14 22:09 ` Jakub Kicinski
2019-03-14 22:35 ` Parav Pandit
2019-03-14 23:39 ` Jakub Kicinski
2019-03-15 1:28 ` Parav Pandit
2019-03-15 1:31 ` Parav Pandit
2019-03-15 2:15 ` Samudrala, Sridhar
2019-03-15 2:40 ` Parav Pandit
[not found] ` <ae938b4f-5fa9-3c33-8ae6-eab2d3d9f1ec@intel.com>
2019-03-15 15:32 ` Parav Pandit
2019-03-15 20:08 ` Jiri Pirko
2019-03-15 20:44 ` Jakub Kicinski
2019-03-15 22:12 ` Parav Pandit
2019-03-16 1:16 ` Jakub Kicinski
2019-03-18 15:43 ` Parav Pandit
2019-03-18 19:29 ` Jakub Kicinski
2019-03-18 12:11 ` Jiri Pirko
2019-03-18 19:16 ` Jakub Kicinski
2019-03-21 8:45 ` Jiri Pirko
2019-03-21 15:14 ` Parav Pandit
2019-03-21 16:14 ` Jiri Pirko
2019-03-21 16:52 ` Parav Pandit
2019-03-21 17:20 ` Jiri Pirko
2019-03-21 17:34 ` Parav Pandit
2019-03-22 16:27 ` Jiri Pirko
2019-03-23 0:37 ` Parav Pandit
2019-03-15 21:59 ` Parav Pandit
2019-03-18 12:21 ` Jiri Pirko [this message]
2019-03-18 15:56 ` Parav Pandit
2019-03-18 16:22 ` Parav Pandit
2019-03-18 19:36 ` Jakub Kicinski
2019-03-18 19:44 ` Parav Pandit
2019-03-18 19:59 ` Jakub Kicinski
2019-03-18 20:35 ` Parav Pandit
2019-03-18 21:29 ` Jakub Kicinski
2019-03-18 22:11 ` Parav Pandit
2019-03-20 18:24 ` Parav Pandit
2019-03-20 20:22 ` Jakub Kicinski
2019-03-20 23:39 ` Parav Pandit
2019-03-21 9:08 ` Jiri Pirko
2019-03-21 15:03 ` Parav Pandit
2019-03-21 16:16 ` Jiri Pirko
2019-03-21 16:50 ` Parav Pandit
2019-03-21 17:23 ` Jiri Pirko
2019-03-21 17:42 ` Parav Pandit
2019-03-22 13:32 ` Jiri Pirko
2019-03-23 0:40 ` Parav Pandit
2019-03-25 20:34 ` Parav Pandit
2019-03-18 19:19 ` Jakub Kicinski
2019-03-18 19:38 ` Parav Pandit
2019-03-21 9:09 ` Jiri Pirko
2019-03-15 7:00 ` Jiri Pirko
[not found] ` <7227d58e-ac58-d549-b921-ca0a0dd3f4b0@intel.com>
2019-03-13 7:37 ` Jiri Pirko
2019-03-13 16:03 ` Samudrala, Sridhar
2019-03-13 16:24 ` Jiri Pirko
2019-03-04 11:19 ` Jiri Pirko
2019-03-05 0:40 ` Jakub Kicinski
2019-03-05 11:07 ` Jiri Pirko
2019-03-04 11:08 ` Jiri Pirko
2019-03-05 0:51 ` Jakub Kicinski
2019-03-05 11:09 ` Jiri Pirko
2019-03-01 18:04 ` [PATCH net-next v2 5/7] nfp: switch to devlink_port_get_phys_port_name() Jakub Kicinski
2019-03-01 18:04 ` [PATCH net-next v2 6/7] devlink: introduce port's peer netdevs Jakub Kicinski
2019-03-01 18:04 ` [PATCH net-next v2 7/7] nfp: expose PF " Jakub Kicinski
2019-03-02 10:13 ` [PATCH net-next v2 0/7] devlink: expose PF and VF representors as ports Jiri Pirko
2019-03-02 19:49 ` [oss-drivers] " Jakub Kicinski
2019-03-04 5:12 ` Parav Pandit
2019-03-04 18:22 ` David Miller
2019-03-20 20:25 ` Jakub Kicinski
2019-03-21 9:11 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190318122105.GH2270@nanopsycho \
--to=jiri@resnulli.us \
--cc=davem@davemloft.net \
--cc=jakub.kicinski@netronome.com \
--cc=netdev@vger.kernel.org \
--cc=oss-drivers@netronome.com \
--cc=parav@mellanox.com \
--cc=sridhar.samudrala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox