From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAC84C43381 for ; Fri, 15 Mar 2019 20:18:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5875521871 for ; Fri, 15 Mar 2019 20:18:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=resnulli-us.20150623.gappssmtp.com header.i=@resnulli-us.20150623.gappssmtp.com header.b="16qQpD9W" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726654AbfCOUSv (ORCPT ); Fri, 15 Mar 2019 16:18:51 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:35402 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726303AbfCOUSu (ORCPT ); Fri, 15 Mar 2019 16:18:50 -0400 Received: by mail-wr1-f65.google.com with SMTP id w1so4916551wrp.2 for ; Fri, 15 Mar 2019 13:18:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=resnulli-us.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=UkmgVXT75V6NC3W6Rv+JKAgIkwKY6ZI2qHYE6297x8w=; b=16qQpD9WWcCoROprc3vqJjA9AlE6PqRtTVk25zMYJB9wSUGS2vTVJHzy0pu73PlsoO vO8qrzF0VzTRXwjY2ZDvkxwqIB57kGsKPctgDL4r6fDQ8kh91ihUrEb0b90Ien8ul/uL f3/E22UaTrN6uwxTWHc74KNFHxRVzq3khG6MTF0x/wNgQm5ERPwiDxZ8t5wCE2lzhuc1 IXcReHBoTTwLq6s8WjRA3FcIeoB9Ila56qwrfLl5lT44twEWwyLunQLqg60i32kWZZ7E KOXpor+LBZBODygpDOjhXY1wffMFJ8xvlSXoJBqrP4651XTG+j0mfAnH8hU0LnyNY+tr wUZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=UkmgVXT75V6NC3W6Rv+JKAgIkwKY6ZI2qHYE6297x8w=; b=Vn1SNlct5hAquM7n+C7Eiywc5NsA40QLf/z6ApDjCii3xWHkaDzXz9aikkVlzjRxvX 2GgbaJBBYIgJEXgoZ36kX9V6jJrTt3t/nuYEZBERKWTJJ+8T0mKlQ2MMS+M0l0m9fZ/n xdOsdIdcEOwkrizjQJpwiu02GUIkGdFpesRY/Ji+jL3XGvwGpaKGI8LSXqVCU/zd5z7M ZOgVYhZhU0+ZUAyVqDZPYrHbHnnP9FuBblKA/eJ5CHnApZpe9sPbp9M31+pA9laAgaQ8 zYhLbpluRTqibg5zNan2Grq0a3yCyVfS96ySTeUveLi3DrR/Vsg1s2F+fNLfCmM1OeFB /tdA== X-Gm-Message-State: APjAAAUN5n3xUJGn88mppZWgzm33v6iltpNAI9vi2NSPs0zOw5d1pxyO pc07JYBMwawn9QDLiyGRpjbVL6FpfBs= X-Google-Smtp-Source: APXvYqzOQIOt+3ray7UTA918axFWO0ztQQKmJPZv6+Du+AusX61PKMkOtjUCefn0M+9Suqxz0j2cUQ== X-Received: by 2002:a5d:6543:: with SMTP id z3mr3558831wrv.200.1552681127921; Fri, 15 Mar 2019 13:18:47 -0700 (PDT) Received: from localhost (jirka.pirko.cz. [84.16.102.26]) by smtp.gmail.com with ESMTPSA id b74sm1805050wme.6.2019.03.15.13.18.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 15 Mar 2019 13:18:46 -0700 (PDT) Date: Fri, 15 Mar 2019 21:08:14 +0100 From: Jiri Pirko To: Parav Pandit Cc: "Samudrala, Sridhar" , Jakub Kicinski , "davem@davemloft.net" , "netdev@vger.kernel.org" , "oss-drivers@netronome.com" Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Message-ID: <20190315200814.GD2305@nanopsycho> References: <20190313095555.0f4f92ca@cakuba.attlocal.net> <20190314073840.GA3034@nanopsycho> <20190314150945.031d1b08@cakuba.netronome.com> <20190314163915.24fd2481@cakuba.netronome.com> <4436da3d-4b99-f792-8e77-695d5958794d@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Fri, Mar 15, 2019 at 04:32:24PM CET, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Samudrala, Sridhar >> Sent: Friday, March 15, 2019 12:58 AM >> To: Parav Pandit ; Jakub Kicinski >> >> Cc: Jiri Pirko ; davem@davemloft.net; >> netdev@vger.kernel.org; oss-drivers@netronome.com >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI >> ports >> >> >> On 3/14/2019 7:40 PM, Parav Pandit wrote: >> > >> > >> >> -----Original Message----- >> >> From: Samudrala, Sridhar >> >> Sent: Thursday, March 14, 2019 9:16 PM >> >> To: Parav Pandit ; Jakub Kicinski >> >> >> >> Cc: Jiri Pirko ; davem@davemloft.net; >> >> netdev@vger.kernel.org; oss-drivers@netronome.com >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on >> >> devlink PCI ports >> >> >> >> >> >> >> >> On 3/14/2019 6:28 PM, Parav Pandit wrote: >> >>> >> >>> >> >>>> -----Original Message----- >> >>>> From: Jakub Kicinski >> >>>> Sent: Thursday, March 14, 2019 6:39 PM >> >>>> To: Parav Pandit >> >>>> Cc: Jiri Pirko ; davem@davemloft.net; >> >>>> netdev@vger.kernel.org; oss-drivers@netronome.com >> >>>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on >> >>>> devlink PCI ports >> >>>> >> >>>> On Thu, 14 Mar 2019 22:35:36 +0000, Parav Pandit wrote: >> >>>>>>> Then instances of flavour pci_vf are going to appear in the same >> >>>>>>> devlink instance. Those are the switch ports: >> >>>>>>> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 >> >>>>>>> flavour pci_vf pf 0 vf 0 >> >>>>>>> switch_id 00154d130d2f peer >> >>>>>>> pci/0000:05:10.1/0 >> >>>>>>> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0 >> >>>>>>> flavour pci_vf pf 0 vf 0 subport 1 >> >>>>>>> switch_id 00154d130d2f peer >> >>>>>>> pci/0000:05:10.1/1 >> >>>>>>> >> >>>>>>> With that, peers are going to appear too, and those are the >> >>>>>>> actual VF/VF >> >>>>>>> subport: >> >>>>>>> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host >> >>>>>>> peer pci/0000:05:00.0/10002 >> >>>>>>> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host >> >>>>>>> peer pci/0000:05:00.0/10003 >> >>>>>>> >> >>>>>>> Later you can push this VF along with all subports to VM. So in >> >>>>>>> VM, you are going to see the VF like this: >> >>>>>>> $ devlink dev >> >>>>>>> pci/0000:00:08.0 >> >>>>>>> $ devlink port >> >>>>>>> pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host >> >>>>>>> pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host >> >>>>>>> >> >>>>>>> And back to your question of how are they connected in eswitch. >> >>>>>>> That is totally up to the original user John who did the creation. >> >>>>>>> He is in charge of the eswitch on baremetal, he would configure >> >>>>>>> the forwarding however he likes. >> >>>>>> >> >>>>>> Ack, so I think you're saying VM has to communicate to the cloud >> >>>>>> environment to have this provisioned using some service API, not >> >>>>>> a kernel API. That's what I wanted to confirm. >> >>>>>> >> >>>>>> I don't see any benefit to having the "host ports" under devlink, >> >>>>>> as such I think it's a matter of preference. >> >>>>> >> >>>>> We need 'host ports' to configure parameters of this host port >> >>>>> which is not exposed by the rep-netdev. >> >>>>> Such as mac address. >> >>>> >> >>>> Please look at the quote of what Jiri wrote above - the host port >> >>>> gets passed to the VM, you can't use it as a handle to set the MAC. >> >>>> >> >>>> The way to set the MAC remains: >> >>>> >> >>>> # devlink port set pci/0000:05:00.0/10002 peer mac_addr >> >>>> 00:11:22:33:44:55 >> >>>> >> >>> Even though it can be done, I think this is wrong model to program >> >> hostport mac address using eswitch port. >> >>> All devlink objects are control objects, so what is passed to VM is >> >>> what is >> >> represented by devlink. >> >>> VF in the VM will anyway create its devlink object. >> >>> What is wrong in programming hostport? >> >>> It gives a very clear view to users of topology and objects. >> >> >> >> The VF or any subport MAC address should be configured by the >> >> orchestration layer that is running on the hypervisor and when a VF >> >> is assigned to a VF, the host port is not visible to the hypervisor. >> > What prevents creation of hostport due to which is not visible? >> > Hostport is control port to program host side of parameters. >> > It should be created when user wants to program the parameters. >> > >> > Model is really straight forward. >> > Program host port params using hostport object. >> > Program switchport params using rep-netdev. >> >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects for each port - >> host facing ports and switch facing ports. This is in addition to the netdevs >> that are created today. >> >I am not proposing any different. >I am proposing only two changes. >1. control hostport params via referring hostport (not via indirect peer) Not really possible. If you passthrough VF into VM, the hostport goes along with it. >2. flavour should not be vf/pf, flavour should be hostport, switchport. >Because switch is flat and agnostic of pf/vf/mdev. Not sure. It's good to have this kind of visibility. > >> Are you suggesting that all the devlink objects should be visible only at the >> hypervisor layer? >> >Of course not. > >Ports and params controlled by hypervisor should be exposed at hypervisor/eswitch wherever its parent devlink instance exist. >Ports which should be visible inside a VM should be exposed inside a VM. >So for a given VF, > >If eswitch is at hypervisor level, >$ devlink port show >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f peer pci/0000:05:10.1/0 >pci/0000:05:10.1/0 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002 > >where VF is enumerated, >$ devlink port show >pci/0000:05:10.1/0 eth netdev flavour hostport So this is how it looks like in VM, right? >This is because unprivileged VF doesn't have visibility to eswitch and its links. > >> I think the terminology need to be defined clearly so that we are all on the >> same page. >> >> > >> >> Currently we have ndo_set_vf_mac_addr api that works with PF netdev, >> >> but i think we are trying to move away from that API and do all the >> >> configuration via the port representor netdevs. >> > This is fine rep-netdev represents eswitch port. >> > You normally don't go to switch to program host port params. >> > >> >> As the mac address cannot be configured using this netdev, i think >> >> Jakub is suggesting creating a devlink opject for each port >> >> representor and use that interface to set peer mac address. >> > >> > I understand but is convoluted interface. >> > When you program host NIC mac address you talk to iLo or BIOS. >> > When you program switch side mac address, you go switch/router/modem. >> > >> > Also programming host params on host side, also doesn't make >> assumption that its connected to eswitch. >> > It also doesn't assume that same connectivity for its life. >> > >> > If you model around how physical devices are configured, it will almost >> never go wrong and still provides same level of flexibility. >> > >> >> We should be able use this to configure port vlan too. >> >> >> >> Also, instead of subport, can we call vport and support different >> >> types of vports - sr-iov, siov, vmdq etc. >> >> >> > At switch level there are just ports. >> > sriov, siov, mdev, vmdq are their couter part (peer) where it is connected. >> > >> >>> >> >>> Also eswitch is flat. There is no need of pf/vf flavour for port. >> >>> It doesn't make sense to define 'mdev' flavour which we are already >> >> working. >> >>> At eswitch level it is just a port, it happen to be connected to vf >> >>> or pf or >> >> other objects, it doesn't matter. >> >>> Port should be flavoured as 'hostport' or 'switchport'. >> >>> >> >>> >> >>>> (using the port ids from above)