From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, URIBL_BLOCKED,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D57E8C43381 for ; Fri, 15 Mar 2019 07:11:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8E49421872 for ; Fri, 15 Mar 2019 07:11:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=resnulli-us.20150623.gappssmtp.com header.i=@resnulli-us.20150623.gappssmtp.com header.b="wimzLig2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728419AbfCOHLP (ORCPT ); Fri, 15 Mar 2019 03:11:15 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:39403 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726790AbfCOHLP (ORCPT ); Fri, 15 Mar 2019 03:11:15 -0400 Received: by mail-wm1-f65.google.com with SMTP id t124so4926882wma.4 for ; Fri, 15 Mar 2019 00:11:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=resnulli-us.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=cUHBlAoyGNWhWpvR6aimS1eeRYWMN2NitJTxzuESFxs=; b=wimzLig2dQft0l23wPL80crTEX+kDnTaXBL2cFHq5qyXxFhWzHIXp7LSuEhNfkFxdl N8KtLUvfN1eLwhkyke0TgzeyFDnhrhzkbHkMj/tqc3ykSaHaWrDuuL0dSPPpDdeX7cH6 Aa3FB89eirCEs0M2eQU0iI1MUg3Ej/ToyrFASsO7TREzIk/vlFt1VJOlTKFKuncs+MPX wm5summ/IVwArE7npgw4UDYpdShVdiXiwBAtLlNPuNC6ErBhygemcLkMsPlbyPvi66vi Vp90++/lAMRyjbT7OdJy+8IkePlwAAVITZe8wQOm3xapRoMgRU8Nw3NGFVazv/VvH8dc 1FkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=cUHBlAoyGNWhWpvR6aimS1eeRYWMN2NitJTxzuESFxs=; b=gVE2BU5Vd8SGEOt4QFj51/P3+rlorg+j7ErZSMJsxySHhLxaI993HnXww7ssJDdw6/ 7DzlJ+qlsq4GzhDILrhJUiFkM2P5nO6XhbqlJI8+15nv1IHZX8RT3kW+lkere6IfmJtv 2Wb9YYfx/RGVzaMXjthq7XHVk87LF9cU4RtPjts32N07pxDV9uVKi2H6mMTKuNbFT45b KYjmYh+yVt63BWo9U7zBRyMvUpDn5s1F3HorsMeeJSyTB37nVuCHs2oHCE+sQfJX4wyb uMimNx81mARlikOO2/hOfeX9VJ3kcLg26er8bm+379PbI7mQ6HT/xd4AaIo83aHgNwa+ XDVQ== X-Gm-Message-State: APjAAAUGukskbfmESGrsX81KmLvBX4hgffJZiZ4pw19xjY/Ub0bKEUzd x5GAdDasEGEV6Lba65U9KXP72feWh+U= X-Google-Smtp-Source: APXvYqxcEPZFgrLO6ZoRtduruhwE0uDm6lqn60e0VpQoKvY7KXEv3FpOY5wTEKorJcElw4jgtWv/sA== X-Received: by 2002:a7b:c764:: with SMTP id x4mr924390wmk.47.1552633872651; Fri, 15 Mar 2019 00:11:12 -0700 (PDT) Received: from localhost (ip-213-220-234-66.net.upcbroadband.cz. [213.220.234.66]) by smtp.gmail.com with ESMTPSA id p16sm1069484wro.25.2019.03.15.00.11.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 15 Mar 2019 00:11:12 -0700 (PDT) Date: Fri, 15 Mar 2019 08:00:49 +0100 From: Jiri Pirko To: Jakub Kicinski Cc: davem@davemloft.net, netdev@vger.kernel.org, oss-drivers@netronome.com Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Message-ID: <20190315070049.GD3034@nanopsycho> References: <20190311085204.GA2194@nanopsycho> <20190311191054.36b801d6@cakuba.hsd1.ca.comcast.net> <20190312140239.GA2455@nanopsycho> <20190312135628.5250135b@cakuba.hsd1.ca.comcast.net> <20190313060701.GB2384@nanopsycho.orion> <20190313091731.76129ece@cakuba.attlocal.net> <20190313162243.GB2270@nanopsycho> <20190313095555.0f4f92ca@cakuba.attlocal.net> <20190314073840.GA3034@nanopsycho> <20190314150945.031d1b08@cakuba.netronome.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190314150945.031d1b08@cakuba.netronome.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Thu, Mar 14, 2019 at 11:09:45PM CET, jakub.kicinski@netronome.com wrote: >On Thu, 14 Mar 2019 08:38:40 +0100, Jiri Pirko wrote: >> Wed, Mar 13, 2019 at 05:55:55PM CET, jakub.kicinski@netronome.com wrote: >> >On Wed, 13 Mar 2019 17:22:43 +0100, Jiri Pirko wrote: >> >> Wed, Mar 13, 2019 at 05:17:31PM CET, jakub.kicinski@netronome.com wrote: >> >> >On Wed, 13 Mar 2019 07:07:01 +0100, Jiri Pirko wrote: >> >> >> Tue, Mar 12, 2019 at 09:56:28PM CET, jakub.kicinski@netronome.com wrote: >> >> >> >On Tue, 12 Mar 2019 15:02:39 +0100, Jiri Pirko wrote: >> >> >> >> Tue, Mar 12, 2019 at 03:10:54AM CET, wrote: >> >> >> >> >On Mon, 11 Mar 2019 09:52:04 +0100, Jiri Pirko wrote: >> >> >> >> >> Fri, Mar 08, 2019 at 08:09:43PM CET, wrote: >> >> >> >> >> >If the switchport is in the hypervisor then only the hypervisor can >> >> >> >> >> >control switching/forwarding, correct? >> >> >> >> >> >> >> >> >> >> Correct. >> >> >> >> >> >> >> >> >> >> >The primary use case for partitioning within a VM (of a VF) would be >> >> >> >> >> >containers (and DPDK)? >> >> >> >> >> >> >> >> >> >> Makes sense. >> >> >> >> >> >> >> >> >> >> >SR-IOV makes things harder. Splitting a PF is reasonably easy to grasp. >> >> >> >> >> >I'm trying to get a sense of is how would we control an SR-IOV >> >> >> >> >> >environment as a whole. >> >> >> >> >> >> >> >> >> >> You mean orchestration? >> >> >> >> > >> >> >> >> >Right, orchestration. >> >> >> >> > >> >> >> >> >To be clear on where I'm going with this - if we want to allow VFs >> >> >> >> >to partition themselves then they have to control what is effectively >> >> >> >> >a "nested" switch. A per-VF set of rules which would the get >> >> >> >> >> >> >> >> Wait. If you allow to make VF subports (I believe that is what you ment >> >> >> >> by VFs partition themselves), that does not mean they will have a >> >> >> >> separate nested switch. They would still belong under the same one. >> >> >> > >> >> >> >But that existing switch is administered by the hypervisor, how would >> >> >> >the VF owners install forwarding rules in a switch they don't control? >> >> >> >> >> >> They won't. >> >> > >> >> >Argh. So how is forwarding configured if there are no rules? Are you >> >> >going to assume its switching on MACs? We're supposed to offload >> >> >software constructs. If its a software port it needs to be explicitly >> >> >switched. If it's not explicitly switched - we already have macvlan >> >> >offload. >> >> >> >> Wait a second. You configure the switch. And for that, you have the >> >> switchports (representors). What we are talking about are VF (VF >> >> subport) host legs. Am I missing something? >> > >> >Hm :) So when VM gets a new port, how is it connected? Are we >> >assuming all ports of a VM are plugged into one big L2 switch? >> >The use case for those sub ports is a little murky, sorry about >> >the endless confusion :) >> >> Np. When user John (on baremetal, or whenever the devlink instance >> with switch port is) creates VF of VF subport by: >> $ devlink dev port add pci/0000:05:00.0 flavour pci_vf pf 0 >> or by: >> $ devlink dev port add pci/0000:05:00.0 flavour pci_vf pf 0 vf 0 >> >> Then instances of flavour pci_vf are going to appear in the same devlink >> instance. Those are the switch ports: >> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 >> flavour pci_vf pf 0 vf 0 >> switch_id 00154d130d2f peer pci/0000:05:10.1/0 >> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0 >> flavour pci_vf pf 0 vf 0 subport 1 >> switch_id 00154d130d2f peer pci/0000:05:10.1/1 >> >> With that, peers are going to appear too, and those are the actual VF/VF >> subport: >> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host >> peer pci/0000:05:00.0/10002 >> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host >> peer pci/0000:05:00.0/10003 >> >> Later you can push this VF along with all subports to VM. So in VM, you >> are going to see the VF like this: >> $ devlink dev >> pci/0000:00:08.0 >> $ devlink port >> pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host >> pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host >> >> And back to your question of how are they connected in eswitch. >> That is totally up to the original user John who did the creation. >> He is in charge of the eswitch on baremetal, he would configure >> the forwarding however he likes. > >Ack, so I think you're saying VM has to communicate to the cloud >environment to have this provisioned using some service API, not >a kernel API. That's what I wanted to confirm. Okay. > >I don't see any benefit to having the "host ports" under devlink, >as such I think it's a matter of preference. I'll try to describe >the two options to Netronome's FAEs and see which one they find more >intuitive. Yeah, the "host ports" are probably not a must. I just like to have them for visibility purposes. No big deal to implement them. > >Makes sense? Okay. Thanks!