From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04477C43381 for ; Tue, 12 Mar 2019 20:56:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B7CEE2147C for ; Tue, 12 Mar 2019 20:56:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="oLrGNxeM" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726606AbfCLU4j (ORCPT ); Tue, 12 Mar 2019 16:56:39 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:33212 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726329AbfCLU4i (ORCPT ); Tue, 12 Mar 2019 16:56:38 -0400 Received: by mail-qk1-f194.google.com with SMTP id x9so2405085qkf.0 for ; Tue, 12 Mar 2019 13:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=ivG4uMFBv3bPkrLCu0XLEOOEt9lO+S5X07GGJ4c9N7Q=; b=oLrGNxeMdDsjRv2GgknUMVf39wSSFIg4qhiLBi+Au8ybHGsDeQOJ1PgXs3zmuYzINl iZpqjT2xEslFuseRb/rUmpWjdqOJ6BOhPChMzIrBwV8iIIBSXoyHJKQYP4kfO79PqTtj rTyqdfuWfxVnVPwh2XKV0+szdMWRQd1TsYzAnCWyjQo1d6whPNce7UeK3X8zPhOoPW8T bsYd2tRgZ+gb4MCp9TxosLe20MTnZ26TRuIcbc6c9qj/XYyNCLH+booy9gC+4rZ4yHsb xJiFhf6vnBIU5h5wysoqhLCqE3R00gxB+G7FS1p0XkS2crD6tahXCt0zwpRpbhkRPsGH j49w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=ivG4uMFBv3bPkrLCu0XLEOOEt9lO+S5X07GGJ4c9N7Q=; b=MMfcF3RUcfGuPstyY+UbfNjUHl56ErRecGxY5TSqAQgaWnhbqRhoP2/69S1Qm4kdXR nMXwFsopue+IKUF/Mu8ILb6lGe9QEX9cKypMq24vodIyefZshQWRjBiFhGlae+zXmE/s teIcuq8qb6iTR2ODTYhKSK8NQrcx11hQHMy54hYK7UoM1JTnt+M6wR/zjWIaQ6hTHBvn bKEe1WEjO2MuFvIK7wRpG51RBI/m+IIU0AjCKvSlaREGTwfHxrpiy3LAQ/eeD25bP0gz f7l8b1ADC0aNZoFPbLK+apJAvV3eG4L5uAElHUFT1mtsMqwmpXLAUcxrpCCeE2T6tiNu U05Q== X-Gm-Message-State: APjAAAXYKoeuOKnaBHHYhgwUsqSi/sVpvXExLsxeO7597JzhQPhLf8FB UWbZDFr9CAe/lf79w3a+ngOIzQ== X-Google-Smtp-Source: APXvYqx+aD15ZhK+hKXJ2t7/wGQR20/i0eKbsmOsMbr15927fgvnTOkOJDZzGJes63ihjRQCZGcCYg== X-Received: by 2002:a05:620a:15af:: with SMTP id f15mr30680095qkk.91.1552424196987; Tue, 12 Mar 2019 13:56:36 -0700 (PDT) Received: from cakuba.hsd1.ca.comcast.net ([66.60.152.14]) by smtp.gmail.com with ESMTPSA id u13sm6764645qtb.2.2019.03.12.13.56.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Mar 2019 13:56:36 -0700 (PDT) Date: Tue, 12 Mar 2019 13:56:28 -0700 From: Jakub Kicinski To: Jiri Pirko Cc: davem@davemloft.net, netdev@vger.kernel.org, oss-drivers@netronome.com Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Message-ID: <20190312135628.5250135b@cakuba.hsd1.ca.comcast.net> In-Reply-To: <20190312140239.GA2455@nanopsycho> References: <20190305110601.GC2314@nanopsycho> <20190305091534.36200de6@cakuba.hsd1.ca.comcast.net> <20190306122037.GB2819@nanopsycho> <20190306095638.7c028bdd@cakuba.hsd1.ca.comcast.net> <20190307094816.GA2190@nanopsycho> <20190307185202.2db37490@cakuba.hsd1.ca.comcast.net> <20190308145421.GA2888@nanopsycho.orion> <20190308110943.2ee42bc0@cakuba.hsd1.ca.comcast.net> <20190311085204.GA2194@nanopsycho> <20190311191054.36b801d6@cakuba.hsd1.ca.comcast.net> <20190312140239.GA2455@nanopsycho> Organization: Netronome Systems, Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Tue, 12 Mar 2019 15:02:39 +0100, Jiri Pirko wrote: > Tue, Mar 12, 2019 at 03:10:54AM CET, wrote: > >On Mon, 11 Mar 2019 09:52:04 +0100, Jiri Pirko wrote: > >> Fri, Mar 08, 2019 at 08:09:43PM CET, wrote: > >> >If the switchport is in the hypervisor then only the hypervisor can > >> >control switching/forwarding, correct? > >> > >> Correct. > >> > >> >The primary use case for partitioning within a VM (of a VF) would be > >> >containers (and DPDK)? > >> > >> Makes sense. > >> > >> >SR-IOV makes things harder. Splitting a PF is reasonably easy to grasp. > >> >I'm trying to get a sense of is how would we control an SR-IOV > >> >environment as a whole. > >> > >> You mean orchestration? > > > >Right, orchestration. > > > >To be clear on where I'm going with this - if we want to allow VFs > >to partition themselves then they have to control what is effectively > >a "nested" switch. A per-VF set of rules which would the get > > Wait. If you allow to make VF subports (I believe that is what you ment > by VFs partition themselves), that does not mean they will have a > separate nested switch. They would still belong under the same one. But that existing switch is administered by the hypervisor, how would the VF owners install forwarding rules in a switch they don't control? > >"flattened" into the main eswitch rule set. If I was to choose I'd > >really rather have this "flattening" be done on the (Linux) hypervisor > >and not in the vendor driver and firmware. > > Agreed. Driver should provide one big switch. User should configure it. Cool, when you say user - is it the tenant or the provider? > >I'd much rather have the VM make a "give me another NIC" orchestration > >call via some high level REST API than devlink. This makes the > >configuration strictly high level to low level: > > > > VM -> cloud net REST API -> cloud agent -> devlink/Linux -> FW -> HW > > > >Without round trips via firmware. > > Okay. So the "devlink/Linux -> FW" part is going to happen on baremetal. > Makes sense. > > >This allows for easy policy enforcement, common code to be maintained > >in user space, in high level languages (no 0.5M LoC drivers and 10M LoC > >firmware for every driver). It can also be used with software paths > >like VirtIO.. > > Agreed. > > >Modelling and debugging a nested switch would be a nightmare. What > >follows is that we probably shouldn't deal with partitioning of VFs, > >but rather only partition via the PF devlink instance, and reassign > >the partitions to VMs. > > Agreed. That must be misunderstanding, I never suggested nested > switches. Cool, yes, I was making sure we weren't going in that direction :) > >> I originally planned to implement sriov orchestration api in devlink too. > > > >Interesting, would you mind elaborating? > > I have to think about it. But something like this: > [...] I see thanks for the examples, they makes things clear!