From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4330BC43381 for ; Tue, 5 Mar 2019 00:40:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 02BA220684 for ; Tue, 5 Mar 2019 00:40:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="UOcPQG3f" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726522AbfCEAkQ (ORCPT ); Mon, 4 Mar 2019 19:40:16 -0500 Received: from mail-qk1-f171.google.com ([209.85.222.171]:37807 "EHLO mail-qk1-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726066AbfCEAkP (ORCPT ); Mon, 4 Mar 2019 19:40:15 -0500 Received: by mail-qk1-f171.google.com with SMTP id m9so3900711qkl.4 for ; Mon, 04 Mar 2019 16:40:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=v7HKcaOKqzxEljZq+kTIbTQkvgjJMjXzkAypC08dulg=; b=UOcPQG3fbm2q7QpBAg0wujFwsqV3UZT76zB+RKpTRZpRMd6Wa7AKxhd4nltxnZo7Vd PegZS2/ilBznUFbmzlwxjNs1JNwTJmLXftnQYy9vOZT566yGZoZcD1+85nPwzcNW26TU P1HXbDZlyHBR5DhQcHZnpId18Ns/5ZBLONoFc4xDXpf5qzP53CaSATTwkJ71SMI3NTtp 8c88yePrlqKe7dFSq5UALKBNIWVYG41ZM1bQXMuwSIwLOJTb3B9JcCj1EGP25AzMcT7s 3YULMM3Z8kemsE7tWiW49n/zQ674439ZlaeKYwVzuX3xE/iXo5jpdDt0U9E9wcPfkX8o z/fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=v7HKcaOKqzxEljZq+kTIbTQkvgjJMjXzkAypC08dulg=; b=Nh7BKRfbEPPMQg79/ecNm8TjZIp4OcERqjKvbzKmWixgqkw+sZtAc/S/Z5fQZEiFJv elPhICALs97PjQSJOMJppp5mvw+K0RWUhtL760feZzMjW1wVilY09PWlzD/GH46KF7Nl obq/h6Z/JaBXSLGK2Sa8fNFx5RYgzO9NBt2E6BGU0KiGsL1BVDA88BA9uDrSO9kHVSTt VI/EWfuHrPjzoza9qwtVGOms8bKTDL2vAO8kgYbnBidWqVKfK6pM8JY2sSVSmSeyj3Ed NQwp8vFQKbJm2ttCTMFQHhg+HvLdzgDmhKfWW8JAeBZ26ud0ZA13aXCn6PVodzEfcpMC IKNA== X-Gm-Message-State: APjAAAWXvCz/hZ8bbJcaJtzbhJTd6rOpHSiUQ5o0QHv5D50EP0ZVflbX h2ovw7SOjFcd1TBkRCV7ExJb1sq82mo= X-Google-Smtp-Source: APXvYqzI7OQvcm2jlgi+FzbbuNF9eE7qMf5BxiC455qweb4Ap1QH73hirIsoGKYKIVceCSoQ0i8j9Q== X-Received: by 2002:a37:d612:: with SMTP id t18mr15432757qki.215.1551746414169; Mon, 04 Mar 2019 16:40:14 -0800 (PST) Received: from cakuba.netronome.com ([66.60.152.14]) by smtp.gmail.com with ESMTPSA id t38sm5998969qtc.12.2019.03.04.16.40.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 04 Mar 2019 16:40:14 -0800 (PST) Date: Mon, 4 Mar 2019 16:40:07 -0800 From: Jakub Kicinski To: Jiri Pirko Cc: davem@davemloft.net, netdev@vger.kernel.org, oss-drivers@netronome.com Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Message-ID: <20190304164007.7cef8af9@cakuba.netronome.com> In-Reply-To: <20190304111902.GX2314@nanopsycho> References: <20190301180453.17778-1-jakub.kicinski@netronome.com> <20190301180453.17778-5-jakub.kicinski@netronome.com> <20190302094116.GQ2314@nanopsycho> <20190302114847.733759a1@cakuba.netronome.com> <20190304111902.GX2314@nanopsycho> Organization: Netronome Systems, Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, 4 Mar 2019 12:19:02 +0100, Jiri Pirko wrote: > Sat, Mar 02, 2019 at 08:48:47PM CET, jakub.kicinski@netronome.com wrote: > >On Sat, 2 Mar 2019 10:41:16 +0100, Jiri Pirko wrote: > >> Fri, Mar 01, 2019 at 07:04:50PM CET, jakub.kicinski@netronome.com wrote: > >> >PCI endpoint corresponds to a PCI device, but such device > >> >can have one more more logical device ports associated with it. > >> >We need a way to distinguish those. Add a PCI subport in the > >> >dumps and print the info in phys_port_name appropriately. > >> > > >> >This is not equivalent to port splitting, there is no split > >> >group. It's just a way of representing multiple netdevs on > >> >a single PCI function. > >> > > >> >Note that the quality of being multiport pertains only to > >> >the PCI function itself. A PF having multiple netdevs does > >> >not mean that its VFs will also have multiple, or that VFs > >> >are associated with any particular port of a multiport VF. > >> > > >> >Example (bus 05 device has subports, bus 82 has only one port per > >> >function): > >> > > >> >$ devlink port > >> >pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical > >> >pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0 > >> >pci/0000:05:00.0/4: type eth netdev enp5s0np1 flavour physical > >> >pci/0000:05:00.0/11000: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1 > >> > >> So these subport devlink ports are eswitch ports for subports, right? > >> > >> Please see the following drawing: > >> > >> +---+ +---+ +---+ > >> pfsub| 5 | vf| 6 | | 7 |pfsub > >> +-+-+ +-+-+ +-+-+ > >> physical link <---------+ | | | > >> | | | | > >> | | | | > >> | | | | > >> +-+-+ +-+-+ +-+-+ +-+-+ > >> | 1 | | 2 | | 3 | | 4 | > >> +--+---+------+---+------+---+------+---+--+ > >> | physical pfsub vf pfsub | > >> | port port port port | > >> | | > >> | eswitch | > >> | | > >> | | > >> +------------------------------------------+ > >> > >> 1) pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2f > >> 2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0 switch_id 00154d130d2f > >> 3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0 flavour pci_vf pf 0 vf 0 switch_id 00154d130d2f > >> 4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1 switch_id 00154d130d2f > >> > >> This is basically what you have and I think we are in sync with that. > >> But what about 5,6,7? Should they have devlink port instances too? > >> > >> 5) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 0 > >> 6) pci/0000:05:10.1/0: type eth netdev enp5s10f0 flavour ???? pf 0 vf 0 > >> 7) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 1 > >> > >> These are the "peers". > >> I think that there could be flavours "pci_pf" and "pci_vf". Then the > >> "representors" (switch ports) could have flavours "pci_pf_port" and > >> "pci_vf_port" or something like that. User can see right away > >> that is not "PF" of "VF" but rather something "on the other end". > >> Note there is no "switch_id" for these devlink ports that tells the user > >> these devlink ports are not part of any switch. > >> What do you think? > > > >Hmmm.. Hm. Hm. > > > >To me its neat if the devlink instance matches an ASIC. I think it's > >kind of clear for people to understand what it stands for then. So if > >we wanted to do the above we'd have to make the switch_id the first > >class identifier for devlink instances, rather than the bus? But then > >VF instances don't have a switch ID so that doesn't work... > > > >I need to think about it. > > > >It's also kind of strange that we have to add the noun *port* to the > >flavour of... a port... So I would prefer not to have those showing up > >as ports. Can we invent a new command (say "partition"?) that'd take > >the bus info where the partition is to be spawned? > > Devlink does not supposed to be only there for switches. From the > beginning the design was to handle cases where the netdev/ib_dev is not > the correct handle. Not only in case you have multiple instances (ports) > for one ASIC, but also in case you have only one. Example use case is > port-type-change (eth->ib,ib->eth). > > I chose word "port" as the parent devlink instance is "dev" and if you > partition the ASIC you basically got "ports", each of a different flavour. > > And as you said, devlink instance matches one ASIC. Therefore the > devlink instance should contain all bits there are part of that ASIC, > not only switch/eswitch ports. That would be very limitting. I could read this as us being in full agreement, but I'm not sure.. I think we agree that all objects of an ASIC should be under one devlink instance, the question remains whether both ends of the pipe for PCI devices (subdevs or not) should appear under ports or does the "far end" (from ASICs perspective)/"host end" get its own category.