From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net-next v2 00/10] Add support for resource abstraction Date: Tue, 2 Jan 2018 11:08:17 +0100 Message-ID: <20180102100817.GB2051@nanopsycho.orion> References: <20171226112359.5313-1-jiri@resnulli.us> <977652df-a0ed-d1a5-f299-1dc433ebd337@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org, dsa@cumulusnetworks.com, roopa@cumulusnetworks.com, davem@davemloft.net, mlxsw@mellanox.com, andrew@lunn.ch, vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com, michael.chan@broadcom.com, ganeshgr@chelsio.com, saeedm@mellanox.com, matanb@mellanox.com, leonro@mellanox.com, idosch@mellanox.com, jakub.kicinski@netronome.com, ast@kernel.org, daniel@iogearbox.net, simon.horman@netronome.com, pieter.jansenvanvuuren@netronome.com, john.hurley@netronome.com, alexander.h.duyck@intel.com, linville@tuxdriver.com, gospo@broadcom.com, steven.lin1@broadcom.com, yuvalm@mellanox.com, ogerlitz@mellanox.com To: Arkadi Sharshevsky Return-path: Received: from mail-wm0-f68.google.com ([74.125.82.68]:33331 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751418AbeABKIT (ORCPT ); Tue, 2 Jan 2018 05:08:19 -0500 Received: by mail-wm0-f68.google.com with SMTP id g130so15715720wme.0 for ; Tue, 02 Jan 2018 02:08:18 -0800 (PST) Content-Disposition: inline In-Reply-To: <977652df-a0ed-d1a5-f299-1dc433ebd337@mellanox.com> Sender: netdev-owner@vger.kernel.org List-ID: Mon, Jan 01, 2018 at 03:58:33PM CET, arkadis@mellanox.com wrote: > > >On 12/26/2017 01:23 PM, Jiri Pirko wrote: >> From: Jiri Pirko >> >> Many of the ASIC's internal resources are limited and are shared between >> several hardware procedures. For example, unified hash-based memory can >> be used for many lookup purposes, like FDB and LPM. In many cases the user >> can provide a partitioning scheme for such a resource in order to perform >> fine tuning for his application. In such cases performing driver reload is >> needed for the changes to take place, thus this patchset also adds support >> for hot reload. >> >> Such an abstraction can be coupled with devlink's dpipe interface, which >> models the ASIC's pipeline as a graph of match/action tables. By modeling >> the hardware resource object, and by coupling it to several dpipe tables, >> further visibility can be achieved in order to debug ASIC-wide issues. >> >> The proposed interface will provide the user the ability to understand the >> limitations of the hardware, and receive notification regarding its occupancy. >> Furthermore, monitoring the resource occupancy can be done in real-time and >> can be useful in many cases. >> --- >> Userspace part prototype can be found at https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farkadis%2Fiproute2%2F&data=02%7C01%7Carkadis%40mellanox.com%7C1ae3d8b4854a454e21e008d54c5329e3%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636498842440762657&sdata=7MC2BFQFxjnmHqy2sOOL9VEa4ZGq6e5Z2n2WvuNgyFk%3D&reserved=0 >> at resource_dev branch. >> >> v1->v2 >> - Add resource size attribute. >> - Fix split bug. >> > >Just to summarize the current fixes required: > >1. ERIF dpipe table size is reporting wrong size. More precisely the > ERIF table does not take rifs, so it should not be linked to the rif > bank resource (is not part of this patchset, future extension). >2. Extended ACK user-space bug. >3. ABI documentation- Not sure we agreed upon it, Jiri? Question is where to put it. It is mlxsw-specific thing, moreover, Spectrum-specific thing, same as dpipe tables etc. Not sure. Perhaps Documentation/networking/mlxsw.txt ? > >If I missed something please respond. Nothing of the fixes mentioned >above is relevant for this patchset actually. > >Couple of key-points: > >1. Constrains\trade off about setting the sizes - this can be obtained > trivially from the resource tree nested structure. >2. Dpipe provides the mapping of hardware processes to resources. >3. Units - each resource specifies his units, if dpipe table's size is > X and its related to some resource its size is normalized to that > resources basic unit. > >IMO this is the most hardware exact interaction, and this is the way it >should be exported from the kernel, if something is not presented in >'user' convenient way some utilities can be implemented in userspace >to easily do it. Furthermore, some examples will be provided for the >whole kvd tree partition for different cases (IPv6 heavy etc..). >Advanced user will be able to tweak it as they like. > >Regarding the 'switchdev' layer I think that kernel's software tables >like nexthops/neigh/routes should be mapped to dpipe tables and not >to resources directly: Sure. dpipe table -> resource mapping is the only one that makes sense. > >kernel_fdb--> dpipe_fdb -->/kvd/hash_single. > >> Arkadi Sharshevsky (10): >> devlink: Add per devlink instance lock >> devlink: Add support for resource abstraction >> devlink: Add support for reload >> devlink: Add relation between dpipe and resource >> mlxsw: pci: Add support for performing bus reset >> mlxsw: spectrum: Register KVD resources with devlink >> mlxsw: spectrum_dpipe: Connect dpipe tables to resources >> mlxsw: spectrum: Add support for getting kvdl occupancy >> mlxsw: pci: Add support for getting resource through devlink >> mlxsw: core: Add support for reload >> >> drivers/net/ethernet/mellanox/mlxsw/core.c | 85 ++- >> drivers/net/ethernet/mellanox/mlxsw/core.h | 16 +- >> drivers/net/ethernet/mellanox/mlxsw/i2c.c | 5 +- >> drivers/net/ethernet/mellanox/mlxsw/pci.c | 98 ++-- >> drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 205 ++++++++ >> drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 13 + >> .../net/ethernet/mellanox/mlxsw/spectrum_dpipe.c | 72 ++- >> .../net/ethernet/mellanox/mlxsw/spectrum_kvdl.c | 26 + >> include/net/devlink.h | 97 ++++ >> include/uapi/linux/devlink.h | 21 + >> net/core/devlink.c | 573 ++++++++++++++++++--- >> 11 files changed, 1079 insertions(+), 132 deletions(-) >>