From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Ahern Subject: Re: [patch net-next v2 00/10] Add support for resource abstraction Date: Thu, 4 Jan 2018 09:44:29 -0700 Message-ID: <69a6ec3e-b892-ef71-1d5c-5789fc014d04@cumulusnetworks.com> References: <20171226112359.5313-1-jiri@resnulli.us> <977652df-a0ed-d1a5-f299-1dc433ebd337@mellanox.com> <96389ae0-e038-8a26-84ea-0cf1b9fa0a05@cumulusnetworks.com> <0f861e90-63d3-2666-ef2d-0fc91beae957@mellanox.com> <48d2d512-6879-cbce-16a4-3413f6505c3d@cumulusnetworks.com> <21a4f75c-8819-0e3a-b3ab-807dcd44f6a9@mellanox.com> <4321df8d-0c79-102d-5748-913ebd8cf2b3@cumulusnetworks.com> <5a294d33-acf6-7855-524f-bc9b3f9a5005@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: davem@davemloft.net, mlxsw@mellanox.com, andrew@lunn.ch, vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com, michael.chan@broadcom.com, ganeshgr@chelsio.com, saeedm@mellanox.com, matanb@mellanox.com, leonro@mellanox.com, idosch@mellanox.com, jakub.kicinski@netronome.com, ast@kernel.org, daniel@iogearbox.net, simon.horman@netronome.com, pieter.jansenvanvuuren@netronome.com, john.hurley@netronome.com, alexander.h.duyck@intel.com, linville@tuxdriver.com, gospo@broadcom.com, steven.lin1@broadcom.com, yuvalm@mellanox.com, ogerlitz@mellanox.com To: Arkadi Sharshevsky , Jiri Pirko , netdev@vger.kernel.org, roopa@cumulusnetworks.com Return-path: Received: from mail-pl0-f42.google.com ([209.85.160.42]:40354 "EHLO mail-pl0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752463AbeADQoe (ORCPT ); Thu, 4 Jan 2018 11:44:34 -0500 Received: by mail-pl0-f42.google.com with SMTP id 62so1325293pld.7 for ; Thu, 04 Jan 2018 08:44:34 -0800 (PST) In-Reply-To: <5a294d33-acf6-7855-524f-bc9b3f9a5005@mellanox.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 1/4/18 9:13 AM, Arkadi Sharshevsky wrote: > > > On 01/04/2018 05:58 PM, David Ahern wrote: >> On 1/4/18 2:24 AM, Arkadi Sharshevsky wrote: >>>> Again, my comments all stem from user experience. >>>> >>>> Can you explain what "double_word" means for a unit? I would expect a >>>> units to be kB or count (or items or entries). >>>> >>> >>> Double word is 64 bit, dont understand why this is confusing. >> >> As Andrew pointed out, double word can have a range of sizes. >> >> To my point here, a 'unit' for a number should not be the number of bits >> it is represented by. I believe all of the kvd sizes are in entries >> (ie., a linear size of 98304 means I can have 98,304 entries in that >> resource). >> >>> >>>> $ devlink resource show pci/0000:03:00.0 >>>> pci/0000:03:00.0: >>>> name kvd size 245760 unit double_word size_valid true >>>> resources: >>>> name linear size 98304 occ 0 unit double_word >>>> name hash_double size 60416 unit double_word >>>> name hash_single size 87040 unit double_word >>>> >>>> While that is confusing here from the userspace command it goes hand in >>>> hand with patch 2 and: >>>> >>>> +enum devlink_resource_unit { >>>> + DEVLINK_RESOURCE_UNIT_DOUBLE_WORD, >>>> +}; >>>> >>>> >>>> Also, it seems like the occ of 0 is wrong since we know from past >>>> responses that if I set linear to 0 all of networking breaks. >>> >>> You are clearly misunderstanding what is occupancy of the resource >>> and what kvd linear is good for. As I mentioned in the last response >>> kvd linear is mapped to adjacency table. So in case its 0 no nexthop >>> routes could be configured, this information is provided by the >>> dpipe<-->resource. >>> >>> Occupancy means how much of the resource is used right now, why is >>> this wrong? and how its related to the size 0 exactly? >> >> The summary line above shows the current kvd/linear occupancy is '0'. >> That suggests my L3 only deployment is not using any kvd/linear which >> means I can set its allocation to 0 and devote more kvd resources to the >> hash regions. >> >> But, as I pointed out in previous responses I can not set linear to 0. >> If I do all of networking breaks. That suggests that kvd/linear is used >> by more networking entities than you are currently reporting. Hence, >> telling me the kvd/linear occupancy is 0 is wrong. >> >> I believe the stems from the how you are determining occupancy and the >> fact that not all tables have been implemented. Showing the current >> occupancy of a resource is very helpful, so it should be part of the API. >> >> However, until mlxsw shows a proper value (either by implementing all >> dpipe tables or changing how it is calculated) it should not be >> returning that attribute. As it stands you are giving a user invalid data. >> > > Wait, the occupancy is very precise it goes directly to the linear > allocator inside the driver and gives exact current usage. You assumed > it goes to the dpipe tables which is incorrect. I am trying *guess* based on this discussion as to why it is 0. In past responses you push that the tables expose how the resource is used. > > The linear kvd memory is used by: > 1. The adjacency table is responsible for the ECMP and nexthop > resolution. > 2. ACL flexible actions blocks (dpipe will contain table for this). > > I dont know what you configured but in case you got some remote > routes with a nexthop the occ should not be zero. > > If your router contains only local routes and dont use ACL you can > shrink it to zero, no problem. My current setup is an L3 only deployment, no ACLs and no bridges. We are meeting in 15 minutes to discuss this design; let's use some of the time for debugging this problem. > >>> >>>> >>>> >>>> >>>> How does a user learn the granularity of a resource: >>>> >>>> $ devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 50000 >>>> Error: mlxsw_spectrum: resource set with wrong granularity. >>>> >>>> Try again with 51000 and then 52000 and ... Why not export the >>>> granularity read-only? I don't see it in the proposed UAPI. >>>> >>> >>> I would like more adding the granularity size to the extack string >>> instead of adding this to UAPI. The user will try to update once >>> and will get the required granularity in the error message. >> >> A user should not have to run a command to get an error message to know >> essential data to running a command with the right value. Further, do >> not assume 'user' is a person or the devlink command. >> >> Since granularity is essential to running a valid command, it should be >> an attribute for each resource. >> > > This is also your approach towards the min/max for resource sizes? > Am I correct? yes.