From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A766C282D7 for ; Thu, 31 Jan 2019 00:24:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4BAA620833 for ; Thu, 31 Jan 2019 00:24:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="B+edWay+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727747AbfAaAYV (ORCPT ); Wed, 30 Jan 2019 19:24:21 -0500 Received: from mail-qt1-f174.google.com ([209.85.160.174]:41669 "EHLO mail-qt1-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726198AbfAaAYU (ORCPT ); Wed, 30 Jan 2019 19:24:20 -0500 Received: by mail-qt1-f174.google.com with SMTP id l12so1678012qtf.8 for ; Wed, 30 Jan 2019 16:24:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=jUbUczbRtKWkQUBTFouaL+plP0gxx2pL3S8GN5dAGbw=; b=B+edWay+JmV9E/43XX4AKcyL3nW4gx7nPUictCTOxrD1XuDQ2dP4zzVH9x2pbBd0Gf TdXhWs/wavgqUAkejVSLrP6m2wJUMiY61B4dYDtvFKVHpTAzIy90RUQa2GFA5qG9vIYY J3pa7L4HP6TOCdkwfMClHe0lKONA1eMSVYNoJTfh/mJvkMge5TDYj791a1+aXIAAG62j ALuj7RM6g6ikuJ1VOW+MiLEVMpd1bvWgtEI564qcr7Di08jpLEQJ2iXxM4FYobamrGST 35CaCHWfQbv+g7fojWdWOwGD4d4LSAxRuZSdjMyDKgFmTFYPSSU+qdqqc5j5LRaeJlCT /anw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=jUbUczbRtKWkQUBTFouaL+plP0gxx2pL3S8GN5dAGbw=; b=ppihQHjDn3i7oSqTyTzpex885NSyo+GPiqKVFh4bbNsBKrSOjtSD0Abf6PAnyCu9Ru n7S5ux0dvtk6I4s/anfaYgBwC1gCNFMXPaxN+81m5BAtxwN2pAwWB/Xc6nvXtAJ0GhqI ZndjDgKn2mJDCDiLEWffv7pFxO1P9FT8S3+K6tOXAWctmVJPfv+9u44OCGwEyfRympyx 2rkzHjHUSmIzBUPQR+sAr2PVYJv6DPS0mM1a9KIjHiTnz2eqiSw1jgzDfOLfoI26EW9t K8Q+o0HYaW+uBdAO0Zle6WD04zCsnM8j0c6oCvW4j0QAq/oglCzzlMSlHuexpoxhv+nW O7ww== X-Gm-Message-State: AJcUuke3+JODRRACrValgM4T51D94MCDA1X0XkC9MvEsIWIL8p5/GjhA QVouYLIV7i0P6OX/Ako5cLQJJQ== X-Google-Smtp-Source: ALg8bN6o1qvJ4IZ+WGDkkB2zOHx7cfMXDG4u3R3oWOS01b/vqxFPmRnIJxJLljb3g2oh9avH/FQrnw== X-Received: by 2002:ac8:53c6:: with SMTP id c6mr32781167qtq.278.1548894259404; Wed, 30 Jan 2019 16:24:19 -0800 (PST) Received: from cakuba.hsd1.ca.comcast.net ([66.60.152.14]) by smtp.gmail.com with ESMTPSA id b17sm3751204qkj.69.2019.01.30.16.24.17 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 Jan 2019 16:24:19 -0800 (PST) Date: Wed, 30 Jan 2019 16:24:08 -0800 From: Jakub Kicinski To: Roopa Prabhu Cc: David Miller , oss-drivers@netronome.com, netdev , =?UTF-8?B?SmnFmcOtIFDDrXJrbw==?= , Florian Fainelli , Andrew Lunn , Michal Kubecek , David Ahern , Simon Horman , "Brandeburg, Jesse" , maciejromanfijalkowski@gmail.com, vasundhara-v.volam@broadcom.com, Michael Chan , shalomt@mellanox.com, Ido Schimmel Subject: Re: [RFC 00/14] netlink/hierarchical stats Message-ID: <20190130162408.60f1f5dc@cakuba.hsd1.ca.comcast.net> In-Reply-To: References: <20190128234507.32028-1-jakub.kicinski@netronome.com> Organization: Netronome Systems, Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, 30 Jan 2019 14:14:34 -0800, Roopa Prabhu wrote: > On Mon, Jan 28, 2019 at 3:45 PM Jakub Kicinski wrote: > > Hi! > > > > As I tried to explain in my slides at netconf 2018 we are lacking > > an expressive, standard API to report device statistics. > > > > Networking silicon generally maintains some IEEE 802.3 and/or RMON > > statistics. Today those all end up in ethtool -S. Here is a simple > > attempt (admittedly very imprecise) of counting how many names driver > > authors invented for IETF RFC2819 etherStatsPkts512to1023Octets > > statistics (RX and TX): > > > > $ git grep '".*512.*1023.*"' -- drivers/net/ | \ > > sed -e 's/.*"\(.*\)".*/\1/' | sort | uniq | wc -l > > 63 > > > > Interestingly only two drivers in the tree use the name the standard > > gave us (etherStatsPkts512to1023, modulo case). > > > > I set out to working on this set in an attempt to give drivers a way > > to express clearly to user space standard-compliant counters. > > > > Second most common use for custom statistics is per-queue counters. > > This is where the "hierarchical" part of this set comes in, as > > groups can be nested, and user space tools can handle the aggregation > > inside the groups if needed. > > > > This set also tries to address the problem of users not knowing if > > a statistic is reported by hardware or the driver. Many modern drivers > > use some prefix in ethtool -S to indicate MAC/PHY stats. At a quick > > glance: Netronome uses "mac.", Intel "port." and Mellanox "_phy". > > In this set, netlink attributes describe whether a group of statistics > > is RX or TX, maintained by device or driver. > > > > The purpose of this patch set is _not_ to replace ethtool -S. It is > > an incredibly useful tool, and we will certainly continue using it. > > However, for standard-based and commonly maintained statistics a more > > structured API seems warranted. > > > > There are two things missing from these patches, which I initially > > planned to address as well: filtering, and refresh rate control. > > > > Filtering doesn't need much explanation, users should be able to request > > only a subset of statistics (like only SW stats or only given ID). The > > bitmap of statistics in each group is there for filtering later on. > > > > By refresh control I mean the ability for user space to indicate how > > "fresh" values it expects. Sometimes reading the HW counters requires > > slow register reads or FW communication, in such cases drivers may cache > > the result. (Privileged) user space should be able to add a "not older > > than" timestamp to indicate how fresh statistics it expects. And vice > > versa, drivers can then also put the timestamp of when the statistics > > were last refreshed in the dump for more precise bandwidth estimation. > > Jakub, Glad to see hw stats in the RTM_*STATS api. I do see you > mention 'partial' support for ethtool stats. I understand the reason > you say its partial. > But while at it, why not also include the ability to have driver > extensible stats here ? ie make it complete. We have talked about > making all hw stats available > via the RTM_*STATS api in the past..., so just want to make sure the > new HSTATS infra you are adding to the RTM_*STATS api > covers or at-least makes it possible to include driver extensible > stats in the future where the driver gets to define the stats id + > value (This is very useful). > It would be nice if you can account for that in this new HSTATS API. My thinking was that we should leave truly custom/strange stats to ethtool API which works quite well for that and at the same time be very accepting of people adding new IDs to HSTAT (only requirement is basically defining the meaning very clearly). For the first stab I looked at two drivers and added all the stats that were common. Given this set is identifying statistics by ID - how would we make that extensible to drivers? Would we go back to strings or have some "driver specific" ID space? Is there any particular type of statistic you'd expect drivers to want to add? For NICs I think IEEE/RMON should pretty much cover the silicon ones, but I don't know much about switches :)