netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: Jakub Kicinski <jakub.kicinski@netronome.com>, davem@davemloft.net
Cc: oss-drivers@netronome.com, netdev@vger.kernel.org,
	jiri@resnulli.us, andrew@lunn.ch, mkubecek@suse.cz,
	dsahern@gmail.com, simon.horman@netronome.com,
	jesse.brandeburg@intel.com, maciejromanfijalkowski@gmail.com,
	vasundhara-v.volam@broadcom.com, michael.chan@broadcom.com,
	shalomt@mellanox.com, idosch@mellanox.com
Subject: Re: [RFC 00/14] netlink/hierarchical stats
Date: Wed, 6 Feb 2019 12:12:39 -0800	[thread overview]
Message-ID: <c5d98b16-dbf7-25a3-3bc5-a9d5ebca503e@gmail.com> (raw)
In-Reply-To: <20190128234507.32028-1-jakub.kicinski@netronome.com>

On 1/28/19 3:44 PM, Jakub Kicinski wrote:
> Hi!
> 
> As I tried to explain in my slides at netconf 2018 we are lacking
> an expressive, standard API to report device statistics.
> 
> Networking silicon generally maintains some IEEE 802.3 and/or RMON
> statistics.  Today those all end up in ethtool -S.  Here is a simple
> attempt (admittedly very imprecise) of counting how many names driver
> authors invented for IETF RFC2819 etherStatsPkts512to1023Octets
> statistics (RX and TX):
> 
> $ git grep '".*512.*1023.*"' -- drivers/net/ | \
>     sed -e 's/.*"\(.*\)".*/\1/' | sort | uniq | wc -l
> 63
> 
> Interestingly only two drivers in the tree use the name the standard
> gave us (etherStatsPkts512to1023, modulo case).
> 
> I set out to working on this set in an attempt to give drivers a way
> to express clearly to user space standard-compliant counters.
> 
> Second most common use for custom statistics is per-queue counters.
> This is where the "hierarchical" part of this set comes in, as
> groups can be nested, and user space tools can handle the aggregation
> inside the groups if needed.
> 
> This set also tries to address the problem of users not knowing if
> a statistic is reported by hardware or the driver.  Many modern drivers
> use some prefix in ethtool -S to indicate MAC/PHY stats.  At a quick
> glance: Netronome uses "mac.", Intel "port." and Mellanox "_phy".
> In this set, netlink attributes describe whether a group of statistics
> is RX or TX, maintained by device or driver.
> 
> The purpose of this patch set is _not_ to replace ethtool -S.  It is
> an incredibly useful tool, and we will certainly continue using it.
> However, for standard-based and commonly maintained statistics a more
> structured API seems warranted.
> 
> There are two things missing from these patches, which I initially
> planned to address as well: filtering, and refresh rate control.
> 
> Filtering doesn't need much explanation, users should be able to request
> only a subset of statistics (like only SW stats or only given ID).  The
> bitmap of statistics in each group is there for filtering later on.
> 
> By refresh control I mean the ability for user space to indicate how
> "fresh" values it expects.  Sometimes reading the HW counters requires
> slow register reads or FW communication, in such cases drivers may cache
> the result.  (Privileged) user space should be able to add a "not older
> than" timestamp to indicate how fresh statistics it expects.  And vice
> versa, drivers can then also put the timestamp of when the statistics
> were last refreshed in the dump for more precise bandwidth estimation.

Another thing that we cannot quite do with ethtool right now, at least
not easily, is something like the following use case.

You have some filtering/classification capable hardware, and the HW can
count the number of times a rule has been hit/missed. The number of
rules programmed into the HW is dynamic and depends on use case so
dumping them all is not convenient for e.g.: hundreds/thousands of rules.

You would want to return only the rules that are active/enabled, and not
the full possible range of rules. With ethtool, this is not possible
because you have to define the strings first, and in a second call, you
are going to get the dump and fill in the data returned to user-space...

I will review more in depth, but the idea looks great so far.

> 
> Jakub Kicinski (14):
>   nfp: remove unused structure
>   nfp: constify parameter to nfp_port_from_netdev()
>   net: hstats: add basic/core functionality
>   net: hstats: allow hierarchies to be built
>   nfp: very basic hstat support
>   net: hstats: allow iterators
>   net: hstats: help in iteration over directions
>   nfp: hstats: make use of iteration for direction
>   nfp: hstats: add driver and device per queue statistics
>   net: hstats: add IEEE 802.3 and common IETF MIB/RMON stats
>   nfp: hstats: add IEEE/RMON ethernet port/MAC stats
>   net: hstats: add markers for partial groups
>   nfp: hstats: add a partial group of per-8021Q prio stats
>   Documentation: networking: describe new hstat API
> 
>  Documentation/networking/hstats.rst           | 590 +++++++++++++++
>  .../networking/hstats_flow_example.dot        |  11 +
>  Documentation/networking/index.rst            |   1 +
>  drivers/net/ethernet/netronome/nfp/Makefile   |   1 +
>  .../net/ethernet/netronome/nfp/nfp_hstat.c    | 474 ++++++++++++
>  drivers/net/ethernet/netronome/nfp/nfp_main.c |   1 +
>  drivers/net/ethernet/netronome/nfp/nfp_main.h |   2 +
>  drivers/net/ethernet/netronome/nfp/nfp_net.h  |  10 +-
>  .../ethernet/netronome/nfp/nfp_net_common.c   |   1 +
>  .../net/ethernet/netronome/nfp/nfp_net_repr.h |   2 +-
>  drivers/net/ethernet/netronome/nfp/nfp_port.c |   2 +-
>  drivers/net/ethernet/netronome/nfp/nfp_port.h |   2 +-
>  include/linux/netdevice.h                     |   9 +
>  include/net/hstats.h                          | 176 +++++
>  include/uapi/linux/if_link.h                  | 107 +++
>  net/core/Makefile                             |   2 +-
>  net/core/hstats.c                             | 682 ++++++++++++++++++
>  net/core/rtnetlink.c                          |  21 +
>  18 files changed, 2084 insertions(+), 10 deletions(-)
>  create mode 100644 Documentation/networking/hstats.rst
>  create mode 100644 Documentation/networking/hstats_flow_example.dot
>  create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_hstat.c
>  create mode 100644 include/net/hstats.h
>  create mode 100644 net/core/hstats.c
> 


-- 
Florian

  parent reply	other threads:[~2019-02-06 20:13 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-28 23:44 [RFC 00/14] netlink/hierarchical stats Jakub Kicinski
2019-01-28 23:44 ` [RFC 01/14] nfp: remove unused structure Jakub Kicinski
2019-01-28 23:44 ` [RFC 02/14] nfp: constify parameter to nfp_port_from_netdev() Jakub Kicinski
2019-01-28 23:44 ` [RFC 03/14] net: hstats: add basic/core functionality Jakub Kicinski
2019-01-30  4:18   ` David Ahern
2019-01-30 17:44     ` Jakub Kicinski
2019-01-30 22:33       ` David Ahern
2019-01-28 23:44 ` [RFC 04/14] net: hstats: allow hierarchies to be built Jakub Kicinski
2019-01-28 23:44 ` [RFC 05/14] nfp: very basic hstat support Jakub Kicinski
2019-01-28 23:44 ` [RFC 06/14] net: hstats: allow iterators Jakub Kicinski
2019-01-28 23:45 ` [RFC 07/14] net: hstats: help in iteration over directions Jakub Kicinski
2019-01-28 23:45 ` [RFC 08/14] nfp: hstats: make use of iteration for direction Jakub Kicinski
2019-01-28 23:45 ` [RFC 09/14] nfp: hstats: add driver and device per queue statistics Jakub Kicinski
2019-01-28 23:45 ` [RFC 10/14] net: hstats: add IEEE 802.3 and common IETF MIB/RMON stats Jakub Kicinski
2019-01-28 23:45 ` [RFC 11/14] nfp: hstats: add IEEE/RMON ethernet port/MAC stats Jakub Kicinski
2019-01-28 23:45 ` [RFC 12/14] net: hstats: add markers for partial groups Jakub Kicinski
2019-01-28 23:45 ` [RFC 13/14] nfp: hstats: add a partial group of per-8021Q prio stats Jakub Kicinski
2019-01-28 23:45 ` [RFC 14/14] Documentation: networking: describe new hstat API Jakub Kicinski
2019-01-30 22:14 ` [RFC 00/14] netlink/hierarchical stats Roopa Prabhu
2019-01-31  0:24   ` Jakub Kicinski
2019-01-31 16:16     ` Roopa Prabhu
2019-01-31 16:31       ` Roopa Prabhu
2019-01-31 19:30         ` Jakub Kicinski
2019-02-02 23:14           ` Roopa Prabhu
2019-02-06  4:45             ` Jakub Kicinski
2019-02-06 20:12 ` Florian Fainelli [this message]
2019-02-07 16:23   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c5d98b16-dbf7-25a3-3bc5-a9d5ebca503e@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=idosch@mellanox.com \
    --cc=jakub.kicinski@netronome.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@resnulli.us \
    --cc=maciejromanfijalkowski@gmail.com \
    --cc=michael.chan@broadcom.com \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=shalomt@mellanox.com \
    --cc=simon.horman@netronome.com \
    --cc=vasundhara-v.volam@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).