From: Zhu Yanjun <zyjzyj2000@gmail.com>
To: Joe Damato <jdamato@fastly.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
tariqt@nvidia.com, saeedm@nvidia.com
Cc: gal@nvidia.com, nalramli@fastly.com,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
"open list:MELLANOX MLX5 core VPI driver"
<linux-rdma@vger.kernel.org>, Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
Date: Fri, 3 May 2024 12:55:41 +0200 [thread overview]
Message-ID: <c3f4f1a4-303d-4d57-ae83-ed52e5a08f69@linux.dev> (raw)
In-Reply-To: <20240503022549.49852-1-jdamato@fastly.com>
On 03.05.24 04:25, Joe Damato wrote:
> Hi:
>
> This is only 1 patch, so I know a cover letter isn't necessary, but it
> seems there are a few things to mention.
>
> This change adds support for the per queue netdev-genl API to mlx5,
> which seems to output stats:
>
> ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
> --dump qstats-get --json '{"scope": "queue"}'
>
> ...snip
> {'ifindex': 7,
> 'queue-id': 28,
> 'queue-type': 'tx',
> 'tx-bytes': 399462,
> 'tx-packets': 3311},
> ...snip
Ethtool -S ethx can get the above information
"
...
tx-0.packets: 2094
tx-0.bytes: 294141
rx-0.packets: 2200
rx-0.bytes: 267673
...
"
>
> I've tried to use the tooling suggested to verify that the per queue
> stats match the rtnl stats by doing this:
>
> NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
>
> And the tool outputs that there is a failure:
>
> # Exception| Exception: Qstats are lower, fetched later
> not ok 3 stats.pkt_byte_sum
With ethtool, does the above problem still occur?
Zhu Yanjun
>
> The other tests all pass (including stats.qstat_by_ifindex).
>
> This appears to mean that the netdev-genl queue stats have lower numbers
> than the rtnl stats even though the rtnl stats are fetched first. I
> added some debugging and found that both rx and tx bytes and packets are
> slightly lower.
>
> The only explanations I can think of for this are:
>
> 1. tx_ptp_opened and rx_ptp_opened are both true, in which case
> mlx5e_fold_sw_stats64 adds bytes and packets to the rtnl struct and
> might account for the difference. I skip this case in my
> implementation, so that could certainly explain it.
> 2. Maybe I'm just misunderstanding how stats aggregation works in mlx5,
> and that's why the numbers are slightly off?
>
> It appears that the driver uses a workqueue to queue stats updates which
> happen periodically.
>
> 0. the driver occasionally calls queue_work on the update_stats_work
> workqueue.
> 1. This eventually calls MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw),
> in drivers/net/ethernet/mellanox/mlx5/core/en_stats.c, which appears
> to begin by first memsetting the internal stats struct where stats are
> aggregated to zero. This would mean, I think, the get_base_stats
> netdev-genl API implementation that I have is correct: simply set
> everything to 0.... otherwise we'd end up double counting in the
> netdev-genl RX and TX handlers.
> 2. Next, each of the stats helpers are called to collect stats into the
> freshly 0'd internal struct (for example:
> mlx5e_stats_grp_sw_update_stats_rq_stats).
>
> That seems to be how stats are aggregated, which would suggest that if I
> simply .... do what I'm doing in this change the numbers should line up.
>
> But they don't and its either because of PTP or because I am
> misunderstanding/doing something wrong.
>
> Maybe the MLNX folks can suggest a hint?
>
> Thanks,
> Joe
>
> Joe Damato (1):
> net/mlx5e: Add per queue netdev-genl stats
>
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
> 1 file changed, 68 insertions(+)
>
next prev parent reply other threads:[~2024-05-03 10:55 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-03 2:25 [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
2024-05-03 2:25 ` [PATCH net-next 1/1] net/mlx5e: Add per queue netdev-genl stats Joe Damato
2024-05-03 10:55 ` Zhu Yanjun [this message]
2024-05-03 18:43 ` [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
2024-05-03 21:58 ` Jakub Kicinski
2024-05-03 23:53 ` Joe Damato
2024-05-04 0:34 ` Jakub Kicinski
2024-05-06 18:04 ` Joe Damato
2024-05-08 21:40 ` Tariq Toukan
2024-05-08 23:24 ` Joe Damato
2024-05-09 0:56 ` Jakub Kicinski
2024-05-09 1:57 ` Joe Damato
2024-05-09 2:08 ` Jakub Kicinski
2024-05-09 4:11 ` Joe Damato
2024-05-09 6:30 ` Joe Damato
2024-05-09 10:16 ` Tariq Toukan
2024-05-09 23:14 ` Joe Damato
2024-05-10 0:31 ` Joe Damato
2024-05-10 4:27 ` Joe Damato
2024-05-09 9:42 ` Tariq Toukan
2024-05-09 23:06 ` Joe Damato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c3f4f1a4-303d-4d57-ae83-ed52e5a08f69@linux.dev \
--to=zyjzyj2000@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=jdamato@fastly.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=nalramli@fastly.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.