From: Zhu Yanjun <zyjzyj2000@gmail.com>
To: Joe Damato <jdamato@fastly.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
tariqt@nvidia.com, saeedm@nvidia.com
Cc: gal@nvidia.com, nalramli@fastly.com,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
"open list:MELLANOX MLX5 core VPI driver"
<linux-rdma@vger.kernel.org>, Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
Date: Fri, 3 May 2024 12:55:41 +0200 [thread overview]
Message-ID: <c3f4f1a4-303d-4d57-ae83-ed52e5a08f69@linux.dev> (raw)
In-Reply-To: <20240503022549.49852-1-jdamato@fastly.com>
On 03.05.24 04:25, Joe Damato wrote:
> Hi:
>
> This is only 1 patch, so I know a cover letter isn't necessary, but it
> seems there are a few things to mention.
>
> This change adds support for the per queue netdev-genl API to mlx5,
> which seems to output stats:
>
> ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
> --dump qstats-get --json '{"scope": "queue"}'
>
> ...snip
> {'ifindex': 7,
> 'queue-id': 28,
> 'queue-type': 'tx',
> 'tx-bytes': 399462,
> 'tx-packets': 3311},
> ...snip
Ethtool -S ethx can get the above information
"
...
tx-0.packets: 2094
tx-0.bytes: 294141
rx-0.packets: 2200
rx-0.bytes: 267673
...
"
>
> I've tried to use the tooling suggested to verify that the per queue
> stats match the rtnl stats by doing this:
>
> NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
>
> And the tool outputs that there is a failure:
>
> # Exception| Exception: Qstats are lower, fetched later
> not ok 3 stats.pkt_byte_sum
With ethtool, does the above problem still occur?
Zhu Yanjun
>
> The other tests all pass (including stats.qstat_by_ifindex).
>
> This appears to mean that the netdev-genl queue stats have lower numbers
> than the rtnl stats even though the rtnl stats are fetched first. I
> added some debugging and found that both rx and tx bytes and packets are
> slightly lower.
>
> The only explanations I can think of for this are:
>
> 1. tx_ptp_opened and rx_ptp_opened are both true, in which case
> mlx5e_fold_sw_stats64 adds bytes and packets to the rtnl struct and
> might account for the difference. I skip this case in my
> implementation, so that could certainly explain it.
> 2. Maybe I'm just misunderstanding how stats aggregation works in mlx5,
> and that's why the numbers are slightly off?
>
> It appears that the driver uses a workqueue to queue stats updates which
> happen periodically.
>
> 0. the driver occasionally calls queue_work on the update_stats_work
> workqueue.
> 1. This eventually calls MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw),
> in drivers/net/ethernet/mellanox/mlx5/core/en_stats.c, which appears
> to begin by first memsetting the internal stats struct where stats are
> aggregated to zero. This would mean, I think, the get_base_stats
> netdev-genl API implementation that I have is correct: simply set
> everything to 0.... otherwise we'd end up double counting in the
> netdev-genl RX and TX handlers.
> 2. Next, each of the stats helpers are called to collect stats into the
> freshly 0'd internal struct (for example:
> mlx5e_stats_grp_sw_update_stats_rq_stats).
>
> That seems to be how stats are aggregated, which would suggest that if I
> simply .... do what I'm doing in this change the numbers should line up.
>
> But they don't and its either because of PTP or because I am
> misunderstanding/doing something wrong.
>
> Maybe the MLNX folks can suggest a hint?
>
> Thanks,
> Joe
>
> Joe Damato (1):
> net/mlx5e: Add per queue netdev-genl stats
>
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
> 1 file changed, 68 insertions(+)
>
next prev parent reply other threads:[~2024-05-03 10:55 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-03 2:25 [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
2024-05-03 2:25 ` [PATCH net-next 1/1] net/mlx5e: Add per queue netdev-genl stats Joe Damato
2024-05-03 10:55 ` Zhu Yanjun [this message]
2024-05-03 18:43 ` [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
2024-05-03 21:58 ` Jakub Kicinski
2024-05-03 23:53 ` Joe Damato
2024-05-04 0:34 ` Jakub Kicinski
2024-05-06 18:04 ` Joe Damato
2024-05-08 21:40 ` Tariq Toukan
2024-05-08 23:24 ` Joe Damato
2024-05-09 0:56 ` Jakub Kicinski
2024-05-09 1:57 ` Joe Damato
2024-05-09 2:08 ` Jakub Kicinski
2024-05-09 4:11 ` Joe Damato
2024-05-09 6:30 ` Joe Damato
2024-05-09 10:16 ` Tariq Toukan
2024-05-09 23:14 ` Joe Damato
2024-05-10 0:31 ` Joe Damato
2024-05-10 4:27 ` Joe Damato
2024-05-09 9:42 ` Tariq Toukan
2024-05-09 23:06 ` Joe Damato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c3f4f1a4-303d-4d57-ae83-ed52e5a08f69@linux.dev \
--to=zyjzyj2000@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=jdamato@fastly.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=nalramli@fastly.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).