netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
@ 2024-05-29  3:16 Joe Damato
  2024-05-29  3:16 ` [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx Joe Damato
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Joe Damato @ 2024-05-29  3:16 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: nalramli, Joe Damato, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Richard Cochran, Saeed Mahameed, Tariq Toukan

Greetings:

Switching to an RFC instead of a PATCH because even though Tariq
patiently explained the code to me, I'm sure I probably still missed
something ;)

If this turns out to be right and Tariq agrees, I can send a PATCH
net-next v4.

This change adds support for the per queue netdev-genl API to mlx5,
which seems to output stats:

./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
         --dump qstats-get --json '{"scope": "queue"}'

...snip
 {'ifindex': 7,
  'queue-id': 28,
  'queue-type': 'tx',
  'tx-bytes': 399462,
  'tx-packets': 3311},
...snip

I've used the suggested tooling to verify the per queue stats match
rtnl by doing this:

  NETIF=eth0 tools/testing/selftests/drivers/net/stats.py

I've tested the following scenarios:
  - The machine at boot (default queue configuration)
  - Adjusting the queue configuration to various amounts via ethtool
  - Add mqprio TCs
  - Removing the mqprio TCs

and in each scenario the stats script above reports that the stats match
rtnl.

Worth noting that Tariq suggested I also export HTB/QOS stats in
mlx5e_get_base_stats.

I am open to doing this, but I think if I were to do that, HTB/QOS queue
stats should also be exported by rtnl so that the script above will
continue to show that the output is correct.

I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
code together at the same time, but a later time, separate from this
change.

Thanks,
Joe

v2 -> rfcv3:
 - Added patch 1/2 which creates some helpers for computing the txq_ix
   and ch_ix/tc_ix.

 - Patch 2/2 modified in several ways:
   - Fixed variable declarations in mlx5e_get_queue_stats_rx to be at
     the start of the function.
   - mlx5e_get_queue_stats_tx rewritten to access sq stats directly by
     using the helpers added in the previous patch.
   - mlx5e_get_base_stats modified in several ways:
     - Took the state_lock when accessing priv->channels.
     - For the base RX stats, code was simplified to call
       mlx5e_get_queue_stats_rx instead of repeating the same code.
     - For the base TX stats, I attempted to implement what I think
       Tariq suggested in the previous thread:
         - for available channels, only unavailable TC stats are summed
	 - for unavailable channels, all stats for TCs up to
	   max_opened_tc are summed.

v1 - > v2:
  - Essentially a full rewrite after comments from Jakub, Tariq, and
    Zhu.

Joe Damato (2):
  net/mlx5e: Add helpers to calculate txq and ch idx
  net/mlx5e: Add per queue netdev-genl stats

 .../net/ethernet/mellanox/mlx5/core/en_main.c | 150 +++++++++++++++++-
 1 file changed, 149 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx
  2024-05-29  3:16 [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Joe Damato
@ 2024-05-29  3:16 ` Joe Damato
  2024-05-30 21:15   ` Jacob Keller
  2024-06-01 11:35   ` Simon Horman
  2024-05-29  3:16 ` [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats Joe Damato
  2024-05-31  0:11 ` [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Jakub Kicinski
  2 siblings, 2 replies; 22+ messages in thread
From: Joe Damato @ 2024-05-29  3:16 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: nalramli, Joe Damato, Saeed Mahameed, Tariq Toukan,
	Leon Romanovsky, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, open list:MELLANOX MLX5 core VPI driver

Add two helpers to:

1. Compute the txq_ix given a channel and a tc offset (tc_to_txq_ix).
2. Compute the channel index and tc offset given a txq_ix
   (txq_ix_to_chtc_ix).

The first helper, tc_to_txq_ix, is used in place of the mathematical
expressionin mlx5e_open_sqs when txq_ix values are computed.

The second helper, txq_ix_to_chtc_ix, will be used in a following patch.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c  | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index b758bc72ac36..ce15805ad55a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2312,6 +2312,22 @@ static int mlx5e_txq_get_qos_node_hw_id(struct mlx5e_params *params, int txq_ix,
 	return 0;
 }
 
+static inline int tc_to_txq_ix(struct mlx5e_channel *c,
+			       struct mlx5e_params *params,
+			       int tc)
+{
+	return c->ix + tc * params->num_channels;
+}
+
+static inline void txq_ix_to_chtc_ix(struct mlx5e_params *params, int txq_ix,
+				     int *ch_ix, int *tc_ix)
+{
+	if (params->num_channels) {
+		*ch_ix = txq_ix % params->num_channels;
+		*tc_ix = txq_ix / params->num_channels;
+	}
+}
+
 static int mlx5e_open_sqs(struct mlx5e_channel *c,
 			  struct mlx5e_params *params,
 			  struct mlx5e_channel_param *cparam)
@@ -2319,7 +2335,7 @@ static int mlx5e_open_sqs(struct mlx5e_channel *c,
 	int err, tc;
 
 	for (tc = 0; tc < mlx5e_get_dcb_num_tc(params); tc++) {
-		int txq_ix = c->ix + tc * params->num_channels;
+		int txq_ix = tc_to_txq_ix(c, params, tc);
 		u32 qos_queue_group_id;
 		u32 tisn;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-05-29  3:16 [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Joe Damato
  2024-05-29  3:16 ` [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx Joe Damato
@ 2024-05-29  3:16 ` Joe Damato
  2024-05-30 21:16   ` Jacob Keller
  2024-06-02  9:14   ` Tariq Toukan
  2024-05-31  0:11 ` [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Jakub Kicinski
  2 siblings, 2 replies; 22+ messages in thread
From: Joe Damato @ 2024-05-29  3:16 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: nalramli, Joe Damato, Saeed Mahameed, Tariq Toukan,
	Leon Romanovsky, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Richard Cochran,
	open list:MELLANOX MLX5 core VPI driver

Add functions to support the netdev-genl per queue stats API.

./cli.py --spec netlink/specs/netdev.yaml \
         --dump qstats-get --json '{"scope": "queue"}'

...snip

 {'ifindex': 7,
  'queue-id': 62,
  'queue-type': 'rx',
  'rx-alloc-fail': 0,
  'rx-bytes': 105965251,
  'rx-packets': 179790},
 {'ifindex': 7,
  'queue-id': 0,
  'queue-type': 'tx',
  'tx-bytes': 9402665,
  'tx-packets': 17551},

...snip

Also tested with the script tools/testing/selftests/drivers/net/stats.py
in several scenarios to ensure stats tallying was correct:

- on boot (default queue counts)
- adjusting queue count up or down (ethtool -L eth0 combined ...)
- adding mqprio TCs

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 132 ++++++++++++++++++
 1 file changed, 132 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index ce15805ad55a..515c16a88a6c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -39,6 +39,7 @@
 #include <linux/debugfs.h>
 #include <linux/if_bridge.h>
 #include <linux/filter.h>
+#include <net/netdev_queues.h>
 #include <net/page_pool/types.h>
 #include <net/pkt_sched.h>
 #include <net/xdp_sock_drv.h>
@@ -5293,6 +5294,136 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
 	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
 }
 
+static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
+				     struct netdev_queue_stats_rx *stats)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5e_channel_stats *channel_stats;
+	struct mlx5e_rq_stats *xskrq_stats;
+	struct mlx5e_rq_stats *rq_stats;
+
+	if (mlx5e_is_uplink_rep(priv))
+		return;
+
+	channel_stats = priv->channel_stats[i];
+	xskrq_stats = &channel_stats->xskrq;
+	rq_stats = &channel_stats->rq;
+
+	stats->packets = rq_stats->packets + xskrq_stats->packets;
+	stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
+	stats->alloc_fail = rq_stats->buff_alloc_err +
+			    xskrq_stats->buff_alloc_err;
+}
+
+static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
+				     struct netdev_queue_stats_tx *stats)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5e_channel_stats *channel_stats;
+	struct mlx5e_sq_stats *sq_stats;
+	int ch_ix, tc_ix;
+
+	mutex_lock(&priv->state_lock);
+	txq_ix_to_chtc_ix(&priv->channels.params, i, &ch_ix, &tc_ix);
+	mutex_unlock(&priv->state_lock);
+
+	channel_stats = priv->channel_stats[ch_ix];
+	sq_stats = &channel_stats->sq[tc_ix];
+
+	stats->packets = sq_stats->packets;
+	stats->bytes = sq_stats->bytes;
+}
+
+static void mlx5e_get_base_stats(struct net_device *dev,
+				 struct netdev_queue_stats_rx *rx,
+				 struct netdev_queue_stats_tx *tx)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	int i, j;
+
+	if (!mlx5e_is_uplink_rep(priv)) {
+		rx->packets = 0;
+		rx->bytes = 0;
+		rx->alloc_fail = 0;
+
+		/* compute stats for deactivated RX queues
+		 *
+		 * if priv->channels.num == 0 the device is down, so compute
+		 * stats for every queue.
+		 *
+		 * otherwise, compute only the queues which have been deactivated.
+		 */
+		mutex_lock(&priv->state_lock);
+		if (priv->channels.num == 0)
+			i = 0;
+		else
+			i = priv->channels.params.num_channels;
+		mutex_unlock(&priv->state_lock);
+
+		for (; i < priv->stats_nch; i++) {
+			struct netdev_queue_stats_rx rx_i = {0};
+
+			mlx5e_get_queue_stats_rx(dev, i, &rx_i);
+
+			rx->packets += rx_i.packets;
+			rx->bytes += rx_i.bytes;
+			rx->alloc_fail += rx_i.alloc_fail;
+		}
+
+		if (priv->rx_ptp_opened) {
+			struct mlx5e_rq_stats *rq_stats = &priv->ptp_stats.rq;
+
+			rx->packets += rq_stats->packets;
+			rx->bytes += rq_stats->bytes;
+		}
+	}
+
+	tx->packets = 0;
+	tx->bytes = 0;
+
+	mutex_lock(&priv->state_lock);
+	for (i = 0; i < priv->stats_nch; i++) {
+		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
+
+		/* while iterating through all channels [0, stats_nch], there
+		 * are two cases to handle:
+		 *
+		 *  1. the channel is available, so sum only the unavailable TCs
+		 *     [mlx5e_get_dcb_num_tc, max_opened_tc).
+		 *
+		 *  2. the channel is unavailable, so sum all TCs [0, max_opened_tc).
+		 */
+		if (i < priv->channels.params.num_channels) {
+			j = mlx5e_get_dcb_num_tc(&priv->channels.params);
+		} else {
+			j = 0;
+		}
+
+		for (; j < priv->max_opened_tc; j++) {
+			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
+
+			tx->packets += sq_stats->packets;
+			tx->bytes += sq_stats->bytes;
+		}
+	}
+	mutex_unlock(&priv->state_lock);
+
+	if (priv->tx_ptp_opened) {
+		for (j = 0; j < priv->max_opened_tc; j++) {
+			struct mlx5e_sq_stats *sq_stats = &priv->ptp_stats.sq[j];
+
+			tx->packets    += sq_stats->packets;
+			tx->bytes      += sq_stats->bytes;
+		}
+	}
+}
+
+static const struct netdev_stat_ops mlx5e_stat_ops = {
+	.get_queue_stats_rx     = mlx5e_get_queue_stats_rx,
+	.get_queue_stats_tx     = mlx5e_get_queue_stats_tx,
+	.get_base_stats         = mlx5e_get_base_stats,
+};
+
 static void mlx5e_build_nic_netdev(struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
@@ -5310,6 +5441,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
 
 	netdev->watchdog_timeo    = 15 * HZ;
 
+	netdev->stat_ops          = &mlx5e_stat_ops;
 	netdev->ethtool_ops	  = &mlx5e_ethtool_ops;
 
 	netdev->vlan_features    |= NETIF_F_SG;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx
  2024-05-29  3:16 ` [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx Joe Damato
@ 2024-05-30 21:15   ` Jacob Keller
  2024-06-01 11:35   ` Simon Horman
  1 sibling, 0 replies; 22+ messages in thread
From: Jacob Keller @ 2024-05-30 21:15 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev
  Cc: nalramli, Saeed Mahameed, Tariq Toukan, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX5 core VPI driver



On 5/28/2024 8:16 PM, Joe Damato wrote:
> Add two helpers to:
> 
> 1. Compute the txq_ix given a channel and a tc offset (tc_to_txq_ix).
> 2. Compute the channel index and tc offset given a txq_ix
>    (txq_ix_to_chtc_ix).
> 
> The first helper, tc_to_txq_ix, is used in place of the mathematical
> expressionin mlx5e_open_sqs when txq_ix values are computed.
> 
> The second helper, txq_ix_to_chtc_ix, will be used in a following patch.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-05-29  3:16 ` [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats Joe Damato
@ 2024-05-30 21:16   ` Jacob Keller
  2024-06-02  9:14   ` Tariq Toukan
  1 sibling, 0 replies; 22+ messages in thread
From: Jacob Keller @ 2024-05-30 21:16 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev
  Cc: nalramli, Saeed Mahameed, Tariq Toukan, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran, open list:MELLANOX MLX5 core VPI driver



On 5/28/2024 8:16 PM, Joe Damato wrote:
> Add functions to support the netdev-genl per queue stats API.
> 
> ./cli.py --spec netlink/specs/netdev.yaml \
>          --dump qstats-get --json '{"scope": "queue"}'
> 
> ...snip
> 
>  {'ifindex': 7,
>   'queue-id': 62,
>   'queue-type': 'rx',
>   'rx-alloc-fail': 0,
>   'rx-bytes': 105965251,
>   'rx-packets': 179790},
>  {'ifindex': 7,
>   'queue-id': 0,
>   'queue-type': 'tx',
>   'tx-bytes': 9402665,
>   'tx-packets': 17551},
> 
> ...snip
> 
> Also tested with the script tools/testing/selftests/drivers/net/stats.py
> in several scenarios to ensure stats tallying was correct:
> 
> - on boot (default queue counts)
> - adjusting queue count up or down (ethtool -L eth0 combined ...)
> - adding mqprio TCs
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-29  3:16 [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Joe Damato
  2024-05-29  3:16 ` [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx Joe Damato
  2024-05-29  3:16 ` [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats Joe Damato
@ 2024-05-31  0:11 ` Jakub Kicinski
  2024-05-31  0:15   ` Joe Damato
  2 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-31  0:11 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, nalramli, David S. Miller, Eric Dumazet,
	Leon Romanovsky, open list:MELLANOX MLX5 core VPI driver,
	Paolo Abeni, Richard Cochran, Saeed Mahameed, Tariq Toukan

On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote:
> Worth noting that Tariq suggested I also export HTB/QOS stats in
> mlx5e_get_base_stats.

Why to base, and not report them as queue stats?

Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
../mlx5/core/en/htb.c it seems that the driver will update the
real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
it seems like the inverse calculation is:

i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)

But really, isn't it enough to use priv->txq2sq[i] for the active
queues, and not active ones you've already covered?

> I am open to doing this, but I think if I were to do that, HTB/QOS queue
> stats should also be exported by rtnl so that the script above will
> continue to show that the output is correct.
> 
> I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
> code together at the same time, but a later time, separate from this
> change.

Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-31  0:11 ` [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Jakub Kicinski
@ 2024-05-31  0:15   ` Joe Damato
  2024-05-31  0:39     ` Jakub Kicinski
  2024-05-31  1:07     ` Joe Damato
  0 siblings, 2 replies; 22+ messages in thread
From: Joe Damato @ 2024-05-31  0:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, nalramli, David S. Miller, Eric Dumazet,
	Leon Romanovsky, open list:MELLANOX MLX5 core VPI driver,
	Paolo Abeni, Richard Cochran, Saeed Mahameed, Tariq Toukan

On Thu, May 30, 2024 at 05:11:28PM -0700, Jakub Kicinski wrote:
> On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote:
> > Worth noting that Tariq suggested I also export HTB/QOS stats in
> > mlx5e_get_base_stats.
> 
> Why to base, and not report them as queue stats?
> 
> Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> ../mlx5/core/en/htb.c it seems that the driver will update the
> real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> it seems like the inverse calculation is:
> 
> i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> 
> But really, isn't it enough to use priv->txq2sq[i] for the active
> queues, and not active ones you've already covered?

This is what I proposed in the thread for the v2, but Tariq
suggested a different approach he liked more, please see this
message for more details:

  https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/

I attempted to implement option 1 as he described in his message.
 
> > I am open to doing this, but I think if I were to do that, HTB/QOS queue
> > stats should also be exported by rtnl so that the script above will
> > continue to show that the output is correct.
> > 
> > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
> > code together at the same time, but a later time, separate from this
> > change.
> 
> Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.

As far as I can tell (and I could be wrong) I didn't see them
included in the rtnl stats, but I'll take another look now.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-31  0:15   ` Joe Damato
@ 2024-05-31  0:39     ` Jakub Kicinski
  2024-05-31  0:56       ` Joe Damato
  2024-05-31  1:07     ` Joe Damato
  1 sibling, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-31  0:39 UTC (permalink / raw)
  To: Joe Damato, Saeed Mahameed
  Cc: linux-kernel, netdev, nalramli, David S. Miller, Eric Dumazet,
	Leon Romanovsky, open list:MELLANOX MLX5 core VPI driver,
	Paolo Abeni, Richard Cochran, Tariq Toukan

On Thu, 30 May 2024 17:15:25 -0700 Joe Damato wrote:
> > Why to base, and not report them as queue stats?
> > 
> > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> > ../mlx5/core/en/htb.c it seems that the driver will update the
> > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> > it seems like the inverse calculation is:
> > 
> > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> > 
> > But really, isn't it enough to use priv->txq2sq[i] for the active
> > queues, and not active ones you've already covered?  
> 
> This is what I proposed in the thread for the v2, but Tariq
> suggested a different approach he liked more, please see this
> message for more details:
> 
>   https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/
> 
> I attempted to implement option 1 as he described in his message.

I see, although it sounds like option 2 would also work.

Saeed can you shine any light here? I'm worried Tariq is already AFK
for the weekend and we'll make no progress until Monday...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-31  0:39     ` Jakub Kicinski
@ 2024-05-31  0:56       ` Joe Damato
  0 siblings, 0 replies; 22+ messages in thread
From: Joe Damato @ 2024-05-31  0:56 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Saeed Mahameed, linux-kernel, netdev, nalramli, David S. Miller,
	Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Richard Cochran, Tariq Toukan

On Thu, May 30, 2024 at 05:39:02PM -0700, Jakub Kicinski wrote:
> On Thu, 30 May 2024 17:15:25 -0700 Joe Damato wrote:
> > > Why to base, and not report them as queue stats?
> > > 
> > > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> > > ../mlx5/core/en/htb.c it seems that the driver will update the
> > > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> > > it seems like the inverse calculation is:
> > > 
> > > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> > > 
> > > But really, isn't it enough to use priv->txq2sq[i] for the active
> > > queues, and not active ones you've already covered?  
> > 
> > This is what I proposed in the thread for the v2, but Tariq
> > suggested a different approach he liked more, please see this
> > message for more details:
> > 
> >   https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/
> > 
> > I attempted to implement option 1 as he described in his message.
> 
> I see, although it sounds like option 2 would also work.

I don't really mind either way; from Tariq's message it sounded like
he preferred option 1, so I tried to implement that thinking that it would be
my best bet at getting this done.

If option 2 is easier/preferred for some reason... it seems like
(other than the locking I forgot to include) the base implementation
in v2 was correct and I could use what I proposed in the thread for
the tx stats, which was:

  mutex_lock(&priv->state_lock);
  if (priv->channels.num > 0) {
          sq = priv->txq2sq[i];
          stats->packets = sq->stats->packets;
          stats->bytes = sq->stats->bytes;
  }
  mutex_unlock(&priv->state_lock);

And I would have implemented option 2... IIUC.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-31  0:15   ` Joe Damato
  2024-05-31  0:39     ` Jakub Kicinski
@ 2024-05-31  1:07     ` Joe Damato
  2024-05-31  1:26       ` Jakub Kicinski
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Damato @ 2024-05-31  1:07 UTC (permalink / raw)
  To: Jakub Kicinski, linux-kernel, netdev, nalramli, David S. Miller,
	Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Richard Cochran, Saeed Mahameed, Tariq Toukan

On Thu, May 30, 2024 at 05:15:25PM -0700, Joe Damato wrote:
> On Thu, May 30, 2024 at 05:11:28PM -0700, Jakub Kicinski wrote:
> > On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote:
> > > Worth noting that Tariq suggested I also export HTB/QOS stats in
> > > mlx5e_get_base_stats.
> > 
> > Why to base, and not report them as queue stats?
> > 
> > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> > ../mlx5/core/en/htb.c it seems that the driver will update the
> > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> > it seems like the inverse calculation is:
> > 
> > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> > 
> > But really, isn't it enough to use priv->txq2sq[i] for the active
> > queues, and not active ones you've already covered?
> 
> This is what I proposed in the thread for the v2, but Tariq
> suggested a different approach he liked more, please see this
> message for more details:
> 
>   https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/
> 
> I attempted to implement option 1 as he described in his message.
>  
> > > I am open to doing this, but I think if I were to do that, HTB/QOS queue
> > > stats should also be exported by rtnl so that the script above will
> > > continue to show that the output is correct.
> > > 
> > > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
> > > code together at the same time, but a later time, separate from this
> > > change.
> > 
> > Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.
> 
> As far as I can tell (and I could be wrong) I didn't see them
> included in the rtnl stats, but I'll take another look now.

I looked and it seems like the htb stats are included in ethtool's
output if you look at mlx5/core/en_stats.c, it appears that
mlx5e_stats_grp_sw_update_stats_qos rolls up the htb stats.

However, the rtnl stats, seem to be computed in mlx5/core/en_main.c
via mlx5e_get_stats calling mlx5e_fold_sw_stats64, which appears to
gather stats for rx, tx, and ptp, but it doesn't seem like qos/htb
is handled.

Unless I am missing something, I think mlx5e_fold_sw_stats64 would
need code similar to mlx5e_stats_grp_sw_update_stats_qos and then
rtnl would account for htb stats.

That said: since it seems the htb numbers are not included right
now, I was proposing adding that in later both to rtnl and
netdev-genl together, hoping that would keep the proposed
simpler/easier to get accepted.

LMK what you think.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-31  1:07     ` Joe Damato
@ 2024-05-31  1:26       ` Jakub Kicinski
  2024-05-31 18:30         ` Joe Damato
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-31  1:26 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, nalramli, David S. Miller, Eric Dumazet,
	Leon Romanovsky, open list:MELLANOX MLX5 core VPI driver,
	Paolo Abeni, Richard Cochran, Saeed Mahameed, Tariq Toukan

On Thu, 30 May 2024 18:07:31 -0700 Joe Damato wrote:
> Unless I am missing something, I think mlx5e_fold_sw_stats64 would
> need code similar to mlx5e_stats_grp_sw_update_stats_qos and then
> rtnl would account for htb stats.

Hm, I think you're right. I'm just surprised this could have gone
unnoticed for so long.

> That said: since it seems the htb numbers are not included right
> now, I was proposing adding that in later both to rtnl and
> netdev-genl together, hoping that would keep the proposed
> simpler/easier to get accepted.

SGTM.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats
  2024-05-31  1:26       ` Jakub Kicinski
@ 2024-05-31 18:30         ` Joe Damato
  0 siblings, 0 replies; 22+ messages in thread
From: Joe Damato @ 2024-05-31 18:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, nalramli, David S. Miller, Eric Dumazet,
	Leon Romanovsky, open list:MELLANOX MLX5 core VPI driver,
	Paolo Abeni, Richard Cochran, Saeed Mahameed, Tariq Toukan

On Thu, May 30, 2024 at 06:26:30PM -0700, Jakub Kicinski wrote:
> On Thu, 30 May 2024 18:07:31 -0700 Joe Damato wrote:
> > Unless I am missing something, I think mlx5e_fold_sw_stats64 would
> > need code similar to mlx5e_stats_grp_sw_update_stats_qos and then
> > rtnl would account for htb stats.
> 
> Hm, I think you're right. I'm just surprised this could have gone
> unnoticed for so long.
> 
> > That said: since it seems the htb numbers are not included right
> > now, I was proposing adding that in later both to rtnl and
> > netdev-genl together, hoping that would keep the proposed
> > simpler/easier to get accepted.
> 
> SGTM.

Cool, so based on that it seems like I just need to figure out the
correct implementation for base and tx stats that is correct and
that will be accepted.

Hoping to hear back from them soon as I'd personally love to have
this API available on our systems.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx
  2024-05-29  3:16 ` [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx Joe Damato
  2024-05-30 21:15   ` Jacob Keller
@ 2024-06-01 11:35   ` Simon Horman
  2024-06-01 11:39     ` Simon Horman
  1 sibling, 1 reply; 22+ messages in thread
From: Simon Horman @ 2024-06-01 11:35 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Tariq Toukan,
	Leon Romanovsky, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, open list:MELLANOX MLX5 core VPI driver

On Wed, May 29, 2024 at 03:16:26AM +0000, Joe Damato wrote:
> Add two helpers to:
> 
> 1. Compute the txq_ix given a channel and a tc offset (tc_to_txq_ix).
> 2. Compute the channel index and tc offset given a txq_ix
>    (txq_ix_to_chtc_ix).
> 
> The first helper, tc_to_txq_ix, is used in place of the mathematical
> expressionin mlx5e_open_sqs when txq_ix values are computed.
> 
> The second helper, txq_ix_to_chtc_ix, will be used in a following patch.

Hi Joe,

I think it would be best to add txq_ix_to_chtc_ix as part of patch that
uses it, because the current arrangement will cause allmodconfigs with
clang-18 and W=1 to fail due to txq_ix_to_chtc_ix being unused.

...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx
  2024-06-01 11:35   ` Simon Horman
@ 2024-06-01 11:39     ` Simon Horman
  2024-06-02 19:23       ` Joe Damato
  0 siblings, 1 reply; 22+ messages in thread
From: Simon Horman @ 2024-06-01 11:39 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Tariq Toukan,
	Leon Romanovsky, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, open list:MELLANOX MLX5 core VPI driver

On Sat, Jun 01, 2024 at 12:35:57PM +0100, Simon Horman wrote:
> On Wed, May 29, 2024 at 03:16:26AM +0000, Joe Damato wrote:
> > Add two helpers to:
> > 
> > 1. Compute the txq_ix given a channel and a tc offset (tc_to_txq_ix).
> > 2. Compute the channel index and tc offset given a txq_ix
> >    (txq_ix_to_chtc_ix).
> > 
> > The first helper, tc_to_txq_ix, is used in place of the mathematical
> > expressionin mlx5e_open_sqs when txq_ix values are computed.
> > 
> > The second helper, txq_ix_to_chtc_ix, will be used in a following patch.
> 
> Hi Joe,
> 
> I think it would be best to add txq_ix_to_chtc_ix as part of patch that
> uses it, because the current arrangement will cause allmodconfigs with
> clang-18 and W=1 to fail due to txq_ix_to_chtc_ix being unused.
> 
> ...

Sorry, one more thing.

Please don't use inline in .c files unless there is a demonstrable
reason - f.e. performance - to do so. Rather, let the compiler figure
out when to inline functions.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-05-29  3:16 ` [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats Joe Damato
  2024-05-30 21:16   ` Jacob Keller
@ 2024-06-02  9:14   ` Tariq Toukan
  2024-06-02 19:22     ` Joe Damato
  1 sibling, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2024-06-02  9:14 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev
  Cc: nalramli, Saeed Mahameed, Leon Romanovsky, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran,
	open list:MELLANOX MLX5 core VPI driver, Tariq Toukan



On 29/05/2024 6:16, Joe Damato wrote:
> Add functions to support the netdev-genl per queue stats API.
> 
> ./cli.py --spec netlink/specs/netdev.yaml \
>           --dump qstats-get --json '{"scope": "queue"}'
> 
> ...snip
> 
>   {'ifindex': 7,
>    'queue-id': 62,
>    'queue-type': 'rx',
>    'rx-alloc-fail': 0,
>    'rx-bytes': 105965251,
>    'rx-packets': 179790},
>   {'ifindex': 7,
>    'queue-id': 0,
>    'queue-type': 'tx',
>    'tx-bytes': 9402665,
>    'tx-packets': 17551},
> 
> ...snip
> 
> Also tested with the script tools/testing/selftests/drivers/net/stats.py
> in several scenarios to ensure stats tallying was correct:
> 
> - on boot (default queue counts)
> - adjusting queue count up or down (ethtool -L eth0 combined ...)
> - adding mqprio TCs

Please test also with interface down.

> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---
>   .../net/ethernet/mellanox/mlx5/core/en_main.c | 132 ++++++++++++++++++
>   1 file changed, 132 insertions(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index ce15805ad55a..515c16a88a6c 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -39,6 +39,7 @@
>   #include <linux/debugfs.h>
>   #include <linux/if_bridge.h>
>   #include <linux/filter.h>
> +#include <net/netdev_queues.h>
>   #include <net/page_pool/types.h>
>   #include <net/pkt_sched.h>
>   #include <net/xdp_sock_drv.h>
> @@ -5293,6 +5294,136 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
>   	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
>   }
>   
> +static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
> +				     struct netdev_queue_stats_rx *stats)
> +{
> +	struct mlx5e_priv *priv = netdev_priv(dev);
> +	struct mlx5e_channel_stats *channel_stats;
> +	struct mlx5e_rq_stats *xskrq_stats;
> +	struct mlx5e_rq_stats *rq_stats;
> +
> +	if (mlx5e_is_uplink_rep(priv))
> +		return;
> +
> +	channel_stats = priv->channel_stats[i];
> +	xskrq_stats = &channel_stats->xskrq;
> +	rq_stats = &channel_stats->rq;
> +
> +	stats->packets = rq_stats->packets + xskrq_stats->packets;
> +	stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
> +	stats->alloc_fail = rq_stats->buff_alloc_err +
> +			    xskrq_stats->buff_alloc_err;
> +}
> +
> +static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
> +				     struct netdev_queue_stats_tx *stats)
> +{
> +	struct mlx5e_priv *priv = netdev_priv(dev);
> +	struct mlx5e_channel_stats *channel_stats;
> +	struct mlx5e_sq_stats *sq_stats;
> +	int ch_ix, tc_ix;
> +
> +	mutex_lock(&priv->state_lock);
> +	txq_ix_to_chtc_ix(&priv->channels.params, i, &ch_ix, &tc_ix);
> +	mutex_unlock(&priv->state_lock);
> +
> +	channel_stats = priv->channel_stats[ch_ix];
> +	sq_stats = &channel_stats->sq[tc_ix];
> +
> +	stats->packets = sq_stats->packets;
> +	stats->bytes = sq_stats->bytes;
> +}
> +
> +static void mlx5e_get_base_stats(struct net_device *dev,
> +				 struct netdev_queue_stats_rx *rx,
> +				 struct netdev_queue_stats_tx *tx)
> +{
> +	struct mlx5e_priv *priv = netdev_priv(dev);
> +	int i, j;
> +
> +	if (!mlx5e_is_uplink_rep(priv)) {
> +		rx->packets = 0;
> +		rx->bytes = 0;
> +		rx->alloc_fail = 0;
> +
> +		/* compute stats for deactivated RX queues
> +		 *
> +		 * if priv->channels.num == 0 the device is down, so compute
> +		 * stats for every queue.
> +		 *
> +		 * otherwise, compute only the queues which have been deactivated.
> +		 */
> +		mutex_lock(&priv->state_lock);
> +		if (priv->channels.num == 0)
> +			i = 0;

This is not consistent with the above implementation of 
mlx5e_get_queue_stats_rx(), which always returns the stats even if the 
channel is down.
This way, you'll double count the down channels.

I think you should always start from priv->channels.params.num_channels.

> +		else
> +			i = priv->channels.params.num_channels;
> +		mutex_unlock(&priv->state_lock);

I understand that you're following the guidelines by taking the lock 
here, I just don't think this improves anything... If channels can be 
modified in between calls to mlx5e_get_base_stats / 
mlx5e_get_queue_stats_rx, then wrapping the priv->channels access with a 
lock can help protect each single deref, but not necessarily in giving a 
consistent "screenshot" of the stats.

The rtnl_lock should take care of that, as the driver holds it when 
changing the number of channels and updating the real_numrx/tx_queues.

This said, I would carefully say you can drop the mutex once following 
the requested changes above.

> +
> +		for (; i < priv->stats_nch; i++) {
> +			struct netdev_queue_stats_rx rx_i = {0};
> +
> +			mlx5e_get_queue_stats_rx(dev, i, &rx_i);
> +
> +			rx->packets += rx_i.packets;
> +			rx->bytes += rx_i.bytes;
> +			rx->alloc_fail += rx_i.alloc_fail;
> +		}
> +
> +		if (priv->rx_ptp_opened) {
> +			struct mlx5e_rq_stats *rq_stats = &priv->ptp_stats.rq;
> +
> +			rx->packets += rq_stats->packets;
> +			rx->bytes += rq_stats->bytes;
> +		}
> +	}
> +
> +	tx->packets = 0;
> +	tx->bytes = 0;
> +
> +	mutex_lock(&priv->state_lock);
> +	for (i = 0; i < priv->stats_nch; i++) {
> +		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
> +
> +		/* while iterating through all channels [0, stats_nch], there
> +		 * are two cases to handle:
> +		 *
> +		 *  1. the channel is available, so sum only the unavailable TCs
> +		 *     [mlx5e_get_dcb_num_tc, max_opened_tc).
> +		 *
> +		 *  2. the channel is unavailable, so sum all TCs [0, max_opened_tc).
> +		 */

I wonder why not call the local var 'tc'?

> +		if (i < priv->channels.params.num_channels) {
> +			j = mlx5e_get_dcb_num_tc(&priv->channels.params);
> +		} else {
> +			j = 0;
> +		}

Remove parenthesis, or use ternary op.

> +
> +		for (; j < priv->max_opened_tc; j++) {
> +			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
> +
> +			tx->packets += sq_stats->packets;
> +			tx->bytes += sq_stats->bytes;
> +		}
> +	}
> +	mutex_unlock(&priv->state_lock);
> +

Same comment regarding dropping the mutex.

> +	if (priv->tx_ptp_opened) {
> +		for (j = 0; j < priv->max_opened_tc; j++) {
> +			struct mlx5e_sq_stats *sq_stats = &priv->ptp_stats.sq[j];
> +
> +			tx->packets    += sq_stats->packets;
> +			tx->bytes      += sq_stats->bytes;
> +		}
> +	}
> +}
> +
> +static const struct netdev_stat_ops mlx5e_stat_ops = {
> +	.get_queue_stats_rx     = mlx5e_get_queue_stats_rx,
> +	.get_queue_stats_tx     = mlx5e_get_queue_stats_tx,
> +	.get_base_stats         = mlx5e_get_base_stats,
> +};
> +
>   static void mlx5e_build_nic_netdev(struct net_device *netdev)
>   {
>   	struct mlx5e_priv *priv = netdev_priv(netdev);
> @@ -5310,6 +5441,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
>   
>   	netdev->watchdog_timeo    = 15 * HZ;
>   
> +	netdev->stat_ops          = &mlx5e_stat_ops;
>   	netdev->ethtool_ops	  = &mlx5e_ethtool_ops;
>   
>   	netdev->vlan_features    |= NETIF_F_SG;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-06-02  9:14   ` Tariq Toukan
@ 2024-06-02 19:22     ` Joe Damato
  2024-06-03 11:11       ` Tariq Toukan
  0 siblings, 1 reply; 22+ messages in thread
From: Joe Damato @ 2024-06-02 19:22 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran, open list:MELLANOX MLX5 core VPI driver,
	Tariq Toukan

On Sun, Jun 02, 2024 at 12:14:21PM +0300, Tariq Toukan wrote:
> 
> 
> On 29/05/2024 6:16, Joe Damato wrote:
> > Add functions to support the netdev-genl per queue stats API.
> > 
> > ./cli.py --spec netlink/specs/netdev.yaml \
> >           --dump qstats-get --json '{"scope": "queue"}'
> > 
> > ...snip
> > 
> >   {'ifindex': 7,
> >    'queue-id': 62,
> >    'queue-type': 'rx',
> >    'rx-alloc-fail': 0,
> >    'rx-bytes': 105965251,
> >    'rx-packets': 179790},
> >   {'ifindex': 7,
> >    'queue-id': 0,
> >    'queue-type': 'tx',
> >    'tx-bytes': 9402665,
> >    'tx-packets': 17551},
> > 
> > ...snip
> > 
> > Also tested with the script tools/testing/selftests/drivers/net/stats.py
> > in several scenarios to ensure stats tallying was correct:
> > 
> > - on boot (default queue counts)
> > - adjusting queue count up or down (ethtool -L eth0 combined ...)
> > - adding mqprio TCs
> 
> Please test also with interface down.

OK. I'll test with the interface down.

Is there some publicly available Mellanox script I can run to test
all the different cases? That would make this much easier. Maybe
this is something to include in mlnx-tools on github?

The mlnx-tools scripts that includes some python scripts for setting
up QoS doesn't seem to work on my system, and outputs vague error
messages. I have no idea if I'm missing some kernel option, if the
device doesn't support it, or if I need some other dependency
installed.

I have been testing these patches on a:

Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
firmware-version: 16.29.2002 (MT_0000000013)

> > 
> > Signed-off-by: Joe Damato <jdamato@fastly.com>
> > ---
> >   .../net/ethernet/mellanox/mlx5/core/en_main.c | 132 ++++++++++++++++++
> >   1 file changed, 132 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > index ce15805ad55a..515c16a88a6c 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > @@ -39,6 +39,7 @@
> >   #include <linux/debugfs.h>
> >   #include <linux/if_bridge.h>
> >   #include <linux/filter.h>
> > +#include <net/netdev_queues.h>
> >   #include <net/page_pool/types.h>
> >   #include <net/pkt_sched.h>
> >   #include <net/xdp_sock_drv.h>
> > @@ -5293,6 +5294,136 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
> >   	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
> >   }
> > +static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
> > +				     struct netdev_queue_stats_rx *stats)
> > +{
> > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > +	struct mlx5e_channel_stats *channel_stats;
> > +	struct mlx5e_rq_stats *xskrq_stats;
> > +	struct mlx5e_rq_stats *rq_stats;
> > +
> > +	if (mlx5e_is_uplink_rep(priv))
> > +		return;
> > +
> > +	channel_stats = priv->channel_stats[i];
> > +	xskrq_stats = &channel_stats->xskrq;
> > +	rq_stats = &channel_stats->rq;
> > +
> > +	stats->packets = rq_stats->packets + xskrq_stats->packets;
> > +	stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
> > +	stats->alloc_fail = rq_stats->buff_alloc_err +
> > +			    xskrq_stats->buff_alloc_err;
> > +}
> > +
> > +static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
> > +				     struct netdev_queue_stats_tx *stats)
> > +{
> > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > +	struct mlx5e_channel_stats *channel_stats;
> > +	struct mlx5e_sq_stats *sq_stats;
> > +	int ch_ix, tc_ix;
> > +
> > +	mutex_lock(&priv->state_lock);
> > +	txq_ix_to_chtc_ix(&priv->channels.params, i, &ch_ix, &tc_ix);
> > +	mutex_unlock(&priv->state_lock);
> > +
> > +	channel_stats = priv->channel_stats[ch_ix];
> > +	sq_stats = &channel_stats->sq[tc_ix];
> > +
> > +	stats->packets = sq_stats->packets;
> > +	stats->bytes = sq_stats->bytes;
> > +}
> > +
> > +static void mlx5e_get_base_stats(struct net_device *dev,
> > +				 struct netdev_queue_stats_rx *rx,
> > +				 struct netdev_queue_stats_tx *tx)
> > +{
> > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > +	int i, j;
> > +
> > +	if (!mlx5e_is_uplink_rep(priv)) {
> > +		rx->packets = 0;
> > +		rx->bytes = 0;
> > +		rx->alloc_fail = 0;
> > +
> > +		/* compute stats for deactivated RX queues
> > +		 *
> > +		 * if priv->channels.num == 0 the device is down, so compute
> > +		 * stats for every queue.
> > +		 *
> > +		 * otherwise, compute only the queues which have been deactivated.
> > +		 */
> > +		mutex_lock(&priv->state_lock);
> > +		if (priv->channels.num == 0)
> > +			i = 0;
> 
> This is not consistent with the above implementation of
> mlx5e_get_queue_stats_rx(), which always returns the stats even if the
> channel is down.
> This way, you'll double count the down channels.
> 
> I think you should always start from priv->channels.params.num_channels.

OK, I'll do that.

> > +		else
> > +			i = priv->channels.params.num_channels;
> > +		mutex_unlock(&priv->state_lock);
> 
> I understand that you're following the guidelines by taking the lock here, I
> just don't think this improves anything... If channels can be modified in
> between calls to mlx5e_get_base_stats / mlx5e_get_queue_stats_rx, then
> wrapping the priv->channels access with a lock can help protect each single
> deref, but not necessarily in giving a consistent "screenshot" of the stats.
> 
> The rtnl_lock should take care of that, as the driver holds it when changing
> the number of channels and updating the real_numrx/tx_queues.
> 
> This said, I would carefully say you can drop the mutex once following the
> requested changes above.

OK, that makes sense to me.

So then I assume I can drop the mutex in mlx5e_get_queue_stats_tx
above, as well, for the same reasons?

Does this mean then that you are in favor of the implementation for
tx stats provided in this RFC and that I've implemented option 1 as
you described in the previous thread correctly?

> > +
> > +		for (; i < priv->stats_nch; i++) {
> > +			struct netdev_queue_stats_rx rx_i = {0};
> > +
> > +			mlx5e_get_queue_stats_rx(dev, i, &rx_i);
> > +
> > +			rx->packets += rx_i.packets;
> > +			rx->bytes += rx_i.bytes;
> > +			rx->alloc_fail += rx_i.alloc_fail;
> > +		}
> > +
> > +		if (priv->rx_ptp_opened) {
> > +			struct mlx5e_rq_stats *rq_stats = &priv->ptp_stats.rq;
> > +
> > +			rx->packets += rq_stats->packets;
> > +			rx->bytes += rq_stats->bytes;
> > +		}
> > +	}
> > +
> > +	tx->packets = 0;
> > +	tx->bytes = 0;
> > +
> > +	mutex_lock(&priv->state_lock);
> > +	for (i = 0; i < priv->stats_nch; i++) {
> > +		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
> > +
> > +		/* while iterating through all channels [0, stats_nch], there
> > +		 * are two cases to handle:
> > +		 *
> > +		 *  1. the channel is available, so sum only the unavailable TCs
> > +		 *     [mlx5e_get_dcb_num_tc, max_opened_tc).
> > +		 *
> > +		 *  2. the channel is unavailable, so sum all TCs [0, max_opened_tc).
> > +		 */
> 
> I wonder why not call the local var 'tc'?

OK.

> > +		if (i < priv->channels.params.num_channels) {
> > +			j = mlx5e_get_dcb_num_tc(&priv->channels.params);
> > +		} else {
> > +			j = 0;
> > +		}
> 
> Remove parenthesis, or use ternary op.

I'll remove the parenthesis; I didn't run checkpatch.pl on this RFC
(which catches this), but I should have.

> > +
> > +		for (; j < priv->max_opened_tc; j++) {
> > +			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
> > +
> > +			tx->packets += sq_stats->packets;
> > +			tx->bytes += sq_stats->bytes;
> > +		}
> > +	}
> > +	mutex_unlock(&priv->state_lock);
> > +
> 
> Same comment regarding dropping the mutex.

OK.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx
  2024-06-01 11:39     ` Simon Horman
@ 2024-06-02 19:23       ` Joe Damato
  0 siblings, 0 replies; 22+ messages in thread
From: Joe Damato @ 2024-06-02 19:23 UTC (permalink / raw)
  To: Simon Horman
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Tariq Toukan,
	Leon Romanovsky, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, open list:MELLANOX MLX5 core VPI driver

On Sat, Jun 01, 2024 at 12:39:13PM +0100, Simon Horman wrote:
> On Sat, Jun 01, 2024 at 12:35:57PM +0100, Simon Horman wrote:
> > On Wed, May 29, 2024 at 03:16:26AM +0000, Joe Damato wrote:
> > > Add two helpers to:
> > > 
> > > 1. Compute the txq_ix given a channel and a tc offset (tc_to_txq_ix).
> > > 2. Compute the channel index and tc offset given a txq_ix
> > >    (txq_ix_to_chtc_ix).
> > > 
> > > The first helper, tc_to_txq_ix, is used in place of the mathematical
> > > expressionin mlx5e_open_sqs when txq_ix values are computed.
> > > 
> > > The second helper, txq_ix_to_chtc_ix, will be used in a following patch.
> > 
> > Hi Joe,
> > 
> > I think it would be best to add txq_ix_to_chtc_ix as part of patch that
> > uses it, because the current arrangement will cause allmodconfigs with
> > clang-18 and W=1 to fail due to txq_ix_to_chtc_ix being unused.
> > 
> > ...
> 
> Sorry, one more thing.
> 
> Please don't use inline in .c files unless there is a demonstrable
> reason - f.e. performance - to do so. Rather, let the compiler figure
> out when to inline functions.

Sure, I'll make sure in the next revision to include the second
helper in the second patch instead and avoid using "inline" in both
cases.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-06-02 19:22     ` Joe Damato
@ 2024-06-03 11:11       ` Tariq Toukan
  2024-06-03 18:11         ` Joe Damato
  2024-06-03 19:22         ` Joe Damato
  0 siblings, 2 replies; 22+ messages in thread
From: Tariq Toukan @ 2024-06-03 11:11 UTC (permalink / raw)
  To: Joe Damato, Tariq Toukan, linux-kernel, netdev, nalramli,
	Saeed Mahameed, Leon Romanovsky, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran,
	open list:MELLANOX MLX5 core VPI driver, Tariq Toukan



On 02/06/2024 22:22, Joe Damato wrote:
> On Sun, Jun 02, 2024 at 12:14:21PM +0300, Tariq Toukan wrote:
>>
>>
>> On 29/05/2024 6:16, Joe Damato wrote:
>>> Add functions to support the netdev-genl per queue stats API.
>>>
>>> ./cli.py --spec netlink/specs/netdev.yaml \
>>>            --dump qstats-get --json '{"scope": "queue"}'
>>>
>>> ...snip
>>>
>>>    {'ifindex': 7,
>>>     'queue-id': 62,
>>>     'queue-type': 'rx',
>>>     'rx-alloc-fail': 0,
>>>     'rx-bytes': 105965251,
>>>     'rx-packets': 179790},
>>>    {'ifindex': 7,
>>>     'queue-id': 0,
>>>     'queue-type': 'tx',
>>>     'tx-bytes': 9402665,
>>>     'tx-packets': 17551},
>>>
>>> ...snip
>>>
>>> Also tested with the script tools/testing/selftests/drivers/net/stats.py
>>> in several scenarios to ensure stats tallying was correct:
>>>
>>> - on boot (default queue counts)
>>> - adjusting queue count up or down (ethtool -L eth0 combined ...)
>>> - adding mqprio TCs
>>
>> Please test also with interface down.
> 
> OK. I'll test with the interface down.
> 
> Is there some publicly available Mellanox script I can run to test
> all the different cases? That would make this much easier. Maybe
> this is something to include in mlnx-tools on github?
> 

You're testing some new functionality. We don't have something for it.


> The mlnx-tools scripts that includes some python scripts for setting
> up QoS doesn't seem to work on my system, and outputs vague error
> messages. I have no idea if I'm missing some kernel option, if the
> device doesn't support it, or if I need some other dependency
> installed.
> 

Can you share the command you use, and the output?

> I have been testing these patches on a:
> 
> Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
> firmware-version: 16.29.2002 (MT_0000000013)
> 
>>>
>>> Signed-off-by: Joe Damato <jdamato@fastly.com>
>>> ---
>>>    .../net/ethernet/mellanox/mlx5/core/en_main.c | 132 ++++++++++++++++++
>>>    1 file changed, 132 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> index ce15805ad55a..515c16a88a6c 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> @@ -39,6 +39,7 @@
>>>    #include <linux/debugfs.h>
>>>    #include <linux/if_bridge.h>
>>>    #include <linux/filter.h>
>>> +#include <net/netdev_queues.h>
>>>    #include <net/page_pool/types.h>
>>>    #include <net/pkt_sched.h>
>>>    #include <net/xdp_sock_drv.h>
>>> @@ -5293,6 +5294,136 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
>>>    	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
>>>    }
>>> +static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
>>> +				     struct netdev_queue_stats_rx *stats)
>>> +{
>>> +	struct mlx5e_priv *priv = netdev_priv(dev);
>>> +	struct mlx5e_channel_stats *channel_stats;
>>> +	struct mlx5e_rq_stats *xskrq_stats;
>>> +	struct mlx5e_rq_stats *rq_stats;
>>> +
>>> +	if (mlx5e_is_uplink_rep(priv))
>>> +		return;
>>> +
>>> +	channel_stats = priv->channel_stats[i];
>>> +	xskrq_stats = &channel_stats->xskrq;
>>> +	rq_stats = &channel_stats->rq;
>>> +
>>> +	stats->packets = rq_stats->packets + xskrq_stats->packets;
>>> +	stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
>>> +	stats->alloc_fail = rq_stats->buff_alloc_err +
>>> +			    xskrq_stats->buff_alloc_err;
>>> +}
>>> +
>>> +static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
>>> +				     struct netdev_queue_stats_tx *stats)
>>> +{
>>> +	struct mlx5e_priv *priv = netdev_priv(dev);
>>> +	struct mlx5e_channel_stats *channel_stats;
>>> +	struct mlx5e_sq_stats *sq_stats;
>>> +	int ch_ix, tc_ix;
>>> +
>>> +	mutex_lock(&priv->state_lock);
>>> +	txq_ix_to_chtc_ix(&priv->channels.params, i, &ch_ix, &tc_ix);
>>> +	mutex_unlock(&priv->state_lock);
>>> +
>>> +	channel_stats = priv->channel_stats[ch_ix];
>>> +	sq_stats = &channel_stats->sq[tc_ix];
>>> +
>>> +	stats->packets = sq_stats->packets;
>>> +	stats->bytes = sq_stats->bytes;
>>> +}
>>> +
>>> +static void mlx5e_get_base_stats(struct net_device *dev,
>>> +				 struct netdev_queue_stats_rx *rx,
>>> +				 struct netdev_queue_stats_tx *tx)
>>> +{
>>> +	struct mlx5e_priv *priv = netdev_priv(dev);
>>> +	int i, j;
>>> +
>>> +	if (!mlx5e_is_uplink_rep(priv)) {
>>> +		rx->packets = 0;
>>> +		rx->bytes = 0;
>>> +		rx->alloc_fail = 0;
>>> +
>>> +		/* compute stats for deactivated RX queues
>>> +		 *
>>> +		 * if priv->channels.num == 0 the device is down, so compute
>>> +		 * stats for every queue.
>>> +		 *
>>> +		 * otherwise, compute only the queues which have been deactivated.
>>> +		 */
>>> +		mutex_lock(&priv->state_lock);
>>> +		if (priv->channels.num == 0)
>>> +			i = 0;
>>
>> This is not consistent with the above implementation of
>> mlx5e_get_queue_stats_rx(), which always returns the stats even if the
>> channel is down.
>> This way, you'll double count the down channels.
>>
>> I think you should always start from priv->channels.params.num_channels.
> 
> OK, I'll do that.
> 
>>> +		else
>>> +			i = priv->channels.params.num_channels;
>>> +		mutex_unlock(&priv->state_lock);
>>
>> I understand that you're following the guidelines by taking the lock here, I
>> just don't think this improves anything... If channels can be modified in
>> between calls to mlx5e_get_base_stats / mlx5e_get_queue_stats_rx, then
>> wrapping the priv->channels access with a lock can help protect each single
>> deref, but not necessarily in giving a consistent "screenshot" of the stats.
>>
>> The rtnl_lock should take care of that, as the driver holds it when changing
>> the number of channels and updating the real_numrx/tx_queues.
>>
>> This said, I would carefully say you can drop the mutex once following the
>> requested changes above.

I still don't really like this design, so I gave some more thought on 
this...

I think we should come up with a new mapping array under priv, that maps 
i (from real_num_tx_queues) to the matching sq_stats struct.
This array would be maintained in the channels open/close functions, 
similarly to priv->txq2sq.

Then, we would not calculate the mapping per call, but just get the 
proper pointer from the array. This eases the handling of htb and ptp 
queues, which were missed in your txq_ix_to_chtc_ix().

This handles mapped SQs.

Now, regarding unmapped ones, they must be handled in the "base" 
function call.
We'd still need to access channels->params, to:
1. read params.num_channels to iterate until priv->stats_nch, and
2. read mlx5e_get_dcb_num_tc(params) to iterate until priv->max_opened_tc.

I think we can live with this without holding the mutex, given that this 
runs under the rtnl lock.
We can add ASSERT_RTNL() to verify the assumption.


> 
> OK, that makes sense to me.
> 
> So then I assume I can drop the mutex in mlx5e_get_queue_stats_tx
> above, as well, for the same reasons?
> 
> Does this mean then that you are in favor of the implementation for
> tx stats provided in this RFC and that I've implemented option 1 as
> you described in the previous thread correctly?
> 

Yes, but I wasn't happy enough with the design.
Thanks for your contribution.

>>> +
>>> +		for (; i < priv->stats_nch; i++) {
>>> +			struct netdev_queue_stats_rx rx_i = {0};
>>> +
>>> +			mlx5e_get_queue_stats_rx(dev, i, &rx_i);
>>> +
>>> +			rx->packets += rx_i.packets;
>>> +			rx->bytes += rx_i.bytes;
>>> +			rx->alloc_fail += rx_i.alloc_fail;
>>> +		}
>>> +
>>> +		if (priv->rx_ptp_opened) {
>>> +			struct mlx5e_rq_stats *rq_stats = &priv->ptp_stats.rq;
>>> +
>>> +			rx->packets += rq_stats->packets;
>>> +			rx->bytes += rq_stats->bytes;
>>> +		}
>>> +	}
>>> +
>>> +	tx->packets = 0;
>>> +	tx->bytes = 0;
>>> +
>>> +	mutex_lock(&priv->state_lock);
>>> +	for (i = 0; i < priv->stats_nch; i++) {
>>> +		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
>>> +
>>> +		/* while iterating through all channels [0, stats_nch], there
>>> +		 * are two cases to handle:
>>> +		 *
>>> +		 *  1. the channel is available, so sum only the unavailable TCs
>>> +		 *     [mlx5e_get_dcb_num_tc, max_opened_tc).
>>> +		 *
>>> +		 *  2. the channel is unavailable, so sum all TCs [0, max_opened_tc).
>>> +		 */
>>
>> I wonder why not call the local var 'tc'?
> 
> OK.
> 
>>> +		if (i < priv->channels.params.num_channels) {
>>> +			j = mlx5e_get_dcb_num_tc(&priv->channels.params);
>>> +		} else {
>>> +			j = 0;
>>> +		}
>>
>> Remove parenthesis, or use ternary op.
> 
> I'll remove the parenthesis; I didn't run checkpatch.pl on this RFC
> (which catches this), but I should have.
> 
>>> +
>>> +		for (; j < priv->max_opened_tc; j++) {
>>> +			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
>>> +
>>> +			tx->packets += sq_stats->packets;
>>> +			tx->bytes += sq_stats->bytes;
>>> +		}
>>> +	}
>>> +	mutex_unlock(&priv->state_lock);
>>> +
>>
>> Same comment regarding dropping the mutex.
> 
> OK.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-06-03 11:11       ` Tariq Toukan
@ 2024-06-03 18:11         ` Joe Damato
  2024-06-03 19:22         ` Joe Damato
  1 sibling, 0 replies; 22+ messages in thread
From: Joe Damato @ 2024-06-03 18:11 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran, open list:MELLANOX MLX5 core VPI driver,
	Tariq Toukan

On Mon, Jun 03, 2024 at 02:11:14PM +0300, Tariq Toukan wrote:
> 
> 
> On 02/06/2024 22:22, Joe Damato wrote:
> > On Sun, Jun 02, 2024 at 12:14:21PM +0300, Tariq Toukan wrote:
> > > 
> > > 
> > > On 29/05/2024 6:16, Joe Damato wrote:
> > > > Add functions to support the netdev-genl per queue stats API.
> > > > 
> > > > ./cli.py --spec netlink/specs/netdev.yaml \
> > > >            --dump qstats-get --json '{"scope": "queue"}'
> > > > 
> > > > ...snip
> > > > 
> > > >    {'ifindex': 7,
> > > >     'queue-id': 62,
> > > >     'queue-type': 'rx',
> > > >     'rx-alloc-fail': 0,
> > > >     'rx-bytes': 105965251,
> > > >     'rx-packets': 179790},
> > > >    {'ifindex': 7,
> > > >     'queue-id': 0,
> > > >     'queue-type': 'tx',
> > > >     'tx-bytes': 9402665,
> > > >     'tx-packets': 17551},
> > > > 
> > > > ...snip
> > > > 
> > > > Also tested with the script tools/testing/selftests/drivers/net/stats.py
> > > > in several scenarios to ensure stats tallying was correct:
> > > > 
> > > > - on boot (default queue counts)
> > > > - adjusting queue count up or down (ethtool -L eth0 combined ...)
> > > > - adding mqprio TCs
> > > 
> > > Please test also with interface down.
> > 
> > OK. I'll test with the interface down.
> > 
> > Is there some publicly available Mellanox script I can run to test
> > all the different cases? That would make this much easier. Maybe
> > this is something to include in mlnx-tools on github?
> > 
> 
> You're testing some new functionality. We don't have something for it.
> 
> 
> > The mlnx-tools scripts that includes some python scripts for setting
> > up QoS doesn't seem to work on my system, and outputs vague error
> > messages. I have no idea if I'm missing some kernel option, if the
> > device doesn't support it, or if I need some other dependency
> > installed.
> > 
> 
> Can you share the command you use, and the output?

Sure:

jdamato@test:~/mlnx-tools/python$ python --version
Python 3.12.3

jdamato@test:~/mlnx-tools/python$ ./mlnx_qos -i vlan401
Priority trust state is not supported on your system
Buffers commands are not supported on your system
Rate limit is not supported on your system!
ETS features are not supported on your system

jdamato@test:~/mlnx-tools/python$ echo $?
1

This is mlnx-tools at SHA 641718b13f71 ("Merge pull request #87 from
ahlabenadam/no_remove_folder")

You can feel free to follow up with me about this off-list if you
like.

> > I have been testing these patches on a:
> > 
> > Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
> > firmware-version: 16.29.2002 (MT_0000000013)
> > 
> > > > 
> > > > Signed-off-by: Joe Damato <jdamato@fastly.com>
> > > > ---
> > > >    .../net/ethernet/mellanox/mlx5/core/en_main.c | 132 ++++++++++++++++++
> > > >    1 file changed, 132 insertions(+)
> > > > 
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > > index ce15805ad55a..515c16a88a6c 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > > @@ -39,6 +39,7 @@
> > > >    #include <linux/debugfs.h>
> > > >    #include <linux/if_bridge.h>
> > > >    #include <linux/filter.h>
> > > > +#include <net/netdev_queues.h>
> > > >    #include <net/page_pool/types.h>
> > > >    #include <net/pkt_sched.h>
> > > >    #include <net/xdp_sock_drv.h>
> > > > @@ -5293,6 +5294,136 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
> > > >    	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
> > > >    }
> > > > +static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
> > > > +				     struct netdev_queue_stats_rx *stats)
> > > > +{
> > > > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > > > +	struct mlx5e_channel_stats *channel_stats;
> > > > +	struct mlx5e_rq_stats *xskrq_stats;
> > > > +	struct mlx5e_rq_stats *rq_stats;
> > > > +
> > > > +	if (mlx5e_is_uplink_rep(priv))
> > > > +		return;
> > > > +
> > > > +	channel_stats = priv->channel_stats[i];
> > > > +	xskrq_stats = &channel_stats->xskrq;
> > > > +	rq_stats = &channel_stats->rq;
> > > > +
> > > > +	stats->packets = rq_stats->packets + xskrq_stats->packets;
> > > > +	stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
> > > > +	stats->alloc_fail = rq_stats->buff_alloc_err +
> > > > +			    xskrq_stats->buff_alloc_err;
> > > > +}
> > > > +
> > > > +static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
> > > > +				     struct netdev_queue_stats_tx *stats)
> > > > +{
> > > > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > > > +	struct mlx5e_channel_stats *channel_stats;
> > > > +	struct mlx5e_sq_stats *sq_stats;
> > > > +	int ch_ix, tc_ix;
> > > > +
> > > > +	mutex_lock(&priv->state_lock);
> > > > +	txq_ix_to_chtc_ix(&priv->channels.params, i, &ch_ix, &tc_ix);
> > > > +	mutex_unlock(&priv->state_lock);
> > > > +
> > > > +	channel_stats = priv->channel_stats[ch_ix];
> > > > +	sq_stats = &channel_stats->sq[tc_ix];
> > > > +
> > > > +	stats->packets = sq_stats->packets;
> > > > +	stats->bytes = sq_stats->bytes;
> > > > +}
> > > > +
> > > > +static void mlx5e_get_base_stats(struct net_device *dev,
> > > > +				 struct netdev_queue_stats_rx *rx,
> > > > +				 struct netdev_queue_stats_tx *tx)
> > > > +{
> > > > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > > > +	int i, j;
> > > > +
> > > > +	if (!mlx5e_is_uplink_rep(priv)) {
> > > > +		rx->packets = 0;
> > > > +		rx->bytes = 0;
> > > > +		rx->alloc_fail = 0;
> > > > +
> > > > +		/* compute stats for deactivated RX queues
> > > > +		 *
> > > > +		 * if priv->channels.num == 0 the device is down, so compute
> > > > +		 * stats for every queue.
> > > > +		 *
> > > > +		 * otherwise, compute only the queues which have been deactivated.
> > > > +		 */
> > > > +		mutex_lock(&priv->state_lock);
> > > > +		if (priv->channels.num == 0)
> > > > +			i = 0;
> > > 
> > > This is not consistent with the above implementation of
> > > mlx5e_get_queue_stats_rx(), which always returns the stats even if the
> > > channel is down.
> > > This way, you'll double count the down channels.
> > > 
> > > I think you should always start from priv->channels.params.num_channels.
> > 
> > OK, I'll do that.
> > 
> > > > +		else
> > > > +			i = priv->channels.params.num_channels;
> > > > +		mutex_unlock(&priv->state_lock);
> > > 
> > > I understand that you're following the guidelines by taking the lock here, I
> > > just don't think this improves anything... If channels can be modified in
> > > between calls to mlx5e_get_base_stats / mlx5e_get_queue_stats_rx, then
> > > wrapping the priv->channels access with a lock can help protect each single
> > > deref, but not necessarily in giving a consistent "screenshot" of the stats.
> > > 
> > > The rtnl_lock should take care of that, as the driver holds it when changing
> > > the number of channels and updating the real_numrx/tx_queues.
> > > 
> > > This said, I would carefully say you can drop the mutex once following the
> > > requested changes above.
> 
> I still don't really like this design, so I gave some more thought on
> this...

Thanks, again, for your careful thoughts and review on this. I do
appreciate it and this functionality will be extremely useful for me
(and I suspect many others).

> I think we should come up with a new mapping array under priv, that maps i
> (from real_num_tx_queues) to the matching sq_stats struct.
> This array would be maintained in the channels open/close functions,
> similarly to priv->txq2sq.
> 
> Then, we would not calculate the mapping per call, but just get the proper
> pointer from the array. This eases the handling of htb and ptp queues, which
> were missed in your txq_ix_to_chtc_ix().
> 
> This handles mapped SQs.

OK, the above makes sense. I'll give that a try.

> Now, regarding unmapped ones, they must be handled in the "base" function
> call.
> We'd still need to access channels->params, to:
> 1. read params.num_channels to iterate until priv->stats_nch, and
> 2. read mlx5e_get_dcb_num_tc(params) to iterate until priv->max_opened_tc.
> 
> I think we can live with this without holding the mutex, given that this
> runs under the rtnl lock.
> We can add ASSERT_RTNL() to verify the assumption.

OK. I'll try the above and propose an rfc v4.

> 
> > 
> > OK, that makes sense to me.
> > 
> > So then I assume I can drop the mutex in mlx5e_get_queue_stats_tx
> > above, as well, for the same reasons?
> > 
> > Does this mean then that you are in favor of the implementation for
> > tx stats provided in this RFC and that I've implemented option 1 as
> > you described in the previous thread correctly?
> > 
> 
> Yes, but I wasn't happy enough with the design.
> Thanks for your contribution.

Thanks for your review and patience.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-06-03 11:11       ` Tariq Toukan
  2024-06-03 18:11         ` Joe Damato
@ 2024-06-03 19:22         ` Joe Damato
  2024-06-03 21:36           ` Tariq Toukan
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Damato @ 2024-06-03 19:22 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran, open list:MELLANOX MLX5 core VPI driver,
	Tariq Toukan

On Mon, Jun 03, 2024 at 02:11:14PM +0300, Tariq Toukan wrote:
> 
> 
> On 02/06/2024 22:22, Joe Damato wrote:
> > On Sun, Jun 02, 2024 at 12:14:21PM +0300, Tariq Toukan wrote:
> > > 
> > > 
> > > On 29/05/2024 6:16, Joe Damato wrote:
> > > > Add functions to support the netdev-genl per queue stats API.
> > > > 
> > > > ./cli.py --spec netlink/specs/netdev.yaml \
> > > >            --dump qstats-get --json '{"scope": "queue"}'
> > > > 
> > > > ...snip
> > > > 
> > > >    {'ifindex': 7,
> > > >     'queue-id': 62,
> > > >     'queue-type': 'rx',
> > > >     'rx-alloc-fail': 0,
> > > >     'rx-bytes': 105965251,
> > > >     'rx-packets': 179790},
> > > >    {'ifindex': 7,
> > > >     'queue-id': 0,
> > > >     'queue-type': 'tx',
> > > >     'tx-bytes': 9402665,
> > > >     'tx-packets': 17551},
> > > > 
> > > > ...snip
> > > > 
> > > > Also tested with the script tools/testing/selftests/drivers/net/stats.py
> > > > in several scenarios to ensure stats tallying was correct:
> > > > 
> > > > - on boot (default queue counts)
> > > > - adjusting queue count up or down (ethtool -L eth0 combined ...)
> > > > - adding mqprio TCs
> > > 
> > > Please test also with interface down.
> > 
> > OK. I'll test with the interface down.
> > 
> > Is there some publicly available Mellanox script I can run to test
> > all the different cases? That would make this much easier. Maybe
> > this is something to include in mlnx-tools on github?
> > 
> 
> You're testing some new functionality. We don't have something for it.
> 
> 
> > The mlnx-tools scripts that includes some python scripts for setting
> > up QoS doesn't seem to work on my system, and outputs vague error
> > messages. I have no idea if I'm missing some kernel option, if the
> > device doesn't support it, or if I need some other dependency
> > installed.
> > 
> 
> Can you share the command you use, and the output?
> 
> > I have been testing these patches on a:
> > 
> > Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
> > firmware-version: 16.29.2002 (MT_0000000013)
> > 
> > > > 
> > > > Signed-off-by: Joe Damato <jdamato@fastly.com>
> > > > ---
> > > >    .../net/ethernet/mellanox/mlx5/core/en_main.c | 132 ++++++++++++++++++
> > > >    1 file changed, 132 insertions(+)
> > > > 
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > > index ce15805ad55a..515c16a88a6c 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > > @@ -39,6 +39,7 @@
> > > >    #include <linux/debugfs.h>
> > > >    #include <linux/if_bridge.h>
> > > >    #include <linux/filter.h>
> > > > +#include <net/netdev_queues.h>
> > > >    #include <net/page_pool/types.h>
> > > >    #include <net/pkt_sched.h>
> > > >    #include <net/xdp_sock_drv.h>
> > > > @@ -5293,6 +5294,136 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
> > > >    	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
> > > >    }
> > > > +static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
> > > > +				     struct netdev_queue_stats_rx *stats)
> > > > +{
> > > > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > > > +	struct mlx5e_channel_stats *channel_stats;
> > > > +	struct mlx5e_rq_stats *xskrq_stats;
> > > > +	struct mlx5e_rq_stats *rq_stats;
> > > > +
> > > > +	if (mlx5e_is_uplink_rep(priv))
> > > > +		return;
> > > > +
> > > > +	channel_stats = priv->channel_stats[i];
> > > > +	xskrq_stats = &channel_stats->xskrq;
> > > > +	rq_stats = &channel_stats->rq;
> > > > +
> > > > +	stats->packets = rq_stats->packets + xskrq_stats->packets;
> > > > +	stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
> > > > +	stats->alloc_fail = rq_stats->buff_alloc_err +
> > > > +			    xskrq_stats->buff_alloc_err;
> > > > +}
> > > > +
> > > > +static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
> > > > +				     struct netdev_queue_stats_tx *stats)
> > > > +{
> > > > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > > > +	struct mlx5e_channel_stats *channel_stats;
> > > > +	struct mlx5e_sq_stats *sq_stats;
> > > > +	int ch_ix, tc_ix;
> > > > +
> > > > +	mutex_lock(&priv->state_lock);
> > > > +	txq_ix_to_chtc_ix(&priv->channels.params, i, &ch_ix, &tc_ix);
> > > > +	mutex_unlock(&priv->state_lock);
> > > > +
> > > > +	channel_stats = priv->channel_stats[ch_ix];
> > > > +	sq_stats = &channel_stats->sq[tc_ix];
> > > > +
> > > > +	stats->packets = sq_stats->packets;
> > > > +	stats->bytes = sq_stats->bytes;
> > > > +}
> > > > +
> > > > +static void mlx5e_get_base_stats(struct net_device *dev,
> > > > +				 struct netdev_queue_stats_rx *rx,
> > > > +				 struct netdev_queue_stats_tx *tx)
> > > > +{
> > > > +	struct mlx5e_priv *priv = netdev_priv(dev);
> > > > +	int i, j;
> > > > +
> > > > +	if (!mlx5e_is_uplink_rep(priv)) {
> > > > +		rx->packets = 0;
> > > > +		rx->bytes = 0;
> > > > +		rx->alloc_fail = 0;
> > > > +
> > > > +		/* compute stats for deactivated RX queues
> > > > +		 *
> > > > +		 * if priv->channels.num == 0 the device is down, so compute
> > > > +		 * stats for every queue.
> > > > +		 *
> > > > +		 * otherwise, compute only the queues which have been deactivated.
> > > > +		 */
> > > > +		mutex_lock(&priv->state_lock);
> > > > +		if (priv->channels.num == 0)
> > > > +			i = 0;
> > > 
> > > This is not consistent with the above implementation of
> > > mlx5e_get_queue_stats_rx(), which always returns the stats even if the
> > > channel is down.
> > > This way, you'll double count the down channels.
> > > 
> > > I think you should always start from priv->channels.params.num_channels.
> > 
> > OK, I'll do that.
> > 
> > > > +		else
> > > > +			i = priv->channels.params.num_channels;
> > > > +		mutex_unlock(&priv->state_lock);
> > > 
> > > I understand that you're following the guidelines by taking the lock here, I
> > > just don't think this improves anything... If channels can be modified in
> > > between calls to mlx5e_get_base_stats / mlx5e_get_queue_stats_rx, then
> > > wrapping the priv->channels access with a lock can help protect each single
> > > deref, but not necessarily in giving a consistent "screenshot" of the stats.
> > > 
> > > The rtnl_lock should take care of that, as the driver holds it when changing
> > > the number of channels and updating the real_numrx/tx_queues.
> > > 
> > > This said, I would carefully say you can drop the mutex once following the
> > > requested changes above.
> 
> I still don't really like this design, so I gave some more thought on
> this...
> 
> I think we should come up with a new mapping array under priv, that maps i
> (from real_num_tx_queues) to the matching sq_stats struct.
> This array would be maintained in the channels open/close functions,
> similarly to priv->txq2sq.
> 
> Then, we would not calculate the mapping per call, but just get the proper
> pointer from the array. This eases the handling of htb and ptp queues, which
> were missed in your txq_ix_to_chtc_ix().

Maybe I am just getting way off track here, but I noticed this in
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c in
set_pflag_tx_port_ts:

  if (!err)
    priv->tx_ptp_opened = true;

but what if enable is false, meaning the user is disabling ptp via
ethtool?

In that case, shouldn't this be:

  if (!err)
    priv->tx_ptp_opened = enabled;

Because it seems like the user could pass 0 to disable ptp, and with
the current code then trigger mlx5e_close_channels which would call
mlx5e_ptp_close, but priv->tx_ptp_opened might still be true even
though ptp has been closed. Just a guess as I really don't know much
about this code at all, but it seems that the MLX5E_PFLAG_TX_PORT_TS
will be set in set_pflag_tx_port_ts based on enable, which then
affects how mlx5e_open_channels behaves.

Likewise in mlx5e_ptp_close_queues, maybe
rx_ptp_opened and tx_ptp_opened should be set to false there?

It seems like the user could get the driver into an inconsistent
state by enabling then disabling ptp, but maybe I'm just reading
this all wrong.

Is that a bug? If so, I can submit:

1. ethtool tx_ptp_opened = false
2. rx_ptp_opened = tx_ptp_opened = false in mlx5e_ptp_close_queues

to net as a Fixes ?

I am asking about this from the perspective of stats, because in the
qos stuff, the txq2sq_stats mapping is created or removed (set to
NULL) in mlx5e_activate_qos_sq and mlx5e_deactivate_qos_sq /
mlx5e_qos_deactivate_queues, respectively.

This means that priv->txq2sq_stats could be a valid sq_stats or
invalid for deactivated TCs and qos. It seems like ptp should work
the same way, but I have no idea.

If ptp did work the same way, then in base the code would check if
ptp was disabled and obtain the stats from it, if there are any from
it being previously enabled.

That seems like more consistent behavior, but sorry if
I'm totally off on all of this.

> This handles mapped SQs.
> 
> Now, regarding unmapped ones, they must be handled in the "base" function
> call.
> We'd still need to access channels->params, to:
> 1. read params.num_channels to iterate until priv->stats_nch, and
> 2. read mlx5e_get_dcb_num_tc(params) to iterate until priv->max_opened_tc.
> 
> I think we can live with this without holding the mutex, given that this
> runs under the rtnl lock.
> We can add ASSERT_RTNL() to verify the assumption.
> 
> 
> > 
> > OK, that makes sense to me.
> > 
> > So then I assume I can drop the mutex in mlx5e_get_queue_stats_tx
> > above, as well, for the same reasons?
> > 
> > Does this mean then that you are in favor of the implementation for
> > tx stats provided in this RFC and that I've implemented option 1 as
> > you described in the previous thread correctly?
> > 
> 
> Yes, but I wasn't happy enough with the design.
> Thanks for your contribution.
> 
> > > > +
> > > > +		for (; i < priv->stats_nch; i++) {
> > > > +			struct netdev_queue_stats_rx rx_i = {0};
> > > > +
> > > > +			mlx5e_get_queue_stats_rx(dev, i, &rx_i);
> > > > +
> > > > +			rx->packets += rx_i.packets;
> > > > +			rx->bytes += rx_i.bytes;
> > > > +			rx->alloc_fail += rx_i.alloc_fail;
> > > > +		}
> > > > +
> > > > +		if (priv->rx_ptp_opened) {
> > > > +			struct mlx5e_rq_stats *rq_stats = &priv->ptp_stats.rq;
> > > > +
> > > > +			rx->packets += rq_stats->packets;
> > > > +			rx->bytes += rq_stats->bytes;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	tx->packets = 0;
> > > > +	tx->bytes = 0;
> > > > +
> > > > +	mutex_lock(&priv->state_lock);
> > > > +	for (i = 0; i < priv->stats_nch; i++) {
> > > > +		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
> > > > +
> > > > +		/* while iterating through all channels [0, stats_nch], there
> > > > +		 * are two cases to handle:
> > > > +		 *
> > > > +		 *  1. the channel is available, so sum only the unavailable TCs
> > > > +		 *     [mlx5e_get_dcb_num_tc, max_opened_tc).
> > > > +		 *
> > > > +		 *  2. the channel is unavailable, so sum all TCs [0, max_opened_tc).
> > > > +		 */
> > > 
> > > I wonder why not call the local var 'tc'?
> > 
> > OK.
> > 
> > > > +		if (i < priv->channels.params.num_channels) {
> > > > +			j = mlx5e_get_dcb_num_tc(&priv->channels.params);
> > > > +		} else {
> > > > +			j = 0;
> > > > +		}
> > > 
> > > Remove parenthesis, or use ternary op.
> > 
> > I'll remove the parenthesis; I didn't run checkpatch.pl on this RFC
> > (which catches this), but I should have.
> > 
> > > > +
> > > > +		for (; j < priv->max_opened_tc; j++) {
> > > > +			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
> > > > +
> > > > +			tx->packets += sq_stats->packets;
> > > > +			tx->bytes += sq_stats->bytes;
> > > > +		}
> > > > +	}
> > > > +	mutex_unlock(&priv->state_lock);
> > > > +
> > > 
> > > Same comment regarding dropping the mutex.
> > 
> > OK.
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-06-03 19:22         ` Joe Damato
@ 2024-06-03 21:36           ` Tariq Toukan
  2024-06-03 21:50             ` Joe Damato
  0 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2024-06-03 21:36 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev, nalramli, Saeed Mahameed,
	Leon Romanovsky, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Richard Cochran,
	open list:MELLANOX MLX5 core VPI driver, Tariq Toukan



On 03/06/2024 22:22, Joe Damato wrote:
> On Mon, Jun 03, 2024 at 02:11:14PM +0300, Tariq Toukan wrote:
>>

...

>>
>> I still don't really like this design, so I gave some more thought on
>> this...
>>
>> I think we should come up with a new mapping array under priv, that maps i
>> (from real_num_tx_queues) to the matching sq_stats struct.
>> This array would be maintained in the channels open/close functions,
>> similarly to priv->txq2sq.
>>
>> Then, we would not calculate the mapping per call, but just get the proper
>> pointer from the array. This eases the handling of htb and ptp queues, which
>> were missed in your txq_ix_to_chtc_ix().
> 
> Maybe I am just getting way off track here, but I noticed this in
> drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c in
> set_pflag_tx_port_ts:
> 
>    if (!err)
>      priv->tx_ptp_opened = true;
> 
> but what if enable is false, meaning the user is disabling ptp via
> ethtool?
> 

This bool field under priv acts as a flag, means: ptp was ever opened.
It's the same idea as max_opened_tc, but with boolean instead of numeric.

> In that case, shouldn't this be:
> 
>    if (!err)
>      priv->tx_ptp_opened = enabled;
> 
> Because it seems like the user could pass 0 to disable ptp, and with
> the current code then trigger mlx5e_close_channels which would call
> mlx5e_ptp_close, but priv->tx_ptp_opened might still be true even
> though ptp has been closed. Just a guess as I really don't know much
> about this code at all, but it seems that the MLX5E_PFLAG_TX_PORT_TS
> will be set in set_pflag_tx_port_ts based on enable, which then
> affects how mlx5e_open_channels behaves.
> 
> Likewise in mlx5e_ptp_close_queues, maybe
> rx_ptp_opened and tx_ptp_opened should be set to false there?
> 
> It seems like the user could get the driver into an inconsistent
> state by enabling then disabling ptp, but maybe I'm just reading
> this all wrong.
> 
> Is that a bug? If so, I can submit:
> 
> 1. ethtool tx_ptp_opened = false
> 2. rx_ptp_opened = tx_ptp_opened = false in mlx5e_ptp_close_queues
> 
> to net as a Fixes ?
> 
> I am asking about this from the perspective of stats, because in the
> qos stuff, the txq2sq_stats mapping is created or removed (set to
> NULL) in mlx5e_activate_qos_sq and mlx5e_deactivate_qos_sq /
> mlx5e_qos_deactivate_queues, respectively.
> 
> This means that priv->txq2sq_stats could be a valid sq_stats or
> invalid for deactivated TCs and qos. It seems like ptp should work
> the same way, but I have no idea.
> 
> If ptp did work the same way, then in base the code would check if
> ptp was disabled and obtain the stats from it, if there are any from
> it being previously enabled.
> 
> That seems like more consistent behavior, but sorry if
> I'm totally off on all of this.
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats
  2024-06-03 21:36           ` Tariq Toukan
@ 2024-06-03 21:50             ` Joe Damato
  0 siblings, 0 replies; 22+ messages in thread
From: Joe Damato @ 2024-06-03 21:50 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: linux-kernel, netdev, nalramli, Saeed Mahameed, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran, open list:MELLANOX MLX5 core VPI driver,
	Tariq Toukan

On Tue, Jun 04, 2024 at 12:36:16AM +0300, Tariq Toukan wrote:
> 
> 
> On 03/06/2024 22:22, Joe Damato wrote:
> > On Mon, Jun 03, 2024 at 02:11:14PM +0300, Tariq Toukan wrote:
> > > 
> 
> ...
> 
> > > 
> > > I still don't really like this design, so I gave some more thought on
> > > this...
> > > 
> > > I think we should come up with a new mapping array under priv, that maps i
> > > (from real_num_tx_queues) to the matching sq_stats struct.
> > > This array would be maintained in the channels open/close functions,
> > > similarly to priv->txq2sq.
> > > 
> > > Then, we would not calculate the mapping per call, but just get the proper
> > > pointer from the array. This eases the handling of htb and ptp queues, which
> > > were missed in your txq_ix_to_chtc_ix().
> > 
> > Maybe I am just getting way off track here, but I noticed this in
> > drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c in
> > set_pflag_tx_port_ts:
> > 
> >    if (!err)
> >      priv->tx_ptp_opened = true;
> > 
> > but what if enable is false, meaning the user is disabling ptp via
> > ethtool?
> > 
> 
> This bool field under priv acts as a flag, means: ptp was ever opened.
> It's the same idea as max_opened_tc, but with boolean instead of numeric.

OK, but then to avoid double counting ptp, I suppose we don't
output ptp stats in base and only when the correct txq_ix is passed
in to the tx stats function which refers to the PTP queue?

It's either that or we dont include ptp's sqstats in the
txq2sq_stats mapping you've proposed, if I understand correctly.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-06-03 21:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29  3:16 [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Joe Damato
2024-05-29  3:16 ` [RFC net-next v3 1/2] net/mlx5e: Add helpers to calculate txq and ch idx Joe Damato
2024-05-30 21:15   ` Jacob Keller
2024-06-01 11:35   ` Simon Horman
2024-06-01 11:39     ` Simon Horman
2024-06-02 19:23       ` Joe Damato
2024-05-29  3:16 ` [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats Joe Damato
2024-05-30 21:16   ` Jacob Keller
2024-06-02  9:14   ` Tariq Toukan
2024-06-02 19:22     ` Joe Damato
2024-06-03 11:11       ` Tariq Toukan
2024-06-03 18:11         ` Joe Damato
2024-06-03 19:22         ` Joe Damato
2024-06-03 21:36           ` Tariq Toukan
2024-06-03 21:50             ` Joe Damato
2024-05-31  0:11 ` [RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats Jakub Kicinski
2024-05-31  0:15   ` Joe Damato
2024-05-31  0:39     ` Jakub Kicinski
2024-05-31  0:56       ` Joe Damato
2024-05-31  1:07     ` Joe Damato
2024-05-31  1:26       ` Jakub Kicinski
2024-05-31 18:30         ` Joe Damato

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).