netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, Moshe Shemesh <moshe@nvidia.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>
Subject: [net-next 09/16] net/mlx5: Add debugfs counters for page commands failures
Date: Wed,  9 Mar 2022 13:37:48 -0800	[thread overview]
Message-ID: <20220309213755.610202-10-saeed@kernel.org> (raw)
In-Reply-To: <20220309213755.610202-1-saeed@kernel.org>

From: Moshe Shemesh <moshe@nvidia.com>

Add the following new debugfs counters for debug and verbosity:
fw_pages_alloc_failed - number of pages FW requested but driver failed
to allocate.
give_pages_dropped - number of pages given to FW, but command give pages
failed by FW.
reclaim_pages_discard - number of pages which were about to reclaim back
and FW failed the command.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/debugfs.c  |  4 ++++
 .../net/ethernet/mellanox/mlx5/core/pagealloc.c    | 14 +++++++++++---
 include/linux/mlx5/driver.h                        |  3 +++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
index 8673ba2df910..d69bac93a83b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
@@ -222,6 +222,10 @@ void mlx5_pages_debugfs_init(struct mlx5_core_dev *dev)
 	debugfs_create_u32("fw_pages_total", 0400, pages, &dev->priv.fw_pages);
 	debugfs_create_u32("fw_pages_vfs", 0400, pages, &dev->priv.vfs_pages);
 	debugfs_create_u32("fw_pages_host_pf", 0400, pages, &dev->priv.host_pf_pages);
+	debugfs_create_u32("fw_pages_alloc_failed", 0400, pages, &dev->priv.fw_pages_alloc_failed);
+	debugfs_create_u32("fw_pages_give_dropped", 0400, pages, &dev->priv.give_pages_dropped);
+	debugfs_create_u32("fw_pages_reclaim_discard", 0400, pages,
+			   &dev->priv.reclaim_pages_discard);
 }
 
 void mlx5_pages_debugfs_cleanup(struct mlx5_core_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
index 8855fe71d480..e0543b860144 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
@@ -352,8 +352,10 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 		if (err) {
 			if (err == -ENOMEM)
 				err = alloc_system_page(dev, function);
-			if (err)
+			if (err) {
+				dev->priv.fw_pages_alloc_failed += (npages - i);
 				goto out_4k;
+			}
 
 			goto retry;
 		}
@@ -372,14 +374,14 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 		/* if triggered by FW and failed by FW ignore */
 		if (event) {
 			err = 0;
-			goto out_4k;
+			goto out_dropped;
 		}
 	}
 	if (err) {
 		err = mlx5_cmd_check(dev, err, in, out);
 		mlx5_core_warn(dev, "func_id 0x%x, npages %d, err %d\n",
 			       func_id, npages, err);
-		goto out_4k;
+		goto out_dropped;
 	}
 
 	dev->priv.fw_pages += npages;
@@ -394,6 +396,8 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 	kvfree(in);
 	return 0;
 
+out_dropped:
+	dev->priv.give_pages_dropped += npages;
 out_4k:
 	for (i--; i >= 0; i--)
 		free_4k(dev, MLX5_GET64(manage_pages_in, in, pas[i]), function);
@@ -516,6 +520,10 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 	mlx5_core_dbg(dev, "func 0x%x, npages %d, outlen %d\n",
 		      func_id, npages, outlen);
 	err = reclaim_pages_cmd(dev, in, sizeof(in), out, outlen);
+	if (err) {
+		npages = MLX5_GET(manage_pages_in, in, input_num_entries);
+		dev->priv.reclaim_pages_discard += npages;
+	}
 	/* if triggered by FW event and failed by FW then ignore */
 	if (event && err == -EREMOTEIO)
 		err = 0;
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index c5f93b5910ed..00a914b0716e 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -575,6 +575,9 @@ struct mlx5_priv {
 	struct list_head	free_list;
 	u32			vfs_pages;
 	u32			host_pf_pages;
+	u32			fw_pages_alloc_failed;
+	u32			give_pages_dropped;
+	u32			reclaim_pages_discard;
 
 	struct mlx5_core_health health;
 	struct list_head	traps;
-- 
2.35.1


  parent reply	other threads:[~2022-03-09 21:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-09 21:37 [pull request][net-next 00/16] mlx5 updates 2022-03-09 Saeed Mahameed
2022-03-09 21:37 ` [net-next 01/16] net/mlx5e: TC, Fix use after free in mlx5e_clone_flow_attr_for_post_act() Saeed Mahameed
2022-03-10 22:50   ` patchwork-bot+netdevbpf
2022-03-09 21:37 ` [net-next 02/16] net/mlx5: Add command failures data to debugfs Saeed Mahameed
2022-03-09 21:37 ` [net-next 03/16] net/mlx5: Remove redundant notify fail on give pages Saeed Mahameed
2022-03-09 21:37 ` [net-next 04/16] net/mlx5: Remove redundant error " Saeed Mahameed
2022-03-09 21:37 ` [net-next 05/16] net/mlx5: Remove redundant error on reclaim pages Saeed Mahameed
2022-03-09 21:37 ` [net-next 06/16] net/mlx5: Change release_all_pages cap bit location Saeed Mahameed
2022-03-09 21:37 ` [net-next 07/16] net/mlx5: Move debugfs entries to separate struct Saeed Mahameed
2022-03-09 21:37 ` [net-next 08/16] net/mlx5: Add pages debugfs Saeed Mahameed
2022-03-09 21:37 ` Saeed Mahameed [this message]
2022-03-09 21:37 ` [net-next 10/16] net/mlx5: DR, Align mlx5dv_dr API vport action with FW behavior Saeed Mahameed
2022-03-09 21:37 ` [net-next 11/16] net/mlx5: DR, Add support for matching on Internet Header Length (IHL) Saeed Mahameed
2022-03-09 21:37 ` [net-next 12/16] net/mlx5: DR, Remove unneeded comments Saeed Mahameed
2022-03-09 21:37 ` [net-next 13/16] net/mlx5: DR, Fix handling of different actions on the same STE in STEv1 Saeed Mahameed
2022-03-09 21:37 ` [net-next 14/16] net/mlx5: DR, Rename action modify fields to reflect naming in HW spec Saeed Mahameed
2022-03-09 21:37 ` [net-next 15/16] net/mlx5: DR, Refactor ste_ctx handling for STE v0/1 Saeed Mahameed
2022-03-09 21:37 ` [net-next 16/16] net/mlx5: DR, Add support for ConnectX-7 steering Saeed Mahameed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220309213755.610202-10-saeed@kernel.org \
    --to=saeed@kernel.org \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=moshe@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).