Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net/mlx5: frag buffer improvements
@ 2026-05-14 10:49 Tariq Toukan
  2026-05-14 10:49 ` [PATCH net-next 1/2] net/mlx5: use numa_mem_id() for default frag buf allocations Tariq Toukan
  2026-05-14 10:49 ` [PATCH net-next 2/2] net/mlx5: add debugfs stats for frag buf dma pools Tariq Toukan
  0 siblings, 2 replies; 3+ messages in thread
From: Tariq Toukan @ 2026-05-14 10:49 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Nimrod Oren,
	Dragos Tatulea

Hi,

This series adds observability for mlx5 fragment buffer DMA pools and
improves the default NUMA placement policy for fragment buffer
allocations.

Patch 1 adds a debugfs interface exposing per-node DMA pool usage
statistics for mlx5_frag_buf allocations, helping with debugging and
visibility into pool utilization.

Patch 2 improves locality of default fragment buffer allocations by
using numa_mem_id() when no explicit NUMA node is requested, allowing
allocations to prefer the current CPU's local memory node.

Together, these changes improve both introspection and memory locality
behavior of mlx5 fragment buffer allocations.

Regards,
Tariq

Nimrod Oren (2):
  net/mlx5: use numa_mem_id() for default frag buf allocations
  net/mlx5: add debugfs stats for frag buf dma pools

 .../net/ethernet/mellanox/mlx5/core/alloc.c   | 88 ++++++++++++++++++-
 include/linux/mlx5/driver.h                   |  1 +
 2 files changed, 88 insertions(+), 1 deletion(-)


base-commit: 18dc8e6d15d7a30888beec46a1e01ca0f98508fa
-- 
2.44.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH net-next 1/2] net/mlx5: use numa_mem_id() for default frag buf allocations
  2026-05-14 10:49 [PATCH net-next 0/2] net/mlx5: frag buffer improvements Tariq Toukan
@ 2026-05-14 10:49 ` Tariq Toukan
  2026-05-14 10:49 ` [PATCH net-next 2/2] net/mlx5: add debugfs stats for frag buf dma pools Tariq Toukan
  1 sibling, 0 replies; 3+ messages in thread
From: Tariq Toukan @ 2026-05-14 10:49 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Nimrod Oren,
	Dragos Tatulea

From: Nimrod Oren <noren@nvidia.com>

Use the current CPU's local memory node when callers do not request a
specific NUMA node for mlx5_frag_buf allocations.

Signed-off-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index f19644183828..16d6b126a486 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -305,7 +305,7 @@ int mlx5_frag_buf_alloc_node(struct mlx5_core_dev *dev, int size,
 	struct mlx5_dma_pool *pool;
 	int pool_idx;
 
-	node = node == NUMA_NO_NODE ? first_online_node : node;
+	node = node == NUMA_NO_NODE ? numa_mem_id() : node;
 
 	buf->size = size;
 	buf->npages = DIV_ROUND_UP(size, PAGE_SIZE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH net-next 2/2] net/mlx5: add debugfs stats for frag buf dma pools
  2026-05-14 10:49 [PATCH net-next 0/2] net/mlx5: frag buffer improvements Tariq Toukan
  2026-05-14 10:49 ` [PATCH net-next 1/2] net/mlx5: use numa_mem_id() for default frag buf allocations Tariq Toukan
@ 2026-05-14 10:49 ` Tariq Toukan
  1 sibling, 0 replies; 3+ messages in thread
From: Tariq Toukan @ 2026-05-14 10:49 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Nimrod Oren,
	Dragos Tatulea

From: Nimrod Oren <noren@nvidia.com>

Add a debugfs file exposing per-node DMA pool usage for mlx5_frag_buf
allocations.

  # cat /sys/kernel/debug/mlx5/<dev>/frag_buf_dma_pools
  node  block_size  used_blocks  allocated_blocks
     0        4096            0                 0
     0        8192            0                 0
     0       16384            0                 0
     0       32768            0                 0
     0       65536            0                 0
     1        4096            0                 0
     1        8192            0                 0
     1       16384            0                 0
     1       32768            0                 0
     1       65536            0                 0

Signed-off-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/alloc.c   | 86 +++++++++++++++++++
 include/linux/mlx5/driver.h                   |  1 +
 2 files changed, 87 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index 16d6b126a486..4fe9d7d4f143 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -38,6 +38,8 @@
 #include <linux/dma-mapping.h>
 #include <linux/vmalloc.h>
 #include <linux/nodemask.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
 #include <linux/mlx5/driver.h>
 
 #include "mlx5_core.h"
@@ -74,6 +76,13 @@ struct mlx5_frag_buf_node_pools {
 	struct mlx5_dma_pool *pools[MLX5_FRAG_BUF_POOLS_NUM];
 };
 
+struct mlx5_dma_pool_stats {
+	int node;
+	size_t block_size;
+	size_t used_blocks;
+	size_t allocated_blocks;
+};
+
 /* Handling for queue buffers -- we allocate a bunch of memory and
  * register it in a memory region at HCA virtual address 0.
  */
@@ -225,6 +234,43 @@ static void mlx5_dma_pool_free(struct mlx5_dma_pool *pool,
 	mutex_unlock(&pool->lock);
 }
 
+static void mlx5_dma_pool_debugfs_get_stats(struct mlx5_dma_pool *pool,
+					    struct mlx5_dma_pool_stats *stats)
+{
+	int blocks_per_page = BIT(PAGE_SHIFT - pool->block_shift);
+	struct mlx5_dma_pool_page *page;
+	size_t free_blocks = 0;
+	size_t pages = 0;
+
+	mutex_lock(&pool->lock);
+	list_for_each_entry(page, &pool->page_list, pool_link) {
+		pages++;
+		free_blocks += bitmap_weight(page->bitmap, blocks_per_page);
+	}
+	mutex_unlock(&pool->lock);
+
+	stats->node = pool->node;
+	stats->block_size = BIT(pool->block_shift);
+	stats->allocated_blocks = pages * blocks_per_page;
+	stats->used_blocks = stats->allocated_blocks - free_blocks;
+}
+
+static void mlx5_dma_pool_debugfs_stats_print(struct seq_file *file,
+					      struct mlx5_dma_pool *pool)
+{
+	struct mlx5_dma_pool_stats stats = {};
+
+	mlx5_dma_pool_debugfs_get_stats(pool, &stats);
+	seq_printf(file, "%4d       %5zu      %7zu           %7zu\n",
+		   stats.node, stats.block_size, stats.used_blocks,
+		   stats.allocated_blocks);
+}
+
+static void mlx5_dma_pools_debugfs_print_header(struct seq_file *file)
+{
+	seq_puts(file, "node  block_size  used_blocks  allocated_blocks\n");
+}
+
 static void
 mlx5_frag_buf_node_pools_destroy(struct mlx5_frag_buf_node_pools *node_pools)
 {
@@ -257,11 +303,46 @@ mlx5_frag_buf_node_pools_create(struct mlx5_core_dev *dev, int node)
 	return node_pools;
 }
 
+static int
+mlx5_frag_buf_dma_pools_debugfs_show(struct seq_file *file, void *priv)
+{
+	struct mlx5_core_dev *dev = file->private;
+	int node;
+
+	mlx5_dma_pools_debugfs_print_header(file);
+
+	if (!dev->priv.frag_buf_node_pools)
+		return 0;
+
+	for_each_node_state(node, N_POSSIBLE) {
+		struct mlx5_frag_buf_node_pools *node_pools;
+
+		node_pools = dev->priv.frag_buf_node_pools[node];
+		if (!node_pools)
+			continue;
+
+		for (int i = 0; i < MLX5_FRAG_BUF_POOLS_NUM; i++) {
+			struct mlx5_dma_pool *pool = node_pools->pools[i];
+
+			if (!pool)
+				continue;
+
+			mlx5_dma_pool_debugfs_stats_print(file, pool);
+		}
+	}
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(mlx5_frag_buf_dma_pools_debugfs);
+
 void mlx5_frag_buf_pools_cleanup(struct mlx5_core_dev *dev)
 {
 	struct mlx5_priv *priv = &dev->priv;
 	int node;
 
+	debugfs_remove(priv->dbg.frag_buf_dma_pools_debugfs);
+	priv->dbg.frag_buf_dma_pools_debugfs = NULL;
+
 	for_each_node_state(node, N_POSSIBLE) {
 		struct mlx5_frag_buf_node_pools *node_pools;
 
@@ -296,6 +377,11 @@ int mlx5_frag_buf_pools_init(struct mlx5_core_dev *dev)
 		priv->frag_buf_node_pools[node] = node_pools;
 	}
 
+	priv->dbg.frag_buf_dma_pools_debugfs =
+		debugfs_create_file("frag_buf_dma_pools", 0444,
+				    priv->dbg.dbg_root, dev,
+				    &mlx5_frag_buf_dma_pools_debugfs_fops);
+
 	return 0;
 }
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 8b4d384125d1..9a4bb25d8e0a 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -547,6 +547,7 @@ struct mlx5_debugfs_entries {
 	struct dentry *eq_debugfs;
 	struct dentry *cq_debugfs;
 	struct dentry *cmdif_debugfs;
+	struct dentry *frag_buf_dma_pools_debugfs;
 	struct dentry *pages_debugfs;
 	struct dentry *lag_debugfs;
 };
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-14 10:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 10:49 [PATCH net-next 0/2] net/mlx5: frag buffer improvements Tariq Toukan
2026-05-14 10:49 ` [PATCH net-next 1/2] net/mlx5: use numa_mem_id() for default frag buf allocations Tariq Toukan
2026-05-14 10:49 ` [PATCH net-next 2/2] net/mlx5: add debugfs stats for frag buf dma pools Tariq Toukan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox